Reading list
In no particular order,
Statisticians invent data science
Data Cubes
Streaming/Sampling
- Hellerstein et al., Online Aggregation
- Spark
- BlinkDB
- Blais et al., Rapid Sampling for Visualizations with Ordering Guarantees
- Trust me, I’m partially right
- Sample-Oriented Task-Driven Visualizations
Additional reading
Approximating population distributions by carefully-chosen samples:
Vis Systems
Books
- Tukey’s Exploratory Data Analysis
- Gutierrez’s Data Scientists at Work