Wednesday 12th October, 2016
10:30am to 11:20am
How can a small team with a limited budget enable the analysis of large volumes of data?
Lindsey and Phil will explain how the Guardian has used Apache Spark and PrestoDB on AWS to support simple ingestion and fast querying of a wide range of datasets. Learn why it’s important to decouple storage from compute and raw data sources from optimised query formats and why there’s still no single perfect solution.
Mainly software, sometimes politics, fatherhood, games. Currently at Facebook after many years with @gdndevelopers. bio from Twitter
Sign in to add slides, notes or videos to this session