I will describe the challenges we faced when designing a MongoDB database for processing large data streams and the solutions we applied. Some of the difficulties included write-intensive loads, uneven access patterns (posts with many followers get many more hits than posts with few followers), and non-trivial support of privacy. I will describe the choices we made for schema design to optimize writes and efficient querying/retrieval. I will also talk about indexing strategies, tradeoffs we made to work around MongoDB design, and reasoning we applied to find the most optimal denormalization of collections.
Sign in to add slides, notes or videos to this session