Straggler Free Data Processing in Cloud Dataflow

A session at QCon London 2017

One of the main causes of performance problems in distributed data processing systems (from the original MapReduce to modern Spark and Flink) is "stragglers." Stragglers are parts of the input that take an unexpectedly long time to process, delaying the completion of the whole job, and wasting resources that stay idle. Stragglers can happen due to imbalance of data distribution or processing complexity, hardware/networking anomalies, and a variety of other factors.

Google Cloud Dataflow is the first system to address the problem of stragglers in a fully general way. By dynamically redistributing parts of already launched work from straggler workers onto idle workers to maximize utilization, Google Cloud Dataflow is able to preserve data consistency and minimizing re-execution.

This talk describes the theory and practice behind Cloud Dataflow's approach to straggler elimination, as well as the associated non-obvious challenges, benefits, and implications of the technique.

About the speaker

This person is speaking at this event.
Eugene Kirpichov

I like distributed systems, functional programming and academic music. bio from Twitter

Sign in to add slides, notes or videos to this session

QCon London 2017

England England, London

6th10th March 2017

Tell your friends!


Date Wed 8th March 2017

Short URL


View the schedule


See something wrong?

Report an issue with this session