Towards the True Elasticity of Spark

A session at Spark Summit 2015

  • Michael Le

Monday 15th June, 2015

5:00pm to 5:30pm

How well an analytics engine can respond to changing workload demands and resource availability will greatly determine its usefulness and adoption rate. In this talk, we will present a study of the effectiveness of the elasticity property of Spark when deployed on popular resource managers such as Mesos and YARN. In particular, we investigate how well Spark workloads running on Mesos and YARN clusters behave as nodes are added and removed from the clusters. Key measurements include workload runtime, resource utilization delay, average task waiting time, disk I/O and network bandwidth consumption. We then analyze the impact of changing key scheduling parameters (e.g., locality wait time, locality preference, granularity of locality wait time, speculation, resource re-offer interval, etc.) on the above measurements. Lessons from this work will enable the building of effective auto-scaling infrastructure for Spark in a cloud environment.

About the speakers

This person is speaking at this event.
Min Li

Research Staff Member at IBM TJ Watson Research Center bio from LinkedIn

This person is speaking at this event.
Michael Le


Sign in to add slides, notes or videos to this session

Tell your friends!


Time 5:00pm5:30pm PST

Date Mon 15th June 2015

Short URL


Official event site


View the schedule


See something wrong?

Report an issue with this session