SparkR: The past, the present and the future

A session at Spark Summit 2015

  • Shivaram Venkataraman

Tuesday 16th June, 2015

4:30pm to 5:00pm

The SparkR project provides language bindings and runtime support to enable users to run scalable computation from R using Apache Spark. SparkR has an active set of contributors from many companies and a number of recent developments have improved performance and usability. Some of the improvements include (a) a new R to JVM bridge that enables easy deployment to YARN clusters, (b) serialization-deserialization routines that enable integration with other Spark components like ML Pipelines, (c) complete RDD API with support coming for DataFrames and (d) performance improvements for various operations including shuffles. This talk will present an overview of the project, outline some of the technical contributions and discuss new features we will build over the next year. We will also present a demo showcasing how SparkR can be used to seamlessly process large datasets on a cluster directly from the R console.

About the speakers

This person is speaking at this event.
Shivaram Venkataraman
This person is speaking at this event.
Rui Sun

Architect at Intel - Big Data bio from LinkedIn

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 4:30pm5:00pm PST

Date Tue 16th June 2015

Short URL


Official event site


View the schedule


See something wrong?

Report an issue with this session