A more scalable way of making recommendations with MLlib

A session at Spark Summit 2015

Tuesday 16th June, 2015

3:30pm to 4:00pm

Recommendation systems are among the most popular applications of machine learning. MLlib implements alternating least squares (ALS) for collaborative filtering, a very popular algorithm for making recommendations. We utilize Spark's in-memory caching and a special partitioning strategy to make ALS efficient and scalable. MLlib's ALS runs 10x faster than Apache Mahout's implementation and it scales up to billions of ratings. In this talk, we present a more scalable implementation of ALS with scalability results on 100 billion ratings. It is based on the issues we experienced with the old implementation. We will review the ALS algorithm, and describe the internal data storage we used in the new implementation as well as techniques used to accelerate the computation and to improve JVM efficiency. We will also discuss the next steps for recommendation algorithms in MLlib.

About the speaker

This person is speaking at this event.
Xiangrui Meng

Software Engineer at Databricks - We are hiring! bio from LinkedIn

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 3:30pm4:00pm PST

Date Tue 16th June 2015

Short URL


Official event site


View the schedule


See something wrong?

Report an issue with this session