Appraiser : How Airbnb generates complex models in Spark for Demand Prediction

A session at Spark Summit 2015

Monday 15th June, 2015

3:00pm to 3:30pm

Many open source machine learning frameworks exist, such as Spark's MLLIB and the Hadoop based Mahout project. These frameworks are great for getting started with using ML in products, but because they are so generic they may lack certain production driven features. In this talk we will present the ML framework used to generate Appraiser and discuss some production driven concepts that inform the development of the framework such as: Configurable feature engineering Feature code is written once and configured using text files using a feature transformation pipeline Interactions between features are picked to make sense and thus we can scale boosting to many millions of bushy trees Debuggability Boosted random forests are hard to debug Product quantization enables engineers to rapidly debug models and check for data quality Production constraints Creating smooth models Enforcing monotonicity (e.g. demand should always decrease with increasing price)

About the speaker

This person is speaking at this event.
Yang Li Hector Yee

Machine Learning & Recommendations bio from LinkedIn

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 3:00pm3:30pm PST

Date Mon 15th June 2015

Short URL


Official event site


View the schedule


See something wrong?

Report an issue with this session