Monday 15th June, 2015
3:00pm to 3:30pm
Machine Learning workflows are often complex. This talk discusses Pipelines, which were introduced in Spark 1.2 and 1.3 to facilitate ML development. We will cover the basic concepts, usage examples, a few implementation details, and plans for the future. Key takeaways: (1) Motivation: ML workflows are complex, and Pipelines simplify constructing such workflows. (2) Concepts: Pipelines are sequences of ML algorithms which transform datasets. (3) Datasets: Pipelines use DataFrames as ML datasets, so they support diverse types. (4) Usage: This talk will give examples of usage and the API.
Sign in to add slides, notes or videos to this session