Today, tooling for ad-hoc data science is fairly well understood. But when you want to create a repeated process such as analytics or prediction systems, things tend to change with time, and how to deal with such change is not always clear. Columns and features are added and removed. New models are developed. Data errors are discovered and corrected. How can we build a data pipeline system to handle these demands? This talk will discuss some of the systems challenges and solutions that arise when building evolving data science products, and we’ll see how they are addressed at Twitter.
Sign in to add slides, notes or videos to this session