From Data Science to Production - deploy, scale, enjoy

A session at Crunch Conference 2016

Data Science is quite a young field. One of the definitions of Data Scientist: Person who is better at statistics than any software engineer and better at software engineering than any statistician. Hence, it's quite important to talk not only about best practices of feature generation and not overfitting but also about more of software engineering topics.

The talk is based on our experience of Data Science developments at Stylight, an international fashion e-commerce company, that operates in 15 countries worldwide. We refer to our Data Applications written in R and Python, Scala; but the content is not limited to mentioned languages and applicable others.

The talk consists three main parts. A first part introduces best practices of development. How to structure your development, make deployment easy and reproducible, how to make Continuous Integration and commit triggered deployments. The second part covers production deployment to AWS stack, in particular focusing on concepts of immutable infrastructure and infrastructure as code. The last part about using serverless architecture for data applications. We introduce an example of our outlier detection system, that automatically scales based on such approach.

About the speaker

This person is speaking at this event.
Sergii Khomenko

Data scientist at STYLIGHT

Data scientist at one of the biggest fashion communities, STYLIGHT. Data analysis and visualisation hobbyist, working on problems not only in working time, but in free time for fun and personal data visualisations.

Ex-deputy director/lecturer at HP International Institute of Technology, Kiev.

Speaker at different conferences: Berlin Buzzwords 2014, ApacheCon Europe 2014, Puppet Camp London 2015, Munich Developer Camp, Berlin Buzzwords 2015, Tableau Conference on Tour - Berlin 2015, Budapest BI forum 2015, CrunchConf 2015, FOSDEM 2016, PyData Amsterdam 2016

Founder and speaker at Munich Golang User Group, Munich Tableau User Group Speaker at Munich UseR Group, Munich Search User Group, Munich Quantified Self Meetup, Munich Datageeks, AWS User Group

Sign in to add slides, notes or videos to this session

Tell your friends!

Short URL


Official event site


View the schedule


See something wrong?

Report an issue with this session