Spotify is currently one of the most popular music streaming services in the world with over 100 million monthly active users. We have over the last few years have a phenomenal growth that now has pushed our backend infrastructure out from our data centers and into the cloud. Earlier this year we announced that we are transitioning all of our backend into Google Cloud Platform, GCP.
In this talk we are going to give an brief overview of what our Data Infrastructure tribe provides at Spotify. Then we are going to do a bit deeper dive into some of the Data Infrastructure components:
Event Delivery - Our event delivery system is a key component in our data infrastructure, that delivers complete data with predictable latency and well defined interface for our developers. This data is used to produce Discover Weekly, Spotify Party, Year in music and many more Spotify features. Here we will focus on the evolution of the event delivery service and the lessons learned and some of the reasoning for moving to Google Cloud Pub/Sub and into the cloud.
Datamon - Another key component of our data infrastructure is Datamon. Datamon provides and easy overview of data delivered, not just by our event delivery system but for all systems producing data into our central storages. Datamon also integrates with PagerDuty to help with our Data Operations.
Styx - In any data infrastructure there is a need for scheduling applications. Styx enables distributed and scalable scheduling of Docker containers. Styx has evolved out of our extensive use of Luigi and the need to get more specialized tools in our infrastructure. Styx is built using the Spotify Apollo framework and uses Kubernetes for container invocations.
Sign in to add slides, notes or videos to this session