Efficiently storing and real-time querying TBs of time series data with Paul Dix

A session at NY Open Stats Meetup November 2013: Efficiently storing and real-time querying TBs of time series data with Paul Dix

The Talk:

In this talk I'll introduce InfluxDB, a distributed time series database we open sourced based on our backend infrastructure at Errplane. I'll talk about why you'd want a database specifically for time series and cover the API and some of the key features of InfluxDB, including:

• Stores metrics (like Graphite) and events (like page views, exceptions, deploys)

• No external dependencies (self contained binary)

• Fast. Handles many thousands of writes per second on a single node

• HTTP API for reading and writing data

• SQL-like query language

• Distributed to scale out to many machines

• Built in aggregate and statistics functions

• Built in downsampling

I'll talk about the underlying technology and some of the tradeoffs we made in the design to help it scale with time series data.

Paul Dix is co-founder and CEO of the Y-Combinator backed company Errplane. Paul is the series editor for Addison Wesley's "Data & Analytics" series and the author of “Service Oriented Design with Ruby and Rails.” He is a frequent speaker at conferences and user groups including Web 2.0, RubyConf, RailsConf, and GoRuCo. Paul is the founder and organizer of the NYC Machine Learning Meetup. In the past he has worked at startups and larger companies like Google, Microsoft, and McAfee. He lives in New York City.

Sign in to add slides, notes or videos to this session

Tell your friends!


Date Thu 14th November 2013

Short URL


Official session page


View the schedule


See something wrong?

Report an issue with this session