Flume: An Introduction

A session at Chicago Data Summit

Tuesday 26th April, 2011

3:45pm to 4:30pm (CST)

Flume is an open-source, distributed, streaming log collection system designed for ingesting large quantities of data into large-scale data storage and analytics platforms such as Apache Hadoop. It has four goals in mind: Reliability, Scalability, Extensibility, and Manageability. Its horizontal scalable architecture offers fault-tolerant end-to-end delivery guarantees, support for low-latency event processing, provides a centralized management interface , and exposes metrics for ingest monitoring and reporting. It natively supports writing data to Hadoop's HDFS but also has a simple extension interface that allows it to write to other scalable data systems such as low-latency datastores or incremental search indexers.

About the speaker

This person is speaking at this event.
Jonathan Hsieh

Apache HBase committer. Apache Flume Founder. Engineer @ Cloudera. Ski bum. bio from Twitter

Coverage of this session

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 3:45pm4:30pm CST

Date Tue 26th April 2011

Short URL


View the schedule



See something wrong?

Report an issue with this session