Get Lanyrd on your mobile (iPhone, Android and more) - check it out here

Kafka - A distributed publish/subscribe messaging system

A session at ApacheCon North America 2011

Kafka is a distributed publish-subscribe messaging system aimed at providing a scalable, high-throughput, low latency solution for log aggregation and activity stream processing for LinkedIn. Built on Apache Zookeeper in Scala, Kafka aims at providing a unified stream for both real-time and offline consumption. We provide a mechanism for parallel data load into Hadoop as well as the ability to partition real-time consumption over a cluster of machines. Kafka combines the benefits of traditional log aggregators and messaging systems and has been used successfully in production for 8 months. It provides API similar to that of a messaging system and allows applications to consume log events in real-time. Written by the SNA team at LinkedIn, Kafka is open sourced under the Apache 2.0 License and preparing to be submitted as an Apache incubator project. In this presentation, we will highlight the core design principles for this system, and how this system fits into LinkedIn's data ecosystem as well as some of the products and monitoring applications it supports in our usage.

About the speaker

This person is speaking at this event.
Neha Narkhede

Search, Distributed Systems, Stream data processing, LinkedIn bio from Twitter

Coverage of this session

Sign in to add slides, notes or videos to this session

Tell your friends!

When

Time 2:00pm2:50pm PST

Date Fri 11th November 2011

Short URL

lanyrd.com/skdtp

Official session page

na11.apachecon.com/…19358

View the schedule

Topics

See something wrong?

Report an issue with this session