Hadoop, Pig, and Twitter

A session at NoSQL East 2009

Massive growth in the size of business datasets leads many companies to Hadoop, an emerging architecture for parallel data processing. However, the migration path can be challenging, in part because MapReduce analyses use programming languages like Java and Python rather than SQL. Apache Pig is a high-level framework built on top of Hadoop that offers a powerful yet vastly simplified way to analyze data in Hadoop. It allows businesses to leverage the power of Hadoop in a simple language readily learnable by anyone that understands SQL. In this presentation, I will introduce Pig and show how it's been used at Twitter to solve numerous analytics challenges that became intractable with our former MySQL-based architecture.

About the speaker

This person is speaking at this event.
Kevin Weil

VP of Product for Revenue at Twitter. Former big data engineer. Digital advertising, ultra-marathons, particle physics, diet mountain dew, hadoop, lolcats. bio from Twitter

Coverage of this session

Sign in to add slides, notes or videos to this session

Tell your friends!

Short URL

lanyrd.com/sthk

Official event site

nosqleast.com/2009/

View the schedule

Share

Topics

See something wrong?

Report an issue with this session