Sessions at NoSQL East 2009 about Twitter

Your current filters are…


  • Hadoop, Pig, and Twitter

    by Kevin Weil

    Massive growth in the size of business datasets leads many companies to Hadoop, an emerging architecture for parallel data processing. However, the migration path can be challenging, in part because MapReduce analyses use programming languages like Java and Python rather than SQL. Apache Pig is a high-level framework built on top of Hadoop that offers a powerful yet vastly simplified way to analyze data in Hadoop. It allows businesses to leverage the power of Hadoop in a simple language readily learnable by anyone that understands SQL. In this presentation, I will introduce Pig and show how it's been used at Twitter to solve numerous analytics challenges that became intractable with our former MySQL-based architecture.

Schedule incomplete?

Add a new session

Filter by Day

Filter by coverage

Filter by Topic