Introduction to Apache Hadoop

A session at Strata 2012

Tuesday 28th February, 2012

9:00am to 12:30pm (PST)

This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems.

The agenda will include:

  • The rationale for Hadoop
  • Understanding the Hadoop Distributed File System (HDFS) and MapReduce
  • Common Hadoop use cases including recommendation engines, ETL, time-series analysis and more
  • How Hadoop integrates with other systems like Relational Databases and Data Warehouses
  • Overview of the other components in a typical Hadoop “stack” such as these Apache projects: Hive, Pig, HBase, Sqoop, Flume and Oozie

About the speaker

This person is speaking at this event.
Sarah Sproehnle

Cloudera, Inc

Next session in Ballroom CD

1:30pm The Two Most Important Algorithms in Predictive Modeling Today by Mike Bowles and Jeremy Howard

Sign in to add slides, notes or videos to this session

Strata 2012

United States United States, Santa Clara

28th February to 1st March 2012

Tell your friends!


Time 9:00am12:30pm PST

Date Tue 28th February 2012


Ballroom CD, Santa Clara Convention Center

Short URL


View the schedule



See something wrong?

Report an issue with this session