This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems.
The agenda will include:
- The rationale for Hadoop
- Understanding the Hadoop Distributed File System (HDFS) and MapReduce
- Common Hadoop use cases including recommendation engines, ETL, time-series analysis and more
- How Hadoop integrates with other systems like Relational Databases and Data Warehouses
- Overview of the other components in a typical Hadoop “stack” such as these Apache projects: Hive, Pig, HBase, Sqoop, Flume and Oozie