Introducing Apache Hadoop: The Modern Data Operating System

A session at Digital London

Tuesday 13th March, 2012

10:00am to 10:50am (GMT)

Sophisticated data instrumentation and collection technologies are leading to unprecedented data growth. Data-driven organizations need to be able to scale their data storage and perform complex data processing on the collected data (i.e. not just "queries"). Given the unstructured nature of the source data and the need to stay agile, organizations must also be able to change their schemas dynamically (at read-time versus write-time). Apache Hadoop is an open-source distributed fault-tolerant system that leverages commodity hardware to achieve large-scale agile data storage and processing. In this presentation, Dr. Amr Awadallah will introduce the design principles behind Apache Hadoop and explain the architecture of its core sub-systems (the Hadoop Distributed File System and MapReduce). Awadallah will also contrast Hadoop to relational database systems and illustrate how they truly complement each other. Finally, he will cover the Hadoop ecosystem at large, which includes a number of projects that together form a cohesive Data Operating System for the modern data center, and he will outline how this fits into existing data infrastructures.

About the speaker

This person is speaking at this event.
Amr Awadallah

Founder/CTO @Cloudera. Nerd. Chubby. Smart. Gamer. Rough. Happy. Masterchief. Husband. Dad. PhD. Stanford. Egyptian. American. Muslim. bio from Twitter

Next session in Seminar Room B

11am Advanced Reporting & Analysis for Big Data by Jaspersoft Corp.

Sign in to add slides, notes or videos to this session

Digital London

England England, London

13th14th March 2012

Tell your friends!


Time 10:00am10:50am GMT

Date Tue 13th March 2012


Seminar Room B, ExCeL London

Short URL


View the schedule


See something wrong?

Report an issue with this session