•  

How to develop Big Data Pipelines for Hadoop

A session at Strata 2012

Wednesday 29th February, 2012

2:20pm to 3:00pm (PST)

Hadoop is not an island. To deliver a complete Big Data solution, a data pipeline needs to be developed that incorporates and orchestrates many diverse technologies.

A Hadoop focused data pipeline not only needs to coordinate the running of multiple Hadoop jobs (MapReduce, Hive, or Pig), but also encompass real-time data acquisition and the analysis of reduced data sets extracted into relational/NoSQL databases or dedicated analytical engines.

Using an example of real-time weblog processing, in this session we will demonstrate how the open source Spring Batch and Spring Integration projects can be used to build manageable and robust pipeline solutions around Hadoop.

About the speaker

This person is speaking at this event.
Mark Pollack

SpringSource/VMware

Next session in GA J

4pm Hadoop Plugin for MongoDB: The Elephant in the Room by Steve Francia

Sign in to add slides, notes or videos to this session

Strata 2012

United States United States, Santa Clara

28th February to 1st March 2012

Tell your friends!

When

Time 2:20pm3:00pm PST

Date Wed 29th February 2012

Short URL

lanyrd.com/smmwz

View the schedule

Share

Books by speaker

  • Spring Data

See something wrong?

Report an issue with this session