Your current filters are…
This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems.
The agenda will include:
This tutorial will explain how to leverage a Hadoop cluster to do data analysis using Java MapReduce, Apache Hive and Apache Pig. It is recommended that participants have experience with some programming language. Topics include:
NetApp is a fast growing provider of storage technology. Its devices “phone home” regularly, sending unstructured auto-support log and configuration data back to centralized data centers. This data is used to provide timely support, to improve sales, and to plan product improvements. To allow this, data is collected, organized, and analyzed. The system currently ingests 5 TB of compressed data per week, which is growing 40% per year. NetApp was previously storing flat files on disk volumes and keeping summary data in relational databases. Now NetApp is working with Think Big Analytics, deploying Hadoop, HBase and related technologies to ingest, organize, transform and present auto-support data. This will enable business users to make decisions and provide timely response, and will enable automated response based on predictive models. Key requirements include:
In this session we look at the the lessons learned while designing and implementing a system to:
28th February to 1st March 2012