Petabyte Scale, Automated Support for Remote Devices

A session at Strata 2012

Thursday 1st March, 2012

4:00pm to 4:40pm (PST)

NetApp is a fast growing provider of storage technology. Its devices “phone home” regularly, sending unstructured auto-support log and configuration data back to centralized data centers. This data is used to provide timely support, to improve sales, and to plan product improvements. To allow this, data is collected, organized, and analyzed. The system currently ingests 5 TB of compressed data per week, which is growing 40% per year. NetApp was previously storing flat files on disk volumes and keeping summary data in relational databases. Now NetApp is working with Think Big Analytics, deploying Hadoop, HBase and related technologies to ingest, organize, transform and present auto-support data. This will enable business users to make decisions and provide timely response, and will enable automated response based on predictive models. Key requirements include:

  • Query data in seconds within 5 minutes of event occurrence.
  • Execute complex ad hoc queries to investigate issues and plan accordingly.
  • Build models to predict support issues and capacity limits to take action before issues arise.
  • Build models for cross-sale opportunities.
  • Expose data to applications through REST interfaces

In this session we look at the the lessons learned while designing and implementing a system to:

  • Collect 1000 messages of 20MB compressed per minute.
  • Store 2 PB of incoming support events by 2015.
  • Provide low latency access to support information and configuration changes in HBase at scale within 5 minutes of event arrival.
  • Support complex ad hoc queries that join multiple data sets accessing diverse structured and unstructured large scale data sets
  • Operate efficiently at scale.
  • Integrate with a data warehouse in Oracle.

About the speakers

This person is speaking at this event.
Kumar Palaniappan


This person is speaking at this event.
Ron Bodkin

Think Big Analytics

Next session in Ballroom CD

4:50pm Big Analytics Beyond the Elephants by Paul Brown

Sign in to add slides, notes or videos to this session

Strata 2012

United States United States, Santa Clara

28th February to 1st March 2012

Tell your friends!


Time 4:00pm4:40pm PST

Date Thu 1st March 2012


Ballroom CD, Santa Clara Convention Center

Short URL


View the schedule



See something wrong?

Report an issue with this session