Apache HBase is a rapidly-evolving random-access distributed data store built on top of Apache Hadoop’s HDFS and Apache ZooKeeper. Drawing from real-world support experiences, this talk provides administrators insight into improving HBase’s availability and recovering from situations where HBase is not available. We share tips on the common root causes of unavailability, explain how to diagnose them, and prescribe measures for ensuring maximum availability of an HBase cluster. We discuss new features that improve recovery time such as distributed log splitting as well as supportability improvements. We will also describe utilities including new failure recovery tools that we have developed and contributed that can be used to diagnose and repair rare corruption problems on live HBase systems.
Software Engineer — Cloudera
Jonathan is a Software Engineer with Cloudera, currently focused on the Apache HBase project. He is an Apache HBase committer and PMC member, as well as a committer on the Apache Sqoop (incubating) project, and a committer and founder of the Apache Flume (incubating) project. Jonathan has an M.S. in Computer Science from University of Washington and also has an M.S. and a B.S. in Electrical and Computer Engineering from Carnegie Mellon University.
Training Program Lead — Cloudera
Jeff has been working at Cloudera for almost two years as a Solutions Architect, Technical Account Manager, and Education Program Lead. His experience with Hadoop is from helping customers with Hadoop in support, services, and education capacities.
Sign in to add slides, notes or videos to this session