Wednesday 24th October, 2012
5:00pm to 5:50pm
The initial implementation of a highly-available NameNode was completed and merged to Apache Hadoop trunk in February 2012. The design featured an Active NameNode with a hot Standby NameNode.
In this initial implementation, failure of the Active NameNode was not detected automatically, requiring an operator to initiate a failover from Active to Standby. This provided the Hadoop operator the ability to dramatically reduce the frequency of planned HDFS downtime due to configuration changes, software upgrades, hardware maintenance, etc. However, requiring operator intervention is clearly insufficient for preventing unplanned HDFS downtime. Since the initial implementation of HDFS HA, we have developed a system for monitoring the health of the Active NameNode and automatically triggering a failover to the Standby NameNode when the Active is no longer able to provide service.
In order to share pertinent file system state between the Active and Standby NameNodes, the initial implementation of HDFS HA was reliant upon a highly-available NFS filer. While many organizations are willing to purchase and administer such a service, removing this dependency can improve lower the administrative overhead and potentially improve the reliability of operating HDFS.
This session will discuss the design and implementation of these features, as well as give an overview of how to deploy these new features.
Engineer at Cloudera, Hadoop/HBase committer, former Erlang hacker, machine learning enthusiast, Brown CS alumnus bio from Twitter
Hadoop PMC member/committer, engineer at Cloudera, retired rapper, Tech Lead for kanyezone.com. bio from Twitter
Sign in to add slides, notes or videos to this session