HFile is the file format used to store data inside HBase, basically a block-indexed format to store sorted key-value pairs. These properties, plus the ability to create HFiles in hadoop jobs, make it very useful to serve/store data even outside HBase itself. This talk explains an approach to build a data server that uses HFiles as storage format.
Marc is currently a Data Engineer at Last.fm, solving data intensive problems using Hadoop. Previously he worked in semantics, distributed systems and grid technologies in the Barcelona Supercomputing Center.
Data engineer and grindcore / sludge / drone fanatic. I also love zombies. @lant
Sign in with Twitter to add slides, notes or videos to this session