•  

HDF5 is for Lovers

A session at PyData Silicon Valley 2013

Monday 18th March, 2013

9:00am to 10:45am (PST)

HDF5 is a hierarchical, binary database format that has become a de facto standard for scientific computing. While the specification may be used in a relatively simple way (persistence of static arrays) it also supports several high-level features that prove invaluable. These include chunking, ragged data, extensible data, parallel I/O, compression, complex selection, and in-core calculations. Moreover, HDF5 bindings exist for almost every language - including two Python libraries (PyTables and h5py).

This tutorial will discuss tools, strategies, and hacks for really squeezing every ounce of performance out of HDF5 in new or existing projects. It will also go over fundamental limitations in the specification and provide creative and subtle strategies for getting around them. Overall, this tutorial will show how HDF5 plays nicely with all parts of an application making the code and data both faster and smaller. With such powerful features at the developer's disposal, what is not to love?!

This tutorial is targeted at a more advanced audience which has a prior knowledge of Python and NumPy. Knowledge of C or C++ and basic HDF5 is recommended but not required.

This tutorial will require Python 2.7, IPython 0.12+, NumPy 1.5+, and PyTables 2.3+. ViTables and MatPlotLib are also recommended. These may all be found in Linux package managers. They are also available through EPD or easy_install. ViTables may need to be installed independently.

About the speaker

This person is speaking at this event.
Anthony Scopatz

Pretending was a way of learning. bio from Twitter

Coverage of this session

Sign in to add slides, notes or videos to this session

Tell your friends!

When

Time 9:00am10:45am PST

Date Mon 18th March 2013

Short URL

lanyrd.com/scfxkt

Official event site

sv2013.pydata.org

View the schedule

Share

See something wrong?

Report an issue with this session