Wednesday 7th March, 2012
1:20pm to 4:40pm
The tutorial will give a hands-on introduction to manipulating and analyzing large and small structured data sets in Python using the pandas library. While the focus will be on learning the nuts and bolts of the library's features, I also aim to demonstrate a different way of thinking regarding structuring data in memory for manipulation and analysis.
The tutorial will teach the mechanics of the most important features of pandas. It will be focused on the nuts and bolts of the two main data structures, Series (1D) and DataFrame (2D), as they relate to a variety of common data handling problems in Python. The tutorial will be supplemented by a collection of scripts and example data sets for the users to run while following along with the material. As such a significant part of the tutorial will be spend doing interactive data exploration and working examples from within the IPython console.
The tutorial will also teach participants best practices for structuring data in memory and the do's and don'ts of high performance computing with large data sets in Python. For participants who have never used IPython, this will also provide a gentle introduction to interactive scientific computing with IPython.
Author of pandas and upcoming book Python for Data Analysis. Building powerful tools for finance, data analysis, and statistics. Cofounder of @LambdaFoundry bio from Twitter
Sign in to add slides, notes or videos to this session