by Allen Downey
This tutorial is an introduction to Bayesian statistics using Python. My goal is to help participants understand the concepts and solve real problems. We will use material from my book, Think Stats: Probability and Statistics for Programmers (O’Reilly Media).
Bayesian statistical methods are becoming more common and more important, but there are not many resources to help beginners get started. People who know Python can use their programming skills to get a head start.
I will present simple programs that demonstrate the concepts of Bayesian statistics, and apply them to a range of example problems. Participants will work hands-on with example code and practice on example problems.
Students should have at least basic level Python and basic statistics. If you learned about Bayes’s Theorem and probability distributions at some time, that’s enough, even if you don’t remember it! Students should be comfortable with logarithms and plotting data on a log scale.
Students should bring a laptop with Python 2.x and matplotlib. You can work in any environment; you just need to be able to download a Python program and run it.
Outline:
by Wes McKinney
The tutorial will give a hands-on introduction to manipulating and analyzing large and small structured data sets in Python using the pandas library. While the focus will be on learning the nuts and bolts of the library's features, I also aim to demonstrate a different way of thinking regarding structuring data in memory for manipulation and analysis.
The tutorial will teach the mechanics of the most important features of pandas. It will be focused on the nuts and bolts of the two main data structures, Series (1D) and DataFrame (2D), as they relate to a variety of common data handling problems in Python. The tutorial will be supplemented by a collection of scripts and example data sets for the users to run while following along with the material. As such a significant part of the tutorial will be spend doing interactive data exploration and working examples from within the IPython console.
The tutorial will also teach participants best practices for structuring data in memory and the do's and don'ts of high performance computing with large data sets in Python. For participants who have never used IPython, this will also provide a gentle introduction to interactive scientific computing with IPython.
by Ian Ozsvald
At EuroPython 2011 I ran a very hands-on tutorial for High Performance Python techniques. This updated tutorial will cover profiling, PyPy, Cython, numpy, NumExpr, ShedSkin, multiprocessing, ParallelPython and pyCUDA. Here's a 55 page PDF write-up of the EuroPython material: http://ianozsvald.com/2011/07/25...
At EuroPython 2011 I ran a very hands-on tutorial for High Performance Python techniques. This updated tutorial will cover:
I plan to expand the original material and to maybe also cover other tools like execnet and PyPy-numpy.
In this tutorial, I will cover how to write very fast Python code for data analysis. I will briefly introduce NumPy and illustrate how fast code for Python is written in SciPy using tools like Fwrap / F2py and Cython. I will also describe interesting new approaches to creating fast code that is leading changes to NumPy on a fundamental level.
In this tutorial, I will cover how to write very fast Python code for data analysis including making use of NumPy and using GPUs. I will largely focus on writing extensions to Python using hand-wrapping and Cython but will touch also on using tools like weave, Instant, ShedSkin and compare them to PyPy. I will also spend the last part of the tutorial on using GPUs with Python and discuss the performance trade-offs of the technology. This will be a high-level overview of the space with deep dives in Cython and GPUs
Outline: