by Matt Spitz
There are a plethora of options when it comes to deciding how to add a machine learning component to your python application. In this talk, I'll discuss why python as a language is well-suited to solving these particular problems, the tradeoffs of different machine learning solutions for python applications, and some tricks you can use to get the most out of whatever package you decide to use.
This is the age of data. As more companies expose their datasets through APIs, it's becoming increasingly easier to pull information about users, places, and things. But having this data isn't always enough; we want to understand it, find correlations, and identify trends. Fortunately, the area of computer science known as machine learning has a variety of algorithms specifically designed to do this sort of data wrangling. For the python application developer, there are many off-the-shelf toolkits that include implementations of these algorithms (Orange, NLTK, SHOGUN, PyML and scikit-learn to name just a few), but choosing which one to use can be daunting.
There are a number of tradeoffs one makes when making a selection, depending on the specifics of the implementation and the needs of the application. In this talk, I'll give an overview of some of the packages available and discuss what factors might go into deciding which one to use. I'll also offer some python-specific tricks you can use to work with large amounts of data efficiently.
7th–15th March 2012