Sessions at PyCon US 2012 in D3

Your current filters are…

  • Space: D3

Wednesday 7th March 2012

  • Introduction to Event Driven Programming Using Twisted

    by Jean-Paul Calderone

    This tutorial introduces programmers with a basic Python skills to the concepts and techniques of event driven programming. The focus is on understanding an event loop and handling the events related to TCP connections. Twisted is introduced as a re-usable event loop implementation and the abstract concepts of event driven programming are related to specific uses of the Twisted library.

    • What is event driven programming
    • What is it an alternative to
    • What are its advantages
    • How does an event loop work
    • Build one step by step to demonstrate
    • Demonstrate a server which can handle many clients
    • Demonstrate a client which can run in the same event loop
    • Demonstrate timed events in the event loop
    • How are event handlers connected to form a program
    • Callback functions
    • Deferreds
    • Generator tricks - inlineCallbacks
    • Coroutines - stackless, corotwine
    • More

    At 9:00am to 12:20pm, Wednesday 7th March

    In D3, Santa Clara Convention Center

    Coverage slide deck

  • Graph Analysis from the Ground Up

    by Van Lindberg

    Graphs are a fundamental datatype - but typical developers don't get as much exposure to using and working with graphs as with other datatypes like tables and queues. This is a from-the-ground up working session; by the end, attendees should have the tools and experience to model and analyze problems with graphs.

    This tutorial is intended to bring somebody with Python experience but limited or no experience using graph-based algorithms to a place where they:

    • Understand the basics of graph theory and why it can be helpful;
    • Are familiar with the available tools for dealing with graphs;
    • Recognize how to model a problem in terms of a graph; and
    • Have a first hands-on experience applying the theory and the tools to solve an interesting real-world problem.

    To do this, the tutorial is divided into four sections, each corresponding to one of the objectives above. Each portion will have a hands-on exercise pertaining to the exact subject, with part 4 as a crowning workshop bringing together various skills and points raised throughout the session; after having a few minutes to work on their own code and ask questions, the class as a whole will walk through a solution.

    At 1:20pm to 4:40pm, Wednesday 7th March

    In D3, Santa Clara Convention Center

Thursday 8th March 2012

  • Social Network Analysis with Python

    by Maksim Tsvetovat

    Social Network data permeates our world -- yet we often don't know what to do with it. In this tutorial, I will introduce both theory and practice of Social Network Analysis -- gathering, analyzing and visualizing data using Python and other open-source tools. I will walk the attendees through an entire project, from gathering and cleaning data to presenting results.

    SNA techniques are derived from sociological and social-psychological theories and take into account the whole network (or, in case of very large networks such as Twitter -- a large segment of the network). Thus, we may arrive at results that may seem counter-intuitive -- e.g. that Justin Bieber (7.5 mil. followers) and Lady Gaga (7.2 mil. followers) have relatively little actual influence despite their celebrity status -- while a middle-of-the-road blogger with 30K followers is able to generate tweets that "go viral" and result in millions of impressions.

    In this tutorial, we will conduct social network analysis of a real dataset, from gathering and cleaning data to analysis and visualization of results. We will use Python and a set of open-source libraries, including NetworkX, NumPy and Matplotlib.

    Outline:

    • Introduction. Why should we do this? What is the data like? Why is this different from other techniques? What can we learn?
    • Centralities: Degree, closeness, betweenness, PageRank, Klout Score
    • Beyond Klout Score: Finding communities of interest, finding clusters in networks
    • Information diffusion in networks -- how do things go viral?

    At 9:00am to 12:20pm, Thursday 8th March

    In D3, Santa Clara Convention Center

    Coverage video

  • Introduction to NLTK

    by Jacob Perkins

    Learn the basics of natural language processing with NLTK, the Natural Language ToolKit. First we'll cover tokenization, stemming and wordnet. Next we'll get into part-of-speech tagging, chunking & named entity recognition. Then we'll close with text classification and sentiment analysis. You'll walk out with new super-powers and an appreciation of the difficulties of analyzing human language.

    This tutorial will be a hands on approach to learning natural language processing using NLTK, the Natural Language ToolKit. We will cover everything from tokenizing sentences to phrase extraction, from splitting words to training your own text classifiers for sentiment analysis. Please come prepared with NLTK already installed so we can dive into the code & data immediately.

    Hour 1: Tokenization, Stemming & Corpora

    Tokenization & familiarity with corpus readers and models are required knowledge before you can get into the more interesting aspects of NLTK. This first hour will include:

    • an overview of modules & data
    • loading pickled models
    • sentence & word tokenization
    • stemming & lemmatization
    • an overview wordnet and other included corpora

    Hour 2: Part-of-Speech Tagging & Chunking/NER

    Using tokenization and a working knowledge of corpus readers & pickled models, we'll dive into part-of-speech tagging and chunking/NER, including:

    • using a part-of-speech tagger
    • an overview of tags and tagged corpora
    • training a custom tagger with nltk-trainer
    • using a chunker for phrase extraction and named entity recognition
    • an overview of chunked corpora
    • training a custom chunker with nltk-trainer

    Hour 3: Text Classification & Sentiment Analysis

    After using classifiers for training part-of-speech taggers and chunkers, this final hour will explain text classification in greater detail with:

    • an overview of classified corpora
    • text feature extraction
    • an overview of classification algorithms & when to use them
    • training a sentiment analysis classifier on movie reviews with nltk-trainer
    • using a classifier for sentiment analysis
    • hierarchical classification for sentiment analysis
    • binary vs multi-label classification

    Wrapping Up

    Now that you know how to use NLTK to process some of the included English corpora, we'll wrap up by covering:

    • non-english corpora included with NLTK
    • other Python libraries for NLP
    • custom corpus creation

    At 1:20pm to 4:40pm, Thursday 8th March

    In D3, Santa Clara Convention Center