Sessions at PyCon US 2012 about Python on Sunday 11th March

Your current filters are…

Clear
  • Parsing Horrible Things with Python

    by Erik Rose

    If you've ever wanted to get started with parsers, here's your chance for a ground-floor introduction. A harebrained spare-time project gives birth to a whirlwind journey from basic algorithms to Python libraries and, at last, to a parser for one of the craziest syntaxes out there: the MediaWiki grammar that drives Wikipedia.

    Some languages were designed to be parsed. The most obvious example is Lisp and its relatives which are practically parsed when they hit the page. However, many others—including most wiki grammars—grow organically and get turned into HTML by sedimentary strata of regular expressions, all backtracking and warring with one another, making it difficult to output other formats or make changes to the language.

    We will explore the tools and techniques necessary to attack one of the hairiest lingual challenges out there: MediaWiki syntax. Join me for an introduction to the general classes of parsing algorithms, from the birth of the field to the state of the art. Learn how to pick the right one. Have a comparative look at a dozen different Python parsing toolkits. And finally, learn some optimization tricks to get a grammar going at a reasonable clip.

    At 12:00pm to 12:30pm, Sunday 11th March

    In E3, Santa Clara Convention Center

  • Transifex: Beautiful Python app localization

    by Dimitris Glezos

    Localization of Python apps used to be hard, but not any more. This talk will offer a short intro on software localization in Python and discuss today's best practices. It will present Transifex, a modern, Django-based SaaS, also referred to as 'The Github of translations', used by 2.000 open-source projects including Django, Mercurial, Fedora and Firefox.

    This talk targets software developers of Python apps published to an international audience, such as developers of web and desktop apps, games, and frameworks such as Django, presenting and demo-ing a painless way to get their apps localized.

    We will briefly introduce software localization (L10n): what it is, why it matters and how it's being done with Gettext and libraries like babel.

    We'll then present Transifex, a Django-based open-source social localization tool, which developers use to integrate localization in their workflow and reach out to an established community of translators.

    At 12:00pm to 12:30pm, Sunday 11th March

    In E1, Santa Clara Convention Center

    Coverage video

  • Writing GIMP Plug-ins in Python

    by Akkana Peck

    Learn how to write Python plug-ins for GIMP, the GNU Image Manipulation Program. With PyGIMP, you can automate simple image processing tasks in just a few lines of PyGIMP, develop elaborate plug-ins that do low-level pixel manipulation, or anything in between.

    Much of the power of GIMP, the GNU Image Manipulation program, comes from its plug-in architecture. Most of the functions you use in GIMP, including everything in the Filters menu, are plug-ins.

    In this session, you'll learn how to write GIMP plug-ins in Python using the PyGIMP package. Python is rapidly becoming the language of choice for GIMP plug-ins because of its flexibility and clean API. You'll see how Python's access to raw pixel data in an image gives it a huge advantage over GIMP's other scripting language, Script-fu, and how you can use Python-GTK and Python's wealth of other libraries to create user interfaces far beyond GIMP's usual plug-in dialogs.

    Basic Python knowledge is assumed, and familiarity with GIMP at a user level is helpful, but you don't need advanced knowledge of either one to write useful GIMP plug-ins.

    At 12:00pm to 12:30pm, Sunday 11th March

    In E2, Santa Clara Convention Center

    Coverage video

  • Building A Python-Based Search Engine

    by Daniel Lindsley

    Search is an increasingly common request in all types of applications as the amount of data all of us deal with continues to grow. The technology/architecture behind search engines is wildly different from what many developers expect. This talk will give a solid grounding in the fundamentals of providing search using Python to flesh out these concepts in a simple library.

    • Core concepts
    • Terminology
    • Document-based
    • Show basic starting code for a document
    • Inverted Index
    • Show a simple inverted index class
    • Stemming
    • N-gram
    • Show a tokenizer/n-gram processor
    • Fields
    • Show a document handler which ties it all together
    • Searching
    • Show a simple searcher (& the whole thing working together)
    • Faceting (likely no demo)
    • Boost (likely no demo)
    • More Like This
    • Wrap up

    At 1:30pm to 2:10pm, Sunday 11th March

    In E3, Santa Clara Convention Center

  • web2py: ideas we stole and ideas we had

    by Massimo Di Pierro

    In this talk we will provide an overview of some of the web2py design decisions and its newest features. In particular we will discuss which design decisions were inspired by other frameworks (Django, Turbogears, Flask) and which were not and why.

    In this talk we will provide an overview of some of the web2py design decisions and its newest features. In particular we will discuss which design decisions were inspired by other frameworks (Django, Turbogears, Flask) and which were not and why.

    This talk will be an occasion to acknowledge the importance played by other frameworks in the design of web2py and thank them. It will also be a way to explain the motivation behind some of the controversial design decisions and which unique features in web2py depend on them.

    At 1:30pm to 2:10pm, Sunday 11th March

    In E4, Santa Clara Convention Center

    Coverage video

  • What's New in Python on Windows

    by Brian Curtin

    With nearly 1.5 million downloads per month, the CPython installers for Windows account for a huge amount of the traffic through python.org, and they're the most common way for Windows users to obtain Python. Take a look at what's going on with Python on Windows and see what the road ahead looks like for Python 3.3.

    Abstract

    It's often said that we've passed the point where we're surprised about where Python is being used. From satellites out in space to fighter jets much closer to earth, Python is everywhere, so it's no surprise it appears on Windows desktops. However, did you know CPython's Windows installers are downloaded almost 1.5 million times every month? Let's take a look at what's going into nearly 18 million downloads per year, especially the upcoming CPython 3.3.

    The Download Numbers

    A look into the python.org download numbers shows some interesting trends (based on an in-progress sample), including the doubling of 3.x downloads with the release of 3.2. Let's take a look at what the release calendar means for download rates, and what the future looks like for 3.3.

    Installer Changes

    For years, users have been asking for Python's addition to the system path and countless guides have been written to help users figure out how to do that. The rise of freely available Python education materials has steadily increased the amount of first-timers around the community, many whom see immediate failure by typing "python" at a command prompt only to get an error message. Python should be as helpful as possible and provide sensible options at install time, and with Python 3.3, we're bringing you the ability to add Python to the system path, complete with a quick demo and explanation of the options.

    New and Recent Features

    As nearly all of the CPython developers are on Linux-based systems, features tend to show up there first while Windows plays catch-up. Python 3.2 added Windows implementations of os.symlink for Windows Vista and beyond, os.kill using control handlers, and several others, and 3.3 will try to fill in more gaps.

    PEP 3151 Changes to WindowsError

    If you've written cross-platform code that needs to handle WindowsError, you've probably done a few dances to properly handle it. PEP 3151 reworks the OS and and IO exception hierarchy and makes some changes to how WindowsError works, so we'll look into what it means for your code.

    PEP 397 Launcher

    Bringing Linux-like shebang functionality to a Windows computer near you. The ability to launch the proper 2 or 3 interpreter based on a hint in your code is just another way to ease startup issues for users, so we'll take a look at what's going on there.

    Alternative Implementations on Windows (quick mention, likely IronPython focused)

    IronPython is caught up on the 2.7 line and working towards a 3.x release.

    At 1:30pm to 2:10pm, Sunday 11th March

    In E1, Santa Clara Convention Center

    Coverage video

  • More than just a pretty web framework, the Tornado IOLoop

    by Gavin M. Roy

    Tornado, often thought of as a web development framework and toolset is built on top of a protocol-agnostic IOLoop, presenting an alternative to Twisted as a foundation for asynchronous application development in Python. This talk covers the Tornado IOLoop, its features and the process of writing drivers and applications using it.

    Abstract

    Tornado, often thought of as a web development framework and toolset is built on top of a protocol-agnostic IOLoop, presenting an alternative to Twisted as a foundation for asynchronous application development in Python. This talk covers the Tornado IOLoop, its features and the process of writing drivers and applications using it.

    Outline
    (30 Minutes)

    tornado.IOLoop and tornado.IOStream Introduction (5 Minutes)
    Building an event driven server using IOStream (10 Minutes)
    Options for Socket Reading
    read_until_regex, read_until, ready_bytes, read_until close
    Callbacks and handling events
    Inspecting state
    SSL streams
    Diving Deeper, Using the IOLoop Directly (10 Minutes)
    Registering events on the loop
    When data is available
    When we can write to the socket
    When there are errors on the socket
    Using timers, timeouts and callbacks
    Inspecting the stack and debugging
    Performance Considerations and Closing (5 Minutes)

    At 1:55pm to 2:35pm, Sunday 11th March

  • Diversity in practice: How the Boston Python Meetup grew to 1000 people and over 15% women

    by Asheesh Laroia and Jessica McKellar

    How do you bring more women into programming communities with long-term, measurable results? In this talk we'll analyze our successful effort, the Boston Python Workshop, which brought over 200 women into Boston's Python community this year. We'll talk about lessons learned running the workshop, the dramatic effect it has had on the local user group, and how to run a workshop in your city.

    The Boston Python Workshop is a project-driven introduction to Python for women and their friends. It has run 6 times with the Boston Python Meetup in the last 12 months, bringing over 200 women into the local Python community. By being fully integrated into the main user group, the workshop has helped the Meetup grow to over 2**10 members and consistently draw over 15% women at its events. We'll talk about lessons learned running the workshop, the dramatic effect it has had on the Boston Python Meetup, and what it takes to run an outreach event in your city.

    At 2:10pm to 2:55pm, Sunday 11th March

    In E2, Santa Clara Convention Center

  • Parsing sentences with the OTHER natural language tool: LinkGrammar

    by Jeff Elmore

    Many of you are probably familiar with NLTK, the wonderful Natural Language Toolkit for Python. You may not be familiar with Linkgrammar, which is a sentence parsing system created at Carnegie Melon university. Linkgrammar is quite robust and works "out of the box" in a way that NLTK does not for sentence parsing.

    Abstract

    NLTK is a fantastic library with broad capabilities. But often I find that I want something that will just do what I want without my having to figure out all of the details. An example of this is sentence parsing. A quick google search for parsing sentences with NLTK returns a number of articles describing how to write your own grammar and define a parser based on that grammar and parse sentences. This is great for toy problems and education, but if you actually need to parse sentences "from the wild," writing your own grammar is a huge undertaking.

    Enter Linkgrammar. Linkgrammar was developed at Carnegie Melon university and is now maintained by the developers of Abiword as the basis for their grammar checking capabilities. It works nicely out of the box and is tolerant of irregularities found in authentic text.

    At 2:10pm to 2:55pm, Sunday 11th March

    In E3, Santa Clara Convention Center

    Coverage video

  • Python, Linkers, and Virtual Memory

    by Brandon Rhodes

    Why does “top” show that your Python process uses 110 MB of virtual memory but has a resident set size of 9 MB? Does it consume more memory to spawn several interpreters, or to run one Python and have it fork() further workers? What is an “undefined symbol,” anyway? Learn about how an operating system manages memory, loads shared libraries, and what this means for Python servers and applications.

    Abstract

    If you have ever seen the error “Undefined symbol” when running a Python program, then you have encountered dynamic linking: a feature of modern operating systems by which they minimize program size and maximize the memory shared between processes, but that requires software to have been compiled against exactly the right version of a third party library.

    This talk will tackle modern operating system memory management from the ground up, steadily building a picture of its impact on Python performance. By considering how this very limited resource is partitioned and managed by the operating system, we will arrive at very specific recommendations about how your Python program should be debugged, deployed, and monitored.

    Many topics will be covered:

    The invention of virtual memory. The memory space of each Python process is a fiction sustained by the operating system and processor hardware. Why is this fiction, which originated in the 1960s, necessary? What does it accomplish? How much does it cost? And which parts of your Python application are allocated to the text segment, the stack, and the heap?

    Memory, caching, and swap. We will examine the hierarchy of storage media on a modern computer system, and how quickly costs grow as information moves several levels from the processor. We will contrast swap, which persists physical memory pages not currently in active use, with disk buffers by which information already on disk is brought much closer to the processor. When a machine beings thrashing, we will learn, our Python application gets clobbered.

    Linking to shared libraries. To avoid the expense of recompiling large programs, programmers invented linking so that they could combine pre-compiled objects together. After looking at how symbol tables are used in static linking, we will explore what happens when linking takes place at runtime instead — and how disk space and physical memory can be saved as a result. This will teach us why “dev” packages are so often necessary to compile Python extensions, and why undefined symbols result when libraries are missing, or we try to mix-and-match shared library versions.

    Measuring the Memory Footprint. Stepping back to see how these pieces fit together to produce a typical Python process, we will learn how to measure its memory usage as best we can. Basic tools like “top” and the Windows Tasklist will be examined, as well as more low-level tools like the /proc filesystem and tools like ps_mem.py.

    Taking Control of Compilation. What if you want to make your own decisions about what the Python interpreter does and does not include? What if you want to include a library statically to give you safety and portability, and give up the benefits of dynamic linking? We will answer these questions by looking at how Python itself, and also individual extension modules, can choose to statically link against important libraries instead of leaving them dynamic — our specific example will be my pyzmq-static package on PyPI. Finally, we will consider how forking on Linux can result in a set of Python processes with as much memory in common as possible, so long as the application is careful to generate as much shared state as possible before the fork.

    At 2:10pm to 2:55pm, Sunday 11th March

    In D5, Santa Clara Convention Center

    Coverage video

  • What's new and interesting in standard library

    by Senthil Kumaran

    This talk distills some intereting stuff from What's new document from 2.7, 3.2 and upcoming 3.3 release. Look out for those new arguments to your favorite methods, functions add the wow! factor to your code. Heard of @lru_cache?

    Abstract

    Lots of Interesting stuff has gone into Python Standard library in 2.7, 3.1, 3.2 and 3.3 release. Some interesting features that went in really make programmers life easy and it can bring in a 'wow' factor to their code. Additionally, it can also help the external library developers to relook at the their libraries to use new facilities available from standard library modules.

    This talk distills stuff from What's new document from 2.7, 3.2 and 3.3 and presents some of the choicest new features from Python standard library. Since a lots has gone in since 2.7, focus would be given to those which have had good discussion in tracker or in python-dev and would in general was a most sought out one.

    At 2:10pm to 2:55pm, Sunday 11th March

    In E1, Santa Clara Convention Center

    Coverage video