Sessions at PyCon US 2012 about Python

Your current filters are…

Friday 9th March 2012

  • Graph Processing in Python

    by Van Lindberg

    Graphs are everywhere - from your distributed source code control to Twitter analytics. This session presents a set of three problems and shows how they can be decomposed into operations on graphs, and then demonstrates solutions using the various graph libraries available for (or accessible to) Python.

    Graphs are a fundamental computer science datatype, and graphs show up in all sorts of models in all sorts of places. So when you have a graph, what can you do with it? Particularly if it is really big?

    Thirty minutes isn't a lot of time to discuss graph processing as a topic, so there won't be a lot of discussion relative to graph theory generally or the terminology of graphs. Instead, this is inspired by Raymond Hettinger's "mastering team play" - a series of exercises showing the lowering of a problem into a graph representation, followed by a demonstration of how the problem can be solved through graph processing. There will also be a little bit of compare-and-contrast between the available graph libraries to show differences. Each problem will be given 8-10 minutes.

    Problem 1: Python's (legal) history
    Python has developed over time under a number of organizations - each with their own license. What portions of Python's codebase are under each license?

    The CVS/SVN/HG trees as graphs modeling change in time
    Identifying and labeling node types
    Graphing and reporting on results
    Problem 2: Development Cliques
    Linux is famously developed with "lieutenants" in charge of different subsystems of the kernel. Python doesn't have lieutenants... or does it? Put another way, if you have a patch, who should you submit it to?

    Mailing list connections as a graph
    Analysis of connections, cliques, and centrality
    Graphing and reporting on results
    Problem 3: Let's get social
    Your employer has decided that its website should be turned into a social network - you know, because there aren't enough of those.

    Bootstrapping a graph by looking at pairwise analysis of products
    How to suggest who people "might know"?

    At 10:50am to 11:30am, Friday 9th March

    In E4, Santa Clara Convention Center

    Coverage video

  • Introduction to Metaclasses

    by Luke Sneeringer

    Python's metaclasses grant the Python OOP ecosystem all the power of more complex object inheritance systems in other languages, while retaining for most uses the simplicity of the straightforward class structures most developers learn when being introduced to object-oriented programming. This talk is an explanation of metaclasses: first, what they are, and second, how to use them.

    • Metaclasses
    • Introduction (2.5m)
    • Python's metaclasses grant the Python OOP ecosystem all the power of more complex object inheritance systems in other languages, while retaining the simplicity of the straightforward class structure that traditional C++ and Java programmers learned, and is taught in programming courses.
    • Classes are Objects, Too! (5m)
    • Classes are first-class objects in Python, like functions/methods
    • Classes, like other objects, can be assigned to variables and passed as arguments
    • ...and this ability is one of the tricks in the reusable code toolbox
    • Concept: Metaclasses generate classes. (5m)
    • The hierarchy starts with "type"
    • Classes are themselves instances of their metaclasses
    • By extension, classes provide code that runs when instances are created, while metaclasses provide code that runs when classes are created.
    • Remember the "analogies" section on standardized tests in the United States (and many other countries)?
    • Babylon 5 : J. Michael Strazynski :: Star Trek : ___
    • Instances : Classes :: Classes : Metaclasses
    • Think about a self-enclosed machine that creates, say, t-shirts. The machine is the class; the individual shirts are the instances. The guy who builds the t-shirt machines is the metaclass.
    • Concrete Code Examples (10m)
    • will cover 3.0 and 2.7
    • (stub: I haven't decided what my example will be yet)
    • Is metaclassing wise? (2.5m)
    • There's nothing inherently wrong or bad about it. Furthermore, sometimes it's by far the best way to solve a problem.
    • Beware, though: Some people find metaclassing confusing.
    • "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." (Brian Kernighan)
    • Questions (5m)

    At 10:50am to 11:30am, Friday 9th March

    In E3, Santa Clara Convention Center

    Coverage video

  • Stop Mocking, Start Testing

    by Augie Fackler and Nathaniel Manista

    Project Hosting at Google Code is a large, well-established system written mostly in Python. We'll share our battle-born convictions about creating tests for test-unfriendly code and the larger topic of testing.

    When launched, Project Hosting’s testing consisted of the stock Subversion test suite and a handful of ad hoc smoke test scripts that required starting the entire system and manually inspecting the test’s output.

    Over six years of codebase evolution, tests have been added with varying degrees of coverage and maintainability. Early system design decisions made adding tests difficult: the first tests added to the system used mock objects unwisely and large numbers of mock objects made refactoring costly in time and effort.

    Frustration with the difficulty of enhancing the service led us to reevaluate our testing practice and led to the discovery of better ways to test applications of this complexity. We will share our experiences with testing and discuss designing for maintainability and testability and appropriate use of testing tools such as frameworks and mocks.

    At 10:50am to 11:30am, Friday 9th March

    In D5, Santa Clara Convention Center

    Coverage video

  • Extracting musical information from sound

    by Adrian Holovaty

    Music Information Retrieval technology has gotten good enough that you extract musical metadata from your sound files with some degree of accuracy. Find out how to use Python (along with third-party APIs) to determine everything from the key/tempo of a song to the pitch/timbre of individual notes. Then we'll do some amusing analysis of popular tunes.

    Music Information Retrieval technology has gotten good enough that you extract musical metadata from your sound files with some degree of accuracy. Find out how to use Python (along with third-party APIs) to determine everything from the key/tempo of a song to the pitch/timbre of individual notes. Then we'll do some amusing analysis of popular tunes.

    Getting basic data about sounds.
    Visualizing waveforms.
    Parsing musical information at the level of song.
    Detecting individual notes ("segments").
    What fun can we have?

    At 11:30am to 12:10pm, Friday 9th March

    In E2, Santa Clara Convention Center

    Coverage video

  • Fast Test, Slow Test

    by Gary Bernhardt

    Most unit tests aren't and their authors suffer for it. What is a unit test, really? How can writing them prevent classic testing problems? If you do write them, what trade-offs are you implicitly making?

    Your unit test suite takes three minutes to run 500 tests. On a modern CPU, that's just shy of a billion instructions per test. Can something that takes a billion CPU instructions really be called a "unit"? What is that suite really testing?

    Many (and probably most) unit testing failures are related to size. Most common are the suite that takes half an hour to run (so no one runs it), the suite whose runtime scales like lines_of_code^2 (so, again, no one runs it), and the suite that requires huge maintenance for small changes (leading to the "testing is slow" myth).

    This talk is about the "unit" in "unit test": what does it really mean, why do we care, and how does it prevent the three crippling problems above? And, of course, if we do shift our focus toward unit tests, what trade-offs are we really making?

    At 11:30am to 12:10pm, Friday 9th March

    In D5, Santa Clara Convention Center

  • Scalability at YouTube

    by Shannon -jj Behrens and Mike Solomon

    This talk covers scalability at YouTube. It's given by one of the original engineers at YouTube, Mike Solomon. It's a rare glimpse into the heart of YouTube which is one of the largest websites in the world, and one of the few extremely large websites to be written in Python.
    Abstract

    Every day, people watch an average of 3 billion videos on YouTube. Every minute, people upload an average of 48 hours of video to YouTube. YouTube operates at a scale that few other websites will ever see, and it's written mostly in Python.

    Mike Solomon is one of the original engineers at YouTube. In this informal, high-level talk, he'll give an overview of the lessons he's learned as he's brought YouTube to scale. He'll also point out ways in which his philosophy on scaling, testing, and writing Python fly in the face of accepted wisdom. Last of all, we'll also be giving a very short introduction to YouTube APIs and how you can integrate your application with YouTube.

    At 11:30am to 12:10pm, Friday 9th March

    In E1, Santa Clara Convention Center

  • The Art of Subclassing

    by Raymond Hettinger

    All problems have simple, easy-to-understand, logical wrong answers. Subclassing in Python is no exception. Avoid the common pitfalls and learn everything you need to know about making effective use of inheritance in Python.

    Avoid the common pitfalls and learn everything you need to know about how subclass in Python.

    • Overriding and extending
    • Calling your parents
    • The ellipse / circle problem
    • What does a subclass mean?
    • Liskov Substitution Principle
    • Open Closed Principle
    • Facts of life when subclassing builtin types
    • Cooperative Multiple Inheritance
    • Common subclassing patterns
    • Use of the double underscore

    At 11:30am to 12:10pm, Friday 9th March

    In E3, Santa Clara Convention Center

    Coverage video

  • Practical Machine Learning in Python

    by Matt Spitz

    There are a plethora of options when it comes to deciding how to add a machine learning component to your python application. In this talk, I'll discuss why python as a language is well-suited to solving these particular problems, the tradeoffs of different machine learning solutions for python applications, and some tricks you can use to get the most out of whatever package you decide to use.

    This is the age of data. As more companies expose their datasets through APIs, it's becoming increasingly easier to pull information about users, places, and things. But having this data isn't always enough; we want to understand it, find correlations, and identify trends. Fortunately, the area of computer science known as machine learning has a variety of algorithms specifically designed to do this sort of data wrangling. For the python application developer, there are many off-the-shelf toolkits that include implementations of these algorithms (Orange, NLTK, SHOGUN, PyML and scikit-learn to name just a few), but choosing which one to use can be daunting.

    There are a number of tradeoffs one makes when making a selection, depending on the specifics of the implementation and the needs of the application. In this talk, I'll give an overview of some of the packages available and discuss what factors might go into deciding which one to use. I'll also offer some python-specific tricks you can use to work with large amounts of data efficiently.

    At 12:10pm to 12:40pm, Friday 9th March

    In E1, Santa Clara Convention Center

  • Stepping Through CPython

    by Larry Hastings

    Ever wondered how CPython actually works internally? This talk will show you. We start with a simple Python program, then slowly step through CPython, showing in exhaustive detail what happens when it runs that program. Along the way we'll examine the design and implementation of various major CPython subsystems and see how they fit together. The audience should be conversant in C and Python.

    The goal of the talk is to sufficiently familiarize the audience with CPython's internal structure such that a programmer versed in C and Python but having never dealt with an interpreter would be able to comfortably dive in and start hacking on CPython.

    The program examined will be simple but deliberately designed to exercise most of CPython's runtime behavior. This will include loading modules implemented in C and in Python, loading bytecode cached on disk, and a cross-section of bytecodes. (For example, I only need to examine one of the BINARY_* math operands, I don't need to walk through every single one.)

    Areas I expect to examine:

    • built-in modules, including ones that are automatically loaded before your program starts bytecode, including
    • the various implementations of the inner loop (switch statement, labels-as-values)
    • the peephole optimizer
    • on-disk format
    • marshal
    • the magic version number
    • mention lnotab but probably skip the gory details the stack machine
    • unwinding the stack after an exception (and producing tracebacks)
    • contrast CPython's approach with Stackless All the possible fields of PyObject, an overview of fields in PyType built-in types
    • the implementations of a few key internal types
    • list, dict, tuple, str, byte, int, bool, None
    • though not to the level of detail that Hettinger or Rhodes did in past talks
    • interned values the GIL and reference counting
    • weakrefs
    • garbage collection
    • Py_TRASHCAN CPython's small-block and arena allocators
    • The parser, though I don't want to spend a lot of time on it (runtime is where the fun is ;)
    • Internal utility functions like PyArg_Parse

    I'll be giving the talk based on CPython 3.2.

    At 12:10pm to 12:55pm, Friday 9th March

    In E2, Santa Clara Convention Center

    Coverage video

  • Stop Writing Classes

    by Jack Diederich

    Classes are great but they are also overused. This talk will describe examples of class overuse taken from real world code and refactor the unnecessary classes, exceptions, and modules out of them.

    Classes must be nouns but not every noun must be a class. If your class only has two methods and one of them is init you probably meant to write a function.
    MuffinMail recently refactored their API; it went from 20 classes scattered in 22 modules down to 1 class just 15 lines long. It was a welcome change, but we'll further refactor that down to a single function 3 lines long.

    The Python stdlib is an example of a namespace that is relatively flat. You won't find packages that consist of a single module defining an exception, and you won't find many exceptions at all - just 165 kinds in 200k lines of code. That's a tiny ratio compared to most projects including Django.

    Of course there are things, like containers, that should be classes. As a final example we'll add a Heap type to the heapq module (admit it, you already have one in your utils.py).

    At 12:10pm to 12:40pm, Friday 9th March

    In E3, Santa Clara Convention Center

    Coverage video

  • Advanced Security Topics

    by Paul McMillan

    If your Python application has users, you should be worried about security. This talk will cover advanced material, highlighting common mistakes. Topics will include hashing and salts, timing attacks, serialization, and much more. Expect eye opening demos, and an urge to go fix your code right away.

    If your Python application has users (even if it's used offline), you should be worried about security. This talk will cover advanced material, highlighting common mistakes.

    Hashing and encryption can be tricky to get right. We'll discuss when to use hashing to sign data, and how to choose the right encryption algorithm (spoiler: don't). We'll demonstrate length extension attacks, and discuss how to prevent them.

    Another common mistake is the incorrect use of pseudo-random number generators. We'll discuss the fix, and some of the dangers associated with it.

    Timing attacks are relatively exotic, but as applications move into shared data centers (and shared virtual machines) they have become easier to implement and more dangerous. They're a very common class of bugs, but fixing them (and proving they're fixed) can be difficult.

    Pickle is a common and easy to use serialization format for Python objects. Unfortunately, it's also insecure when attackers can send or modify the pickled data. We'll discuss strategies for signing pickled objects, and alternate serialization formats.

    The final portion of the talk will discuss a meta security problem within the Python community. I'll be demonstrating live code that can compromise even the most locked down of servers, and discussing the steps we need to take as a community to mitigate this threat moving forward.

    At 1:45pm to 2:40pm, Friday 9th March

    In E1, Santa Clara Convention Center

    Coverage video

  • The Magic of Metaprogramming

    by Jeff Rush

    Learn the magic of writing programs that monitor, alter and react to the execution of program code by responding to imports, changes to variables, calls to functions and invocations of the builtins. This talk goes beyond the static world of metaclasses and class decorators.

    Learn the magic of writing programs that monitor, alter and react to the execution of program code by responding to imports, changes to variables, calls to functions and invocations of the builtins. This talk goes beyond the static world of metaclasses and class decorators.

    We'll cover how to slide a class underneath a module to intercept reads/writes, place automatic type checking over your object attributes and use stack peeking to make selected attributes private to their owning class. We'll cover import hacking, metaclasses, descriptors and decorators and graphically describe how they work internally. Source examples and color technical diagrams.

    Table-of-Contents
    What is Metaprogramming?
    Tools At Our Disposal
    Orientation Diagram: What is Metaprogramming
    First Third of Talk: Import Hooking
    Sample Problem #1: Subclassing an Embedded Class
    A Solution to #1: Post-Import Hooking
    A Solution to #1 (Packaged Up)
    Alternate Solution: Pre-Import Hooking
    What Does a Subclassed Module Look Like?
    Some Benefits of Subclassing Modules
    2nd Third of Talk: Metaclasses
    Orientation Diagram: Instances, Classes and Metaclasses
    Facts About Metaclasses
    Example #2: Define a Class from an SQL Table Definition
    Example Problem #2 (cont'd)
    Metaclasses versus Class Decorators
    About Meta-Inheritance
    Example #3: Log the Arguments/Return Value of Method Calls
    Lull After Metaclasses, Before Descriptors
    Last Third of Talk: Descriptors
    Python's Mechanism of Attribute Lookup
    When to Use Which Lookup Mechanism
    Example 4: Overriding getattr
    Example 4: Using a Descriptor Instead
    Python's Mechanism of Attribute Lookup (descriptors)
    So What is a descriptor again?
    Where are descriptors used?
    Example 5: Caching an Attribute Value
    Example 6: Declare an Attribute Private to a Class
    Example 7: Tracking Changes in a Value

    At 1:45pm to 2:40pm, Friday 9th March

    In E3, Santa Clara Convention Center

  • Build reliable, traceable, distributed systems with ZeroMQ

    by Jérôme Petazzoni

    We will show how to build simple yet powerful RPC code with ZeroMQ, with very few (if any!) modification to existing code. We will build fan-in and fan-out topologies with ZeroMQ special socket types to implement PUB/SUB patterns and scale up job-processing tasks. Thanks to introspection, the resulting services will be self-documented. Finally, we will show how to implement distributed tracing.

    We will show how to leverage ZeroMQ to build a simple yet powerful RPC for Python code. We will focus on simplicity, the goal being to expose almost any Python module or class to network calls – with very few (if any!) modification to existing code.

    We will then explain the purpose and show some use-cases for ZeroMQ special socket types (PUSH/PULL, PUB/SUB, ROUTER/DEALER) to build fan-in and fan-out topologies, as well as asynchronous processing (to avoid blocking when doing long-running requests). A by-product is the ability to scale up job-processing tasks with a message queue, which can even be made broker-less (you don’t have to deploy heavy machinery if you don’t need it).

    We will also demonstrate how introspection can make development and debugging easier, exposing docstrings, and provideing a few command-line helpers to poke, debug, and experiment directly from the shell.

    At the end of the talk (or in a separate talk), we will explain how to implement a tracing framework for distributed RPC. By hooking into the right places, we will show how to get full tracebacks and profiling information; more precisely:

    how each complex call (involving multiple subcalls) can be accurately traced;
    how to handle exceptions, and know easily when and where they happened (without checking dozens of log files);
    which complex calls take too long, and where they spend their time (distributed profiling).
    Those guidelines are the result of an on-going development work at dotCloud, and actively used and implemented at the core of our leading Platform-as-a-Service offering.

    We don’t expect the audience to be familiar with ZeroMQ or RPC. However, it will certainly help to have basic knowledge of serialization (e.g. pickle) and sockets.

    At 2:00pm to 2:40pm, Friday 9th March

    In E4, Santa Clara Convention Center

  • Code Generation in Python: Dismantling Jinja

    by Armin Ronacher

    For many DSLs such as templating languages it's important to use code generation to achieve acceptable performance in Python. The current version of Jinja went through many different iterations to end up where it is currently. This talk walks through the design of Jinja2's compiler infrastructure and why it works the way it works and how one can use newer Python features for better results.

    Why Code Generation?
    It seems like the general consensus for code generation in many dynamic language communities is: eval is evil, do not use it. However if done properly code generation solves a lot of problems easily, securely and with much better performance than an interpreter written on top of an interpreted language like Python.

    Code generation is what powers most template languages in Python, what powers object relational mappers and more. It is also an excellent tool to simplify debugging.

    Why Codegen is no Silver Bullet
    Just because you generate code does not mean you're faster than an interpreter written in Python. This part of the talk focuses on why compiling Django templates to Python bytecode does not automatically make it fast.

    Design of Jinja2
    Jinja2 underwent multiple design iterations, most of which were made to either improve performance or debug-ability. The internals however are largely undocumented and confusing unless you're familiar with the code. In it however are a few gems hidden and interesting tricks to make code generation work in the best possible way.

    Python's Support for Code Generation
    Over the years Python's support for code generation was steadily improved with different ways to access the abstract syntax tree and to compiling it back to bytecode. This section highlights some alternative ways to do code generation that are not yet fully implemented in Jinja2 but are otherwise widely used.

    At 2:00pm to 2:40pm, Friday 9th March

    In E2, Santa Clara Convention Center

  • Apache Cassandra and Python

    by Jeremiah Jordan

    Using Apache Cassandra from Python is easy to do. This talk will cover setting up and using a local development instance of Cassandra from Python. It will cover using the low level thrift interface, as well as using the higher level pycassa library.

    • Very brief intro to Apache Cassandra
    • What is Apache Cassandra and where do I get it?
    • Using the Cassandra CLI to setup a keyspace (table) to hold our data
    • Installing the Cassandra thrift API module
    • Using Cassandra from the thrift API
    • Connecting
    • Writing
    • Reading
    • Batch operations
    • Installing the pycassa module
    • Using Cassandra from the pycassa module
    • Connecting
    • Reading
    • Writing
    • Batch operations
    • Indexing in Cassandra
    • Automatic vs Rolling your own
    • Using Composite Columns
    • Setting them up from the CLI
    • How to using them from pycassa
    • Lessons learned

    At 2:40pm to 3:20pm, Friday 9th March

    In E2, Santa Clara Convention Center

  • Interfaces and Python

    by Eric Snow

    In 2.6, Python introduced the Abstract Base Classes. Before that we had "protocols" (and we still do). In this talk we'll look at how the general concept of interfaces fits into today's Python. We'll also look at some of the alternate proposals of the past, some of the controversies around ABCs, and the direction interfaces might go in the future.

    Talk Outline:

    • What are Interfaces? (3 min)
    • modeling strict abstraction
    • precedents in other languages
    • Interfaces in Python (6 min)
    • duck-typing
    • Python "protocols"
    • past proposals (PEP 245)
    • how Python "interfaces" are different
    • Newer Interface Support (11 min)
    • annotations
    • Abstract Base Classes
    • why run-time validation?
    • ABC vs. duck-typing
    • Third-party Libraries (5 min)
    • Peak's PyProtocols
    • zope.interface
    • Twisted
    • What Next? (3 min)
    • strict interfaces
    • compile-time validation
    • an example interface library

    For more comprehensive coverage of interfaces in Python, check out this reference: http://readthedocs.org/docs/refe...

    At 2:40pm to 3:20pm, Friday 9th March

    In E3, Santa Clara Convention Center

    Coverage video

  • Introduction to PDB

    by Chris McDonough

    PDB is an interactive debugging environment for Python programs. It allows you to pause your program, look at the values of variables, and watch program execution step-by-step, so you can understand what your program is actually doing, as opposed to what you think it's doing. This talk will show novice and intermediate Python users how to use PDB to troubleshoot existing code.

    PDB is an interactive debugging environment for Python programs. It allows you to pause your program, look at the values of variables, and watch program execution step-by-step, so you can understand what your program is actually doing, as opposed to what you think it's doing.

    Effectively using PDB is arguably the most important skill a new Python developer can learn. This talk will show novice and intermediate Python users how to use PDB to troubleshoot existing code.

    When is it reasonable to use PDB?

    "I don't use a debugger"

    When is it really not reasonable?

    Modes of pdb usage

    set_trace mode, e.g. pdb.set_trace()

    postmortem mode, e.g. python -m pdb buggy.py or pdb.pm()

    run mode, .e.g. pdb.run('some.expression()').

    Getting help

    Shortcut aliases (c vs. continue)

    The workhorse commands (list, print, pretty-print, next, continue, step, return, until, where, up, down):

    list: displaying code in your current execution context

    p and pp: displaying objects

    continue, step, return, next, return, until: execution control

    where: showing the current location in the frame stack

    up, down: navigating the frame stack

    Managing breakpoints (break, tbreak, ignore, enable, disable, clear):

    break, tbreak, ignore, enable, disable, and clear: Managing breakpoints
    Lesser-used commands (args, !-prefixing, debug)

    debug: recursive debugging

    !-prefixing: modifying variables

    args: printing args to the current function

    commands: scripting pdb

    ~/.pdbrc and PDB aliases

    Debugging in the face of threads (ie. web apps).

    "Purple bags"

    Enhanced shells: ipdb, pudb, winpdb

    In-editor debugger integration (Wing, Eclipse PyDev, PyCharm, etc)

    At 2:40pm to 3:20pm, Friday 9th March

    Coverage video

  • pytest - rapid and simple testing with Python

    by Holger Krekel

    The py.test tool presents a rapid and simple way to write tests. This talks introduces common testing terms, basic examples and unique pytest features for writing unit- or functional tests: assertions and dependency injection mechanisms. We also look at other features like distributing test load, new plugins and reasons why some Redhat and Mozilla people choose pytest over other approaches.

    The py.test tool presents a rapid and simple way to write tests for your Python code. This talks introduces some common testing terminology and basic pytest usage. Moreover, it discusses some unique pytest features for writing unit- or functional tests. For unit tests, the simple Python "assert" statement is used for coding assertions. As of 2011, this assert support has been perfected for Python 2.6 or higher, finally removing what some people have formerly called "magic side effects". For writing functional or acceptance tests py.test features a unique depdendency injection mechanism for managing test resources. The talk shows how to setup these resources and how to configure it via command line options. More recently, QA folks from Mozilla and Redhat QA people have endorsed come to appreciate these unique features and the general customizability. The talk concludes with a look on other features like distributing test load and other recently released plugins.

    This is the planned series of topics:

    unit- and functional testing

    why pytest and not Python packaged unittest package?

    simple test example and assertions

    example of dependency injection

    example usage from webqa mozilla project

    mocking and monkeypatching

    distributed test load to processors

    non-python test discovery

    customized reporting

    outlook on future releases

    At 2:40pm to 3:20pm, Friday 9th March

    In D5, Santa Clara Convention Center

  • Throwing Together Distributed Services With Gevent

    by Jeff Lindsay

    In this talk we learn how to throw together a distributed system using gevent and a simple framework called gservice. We'll go from nothing to a distributed messaging system ready for production deployment based on experiences building scalable, distributed systems at Twilio.

    As some have found, gevent is one of the best kept secrets of Python. It gives you fast, evented network programming without messes of callbacks, code that is more Pythonic, and lets you use most regular Python networking libraries and protocol implementations. Now, let's build on this.

    In this talk we learn how to throw together distributed services using gevent and a simple framework called gservice. We'll go from nothing to a distributed messaging system based on experiences building scalable, distributed systems at Twilio.

    This talk will be full of code, live coding, and real production applications with guest appearances by other fun technologies like ZeroMQ, WebSocket, and Doozer.

    At 2:40pm to 3:20pm, Friday 9th March

    In E4, Santa Clara Convention Center

    Coverage video

  • Decorators and Context Managers

    by Dave Brondsema

    Learn how decorators and context managers work, see several popular examples, and get a brief intro to writing your own. Decorators wrap your functions to easily add more functionality. Context managers use the 'with' statement to make indented blocks magical. Both are very powerful parts of the python language; come learn how to use them in your code.

    Decorators wrap or modify your functions. They are functions themselves, so you can pass parameters to them. You can wrap a function with multiple decorators too. We'll cover how to use decorators in all those situations and how to write them too. Examples will be demonstrated with:

    • @property
    • @memoize
    • mock/patch
    • TurboGears
    • Django
    • Allura

    Context managers wrap a block of code using the 'with' statement to do something on the way into the block and something on the way out. Opening and closing a file is a very common case, but there is a lot more you can do. Examples include:

    • modifying state
    • capturing stdout
    • mock/patch
    • locks
    • timing
    • transactions

    More about decorators: http://www.python.org/dev/peps/p...

    More about context managers: http://www.python.org/dev/peps/p...
    http://docs.python.org/library/c...

    At 3:20pm to 3:50pm, Friday 9th March

    In E3, Santa Clara Convention Center

  • Fake It Til You Make It: Unit Testing Patterns With Mocks and Fakes

    by Brian K. Jones

    In this talk, aimed at intermediate Pythonistas, we'll have a look at some common, simple patterns in code, and the testing patterns that go with them. We'll also discover what makes some code more testable than others, and how mocks and fakes can help isolate the code to be tested (and why you want to do that). Finally, we'll touch on some tools to help make writing and running tests easier.

    Overview
    You've heard the gospel of 'test, test, test!' over and over again, and may have even felt some jealousy or guilt because you're not using Test-Driven Development. Maybe you've even seen talks or read blog posts about writing 'testable code', but it just hasn't sunk in.

    The reality is that writing effective unit tests can be somewhat difficult to wrap your head around. What's a unit test? When is a unit test not a unit test? What's a functional test? When is a Mock really a fake or stub? There's a good bit of lingo, a fair amount of religion, but not enough instruction on effective testing patterns and idiomatic, Pythonic testing practices.

    As programming and application architecture is heavily influenced by the use of patterns, it's only logical that those patterns produce the side effect of making the way they'll be tested more predictable, and yet discussions of patterns regularly leave out coverage of testing, and most testing talks fail to link a methodology to patterns in the code. This changes now.

    In this talk, aimed at intermediate Pythonistas, we'll have a look at some common, simple patterns in code, and then have a look at the testing patterns that go with them. We'll also discover what makes some code more testable than others, and how mocks and fakes can help truly isolate the code to be tested (and why you really want to do that). Finally, we'll touch on some tools to help make writing and running tests easier.

    Outline
    What is a unit test? (3 minutes)
    Unit Test definition
    Unit tests vs. functional, integration, and acceptance tests
    Why Unit Tests? (3 minutes)
    "Why isolate the code?" (I get this question a lot)
    "But, I use functional tests & have 100% coverage!"
    Three pieces of code, and how to make it more testable. (8 minutes)
    Patterns in code, patterns in tests (15 minutes)
    A Simple datetime abstraction library, its patterns and tests.
    A REST Client Module, its patterns and tests
    A cmd-module-based command shell, its patterns and tests
    A microframework-based service, its patterns and tests
    Tools You Want to Use (5 minutes)
    Mock
    Nose
    Coverage
    Tox
    TBD (Possibly PyCharm's test support, which is getting good w/ 2.0, but there are many candidates)

    At 3:20pm to 4:05pm, Friday 9th March

    In D5, Santa Clara Convention Center

  • Make Sure Your Programs Crash

    by Moshe Zadka

    With Python, segmentation faults and the like simply don't happen -- programs do not crash. However, the world is a messy, chaotic place. What happens when your programs crash? I will talk about how to make sure that your application survives crashes, reboots and other nasty problems.

    Handling crashes is divided into two parts -- resilience (making sure that your software maintains correctness in the face of crashes) and speed of recovery (optimizing the time it takes back to get back to full working condition). I will talk about techniques to allow for resilience -- separating master data from cache data, minimizing the amount of master data, using atomic file operations, using databases and persisting structures in the right order. Then I will talk about speedy recovery techniques, among them process separation, working while restarting and more. I will conclude with surveying the options in testing all of these things so that the crashes are made to happen in the development/testing environment.

    Outline:

    • Ways Python programs can crash
    • Infinite loops
    • Getting stuck
    • Memory leaks
    • Exceptions
    • Catching exceptions considered scary
    • Threads dead-locks
    • Minimizing effects of a crash
    • Atomic file operations
    • Databases
    • Vertical process splitting
    • Horizontal process splitting
    • Limiting process lifetime
    • Detecting crashes
    • Process death
    • Process inresponsiveness
    • Test communication
    • Helper checker processes
    • Restarting processes
    • Minimize master data
    • Boot-up speed
    • Order of start-up and communication
    • Testing by killing processes
    • Testing by pausing processes
    • Conclusions
    • Python processes can still crash
    • Plan for crashes
    • Test your plan for crashes

    At 3:20pm to 3:50pm, Friday 9th March

    In E1, Santa Clara Convention Center

  • Putting Python in PostgreSQL

    by Frank Wiles

    PostgreSQL is pretty powerful all on it's own, but did you know you can use Python as a stored procedure language? Not only does using a familiar language make development easier, but you get the power of the standard library and PyPi to boot. Come learn the ins and outs of putting Python in your DB.

    Pushing logic in your database can be a blessing or a curse. While these techniques aren't appropriate for most users in most situations, it's good to know what kind of power you have at your disposal with PL/Python if the need ever arises.

    Learn about how to: - Use triggers to off load data processing tasks - Fire off email and log alerts based on what is happening in your tables - Create more granular and detailed constraints such as verifying email address or credit card checksums at the database level - Replace batch jobs with in database triggers - Learn when and when NOT to use these techniques - Debugging techniques - Security concerns

    At 3:20pm to 4:05pm, Friday 9th March

    In E2, Santa Clara Convention Center

  • Static analysis of Python extension modules using GCC

    by Dave Malcolm

    Want to analyse C/C++ code using Python? I've written a plugin for GCC that embeds Python inside the compiler, allowing you to write new C/C++ compilation passes in Python. I've used this to build a static analysis tool that understands the CPython extension API, and can automatically detect reference-counting bugs, and other errors.

    I've written a plugin for GCC that embeds Python inside the compiler, allowing you to write new C/C++ compilation passes in Python.

    I've used this to build a static analysis tool that understands the CPython extension API, and can automatically detect various errors (e.g. reference counting mistakes).

    I'll be talking about how to use the GCC plugin to analyse C and C++ code with Python scripts, and giving a guided tour of the static analysis tool on some real-world Python extension modules.

    At 3:20pm to 4:05pm, Friday 9th March

    In E4, Santa Clara Convention Center

  • Permission or Forgiveness?

    by Alex Martelli

    Grace Murray Hopper's famous motto, "It's easier to ask forgiveness than permission", has many useful applications -- in Python, in concurrency, in networking, as well of course as in real life. However, it's not universally valid. This talk explores both useful and damaging applications of this principle.

    I'll start by introducing the motto "It's easier to ask forgiveness than permission" and the woman who used it, Rear Admiral Grace Murray Hopper, also known as the "mother of Cobol" and the author of the first ever programming-language compiler.

    I then move on to the Python context, where the motto supports the proper usage of exception-catching rather than preliminary checks; and the "rule that proves the exception" introduced by abstract base classes.

    Expanding the subject, I show how "optimistic concurrency" applies that motto (while locking would "ask permission", in essence, STM "asks forgiveness"), and how collision-detection focused networking protocols have similarly triumphed over more highly structured, "ask permission" ones like token-ring.

    Moving to the fuzzier context of real life, I then show how this daring approach does not work quite as well as in the technical realm -- except when applied correctly, in the right circumstances... and I try to evince a general law describing what the right circumstances for its application are, comparing and contrasting with the similar issue of "do it right the first time" versus "launch and iterate" (and the latter's cognate "fail, but fail fast" principle).

    At 4:25pm to 5:20pm, Friday 9th March

    In E1, Santa Clara Convention Center

  • Python Metaprogramming for Mad Scientists and Evil Geniuses

    by Walker Hale

    This talk covers the power and metaprogramming features of Python that cater to mad scientists and evil geniuses. This will also be of interest to others who just want to use of Python in a more power (hungry) way. The core concept is that you can synthesize functions, classes and modules without a direct correspondence to source code. You can also mutate third-party objects and apps.

    This talk covers the power and metaprogramming features of Python that cater to mad scientists and evil geniuses. This will also be of interest to others who just want to use of Python in a more power (hungry) way.

    Users of Python are not limited to the usual model of a one-to-one correspondence between source code and live objects. Python allows you to synthesize functions, classes and modules without a direct correspondence to source code. You can mutate third-party objects, classes, modules and applications through monkey patching -- changing their behavior without altering their source code. You can even "chop-up" third-party objects to create new objects from the pieces. Find out how to unleash your inner Mad Scientist!

    Thesis: Python is an ideal language for both:

    • Mad Scientists
    • Evil Geniuses
    • Mad Scientist versus Evil Genius
    • Mad Scientist: creating new things because it's cool
    • Evil Genius: practical applications
    • Typical Mad Science Goals
    • Create new living code objects from scraps without corresponding source code.
    • Mutate third-party code to suite our purposes without modifying the third-party source code.
    • Synthetics
    • Synthetic Functions
    • Synthetic Classes
    • Synthetic Modules
    • Applications of Synthetics
    • Monkey Patching
    • Monkey Patching Modules
    • Monkey Patching Classes
    • Monkey Patching Instances
    • sitecustomize.py
    • Dealing with Angry Villagers
    • Limitations: When not to do this
    • For the Evil Geniuses

    Although most of the material is presented from the point of view of the Mad Scientist, it is equally useful to the Evil Genius.

    Since the Python community prides itself on diversity, I should emphasize that the sane, the non-evil, and "do-gooders" are all welcome.

    At 4:25pm to 5:20pm, Friday 9th March

    In E3, Santa Clara Convention Center

    Coverage video note

  • Sage: Open Source Math in Python

    by Christopher Swenson

    A quick introduction to Sage, an open-source mathematics package for experimentation in all areas of mathematics. There will be some brief remarks and demos of what Sage is capable of.

    This talk is meant to give users a taste of what Sage is like, and why they might use it in the future.

    Intended for all levels of users, but the more Python you know, the easier it will be to use Sage.

    Sage is meant to be the open source platform for developing mathematical software, combining the best of other utilities. It uses Python as its primary development language, to give it a clean and powerful interface.

    This talk will introduce attendees to Sage, and demo some of its capabilities.

    Specifically, I'd cover some of the following features:

    Very quick intro: what Sage is and isn't, and how it compares to the Big Ms: Mathematica, Matlab, Magma, etc.
    Sage interpreter and notebook interface
    Standard Sage functions
    Plotting
    Algebra and Polynomials
    Number Theory
    Extending Sage: Python & Cython
    Other things the audience cares about!
    This tutorial will not require any advanced math to understand and enjoy, but I will try to touch on as much of Sage as I can during the talk.

    I'm not a Sage expert, but merely an enthusiast.

    At 4:40pm to 5:20pm, Friday 9th March

    In E2, Santa Clara Convention Center

  • Introspecting Running Python Processes

    by Adam Lowry

    Understanding the internal state of a running system can be vital to maintaining a high performance, stable system, but conventional approaches such as logging and error handling only expose so much. This talk will touch on how to instrument Python programs in order to observe the state of the system, measure performance, and identify ongoing problems.

    Something is wrong with your web application. The time it’s taking to serve requests is growing. Your logs don’t contain enough. Your database appears bored. How do you know what’s going wrong?

    In high-performance production servers it’s vital to know as much about the internals of your system as possible. Traditionally this is done by simple methods like logging anything of potential interest or sending error emails with unexpected exceptions. These methods are insufficient, both due to the level of noise inherent in such systems and because of the difficulty in anticipating what metrics are important during an incident.

    Environments such as the JVM and .Net VM have advanced tools for communicating with the VM and for applications to expose internal state, but CPython has lacked similar tooling.

    This talk will cover what options CPython application developers have for introspecting their programs; new tools for instrumenting, exposing, and compiling performance and behavior metrics; and techniques for diagnosing runtime issues without restarting the process.

    At 5:20pm to 6:00pm, Friday 9th March

    In E1, Santa Clara Convention Center

  • pandas: Powerful data analysis tools for Python

    by Wes McKinney

    pandas is a Python library providing fast, expressive data structures for working with structured or relational data sets. In addition to being used for general purpose data manipulation and data analysis, it has also been designed to enable Python to become a competitive statistical computing platform. In this talk, I will discuss the library's features and show a variety of topical examples.

    At 5:20pm to 6:00pm, Friday 9th March

    In E2, Santa Clara Convention Center

    Coverage video

Saturday 10th March 2012

Schedule incomplete?

Add a new session

Filter by Day

Filter by coverage

Filter by Topic

Filter by Venue

Filter by Space