The Django framework is a fast, flexible, easy to learn, and easy to use framework for designing and deploying web sites and services using Python. In this session, we'll cover the fundamentals of development with Django, generate a Django data model, and put together a simple web site using the framework.
This tutorial introduces programmers with a basic Python skills to the concepts and techniques of event driven programming. The focus is on understanding an event loop and handling the events related to TCP connections. Twisted is introduced as a re-usable event loop implementation and the abstract concepts of event driven programming are related to specific uses of the Twisted library.
IPython provides tools for interactive and parallel computing that are widely used in scientific computing, but can benefit any Python developer. We will show how to use IPython in different ways, as: an interactive shell, an embedded shell, a graphical console, a network-aware VM in GUIs, a web-based notebook with code, graphics and rich HTML, and a high-level framework for parallel computing.
IPython started as a better interactive Python interpreter in 2001, but over the last decade it has grown into a rich and powerful set of interlocking tools aimed at maximizing developer productivity with Python while using the language interactively.
Today, IPython consists of a kernel that executes the user code and controls the user's namespace, and a collection of tools to control this kernel either in-process or out-of-process thanks to a well-specified communications protocol implemented over ZeroMQ. The kernel can do much more than execute user code, including introspection of objects in the user's namespace, detailed error reporting with rich tracebacks, history logging of inputs and outputs with an SQLite backend, a user-extensible system of commands for interactive control that don't collide with user variables, and much more.
Our communications architecture allows these same features to be accessed via a variety of clients, each providing unique functionality tuned to a specific use case. We expose a number of directly usable applications:
An interactive, terminal-based shell with many capabilities far beyond the default Python interactive interpreter (this is the default application opened by the ipython command that most users are familiar with).
A Qt console that provides the look and feel of a terminal, but adds support for inline figures, graphical calltips, a persistent session that can survive crashes (even segfaults) of the kernel process, and more. A user-based review of some of these features can be found here.
A web-based notebook that can execute code and also contain rich text and figures, mathematical equations and arbitrary HTML. This notebook controls the same kernel as the other two applications, but instead of offering a linear, terminal-like workflow, it presents a document-like view with cells where code is executed but that can be edited in-place, reordered, mixed with explanatory text and figures, etc. This model is a kind of literate programming environment popular in scientific computing and pioneered by the Mathematica system, that allows for the creation of rich documents that combine computational experimentation and results with other explanatory elements. A detailed review of this system can be found here.
A high-performance, low-latency system for parallel computing that supports the control of a cluster of IPython engines communicating over ZeroMQ, with optimizations that minimize unnecessary copying of large objects (especially numpy arrays). These engines can be controlled interactively while developing and doing exploratory work, or can run in batch mode either on a local machine or in a large cluster/supercomputing environment via a batch scheduler.
In this hands-on, in-depth tutorial, we will briefly describe IPython's architecture and will then show how to use and configure each of the above components. We will also discuss how to use the underlying IPython libraries in your own application to provide interactive control.
An outline of the tutorial follows:
A short listing of other features not covered in this tutorial, as guidance for users to later learn about on their own.
For full details about IPython including documentation, previous presentations and videos of talks, please see the project website.
A tutorial that goes beyond all other Django tutorials; we'll dive deep into the guts of the framework, and learn how each commonly-used component -- ORM, templates, HTTP handling, views and the admin -- work from the bottom up, covering both public and internal APIs in excruciating detail.
by Ian Ozsvald
At EuroPython 2011 I ran a very hands-on tutorial for High Performance Python techniques. This updated tutorial will cover profiling, PyPy, Cython, numpy, NumExpr, ShedSkin, multiprocessing, ParallelPython and pyCUDA. Here's a 55 page PDF write-up of the EuroPython material: http://ianozsvald.com/2011/07/25...
At EuroPython 2011 I ran a very hands-on tutorial for High Performance Python techniques. This updated tutorial will cover:
I plan to expand the original material and to maybe also cover other tools like execnet and PyPy-numpy.
From how the operating system handles your requests through design principles on how to use concurrency and parallelism to optimize your program's performance and scalability. We will cover processes, threads, generators, coroutines, non-blocking IO, and the gevent library.
How processes, threads, coroutines, and non-blocking IO work from the operating system through code implementation and design principles to optimize Python programs. The difference between parallelism and concurrency and when to use each.
The premise is that to make an informed decision you need to know what is happening under the hood. Once you understand the low level functionality, you can make the correct decision in the design phase.
The emphasis is on practical application to solve real world problems.
by Edgar Roman
Our challenge was to create a login system for little people who might barely read, maybe no email, perhaps no home computer. And we had to watch out for privacy laws - especially tough for minors. But these kids want to play games, write stories, and create online avatars to share and compete against their buddies. Listen to how we developed the PBS KIDS login and moderation system in Django.
by Idan Gazit
The ultimate goal of data visualization is to tell a story and supply meaning. There are tools and science that can inform your choice of data to present and how best to present it. We reflexively evaluate data and fit it into a narrative which aids decisionmaking; learn how to take advantage of this tendency in order to deliver meaning, not just numbers and charts.
Data visualization is a hot field right now—and for good reason. In our age of info-saturation, true value is found in distilling large amounts of data into a form that is easy to comprehend and act upon. This talk provides an overview of tools and techniques which you can use to level up your data presentation, regardless of application.
As humans, we are adept at evaluating visual information. From an early age, we learn to make inferences about things based on their visual properties—large and small, near and far, motion, direction, and other attributes. Taking advantage of the visual process we’ve been practicing since birth is an easy way to optimize delivery of your data into the brains of your audience.
Unfortunately, it isn’t enough to appeal to the part of our brains responsible for figuring out whether we can successfully hit an animal with a rock. A great visualization must appeal to our sense of beauty. Structure, layout, typography, and color are all tools which can be used (and abused) to delight your audience and direct their attention where you want it to go.
Whether you’re building an information dashboard for a webapp or presenting scientific data, an understanding of these techniques will make your data more accessible to your audience, and more of a delight to read and learn from.
by Matt Spitz
There are a plethora of options when it comes to deciding how to add a machine learning component to your python application. In this talk, I'll discuss why python as a language is well-suited to solving these particular problems, the tradeoffs of different machine learning solutions for python applications, and some tricks you can use to get the most out of whatever package you decide to use.
This is the age of data. As more companies expose their datasets through APIs, it's becoming increasingly easier to pull information about users, places, and things. But having this data isn't always enough; we want to understand it, find correlations, and identify trends. Fortunately, the area of computer science known as machine learning has a variety of algorithms specifically designed to do this sort of data wrangling. For the python application developer, there are many off-the-shelf toolkits that include implementations of these algorithms (Orange, NLTK, SHOGUN, PyML and scikit-learn to name just a few), but choosing which one to use can be daunting.
There are a number of tradeoffs one makes when making a selection, depending on the specifics of the implementation and the needs of the application. In this talk, I'll give an overview of some of the packages available and discuss what factors might go into deciding which one to use. I'll also offer some python-specific tricks you can use to work with large amounts of data efficiently.
by Erik Rose
Mozilla's projects have thousands of tests, so we've had to venture beyond vanilla test runners to keep things manageable. Our secret sauce can be used with your project as well. Reach beyond the test facilities that came with your project, harnessing pluggable test frameworks, dynamically reordering tests for speed, exploring various mocking libraries, and profiling your way to testing nirvana.
A partial outline:
Motivation: a test not run is no test at all.
For most web apps, the easiest test speed win is a conquest of I/O.
The nose testrunner
Test discovery lets you organize tests well.
Gluing to projects with custom testrunners: django-nose and test-utils
Compare to nose. Nose forked from it. Explain history.
Very cool assertion re-evaluation
Plugin compatibility between py.test and nose
Start here. Premature optimization sucks.
time on the commandline to divide CPU from I/O
Killing I/O for speedy justice: case study of support.mozilla.com
Fixture speed hacks (a 5x improvement!)
How to use DB transactions to avoid repetitive I/O
Dynamic test reordering and fixture sharing
DB reuse and other startup optimizations
37,583 queries to 4,116. Watch them fly by!
What to do instead of fixtures: the model-maker pattern
Using mocking to kill the fixtures altogether
mock, the canonical lib
fudge, new declarative hotness
Example: oedipus, a better API for the Sphinx search engine. I used fudge to unit-test oedipus without requiring devs to set up and populate Sphinx.
Dangers of mocking
Don't mock out your caching unless your invalidation is perfect.
Some of our mistakes in oedipus
The nose-progressive display engine
Test results that are a pain to read don't get read.
Elision of junk frames
Easier round-tripping from test failure to source code
Next steps: what to do once you're CPU-bound
Multithreading really buys you no speed bump for CPU-bound (or I/O bound?) tasks in Python due to the GIL. (Ref: PyCodeConf talk by David Beazley.)
State of multiprocess plugins in various testrunners.
Mozilla's Jenkins test farm
QA's big stacks of Mac Minis
What global warming? ;-)
We will show how to build simple yet powerful RPC code with ZeroMQ, with very few (if any!) modification to existing code. We will build fan-in and fan-out topologies with ZeroMQ special socket types to implement PUB/SUB patterns and scale up job-processing tasks. Thanks to introspection, the resulting services will be self-documented. Finally, we will show how to implement distributed tracing.
We will show how to leverage ZeroMQ to build a simple yet powerful RPC for Python code. We will focus on simplicity, the goal being to expose almost any Python module or class to network calls – with very few (if any!) modification to existing code.
We will then explain the purpose and show some use-cases for ZeroMQ special socket types (PUSH/PULL, PUB/SUB, ROUTER/DEALER) to build fan-in and fan-out topologies, as well as asynchronous processing (to avoid blocking when doing long-running requests). A by-product is the ability to scale up job-processing tasks with a message queue, which can even be made broker-less (you don’t have to deploy heavy machinery if you don’t need it).
We will also demonstrate how introspection can make development and debugging easier, exposing docstrings, and provideing a few command-line helpers to poke, debug, and experiment directly from the shell.
At the end of the talk (or in a separate talk), we will explain how to implement a tracing framework for distributed RPC. By hooking into the right places, we will show how to get full tracebacks and profiling information; more precisely:
how each complex call (involving multiple subcalls) can be accurately traced;
how to handle exceptions, and know easily when and where they happened (without checking dozens of log files);
which complex calls take too long, and where they spend their time (distributed profiling).
Those guidelines are the result of an on-going development work at dotCloud, and actively used and implemented at the core of our leading Platform-as-a-Service offering.
We don’t expect the audience to be familiar with ZeroMQ or RPC. However, it will certainly help to have basic knowledge of serialization (e.g. pickle) and sockets.
For many DSLs such as templating languages it's important to use code generation to achieve acceptable performance in Python. The current version of Jinja went through many different iterations to end up where it is currently. This talk walks through the design of Jinja2's compiler infrastructure and why it works the way it works and how one can use newer Python features for better results.
Why Code Generation?
It seems like the general consensus for code generation in many dynamic language communities is: eval is evil, do not use it. However if done properly code generation solves a lot of problems easily, securely and with much better performance than an interpreter written on top of an interpreted language like Python.
Code generation is what powers most template languages in Python, what powers object relational mappers and more. It is also an excellent tool to simplify debugging.
Why Codegen is no Silver Bullet
Just because you generate code does not mean you're faster than an interpreter written in Python. This part of the talk focuses on why compiling Django templates to Python bytecode does not automatically make it fast.
Design of Jinja2
Jinja2 underwent multiple design iterations, most of which were made to either improve performance or debug-ability. The internals however are largely undocumented and confusing unless you're familiar with the code. In it however are a few gems hidden and interesting tricks to make code generation work in the best possible way.
Python's Support for Code Generation
Over the years Python's support for code generation was steadily improved with different ways to access the abstract syntax tree and to compiling it back to bytecode. This section highlights some alternative ways to do code generation that are not yet fully implemented in Jinja2 but are otherwise widely used.
by Jeremiah Jordan
Using Apache Cassandra from Python is easy to do. This talk will cover setting up and using a local development instance of Cassandra from Python. It will cover using the low level thrift interface, as well as using the higher level pycassa library.
Learn how decorators and context managers work, see several popular examples, and get a brief intro to writing your own. Decorators wrap your functions to easily add more functionality. Context managers use the 'with' statement to make indented blocks magical. Both are very powerful parts of the python language; come learn how to use them in your code.
Decorators wrap or modify your functions. They are functions themselves, so you can pass parameters to them. You can wrap a function with multiple decorators too. We'll cover how to use decorators in all those situations and how to write them too. Examples will be demonstrated with:
Context managers wrap a block of code using the 'with' statement to do something on the way into the block and something on the way out. Opening and closing a file is a very common case, but there is a lot more you can do. Examples include:
More about decorators: http://www.python.org/dev/peps/p...
by Moshe Zadka
With Python, segmentation faults and the like simply don't happen -- programs do not crash. However, the world is a messy, chaotic place. What happens when your programs crash? I will talk about how to make sure that your application survives crashes, reboots and other nasty problems.
Handling crashes is divided into two parts -- resilience (making sure that your software maintains correctness in the face of crashes) and speed of recovery (optimizing the time it takes back to get back to full working condition). I will talk about techniques to allow for resilience -- separating master data from cache data, minimizing the amount of master data, using atomic file operations, using databases and persisting structures in the right order. Then I will talk about speedy recovery techniques, among them process separation, working while restarting and more. I will conclude with surveying the options in testing all of these things so that the crashes are made to happen in the development/testing environment.
by Frank Wiles
PostgreSQL is pretty powerful all on it's own, but did you know you can use Python as a stored procedure language? Not only does using a familiar language make development easier, but you get the power of the standard library and PyPi to boot. Come learn the ins and outs of putting Python in your DB.
Pushing logic in your database can be a blessing or a curse. While these techniques aren't appropriate for most users in most situations, it's good to know what kind of power you have at your disposal with PL/Python if the need ever arises.
Learn about how to: - Use triggers to off load data processing tasks - Fire off email and log alerts based on what is happening in your tables - Create more granular and detailed constraints such as verifying email address or credit card checksums at the database level - Replace batch jobs with in database triggers - Learn when and when NOT to use these techniques - Debugging techniques - Security concerns
by Adam Lowry
Understanding the internal state of a running system can be vital to maintaining a high performance, stable system, but conventional approaches such as logging and error handling only expose so much. This talk will touch on how to instrument Python programs in order to observe the state of the system, measure performance, and identify ongoing problems.
Something is wrong with your web application. The time it’s taking to serve requests is growing. Your logs don’t contain enough. Your database appears bored. How do you know what’s going wrong?
In high-performance production servers it’s vital to know as much about the internals of your system as possible. Traditionally this is done by simple methods like logging anything of potential interest or sending error emails with unexpected exceptions. These methods are insufficient, both due to the level of noise inherent in such systems and because of the difficulty in anticipating what metrics are important during an incident.
Environments such as the JVM and .Net VM have advanced tools for communicating with the VM and for applications to expose internal state, but CPython has lacked similar tooling.
This talk will cover what options CPython application developers have for introspecting their programs; new tools for instrumenting, exposing, and compiling performance and behavior metrics; and techniques for diagnosing runtime issues without restarting the process.
by David Cramer
Practice iterative development like the pros. Release sooner, faster, and more often.
Continuous deployment (and testing) has started to become a reality for many companies. It brings to light one of the many problems that face large product teams, but also creates some of its own.
This talk will focus on the pros and cons of continuous deployment, how DISQUS switched from the recurring release cycle to continuous releases, as well as providing tips and arguments for adopting it in your workplace.
Django's template language is designed to strike a balance between power and ease of use; learn how to use this balance to create awesome looking websites. This talk will cover the basics and best practices of Django templating, from custom tag and filter creation, to the finer points of template rendering and loading, and even to replacing the default templating engine itself.
Harness the power of Django templates to help present your data with ease! Learn about:
Basic block formations, common patterns, and using includes wisely.
Tips and tricks in using the built-in template tags and filters.
How to make custom tags and filters: examples, what you should and shouldn’t do, and tools to help the process such as django-classy-tags.
Different ways to load and render templates.
Replacing Django’s default template language: pros and cons
Django Form processing often takes a back seat to flashier, more visible parts of the framework. But Django forms, fully leveraged, can help developers be more productive and write more cohesive code. This talk will dive deep into the stock Django forms package, as well as discuss a strategy for abstracting validation for forms, and the use of unit and integration tests with forms.
Django Form processing often takes a back seat to flashier, more visible parts of the framework. But Django forms are an integral part of the framework that can help developers be more productive and write more cohesive, well tested code. This talk will dive deep into the stock Django forms package, providing an examples of:
We'll also discuss ways to build on Django forms, including:
* writing unit and integration tests for forms, and how writing tests can help you understand code cohesion
& abstracting validation for forms to provide tiered validation (for example, one set of criteria to save, additional criteria to publish)
* approaches to working with multiple, heterogeneous forms simultaneously
Python has great Unicode support, but it's still your responsibility to handle it properly. I'll do a quick overview of what Unicode is, but only enough to get your program working properly. I'll describe strategies to make your code work, and keep it working, without getting too far afield in Unicode la-la-land.
Python has great Unicode support, but it's still your responsibility to handle it properly. Even expert programmers get tripped up with the encodings and decodings that can happen implicitly, throwing errors in unexpected places.
This talk will present a quick overview of what Unicode is, why it exists, and how it works, but only enough to get your program working properly. Unicode can be intricate and fascinating, but really, who cares? You just want your code to work without throwing a UnicodeEncodeError every time an accented character sneaks in somehow.
I'll describe strategies to make your code work, and keep it working, without getting too far afield in Unicode la-la-land.
How Unicode is handled is one of the biggest changes in Python 3. I'll touch on what those changes are, and how you can use them to keep even your Python 2 code running smoothly.
Bytes vs. text
ASCII, 8859-1, etc.
Python 2: str vs unicode
encode and decode
Python 3: bytes vs str
by Carl Meyer
A deep dive into writing tests with Django, covering Django's custom test-suite-runner and the testing utilities in Django, what all they actually do, how you should and shouldn't use them (and some you shouldn't use at all!). Also, guidelines for writing good tests (with or without Django), and my least favorite things about testing in Django (and how I'd like to fix them).
Django has a fair bit of custom test code: a custom TestSuiteRunner, custom TestCase subclasses, some test-only monkeypatches to core Django code, and a raft of testing utilities. I'll cover as much of that code as I find interesting and non-trivial, taking a close look at what it's actually doing and what that means for your tests.
This will be a highly opinionated talk. There are some things in Django's test code I really don't like; I'll talk about why, and how I'd like to see them changed. As a natural part of this, I'll also be outlining some principles I try to follow for writing effective and maintainable tests, and note where Django makes it easy or hard.
This is an "extreme" talk, so I'll be assuming you've used Django and done some testing, and you're familiar with the basic concepts of each. This won't be an introductory "testing with Django" howto.
Datums! Coordinate systems! Map projections! Topologies! Spatial applications are a nebulous, daunting concept to most Pythonistas. This talk is a gentle introduction into the concepts, terminology and tools to demystify the world of the world.
This talk will have multiple parts:
New Python web developers seem to love running benchmarks on WSGI servers. Reality is that they often have no idea what they are doing or what to look at. This talk will look at a range of factors which can influence the performance of your Python web application. This includes the impact of using threads vs processes, number of processors, memory available, the GIL and slow HTTP clients.
A benchmark of a hello world application is often what developers use to make the all important decision of what web hosting infrastructure they use. Worse is that in many cases this is the only sort of performance testing or monitoring they will ever do. When it comes to their production applications they are usually flying blind and have no idea of how it is performing and what they need to do to tune their web application stack.
This talk will discuss different limiting factors or bottlenecks within your WSGI server stack and system that can affect the performance of your Python web application. It will illustrate the impacts of these by looking at typical configurations for the more popular WSGI hosting mechanisms of Apache/mod_wsgi, gunicorn and uWSGI, seeing how they perform under various types of traffic and request loads and then tweaking the configurations to see whether they perform better or worse.
Such factors that will be discussed will include:
Use of threads vs processes.
Number of processors available.
Python global interpreter lock (GIL)
Amount of memory available.
Slow HTTP browsers/clients.
Browser keep alive connections.
Need to handle static assets.
From this an attempt will be made to provide some general guidelines of what is a good configuration/architecture to use for different types of Python web applications. The importance of continuous production monitoring will also be covered to ensure that you know when the performance of your system is dropping off due to changing traffic patterns as well as code changes you have made in your actual web application.
by Kurt Grandis
Has your garden been ravaged by the marauding squirrel hordes? Has your bird feeder been pillaged? Tired of shaking your fist at the neighbor children? Learn how to use Python to tap into computer vision libraries and build an automated sentry water cannon capable of soaking intruders.
Using the Python bindings for the computer vision library, OpenCV, we will investigate the components and steps needed to power a sentry gun. In addition to basic object and motion tracking, concepts of object recognition (friend or foe) will be discussed. Communication and control of the underlying hardware is performed using Python and will also be covered.
Additional peace-time applications of the above technology will be demonstrated.
Providing full-featured REST APIs is an increasingly popular request. Tastypie allows you to easily implement a customizable REST API for your Python or Django applications.
Who am I? (Primary author of Tastypie)
A touch of philosophy
Use HTTP the best we can
Flexible serialization (not everyone wants JSON)
What you can GET should be able to be POST/PUT
Should be reasonable by default but easy to extend
Works with Django
Any data source (Not just ORM)
Designed to be extensible
Supports a variety of serialization formats (JSON/XML/YAML/bplist)
URIs everywhere by default
Lots of hooks for customization
Demonstrate a simple setup
Then explore the API based on that trivial setup
Demonstrate adding authentication/authorization
Demonstrate adding custom serialization
Demonstrate adding a different data source
Demonstrate adding a custom endpoint
Twitter's new scalable, fault-tolerant, and simple(ish) stream programming system... with Python!
Storm is a high-volume, continuous, reliable stream processing system developed at BackType and recently open-sourced by Twitter. Though most of the system (and it's documentation) is written in Java-based languages, it is possible to use in a Python environment with Python-based analysis code. At DotCloud (our application-platform-as-a-service) we're doing just that, and we'll be showing how you can too.
We collect a lot of data: we have tens of thousands of customers, many of whom have dozens of services running on our platform, each of which in turn produces dozens of metrics every second. All in all, we're dealing with millions of datapoints per minute. Storm will be the third iteration of our metrics system, an attempt at standardizing a number of previously-distinct pieces of our infrastructural software, to enable automated, real-time reactions to changes in the platform's state.
We'll start by touching on what problems Storm is (and isn't) trying to solve and why it's model is so powerful, informed by our previous attempts to solve the stream processing problem. We'll then move on to a deep dive into how to get Storm up and running with the most Python and least Java-enduced-pain possible and finish up with tips to solve some of the challenges we've encountered while adopting Storm into our Python-based development process.
What is stream processing?
High volume, Continuous, Reliable data analysis
How do people solve this today?
Storm's overall model
Why is this solution better?
The Hard Part, Made Simple:
Build a topology
Code your processors
The Simple Part, Made Hard (made simple):
Clojure (less ugh)
by Paul Smith
Spatial data are often seen as opaque to most developers, and while dealing with them does require a shift in approach from the data types we most regularly handle, they needn’t be the domain of specialists. High-quality Python libraries and Python-based applications exist for operating on and transforming spatial data, and for creating visualizations, including maps for presentation on the web.
This talk will be an overview of the Python libraries and applications available for handling spatial and geospatial data and creating maps for the web. It will cover libraries for open and transforming spatial data formats and representations, spatial operators and predicates for queries and relationships, spatial indexes for efficient queries, and compositing and rendering map tiles, as well as desktop applications extensible with Python that replace much of the functionality of "enterprise" GIS software.
Java is in some ways a bogeyman to the Python community -- the language that parents scare their children with, the Cobol of the 21st century. But if we look past the cesspool of JEE it turns out that Java has quietly become an excellent systems environment, one that is still in many ways ahead of its time.
by Erik Rose
If you've ever wanted to get started with parsers, here's your chance for a ground-floor introduction. A harebrained spare-time project gives birth to a whirlwind journey from basic algorithms to Python libraries and, at last, to a parser for one of the craziest syntaxes out there: the MediaWiki grammar that drives Wikipedia.
Some languages were designed to be parsed. The most obvious example is Lisp and its relatives which are practically parsed when they hit the page. However, many others—including most wiki grammars—grow organically and get turned into HTML by sedimentary strata of regular expressions, all backtracking and warring with one another, making it difficult to output other formats or make changes to the language.
We will explore the tools and techniques necessary to attack one of the hairiest lingual challenges out there: MediaWiki syntax. Join me for an introduction to the general classes of parsing algorithms, from the birth of the field to the state of the art. Learn how to pick the right one. Have a comparative look at a dozen different Python parsing toolkits. And finally, learn some optimization tricks to get a grammar going at a reasonable clip.
7th–15th March 2012