by Edgar Roman
Our challenge was to create a login system for little people who might barely read, maybe no email, perhaps no home computer. And we had to watch out for privacy laws - especially tough for minors. But these kids want to play games, write stories, and create online avatars to share and compete against their buddies. Listen to how we developed the PBS KIDS login and moderation system in Django.
by Shannon -jj Behrens and Mike Solomon
This talk covers scalability at YouTube. It's given by one of the original engineers at YouTube, Mike Solomon. It's a rare glimpse into the heart of YouTube which is one of the largest websites in the world, and one of the few extremely large websites to be written in Python.
Abstract
Every day, people watch an average of 3 billion videos on YouTube. Every minute, people upload an average of 48 hours of video to YouTube. YouTube operates at a scale that few other websites will ever see, and it's written mostly in Python.
Mike Solomon is one of the original engineers at YouTube. In this informal, high-level talk, he'll give an overview of the lessons he's learned as he's brought YouTube to scale. He'll also point out ways in which his philosophy on scaling, testing, and writing Python fly in the face of accepted wisdom. Last of all, we'll also be giving a very short introduction to YouTube APIs and how you can integrate your application with YouTube.
by Matt Spitz
There are a plethora of options when it comes to deciding how to add a machine learning component to your python application. In this talk, I'll discuss why python as a language is well-suited to solving these particular problems, the tradeoffs of different machine learning solutions for python applications, and some tricks you can use to get the most out of whatever package you decide to use.
This is the age of data. As more companies expose their datasets through APIs, it's becoming increasingly easier to pull information about users, places, and things. But having this data isn't always enough; we want to understand it, find correlations, and identify trends. Fortunately, the area of computer science known as machine learning has a variety of algorithms specifically designed to do this sort of data wrangling. For the python application developer, there are many off-the-shelf toolkits that include implementations of these algorithms (Orange, NLTK, SHOGUN, PyML and scikit-learn to name just a few), but choosing which one to use can be daunting.
There are a number of tradeoffs one makes when making a selection, depending on the specifics of the implementation and the needs of the application. In this talk, I'll give an overview of some of the packages available and discuss what factors might go into deciding which one to use. I'll also offer some python-specific tricks you can use to work with large amounts of data efficiently.
If your Python application has users, you should be worried about security. This talk will cover advanced material, highlighting common mistakes. Topics will include hashing and salts, timing attacks, serialization, and much more. Expect eye opening demos, and an urge to go fix your code right away.
If your Python application has users (even if it's used offline), you should be worried about security. This talk will cover advanced material, highlighting common mistakes.
Hashing and encryption can be tricky to get right. We'll discuss when to use hashing to sign data, and how to choose the right encryption algorithm (spoiler: don't). We'll demonstrate length extension attacks, and discuss how to prevent them.
Another common mistake is the incorrect use of pseudo-random number generators. We'll discuss the fix, and some of the dangers associated with it.
Timing attacks are relatively exotic, but as applications move into shared data centers (and shared virtual machines) they have become easier to implement and more dangerous. They're a very common class of bugs, but fixing them (and proving they're fixed) can be difficult.
Pickle is a common and easy to use serialization format for Python objects. Unfortunately, it's also insecure when attackers can send or modify the pickled data. We'll discuss strategies for signing pickled objects, and alternate serialization formats.
The final portion of the talk will discuss a meta security problem within the Python community. I'll be demonstrating live code that can compromise even the most locked down of servers, and discussing the steps we need to take as a community to mitigate this threat moving forward.
by Adam Lowry
Understanding the internal state of a running system can be vital to maintaining a high performance, stable system, but conventional approaches such as logging and error handling only expose so much. This talk will touch on how to instrument Python programs in order to observe the state of the system, measure performance, and identify ongoing problems.
Something is wrong with your web application. The time it’s taking to serve requests is growing. Your logs don’t contain enough. Your database appears bored. How do you know what’s going wrong?
In high-performance production servers it’s vital to know as much about the internals of your system as possible. Traditionally this is done by simple methods like logging anything of potential interest or sending error emails with unexpected exceptions. These methods are insufficient, both due to the level of noise inherent in such systems and because of the difficulty in anticipating what metrics are important during an incident.
Environments such as the JVM and .Net VM have advanced tools for communicating with the VM and for applications to expose internal state, but CPython has lacked similar tooling.
This talk will cover what options CPython application developers have for introspecting their programs; new tools for instrumenting, exposing, and compiling performance and behavior metrics; and techniques for diagnosing runtime issues without restarting the process.
by David Mertz
This talk traces lightweight concurrency from Python 2.2's generators, which enabled semi-coroutines as a mechanism for scheduling "weightless" threads; to PEP 342, which created true coroutines, and hence made event-driven programming easier; to 3rd party libraries built around coroutines, from older GTasklet and peak.events to the current Greenlet/gevent and Twisted Reactor.
This talk aims to provide both a practical guide and theoretical underpinnings to the use of generator-based lightweight concurrency in Python.
Lightning tour of generator constructs. Why generator-based scheduling is particularly useful for event-based programming.
Simple example of a "trampoline" or scheduler.
Slightly fleshed out example of scheduler with discussion of data-passing issues.
Examples using GTasklet to make coroutine code look more like familiar sequential code (the framework is based on greenlets rather than generators, but accomplishes similar purpose).
Brief examples of Twisted Reactors and Deferreds.
Limits of generator-based concurrency (i.e. doesn't help with multiple cores and multiple servers). "Throw at the wall" list of ways to generalize to larger scales than single cores.
by Andrea O. K. Wright
I won't just demonstrate how to use projects that bridge programming languages, I'll walk through the lower-level code that allows inter-language communication to happen.
This talk is written to engage developers interested in exploring the different forms that polyglot programs can take. The examples all demonstrate techniques for integrating Python with Scala, but the concepts are applicable beyond the specific use case of Python/Scala interop, and the talk does not assume prior knowledge of Scala.
Topics include:
Brief intro to Scala (enough to be able to follow the examples)
Basic script hosting APIs
Python/Scala integration via Jython
JEPP (CPython/JVM bridge)
Leveraging Python’s support for metaprogramming to make foreign function calls virtually indistinguishable from host language function calls
Passing a Python function to a Scala function that takes a Scala function as an argument
Interface Definition Language (IDL) and IDL-based strategies
Using TCP for inter-language communication
Attendees should be familiar Python's metaprogramming capabilities because I won’t be providing background information about adding or modifying attributes dynamically or interrogating a Python object for attribute details.
Java is in some ways a bogeyman to the Python community -- the language that parents scare their children with, the Cobol of the 21st century. But if we look past the cesspool of JEE it turns out that Java has quietly become an excellent systems environment, one that is still in many ways ahead of its time.
Localization of Python apps used to be hard, but not any more. This talk will offer a short intro on software localization in Python and discuss today's best practices. It will present Transifex, a modern, Django-based SaaS, also referred to as 'The Github of translations', used by 2.000 open-source projects including Django, Mercurial, Fedora and Firefox.
This talk targets software developers of Python apps published to an international audience, such as developers of web and desktop apps, games, and frameworks such as Django, presenting and demo-ing a painless way to get their apps localized.
We will briefly introduce software localization (L10n): what it is, why it matters and how it's being done with Gettext and libraries like babel.
We'll then present Transifex, a Django-based open-source social localization tool, which developers use to integrate localization in their workflow and reach out to an established community of translators.
by Brian Curtin
With nearly 1.5 million downloads per month, the CPython installers for Windows account for a huge amount of the traffic through python.org, and they're the most common way for Windows users to obtain Python. Take a look at what's going on with Python on Windows and see what the road ahead looks like for Python 3.3.
Abstract
It's often said that we've passed the point where we're surprised about where Python is being used. From satellites out in space to fighter jets much closer to earth, Python is everywhere, so it's no surprise it appears on Windows desktops. However, did you know CPython's Windows installers are downloaded almost 1.5 million times every month? Let's take a look at what's going into nearly 18 million downloads per year, especially the upcoming CPython 3.3.
The Download Numbers
A look into the python.org download numbers shows some interesting trends (based on an in-progress sample), including the doubling of 3.x downloads with the release of 3.2. Let's take a look at what the release calendar means for download rates, and what the future looks like for 3.3.
Installer Changes
For years, users have been asking for Python's addition to the system path and countless guides have been written to help users figure out how to do that. The rise of freely available Python education materials has steadily increased the amount of first-timers around the community, many whom see immediate failure by typing "python" at a command prompt only to get an error message. Python should be as helpful as possible and provide sensible options at install time, and with Python 3.3, we're bringing you the ability to add Python to the system path, complete with a quick demo and explanation of the options.
New and Recent Features
As nearly all of the CPython developers are on Linux-based systems, features tend to show up there first while Windows plays catch-up. Python 3.2 added Windows implementations of os.symlink for Windows Vista and beyond, os.kill using control handlers, and several others, and 3.3 will try to fill in more gaps.
PEP 3151 Changes to WindowsError
If you've written cross-platform code that needs to handle WindowsError, you've probably done a few dances to properly handle it. PEP 3151 reworks the OS and and IO exception hierarchy and makes some changes to how WindowsError works, so we'll look into what it means for your code.
PEP 397 Launcher
Bringing Linux-like shebang functionality to a Windows computer near you. The ability to launch the proper 2 or 3 interpreter based on a hint in your code is just another way to ease startup issues for users, so we'll take a look at what's going on there.
Alternative Implementations on Windows (quick mention, likely IronPython focused)
IronPython is caught up on the 2.7 line and working towards a 3.x release.
This talk distills some intereting stuff from What's new document from 2.7, 3.2 and upcoming 3.3 release. Look out for those new arguments to your favorite methods, functions add the wow! factor to your code. Heard of @lru_cache?
Abstract
Lots of Interesting stuff has gone into Python Standard library in 2.7, 3.1, 3.2 and 3.3 release. Some interesting features that went in really make programmers life easy and it can bring in a 'wow' factor to their code. Additionally, it can also help the external library developers to relook at the their libraries to use new facilities available from standard library modules.
This talk distills stuff from What's new document from 2.7, 3.2 and 3.3 and presents some of the choicest new features from Python standard library. Since a lots has gone in since 2.7, focus would be given to those which have had good discussion in tracker or in python-dev and would in general was a most sought out one.