Django's template language is designed to strike a balance between power and ease of use; learn how to use this balance to create awesome looking websites. This talk will cover the basics and best practices of Django templating, from custom tag and filter creation, to the finer points of template rendering and loading, and even to replacing the default templating engine itself.
Harness the power of Django templates to help present your data with ease! Learn about:
Basic block formations, common patterns, and using includes wisely.
Tips and tricks in using the built-in template tags and filters.
How to make custom tags and filters: examples, what you should and shouldn’t do, and tools to help the process such as django-classy-tags.
Different ways to load and render templates.
Replacing Django’s default template language: pros and cons
by Peter Kropf
Arduino is an open-source electronics prototyping platform based on flexible, easy-to-use hardware and software. Python is our favorate programming language that allows you to integrate systems more effectively. Learn how to use Python to communicate with an Arduino and interact with sensors, solenoids and motors.
This talk with introduce the Arduino microcontroller and show how to interact with it using Python. With a serial line command protocol, Python code can easily turn on digital I/O pins to turn on LEDs, change the pulse width modulation (PWM) to alter brightness or move a stepper motor. Examples will be shown of a small robot that has a pair of 2 axis gimbles that serve as eyes and of controlling fire effect sequencing.
by Ken Elkabany
The recent cloud buzz has hugely benefited Python web devs. But, for Python's formidable scientific community, the cloud has been less ambitious--until now. PiCloud is a Python-based cloud platform that tackles a noble cause: giving every scientist in the world instant access to a supercomputer. The talk will cover how Python inspired the design of PiCloud, which has now processed over 100M jobs.
by Taavi Burns
`time`, `datetime`, and `calendar` from the standard library are a bit messy. Find out what to use where and how (particularly when you have users in many timezones), and what extra modules you might want to look into.
One of the goals of PyPy is to make existing Python code faster, however an even broader goal was to make it possible to write things in Python that previous would needed to be written in C or other low-level language. This talk will show examples of this, and describe how they represent the tremendous progress PyPy has made, and what it means for people looking to use PyPy.
In this talk we'll talk about PyPy's present status and for what kinds of applications it might be useful. We'll also show examples of things that are possible with PyPy that were impossible to do with Python before, like real time video processing done in pure python. Our objective with each of the examples will be to highlight the type of work that is sped up and why this represents both a boon for existing Python programmers, as well as an opportunity for Python to expand to new audiences.
We'll also explain a bit PyPy's accomplishements in the passing year that make all of this possible, current status, its goals and the near future.
Django Form processing often takes a back seat to flashier, more visible parts of the framework. But Django forms, fully leveraged, can help developers be more productive and write more cohesive code. This talk will dive deep into the stock Django forms package, as well as discuss a strategy for abstracting validation for forms, and the use of unit and integration tests with forms.
Django Form processing often takes a back seat to flashier, more visible parts of the framework. But Django forms are an integral part of the framework that can help developers be more productive and write more cohesive, well tested code. This talk will dive deep into the stock Django forms package, providing an examples of:
We'll also discuss ways to build on Django forms, including:
* writing unit and integration tests for forms, and how writing tests can help you understand code cohesion
& abstracting validation for forms to provide tiered validation (for example, one set of criteria to save, additional criteria to publish)
* approaches to working with multiple, heterogeneous forms simultaneously
How do you take the big step from casual SQLAlchemy user, who treats your database as a mysterious object store, to advanced power user, who optimizes critical queries, plans indexing and migrations, and generates efficient reports? This talk will teach you how databases think; why humanity invented the Relational Algebra; and how SQLAlchemy grants you access to relational power.
While drawing enlightening comparisons between SQLAlchemy, the Django ORM, and several NoSQL databases, this talk will focus most of its energy on understanding relational databases — a technology foundation that has been steadily improved since the early 1970s. The talk will first tackle the big questions that all databases have to answer, then teach specific SQLAlchemy techniques for taking advantage of relational queries.
Records and Indexes: whether a database is relational, key-value, document-based, or hierarchical, it must both store some kind of record, and also allow indexes to be built across its collection of records. After reviewing why hardware speeds make indexes a necessity, we will consider their structure, performance, cost (especially for writing), and the trade-offs between building them directly from data versus through functions or map-reduce mechanisms.
The Relational Algebra and Query Optimization: we will learn — using concrete, well-illustrated examples — how relational databases use a combination of powerful indexes and intelligent query planning to support “normalized” data storage. This will be briefly contrasted with the different normalizations approach encouraged by modern document databases.
Building Queries: with brief glances at pure SQL syntax for the very curious, we will learn how SQLAlchemy lets you build SQL queries as a series of Python method calls. We will see how queries can return either raw rows or ORM class instances, and how the use of joins reduces the number of round-trips to the database.
Advanced ORM: finally, we will use our knowledge of database structure and queries to see how high-level ORM operations can benefit from eager loading, query-specific indexes, query logging, and the EXPLAIN operator to learn how your database is — or is not — optimizing your operations. Finally, we will review the transactional nature of relational databases and learn about the SQLAlchemy object cache (pointing out the big difference between it and the Django ORM), the SQLAlchemy unit-of-work construct, and how these can vastly confuse you if you are not prepared for their behavior.
The Python community is abuzz about the major speed gains PyPy can offer pure Python code. But how does PyPy JIT actually work? This talk will discuss how the PyPy JIT is implemented. It will include descriptions of the tracing, optimization, and assembly generation phases. I will demonstrate each step with a example loop.
Strictly speaking, PyPy doesn't have a JIT; it has a JIT generator. I will describe how the JIT is generated during the translation of the interpreter as well as the advantages of the meta-JIT method.
Next, we'll explore tracing, in which the JIT records operations in a hot loop. I'll introduce the meta interpreter and the JIT IR (intermediate representation).
Optimizations are the next step of JITing. The focus will be on the most important optimizations for dynamic languages: virtuals and virtualizables. However, PyPy also includes the standard set of compiler peephole optimizations, like strength reduction, and well as some more complicated loop optimizations, like constant hoisting.
The final topic will be assembly generation. We'll see register allocation and how specific high-level Python operations are compiled down to tight x86 instructions.
Time permitting, additional topics may include how the garbage collector is integrated with the JIT and how the JIT bails back to the interpreter when a guard fails.
Along the way, I will demonstrate how each phase of JITing acts on an example Python loop. This will also allow me to introduce the jitviewer, a program to view how PyPy is compiling your loops.
by Hugo Boyer
Digital fabrication is the art of translating digital designs into physical objects. Using personal machines that are controlled via software, a live demonstration of CNC milling and 3D printing will also be performed. This talk is a walkthrough from 3D models to machine motion, that shows how we can use Python to write GCODE generators that create endless form.
This talk is aimed to the beginner to intermediate Python developer because it covers basic uses of certain parts of the standard library. it should also be of interest to people who are curious about digital fabrication and geometry.
First, 2 personal fabrication machines are presented: A Sherline Milling machine and the latest Makerbot dual extruder 3D printer. A very quick functioning overview is presented, showing how electric pulses are translated to XYZ movements of the toolheads, as well as the difference between hot plastic extrusion onto a surface and milling away material.
Then the rationale for a machine controller software is explained (being able to control the machine in terms of distances and feed rates rather than pulses), and an overview of gcode is provided. The first python scripts demonstrated generate gcode for milling operation. They are based on the code behind http://machinetouch.appspot.com
http://machinetouch.appspot.com/... is presented with 2 or 3 slides of code and a Python logo is then engraved (live demo, 5 min) while an overview of the code is given: - how primitives map to python methods and gcode. - how it was easy to move from a console app to gui and to a web app using only the standard library features. - how this parametric approach to design allows the user to change certain aspects (materials, forms) and regenerate the tool path automatically (as opposed to a static drawing or the normal CAM based approach)
The second part of the talk deals with 3D printing. The 3D printer is demoed first (loading a 3D model file, creating slices with the desired density and launching the tool), because the print will last about 10 minutes. As the machine runs, the following python code will be shown: - extracting triangles from an stl file using generators - slicing: finding intersection contours at a given height - pathing: generating patterns for the extruder - hard core parallelization of certain tasks using pyOpenCl (TBD, depending on the code base available) The presentation ends by assembling the new object made entirely from a digital representation (except for the LED circuit).
The code will be based on Skeinforge of Makerbot’s new slicer (in development in Python3 right now).
Makerbot operators are growing in numbers (more than 4000 units of open hardware sold) and all use Python to slice their 3D models and print them.
by Jonathan Rocher
Analyzing, storing and visualizing time-series efficiently are recurring though difficult tasks in various aspects of scientific data analysis such as meteorological forecasting, financial modeling, ... In this talk we will explore the current Python ecosystem for doing this effectively, comparing options, using only open source packages that are mature yet still under active development.
SQLAlchemy is the object relational mapper and database toolkit for Python, first introduced in 2005. In this talk I'll describe why SQLAlchemy has always been called a "toolkit", detailing the software construction mindset for which SQLAlchemy was designed to be used with - what I am currently referring to as the "Hand Coded" approach.
by Jim Baker
As a dynamic language, Python is difficult to optimize. In addition, these dynamic features make using Python code from Java currently too complex. However, Java 7 adds the invokedynamic bytecode and corresponding library support, making it possible to finally address these problems in Jython. This talk will describe work in progress to make Jython faster and better (improving Java integration).
Jython demonstrates that it is quite possible to fit Python’s dynamic features on the Java Virtual Machine (JVM) to provide for seamless integration, while also taking advantage of the breadth of the Java platform. My favorite aspect of this blending is certainly is the amazing java.util.concurrent package.
However, from a performance perspective, it’s an awkward fit. In certain cases, the JVM is able to aggressively inline Python codepaths through its JIT. But generally it cannot for a variety of technical reasons. In addition, while Jython can conveniently call Java code, and support callbacks, it’s not at all convenient right now to go the opposite way. This mismatch is perhaps best seen now in Jython’s lack of support for Java annotations.
Consider how you, as a human translator, might attempt to optimize Python code for the JVM, or to fix the Java integration issues. Your plan of attack is simple: translate idiomatic Python to similarly idiomatic (and highly JIT-able) Java. Python developers signal their intent through a variety of mechanisms. They use builtin names like True/False or range/xrange, which through common convention, no one would seriously expect to see changed. They rarely monkey patch namespaces (action at a distance), although import time can get quite interesting. Importing packages from the java.* namespace means something when integrating from Jython. The challenge would be in supporting the dynamic functionality, however. Rewriting truly dynamic code into statically-typed code is just not the right way. It is non-trivial, error prone, and certainly not fun.
Enter the invokedynamic bytecode and the java.lang.invoke package, introduced with Java 7. It enables a wide range of optimizations, while allowing for the correctness of Python code, with its full range of dynamic features, no matter how crazy, to be maintained. There are some obvious wins, such as ensuring that a callsite (the point in the code where a call to a given function is invoked), a MethodHandle to a function in that namespace is linked, with all parameters properly permuted so that it’s a straight call through the Java calling convention. If there’s a namespace change, simply relink.
But there are also opportunities to use static analysis. For example, iterating over an xrange looks like a Java for loop, and can be optimistically compiled as such. If the builtin is rebound, the controlling SwitchPoint in invalidated and a continuation is setup to re-execute into an interpreter using unoptimized code (actually running Python bytecode). Other static analysis opportunities include being able to control the construction of frames for functions, use of decorators and function annotations to describe gradual typing (especially useful for Java integration), and so forth. This talk will cover a variety of these translations, demonstrate how we support both the fast and slow paths, and also describe some of the current performance benchmarks. I will also describe the pitfalls: obvious optimizations frequently result in bad performance due to the number of moving parts.
In addition, this talk with cover the state of Jython 2.6+. In particular, I will describe forthcoming changes to the Jython API (for embedding into Java). These include limited backwards breaking changes in ThreadState and PySystemState, to support increased performance, cleanup APIs, and remove issues in the garbage collection of ClassLoader objects.
Python has great Unicode support, but it's still your responsibility to handle it properly. I'll do a quick overview of what Unicode is, but only enough to get your program working properly. I'll describe strategies to make your code work, and keep it working, without getting too far afield in Unicode la-la-land.
Python has great Unicode support, but it's still your responsibility to handle it properly. Even expert programmers get tripped up with the encodings and decodings that can happen implicitly, throwing errors in unexpected places.
This talk will present a quick overview of what Unicode is, why it exists, and how it works, but only enough to get your program working properly. Unicode can be intricate and fascinating, but really, who cares? You just want your code to work without throwing a UnicodeEncodeError every time an accented character sneaks in somehow.
I'll describe strategies to make your code work, and keep it working, without getting too far afield in Unicode la-la-land.
How Unicode is handled is one of the biggest changes in Python 3. I'll touch on what those changes are, and how you can use them to keep even your Python 2 code running smoothly.
Bytes vs. text
ASCII, 8859-1, etc.
Python 2: str vs unicode
encode and decode
Python 3: bytes vs str
by Thomas Smith
Project Gado is an initiative which aims to create an open-source archival scanning robot which small archives can purchase for $500 and use to autonomously scan their photographic collections. This talk presents the Gado 2, a prototype scanning robot built around Python and Arduino, and shares lessons learned from using Python as the primary language in a large-scale archival scanning project.
The archives of the Afro American Newspaper in Baltimore MD contain over 1.5 million historical photos spanning 115 years of the city’s African American history. One of the largest Black history collections in the world, the Afro’s archives include thousands of photos which have never been seen by the public.
Why? Of the paper’s 1.5 million photos, only around 10,000 exist in a digital form; the Afro, like many small archives, simply does not have the human resources to manually digitize its collections. As a result, photos with incredible value for scholars, educators and community members alike are available only to the select few with the access, specialized skills, and time to travel to the physical archive and locate them.
Project Gado was founded in 2010 to address these challenges. The project seeks to create an open source archival scanning robot which small organizations like the Afro can use to autonomously digitize their photographic holdings. The Gado 1, a proof-of-concept machine built using Python and Arduino, has successfully scanned over 1,000 photos to date.
by Carl Meyer
A deep dive into writing tests with Django, covering Django's custom test-suite-runner and the testing utilities in Django, what all they actually do, how you should and shouldn't use them (and some you shouldn't use at all!). Also, guidelines for writing good tests (with or without Django), and my least favorite things about testing in Django (and how I'd like to fix them).
Django has a fair bit of custom test code: a custom TestSuiteRunner, custom TestCase subclasses, some test-only monkeypatches to core Django code, and a raft of testing utilities. I'll cover as much of that code as I find interesting and non-trivial, taking a close look at what it's actually doing and what that means for your tests.
This will be a highly opinionated talk. There are some things in Django's test code I really don't like; I'll talk about why, and how I'd like to see them changed. As a natural part of this, I'll also be outlining some principles I try to follow for writing effective and maintainable tests, and note where Django makes it easy or hard.
This is an "extreme" talk, so I'll be assuming you've used Django and done some testing, and you're familiar with the basic concepts of each. This won't be an introductory "testing with Django" howto.
Datums! Coordinate systems! Map projections! Topologies! Spatial applications are a nebulous, daunting concept to most Pythonistas. This talk is a gentle introduction into the concepts, terminology and tools to demystify the world of the world.
This talk will have multiple parts:
by David Mertz
This talk traces lightweight concurrency from Python 2.2's generators, which enabled semi-coroutines as a mechanism for scheduling "weightless" threads; to PEP 342, which created true coroutines, and hence made event-driven programming easier; to 3rd party libraries built around coroutines, from older GTasklet and peak.events to the current Greenlet/gevent and Twisted Reactor.
This talk aims to provide both a practical guide and theoretical underpinnings to the use of generator-based lightweight concurrency in Python.
Lightning tour of generator constructs. Why generator-based scheduling is particularly useful for event-based programming.
Simple example of a "trampoline" or scheduler.
Slightly fleshed out example of scheduler with discussion of data-passing issues.
Examples using GTasklet to make coroutine code look more like familiar sequential code (the framework is based on greenlets rather than generators, but accomplishes similar purpose).
Brief examples of Twisted Reactors and Deferreds.
Limits of generator-based concurrency (i.e. doesn't help with multiple cores and multiple servers). "Throw at the wall" list of ways to generalize to larger scales than single cores.
This talk will show you how to develop a game using Kinect from Python. I'll start w/ an introduction to the Kinect API including skeleton tracking, normal video, depth video, and audio APIs including speech recognition. I’ll then show how the Kinect APIs can be incorporated into a game using PyGame. After the talk you’ll be able to start developing your own Python based Kinect games!
Kinect API Overview
Audio / Voice Recognition
Post Kinect events to PyGame event queue
Processing skeleton data, drawing skeletons, working w/ video stream
by Mike Müller
The presentation introduces the possibilities to use HDF5 (Hierarchical Data Format) from Python. HDF5 is one of the fastest ways to store large amounts of numerical data. The talk is for scientist who would like to store their measured or calculated data as well as for programmers who are interested in non-relational data storage.
HDF5 (Hierarchical Data Format) allows to store large amounts of data fast. Many scientists use HDF5 for numerical data. Multidimensional arrays and database-like tables can be nested. This makes HDF5 useful for other user groups such as people working with image data.
The main objective of HDF5 is the storage of data in the GB and TB range. A HDF5 file has a hierarchical structure with groups and sub-groups similar to file system with directories and sub-directories. The analogy to files are homogeneous, multidimensional arrays or database-like tables. The hierarchical structure uses B-trees that may span several files.
HDF5 comes with compression options that allow a compact data storage. Therefore, write and read rates can be faster than the maximum rate of the hard drive compared to the stored data.
Users from scientific and technical fields like to use HDF5. It has proven valuable for a variety of applications. The speed is often considerably higher than that of user defined binary formats. HDF5 is very attractive because its storage capacity is practically unlimited and the data access is very convenient. In addition, there are many tools that help visualize and interpret data stored in HDF5 files.
HDF5 can be interesting not only for scientific application. Multidimensional arrays can be stored in tables. This opens new possibilities for an efficient and easy storage of image data including indexing. Another application could be platform independent virtual file systems based on HDF5.
There are HDF5 libraries for different programming languages such as C, C++ and Fortran. There are two libraries for Python:
h5py exposing the full C-API with all options to Python and
PyTables that adds pythonic features to simplify especially the work with tables.
This presentation gives examples for how to work with both libraries. Python programs for reading and writing HDF5 data are typically multiple times shorter than their counterparts in C or Fortran. Combining the elegance of Python with the extraordinary speed of HDF5 makes programming as well as program execution highly effective.
New Python web developers seem to love running benchmarks on WSGI servers. Reality is that they often have no idea what they are doing or what to look at. This talk will look at a range of factors which can influence the performance of your Python web application. This includes the impact of using threads vs processes, number of processors, memory available, the GIL and slow HTTP clients.
A benchmark of a hello world application is often what developers use to make the all important decision of what web hosting infrastructure they use. Worse is that in many cases this is the only sort of performance testing or monitoring they will ever do. When it comes to their production applications they are usually flying blind and have no idea of how it is performing and what they need to do to tune their web application stack.
This talk will discuss different limiting factors or bottlenecks within your WSGI server stack and system that can affect the performance of your Python web application. It will illustrate the impacts of these by looking at typical configurations for the more popular WSGI hosting mechanisms of Apache/mod_wsgi, gunicorn and uWSGI, seeing how they perform under various types of traffic and request loads and then tweaking the configurations to see whether they perform better or worse.
Such factors that will be discussed will include:
Use of threads vs processes.
Number of processors available.
Python global interpreter lock (GIL)
Amount of memory available.
Slow HTTP browsers/clients.
Browser keep alive connections.
Need to handle static assets.
From this an attempt will be made to provide some general guidelines of what is a good configuration/architecture to use for different types of Python web applications. The importance of continuous production monitoring will also be covered to ensure that you know when the performance of your system is dropping off due to changing traffic patterns as well as code changes you have made in your actual web application.
by Eric Snow
To really take advantage of Python you must understand how imports work and how to use them effectively. In this talk we'll discuss both of these. After a short introduction to imports, we'll dive right in and look at how customizing import behavior can make all your wildest dreams come true.
Python's import statement has been a powerful feature since the first release, and only gotten better with age. Understanding how imports work under the hood will let you take advantage of that power.
A big key to customizing Python's imports is the importers introduced by PEP 302. That's a tool that you want in your belt! We'll be covering such import hooks as well as a couple other customization methods.
by Zain Memon
Python makes it easy to store, query, and transform geodata. We will run through a handful of useful GIS libraries and patterns that let you do magical things with your maps. If you want to make maps that are more interactive and more interesting, this talk is for you.
This talk will demystify the different parts of a usual map stack, including:
GeoSpatial Datastores (RDBMS & NoSQL)
Map servers (that query the geodata)
Tile servers (that chunk the data into tiles and cache it)
by Kurt Grandis
Has your garden been ravaged by the marauding squirrel hordes? Has your bird feeder been pillaged? Tired of shaking your fist at the neighbor children? Learn how to use Python to tap into computer vision libraries and build an automated sentry water cannon capable of soaking intruders.
Using the Python bindings for the computer vision library, OpenCV, we will investigate the components and steps needed to power a sentry gun. In addition to basic object and motion tracking, concepts of object recognition (friend or foe) will be discussed. Communication and control of the underlying hardware is performed using Python and will also be covered.
Additional peace-time applications of the above technology will be demonstrated.
Providing full-featured REST APIs is an increasingly popular request. Tastypie allows you to easily implement a customizable REST API for your Python or Django applications.
Who am I? (Primary author of Tastypie)
A touch of philosophy
Use HTTP the best we can
Flexible serialization (not everyone wants JSON)
What you can GET should be able to be POST/PUT
Should be reasonable by default but easy to extend
Works with Django
Any data source (Not just ORM)
Designed to be extensible
Supports a variety of serialization formats (JSON/XML/YAML/bplist)
URIs everywhere by default
Lots of hooks for customization
Demonstrate a simple setup
Then explore the API based on that trivial setup
Demonstrate adding authentication/authorization
Demonstrate adding custom serialization
Demonstrate adding a different data source
Demonstrate adding a custom endpoint
Twitter's new scalable, fault-tolerant, and simple(ish) stream programming system... with Python!
Storm is a high-volume, continuous, reliable stream processing system developed at BackType and recently open-sourced by Twitter. Though most of the system (and it's documentation) is written in Java-based languages, it is possible to use in a Python environment with Python-based analysis code. At DotCloud (our application-platform-as-a-service) we're doing just that, and we'll be showing how you can too.
We collect a lot of data: we have tens of thousands of customers, many of whom have dozens of services running on our platform, each of which in turn produces dozens of metrics every second. All in all, we're dealing with millions of datapoints per minute. Storm will be the third iteration of our metrics system, an attempt at standardizing a number of previously-distinct pieces of our infrastructural software, to enable automated, real-time reactions to changes in the platform's state.
We'll start by touching on what problems Storm is (and isn't) trying to solve and why it's model is so powerful, informed by our previous attempts to solve the stream processing problem. We'll then move on to a deep dive into how to get Storm up and running with the most Python and least Java-enduced-pain possible and finish up with tips to solve some of the challenges we've encountered while adopting Storm into our Python-based development process.
What is stream processing?
High volume, Continuous, Reliable data analysis
How do people solve this today?
Storm's overall model
Why is this solution better?
The Hard Part, Made Simple:
Build a topology
Code your processors
The Simple Part, Made Hard (made simple):
Clojure (less ugh)
by Ask Solem
This talk will delve deep into advanced aspects of the Celery task queue and ecosystem. Previous experience with task queues and message oriented middleware is beneficial.
We will look at task examples and rewrite them to better fit the distributed paradigms.
Celery + Eventlet
Monitoring and troubleshooting.
Logging (syslog, sentry, error-emails).
Tracing memory leaks.
Writing a Celery worker in Ruby using celeryd as a proxy.
Clustering and HA
by Noah Silas and Jacob Burch
This talk aims to briefly introduce the core concepts of caching and covers the best practices of implementing them, using a small variety of python web frameworks (Flask, Django) for example code.
"Are you caching?" is a question asked early on in any yarn on web scaling advice. These conversations are much better steered by asking a more open and difficult questions "What is your caching strategy?" and “How are you implementing it?” This talk aims to briefly introduce the core concepts of caching and quickly moves to cover the best practices using Django and Flask for example code. We will let the audience know what the important questions to ask are, give them advice on how to implement the right answers, and when even the built-in core backend isn’t enough, point them to more advanced techniques and the right third party tools.
Important questions covered:
Exploring and analyzing data can be daunting and time-consuming, even for data lovers. Python can make the process fun and exciting. We will present techniques of data analysis, along with python tools that help you explore and map data. Our talk includes examples that show how python libraries such as csvkit, matplotlib, scipy, networkx and pysal can help you dig into and make sense of your data.
Learn about powerful python libraries for analyzing all types of data, including spatial data, through the following illustrated examples.
Example 1: Explore data
Problem: I have a large voter data file in CSV format. I want to examine it, check the column headings and data types, and do some basic stats, but I don’t want to pull it into Excel or Access. What are my options?
Solution: csvkit - I can explore my data, chop it up, sort it, summarize it, and prepare it for import to postgis.
Bonus: Developers and journalists have been working hard to add functionality to csvkit. You can contribute!
Example 2: Analyze data
Problem: I have a bunch of data points from Twitter. How do I make sense of what I have in front of me, and where do I start?
Solutions: matplotlib, networkx
Bonus: Learn about how python libraries are plug and play with each other.
Example 3: Map data
Problem: I have a year’s worth of crime incidents for a large city. I want to explore global and local patterns in the data and identify clusters.
Solutions: PySal (Numpy, Scipy)
Bonus: We’ll look at the full ESDA (Exploratory Spatial Data Analysis) module in PySal, and we’ll briefly touch on a selection of the rest of PySal’s functionality.
To wrap up the talk, we'll give some tips on using postgis and geodjango to go from data analysis and mapping to building a web application.
Due to its robust namespacing, Python uniquely equips developers to write and distribute reusable code. The Python community has a tool for this: the Python Package Index. PyPI is a massive repository of code, and in this talk you'll learn how to take code that you've written it and register and distribute it for use by others.
Sharing is Caring: Posting to the Python Package Index
I had been writing Python professionally for over a year before I learned how to do this. If I’d known how easy it was, perhaps I would make more code available.
What is the Python Package Index? (5m)
A repository of Python code written by thousands of different contributors
Brief history (“the Cheese Shop”), and bare-bones explanation of pip
Ultimately, it’s a place to help you re-invent as few wheels as possible
Reminders: Reusable Code (5m)
Everyone has a different use case.
Some best practices for this.
The bottom line: Think about general needs, not just your specific needs.
Python internal documentation
...and write your own, too!
Walkthrough of Posting a Package (12.5m)
An explanation of setup.py
The setup function
Things you need: name, author, version, packages, scripts
Registering and uploading your project.
By ensuring consistency and repeatability in setting up the development environments of a team of developers, errors can be avoided (by automating repetitive tasks). It also helps by lowering the entry barrier for new developers, and letting existing developers focus on development tasks without having to worry about infrastructure or process issues.
This talk will show how you can use fabric to standardize the different tasks that are commonly performed during the development process. Such tasks can be grouped into categories like:
bootstrap: initializing the development environment
database: managing the database
testing: running tests
lint: validating the code against different standards
deploy: automating the deployment process
7th–15th March 2012