Python MapReduce Programming with Pydoop

A session at EuroPython 2011

  • Simone Leo

Friday 24th June, 2011

9:00am to 10:30am (CET)

Hadoop is the leading open source implementation of MapReduce,
Google's large scale distributed computing paradigm. Hadoop's native
API is in Java, and its built-in options for Python programming --
Streaming and Jython -- have several drawbacks: the former allows to
access only a small subset of Hadoop's features, while the latter
carries with it all of the limitations of Jython with respect to

Pydoop (http://pydoop.sourceforge.net) is an API for Hadoop that makes
most of its features available to Python programmers while allowing
CPython development. Its core consists of Boost.Python wrappers for
Hadoop's C/C++ interface.

The talk consists of a MapReduce/Hadoop tutorial and a presentation of
the Pydoop API, with the main goal of bridging the gap between the
Hadoop and Python communities. A basic knowledge of distributed
programming is helpful but not strictly required.

About the speaker

This person is speaking at this event.
Simone Leo

Coverage of this session

Sign in to add slides, notes or videos to this session

EuroPython 2011

Italy Italy, Florence

20th26th June 2011

Tell your friends!


Time 9:00am10:30am CET

Date Fri 24th June 2011

Short URL


View the schedule



See something wrong?

Report an issue with this session