Python MapReduce Programming with Pydoop

A session at EuroPython 2011

  • Simone Leo

Friday 24th June, 2011

9:00am to 10:30am (CET)

Hadoop is the leading open source implementation of MapReduce,
Google's large scale distributed computing paradigm. Hadoop's native
API is in Java, and its built-in options for Python programming --
Streaming and Jython -- have several drawbacks: the former allows to
access only a small subset of Hadoop's features, while the latter
carries with it all of the limitations of Jython with respect to
CPython.

Pydoop (http://pydoop.sourceforge.net) is an API for Hadoop that makes
most of its features available to Python programmers while allowing
CPython development. Its core consists of Boost.Python wrappers for
Hadoop's C/C++ interface.

The talk consists of a MapReduce/Hadoop tutorial and a presentation of
the Pydoop API, with the main goal of bridging the gap between the
Hadoop and Python communities. A basic knowledge of distributed
programming is helpful but not strictly required.

About the speaker

This person is speaking at this event.
Simone Leo

Coverage of this session

Sign in to add slides, notes or videos to this session

EuroPython 2011

Italy Italy, Florence

20th26th June 2011

Tell your friends!

When

Time 9:00am10:30am CET

Date Fri 24th June 2011

Short URL

lanyrd.com/sfwkc

View the schedule

Share

Topics

See something wrong?

Report an issue with this session