•  

RHadoop, R meets Hadoop

A session at Strata 2012

Wednesday 29th February, 2012

10:40am to 11:20am (PST)

Rhadoop is an open source project spearheaded by Revolution Analytics to grant data scientists access to Hadoop’s scalability from their favorite language, R. RHadoop is comprised of three packages.

  • rhdfs provides file level manipulation for HDFS, the Hadoop file system
  • rhbase provides access to HBASE, the hadoop database
  • rmr allows to write mapreduce programs in R

rmr allows R developers to program in the mapreduce framework, and to all developers provides an alternative way to implement mapreduce programs that strikes a delicate compromise betwen power and usability. It allows to write general mapreduce programs, offering the full power and ecosystem of an existing, established programming language. It doesn’t force you to replace the R interpreter with a special run-time—it is just a library. You can write logistic regression in half a page and even understand it. It feels and behaves almost like the usual R iteration and aggregation primitives. It is comprised of a handful of functions with a modest number of arguments and sensible defaults that combine in many useful ways. But there is no way to prove that an API works: one can only show examples of what it allows to do and we will do that covering a few from machine learning and statistics. Finally, we will discuss how to get involved.

About the speaker

This person is speaking at this event.
Antonio Piccolboni

Revolution Analytics

Next session in GA J

11:30am Monitoring Apache Hadoop - a big data problem? by Henry Robinson

Sign in to add slides, notes or videos to this session

Strata 2012

United States United States, Santa Clara

28th February to 1st March 2012

Tell your friends!

When

Time 10:40am11:20am PST

Date Wed 29th February 2012

Short URL

lanyrd.com/sqfym

View the schedule

Share

Topics

See something wrong?

Report an issue with this session