This talk will cover Pandas and IPython for beginners. You may have heard of the R programming language for statistical analysis, you may even have tried it out, but while R is fantastic for statistics, it is not so great for data munging and preparation. Oh, and R requires that you learn yet another programming language.
The combination of Pandas and IPython can provide a familiar (or easy to learn) programming environment that allows you to not only prepare the data, but do the analysis in an interactive manner and access the numerous libraries that exist in Python. Come find out how easy and fun data analysis can be.
Hadoop is about more than MapReduce these days. The toolset is growing, the technology is maturing, and it’s being stretched and squeezed in all sorts of new directions. Variety and growth are great, but they also open up a whole new set of questions.
How can you use new languages like Clojure, F#, Pig and HQL to get the best out of huge amounts of data? How can you use massive clusters of CPUs to make real-time apps with new frameworks like YARN and Tez that make Hadoop 2.0? By the end of this session you'll know how and, more importantly, you’ll know when it’s a good idea.
28th March 2014