Wednesday 24th October, 2012
5:00pm to 5:50pm
The exponential growth in the study of graph-based data dependencies is fueling the need for large scale machine learning frameworks and techniques. The nature of these computations is iterative and compute-centered. Recently, frameworks, such as Google’s Giraph, Apache’s Hama, and CMU’s GraphLab, have emerged to perform these computations in a distributed manner at commercial scale. But feeding data to these frameworks is a huge challenge in itself. Since graph construction is a data-parallel problem, Hadoop is well-suited for this task but lacks some elements that would make things easier for Map-Reduce programmers.
In this talk, Nilesh will introduce GraphBuilder, a graph construction library for Apache Hadoop. GraphBuilder makes the job easy by providing services for transforming unstructured data into graphs, graph cleaning, output-formatting, and partitioning graphs ahead of cluster ingress.
Nilesh will review emerging frameworks for graph-based machine learning and explain the benefits of GraphBuilder by sharing end-to-end case studies for complex machine learning applications, such as sentiment analysis and perceptual computing. Finally he will explain how his work is evolving to accommodate more frameworks and complex ingress structures.
Sr. Research Scientist, Intel Corp
Sign in to add slides, notes or videos to this session