Wednesday 29th February, 2012
11:30am to 12:10pm
Hadoop is gaining momentum with most companies having already deployed Hadoop in some fashion or are testing it in the lab. But there are many aspects of Hadoop that are not fully understood and appreciated including – How Hadoop can easily be leveraged by non-programmers, how to use Hadoop to quickly outperform complex models, how to easily integrate Hadoop into existing environments, and the two step process to use legacy applications with Hadoop.
During the session, Ted Dunning will show that while counter intuitive, as data size increases simple algorithms perform better than complex models on small data. This can greatly simplify the deployment and development of Hadoop applications and the talk will include several examples of machine learning deployments across multiple industries.
This session will also cover recent developments that make Hadoop access available to rank-and -file users. This expands access with standard applications to view and manipulate data beyond programmer access. This session will provide detailed descriptions of the following:
1) Getting data into and out of the Hadoop cluster as quickly as possible 2) Allowing real-time components to easily access cluster data 3) Using well-known and understood standard tools to access cluster data 4) Making Hadoop easier to use and operate 5) Leveraging existing code in map-reduce settings 6) Integrating map-reduce systems into existing analytic systems
Sign in to add slides, notes or videos to this session