Topics for any discipline that focuses on quantitative or technical data have always depended on the datasets that were available at the time. Crowdsourcing has changed that — democratizing the data-collection process and cutting researchers’ reliance on stagnant, overused datasets. Tools like Amazon Mechanical Turk allow anyone to gather data overnight, rather than waiting years.
Learn how to leverage data exhaust, the digital byproduct of our online activities, to solve problems and discover insights about the world around you. We will walk through a real world example which combines several datasets and statistical techniques to discover insights and make predictions about attendees at O'Reilly Strata.
by Dustin Kirk
When faced endless data and the need to manage it, there are a variety of proven design patterns that will help designers create usable, efficient, and effective interfaces. From distributing workload to reducing sensory overload, we’ll review the techniques that are enabling the highly scalable user interfaces of today and tomorrow.
Data modeling competitions allow companies and researchers to post a problem and have it scrutinised by the world's best data scientists. By exposing a problem to a wide audience, competitions are a great way to get the most out of a dataset. In just a few months, Kaggle's competitions have helped to progress the state of the art in chess ratings and HIV research.
1st–3rd February 2011