Your current filters are…
by Doug Cutting
Apache Hadoop forms the kernel of an operating system for Big Data. This ecosystem of interdependent projects enables institutions to affordably explore ever vaster quantities of data. The platform is young, but it is strong and vibrant, built to evolve.
by Dave Campbell
In a world where data increasing 10x every 5 years and 85% of that information is coming from new data sources, how do our existing technologies to manage and analyze data stack up? This talk discusses some of the key implications that Big Data will have on our existing technology infrastructure and where do we need to go as a community and ecosystem to make the most of the opportunity that lies ahead.
How big data tools and technologies give us back our individual identity … because if you didn’t know you were unique and special, well, you are. Big data can be applied to solving socio-economic problems that rival the scale and importance of building ad optimization models.
by Mike Olson
Tools for attacking big data problems originated at consumer internet companies, but the number and variety of big data problems have spread across industries and around the world. I’ll present a brief summary of some of the critical social and business problems that we’re attacking with the open source Apache Hadoop platform.
Back in the late 80s artificial intelligence was set to take over the world; it didn’t happen. In 2012; AI has been stripped down, dressed up and reborn as machine learning. Will it take over the world this time? What makes a Big Data – Machine Learning solution ‘better’? Can machine learning happen with legacy tools? What exactly does it mean to be fully parallel? Do I care? Will I be any better if I get it right?
This talk sponsored by HPCC Systems
Our education system is not preparing students for college. There is an urgent need to improve academic outcomes and equip students with critical 21st century skills. Evidence from top-performing schools shows that use of data, analysis, and feedback are our best tools for improvement. The increasing use of online software and digital devices in classrooms presents an opportunity to collect high-frequency data for mining. Today’s analytics techniques could be used to develop a deeper understanding of how students learn, recommend personalized learning plans, and identify early warning flags. Rich data, analytics, and feedback enable a process of iteration and continuous improvement, where educators become learners, and we figure out how to improve education. We are at the beginning of a wave of data-driven change in education, with important social consequences and fantastic opportunities.
So you’ve hoarded the world’s data within your enterprise. Now what? Author and digital marketing evangelist Avinash Kaushik shares lessons from the nascent world of Web Analytics on how multiplicity, scale and outsourcing powers a data democracy, and how that in turn drives business action.
by Ben Goldacre
I am a doctor and a data geek. I worry that data geeks are too easily seduced by the glamour of laboratory science and forget about clinics. Randomised controlled trials are the best tool we have in medicine for finding out if a treatment works or not. Lots of trials are done. Unfortunately, the results of these trials can go missing in action after they are completed.
Missing data is always a challenge: but we also know that “negative results” are more likely to go missing. This means we have a biased sample, overestimating the benefits of treatments. To prevent all this happening, people have set up registers of trial protocols, to be completed before trials begin. These have not been correctly used, and they are not matched to published trials, which show up what data has been left unpublished.
I will describe a small project to fix this, illustrate how that can lead on to fixing other similar problems in medicine, and make a cry for help.
by Jon Gosier
Big data isn’t just an abstract problem for corporations, financial firms, and tech companies. To your mother, a ‘big data’ problem might simply be too much email, or a lost file on her computer.
We need to democratize access to the tools used for understanding information by taking the hard-work out of drawing insight from excessive quantities of information. To help humans process content more efficiently and to help them capture more of their world.
Tools to effectively do this need to be visual, intuitive, and quick. This talk looks at some of the data visualization platforms that are helping to solve big data problems for normal people.
How are businesses using big data to connect with their customers, deliver new products or services faster and create a competitive advantage? Luke Lonergan, co-founder & CTO, Greenplum, a division of EMC, gives insight into the changing nature of customer intimacy and how the technologies and techniques around big data analysis provide business advantage in today’s social, mobile environment – and why it is imperative to adopt a big data analytics strategy.
This session is sponsored by Greenplum, a division of EMC²
by Coco Krumme
Why data can tell us only so much about food, flavor, and our preferences.
by Pete Warden
Why unstructured data beats structured.
by Usman Haque
The expected massive growth of connected device, appliance and sensor markets in the coming years – often called ‘The Internet of Things’ – will need a more rich concept of ‘open data’ than is currently common. When data is generated through activities of people doing things inside their homes and outside in public in their cities, the question of who owns the data becomes almost irrelevant next to the questions of who has access to the data, what do they do with it, and how do citizens manage and make sense of their data while retaining the ‘openness’ that we’ve seen drive creativity and business on the web over the last few years.
by Gary Lang
Big Data is about extracting value from fast, huge, varied, complex data sets. But simply crunching data is only the first step. As adoption of MapReduce and data analytic technologies increases, forward thinking companies are starting to build applications on their core data assets. In this keynote, MarkLogic’s Gary Lang will explore what these Big Data Applications look like, offering some tantalizing real-world glimpses at what data wrapped in applications makes possible.
This keynote is sponsored by MarkLogic
by Hal Varian
Google Insights for Search provides an index of search activity for millions of queries. These queries can sometimes help understand consumer behavior. Hal describes some of the issues that arise in trying to use this data for short-term economic forecasts and provide examples.
United States United States, Santa Clara
28th February to 1st March 2012