by Alex Lundry, Dan Siroker, Josh Hendler, Kristen Soltis and Patrick Ruffini
Despite the advent of new media, campaigns for President still measure the electorate in pretty much the same way they did 40 years ago, through traditional polls to landline phones. That could all change this year. The hottest job in today’s Presidential campaigns is the Data Mining Scientist -- whose job it is to sort through terabytes of data and billions of behaviors tracked in voter files, consumer databases, and site logs. They’ll use the numbers to uncover hidden patterns that predict how you’ll vote, if you’ll pony up with a donation, and if you’ll influence your friends to support a candidate. This panel will delve deep into the world of real-time data on Presidential campaigns, showing how it’ll be used to make decisions on everything from the layout of a signup form to where to spend millions of advertising dollars in the closing days of a campaign. Forget about which candidate has the most likes on Facebook or followers on Twitter -- and learn why 2012 will be the year of Big Data in American politics.
by Erik Swan and Michael Wilde
WTF is Big Data & Why Should I Care?Love that smartphone? Navigate with your GPS? Tweeting about this session? Everything other than brushing your teeth has is generating data. Every action we do generates data & a record of that action. According to a recent study by McKinsey, 15 out of 17 industry sectors in the US have more data stored per company than the Library of Congress. The sheer volume of data, driven by new devices & disparate data sources, requires a shift in how to capture & analyze information. If you could mine data generated by your audience, what questions might you ask? Improving your perspective on what users are doing or how they're interacting with you can yield some amazing returns. Analyzing big data can be as easy as surfing the web. We'll show some cool ways to ask questions, in realtime, to some fun data sources & get amazing answers. See how to turn data into information, information to knowledge & knowledge to action.
by Dhruv Bansal and Flip Kromer
Where are all the coffee shops in my neighborhood?
Seemingly easy questions can become complex when you consider ambiguity. This one sounds simple until you consider that folks may define “coffee shop” differently and the boundaries of your “neighborhood” differently. One person’s Central Austin, may be someone else’s South Dallas.
How about instead of working too hard to define the parameters in an attempt to completely remove the ambiguity, we instead look at what people do, interact with and talk about. We can watch what people do and decide from there what a coffee shop is and where the boundaries of your neighborhood are. It might not be the “truth”, but it can be darn close.
When we learn to embrace ambiguity, not only can we still find the answers to our questions, but we can also find answers to questions we hadn’t even thought to ask.
by Arnab Gupta
Using Big Data Takes Machines & Humans Man vs. machine – usually, good (man) versus evil (machine) – has long been the stuff of scary science fiction. And now as machines master more advanced processes, the prospect that thinking machines will outperform and ultimately replace thinking humans becomes more real and threatening. Example: IBM’s Watson, an advanced AI machine that’s squared off against Jeopardy’s best human contestants and won.But Arnab Gupta, CEO of Big Data analytics firm Opera Solutions, believes “humans vs. machines” is the wrong construct. Humans PLUS machines is far more powerful. Marrying machines’ ability to discern patterns in Big Data with humans’ ability to derive meaning from this output enables far better decisions. It’s the next wave in productivity.How to accomplish this, when machines and people speak different languages and “think” differently? At SXSW, Arnab will explore the power of “machine + humans” and discuss ways to create collaboration.
by Paul Lamere
Data mining is the process of extracting patterns and knowledge from large data sets. It has already helped revolutionized fields as diverse as advertising and medicine. In this talk we dive into mega-scale music data such as the Million Song Dataset (a recently released, freely-available collection of detailed audio features and metadata for a million contemporary popular music tracks) to help us get a better understanding of the music and the artists that perform the music.
We explore how we can use music data mining for tasks such as automatic genre detection, song similarity for music recommendation, and data visualization for music exploration and discovery. We use these techniques to try to answers questions about music such as: Which drummers use click tracks to help set the tempo? or Is music really faster and louder than it used to be? Finally, we look at techniques and challenges in processing these extremely large datasets.