by John Hugg
In this talk, we will introduce a simple formula for all Big Data applications: Big Data = Fast Data + Deep Data. Through a use-case format, we will discuss the specialized requirements for real-time (“fast”) and analytic (“deep”) data management.
by Jay Kreps
The last few years have brought a wealth of new data technologies organized around horizontal scalability. This talk will cover the essential infrastructure areas: real-time stream processing, offline data crunching, large-scale data deployments and live serving. The focus will be on how these ingredients come together to enable innovative data-driven products at LinkedIn.
by Bill Fox
A big data case study with the NY Medicaid Inspector General's Office and HPCC Systems from LexisNexis.
by Tom Wilkie
The standard Linux storage stack wasn't designed for write-heavy big data workloads, nor is it well-suited to modern hardware: large, slow SATA disks, SSDs or many cores. Castle, an open-source project, is a ground-up overhauling of RAID, file systems, and the POSIX interface.
Building large data applications can present a unique set of technical challenges because things that often work well in the conventional development environment can become incredibly arduous or expensive when applied on a much bigger scale. This talk will cover some of those challenges and potential solutions for each.
by Erik Onnen
This talk will cover lessons learned in building Urban Airship's large-scale data warehouse in EC2 including PostgreSQL, Kafka, Cassandra, HBase and Hadoop.
25th–27th July 2011