A discussion of Big Data approaches to analysis problems in marketing, forecasting, academia and enterprise computing. We focus on practices to enhance collaboration and employ rich statistical methods: a Magnetic, Agile and Deep (MAD) approach to analytics. While the approach is language-agnostic, we show that sophisticated statistics can be easily scaled in traditional environments like SQL.
by Doug Cutting
Apache Avro provides an expressive, efficient standard for representing large data sets. Avro data is programming-language neutral and MapReduce-friendly. Hopefully it can replace gzipped CSV-like formats as a dominant format for data.
1st–3rd February 2011