Big Analytics Beyond the Elephants

A session at Strata 2012

  • Paul Brown

Thursday 1st March, 2012

4:50pm to 5:30pm (PST)

Scientists dealt with big data and big analytics for at least a decade before the business world precipitated buzz-words like ‘Big Data’, ‘Data Tsunami’ and ‘the Industrial Revolution of data’ from the strange broth of their marketing solution and came to realize they had the same problems. Both the scientific world and the commercial world share requirements for a high performance informatics platform supporting the collection, curation, collaboration, exploration, and analysis of massive datasets.

In this talk we will sketch the design of SciDB and explain how it differs from hadoop-based systems, SQL DBMS products, and NoSQL platforms, and explain why that matters. We will present benchmarking data and present a computational genomics use case that showcase SciDB’s massively scalable parallel analytics.

SciDB is an emerging open source analytical database that runs on a commodity hardware grid or in the cloud. SciDB natively supports:

• An array data model – a flexible, compact, extensible data model for rich, highly dimensional data

• Massively scale math – non-embarassingly parallel operations like linear algebra operations on matrices too large to fit in memory as well as transparently scalable R, MatLab, and SAS style analytics without requiring code for data distribution or parallel computation

• Versioning and Provenance – Data is updated, but never overwritten. The raw data, the derived data, and the derivation are kept for reproducibility, what-if modeling, back-testing, and re-analysis

• Uncertainty support – data carry error bars, probability distribution or confidence metrics that can be propagated through calculations

• Smart storage – compact storage for both dense and sparse data that is efficient for location-based, time-series, and instrument data

About the speaker

This person is speaking at this event.
Paul Brown

Paradigm4 Inc

Sign in to add slides, notes or videos to this session

Strata 2012

United States United States, Santa Clara

28th February to 1st March 2012

Tell your friends!


Time 4:50pm5:30pm PST

Date Thu 1st March 2012


Ballroom CD, Santa Clara Convention Center

Short URL


View the schedule


See something wrong?

Report an issue with this session