Faced with the costs of vertically scaling their relational database systems, developers are increasingly turning to Apache Cassandra as an alternative. Cassandra solves the scaling problem by partitioning data, expanding horizontally and promising replication consistency. Effectively utilizing Cassandra requires that developers take different approaches to the ways they model data used in their applications. This presentation will explain how Cassandra achieves scale and reliability, and give an example of porting a SQL schema to Cassandra.
Big Data solutions, such as Apache Hadoop and Apache Cassandra, are growing up and are in the process of moving out of a grassroots movement to widespread adoption. Unfortunately, the majority of the technical expertise still lies in the hands of the open source project contributors and most solutions are tackled from the bottom up, starting with the technical problems. The collateral that is presently available is largely from the social media giants that tout solutions built using 10,000 node clusters that process petabytes of data a day. The reality? The average person just cannot relate or intuitively draw parallels to their own business problems.
While Big Data solutions are worthwhile far before you reach petabyte scale data, just getting started can be a challenge in itself. New open source projects are being regularly released that tackle a variety of issues related to Big Data, some of which are just slightly different to existing technologies. Just how does one navigate the plethora of technologies to design workable solutions to business problems? What if you only have gigabytes or terabytes of "medium" data on a small cluster? This panel features Solution Architects from a variety of key companies in the Big Data space which will provide deep dive technical discussions on real solutions we've employed for our customers, across a variety of industries, starting with the business problems.
11th–15th March 2011