Sessions at Strange Loop 2010 about NoSQL

Your current filters are…


  • Adopting Apache Cassandra

    by Eben Hewitt

    The Cassandra database is distributed, highly-available, fault-tolerant, and offers an elastic scaling model—all of which make Cassandra a powerful proposition for mission-critical applications. It’s used by many of the world’s biggest web properties, including Facebook, Twitter, Digg, StumbleUpon, Reddit, Cisco, and others.

    This is all fantastic, but there’s no free lunch—Cassandra is not a relational database, but rather follows in the footsteps of columnar data stores such as Google BigTable and Amazon’s Dynamo. As such, getting your head around how Cassandra works can be daunting to say the least: there’s a lot of new terminology (what’s a Hinted Handoff? What’s a SuperColumn?? What do I need to know about Vector Clocks??? Argh!). There are some complex algorithms in Cassandra, and new ways of handling basic operations in order to achieve the benefits mentioned above. Cassandra only recently emerged from Incubator status, and there aren’t a lot of tools available yet to smooth your path toward adoption. This talk can help you understand everything you need to know to get started using Cassandra. We’ll sort out all the terminology and foundational concepts, and then dive into a practical set of ways to get started putting Cassandra to work in your applications today.

  • Enterprise NoSQL: Silver Bullet or Poison Pill?

    by Billy Newport

    NoSQL has become the latest darling technology. We will examine its roots, why it became popular in that context, and whether it can extend its reach into mainstream enterprise applications.

    Coverage slide deck

  • HyperGraphDB - Data Management for Complex Systems

    by Borislav Iordanov

    While the problem of handling massive amounts of data has been at the forefront of database research both in industry and in academia, addressing the complexity of domain models has remained solely a concern of application architects forced to align often highly incompatible problem and solution domains.

    HyperGraphDB is a database with a unique memory/data model based on generalized hypergraphs. Those are graphs where edges can point to an arbitrary number of nodes and even to other edges. Thus higher order relationships are expressed naturally which automatically solves most headaches related to domain data modeling. Entities (nodes and edges) have arbitrary values managed by a comprehensive type system embedded as a hypergraph itself.

    In a sense, HyperGraphDB is a dynamic-schema database general enough to easily accommodate any meta-model and integrate entities of different formal representations while maintaining high performance through aggressive indexing. In that respect, it is as much a knowledge management system suitable for AI applications as it is database for conventional enterprise systems. Key to such capability are its open-architecture and extremely general formal basis.

    In this talk, I will present some of the more interesting aspects of the HyperGraphDB architecture and discuss some of the subtleties in balancing generality, practicality and efficiency in such an open-ended, yet highly organized memory model. I will compare it to other graph databases and put in the larger context of the recent NOSQL movement.

  • NoSQL At Twitter

    by Kevin Weil

    Non-relational data stores are growing in popularity, due both to the massive growth in the size of business datasets and to non-traditional access patterns caused by e.g. social graphs. Twitter faces both challenges, and so it is no surprise that we are making increasing use of NoSQL systems such as Hadoop, Cassandra, Redis, and our own open source social graph store, Flock. In this presentation, I will focus on how we use these systems at Twitter, with specific examples of where we ran into problems with a traditional MySQL-based architecture.

  • Panel: "Non-Relational Data Stores"

    by Mike Malone, Chris Biow, Roger Bodamer, Ken Sipe, Steve Harris and Rusty Klophaus

    This panel will be moderated by Ken Sipe and focus on the future of nosql and other non-relational data stores.

  • Real World Modeling with MongoDB

    by Steve Smith

    Learn to break out of the habits of relational databases and model your data in a better, more meaningful way using MongoDB. Find out where the flexibility of MongoDB can let you rethink how you interact with your data, and how this flexibility makes your application cleaner, faster, and better.

    Coverage slide deck

  • Riak: From Small to Large

    by Rusty Klophaus

    Riak ( http://wiki.basho.com ), a Dynamo-inspired, open-source key/value datastore, was built to scale from a single machine to a 100+ server cluster without driving you or your operations team crazy. This presentation points out the characteristics of Riak that become important in small, medium, and large clusters, and then demonstrates the Riak API via the Python client library.

    Coverage slide deck

  • Unifying the Search Engine and NoSQL DBMS with a Universal Index

    by Chris Biow

    Unifying the Search Engine and NoSQL DBMS with a Universal Index In contrast to single-function architectures, MarkLogic Server takes an unusual approach to collapsing the usual hierarchies of types of servers that make up a complete application, combining Search, a NoSQL DBMS, and an application server in a single kernel. The computational foundation for this hybrid is the Universal Index.

    In this talk, we'll begin with the familiar text indexing data structures and algorithms that underlie search engine technologies. We'll extend that index to cover document structure and semantics, add scalar range indexing in one and two dimensions (including geospatial application), and then incorporate "reverse" indexing of queries. We will demonstrate a novel type of "matchmaking" query whose evaluation is based on a composition of forward and reverse index evaluation, in a true "strange loop" path through abstraction levels. Finally, we'll explore the means by which all of this indexing may efficiently run concurrently with querying, using MultiVersion Concurrency Control and Log-Structured Merge Trees, providing ACID transactions together with lock-free query evaluation, built-in sharding, terabyte-per-server scale-out, replication, and query distribution.

    We will conclude with examples of production applications built on this architecture for geospatial service discovery at warriorgateway.org, social networking at bx.businessweek.com, and knowledge management on a US Air Force application.

    Coverage slide deck