by Erik Meijer
For the past decade, I have been on a quest to democratize developing data-intensive distributed applications. My secret weapon to slay the complexity dragon has been category theory and monads, but in particular the concept of duality. As it turns out, the data domain is an extremely rich source of all kinds of interesting dualities. These dualities are not just theoretical curiosities, but actually solve many practical problems and help to uncover deep similarities between concepts that at first look totally unrelated.In this talk I will illustrate several of the dualities I have encountered during my journey, and show how this resulted in a novel “A co-Relational Model of Data for Large Shared Data Banks”.
by Nathan Marz
Storm makes it easy to write and scale complex realtime computations on a cluster of computers, doing for realtime processing what Hadoop did for batch processing. Storm guarantees that every message will be processed. And it’s fast — you can process millions of messages per second with a small cluster. Best of all, you can write Storm topologies using any programming language.
Storm has a wide range of use cases. The basic use case is “stream processing”: processing a stream of new data and updating databases in realtime. Unlike the standard approach of doing stream processing with queues and workers, Storm is fault-tolerant and scalable.
Another use case is “continuous computation”: streaming the results of a query to clients to visualize in realtime. An example is streaming trending topics on Twitter into browsers.
A third use case is “distributed RPC”: computing an intense query on the fly in parallel. With distributed RPC, a Storm topology is a distributed function that you can invoke like a normal function.
In this talk, I’ll release Storm as open-source. I’ll show how Storm’s simple programming model makes realtime computation easy, robust, and even fun.
18th–20th September 2011