by Nathan Marz
Storm makes it easy to write and scale complex realtime computations on a cluster of computers, doing for realtime processing what Hadoop did for batch processing. Storm guarantees that every message will be processed. And it’s fast — you can process millions of messages per second with a small cluster. Best of all, you can write Storm topologies using any programming language.
Storm has a wide range of use cases. The basic use case is “stream processing”: processing a stream of new data and updating databases in realtime. Unlike the standard approach of doing stream processing with queues and workers, Storm is fault-tolerant and scalable.
Another use case is “continuous computation”: streaming the results of a query to clients to visualize in realtime. An example is streaming trending topics on Twitter into browsers.
A third use case is “distributed RPC”: computing an intense query on the fly in parallel. With distributed RPC, a Storm topology is a distributed function that you can invoke like a normal function.
In this talk, I’ll release Storm as open-source. I’ll show how Storm’s simple programming model makes realtime computation easy, robust, and even fun.
18th–20th September 2011