by Evan Elias
When correctly architected, horizontal partitioning of data can lead to dramatically better fault tolerance and isolation, while significantly improving the manageability and flexibility of a web site. A well-sharded MySQL architecture can perform comparably to newer NoSQL solutions, while still offering the expressiveness of SQL and the time-tested robustness of InnoDB.
The road to complete sharding of Tumblr’s primary data has been a long one, and I’ll discuss how, when, and why we transitioned from a single database, to several functionally-partitioned datasets, and finally to a massively horizontally-sharded architecture.
Along the way, we’ve developed a set of tools that has made the sharding process significantly easier to execute and monitor. The bulk of this session will focus on the practical aspects of sharding, digging into the nitty-gritty of the tools we use to:
The session will broadly appeal to engineers at rapidly-growing, MySQL-backed sites. The information presented is all drawn from first-hand experience, and mostly cannot be found in existing books or web sites. Some level of audience familiarity with MySQL is assumed, but previous experience with sharding, InnoDB internals, or MySQL replication is not required.