There’s been a surprising amount of drama (in some circles, at least) about database technology recently. I shouldn’t be surprised, given the volume of reactions to the I Want a New Datastore post that I wrote. (Hint: I still hear from folks pitching the newest data storage systems.)
The two things that caught my eye recently involve Cassandra and MongoDB (and, indirectly, MySQL). First was what I read as a poorly thought out and whiny critique of MongoDB’s durability model: MongoDB Performance & Durability. Just because something is the default doesn’t mean you have to use it that way. Thankfully there was reasoned discussion and reaction elsewhere, including the Hacker News thread about it.
Look. Building fast, feature-rich, scalable systems is Really Hard Work. You’re always making tradeoffs. You can have the ultimate in single-server durability (with all the fancy hardware that dictates) but you’re going to really sacrifice performance (or budget!). But at least you won’t have a lot of complexity. Or you can build something that scales out really well using many machines. But that adds a lot of complexity and different sacrifices.
Next comes the Twitter Engineering blog post Cassandra at Twitter Today in which we learn that Twitter loves Cassandra but they’re opting to use their sharded MySQL infrastructure for storing tweets. This surprised a lot of people and even became “news” at TechCrunch. This is hardly surprising. The long version of why I say that is captured in the Reddit comments on the story.
But if you’re not interested in reading the 80+ comments currently there, maybe I can simplify it a bit. Have you ever wondered why there are so damned many NoSQL systems out there?
Simple. Different circumstances dictate making different choices when presented with the list of tradeoffs. This includes durability, performance, data model, scalability, richness of query language, replication model, atomicity, indexing, transactions, administration and support, etc.
I’m not saying any of this to promote MySQL and knock Cassandra or MongoDB. I lost at least a day of work last week due to some legacy MySQL issues that seem completely insane in the modern world. But years ago those issues were edge cases. Nowadays they’re very easy to hit.
I’ve actually spent some time recently playing with both Cassandra and MongoDB in the hopes of replacing a big (in data size, not query volume) MySQL cluster. Both are impressive (and frustrating) in different ways. But ultimately, I do expect that one of them will work quite nicely in this role–and possibly others later on. Not having to contemplate another multi-week ALTER TABLE will be a welcome change!
Which one? Stay tuned. :-)
Maybe what I should do in the meantime is spend more time reading stories about what works WELL for people, instead of how they’re unhappy with their choice of tool. All this drama is a real time sink.