I Want a New Data Store

While there is a dizzying array of technologies that have the “NoSQL” label applied to them, I’m looking for one to replace a MySQL cluster. This particular cluster has roughly a billion records in it, uses a few TB of disk space, and is growing all the time. It is currently a MySQL master with several slaves and handles a fairly light query volume. The reasons we’d like to change it are:

  • ALTER TABLE takes an unreasonably long time, so we can’t add or removed indexes or columns. Changes take over a month.
  • The data footprint is large enough that it requires RAID and we seem to be really good at breaking RAID controllers.
  • Queries are not very efficient, due partly to the underlying data organization, and due partly to the sheer amount of it compared to the available RAM.
  • The data really isn’t relational anymore, so a document store is more appropriate.  It’s just that when it was set up, MySQL was the tool of choice for just about all data storage.

I’ve spent some time looking around at several options, including MongoDB, CouchDB, and Cassandra. And I like aspects of all of them. If I could pick and choose features from all of them, here’s what it might look like:

  • The high-level abstractions provided by CouchDB and MongoDB. Cassandra makes you think more about low-level details and performance (which can be good).
  • The performance of Cassandra. By all accounts it is very fast and got faster in the 0.60 release.
  • A clear understanding of the performance and storage tradeoffs in various schema/indexing designs.
  • The “no single point of failure” and multi-machine replication and sharding features of Cassandra. They’re VERY compelling. CouchDB Lounge is a step in the right direction, but I’d rather see it as part of the core system.
  • Map/Reduce support for ad-doc data analysis and queries.
  • Persistent indexes to speed our most common queries. Cassandra’s feels very “roll your own”, which is consistent with the lower-level nature of Cassandra. CouchDB’s “all views, all the time” feels a bit odd too. MongoDB seems to get this right, providing traditional indexes and support for more ad-hoc operations similar to CouchDB.
  • The documentation of MongoDB. CouchDB is pretty good and there are books available. Cassandra’s docs require a bit more trial and error on the part of the developer as things change with each release.
  • A corporate entity that can provide support, consulting, and possibly custom development. Both CouchDB and MongoDB have this. Cassandra is more a community project, though the bulk of contributions come from developers employed by tech companies–none of them appear to be in the business of doing Cassandra.
  • I’d love to be able to influence the organization of records on disk so that the most common queries will require very few seeks. I’d like to cluster around a particular key that may not be the primary key. I can see ways to do this with Cassandra. I’m not sure about the low-level details of MongoDB or CouchDB.
  • I’d love native compression for large text fields. A substantial portion of our data in this cluster is text.
  • Full-Text indexing would be a nice to have. We’re pretty good with Sphinx already, but having reasonable full-text indexing integrated would simplify things.
  • A good Perl API.

Given all that, what else should I be looking at? What misconceptions do I have? What’s your experience been with any or all of them?

About Jeremy Zawodny

I'm a software engineer and pilot. I work at craigslist by day, hacking on various bits of back-end software and data systems. As a pilot, I fly Glastar N97BM, Just AirCraft SuperSTOL N119AM, Bonanza N200TE, and high performance gliders in the northern California and Nevada area. I'm also the original author of "High Performance MySQL" published by O'Reilly Media. I still speak at conferences and user groups on occasion.
This entry was posted in craigslist, mysql, nosql. Bookmark the permalink.

81 Responses to I Want a New Data Store

  1. Here are some hopefully helpful links.

    http://github.com/rnewson/couchdb-lucene

    http://spyced.blogspot.com/2010/04/and-now-for-something-completely.html

    (I’m mostly a CouchDB man, but I have no performance needs to speak of. I love its elegance. Also, something like Lounge will eventually get integrated, the only unsure thing is the timeline.)

    Not sure what’s odd about the “all views, all the time”? It’s been very powerful for what I’ve done with it.

  2. Lukas says:

    Good summary that also mirrors my perception .. aka Cassandra is raw but powerful, MongoDB and CouchDB more higher level and polished for ease of use.

    As for fulltext .. afaik CouchDB has Lucene integration.

  3. for what is worth, cassandra has corporate baking now from http://riptano.com/ which may lead to fixing a couple of your issues.

  4. I’d like to see “Elastic Search” included in your review 🙂

  5. Roger says:

    I have a project involving about 10 million records which is 2GB of raw JSON. I started with CouchDB as it is very friendly but ultimately had to give up on it due to implementation details. I love the way CouchDB does replication. Just point instances at each other and things effectively flood fill. Unfortunately the file format is append only which means any change consists of appending changed btree nodes and a new root node in addition to the data. This resulted in humungous disk consumption (eg 25GB for data, 43GB for views). You can mitigate it somewhat by presorting data. Compaction will reclaim space but you need both the before and after space, and you need it on every server instance when replicating. Additionally performance isn’t very good. Loading the data on a 4 core machine with raid 0 disks took 75 minutes of elapsed time. MongoDB took 6 CPU minutes (I didn’t measure elapsed time as it is so quick).

    I do like MongoDB a lot better and it uses a reasonable amount of space (6GB for my data and indices) and has great performance. The replication options are single master with one or more slaves and various forms of automatic failover and master promotion. They also take a different view on individual machines. For example CouchDB and other traditional databases try their hardest to ensure that if the power is pulled etc then the database can recover on boot. MongoDB says not to trust the machine – if something untoward happens then you discard the machine and certainly do not trust the data on it. And for practical reasons it is pointless to do so anyway since the time for machines to reboot is too long, and the single most likely cause of problems is machine hardware issues – ie you can’t trust it anyway. See http://blog.mongodb.org where there are lots of postings about distributed concurrency and a February 10th 2010 posting about durability with better explanations. The March 14th 2010 posting about query language is good too. MongoDB’s performance for searches even without indices is excellent while CouchDB takes way too long to generate views for non-trivial numbers of documents.

    My recommendation based on experience so far is to use CouchDB when you need offline mode, personal use and you count your documents in the thousands.

    Use MongoDB when you have multiple servers available, no disconnected use, and count your documents in the millions or more.

    It does take a while to get your head around how to use these effectively. For example doing ‘OR’ queries is practically impossible in either (short of precalculating them with map/reduce). The example design problem is how would you store a blog – in particular the relationship between postings and comments. This post is a great example of the thinking http://www.cmlenz.net/archives/2007/10/couchdb-joins and is applicable to both CouchDB and MongoDB.

    For a while I did have an abstraction layer in my code that allowed transparent access to both CouchDB and MongoDB. Last week I had to rip it out as a lowest common denominator behaviour was far more trouble than it was worth.

  6. Sundar says:

    I believe we cannot get everything. You have to tradeoff some of your wishes for the other.

  7. Kristina says:

    I maintain the MongoDB Perl driver, and in my unbiased opinion it’s a terrific interface 🙂 Seriously, though, it does have a great community, it’s pretty mature, and I think it is fairly easy to use. (http://search.cpan.org/dist/MongoDB/)

    It would be interesting to do a speed comparison… I would imagine MongoDB is actually faster than Cassandra.

    Also, just so you know, MongoDB never, ever makes you have a single point of failure. Every node can have at least one (and, in the near future, arbitrarily many) backup nodes with automatic failover.

  8. Tracy says:

    Just came across this re: commercial cassandra support:

    Riptano is the leading expert for Apache Cassandra, providing support, training, and professional services for all things Cassandra. Riptano is obsessed with providing great customer service.

    http://riptano.com/

  9. Joe Hildebrand says:
  10. Roger says:

    @Joe: The problem with Riak is that it is not completely open source. I certainly understand why they keep some stuff back from a business point of view, but it also means that you are completely at their mercy for those features. There won’t be alternatives that can jump in and help should you need them.

    Jeremy didn’t state open source requirements in the list, but all the candidates listed are 100% open. Perhaps a posting on his views especially after the MySQL experience would be apropos.

    • fedesilva says:

      Can you provide links to the parts that are not open? If I understand correctly only this are not:
      + Management Tools
      + SNMP Monitoring Support
      + Masterless multi-site replication

      All of them could be provided by a third party; maybe the third is harder.

      I am currently looking at Riak and would like to know more what people think about it being a single company’s product.

      • Justin Sheehy says:

        Your understanding is correct. More is on the way as well, and we welcome additions from others. Given the Apache license, nothing at all prevents others from providing such things.

        Even though Riak is in one sense a single company’s product, it has a healthy community and has taken improvements from a number of outside contributors.

    • Justin Sheehy says:

      @Roger: If someone added something of their own on top of something you consider “100% open” and sold that addition, would you consider the original thing no longer entirely open source?

      Riak provides an entirely capable open source solution, with all of the features a developer would want included and available under the Apache 2 license.

      Enterprise customers of Basho get some add-ons that big companies typically want and developers could care less about, such as monitoring tools, many-cluster replication, and so on. Such features developed at customer request sometimes go directly to open source; others are proprietary at first and later migrate into the open source trunk.

      • Roger says:

        I would consider the non-open additions to not be open. Using your monitoring tools as an example, no one else except Basho can jump on a plane/get in a car and come over to troubleshoot, produce patches etc. There were/are companies providing that sort of thing for MySQL.

        I have no idea what Jeremy’s requirements in this area are and there are obviously people who are happy with things the way they are. Heck people even buy completely proprietary stuff!

        My own preferences as a customer are that I prefer open source over proprietary, and that I prefer 100% open source for all parts vs mostly open source. But these are not deal breakers. They do however ensure the least amount of dependence on a single entity.

      • For the record, Open Source is a very big plus for our use. We use a lot of Open Source software–in fact, we’re close to being an entirely open source shop for our production code (unless I’m missing something in my mental inventory).

  11. Marshall says:

    There was a NoSQL Smackdown panel at SXSW in which advocates for Cassandra, CouchDB, and MongoDB go head-to-head in a rip roaring debate. Lots of great details surface in the discussion.

    Audio and video of this are hosted over on the Changelog:

    http://thechangelog.com/post/457259567/episode-0-1-8-nosql-smackdown

  12. Well ignoring everything but the Perl API part. My company is willing to work with you on trying KiokuDB plus the MongoDB or one of the other backends to satisfy your requirements here. Note that we didn’t write KiokuDB but we are friends with the original authors and early adopters.

    Kioku is an Object Graph store written in Moose (Perl’s Post-Modern Object System) and it invisibly persists Moose objects. I’ve not yet had to scale it past a couple million thousand records, nor used it in a distributed fashion but I would love to help solve those problems.

  13. John Allspaw says:

    Jeremy –

    Any reason why you wouldn’t just shard the data that you have, using MySQL? Yes you’d have to do some work on the application side, but you’d gain greater control as to where the data is stored (on which shards) as opposed to giving it all up to the datastore. I know at least one site who’s got about 30TB of MySQL data, sharded, whose ALTERs don’t take long at all. 🙂

    Outside of that, we’re seeing (non-sharded, mind you) excellent performance and behaviour with MongoDB at Etsy, FWIW.

    • Thanks for the MongoDB comment, John.

      As for the sharding of MySQL, we *could* do that but I’m trying to explore the other options that are out there. I have this nagging sense that There Are Better Ways *and* this whole cluster needs a lot of love anyway, so I might as well cast a wide net before pouring a lot time and energy into fixing it.

  14. Pingback: I Want a New Data Store (for Craigslist)

  15. Ian Kallen says:

    I’ve most impressed with Cassandra’s peer to peer architecture, I think it makes incremental scaling much smoother. The writes-never-fail aspect of Cassandra is also very attractive. I’m dealing with large analytics requirements and I’m implementing a hadoop-to-cassandra pipeline, ask me in a week or so how it’s going 🙂

    While I agree that MongoDB has much nicer interfaces, cool operators (atomic addToSet, increment, etc) and way better documentation, relying on master/slaves or master/master topologies doesn’t impress me for HA/FO.
    -Ian

    • Totally agreed on the writes-never-fail and the peer architecture. Those are fantastic features that I’d *love* to see in so many other systems.

    • luckytaxi says:

      My understanding is that MongoDB should have sharding pretty soon. Of course if you need it now, you’re out of luck for the time being.

  16. joff says:

    Have you thought of swapping out MySQL for another relational database that isn’t completely rubbish? I’m betting that any of PostgreSQL, MS SQL or Oracle would be up to the task (my personal preference being Postgres)

    You say your data isn’t really relational anymore – is this the result of having to remove constraints because they were too slow in MySQL? In any case, this might buy you some time while you consider your next move.

  17. Doug Judd says:

    How about a Bigtable implementation like Hypertable? Bigtable underpins over 100 services at Google. Hypertable is a faithful implementation of Bigtable and is very high performance. It’s fully consistent, when you successfully write to it you can rest assured that the data will be there when you later ask for it. ALTER TABLE is instantaneous. It supports what’s known as ACCESS GROUPS which give you control over the physical layout of columns on disk. The system stores data compressed (default compressor is lzo) and you can configure it to use the bmz compressor which results in a great compression ratio for large blocks of similar text such as crawl data. Prefix compression of keys has been implemented and will be available in the next release. Hypertable does have a single Master, but no client data moves through the master, so it can be down for short periods without anyone noticing. And it has a Perl interface. The documentation is quite good (see http://www.hypertable.org/documentation.html). And there is a commercial support organization (Hypertable, Inc.) that provides 24/7 support and consulting.

  18. I agree with the suggestion of PostgreSQL worth a look. But one of the NoSQL’s might be better for you.

  19. Jeremy, have you seen this old article on how FriendFeed handled this problem:

    http://bret.appspot.com/entry/how-friendfeed-uses-mysql

    It’s a bit low-tech compared to using a fancy new datastore, but low-tech (ie simple, reliable) seems like a nice quality for something as important as Craigslist. It also meets your needs of having a company behind it, and you already presumably have lots of experience with MySQL.

    • Hi Aaron,

      Yeah, I’ve definitely seen that posting. In fact, I remember the day it was first published. 🙂

      The situation I’m in now is that no matter what we do, it’s gonna be a bunch of work. So I’m trying to figure out what the options look like.

      Thanks for the pointer.

  20. San says:

    The SenSage backend does just about everything you asked for (distributed Perl too). Not free though.

  21. Paul Prescod says:

    There is work under-way on for secondary indexes in Cassandra. The JIRA issue CASSANDRA-749 has patches for it.

    Cassandra 0.7 is integrated with Hadoop for map/reduce.

    My gut feeling is that Cassandra is taking the right approach of building in performance and scalability first and then features on top. It’s the opposite approach that SQL databases take, and therefore offers you a very clear choice. CouchDB is more of a middle ground, which is fine, but I’m glad someone is taking a radically different approach.

  22. Jacques Chester says:

    I would blame MySQL before I blamed relational algebra.

    I realise that NoSQL is shiny and such, but remember that you will simply wind up reimplementing consistency guarantees yourself; only with less testing and more undiscovered corner cases.

    If we were all being honest with ourselves, it would be called ‘NoMySQL’.

  23. Jacques Chester says:

    On reflection, I wish I could subtract about 84.5% of my smug tone on the previous message.

    Is there a particular reason you don’t want to trial other RDBMSes?

    • Well, the short version is that the data is really non-relational at this point and Relational Database systems are built around the idea that your data IS relational. So it seems prudent to look at non-relational replacement in which to store our non-relational data.

      The longer version would be difficult at this hour… it’s past my bed time! 🙂

  24. sandeep says:

    Looks like you are leaning heavily towards Cassandra because of its performance.

    With the kind of financial resources that Craigslist has, is it not possible to throw a bunch of money at Cassandra and get (for example) MongoDB-like indexes on top of Cassandra.

    I understand this is still an early stage eval – but 50-100k is peanuts to Craigslist for blue sky development.

    Maybe you could fund both MongoDB and Cassandra and choose one !

    • Spending money is not out of the question.

      We’ve paid for MySQL support and also have paid for Sphinx support and development, just to name a couple.

      • sandeep says:

        In that case I suspect you could change the title to “I want a new datastore : 5k bounty (each) ”

        A bounty is a little more open-ended than paying for support, but it is an interesting social + technical experiment: Even if your features are not patched in.. it could be an indication of community interest + architectural flexibility (or lack thereof)

  25. Steven Roussey says:

    And there is the Casandra and Lucene mashup “Lucandra” FYI.

  26. Tit Petric says:

    It was the year 2001. I barely started out with PHP and eventually MySQL. A client had this wish, to log and analyze its web traffic (a substantial amount of it), and because of a clustered environment, and also lack of detail with existing solutions the only available solution then was to develop our own analytics package with php and mysql. While at first we didn’t have problems scaling, we have quickly grown the log data and eventually hit barriers as disk space availability, concurrency issues (MyIsam …) and then query speed itself for analyzing data (sum, count, avg …).

    A few years back, when we were still trying to save that sinking ship, I actually started to write a mysql-backed statistics collection daemon, which took some of the load out of the computation, by doing in-memory statistics of each log query that passed it, and putting the statistics into a rolling-average database. Sure, it wasn’t 100% reliable counting power outages and wouldn’t probablly pass an ACID test to save it’s life – but it was working with what we had.

    While this aproach is very technical, it is also very specific. I wasn’t trying to solve the worlds problems – I was trying to save my own. While we eventually gave up (there’s this neat thing now we call Google Analytics), we still do some of our own very specific and detailed analytics. It isn’t much of a problem, when you take a step back and figure out what you really need. For us it was a data size and frequency problem, but we didn’t realize until much later, that the data we collected was mostly irelevant and will probably never be looked at in retrospective. If data like this drives business decisions, I find that life goes on even if you reduce the data to a fraction of what it once was, and even then it is probablly too big for what people need.

    Sure, a MapReduce algorithm to process all the data and collect meaningful statistics is useful for almost anyone, but unless you’re analyzing DNA strands or carrier grade network tracing (afaik, there’s some EU directives that force ISP’s to collect about 2 years worth of all ISP traffic) you most probably don’t need what you set out to have – but then again, who the hell doesn’t want a fault-tolerant distributed database with amazing storage, indexing, processing and search performances. I just find it that reality means you will end up with a desk piled full of post-it notes and will never achieve database nirvana, since the mighty Chtulhu will eat all your disk space, turn all your cpus into glowsticks and dance on the ashes of your datacentre. That’s about as realistic as “The Perfect Database”.

  27. Tit Petric says:

    p.s. slightly more on topic (of that digested rant), why don’t you roll your own? I’m sure a small community based effort would give you a more custom-tailored solution which would work perfectly for Craigslist. If I remember correctly, didn’t you roll your own transaction engine on top of MySQL 3.x? It’s not like you don’t have experience in working around problems instead of trying to replace them with a whole new set of problems 🙂

  28. Jeremy,
    After having built and struggled to economically cluster many rapid growth SQL/MySQL sites ourselves in previous lives we started looking at all of the above and others well before the term NoSQL emerged. But every option was a compromise, most notably the re-engineering required and limited query capabilities. Best of both worlds for MySQL is an embedded, elastically scaleable NoSQL storage engine for MySQL, and v5.1 enabled this. At high risk of self promotion, our view is that SQL-NoSQL coexistence and interoperability is the future, and immediate consistency with full self-healing replication is a pre-requisite. The solution we developed, GenieDB, is not (yet) open source, but if your site is relatively simple and you don’t mind your developers changing the way they or their apps work, then my vote is do as John Quinn did when they Cassandra’d Digg – but it took 6 months to do so! I would watch this space for the new generation of cloud compatible scalable datastores that speak SQL round the corner.
    Kind regards, Dr Jack Kreindler

  29. Andraz Tori says:

    Hi Tit,
    fun description of that project :), and I have to agree with the solution – “just fix your problem, don’t save the world”.

    However having your data in a datastore like Hadoop and being able to do novel analysis with Hive/Pig/your own map-reduce, that really can make the product development more agile (in terms of ability to gather and analyze data on it). It’s liberating.

    Otherwise, if MySQL would just do a bit more, that wouldn’t hurt either… online alter table, funtion-based indexes, pure hash indexes (when you _know_ your keys aren’t orderable), joint-index-stores, per-column compression. And fine tuning of how big blobs are handled. Ah, one can dream.

  30. Tit Petric says:

    @Andraz: mainly i would like to see some function based indexes also be done in a rolling-average fashion. Things like:

    select avg(render_time) from stats_pageloads;

    This should be recognized by the db engine (can also be specified manually in configs) and be solved in a way which doesnt even need a table or index scan, except for the first time the query is seen.

    All you need for above is

    1) a handler which catches the queries going into the stats_pageloads database and
    2) keeps count of the number of inserted records so it can update a
    3) total render_time counter!

    so when an engine would see an

    insert into stats_pageloads (render_time) values (0,012);

    it would update a rolling average counter:

    count ++;
    total_render_time+=render_time;

    and return the total_render_time / count;

    And voila – no table scan, no index scan. Except for the first time, ofcourse. 🙂

    Sum, count, avg, math – all could be done in this fashion, but as far as I am aware, nobody is doing this. It might not be what you want, but in some cases, it can be quite useful, to bring down the load on the database.

    The memory footprint should be quite low (we’re talking way less than 1MB since we’re talking about single row results) and in fact it could use the same memory pool as the MySQL query cache – depending on how hard it would be to update it :).

    If you consider also update and delete statements, theese values can be updated, ofcourse with additional cost at the write level, since each row has to match the queries where clauses.

    I can only dream of a database this nice 🙂

  31. pablo says:

    did you consider Redis? [redis.googlecode.com]
    I’m quite happy with it.
    If you did, what is your opinion about it?

  32. Brian says:

    There are other database models that are worth looking at for what you want:
    If you want the fast direct performance of a NoSQL database along with the advantages of virtual metadata and complex data modelling, multivalue databases (MVDBMS) like UniVerse (until recently an IBM product) and OpenQM have been around for years offering all of that and more.. take any industry sector and you will find commercial MVDBMS powering a huge number of mission critical business applications, just nobody ever hears about them because they ain’t sexy and they work…

    • STEVE says:

      one day brian some bright spark will take the mvdbms elegance to the cloud.. I totally agree with you. Our systems for 82 travel operators large and small work and are great but we are having to make them sexy.. MV needs to somehow be sexier? What do you recommend for Web Application Dev for MV

  33. Larry says:

    why don’t you buy some Oracle licenses

  34. jeo says:

    Fwiw, MS SQL adds a new column instantly to a 10-million row table. It seems to be independent of table size.

    That said, it doesn’t seem to me that Craigslist really requires the kind of arbitrary transactions or complex queries that a relational DB provides.

  35. At Cloudant we’ve taken the CouchDB data model, view engine, REST api, etc. and, using open source software, built in a partitioning system inspired by Amazon’s Dynamo. This allows to you to spread a DB over hundreds of nodes transparently, and elastically, with no single point of failure. This clustering layer is built as a native Erlang OTP application just like CouchDB, and when we open-source it (early Summer), we hope it gains widespread adoption as *the* clustering layer for CouchDB.

    We also conveniently act as the “corporate entity” you mention for providing support, consulting (and hosting). Currently we’re managing distributed CouchDB deployments, in production, for customers quite a bit larger than your requirements. Our search product allows you to upload your Lucene strategy and index your documents in near real-time, and will soon be released as a beta add-on product. If you like the CouchDB interface and philosophy and you’re looking to add scalability, you should check us out.

  36. Nick says:

    Riptano is a new startup providing cassandra support. Brand new company but seems extremely promising in my opinion.

    http://www.riptano.com

  37. Enrique says:

    It seems to me like you’re not really doing your homework… you want to migrate away from MySQL, fine. But you’re only looking and these new alternatives, when there is an easier path; there are other RDBMS out there far more powerful than MySQL. PostgreSQL comes to mind; did you even consider such options?

  38. Whoops, looks like I opened up a can of worms by posting this to reddit.

    Anyways, I’ve had good luck with MongoDB in development, although it does have a requirement for 64-bit hosts (there goes most virtualized environments and older hardware) and doesn’t break/error as gracefully. Mongo could use some maturity, for sure.

  39. Lori Fassman says:

    Pardon a sales pitch, but the Cache’ database by InterSystems is *very* fast, scalable, flexible and has a long track record of rock solid reliability. It’s #1 in healthcare worldwide, which means it’s used in mission critical systems where people’s lives are literally at stake if the systems fall down. It has pretty much all the features you’re looking for. Give a shout if you have any questions!

  40. Ed D. says:

    Ha, Lori beat me to it. I meant UniVerse, however–that is the more robust product found at that same URL.

  41. Pingback: Top Posts — WordPress.com

  42. Frank Leahy says:

    We’re looking at Clustrix, a stealth startup that claims to have solved a bunch of problems with MySql including on the fly schema changes, automatically spreading data across multiple nodes, query parallelization across multiple nodes, eliminates need for sharding, etc. Believe they’re announcing at Web 2.0 Summit in SF next week.

  43. Will Sargent says:

    If you’re looking for an interim solution to the ALTER TABLE problems that MySQL has, you can look at Tungsten:

    http://www.continuent.com/

    It’s basically a cheaper GoldenGate replication system, and I’ve heard great things about it.

  44. Have you looked at MarkLogic?
    http://www.marklogic.com
    Disclaimer: I work there

    Think of us as a document database with extensive indexing/search capabilities. Our server scales and provides sub-second performance to 100s of TBs. We are also “schemaless” and update the database and index in real-time.

    Much more information on our website. You can download a community version here as well:
    developer.marklogic.com

  45. Dave T says:

    Re: Oracle

    Oracle + Fast SANs would probably solve most of your problems unless your dataset is really large. Also, while Oracle can provides master-slaves, it doesn’t have any built-in support for master-master.

    That said, it is terribly expensive.

  46. We are planning to migrate able2know.org from MySql to Cassandra. As mentioned above riptano is a new company providing commercial Cassandra support and was founded by the lead Cassandra developer if I am not mistaken.

    Scalr tells me that they are also planning on providing Cassandra as a service, ala AWS’ MySql service and that should make adoption much easier and is something I’ve been nagging AWS and Rackspace to do.

  47. nunojob says:

    >> Given all that,
    >> what else should I be looking at?
    Given the scenario you described: MarkLogic Server.

    If you are talking semi-structured, Multi TB scale with full-text search I can’t think of any other company that can do all that out off the shelf.

    It’s a commercial product but the maturity level is incomparable and does a lot of things that Mongo, Couch or Cassandra simply don’t (and some things they haven’t even though about yet like integrated full text search. Another example would be the same clustered system taking advantaged of search and database indexes at the same time – performance benefits are huge).

    >> Does it scale?
    Yes. Uses techniques that ressemble the open-source NoSQL databases.

    >> Open source?
    A lot of things are (api’s, application services, etc) Everything except the core code. A considerable amount of open-source code is available at github.com/marklogic

    >> Disclaimer
    I work for the company but I’m talking on my _own_ behalf and out of technical knowledge in NoSQL products (namely MarkLogic, Couch, Mongo and Cassandra).
    I’m no spokesman for the product or company, neither my views or opinions reflect the company. At all… 🙂

  48. Bradford says:

    Hey there,

    I read your post, and have been there before! My startup is building something very similar to what you desire. It’s built from the ground-up to be distributed, and has integrated full-text search, as well as a query language for traditional RDBMS-ish tasks.

    Drop me a line sometime, would love to pick your brain about what things you’re looking for. I’m an engineer, not a salesman 🙂

  49. Pierce Lamb says:

    Jeremy, don’t rewrite your apps or give up consistency for NoSQL alternatives; get all the benefits of a distributed in-memory solution while maintaining the strength of your MySQL deployment: http://bit.ly/bng6ot

  50. sharon says:

    hi jeremy,

    I need some help. Can you please contact me at my email sharonsingh23@yahoo.com?

  51. Eric says:

    Have you investigated Amazon Web Services?

    http://aws.amazon.com/

  52. Josh says:

    1. ALTER TABLE takes an unreasonably long time, so we can’t add or removed indexes or columns. Changes take over a month.

    First, have you considered performance-testing an RDBMS other than MySQL?

    Second, this sounds like you might benefit from sharding rather than having a monolithic

    2. The data footprint is large enough that it requires RAID and we seem to be really good at breaking RAID controllers.

    What does that have to do with your database? Either stop buying cheap storage hardware or drop hardware RAID and rely on smarter, more flexible solution like ZFS.

    3. Queries are not very efficient, due partly to the underlying data organization, and due partly to the sheer amount of it compared to the available RAM.

    This really sounds like a problem with your schema, not your RDBMS.

    4. The data really isn’t relational anymore, so a document store is more appropriate. It’s just that when it was set up, MySQL was the tool of choice for just about all data storage.

    See my response to #3. Honestly, I fail to see how your data couldn’t be relational. Each post is associated with a locality, a category (maybe multiple categories), and a particular user. I could go on, but I don’t really feel like it.

    You mention wanting things like transparent compression, and, again, I have to point my finger at MySQL. Other RDBMS support compression, encryption, full-text indexing, and more. You can also deal with compression at the file system level (ZFS again, or a similar FS).

    It really sounds to me like your problems are caused by a poor DB design implemented with a less-than-stellar RDBMS. Both were “good enough” when you started, but “good enough” becomes more and more painful as time goes on.

  53. Marshall says:

    Josh — you might want to note that the humble author of this blog not only literally wrote the book on “High Performance MySql,” but also has worked on several of the world’s largest MySql installations. If anyone has the depth of experience require to issue a genuine critique of MySql and participate in the scalability side of the NoSQL debate, it’s Jeremy.

    http://oreilly.com/catalog/9780596003067

    You might want to show a little respect to your elders and keep that sassy tongue in check.

  54. Julian Morrison says:

    One possible point that distinguishes MongoDB: the combination of complex atomic query/updates and complex data structures means that you can scrape together a sort of transactionality (bounded by the rather large limits of one “document” and all its sub-structures) without using any global locks or lock-managers. By using in-document lists as mementos of partially completed operations, you can make protocols that can be cleanly detected and resumed after a program restart.

    Also, versus the GET/PUT style, updates done the MongoDB way implicitly commute – no need to lock while incrementing, for example.

  55. About 10 years ago I ran into a similar problem at Shopzilla. I was looking for an effective flat database engine and found a couple of very good ones: KDB from KX Systems (http://www.kx.com) and Sybase IQ Multiplex (not to be confused with the regular Sybase databases).
    While KDB was insanely, ridiculously, unimaginably fast and optimized, its learning curve for a large team was unacceptable steep and long.
    I opted instead for Sybase IQ Multiplex, a columnar database with SQL syntax, good solid tools, and all the necessary APIs including (at the time) DBD::Sybase that I slightly modified to hit IQ Multiplex with great success.

    Both KDB and IQ Multiplex are used extensively in the finance industry, and are very robust. They’re not open source, but they solve a certain category of problems very well.

  56. Bob Zurek says:

    More than happy to have you give Infobright a try. Great compression, fast query performance. If we can help, let us know.

  57. Pingback: MongoDB Early Impressions « Jeremy Zawodny's blog

  58. Pingback: Database Drama « Jeremy Zawodny's blog

  59. Richard says:

    One thing that you mentioned got my attention:

    “Cassandra makes you think more about low-level details and performance (which can be good).”

    I’d actually say that when anyone gets to the point that they’re looking, seriously, at needing one of these products, that the less abstraction the better (to a certain point). I’ve been digging into Cassandra as well, and love the fact that it appears to handle most of the “tech-stack” worries (failure, rollover, etc) and allows (even forces) me to handle the data structure and access portions.

    I’m sure that there are some people who are putting in a small rack of servers somewhere who just want to flip a switch and trust everything to their DB vendor, but I’d say that those deployments are few and far between 🙂

    Then again, not that long ago running with even 10G of OLTP RDBMS fell into that category… now kids use that for their pr0n collections. How times change…

  60. Pingback: MongoDB users and use cases | DBMS 2 : DataBase Management System Services

  61. SCC says:

    Great summary, unfortunately there is always going to be some sort of tradeoff

  62. Las tabletas son las estrellas del instante, el equipo del deseo.
    Y aunque por ahora es producto de pocos, también posee un destino masivo.

    Para Carrier, lo que estamos viendo en este momento
    en ese mercado no es indicador de lo que va a ocurrir.
    “El público que hoy posee una tableta es muy informatizado.
    Asimismo tiene un smartphone y una computadora.
    Y se da el lujo de contar con de un dispositivo intermedio, que usa de manera
    básica para consumir contenido. Para sentarse sobre un sillón y navegar”.

  63. Well if that machine was not damaged by mistakes in life
    and property office buildings, when you’re facing search
    engine DUI charges. In fact, search engine it is possible.

    Sometimes judges set bail unusually high to prove your innocence or
    lack thereof, and the Intoxilizer 8000. You might have paid and your family’s life now and then given measured doses
    of alcohol and/or drugs and alcohol content BAC is gonna be higher than the others.

Leave a Reply to pablo Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s