With a name like Mongo, it has to be good

I just got back from MongoSF, which was awesome. Over 200 Mongo geeks, three tracks, and language-specific workshops all day.

I got to the conference early and took a picture of the venue... an hour later it was packed.

The highlight, for me, was Eliot Horowitz’s talk on sharding. He set up a MongoDB cluster of 25 large EC2 instances and started hammering them.

He pulled up an incredibly snazzy sharding GUI (okay, I wrote it, it’s not actually that snazzy) that displayed a row of stats for each shard. Each shard had about the same level of operations per second, so you could see that Mongo was doing pretty well distributing the data across the shards.

Eliot scrolled down to the bottom of the stats table where it showed the total number of operations per second across all the shards. The cluster was doing 8 million operations per second.

8,000,000 operations per second.

The whole audience burst into applause.

That’s about 320,000 operations per second per box, which is about what would be expected for Mongo on a powerful server (like a large EC2 instance).

Cool things about this:

  • If your app needs more than 8 million ops/sec, you can just add more machines and Mongo will take care of redistributing the load.
  • Your app doesn’t need to know if it’s talking to 1, 25, or 7,000 servers. You can focus on writing your app and let Mongo focus on scaling it.

Speaker's dinner the night before

If you missed out on MongoSF, there are a bunch of other Mongo conferences coming up: MongoNY at the end of May and MongoUK and MongoFR in June.

  • Chris

    The sharding talk sounds interesting. Will any videos be posted?

  • Chris

    The sharding talk sounds interesting. Will any videos be posted?

  • Simo

    Nice! was this achieved using the experimental auto-sharding functionality or is this readily available in the stable version?

  • Simo

    Nice! was this achieved using the experimental auto-sharding functionality or is this readily available in the stable version?

  • @Chris there was video taken, I’m not sure if/when/where it’ll be up, but I’ll link to it if it is. There are slides at http://www.slideshare.net/mongosf, but these don’t include the demo, of course.

    @Slimo this was done with the latest 1.5 branch build, which is what I’d recommend using for anyone trying sharding.

  • @Chris there was video taken, I’m not sure if/when/where it’ll be up, but I’ll link to it if it is. There are slides at http://www.slideshare.net/mongosf, but these don’t include the demo, of course.

    @Slimo this was done with the latest 1.5 branch build, which is what I’d recommend using for anyone trying sharding.

  • Looks like videos are up at http://www.10gen.com/event_mongosf_10apr30

  • Looks like videos are up at http://www.10gen.com/event_mongosf_10apr30

  • Any idea on what was the setup of the cluster and what kind of workload it was?

    320k ops seems awesome but if the dataset was small (like fewer than 1 millions documents) and all in memory with only reads, the results are not that surprising.

  • Any idea on what was the setup of the cluster and what kind of workload it was?

    320k ops seems awesome but if the dataset was small (like fewer than 1 millions documents) and all in memory with only reads, the results are not that surprising.

  • IIRC, it was ~20,000 each of writes, deletes, updates, inserts, commands, & get mores and ~200,000 reads. I think that it was a quite large data set, but I’ll check with Eliot tomorrow.

  • IIRC, it was ~20,000 each of writes, deletes, updates, inserts, commands, & get mores and ~200,000 reads. I think that it was a quite large data set, but I’ll check with Eliot tomorrow.

  • I agree – an absolutely awesome demo! 🙂

    @Nicolas, I think it was five shards of five replica-set servers each. I might misremember.

  • I agree – an absolutely awesome demo! 🙂

    @Nicolas, I think it was five shards of five replica-set servers each. I might misremember.

  • ian

    was at MongoSF.. the sharding demo was definitely cool, a lot of the other talks/demos were very good as well. overall a great conference, esp. for the price. thanks again 🙂

  • ian

    was at MongoSF.. the sharding demo was definitely cool, a lot of the other talks/demos were very good as well. overall a great conference, esp. for the price. thanks again 🙂

  • @Noah unfortunately, replica sets aren’t ready yet. So, it was just 25 shards with no replication, which was actually kind of a problem because Amazon thought that we were DDOSing ourselves, so kept shutting down our instances, thus breaking the demo.

    About replica sets: I am totally in love with them, so if you’re interested subscribe to my RSS feed because I’ll be doing a blog post on how to use them ~3 seconds after they’re available in the nightly build.

  • @Noah unfortunately, replica sets aren’t ready yet. So, it was just 25 shards with no replication, which was actually kind of a problem because Amazon thought that we were DDOSing ourselves, so kept shutting down our instances, thus breaking the demo.

    About replica sets: I am totally in love with them, so if you’re interested subscribe to my RSS feed because I’ll be doing a blog post on how to use them ~3 seconds after they’re available in the nightly build.

  • 8 million ops/s is 320000 ops/s per instance.
    As you wrote this is far above a local Mongo, Redis or OrientDB instance.
    And this on EC2 with more network letency??
    Sounsd impossible. What’s behind this magic?
    Regards
    Stefan Edlich

  • 8 million ops/s is 320000 ops/s per instance.
    As you wrote this is far above a local Mongo, Redis or OrientDB instance.
    And this on EC2 with more network letency??
    Sounsd impossible. What’s behind this magic?
    Regards
    Stefan Edlich

  • As I wrote, ~320,000 is reasonable for a powerful server. There were lots of clients sending requests and MongoDB can do reads in parallel, which is what there were the most of.

  • As I wrote, ~320,000 is reasonable for a powerful server. There were lots of clients sending requests and MongoDB can do reads in parallel, which is what there were the most of.

  • Kristina,

    Those numbers sound amazing, but there are a lot of missing details ;-).

    1. what was the concurrency level?
    2. where there concurrent CRUD operations? (basically I expect things to behave quite differently when sets of READS are (almost completely) separated from sets of WRITES)
    3. what was the average data size/op?

    You know such benchmarks can be “fabricated” and missing information tends to encourage that kind of thought.

  • Kristina,

    Those numbers sound amazing, but there are a lot of missing details ;-).

    1. what was the concurrency level?
    2. where there concurrent CRUD operations? (basically I expect things to behave quite differently when sets of READS are (almost completely) separated from sets of WRITES)
    3. what was the average data size/op?

    You know such benchmarks can be “fabricated” and missing information tends to encourage that kind of thought.

  • 1. I have no idea.
    2. Yes.
    3. I have no idea.

    I think people are taking this post a bit too seriously… I just thought it was a really cool demo of what Mongo’s going to be able to do: scale horizontally to practically infinite capacity. It’s certainly not meant to be a benchmark.

    I think we’re working on a more interesting application (one has actual functionality other than hammering the database) that we’ll open source and use as a demo in the future. We’ll also do some real benchmarks against sharding before releasing 1.6, obviously.

  • 1. I have no idea.
    2. Yes.
    3. I have no idea.

    I think people are taking this post a bit too seriously… I just thought it was a really cool demo of what Mongo’s going to be able to do: scale horizontally to practically infinite capacity. It’s certainly not meant to be a benchmark.

    I think we’re working on a more interesting application (one has actual functionality other than hammering the database) that we’ll open source and use as a demo in the future. We’ll also do some real benchmarks against sharding before releasing 1.6, obviously.

  • i think it's blog class ho, meet girls on webcams for free, %O,

  • Mike

    Do you have any more details on the “incredibly snazzy sharding GUI” that you mentioned in the post? I'm interested in doing some load testing to see how MongoDB will perform for my system and it sounds like that GUI was giving some pretty valuable data — anyway to get access to that program? Thanks!

  • kristina1

    Unfortunately, it's not publicly available yet, but we'll probably release it when 1.6 comes out. It doesn't give any information you can't get from the shell (it basically runs the serverStatus command once per second for each machine), it just presents it in a pretty way. I'm hoping to add a chunk viewer and stuff, too, but it's very, very, very alpha at the moment.

  • Bob

    If you are looking for blog post suggestions, how about a post about MongoDB and memory.

  • Anonymous

    What are you interested in, in particular?

  • Anonymous

    Yes, videos are available here: http://www.10gen.com/video

  • Anonymous

    It was a mix of reads and writes, I think it was 4:1 reads to writes. I’d imagine almost everything was in memory.

  • Pingback: ehcache.net()

  • Pingback: Reflections on MongoSF 2012 « Meghan Gill()

  • Kay

    It seems that Eliot’s presentation of the 25 node cluster has been removed from the whole internet. I have doubts about the performance of 320.000 ops/sec/mongod (back in 2010) and would have loved to see his presentation to have some insights into his test setup.

    Is there any chance to get the video back or is it gone for ever? Thanks!

  • kristina1

    I have no idea about the video availability, the 10gen site’s been recreated a few times since this was posted. For advice on speeding thing up, I recommend starting with http://docs.mongodb.org/manual/administration/optimization/ and MongoDB conferences often have talks about performance tuning.

    (I know this is a fairly unsatisfying answer, but I was busy being sick with nerves for my talk and wasn’t there when they were setting up the cluster.)

kristina chodorow's blog