There must be 50 ways to start your Mongo

This blog post covers four major ones:

Feel free to jump to the ones that interest you (for instance, sharding).

Just start up a database, Ace

Starting up a vanilla MongoDB instance is super easy, it just needs a port it can listen on and a directory where it can save your info. By default, Mongo listens on port 27017, which should work fine (it’s not a very commonly used port). We’ll create a new directory for database files:

$ mkdir -p ~/dbs/mydb # -p creates parent directories if they don't exist

And then start up our database:

$ cd 
$ bin/mongod --dbpath ~/dbs/mydb

…and you’ll see a bunch of output:

$ bin/mongod --dbpath ~/dbs/mydb
Fri Apr 23 11:59:07 Mongo DB : starting : pid = 9831 port = 27017 dbpath = /data/db/ master = 0 slave = 0  32-bit 
 
** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data
**       see http://blog.mongodb.org/post/137788967/32-bit-limitations for more
 
Fri Apr 23 11:59:07 db version v1.5.1-pre-, pdfile version 4.5
Fri Apr 23 11:59:07 git version: f86d93fd949777d5fbe00bf9784ec0947d6e75b9
Fri Apr 23 11:59:07 sys info: Linux ubuntu 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:54:29 UTC 2009 i686 BOOST_LIB_VERSION=1_38
Fri Apr 23 11:59:07 waiting for connections on port 27017
Fri Apr 23 11:59:07 web admin interface listening on port 28017

Now, Mongo will “freeze” like this, which confuses some people. Don’t worry, it’s just waiting for requests. You’re all set to go.

Set up a slave, Maeve

As we’re running master and slave on the same machine, they’ll need separate ports. We’ll use port 10000 for the master and 20000 for the slave. We also need separate directories for data, so we’ll create those:

$ mkdir ~/dbs/master ~/dbs/slave

Now we start the master database:

$ bin/mongod --master --port 10000 --dbpath ~/dbs/master

And then the slave, in a different terminal:

$ bin/mongod --slave --port 20000 --dbpath ~/dbs/slave --source localhost:10000

The “source” option specifies where the master is that the slave should replicate data from.

Now, if we want to add another slave, we need to go though the herculean effort of choosing a port and creating a new directory:

$ mkdir ~/dbs/slave2
$ bin/mongod --slave --port 20001 --dbpath ~/dbs/slave2 --source localhost:10000

Tada! Two slaves, one master. For more information on master-slave, see the core docs on it and my previous post.

This example puts the master server and slave server on the same machine, but people generally have a master on one machine and a slave on another. It works fine to put them on a single machine, it just defeats the point of a bit.

Get auto-failover… Rover

Okay, so there aren’t many people named Rover, but you come up with a rhyme for “auto-failover” (I tried “replica”, too).

Replica pairs are cool because it’s like master-slave, but you get automatic failover: if the master becomes unavailable, the slave will become a master. So, it’s basically the same as master-slave, but the servers know about each other and there is, optionally, an arbiter server that doesn’t do anything other than resolve “disputes” over who is master.

When could the arbiter come it in handy? Suppose the master’s network cable is pulled. The server still thinks it’s master, but no one else knows it’s there. The slave becomes master and the rest of the world goes along happily. When the master’s network cable gets plugged back in, now both servers think they’re master! In this case, the arbiter steps in and gently informs the master who’s behind in the times that he is now a slave.

You don’t have to set up an arbiter, but we will since it’s good practice:

$ mkdir ~/dbs/arbiter ~/dbs/replica1 ~/dbs/replica2
$ bin/mongod --port 50000 --dbpath ~/dbs/arbiter

Now, in separate terminals, you start each of the replicas:

$ bin/mongod --port 60000 --dbpath ~/dbs/replica1 --pairwith localhost:60001 --arbiter localhost:50000

And then the other one:

$ bin/mongod --port 60001 --dbpath ~/dbs/replica2 --pairwith localhost:60000 --arbiter localhost:50000

After they’ve been running for a bit, try killing (Ctrl-C) one, then restarting it, then killing the other one, back and forth.

For more information on replica pairs, see the core docs.

What’s this? Replica pairs are evolving! *voop* *voop* *voop*

Replica pairs have evolved into… replica sets! Well, okay, they haven’t yet, but they’re coming soon. Then you’ll be able to have an arbitrary number of servers in the auto-failover ring.

Make a new cluster, Buster

For the grand finale, sharding. Sharding is how you distribute data with Mongo. If you don’t know what sharding is, check out my previous post explaining how it works.

First of all, download the latest 1.5.x nightly build from the website. Sharding is changing rapidly, you want the latest and greatest here, not stable.

We’re going to be creating a three-node cluster. So, same as ever, create your database directories. We want one directory for the cluster configuration and three directories for our shards (nodes):

$ mkdir ~/dbs/config ~/dbs/shard1 ~/dbs/shard2 ~/dbs/shard3

The config server keeps track of what’s where, so we need to start that up first:

$ bin/mongod --configsvr --port 70000 --dbpath ~/dbs/config

The mongos is just a request router that runs on top of the config server. It doesn’t even need a data directory, we just tell it where to look for the configuration:

$ bin/mongos --configdb localhost:70000

Note the “s”: the router is called “mongos”, not “mongod”. We haven’t specified a port for it, so it’ll listen on the default port (27017).

Okay! Now, we need to set up our shards. Start these each up in separate terminals:

$ bin/mongod --shardsvr --port 71000 --dbpath ~/dbs/shard1
$ bin/mongod --shardsvr --port 71001 --dbpath ~/dbs/shard2
$ bin/mongod --shardsvr --port 71002 --dbpath ~/dbs/shard3

mongos doesn’t actually know about the shards yet, you need to tell it to add these servers to the cluster. The easiest way is to fire up a mongo shell:

$ bin/mongo
MongoDB shell version: 1.5.1-pre-
url: test
connecting to: test
type "help" for help
>

Now, we add each shard to the cluster:

> db = connect("localhost:70000/admin");
connecting to: localhost:70000
admin
> db.runCommand({addshard : "localhost:71000", allowLocal : true})
{
    "added" : "localhost:71000",
    "ok" : 1
}
> db.runCommand({addshard : "localhost:71001", allowLocal : true})
{
    "added" : "localhost:71001",
    "ok" : 1
}
> db.runCommand({addshard : "localhost:71002", allowLocal : true})
{
    "added" : "localhost:71002",
    "ok" : 1
}
>

mongos expects shards to be on remote machines and by default won’t allow you to add local shards (i.e., shards with “localhost” in the name). Since we’re just playing around, we specify “allowLocal” to override this behavior. (Note that “addshard” IS NOT camel-case, and allowLocal IS camel-case, because we’re consistent like that.)

Congratulations, you’re running a distributed database!

What do you do now? Well, use it just like a normal database! Connect to “localhost:27017” and proceed normally (or, as normally as possible… please report any bugs to our bugtracker!). Try the tutorial (since you’ve already got the shell open) or connect through your favorite driver and play around.

Connecting to mongos should be an identical experience to connecting to a normal Mongo server. Behind the scenes, it splits up your requests/data across the shards so you can concentrate on making your application, not scaling it.


P.S. Obviously, this example setup is full of single points of failure, but that’s completely avoidable. I can go over how to set up distributed MongoDB with zero single points of failure in a follow-up post, if people are interested.

  • Kristina,
    Your blog is kind of awesome, particularly the comics.

    I would like to express interest in the “ow to set up distributed MongoDB with zero single points of failure” post, and I have a question relating to our own deployment of Mongo servers.

    We would like to have two MongoDBs as paired replicas, and a third acting as both a slave and arbiter (this is used for backups). Is this possible with 3.4 or are we waiting for the next stable minor version increment, 3.6?

    Peace,
    Chris

  • Kristina,
    Your blog is kind of awesome, particularly the comics.

    I would like to express interest in the “ow to set up distributed MongoDB with zero single points of failure” post, and I have a question relating to our own deployment of Mongo servers.

    We would like to have two MongoDBs as paired replicas, and a third acting as both a slave and arbiter (this is used for backups). Is this possible with 3.4 or are we waiting for the next stable minor version increment, 3.6?

    Peace,
    Chris

  • @Chris Thank you! Good enough for me, I’ll probably write the post next week.

    Unfortunately, you cannot have a replica pair with a slave at the moment. You can only have a master-slave setup OR a replica pair setup. This is one of the reasons we’re upgrading to replica sets, which will support this behavior: some db servers can be set to be “potential masters” (if the master goes down, they could take over) and others can be set to be “backup only” and will never become master.

    Unfortunately, you’ll have to wait until 1.6, but it shouldn’t be too long!

  • @Chris Thank you! Good enough for me, I’ll probably write the post next week.

    Unfortunately, you cannot have a replica pair with a slave at the moment. You can only have a master-slave setup OR a replica pair setup. This is one of the reasons we’re upgrading to replica sets, which will support this behavior: some db servers can be set to be “potential masters” (if the master goes down, they could take over) and others can be set to be “backup only” and will never become master.

    Unfortunately, you’ll have to wait until 1.6, but it shouldn’t be too long!

  • ñåêñ íà îäíó íî÷ü êèåâ, map downloads in coolnavi 700, 3291, shellac torrent greyhound, 842, tre donne crudeli download, xipa, johnq tool bar download, 853, yuri's revenge full game download, 8]]], suprabhatam mp3 download mp3, 8(, niv ble software pda, >:(((, chantz fortune downloads, 8-((, peachtree complete accounting crack, :PPP, atari 1040 software icom, 8DD, tera bond torrent, 71099, v9.2 citrix presentation server download, jzypru, bosch desa download, 61085, gamespace1.6 free downloads, 235169, bleep out download, :[[[, download stacked with daniel negreanu, 090003, reed mcclintock coin ovations torrent, 575, 2up mlm software, npmdv, ftv melia torrent, pxzrsi, software dell photo a10 922, olwdt, webcam interview aprimo software, 840160, corel icad software, pnhuy, echostar ad3000ip viaccess downloads, jcf, chiodos torrent, 672, jamcam 3.0 software, =]]], ferrets out unpatched software tool, >:(, download hareem kareem soundtrack, 6054, seeyou mobile crack 2.75, :((, omnisphere vsti torrent, pogc, bottle buster free game download, 7201, elfen lied english dub download, jze, free sh3 software for jornada540, :-))),

  • yes that site unapproachable kh, hacked teen webcam, jwy,

  • javadoc

    i just setup a sharding config from this post. works nice. one problem though: it looks like one collection is not sharded, but the collections go to different servers. what i was looking for was the “main” collection i am going to have to be sharded through the nodes, so that one single query on a collection can faster. is this possible?

  • javadoc

    oh, never mind- stupid me. i found it here:
    http://www.mongodb.org/display/DOCS/Sharding+In

    thanks again

  • Eric

    Quick question, in the auto-failover section, the arbiter was started on port 50000 but the servers were told that it was on port 27017. Is that correct?

    $ bin/mongod –port 50000 –dbpath ~/dbs/arbiter

    $ bin/mongod –port 60000 –dbpath ~/dbs/replica1 –pairwith localhost:60001 –arbiter 27017

  • Hi,

    great Post! I would also like to see the zero single points of failure solution because I always thought that's what differentiates couch and mongo db.

  • kristina1

    Thanks, fixed. It should have been localhost:50000.

  • kristina1

    I'm just waiting on replica sets and then I'll write the post. The main difference in replication between Couch and Mongo is that Couch supports multi-master, which means that you can always write _somewhere_ but you can also end up with conflicts.

  • Alex

    Wonderful!!! Can't wait to see your replicaset / sharding article. Nice one!

  • Pingback: links for 2010-08-11 « Caiwangqin’s delicious bog()

  • Pingback: links for 2011-02-08 « Bloggitation()

  • Pingback: ehcache.net()

  • Pingback: Replicas de relógios Compre agora as melhores repliacs de relogios()

  • Pingback: best sewing machine guide()

  • Pingback: SEO()

  • Pingback: How to organize business()

  • Pooria

    Great. Many thanks.

  • This was a very simple tutorial for a newbie like me 🙂

kristina chodorow's blog