There must be 50 ways to start your Mongo
This blog post covers four major ones:
Feel free to jump to the ones that interest you (for instance, sharding).
Starting up a vanilla MongoDB instance is super easy, it just needs a port it can listen on and a directory where it can save your info. By default, Mongo listens on port 27017, which should work fine (it’s not a very commonly used port). We’ll create a new directory for database files:
$ mkdir -p ~/dbs/mydb # -p creates parent directories if they don't exist |
And then start up our database:
$ cd $ bin/mongod --dbpath ~/dbs/mydb |
…and you’ll see a bunch of output:
$ bin/mongod --dbpath ~/dbs/mydb Fri Apr 23 11:59:07 Mongo DB : starting : pid = 9831 port = 27017 dbpath = /data/db/ master = 0 slave = 0 32-bit ** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data ** see http://blog.mongodb.org/post/137788967/32-bit-limitations for more Fri Apr 23 11:59:07 db version v1.5.1-pre-, pdfile version 4.5 Fri Apr 23 11:59:07 git version: f86d93fd949777d5fbe00bf9784ec0947d6e75b9 Fri Apr 23 11:59:07 sys info: Linux ubuntu 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:54:29 UTC 2009 i686 BOOST_LIB_VERSION=1_38 Fri Apr 23 11:59:07 waiting for connections on port 27017 Fri Apr 23 11:59:07 web admin interface listening on port 28017 |
Now, Mongo will “freeze” like this, which confuses some people. Don’t worry, it’s just waiting for requests. You’re all set to go.
As we’re running master and slave on the same machine, they’ll need separate ports. We’ll use port 10000 for the master and 20000 for the slave. We also need separate directories for data, so we’ll create those:
$ mkdir ~/dbs/master ~/dbs/slave |
Now we start the master database:
$ bin/mongod --master --port 10000 --dbpath ~/dbs/master |
And then the slave, in a different terminal:
$ bin/mongod --slave --port 20000 --dbpath ~/dbs/slave --source localhost:10000 |
The “source” option specifies where the master is that the slave should replicate data from.
Now, if we want to add another slave, we need to go though the herculean effort of choosing a port and creating a new directory:
$ mkdir ~/dbs/slave2 $ bin/mongod --slave --port 20001 --dbpath ~/dbs/slave2 --source localhost:10000 |
Tada! Two slaves, one master. For more information on master-slave, see the core docs on it and my previous post.
This example puts the master server and slave server on the same machine, but people generally have a master on one machine and a slave on another. It works fine to put them on a single machine, it just defeats the point of a bit.
Okay, so there aren’t many people named Rover, but you come up with a rhyme for “auto-failover” (I tried “replica”, too).
Replica pairs are cool because it’s like master-slave, but you get automatic failover: if the master becomes unavailable, the slave will become a master. So, it’s basically the same as master-slave, but the servers know about each other and there is, optionally, an arbiter server that doesn’t do anything other than resolve “disputes” over who is master.
When could the arbiter come it in handy? Suppose the master’s network cable is pulled. The server still thinks it’s master, but no one else knows it’s there. The slave becomes master and the rest of the world goes along happily. When the master’s network cable gets plugged back in, now both servers think they’re master! In this case, the arbiter steps in and gently informs the master who’s behind in the times that he is now a slave.
You don’t have to set up an arbiter, but we will since it’s good practice:
$ mkdir ~/dbs/arbiter ~/dbs/replica1 ~/dbs/replica2 $ bin/mongod --port 50000 --dbpath ~/dbs/arbiter |
Now, in separate terminals, you start each of the replicas:
$ bin/mongod --port 60000 --dbpath ~/dbs/replica1 --pairwith localhost:60001 --arbiter localhost:50000 |
And then the other one:
$ bin/mongod --port 60001 --dbpath ~/dbs/replica2 --pairwith localhost:60000 --arbiter localhost:50000 |
After they’ve been running for a bit, try killing (Ctrl-C) one, then restarting it, then killing the other one, back and forth.
For more information on replica pairs, see the core docs.
What’s this? Replica pairs are evolving! *voop* *voop* *voop*
Replica pairs have evolved into… replica sets! Well, okay, they haven’t yet, but they’re coming soon. Then you’ll be able to have an arbitrary number of servers in the auto-failover ring.
For the grand finale, sharding. Sharding is how you distribute data with Mongo. If you don’t know what sharding is, check out my previous post explaining how it works.
First of all, download the latest 1.5.x nightly build from the website. Sharding is changing rapidly, you want the latest and greatest here, not stable.
We’re going to be creating a three-node cluster. So, same as ever, create your database directories. We want one directory for the cluster configuration and three directories for our shards (nodes):
$ mkdir ~/dbs/config ~/dbs/shard1 ~/dbs/shard2 ~/dbs/shard3 |
The config server keeps track of what’s where, so we need to start that up first:
$ bin/mongod --configsvr --port 70000 --dbpath ~/dbs/config |
The mongos is just a request router that runs on top of the config server. It doesn’t even need a data directory, we just tell it where to look for the configuration:
$ bin/mongos --configdb localhost:70000 |
Note the “s”: the router is called “mongos”, not “mongod”. We haven’t specified a port for it, so it’ll listen on the default port (27017).
Okay! Now, we need to set up our shards. Start these each up in separate terminals:
$ bin/mongod --shardsvr --port 71000 --dbpath ~/dbs/shard1 $ bin/mongod --shardsvr --port 71001 --dbpath ~/dbs/shard2 $ bin/mongod --shardsvr --port 71002 --dbpath ~/dbs/shard3 |
mongos doesn’t actually know about the shards yet, you need to tell it to add these servers to the cluster. The easiest way is to fire up a mongo shell:
$ bin/mongo MongoDB shell version: 1.5.1-pre- url: test connecting to: test type "help" for help > |
Now, we add each shard to the cluster:
> db = connect("localhost:70000/admin"); connecting to: localhost:70000 admin > db.runCommand({addshard : "localhost:71000", allowLocal : true}) { "added" : "localhost:71000", "ok" : 1 } > db.runCommand({addshard : "localhost:71001", allowLocal : true}) { "added" : "localhost:71001", "ok" : 1 } > db.runCommand({addshard : "localhost:71002", allowLocal : true}) { "added" : "localhost:71002", "ok" : 1 } > |
mongos expects shards to be on remote machines and by default won’t allow you to add local shards (i.e., shards with “localhost” in the name). Since we’re just playing around, we specify “allowLocal” to override this behavior. (Note that “addshard” IS NOT camel-case, and allowLocal IS camel-case, because we’re consistent like that.)
Congratulations, you’re running a distributed database!
What do you do now? Well, use it just like a normal database! Connect to “localhost:27017″ and proceed normally (or, as normally as possible… please report any bugs to our bugtracker!). Try the tutorial (since you’ve already got the shell open) or connect through your favorite driver and play around.
Connecting to mongos should be an identical experience to connecting to a normal Mongo server. Behind the scenes, it splits up your requests/data across the shards so you can concentrate on making your application, not scaling it.
P.S. Obviously, this example setup is full of single points of failure, but that’s completely avoidable. I can go over how to set up distributed MongoDB with zero single points of failure in a follow-up post, if people are interested.



Subscribe
Pingback: links for 2010-08-11 « Caiwangqin’s delicious bog
Pingback: links for 2011-02-08 « Bloggitation
Pingback: ehcache.net
Pingback: Replicas de relógios Compre agora as melhores repliacs de relogios
Pingback: best sewing machine guide
Pingback: SEO
Pingback: How to organize business