Return of the Mongo Mailbag

On the mongodb-user mailing list last week, someone asked (basically):

I have 4 servers and I want two shards. How do I set it up?

A lot of people have been asking questions about configuring replica sets and sharding, so here’s how to do it in nitty-gritty detail.

The Architecture

Prerequisites: if you aren’t too familiar with replica sets, see my blog post on them. The rest of this post won’t make much sense unless you know what an arbiter is. Also, you should know the basics of sharding.

Each shard should be a replica set, so we’ll need two replica sets (we’ll call them foo and bar). We want our cluster to be okay if one machine goes down or gets separated from the herd (network partition), so we’ll spread out each set among the available machines. Replica sets are color-coded and machines are imaginatively named server1-4.

Each replica set has two hosts and an arbiter. This way, if a server goes down, no functionality is lost (and there won’t be two masters on a single server).

To set this up, run:

server1

$ mkdir -p ~/dbs/foo ~/dbs/bar
$ ./mongod --dbpath ~/dbs/foo --replSet foo
$ ./mongod --dbpath ~/dbs/bar --port 27019 --replSet bar --oplogSize 1

server2

$ mkdir -p ~/dbs/foo
$ ./mongod --dbpath ~/dbs/foo --replSet foo

server3

$ mkdir -p ~/dbs/foo ~/dbs/bar
$ ./mongod --dbpath ~/dbs/foo --port 27019 --replSet foo --oplogSize 1
$ ./mongod --dbpath ~/dbs/bar --replSet bar

server4

$ mkdir -p ~/dbs/bar
$ ./mongod --dbpath ~/dbs/bar --replSet bar

Note that arbiters have an oplog size of 1. By default, oplog size is ~5% of your hard disk, but arbiters don’t need to hold any data so that’s a huge waste of space.

Putting together the replica sets

Now, we’ll start up our two replica sets. Start the mongo shell and type:

> db = connect("server1:27017/admin")
connecting to: server1:27017
admin
> rs.initiate({"_id" : "foo", "members" : [
... {"_id" : 0, "host" : "server1:27017"},
... {"_id" : 1, "host" : "server2:27017"},
... {"_id" : 2, "host" : "server3:27019", arbiterOnly : true}]})
{
        "info" : "Config now saved locally.  Should come online in about a minute.",
        "ok" : 1
}
> db = connect("server3:27017/admin")
connecting to: server3:27017
admin
> rs.initiate({"_id" : "bar", "members" : [
... {"_id" : 0, "host" : "server3:27017"},
... {"_id" : 1, "host" : "server4:27017"},
... {"_id" : 2, "host" : "server1:27019", arbiterOnly : true}]})
{
        "info" : "Config now saved locally.  Should come online in about a minute.",
        "ok" : 1
}

Okay, now we have two replica set running. Let’s create a cluster.

Setting up Sharding

Since we’re trying to set up a system with no single points of failure, we’ll use three configuration servers. We can have as many mongos processes as we want (one on each appserver is recommended), but we’ll start with one.

server2

$ mkdir ~/dbs/config
$ ./mongod --dbpath ~/dbs/config --port 20000

server3

$ mkdir ~/dbs/config
$ ./mongod --dbpath ~/dbs/config --port 20000

server4

$ mkdir ~/dbs/config
$ ./mongod --dbpath ~/dbs/config --port 20000
$ ./mongos --configdb server2:20000,server3:20000,server4:20000 --port 30000

Now we’ll add our replica sets to the cluster. Connect to the mongos and and run the addshard command:

> mongos = connect("server4:30000/admin")
connecting to: server4:30000
admin
> mongos.runCommand({addshard : "foo/server1:27017,server2:27017"})
{ "shardAdded" : "foo", "ok" : 1 }
> mongos.runCommand({addshard : "bar/server3:27017,server4:27017"})
{ "shardAdded" : "bar", "ok" : 1 }

Edit: you must list all of the non-arbiter hosts in the set for now. This is very lame, because given one host, mongos really should be able to figure out everyone in the set, but for now you have to list them.

Tada! As you can see, you end up with one “foo” shard and one “bar” shard. (I actually added that functionality on Friday, so you’ll have to download a nightly to get the nice names. If you’re using an older version, your shards will have the thrilling names “shard0000” and “shard0001”.)

Now you can connect to “server4:30000” in your application and use it just like a “normal” mongod. If you want to add more mongos processes, just start them up with the same configdb parameter used above.

kristina chodorow's blog