Why Command Helpers Suck

This is a rant from my role as a driver developer and person who gives support on the mailing list/IRC.

Command helpers are terrible. They confuse users, result in a steeper learning curve, and make MongoDB’s interface seem arbitrary.

The basics: what are command helpers?

Command helpers are wrappers for database commands. Database commands are everything other than CRUD (create, retrieve, update, delete) that you can do with MongoDB. This includes things like dropping a collection, doing a MapReduce, adding a member to a replica set, seeing what arguments you started mongod with, and finding out if the last write operation succeeded. They’re everywhere, if you’ve used MongoDB, you’ve run a database command (even if you weren’t aware of it).

So, what are command helpers? These are wrappers around the raw command, turning something like db.adminCommand({serverStatus:1}) into db.serverStatus(). This makes it slightly quicker to run and look “nicer” than the command. However, there are honey bunches of reasons that they’re a bad idea and should be avoided whenever possible.

Database helpers are unportable

Helpers are extremely unportable. If you know how to run db.serverStatus() in the shell, that’s great, but all you know is how to do it in the shell. If you know how to run the serverStatus command, you know how to get the server status in every language you’ll ever use.

Similarly, each language handles command options differently. Take a command like group: the shell helper chooses one order of options (a single argument “options”, incidentally) and the Python driver chooses another (“key”, “condition”, “initial”, “reduce”, “finalize”) and the PHP driver another (“key”, “initial”, “reduce”, “options”). If you just learn the group command itself, you can execute it in any language you please.

This affects almost everyone using MongoDB, as almost everyone uses at least two languages (JavaScript and something else). I have seen hundreds of questions of the form “How do I run <shell function> using my driver?” If these users knew it was a database command (and knew what a database command was), they wouldn’t have to ask.

Database helpers lock you to a certain API, often an out-of-date one

Suppose the database changes the options for a command. All of the drivers that support helpers for that command are suddenly out-of-date. Conversely, if you have a recent version of a driver and an old version of the database, you can have helpers for features that don’t exist yet or have different options.

An example of old driver/new database: MapReduce’s options changed in version 1.7.4. As far as I know, none of the drivers support the new options, yet.

You can’t support database helpers for everything

Next, there’s just the sheer volume of database commands, which makes it impossible to implement helpers for all of them. Everyone has their favorites: aggregation is important to some people, administration helpers are important to others, etc. If all of them had helpers, not only would there be a ridiculous number of methods polluting the API documentation, but it would leads to tons of compatibility problems between the driver and the database (as mentioned above).

Database helpers conceal what’s going on, giving users less options

Finally, using command helpers keeps people from understanding what’s actually going on, which is pointless and can lead to problems. It’s pointless to conceal the gory details because the details aren’t very gory: all database commands are queries. This means you can deconstruct command helpers as follows (example in PHP):

// the command helper
$db->lastError();
// is the same as
$db->command(array("getlasterror" => 1));
// is the same as
$db->selectCollection('$cmd')->findOne(array("getlasterror" => 1));
// is the same as
$db->selectCollection('$cmd')->find(array("getlasterror" => 1))->limit(1)->getNext();

Every command helper is just a find() in disguise! This means you can do (almost) anything with a database command that you could with a query.

This gives you more control. Not only can you use whatever options you want, you can do a few other things:

  • By default, drivers send all commands to the master, even if slaveOkay is set. If you want to send a command to a slave, you can deconstruct it to a query bypass the driver’s commands-go-to-master logic.
  • Suppose you have a command that takes a long time to execute and it times out on the client side. If you deconstruct the command into a query, you can (for some drivers) set the client-side timeout on the cursor.

Finally, if you’re using an unfamiliar driver, you might not know what its helpers are called but all drivers have a find() method, so you can always use that.

Exceptions

There are a couple command helpers worth implementing. I think that count and drop (at both the database and collection levels) are common enough to be worth having helpers for. Also, at a higher level (e.g., frameworks on top of the driver and admin GUIs) I think helpers are absolutely fine. However, as someone who has been maintaining a driver and supporting users for the last few years, I think that, at a driver level, command helpers are a terrible idea.

kristina chodorow's blog