Oh, the Mistakes I’ve Seen

A slow database is easily fixed
If you make good choices of fields indexed.
Sometimes the answer is simpler still,
A quick code change may fit the bill.

I’ll be giving an O’Reilly webcast, Scaling with MongoDB, on Friday (9/17). Please sign up if you’re interested in learning some more advanced optimization than what this post gets into. This webcast is, in part, to pimp MongoDB: The Definitive Guide, which will be coming out next week!

These are a few basic tips on making your application better/faster/stronger without knowing anything about indexes or sharding.

Connecting

Connecting to the database is a (relatively) expensive operation. Try to minimize the number of times you connect and disconnect: use persistent connections or connection pooling (depending on your language).

To not waste connections, you have to know what your driver is doing. I see a lot of code like this in PHP:

$connection = new Mongo();
$connection->connect();

What this does is:

  1. The constructor connects to the database.
  2. connect() sees that you’re already connected, assumes you want to reset the connection.
  3. Disconnects from the database.
  4. Connects again.

Gah! You just doubled your execution time.

ObjectIds

ObjectIds seem to make people vaguely uncomfortable, so they convert their ObjectIds into strings (the macaroni and cheese of data types). The problem is, an ObjectId takes up 12 bytes but its string representation takes up 29 bytes (almost two and a half times bigger). The lesson: suck it up and eat your spinachy ObjectIds. You’ll learn to like ’em.

Also, an ObjectId won’t sneakily convert itself into a string on the fly. I see a lot of code like:

id = new ObjectId();
db.foo.insert({"_id" : new ObjectId(id)});
// or, even sillier
db.foo.insert({"_id" : new ObjectId(id.toString())});

If you created an ObjectId and haven’t messed with it, it’s still an ObjectId.

Numbers vs. Strings

MongoDB is type-sensitive and it’s important to use the correct type: numbers for numeric values and strings for strings.

If you have large numbers and you save them as strings (“1234567890” instead of 1234567890), MongoDB may slow down as it strcmps the entire length of the number instead of doing a quicker numeric comparison. Also, “12” is going to be sorted as less than “9”, because MongoDB will use string, not numeric, comparison on the values. This can lead to some surprising results.

Driver-specific

Find out if you’re driver is particularly weaknesses (or strengths). For instance, the Perl driver is one of the fastest drivers, but it sucks at decoding Date types (Perl’s DateTime objects take a long time to create). So, if you want fast Perl programs, avoid dates like the plague or you’ll be puttering along with the Ruby programmers. (Just kidding, Rubyists! Sort of.)

The most important thing is to get to know your language’s documentation and ask if you have any questions.

kristina chodorow's blog