Part 3 of the replication internals series: three handy tricks.
MongoDB has a type of query that behaves like the tail -f command: it shows you new data as it’s written to a collection. This is great for the oplog, where you want to see new records as they pop up and don’t want to query over and over.
If you want this type of ongoing query, MongoDB returns a tailable cursor. When this cursor gets to the end of the result set it will hang around and wait for more elements to be added to the collection. As they’re added, the cursor will return them. If no elements are added for a while, the cursor will time out and the client has to requery if they want more results.
Using your knowledge of the oplog’s format, you can use a tailable cursor to do a long pull for activities in a certain collection, of a certain type, at a certain time… almost any criteria you can imagine.
Suppose your database goes down, but you have a fairly recent backup. You could put a backup into production, but it’ll be a bit behind. You can bring it up-to-date using your oplog entries.
If you use the trigger mechanism (described above) to capture the entire oplog and send it to a non-capped collection on another server, you can then use an oplog replayer to play the oplog over your dump, bringing it as up-to-date as possible.
Pick a time pre-dump and start replaying the oplog from there. It’s okay if you’re not sure exactly when the dump was taken because the oplog is idempotent: you can apply it to your data as many times as you want and your data will end up the same.
Also, warning: I haven’t tried out the oplog replayer I linked to, it’s just the first one I found. There are a few different ones out there and they’re pretty easy to write.
The local database contains data that is local to a given server: it won’t be replicated anywhere. This is one reason why it holds all of the replication info.
local isn’t reserved for replication stuff: you can put your own data there, too. If you do a write in the local database and then check the oplog, you’ll notice that there’s no record of the write. The oplog doesn’t track changes to the local database, since they won’t be replicated.
And now I’m all replicated out. If you’re interested in learning more about replication, check out the core documentation on it. There’s also core documentation on tailable cursors and language-specific instructions in the driver documentation.