Popping Timestamps into ObjectIds

ObjectIds contain a timestamp, which tells you when the document was created. Because the _id field is always indexed, that means you have a “free” index on your “created at” time (unless you have persnickety requirements for creation times, like resolutions of less than a second, synchronization across app servers, etc.).

Actually using this index can seem daunting (how do you use an ObjectId to query for a certain date?) so let’s run through an example.

First, let’s insert 100 sample docs, 10 docs/second.

> for (i=0; i<10; i++) { 
... print(i+": "+Date.now()); 
... for (j=0; j<10; j++) { 
...    db.foo.insert({x:i,y:j}); 
... } 
... sleep(1000); 
... }
0: 1324417241111
1: 1324417242112
2: 1324417243112
3: 1324417244113
4: 1324417245114
5: 1324417246115
6: 1324417247115
7: 1324417248116
8: 1324417249117
9: 1324417250117

Let’s find all entries created after 1324417246115 (when i=5).

The time is currently in milliseconds (that’s how JavaScript does dates), so we’ll have to convert it to seconds:

> secs = Math.floor(1324417246115/1000)
1324417246

(Your secs will be different than mine, of course.)

ObjectIds can be constructed from a 24-character string, each two characters representing a byte (e.g., “ff” is 255). So, we need to convert secs to hexidecimal, which luckily is super-easy in JavaScript:

> hexSecs = secs.toString(16)
4ef100de

Now, we create an ObjectId from this:

> id = ObjectId(hexSecs+"0000000000000000")
ObjectId("4ef100de0000000000000000")

If you get the wrong number of zeros here, you’ll get an error message that is, er, hard to miss.

Now, we query for everything created after this timestamp:

> db.foo.find({_id : {$gt : id}})
{ "_id" : ObjectId("4ef100de7d435c39c3016405"), "x" : 5, "y" : 0 }
{ "_id" : ObjectId("4ef100de7d435c39c3016406"), "x" : 5, "y" : 1 }
{ "_id" : ObjectId("4ef100de7d435c39c3016407"), "x" : 5, "y" : 2 }
{ "_id" : ObjectId("4ef100de7d435c39c3016408"), "x" : 5, "y" : 3 }
{ "_id" : ObjectId("4ef100de7d435c39c3016409"), "x" : 5, "y" : 4 }
{ "_id" : ObjectId("4ef100de7d435c39c301640a"), "x" : 5, "y" : 5 }
{ "_id" : ObjectId("4ef100de7d435c39c301640b"), "x" : 5, "y" : 6 }
{ "_id" : ObjectId("4ef100de7d435c39c301640c"), "x" : 5, "y" : 7 }
{ "_id" : ObjectId("4ef100de7d435c39c301640d"), "x" : 5, "y" : 8 }
{ "_id" : ObjectId("4ef100de7d435c39c301640e"), "x" : 5, "y" : 9 }
{ "_id" : ObjectId("4ef100df7d435c39c301640f"), "x" : 6, "y" : 0 }
{ "_id" : ObjectId("4ef100df7d435c39c3016410"), "x" : 6, "y" : 1 }
{ "_id" : ObjectId("4ef100df7d435c39c3016411"), "x" : 6, "y" : 2 }
{ "_id" : ObjectId("4ef100df7d435c39c3016412"), "x" : 6, "y" : 3 }
{ "_id" : ObjectId("4ef100df7d435c39c3016413"), "x" : 6, "y" : 4 }
{ "_id" : ObjectId("4ef100df7d435c39c3016414"), "x" : 6, "y" : 5 }
{ "_id" : ObjectId("4ef100df7d435c39c3016415"), "x" : 6, "y" : 6 }
{ "_id" : ObjectId("4ef100df7d435c39c3016416"), "x" : 6, "y" : 7 }
{ "_id" : ObjectId("4ef100df7d435c39c3016417"), "x" : 6, "y" : 8 }
{ "_id" : ObjectId("4ef100df7d435c39c3016418"), "x" : 6, "y" : 9 }
Type "it" for more

If we look at the explain for the query, you can see that it’s using the index:

> db.foo.find({_id:{$gt:id}}).explain()
{
	"cursor" : "BtreeCursor _id_",
	"nscanned" : 50,
	"nscannedObjects" : 50,
	"n" : 50,
	"millis" : 0,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : false,
	"indexOnly" : false,
	"indexBounds" : {
		"_id" : [
			[
				ObjectId("4ef100de0000000000000000"),
				ObjectId("ffffffffffffffffffffffff")
			]
		]
	}
}

We’re not quite done, because we’re actually not returning what we wanted: we’re getting all docs greater than or equal to the “created at” time, not just greater than. To fix this, we’d just need to add 1 to the secs before doing anything else. Or I can claim that we were querying for documents created after i=4 all along.

kristina chodorow's blog