Popping Timestamps into ObjectIds

ObjectIds contain a timestamp, which tells you when the document was created. Because the _id field is always indexed, that means you have a “free” index on your “created at” time (unless you have persnickety requirements for creation times, like resolutions of less than a second, synchronization across app servers, etc.).

Actually using this index can seem daunting (how do you use an ObjectId to query for a certain date?) so let’s run through an example.

First, let’s insert 100 sample docs, 10 docs/second.

> for (i=0; i<10; i++) { 
... print(i+": "+Date.now()); 
... for (j=0; j<10; j++) { 
...    db.foo.insert({x:i,y:j}); 
... } 
... sleep(1000); 
... }
0: 1324417241111
1: 1324417242112
2: 1324417243112
3: 1324417244113
4: 1324417245114
5: 1324417246115
6: 1324417247115
7: 1324417248116
8: 1324417249117
9: 1324417250117

Let’s find all entries created after 1324417246115 (when i=5).

The time is currently in milliseconds (that’s how JavaScript does dates), so we’ll have to convert it to seconds:

> secs = Math.floor(1324417246115/1000)
1324417246

(Your secs will be different than mine, of course.)

ObjectIds can be constructed from a 24-character string, each two characters representing a byte (e.g., “ff” is 255). So, we need to convert secs to hexidecimal, which luckily is super-easy in JavaScript:

> hexSecs = secs.toString(16)
4ef100de

Now, we create an ObjectId from this:

> id = ObjectId(hexSecs+"0000000000000000")
ObjectId("4ef100de0000000000000000")

If you get the wrong number of zeros here, you’ll get an error message that is, er, hard to miss.

Now, we query for everything created after this timestamp:

> db.foo.find({_id : {$gt : id}})
{ "_id" : ObjectId("4ef100de7d435c39c3016405"), "x" : 5, "y" : 0 }
{ "_id" : ObjectId("4ef100de7d435c39c3016406"), "x" : 5, "y" : 1 }
{ "_id" : ObjectId("4ef100de7d435c39c3016407"), "x" : 5, "y" : 2 }
{ "_id" : ObjectId("4ef100de7d435c39c3016408"), "x" : 5, "y" : 3 }
{ "_id" : ObjectId("4ef100de7d435c39c3016409"), "x" : 5, "y" : 4 }
{ "_id" : ObjectId("4ef100de7d435c39c301640a"), "x" : 5, "y" : 5 }
{ "_id" : ObjectId("4ef100de7d435c39c301640b"), "x" : 5, "y" : 6 }
{ "_id" : ObjectId("4ef100de7d435c39c301640c"), "x" : 5, "y" : 7 }
{ "_id" : ObjectId("4ef100de7d435c39c301640d"), "x" : 5, "y" : 8 }
{ "_id" : ObjectId("4ef100de7d435c39c301640e"), "x" : 5, "y" : 9 }
{ "_id" : ObjectId("4ef100df7d435c39c301640f"), "x" : 6, "y" : 0 }
{ "_id" : ObjectId("4ef100df7d435c39c3016410"), "x" : 6, "y" : 1 }
{ "_id" : ObjectId("4ef100df7d435c39c3016411"), "x" : 6, "y" : 2 }
{ "_id" : ObjectId("4ef100df7d435c39c3016412"), "x" : 6, "y" : 3 }
{ "_id" : ObjectId("4ef100df7d435c39c3016413"), "x" : 6, "y" : 4 }
{ "_id" : ObjectId("4ef100df7d435c39c3016414"), "x" : 6, "y" : 5 }
{ "_id" : ObjectId("4ef100df7d435c39c3016415"), "x" : 6, "y" : 6 }
{ "_id" : ObjectId("4ef100df7d435c39c3016416"), "x" : 6, "y" : 7 }
{ "_id" : ObjectId("4ef100df7d435c39c3016417"), "x" : 6, "y" : 8 }
{ "_id" : ObjectId("4ef100df7d435c39c3016418"), "x" : 6, "y" : 9 }
Type "it" for more

If we look at the explain for the query, you can see that it’s using the index:

> db.foo.find({_id:{$gt:id}}).explain()
{
	"cursor" : "BtreeCursor _id_",
	"nscanned" : 50,
	"nscannedObjects" : 50,
	"n" : 50,
	"millis" : 0,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : false,
	"indexOnly" : false,
	"indexBounds" : {
		"_id" : [
			[
				ObjectId("4ef100de0000000000000000"),
				ObjectId("ffffffffffffffffffffffff")
			]
		]
	}
}

We’re not quite done, because we’re actually not returning what we wanted: we’re getting all docs greater than or equal to the “created at” time, not just greater than. To fix this, we’d just need to add 1 to the secs before doing anything else. Or I can claim that we were querying for documents created after i=4 all along.

  • http://www.justaprogrammer.net Justin Dearing

    Great post. A while back I created an ObjectId 
    implementation JavaScript in Javascript
    . It requires a browser to store use html5 local storage or a cookie to store the machine id (a random number) since I could not get a hostname/ip/mac address.

    The .NET driver lets you query an Objectid, as well as the php driver (which you knew since you wrote it). Maybe the shell version of ObjectId should have a convenience functions to get the timestamp out of an ObjectId.

  • Anonymous

    Thanks! You actually can get the timestamp out of an ObjectId pretty easily in the shell: ObjectId().getTimestamp().  I thought it would be pedagogically clearer for people to start with a timestamp and convert it to an ObjectId, instead of ObjectId->timestamp->ObjectId.

  • http://techbrahmana.blogspot.com Brahmana (Srirang)

    This is neat trick. A plain listing by creation time can be done without an additional index this way.

    Although, what would be even awesome is if MongoDB could use multiple indexes for the same query (index intersection if I am not wrong). That is however beyond the scope of this post. :)

  • Anonymous

    ..deleted..

kristina chodorow's blog