Popping Timestamps into ObjectIds
ObjectIds contain a timestamp, which tells you when the document was created. Because the _id field is always indexed, that means you have a “free” index on your “created at” time (unless you have persnickety requirements for creation times, like resolutions of less than a second, synchronization across app servers, etc.).
Actually using this index can seem daunting (how do you use an ObjectId to query for a certain date?) so let’s run through an example.
First, let’s insert 100 sample docs, 10 docs/second.
> for (i=0; i<10; i++) { ... print(i+": "+Date.now()); ... for (j=0; j<10; j++) { ... db.foo.insert({x:i,y:j}); ... } ... sleep(1000); ... } 0: 1324417241111 1: 1324417242112 2: 1324417243112 3: 1324417244113 4: 1324417245114 5: 1324417246115 6: 1324417247115 7: 1324417248116 8: 1324417249117 9: 1324417250117 |
Let’s find all entries created after 1324417246115 (when i=5).
The time is currently in milliseconds (that’s how JavaScript does dates), so we’ll have to convert it to seconds:
> secs = Math.floor(1324417246115/1000) 1324417246 |
(Your secs will be different than mine, of course.)
ObjectIds can be constructed from a 24-character string, each two characters representing a byte (e.g., “ff” is 255). So, we need to convert secs to hexidecimal, which luckily is super-easy in JavaScript:
> hexSecs = secs.toString(16) 4ef100de |
Now, we create an ObjectId from this:
> id = ObjectId(hexSecs+"0000000000000000") ObjectId("4ef100de0000000000000000") |
If you get the wrong number of zeros here, you’ll get an error message that is, er, hard to miss.
Now, we query for everything created after this timestamp:
> db.foo.find({_id : {$gt : id}}) { "_id" : ObjectId("4ef100de7d435c39c3016405"), "x" : 5, "y" : 0 } { "_id" : ObjectId("4ef100de7d435c39c3016406"), "x" : 5, "y" : 1 } { "_id" : ObjectId("4ef100de7d435c39c3016407"), "x" : 5, "y" : 2 } { "_id" : ObjectId("4ef100de7d435c39c3016408"), "x" : 5, "y" : 3 } { "_id" : ObjectId("4ef100de7d435c39c3016409"), "x" : 5, "y" : 4 } { "_id" : ObjectId("4ef100de7d435c39c301640a"), "x" : 5, "y" : 5 } { "_id" : ObjectId("4ef100de7d435c39c301640b"), "x" : 5, "y" : 6 } { "_id" : ObjectId("4ef100de7d435c39c301640c"), "x" : 5, "y" : 7 } { "_id" : ObjectId("4ef100de7d435c39c301640d"), "x" : 5, "y" : 8 } { "_id" : ObjectId("4ef100de7d435c39c301640e"), "x" : 5, "y" : 9 } { "_id" : ObjectId("4ef100df7d435c39c301640f"), "x" : 6, "y" : 0 } { "_id" : ObjectId("4ef100df7d435c39c3016410"), "x" : 6, "y" : 1 } { "_id" : ObjectId("4ef100df7d435c39c3016411"), "x" : 6, "y" : 2 } { "_id" : ObjectId("4ef100df7d435c39c3016412"), "x" : 6, "y" : 3 } { "_id" : ObjectId("4ef100df7d435c39c3016413"), "x" : 6, "y" : 4 } { "_id" : ObjectId("4ef100df7d435c39c3016414"), "x" : 6, "y" : 5 } { "_id" : ObjectId("4ef100df7d435c39c3016415"), "x" : 6, "y" : 6 } { "_id" : ObjectId("4ef100df7d435c39c3016416"), "x" : 6, "y" : 7 } { "_id" : ObjectId("4ef100df7d435c39c3016417"), "x" : 6, "y" : 8 } { "_id" : ObjectId("4ef100df7d435c39c3016418"), "x" : 6, "y" : 9 } Type "it" for more |
If we look at the explain for the query, you can see that it’s using the index:
> db.foo.find({_id:{$gt:id}}).explain() { "cursor" : "BtreeCursor _id_", "nscanned" : 50, "nscannedObjects" : 50, "n" : 50, "millis" : 0, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { "_id" : [ [ ObjectId("4ef100de0000000000000000"), ObjectId("ffffffffffffffffffffffff") ] ] } } |
We’re not quite done, because we’re actually not returning what we wanted: we’re getting all docs greater than or equal to the “created at” time, not just greater than. To fix this, we’d just need to add 1 to the secs before doing anything else. Or I can claim that we were querying for documents created after i=4 all along.
Subscribe