Using MongoDB for Real-time Analytics

Aug 25 • Posted 4 years ago

Some MongoDB developers use the database as a way to track real-time performance metrics for their websites (page views, uniques, etc.)  Tools like Google Analytics are great but not real-time — sometimes it is useful to build a secondary system that provides basic realtime stats.

Using the Mongo upsert and $inc features, we can efficiently solve the problem.  When an app server renders a page, the app server can send one or more updates to the database to update statistics.

We can be do this efficiently for a few reasons.  First, we send a single message to the server for the update.  The message is an “upsert” — if the object exists, we increment the counters, if it does not, the object is created.  Second, we do not wait for a response — we simply send the operation, and immediately return to other work at hand.  As the data is simply page counters, we do not need to wait and see if the operation completes (we wouldn’t report such an error to our web site user anyway).  Third, the special $inc operator lets us efficiently update an existing object without requiring a much more expensive query/modify/update sequence.

The example below demonstrates this using the mongo shell syntax (analogous steps can be done in any programming language for which one has a Mongo driver).
$ ./mongo
> c = db.uniques_by_hour;
> c.find();
> cur_hour = new Date("Mar 05 2009 10:00:00")
> c.ensureIndex( { hour : 1, site : 1 } );
> c.update( { hour : cur_hour, site : "abc" },
{ $inc : { uniques:1, pageviews: 1} },
{ upsert : true } )
> c.find();
{"_id" : "49aff5c62f47a38ee77aa5cf" ,
"hour" : "Thu Mar 05 2009 10:00:00 GMT-0500 (EST)" ,
"site" : "abc" , "uniques" : 1 ,
"pageviews" : 1}
> c.update( { hour : cur_hour, site : "abc" },
{ $inc : { uniques:1, pageviews: 1} },
{ upsert : true } )
> c.find();
{"_id" : "49aff5c62f47a38ee77aa5cf" ,
"hour" : "Thu Mar 05 2009 10:00:00 GMT-0500 (EST)" ,
"site" : "abc" , "uniques" : 2 , "pageviews" : 2}
> c.update( { hour : cur_hour, site : "abc" },
{ $inc : { uniques:0, pageviews: 1} },
{ upsert : true } )
> c.find();
{"_id" : "49aff5c62f47a38ee77aa5cf" ,
"hour" : "Thu Mar 05 2009 10:00:00 GMT-0500 (EST)" ,
"site" : "abc" , "uniques" : 2 , "pageviews" : 3}