Holy Large Hadron Collider, Batman!

Valentin Kuznetsov just presented a paper at the International Conference on Computational Science on CERN’s use of MongoDB for Large Hadron Collider data. The paper, The CMS Data Aggregation System, is available as a PDF at ScienceDirect.
A summary
“CMS” stands for Compact Muon Solenoid, a general-purpose particle physics detector built on the Large Hadron Collider. The CMS project posted a few comics which provide a nice, simple (if somewhat cheesy) explanation of what the CMS/LHC does.
The LHC generates massive amounts of data of all different varieties, which is distributed across a worldwide grid. It sends status messages to some of the computers, job monitoring info to other computers, bookkeeping info still elsewhere, and so on.
This means that each location has specialized queries it can do on the data it has, but up until now it’s been very difficult to query across the whole grid. Enter the Data Aggregation System, designed to allow anything to be queried across all of the machines.
How it works
Read More →