MongoDB stores documents (objects) in a format called BSON. BSON is a binary serialization of JSON-like documents. BSON stands for “Binary JSON”, but also contains extensions that allow representation of data types that are not part of JSON. For example, BSON has a Date data type and BinData type.
The MongoDB client drivers perform the serialization and unserialization. For a given language, the driver performs translation from the language’s “object” (ordered associative array) data representation to BSON, and back. While the client performs this work, the database understands the internals of the format and can “reach into” BSON objects when appropriate: for example to build index keys, or to match an object against a query expression. That is to say, MongoDB is not just a blob store.
Thus, BSON is a language independent data interchange format.
The BSON serialization code from any MongoDB driver can be used to serialize and unserialize BSON, even for applications where the Mongo database proper is completely uninvolved. This usage is encouraged and we would be happy to work with others on making the format as generically useful as possible.
The key advantage over XML and JSON is efficiency (both in space and compute time), as it is a binary format.
BSON can be compared to binary interchange formats, such as Protocol Buffers. BSON is more “schemaless” than Protocol Buffers — this being both an advantage in flexibility, and a slight disadvantage in space as BSON has a little overhead for fieldnames within the serialized BSON data.