Posts tagged:


Announcing NoSQL Live from Boston: March 11, 2010

Jan 26 • Posted 4 years ago

Clear your calendars for NoSQL Live, hosted by 10gen in Boston on March 11th. It’s not your ordinary NoSQL meetup. Rather than introducing attendees to basic functions on the tools out there, NoSQL Live will bring together people using MongoDB and a number of different non-relational databases to discuss real use cases in production systems. The full-day conference will feature panel discussions, lightening talks, networking sessions, and a NoSQL Lab where attendees can get a practical view of programming with NoSQL databases. Cloudant, which provides a hosted database and data analytics platform based on Apache CouchDB, has been confirmed as a co-sponsor.

More information on speakers, panels, and schedules to come. For those interested in presenting or sponsoring, contact Meghan Gill, event coordinator, at Visit to register.

"Partial Object Updates" will be an Important NoSQL Feature

Dec 30 • Posted 4 years ago

It’s nice that in SQL we can do things like


We term this a “partial object update”: we updated the value of X without sending a full row update to the server.

Seems like a very simple thing to be discussing, yet some nosql solutions do not support this (others do).

In these new datastores, the average stored object size (whether it be a document, a key/value blob, or a row) tends to be larger than the traditional database row.  The data is not fully normalized, so we are packing more data into a single storage object than before.

This means the cost of full updates is higher.  If we have a 100KB document and want to set a single value within it, passing the full 100KB in both directions over the network for the operation is expensive.

MongoDB supports partial updates in its update operation via a set of special $ operators: $inc, $set, $push, etc.  More of these operators will be added in the future.

There are further benefits to the technique too.  First, we get easy (single document) atomicity for these operations (consider $inc).  Second, replication is made cheaper: when a partial update occurs, MongoDB replicates the partial update rather than the full object changed.  This makes replication much less expensive and network intensive.

Databases Should be Dynamically Typed

Oct 17 • Posted 5 years ago

Software developers often debate the pros and cons of static versus dynamic typing in programming languages.  Yet what about databases?

Of course, static typing is traditional for databases.  In a relational database we usual declare our columns and the datatype of each column’s values.

However, we now see in the nosql space what are known as “schemaless” databases. Technically these products are often have some schema: for example in MongoDB we define collections and indexes.  However, we do not predefine the structure of objects within those collections — they may all be different, or all the same.  The typing is dynamic.

Dynamically typed databases are a good fit with dynamically typed programming languages.

It certainly feels like it would be a win to have a dynamically typed db when using a dynamically typed programming language (Ruby, PHP, Python, Erlang, …)  How suboptimal it would be to have all the flexibility of dynamic typing in our code, and then hit a “brick wall” when we go to persist the data and have to statically spec everything out!  There is synergy to be had between the dynamically typed programming language and the dynamically typed database.

Dynamically typed databases can be a good thing when using statically typed programming languages.

The best thing about static typing with compilers is that errors are reported at compile/development time.  This is a big win for statically typed languages such as Java and C++.  However, even with a statically typed database, type matching errors storing data are only reported at runtime!  (That is, our java compiler doesn’t check our MySQL schema.)

Thus some of the power of static typing in programming is lost at the storage layer.  We still retain some benefits: assurance of some consistency to the data stored.  But any failure to honor such a contract is only reported at runtime.  Thus, it is more than worth considering using a “schemaless” database with say, Java, and getting out of the business of writing data migration scripts with each release.  (Yes, some of that work stays but we can eliminate the majority.)

Relational databases could be dynamically typed.

While existing RDBMSes are statically typed, this is not an inherent limitation of the relational model.  One could imagine a relational database with tables where one can dynamically insert a row with an extra column value at any time, and where values of cells in the same column of a table may have different types.

What is the Right Data Model?

Jul 16 • Posted 5 years ago

There is certainly plenty of activity in the nonrelational (“NOSQL”) db space right now.  We know for these projects the data model is not relational.  But what is the data model?  What is the right model?

There are many possibilities, the most popular of which are:

Key/Value. Pure key/value stores are blobs stored by key.

Tabular. Some projects use a Google BigTable-like data model which we call “tabular” here — or one can think of it as “multidimensional tabular”.

Document-Oriented. Typical of these are JSON-style data stores.

We think this is a very important topic.  What is the right data model?  Should there be standardization?

Below are some thoughts on the approaches above.  Of course, as MongoDB committers, we are biased — you know which one we’re going to like.

Key/value has the advantage of being simple.  It is easy to make such systems fast and scalable.  Con is that it is too simple for easy implementation of some real world problems.  We’d like to see something more general purpose.

The tabular space brings more flexibility.  But why are we sticking to tables?  Shouldn’t we do something closer to the data model of our programming languages?  Tabular jettisons the theoretical underpinnings of relational algebra, yet we still have significant mapping work from program objects to “tables”.  If I were going to work with tables, I’d really like to have full relational power.

We really like the document-oriented approach.  The programming languages we use today, not to mention web services, map very nicely to say, JSON.  A JSON store gives us an object-like representation, yet also is not tied too tightly to any one single language, which seems wrong for a database.

Would love to hear the thoughts of others.

See also: the BSON blog post

blog comments powered by Disqus