February 8, 2010

Practical MongoDB Training with Kyle Banker

10gen is offering day-long MongoDB training sessions in San Francisco and New York City! Kyle Banker, a software engineer at 10gen, will be leading both sessions. Kyle has presented MongoDB in numerous forums, most recently at Chicago Ruby, and is excited to share his expertise. Kyle is preparing several interesting and challenging projects so that attendees can really get their hands dirty. Whether you are brand new to MongoDB or you’ve played with it already, you will leave this course with a comprehensive understanding of how to build applications with MongoDB.

More details available on the 10gen website.

Comments (View)
February 4, 2010

Hosting Center Update

Update on supported hosting options:

  • Dreamhost is now offering instant configuration and deployment of MongoDB to DreamHost PS customers
  • Webfaction and Linode have recently published instructions for installing MongoDB on their respective systems

Check out the Hosting Center for more details. If you’re interested in support from other hosting providers, please let us know which ones you’d like to see in the comments.

Comments (View)
January 26, 2010

Announcing NoSQL Live from Boston: March 11, 2010

Clear your calendars for NoSQL Live, hosted by 10gen in Boston on March 11th. It’s not your ordinary NoSQL meetup. Rather than introducing attendees to basic functions on the tools out there, NoSQL Live will bring together people using MongoDB and a number of different non-relational databases to discuss real use cases in production systems. The full-day conference will feature panel discussions, lightening talks, networking sessions, and a NoSQL Lab where attendees can get a practical view of programming with NoSQL databases. Cloudant, which provides a hosted database and data analytics platform based on Apache CouchDB, has been confirmed as a co-sponsor.

More information on speakers, panels, and schedules to come. For those interested in presenting or sponsoring, contact Meghan Gill, event coordinator, at meghan@10gen.com. Visit http://nosqlboston.eventbrite.com/ to register.

Comments (View)
December 30, 2009

"Partial Object Updates" will be an Important NoSQL Feature

It’s nice that in SQL we can do things like

UPDATE PERSONS SET X = X + 1

We term this a “partial object update”: we updated the value of X without sending a full row update to the server.

Seems like a very simple thing to be discussing, yet some nosql solutions do not support this (others do).

In these new datastores, the average stored object size (whether it be a document, a key/value blob, or a row) tends to be larger than the traditional database row.  The data is not fully normalized, so we are packing more data into a single storage object than before.

This means the cost of full updates is higher.  If we have a 100KB document and want to set a single value within it, passing the full 100KB in both directions over the network for the operation is expensive.

MongoDB supports partial updates in its update operation via a set of special $ operators: $inc, $set, $push, etc.  More of these operators will be added in the future.

There are further benefits to the technique too.  First, we get easy (single document) atomicity for these operations (consider $inc).  Second, replication is made cheaper: when a partial update occurs, MongoDB replicates the partial update rather than the full object changed.  This makes replication much less expensive and network intensive.

Comments (View)
December 10, 2009

NoSQL and the future of cloud databases

news.cnet.com One of the cloud-related trends that developers have been paying attention to is “NoSQL,” a set of operational-data technologies based on nonrelational technology. According to Dwight Merriman, CEO of 10gen (the commercial team behind the open-source MongoDB project), we’ll see NoSQL complement existing applications for the foreseeable future.

Comments (View)
November 18, 2009

Fast Updates with MongoDB (update-in-place)

One nice feature with MongoDB is that updates can happen “in place” — the database does not have to allocate and write a full new copy of the object.

This can be highly performant for frequent update use cases.  For example, incrementing a counter is a highly efficient operation.  We need not fetch the document from the server, we can simply send an increment operation over:

db.my_collection.update( { _id : ... }, { $inc : { y : 2 } } ); // increment y by 2

MongoDB disk writes are lazy.  If we receive 1,000 increments in one second for the object, it will only be written once.  Physical writes occur a couple of seconds after the operation.

One question is what happens when an object grows.  If the object fits in its previous allocation space, it will update in place.  If it does not, it will be moved to a new location in the datafile, and its index keys must be updated, which is slower.  Because of this, Mongo uses an adaptive algorithm to try to minimize moves on an update.  The database computes a padding factor for each collection based on how often items grow and move.  The more often the objects grow, the larger the padding factor will be; when less frequent, smaller.

See also:

http://www.mongodb.org/display/DOCS/Updating

http://blog.mongodb.org/post/171353301/using-mongodb-for-real-time-analytics

Comments (View)
November 12, 2009

Webinar recording posted

The recording of the webinar on MongoDB by Dwight Merriman (10gen) & Ian White (Business Insider) is available here: http://vivu.tv/portal/archive.jsp?flow=527-472-7945&id=1256920226675 

Comments (View)
November 10, 2009

10gen is looking for a full time Java developer

10gen, which provides commercial support for MongoDB, is hiring a Java developer to work full time on the JVM languages in NYC.

This job entails maintenance and improvement of the Java driver, and also working with the JVM languages like scala and clojure.

If you’re interested, you can send an email to info at 10gen dot com. If you’re really interested, you should submit a patch to the driver (http://github.com/mongodb/mongo-java-driver/).

-Eliot

Comments (View)
November 2, 2009

Joyent

A prebuilt binary for Joyent (labeled “Solaris64”) is now available on the mongodb.org downloads page.

See http://www.mongodb.org/display/DOCS/Joyent for more information including an example of installation.

Comments (View)
October 22, 2009

More than 10 Indexes now Supported

The 10 index limit per collection has been raised to 40.  This is available in the latest daily build.

Please consider “alpha” for now (like any daily build) but let us know how it works.

Comments (View)
October 21, 2009

Webinar on MongoDB on Oct 30th

We’re doing a webinar on MongoDB on Oct 30, 2009 noon EST. It’ll be an overview of MongoDB & will also have Ian White from Business Insider talking about how they are using MongoDB in production:

Details & register at: http://mongodb1.eventbrite.com/

(The webinar is FREE)

We’ve been speaking about MongoDB at physical events like conferences and meetups. But since there’s interest in MongoDB from many different geographical locations, we thought we’d also do a webinar.  This will be an interactive live web event. Look forward to seeing you there!

If you have questions on the webinar or have ideas for other such webinars shoot us an email at info@10gen.com.

Instructions:

Comments (View)
October 17, 2009

Databases Should be Dynamically Typed

Software developers often debate the pros and cons of static versus dynamic typing in programming languages.  Yet what about databases?

Of course, static typing is traditional for databases.  In a relational database we usual declare our columns and the datatype of each column’s values.

However, we now see in the nosql space what are known as “schemaless” databases. Technically these products are often have some schema: for example in MongoDB we define collections and indexes.  However, we do not predefine the structure of objects within those collections — they may all be different, or all the same.  The typing is dynamic.

Dynamically typed databases are a good fit with dynamically typed programming languages.

It certainly feels like it would be a win to have a dynamically typed db when using a dynamically typed programming language (Ruby, PHP, Python, Erlang, …)  How suboptimal it would be to have all the flexibility of dynamic typing in our code, and then hit a “brick wall” when we go to persist the data and have to statically spec everything out!  There is synergy to be had between the dynamically typed programming language and the dynamically typed database.

Dynamically typed databases can be a good thing when using statically typed programming languages.

The best thing about static typing with compilers is that errors are reported at compile/development time.  This is a big win for statically typed languages such as Java and C++.  However, even with a statically typed database, type matching errors storing data are only reported at runtime!  (That is, our java compiler doesn’t check our MySQL schema.)

Thus some of the power of static typing in programming is lost at the storage layer.  We still retain some benefits: assurance of some consistency to the data stored.  But any failure to honor such a contract is only reported at runtime.  Thus, it is more than worth considering using a “schemaless” database with say, Java, and getting out of the business of writing data migration scripts with each release.  (Yes, some of that work stays but we can eliminate the majority.)

Relational databases could be dynamically typed.

While existing RDBMSes are statically typed, this is not an inherent limitation of the relational model.  One could imagine a relational database with tables where one can dynamically insert a row with an extra column value at any time, and where values of cells in the same column of a table may have different types.

Comments (View)
September 30, 2009

Upcoming Conferences for the MongoDB Team

We try to speak about MongoDB at as many conferences and meetups as possible. If you’re interested in learning more about MongoDB or in meeting some of the people who work on it then you should try to make it out to one. Our schedule for the next couple of months is below. If you know of (or are organizing) a conference/meetup where you’d like to hear from us shoot us an email at info@10gen.com!

  • 10/5/2009 NYC NoSQL NYC Dwight will be presenting about MongoDB and Eliot will be on a panel discussion, but all of us will be at the event

  • 10/16/2009 DC DC Hadoop Meetup Mike will be talking about MongoDB

  • 10/23/2009 St Louis Strange Loop Conference Mike will be discussing MongoDB

  • 10/24/2009 Foz do Iguaçu, Brazil Latinoware Kristina will be talking about MongoDB for web applications

  • 10/27/2009 NYC NY PHP Kristina will be talking about using MongoDB from PHP

  • 11/7/2009 Poznań, Poland RuPy Mike will be talking about using MongoDB from Ruby and Python

  • 11/14/2009 Portland OpenSQLCamp Portland Mike will be in Portland for OpenSQLCamp

  • 11/17/2009 NYC Web 2.0 Expo Eliot will be talking about shifting to non-relational databases

  • 11/19/2009 San Francisco RubyConf Mike will be talking about using MongoDB from Ruby

  • 11/19/2009 NYC Interop New York Dwight will be talking about data in the cloud

Comments (View)
September 9, 2009

Storing Large Objects and Files in MongoDB

Large objects, or “files”, are easily stored in MongoDB.  It is no problem to store 100MB videos in the database.  For example, MusicNation uses MongoDB to store its videos.

This has a number of advantages over files stored in a file system.  Unlike a file system, the database will have no problem dealing with millions of objects.  Additionally, we get the power of the database when dealing with this data: we can do advanced queries to find a file, using indexes; we can also do neat things like replication of the entire file set.

MongoDB stores objects in a binary format called BSON.  BinData is a BSON data type for a binary byte array.  However, MongoDB objects are typically limited to 4MB in size.  To deal with this, files are “chunked” into multiple objects that are less than 4MB each.  This has the added advantage of letting us efficiently retrieve a specific range of the given file.

While we could write our own chunking code, a standard format for this chunking is predefined, call GridFS.  GridFS support is included in many MongoDB drivers and also in the mongofiles command line utility.

A good way to do a quick test of this facility is to try out the mongofiles utility.  See the MongoDB documentation for more information on GridFS.

Comments (View)
August 27, 2009

1.0 GA Released

The MongoDB team is very happy to announce that we have released MongoDB version 1.0.0.

MongoDB 1.0.0 is production ready for single master, master/slave and replica pair environments.  While there are many more features that people want and that we are working on, 1.0 is very stable and the code base has been used in production for over 18 months.

As usual, you can get from here: http://www.mongodb.org/display/DOCS/Downloads

Note: No changes have been made between 0.9.10 and 1.0.0.  There is a v1.0 branch on github for the 1.0.x releases.  See http://www.mongodb.org/display/DOCS/Version+Numbers for more notes about version numbers.

Comments (View)