This is a guest post by Stéphane Godbillon, software architect at Zenexity _ I am very excited to introduce ReactiveMongo, a brand new Scala driver for MongoDB. More than just yet-another-async-driver, it’s a _reactive driver that allows you to design very scalable applications unleashing MongoDB capabilities like streaming infinite live collections and files for modern Realtime Web applications.
I/O operations may be handled in the following ways: * synchronously: each time a request is sent, the running thread is blocked, waiting for the response to arrive. When the response is received, the execution flow resumes. * asynchronously: the code handling the response may be not run simultaneously (often in a closure). Still, a thread may be blocked, but it may not be the same thread that did the request. * non-blocking: sending a request does not block any thread.
Synchronous database drivers do not perform very well in heavily loaded applications - they spend a lot of time waiting for a response. When you get 2000 or more clients performing each several database non-trivial requests, such a component becomes the main bottleneck of your application. Moreover, what’s the point of using a nifty, powerful, fully asynchronous/non-blocking web framework (like Play!) if all your database accesses are blocking?
The problem is, almost all MongoDB drivers perform synchronous IO. Particularly, there are no fully implemented non-blocking drivers on the JVM. A year ago, Brendan McAdams from 10gen did actually start an initiative to implement the mongo protocol in a non-blocking manner, Hammersmith. This gave us motivation to work on ReactiveMongo.
ReactiveMongo is designed to avoid any kind of blocking request. Every operation returns immediately, freeing the running thread and resuming execution when it is over. Accessing the database is not a bottleneck anymore.
More than enabling non-blocking database requests, ReactiveMongo is streaming capable. Everywhere it makes sense, you can provide streams of documents instead of one, and you can get streams of documents instead of a classic cursor or list. It is very powerful when you want to serve many clients at the same time, without filling up the memory and in an efficient way. This feature is crucial to build modern web applications. Indeed, the future of the web is in streaming data to a very large number of clients simultaneously. Twitter Stream API is a good example of this paradigm shift that is radically altering the way data is consumed all over the web.
ReactiveMongo enables you to build such a web application right now. It allows you to stream data both into and from your MongoDB servers. Using the well known Iteratee/Enumerator pattern, MongoDB can now be seen as a consumer (or producer) of streams of documents.
Here are some examples of what you can do with ReactiveMongo and Iteratees/Enumerators.
With ReactiveMongo, you can stream the content of a collection. That’s particularly useful when dealing with websockets. For example, you can get a tailable cursor over a capped collection, asynchronously fetch newly inserted documents and push them into a websocket. There is a good example of doing this here.
Imagine that your application receives a lot of documents to insert new documents continuously. Instead of making one database call for each, it is better to build a bulk of documents and then send it to the database. ReactiveMongo enables you to do this in perfectly non-blocking way.
MongoDB provides a way to deal with files, GridFS. That’s great when you want to read and write files, without directly using the filesystem and to profit of MongoDB replication/sharding capabilities. But in order to be very scalable, a web application should be able to stream files, without filling up the memory unnecessarily or blocking one thread per upload/download. ReactiveMongo is the perfect tool to do this. It is designed to be natively streaming capable.
All the database operations that ReactiveMongo performs are non-blocking. No thread may be blocked waiting for a response.
Streaming capability is essential for building modern, reactive web applications. ReactiveMongo implements it with enumerators and iteratees (immutable Producer/Consumer pattern). The API is designed to provide this capability for every feature where it may be interesting: cursors, (bulk) insertion, GridFS…
Like modern web framework try to be conform to the HTTP protocol in their architecture (notably by being RESTful), ReactiveMongo is designed to stay as close as possible of the Mongo Wire Protocol.
ReactiveMongo’s API is designed to be very easy to play with, and to make easier to build libraries upon it.
ReactiveMongo is still under heavy development, but the following are already implemented: * ReplicaSet support * Authentication support * GridFS support (streaming capable) * Cursors (providing a stream of documents) * Bulk inserts * Database commands support * Indexing operations
You can get the code from the Github repository. The scaladoc is also available.
Note that it targets Scala 2.10, but can be tested with Scala 2.9.2 right now.
The first release is expected within the coming weeks. What remains to do:
There is a tutorial on writing a web application with ReactiveMongo and Play Framework. There are also some good examples that cover ReactiveMongo’s capabilities:
There is a ReactiveMongo Google Group. Don’t hesitate to post your questions, bug reports and comments. All kinds of contribution are very welcomed.