Aug 13 • Posted 8 months ago
By Trisha Gee, MongoDB Java Engineer and Evangelist
You may have heard that the JVM team at 10gen is working on a 3.0 version of the Java driver. We’ve actually been working on it since the end of last year, and it’s probably as surprising to you as it is to me that we still haven’t finished it yet. But this is a bigger project than it might seem, and we’re working hard to get it right.
So why update the driver? What are we trying to achieve?
Well, the requirements are:
- More maintainable
- More extensible
- Better support for ODMs, third party libraries and other JVM languages
- More idiomatic for Java developers
That’s all very nice, but it’s a bit fluffy. You can basically summarise that as “better all round”. Which is probably the requirement of any major upgrade. Since it’s too fluffy to guide us in our development, we came up with the following design goals.
- Cleaner design
- Intuitive API
- Understandable Exceptions
- Test Friendly
- Backwards compatible
Java developers using the driver will have encountered a number of inconsistencies: the way you do things in the shell, or in other drivers, is not always the same way you do things in the Java driver. Even using just the Java driver, methods are confusingly named (what’s the difference between
ensureIndex, for example?); the order of parameters is frequently different; often methods are overloaded but sometimes you chain methods; there are helpers such as
QueryBuilder but sometimes you need to manually construct a
DBObject, and so on.
If you’re working within the driver, the inconsistencies in the code will drive you mad if you’re even slightly OCD: use of whitespace, position of curly braces, position of fields, mixed field name conventions and so on. All of this may seem pedantic to some people, but it makes life unnecessarily difficult if you’re learning to use the driver, and it means that adding features or fixing bugs takes longer than it should.
It’s easy to assume that the driver has a single, very simple, function - to serialise Java to BSON and back again. After all, its whole purpose is to act as a facilitator between your application and MongoDB, so surely that’s all it does - turn your method call and Java objects into wire-protocol messages and vice versa. And while this is an important part of what the driver does, it’s not its only function. MongoDB is horizontally scalable, so that means your application might not be talking to just a single physical machine - you could be reading from one of many secondaries, you could be writing to and reading from a sharded environment, you could be working with a single server. The driver aims to make this as transparent as possible to your application, so it does things like server discovery, selects the appropriate server, and tries to reuse the right connection where appropriate. It also takes care of connection pooling. So as well as serialisation and deserialisation, there’s a whole connection management piece.
The driver also aims to provide the right level of abstraction between the protocol and your application - the driver has a domain of its own, and should be designed to represent that domain in a sane way - with Documents, Collections, Databases and so on exposed to your application in a way that you can intuitively use.
But it’s not just application developers that are using the driver. By implementing the right shaped design for the driver, we can make it easier for other libraries and drivers to reuse some of the low-level code (e.g. BSON protocol, connection management, etc) but put their own API on the front of it - think Spring Data, Morphia, and other JVM languages like Scala. Instead of thinking of the Java driver as the default way for Java developers to access MongoDB, we can think of this as the default JVM driver, on top of which you can build the right abstractions. So we need to make it easier for other libraries to reuse the internals without necessarily having to wrap the whole driver.
All this has led us to design the driver so that there is a Core, around which you can wrap an API - in our case, we’re providing a backward-compatible API that looks very much like the old driver’s API, and we’re working on a new fluent API (more on that in the next section). This Core layer (with its own public API) is what ODMs and other drivers can talk to in order to reuse the common functionality while providing their own API. Using the same core across multiple JVM drivers and libraries should give consistency to how the driver communicates with the database, while allowing application developers to use the library with the most intuitive API for their own needs.
We want an API that:
- Feels natural to Java developers
- Is logical if you’ve learnt how to talk to MongoDB via the shell (since most of our documentation references the shell)
- Is consistent with the other language drivers.
Given those requirements, it might not be a surprise that it’s taking us a while to come up with something that fits all of them, and this process is still in progress. However, from a Java point of view, we would like the following:
- Static typing is an advantage of Java, and we don’t want to lose that. In particular, we’re keen for the IDE to help you out when you’re trying to decide which methods to use and what their parameters are. We want Cmd+space to give you the right answers.
- Generics. They’ve been around for nearly 10 years, we should probably use them in the driver
- We want to use names and terms that are familiar in the MongoDB world. So, no more
DBObject, please welcome
- More helpers to create queries and objects in a way that makes sense and is self-describing
The API is still evolving, what’s in Github WILL change. You can take a look if you want to see where we are right now, but we make zero guarantees that what’s there now will make it into any release.
When you’re troubleshooting someone’s problems, it becomes obvious that some of the exceptions thrown by the driver are not that helpful. In particular, it’s quite hard to understand whether it’s the server that threw an error (e.g. you’re trying to write to a secondary, which is not allowed) or the driver (e.g. can’t connect to the server, or can’t serialise that sort of Java object). So we’ve introduced the concept of Client and Server Exceptions. We’ve also introduced a lot more exceptions, so that instead of getting a MongoException with some message that you might have to parse and figure out what to do, we’re throwing specific exceptions for specific cases (for example,
This should be helpful for anyone using the driver - whether you’re using it directly from your application, whether a third party is wrapping the driver and needs to figure out what to do in an exceptional case, or whether you’re working on the driver itself - after all, the code is open source and anyone can submit a pull request.
The first thing I tried to do when I wrote my first MongoDB & Java application was mock the driver - while you’ll want some integration tests, you may also want to mock or stub the driver so you can test your application in isolation from MongoDB. But you can’t. All the classes are final and there are no interfaces. While there’s nothing wrong with performing system/integration/functional tests on your database, there’s often a need to test areas in isolation to have simple, fast-running tests that verify something is working as expected.
The new driver makes use of interfaces at the API level so that you can mock the driver to test your application, and the cleaner, decoupled design makes it easier to create unit tests for the internals of the driver. And now, after a successful spike, we’ve started implementing Spock tests, both functional and unit, to improve the coverage and readability of the internal driver tests.
In addition, we’re trying to implement more acceptance tests (which are in Java, not Groovy/Spock). The goal here is to have living documentation for the driver - not only for how to do things (“this is what an insert statement looks like”) but also to document what happens when things don’t go to plan (“this is the error you see when you pass null values in”). These tests are still very much a work in progress, but we hope to see them grow and evolve over time.
Last, but by no means least, all this massive overhaul of design, architecture, and API MUST be backwards compatible. We are committed to all our existing users, we don’t want them to have to do a big bang upgrade simply to get the new and improved driver. And we believe in providing users with an upgrade path which lets them migrate gradually from the old driver, and the old API, to the new driver and new API. This has made development a little bit more tricky, but we think it’s made it easier to validate the design of the new driver - not least because we can run existing test suites against the compatible new driver (the compatible-mode driver exposes the old API but uses the new architecture), to verify that the behaviour is the same as it used to be, other than deprecated functionality .
It was time for the Java Driver for MongoDB to have a bit of a facelift. To ensure a quality product, the drivers team at 10gen decided on a set of design goals for the new driver and have been hard at work creating a driver that means these criteria.
In the next post, we’ll cover the new features in the 3.0 driver and show you where to find it.