Posts tagged:

nosql database

Crittercism: Scaling To Billions Of Requests Per Day On MongoDB

Feb 20 • Posted 6 months ago

This is a guest post by Mike Chesnut, Director of Operations Engineering at Crittercism. This June, Mike will present at MongoDB World on how Crittercism scaled to 30,000 requests/second (and beyond) on MongoDB.

MongoDB is capable of scaling to meet your business needs — that is why its name is based on the word humongous. This doesn’t mean there aren’t some growing pains you’ll encounter along the way, of course. At Crittercism, we’ve had a huge amount of growth over the past 2 years and have hit some bumps in the road along the way, but we’ve also learned some important lessons that can hopefully be of use to others.

Background

Crittercism provides the world’s first and leading Mobile Application Performance Management (mAPM) solution. Our SDK is embedded in tens of thousands of applications, and used by nearly a billion users worldwide. We collect performance data such as error reporting, crash diagnostics details, network breadcrumbs, device/carrier/OS statistics, and user behavior. This data is largely unstructured and varies greatly among different applications, versions, devices, and usage patterns.

Storing all of this data in MongoDB allows us to collect raw information that may be of use in a variety of ways to our customers, while also providing the analytics necessary to summarize the data down to digestible, actionable metrics.

As our request volume has grown, so too has our data footprint; over the course of 18 months our daily request volume increased over 40x:

Our primary MongoDB cluster now houses over 20TB of data, and getting to this point has helped us learn a few things along the way.

Routing

The MongoDB documentation suggests that the most common topology is to include a router — a mongos process — on each client system. We started doing this and it worked well for quite a while.

As the number of front-end application servers in production grew from the order of 10s to several 100s, we found that we were creating heavy load via hundreds (or sometimes thousands) of connections between our mongos routers and our mongod shard servers. This meant that whenever chunk balancing occurred — something that is an integral part of maintaining a well-balanced, sharded MongoDB cluster — the chunk location information that is stored in the config database took a long time to propagate. This is because every mongos router needs to have a full picture of where in the cluster all of the chunks reside.

So what did we learn? We found that we could alleviate this issue by consolidating mongos routers onto a few hosts. Our production infrastructure is in AWS, so we deployed 2 mongos servers per availability zone. This gave us redundancy per AZ, as well as offered the shortest network path possible from the clients to the mongos routers. We were concerned about adding an additional hop to the request path, but using Chef to configure all of our clients to only talk to the mongos routers in their own AZ helped minimize this issue.

Making this topology change greatly reduced the number of open connections to our mongod shards, which we were able to measure using MMS, without a noticeable reduction in application performance. At the same time, there were several improvements to MongoDB that made both the mongos updates and the internal consistency checks more efficient in general. Combined with the new infrastructure this meant that we could now balance chunks throughout our cluster without causing performance problems while doing so.

Shard Replacement

Another scenario we’ve encountered is the need to dynamically replace mongod servers in order to migrate to larger shards. Again following the recommended best deployment practice, we deploy MongoDB onto server instances utilizing large RAID10 arrays running xfs. We use m2.4xlarge instances in AWS with 16 disks. We’ve used basic Linux mdadm for performance, but at the expense of flexibility in disk configuration. As a result when we are ready to allocate more capacity to our shards, we need to perform a migration procedure that can sometimes take several days. This not only means that we need to plan ahead appropriately, but also that we need to be aware of the full process in order to monitor it and react when things go wrong.

We start with a replica set where all replicas are at approximately the same disk utilization. We first create a new server instance with more disk allocated to it, and add it to this replica set with rs.add().

The new replica will enter the STARTUP2 state and remain there for a long time (in our case, usually 2-3 days) while it first clones data, then catches up via oplog replication and builds indexes. During this time, index builds will often stop the replication process (note that this behavior is set to change in MongoDB 2.6), and so the replication lag time does not strictly decrease the whole time — it will steadily decrease for a while, then an index build will cause it to pause and start to fall further behind. Once the index build completes the replication will resume. It’s worth noting that while index builds occur, mongostat and anything else that requires a read lock will be blocked as well.

Eventually the replica will enter the SECONDARY state and will be fully functional. At this point we can rs.stepDown() one of the old replicas, shut down the mongod process running on it, and then remove it from the replica set via rs.remove(), making the server ready for retirement.

We then repeat the process for each member of the replica set, until all have been replaced with the new instances with larger disks.

While this process can be time-consuming and somewhat tedious, it allows for a graceful way to grow our database footprint without any customer-facing impact.

Conclusion

Operating MongoDB at scale — as with any other technology — requires some knowledge that you can gain from documentation, and some that you have to learn from experience. By being willing to experiment with different strategies such as those shown above, you can discover flexibility that may not have been previously obvious. Consolidating the mongos router tier was a big win for Crittercism’s Ops team in terms of both performance and manageability, and developing the above described migration procedure has enabled us to continue to grow to meet our needs without affecting our service or our customers.

See how Crittercism, Stripe, Carfax, Intuit, Parse and Sailthru are building the next generation of applications at MongoDB World. Register now and join the MongoDB Community in New York City this June.

The MongoDB Java Driver 3.0: What’s Changing

Aug 30 • Posted 1 year ago

By Trisha Gee, MongoDB Java Engineer and Evangelist

In the last post, we covered the design goals for the new MongoDB Java Driver. In this one, we’re going to go into a bit more detail on the changes you can expect to see, and how to start playing with an alpha version of the driver. Please note, however, that the driver is still a work in progress, and not ready for production.

New features

Other than the overall changes to design detailed above, the 3.0 driver has the following new features:

  • Pluggable Codecs: This means you can do simple changes to serialisation/deserialisation, like tell the driver to use Joda Time instead of java.util.Date, or you can take almost complete control of how to turn your Java objects into BSON. This should be particularly useful for ODMs or other libraries, as they can write their own codecs to convert Java objects to BSON bytes.
  • Predictable cluster management: We’ve done quite a lot of work around discovering the servers in your cluster and determining which ones to talk to. In particular, the driver doesn’t have to wait for all servers to become available before it can start using the ones that are definitely there - the design is event-based so as soon as a server notifies the driver of its state the driver can take appropriate action - use it if it’s active, or start ignoring it if it’s no longer available.
  • Additional Connection Pool features: We’ve added support for additional connection pool settings, and a number of other improvements around connection management. Here’s the full list.
  • Deprecated methods/classes will be removed: In the next 2.x release a number of methods and classes will be deprecated. These, along with existing deprecated methods, will be removed in the 3.0 driver. This should point you in the right direction to help you migrate from 2.x to 3.x.

Speaking of Migration…

We’ve worked hard to maintain backwards compatibility whilst moving forwards with the architecture of the Java driver for MongoDB. We want to make migration as painless as possible, in many cases it should be a simple drop-in replacement if you want to keep using the existing API. We hope to provide a step-by-step guide to migrating from 2.x to 3.0 in the very near future. For now, it’s worth mentioning that upgrading will be easiest if you update to 2.12 (to be released soon), migrate any code that uses deprecated features, and then move to the compatible mode of the new driver.

Awesome! Can I try it?

Yes you can! You can try out an alpha of the new driver right now, but as you’d expect there are CAVEATS: this is an alpha, it does not support all current features (notably aggregation); although it has been tested it is still in development and we can’t guarantee everything will work as you expect. Features which have been or will be deprecated in the 2.x driver are missing completely from the 3.0 driver. Please don’t use it in production. However, if you do want to play with it in a development environment, or want to run your existing test suite against it, please do send us any feedback you have.

If you want to use the compatible mode, with the old API (minus deprecations) and new architecture:

Maven

Gradle

You should be able to do a drop-in replacement with this dependency - use this instead of your existing MongoDB driver, run it in your test environment and see how ready you are to use the new driver.

If you want to play with the new, ever-changing, not-at-all-final API, then you can use the new driver with the new API. Because we wanted to be able to support both APIs and not have a big-bang switchover, there’s a subtle difference to the location of the driver with the updated API, see if you can spot it:

Maven

Gradle

Note that if you use the new API version, you don’t have access to the old compatible API.

Of course, the code is in GitHub

In Summary

For 3.0, we will deliver the updated, simplified architecture with the same API as the existing driver, as well as working towards a more fluent style of API. This means that although in future you have the option of using the new API, you should also be able to do a simple drop-in replacement of your driver jar file and have the application work as before.

A release date for the 3.0 driver has not been finalized, but keep your eyes open for it.

All Hail the new Java driver!

The MongoDB Java Driver 3.0

Aug 13 • Posted 1 year ago

By Trisha Gee, MongoDB Java Engineer and Evangelist

You may have heard that the JVM team at 10gen is working on a 3.0 version of the Java driver. We’ve actually been working on it since the end of last year, and it’s probably as surprising to you as it is to me that we still haven’t finished it yet. But this is a bigger project than it might seem, and we’re working hard to get it right.

So why update the driver? What are we trying to achieve?

Well, the requirements are:

  • More maintainable
  • More extensible
  • Better support for ODMs, third party libraries and other JVM languages
  • More idiomatic for Java developers
Read more
blog comments powered by Disqus