MongoDB: Powering the Magic and the Monsters at Stripe

May 2 • Posted 1 year ago

Update: Watch the video of Greg Brockman’s talk on MongoDB for High Availability at MongoSF ‘12

Stripe offers a simple platform for developers to accept online payments. They are a long-time user of MongoDB and have built a powerful and flexible system for enabling transactions on the web. In advance of their talk at MongoSF on MongoDB for high availability, Stripe’s engineer, Greg Brockman spoke with us about what’s going on with MongoDB at Stripe.

Stripe has a heavy write load with large query volumes. Can you give us some insight into your tips and tricks for wrangling with MongoDB’s replica sets on your system?

Getting replica sets up and running is actually incredibly easy. I used to run MySQL clusters where configuring and maintaining replication was a pain, and it was a joy to just be able to run “rs.add(node)” and watch it join the cluster.

In order to avoid losing any operations even if we lose our database primary, we structure our application such that all writes are idempotent. We then wrap our calls to the MongoDB driver in a retry block. If the call fails because our MongoDB cluster is currently reconfiguring, we try the operation again (with the usual backoff and timeout you’d expect from a scheme like this). We’re very careful to avoid operations which could result in evicting hot data from the cache. Running unindexed queries is an obvious example of this, but we’ve also found that running a large multi-update can have production impact.

So when we need to change our schema for an entire collection of documents, we’ll usually run a slower (but non-impactful) document-by-document migration at the application level.

Let’s take a step back to your past talk at MongoSV ‘11 — what are you doing with Monster (Stripe’s native events processing system for payments)?

Monster is our framework for event production and event consumption, which uses MongoDB as a highly-available, persistent queue. With Monster, our engineers can start logging a new type of event with only a few lines of code, and at any time in the future can add a consumer that will automatically be passed relevant events (possibly even historical ones). We use it for a variety of purposes: structured logging, incremental updating of state (such as people’s graphs of payment volume), and background jobs.

Lots of people are innovating in the financial space — in particular building APIs for mobile payments. For those just starting up, why should they use MongoDB?

As a payments processor, our uptime is incredibly important. We were initially drawn to MongoDB because replica sets make it incredibly easy to run your database in a highly-available fashion. I came from a world where my database master could never be rebooted, since there was no zero-downtime failover strategy even for routine maintenance — MongoDB gives you this almost out of the box.

MongoDB also makes it easy to do zero-downtime migrations, with features such as background index builds and allowing multiple schemas in a single collection. Anyone caring about their availability should look very hard at MongoDB.

How are you guys using the Ruby driver in your system? Anything interesting?

We’ve banged on the Ruby driver in a variety of configurations, ensuring that it behaves properly when exposed to all the possible failures we can imagine (or have noticed) our database servers experiencing. These days, we’re very happy with how robust the Ruby driver is against the wide variety of failure modes of the distributed MongoDB nodes.

What’s your wish list for the Ruby Driver?

I wish there were a configuration option for forcing reads from a secondary. (Right now, you can request that reads be on a secondary if one is available, but they’ll start reading from the primary if no secondary is available.)

What’s on stripe’s engineering roadmap?

While making Stripe available outside the US is our top priority, our biggest engineering challenges at the moment are scaling our systems to keep up with the phenomenal growth we’ve been experiencing.

Many thanks to Greg for taking the time to tell a bit about the magic at Stripe.