Oct 24 • Posted 5 months ago
This is a guest post by by Chris Merz & Garrett Clark, SolidFire
We recently had a large enterprise customer implement a MongoDB sharded cluster on SolidFire as the backend for a global e-commerce system. By leveraging solid-state drive technology with features like storage virtualization, Quality of Service (guaranteed IOPS per volume), and horizontal scaling, the customer was looking to combine the benefits of dedicated storage performance with the simplicity and scalability of a MongoDB environment.
During the implementation the customer reached out to us with some performance and tuning questions, requesting our assistance with the configuration. After meeting with the team and reviewing the available statistics, we discovered response times that were deemed out of range for the application’s performance requirements. Response times were ~13-20ms (with an average of 15-17 ms). While this is considered acceptable latency in many implementations, the team was targeting < 5ms average query response times.
When troubleshooting any storage latency performance issue it is important to focus on two critical aspects of the end-to-end chain: potential i/o queue depth bottlenecks and the main contributors to the overall latency in the chain. A typical end-to-end sequence with attached storage can be described by:
MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB
First off, we looked for any i/o queue depth bottlenecks and found the first one on the operating system layer. MongoDB was periodically sending an i/o queue depth of >100 to the operating system and, by default, iSCSI could only release a 32 queue depth per iSCSI session. This drop from an i/o queue depth of >100 to 32 caused frames to be stalled on the operating system layer while they were waiting to continue down the chain.
We alleviated the issue by increasing the number of iSCSI sessions to the volume from 1 to 4, which proportionally increased the queue depth exiting the operating system to 128 (32*4). This enabled all frames coming off the application layer to immediately pass through the operating system and NIC, decreased the overall latency from ~15ms to ~4ms. Despite the latency average being 4ms, performance was still rather variable.
We then turned our focus to pinpointing the sources of the remaining end-to-end latency. We were able to determine the latency factors in the stack through the study of three latency loops:
First, the complete chain of: MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB. This loop took an average of 3.9ms to complete.
Secondly, the subset loop of: OS > NIC > Network > Storage > Network > NIC > OS. This loop took ~1.1ms to complete. We determined the latency of this loop by the output of “iostat –xk 1” then greping for the corresponding volume.
The last loop segment, latency on the storage subsystem, was 0.7ms and was obtained through a polling API command issued to the SolidFire unit.
Our analysis pointed to the first layers of the stack contributing the most significant percent (>70%) of the end-to-end latency, so we decided to start there and continue downstream.
We reviewed the OS configuration and tuning, with an eye towards both SolidFire/iSCSI best practices and MongoDB performance. Several OS-level tunables were found that could be tweaked to ensure optimal throughput for this type of deployment. Unfortunately, none of these resulted in any major reduction in the end-to-end latency for mongo.
Having eliminated the obvious, we were left with what remained: MongoDB itself. A phrase oft-quoted by the famous fictional detective, Sherlock Holmes came to mind: “when you have eliminated the impossible, whatever remains, however improbable, must be the truth.”
Upon going over the collected statistics runs with a fine-toothed comb, we noticed that the latency spikes had intervals of almost exactly 60 seconds. That’s when the light bulb went off…
The MongoDB flush interval. The architecture of MongoDB was developed in the context of spinning disk, a vastly slower storage technology requiring batched file syncs to minimize query latency. The
syncdelay setting defaults to 60 seconds for this very reason. In the documentation, it is clearly stated “In almost every situation you should not set this value and use the default setting”. ‘Almost’ was the key to our solution, in this particular case. It should be noted that changing
syncdelay is an advanced tuning, and should be carefully evaluated and tested on a per-deployment basis.
Little’s Law (IOPS = Queue Depth / Latency) indicated that lowering the flush interval would reduce the variance in queue depth thereby smoothing the overall latency. In lab testing, we had found that, under maximum load, decreasing the syncdelay to 1 second would force a ‘continuous flush’ behavior usually repeating every 6-7 seconds, reducing i/o spikes in the storage path. We had seen this as a useful technique for controlling IOPS throughput variability, but had not typically viewed it as a latency reduction technique.
After implementing the change, the customer excitedly reported that they were seeing average end-to-end MongoDB response times of 1.2ms, with a throughput of ~4-5k IOPS per mongod (normal for this application), and NO obvious increase in extraneous i/o.
By increasing the number of iSCSI sessions, normalizing the flush rate and removing the artificial 60s buffer, we reduced average latency more than an order of magnitude, proving out the architecture at scale in a global production environment. Increasing the iSCSI sessions increased parallelism, and decreased the latency by 3.5-4x. The reduction in syncdelay had the effect of smoothing the average queue depth being sent to the storage system, decreasing latency by slightly more than 3x.
This customer’s experience is a good example of how engaging the MongoDB team early on can ensure a smooth product launch. As of today, we’re excited to announce that SolidFire is a MongoDB partner. Learn more about the SolidFire and MongoDB integration on our Database Solutions page. To learn more about performance tuning MongoDB on SolidFire, register for our upcoming webinar on November 6 with MongoDB.
For more information on MongoDB performance, check out Performance Considerations for MongoDB, which covers other topics such as indexes and application patterns and offers insight into achieving MongoDB performance at scale.