Deployment Best Practices: Monitor your resources

Mar 28 • Posted 1 year ago

When you’re preparing a MongoDB deployment, you should try to understand how your application is going to hold up in production. It’s a good idea to develop a consistent, repeatable approach to managing your deployment environment so that you can minimize any surprises once you’re in production.

The best approach incorporates prototyping your setup, conducting load testing, monitoring key metrics, and using that information to scale your setup. The key part of the approach is to proactively monitor your entire system - this will help you understand how your production system will hold up before deploying, and determine where you’ll need to add capacity. Having insight into potential spikes in your memory usage, for example, could help put out a write-lock fire before it starts.

Read more

MongoDB 2.4 Javascript Changes

Mar 27 • Posted 1 year ago

The upcoming release of MongoDB 2.4 brings an exciting change to the JavaScript engine. Previously, MongoDB ran Spidermonkey 1.7, but going forward, MongoDB will be running V8, the open-source high-performance JavaScript engine from Google. This means that from now on, whenever JavaScript is executed, V8 will be running the show.

In this post we’ll examine the following primary impacts of this change:

  1. concurrency improvements
  2. modernized JavaScript implementation
  3. impacted features

Concurrency improvements

Previously, every query/command that used the JS interpreter had to acquire a mutex, thus serializing all JS work. Now, with V8 we have improved concurrency by allowing each JavaScript job to run on a separate core.

For example, if a user’s workload commonly involved 24 concurrent $where queries (each from a unique client), and they have a server with 24 cores, they should expect query execution times to be reduced by (roughly) a factor of 24.

Read more

MongoDB 2.4 Released

Mar 19 • Posted 1 year ago

The MongoDB Engineering Team is pleased to announce the release of MongoDB 2.4. This is the latest stable release, following the September 2012 release of MongoDB 2.2. This release contains key new features along with performance improvements and bug fixes. We have outlined some of the key features below. For additional details about the release:

Highlights of MongoDB 2.4 include:

  • Hash-based Sharding
  • Capped Arrays
  • Text Search (Beta)
  • Geospatial Enhancements
  • Faster Counts
  • Working Set Analyzer
  • V8 JavaScript engine
  • Security

Read more

MongoDB Tip: The touch Command

Mar 6 • Posted 1 year ago

MongoDB 2.2 introduced the touch command, which loads data from the data storage layer into memory. The touch command will load a collection’s documents, indexes or both into memory. This can be ideal to preheat a newly started server, in order to avoid page faults and slow performance once the server is brought into production. You can also use this when adding a new secondary to an existing replica set to ensure speedy subsequent reads.

Note that while the touch command is running, a replica set member will enter into a RECOVERING state to prevent reads from clients. When the operation completes, the secondary will return to the SECONDARY(2) state.

You invoke the touch command through the following syntax:

db.runcommand({ touch: “collection_name”, data: true, index: true})

Read more

MongoDB and Hadoop: A Step-by Step Tutorial Using the Mortar Development Framework

Feb 19 • Posted 1 year ago

The following is a guest post from Jeremy Karn. This article is excerpted from ‘MongoDB + Hadoop: A Step-by-Step Tutorial’. Jeremy is a cofounder at Mortar Data, a Hadoop-as-a-service provider, and creator of mortar, an open source framework for data processing.

People who are worried about scalability often find themselves looking at two tools: MongoDB for storing large amounts of data easily and Hadoop for processing that data. But a common question is: “How do I combine these two to really get the most out of my data?”

Read more

Analyzing Your MongoDB Data with Analytica

Jan 29 • Posted 1 year ago

This is a guest post by Nosh Petigara, president of Analytica

Analytica is an analytics platform that makes it easy to analyze and report on data like user profiles, event logs, product catalogs, user-generated content, financial assets, or anything else you may have stored in you MongoDB database.

Analytica is built from the ground up for rich document type data and uses a JSON-like representation throughout its architecture. You use Analytica Script a declarative expression language tailored for JSON data, to tell Analytica how perform calculations, filter, group, and transform your documents into the results you want. You can interact with Analytica using a plug-in to Microsoft Excel or a command line shell.  Analytica can also be used through its REST API. Browser-based and mobile interfaces are coming soon. 
Read more

Checking Disk Performance with the mongoperf Utility

Jan 17 • Posted 1 year ago

Note: while this blog post uses some Linux commands in its examples, mongoperf runs and is useful on just about all operating systems.

mongoperf is a utility for checking disk i/o performance of a server independent of MongoDB. It performs simple timed random disk i/o’s.

mongoperf has a couple of modes: mmf:false and mmf:true  

mmf:false mode is a completely generic random physical I/O test — there is effectively no MongoDB code involved.

Read more

MongoDB Text Search: Experimental Feature in MongoDB 2.4

Jan 14 • Posted 1 year ago

Text search (SERVER-380) is one of the most requested features for MongoDB 10gen is working on an experimental text-search feature, to be released in v2.4, and we’re already seeing some talk in the community about the native implementation within the server. We view this as an important step towards fulfilling a community need. 

MongoDB text search is still in its infancy and we encourage you to try it out on your datasets. Many applications use both MongoDB and Solr/Lucene, but realize that there is still a feature gap. For some applications, the basic text search that we are introducing may be sufficient. As you get to know text search, you can determine when MongoDB has crossed the threshold for what you need.

Setting up Text Search

You can configure text search in the mongo shell:

db.adminCommand( { setParameter : 1, textSearchEnabled : true } )

Or set a command:

mongod --setParameter textSearchEnabled=true


Read more

MongoDB Schema Design: Insights and Tradeoffs from Jetlore

MongoDB’s flexible schema is a powerful feature, and to build a successful first application you need to know how to leverage this feature to its full extent. In this presentation, Montse Medina outlines lessons learned from building Jetlore, a social content marketing platform. Some performance tips from this video:

  • Sometimes it’s ok to randomize your sharding key. When you have lots of users that want to read from other users, you’ll need to randomize it in order to have fewer disk seeks per shard.
  • Reduce collection size by always using short field names as a convention. This will help you save memory over time.
  • Always test your queries with .explain() to check that you’re hitting the right index.

Performance Tips: MongoDB at Firescope

Dec 19 • Posted 1 year ago

Guest post by Pete Whitney

Starting to work with any new technology or new API is always challenging at first. You’re often not quite sure of the best ways to get things done or if you’re are using the new technology in the most efficient manner. Furthermore, the early learning process is often littered with trial and error improvements that unfortunately take time to rework and reengineer into more optimal solutions. It sure would be nice if we could short circuit this learning process and simply arrive at nirvana on the first cut. While I won’t proclaim that the destination of the blog is nirvana, I will try to short circuit the learning process by sharing four specific performance related tips that we learned at FireScope Inc. when we transitioned from MySQL to MongoDB for our improved cloud based Stratis product. This blog will share the shorthand version of these tips and point the reader to a more in depth rendering if further understanding is desired.

  1. Through a comparative analysis FireScope found that accessing MongoDB via the MongoDB java driver was three times faster than doing the same access using SpringData. While SpringData yields many benefits it accomplishes its job using a reflection based solution that is performed on top of the native MongoDB java driver. So for FireScope’s performance centric considerations paying a 3X performance penalty for its benefits was not a tradeoff we were willing to make.
Read more

3D Repo Runs MongoDB

Dec 12 • Posted 1 year ago

If you’re an architectural or engineering firm, you’ve undoubtedly confronted the difficulty of managing and collaborating on 3D assets like CAD drawings.  The act of sharing massive files is hard but feasible, but it is significantly complicated by the inability to determine that you’re using the latest version.  For the CAD-inclined, there’s hope. Jozef Dobos, a doctoral student at University College London (UCL), has applied the geospatial indexing capabilities of MongoDB a version control system for 3D assets called 3D Repo.  Sponsored by Arup Foresight, the built environment innovation division of Arup Group Limited, a global design and business consulting firm with offices in over 30 countries, 3D Repo leverages the flexibility of MongoDB’s data model, not to mention its geospatial capabilities, to make collaboration on 3D assets easy.

Read more

November Driver Releases

Dec 10 • Posted 1 year ago

On November 27, all 10gen supported drivers were updated with new error checking and reporting defaults. Each driver now has a MongoClient connection class to handle the error checking. On the same day there was also a server release with fixes on 2.2

MongoQP: MongoDB Slow Query Profiler

Dec 5 • Posted 1 year ago

Two times a year 10gen’s Drivers and Innovations team gather together for a face to face meeting to work together and setting goals for the upcoming six months. This year the team broke up into teams for an evening hackathon. MongoQP, a query profiler, was one of the hacks presented by Jeremy Mikola, PHP Evangelist at 10gen.

Logging slow queries is essential for any database application, and MongoDB makes doing so relatively painless with its database profiler. Unfortunately, making sense of the system.profile collection and tying its contents back to your application requires a bit more effort. The heart of mongoqp (Mongo Query Profiler) is a bit of map/reduce JS that aggregates those queries by their BSON skeleton (i.e. keys preserved, but values removed). With queries reduced to their bare structure, any of their statistics can be aggregated, such as average query time, index scans, counts, etc.


Read more

Lessons Learnt Building mongoengine

Nov 29 • Posted 1 year ago

Recently, I attended both Pycon UK and Pycon Ireland to talk about the lessons I have learnt while maintaining mongoengine. The conferences were both excellent and surprisingly different. Pycon UK had quite an “unconference” feel, with some exciting sprint rooms - I wish I had more time as by all reports the educational jam was inspirational. Pycon Ireland in contrast felt more slick with booths from DemonWare, Amazon and Facebook. If you can, I’d advise going to both conferences as they really complement each other.

Read more

Introducing MongoClient

Nov 27 • Posted 1 year ago

Today we are releasing updated versions of most of the officially supported MongoDB drivers with new error checking and reporting defaults. See below for more information on these changes, and check your driver docs for specifics.

Over the past several years, it’s become evident that MongoDB’s previous default behavior (where write messages did not wait for a return code from the server by default) wasn’t intuitive and has caused confusion for MongoDB users. We want to rectify that with minimal disruption to the MongoDB apps already in production.

Read more