MongoDB’s New Matcher

May 28 • Posted 1 year ago

Discuss on Hacker News

MongoDB 2.5.0 (an unstable dev build) has a new implementation of the “Matcher”. The old Matcher is the bit of code in Mongo that takes a query and decides if a document matches a query expression. It also has to understand indexes so that it can do things like create a subsets of queries suitable for index covering. However, the structure of the Matcher code hasn’t changed significantly in more than four years and until this release, it lacked the ability to be easily extended. It was also structured in such a way that its knowledge could not be reused for query optimization. It was clearly ready for a rewrite.

The “New Matcher” in 2.5.0 is a total rewrite. It contains three separate pieces: an abstract syntax tree (hereafter ‘AST’) for expression match expressions, a parser from BSON into said AST, and a Matcher API layer that simulates the old Matcher interface while using all new internals. This new version is much easier to extend, easier to reason about, and will allow us to use the same structure for matching as for query analysis and rewriting.

Read more

New Geo Features in MongoDB 2.4 

May 21 • Posted 1 year ago


Geometric processing as a field of study has many applications, and has resulted in lots of research, and powerful tools. Many modern web applications have location based components, and require a data storage engines capable of managing geometric information. Typically this requires the introduction of an additional storage engine into your infrastructure, which can be a time consuming and expensive operation.

MongoDB has a set of geometric storage and search features. The MongoDB 2.4 release brought several improvements to MongoDB’s existing geo capabilities and the introduction of the 2dsphere index.

The primary conceptual difference (though there are also many functional differences) between the 2d and 2dsphere indexes, is the type of coordinate system that they consider. Planar coordinate systems are useful for certain applications, and can serve as a simplifying approximation of spherical coordinates. As you consider larger geometries, or consider geometries near the meridians and poles however, the requirement to use proper spherical coordinates becomes important.

In addition to this major conceptional difference, there are also significant functional differences, which are outlined in some depth in the Geospatial Indexes and Queries section of the MongoDB documentation. This post will discuss the new features that have been added in the 2.4 release.

Read more

MongoDB, build parties, and deploying your web application at 11am on a Wednesday

May 12 • Posted 1 year ago

This is a guest post by Sean Reilly. Release your applications with MongoDB more often and get closer to the ultimate goal of deploying applications anytime and why not at 11am on Wednesday mornings?

What you will learn…

This article explores how to make use of MongoDB characteristics in order to avoid the downtime traditionally required by migration scripts in the SQL world. This is in order to get closer to the goal of being able to deploy applications with no downtime.

Read more

ODBC Connector for MongoDB

May 7 • Posted 1 year ago

This is a guest post by NYU Information Systems (MSIS) Graduate students Kyle Galloway, Pravish Sood and Dylan Kelemen.

We are pleased to announce the Mongo-ODBC project. As NYU MSIS students in Courant Institute’s Information Technology Projects course, we are working under the guidance of 10gen and our Professor Evan Korth to develop an ODBC (Open-Database-Connectivity) driver for MongoDB.

ODBC was created in order to facilitate the movement of data between applications with different file structures and although it is not as popular as it once was, in part due to more flexible alternatives like MongoDB, but many programs maintain ODBC compliance. The goal of our project is to create an ODBC driver that supports the ODBC functions that can be carried out on MongoDB. This will allow users of programs that don’t yet offer MongoDB support some access to data in MongoDB databases. We believe this will particularly useful for new users and those dependent on programs like Excel and Tableau for simple business analysis reporting.

Read more

The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js

Apr 30 • Posted 1 year ago

By Valeri Karpov, Kernel Tools engineer at MongoDB and and co-founder of the Ascot Project

A few weeks ago, a friend of mine asked me for help with PostgreSQL. As someone who’s been blissfully SQL-­free for a year, I was quite curious to find out why he wasn’t just using MongoDB instead. It turns out that he thinks MongoDB is too difficult to use for a quick weekend hack, and this couldn’t be farther from the truth. I just finished my second 24 hour hackathon using Mongo and NodeJS (the FinTech Hackathon co­sponsored by 10gen) and can confidently say that there is no reason to use anything else for your next hackathon or REST API hack.

Read more

10 questions to ask (and answer) when hosting MongoDB on AWS

Apr 22 • Posted 1 year ago

This is a guest post from Dharshan Rangegowda, founder of Scalegrid, creators of MongoDirector. This originally appeared on the MongoDirector blog

Are you hosting your production MongoDB instances on Amazon AWS? At MongoDirector.comwe manage hundreds of production MongoDB instances on AWS and have learnt a few things along the way. Here are a set of 10 questions you need to ask yourself and answer as you continue to manage your deployment. Almost all of the information below is applicable to other cloud service providers as well.

Read more

New Hash-based Sharding Feature in MongoDB 2.4

Apr 10 • Posted 1 year ago

Lots of MongoDB users enjoy the flexibility of custom shard keys in organizing a sharded collection’s documents. For certain common workloads though, like key/value lookup, using the natural choice of _id as a shard key isn’t optimal because default ObjectId’s are ascending, resulting in poor write distribution.  Creating randomized _ids or choosing another well-distributed field is always possible, but this adds complexity to an app and is another place where something could go wrong.

To help keep these simple workloads simple, in 2.4 MongoDB added the new Hash-based shard key feature.  The idea behind Hash-based shard keys is that MongoDB will do the work to randomize data distribution for you, based on whatever kind of document identifier you like.  So long as the identifier has a high cardinality, the documents in your collection will be spread evenly across the shards of your cluster.  For heavy workloads with lots of individual document writes or reads (e.g. key/value), this is usually the best choice.  For workloads where getting ranges of documents is more important (i.e. find recent documents from all users), other choices of shard key may be better suited.

Read more

Deployment Best Practices: Monitor your resources

Mar 28 • Posted 1 year ago

When you’re preparing a MongoDB deployment, you should try to understand how your application is going to hold up in production. It’s a good idea to develop a consistent, repeatable approach to managing your deployment environment so that you can minimize any surprises once you’re in production.

The best approach incorporates prototyping your setup, conducting load testing, monitoring key metrics, and using that information to scale your setup. The key part of the approach is to proactively monitor your entire system - this will help you understand how your production system will hold up before deploying, and determine where you’ll need to add capacity. Having insight into potential spikes in your memory usage, for example, could help put out a write-lock fire before it starts.

Read more

MongoDB 2.4 Javascript Changes

Mar 27 • Posted 1 year ago

The upcoming release of MongoDB 2.4 brings an exciting change to the JavaScript engine. Previously, MongoDB ran Spidermonkey 1.7, but going forward, MongoDB will be running V8, the open-source high-performance JavaScript engine from Google. This means that from now on, whenever JavaScript is executed, V8 will be running the show.

In this post we’ll examine the following primary impacts of this change:

  1. concurrency improvements
  2. modernized JavaScript implementation
  3. impacted features

Concurrency improvements

Previously, every query/command that used the JS interpreter had to acquire a mutex, thus serializing all JS work. Now, with V8 we have improved concurrency by allowing each JavaScript job to run on a separate core.

For example, if a user’s workload commonly involved 24 concurrent $where queries (each from a unique client), and they have a server with 24 cores, they should expect query execution times to be reduced by (roughly) a factor of 24.

Read more

MongoDB 2.4 Released

Mar 19 • Posted 1 year ago

The MongoDB Engineering Team is pleased to announce the release of MongoDB 2.4. This is the latest stable release, following the September 2012 release of MongoDB 2.2. This release contains key new features along with performance improvements and bug fixes. We have outlined some of the key features below. For additional details about the release:

Highlights of MongoDB 2.4 include:

  • Hash-based Sharding
  • Capped Arrays
  • Text Search (Beta)
  • Geospatial Enhancements
  • Faster Counts
  • Working Set Analyzer
  • V8 JavaScript engine
  • Security

Read more

MongoDB Tip: The touch Command

Mar 6 • Posted 1 year ago

MongoDB 2.2 introduced the touch command, which loads data from the data storage layer into memory. The touch command will load a collection’s documents, indexes or both into memory. This can be ideal to preheat a newly started server, in order to avoid page faults and slow performance once the server is brought into production. You can also use this when adding a new secondary to an existing replica set to ensure speedy subsequent reads.

Note that while the touch command is running, a replica set member will enter into a RECOVERING state to prevent reads from clients. When the operation completes, the secondary will return to the SECONDARY(2) state.

You invoke the touch command through the following syntax:

db.runcommand({ touch: “collection_name”, data: true, index: true})

Read more

MongoDB and Hadoop: A Step-by Step Tutorial Using the Mortar Development Framework

Feb 19 • Posted 1 year ago

The following is a guest post from Jeremy Karn. This article is excerpted from ‘MongoDB + Hadoop: A Step-by-Step Tutorial’. Jeremy is a cofounder at Mortar Data, a Hadoop-as-a-service provider, and creator of mortar, an open source framework for data processing.

People who are worried about scalability often find themselves looking at two tools: MongoDB for storing large amounts of data easily and Hadoop for processing that data. But a common question is: “How do I combine these two to really get the most out of my data?”

Read more

Analyzing Your MongoDB Data with Analytica

Jan 29 • Posted 1 year ago

This is a guest post by Nosh Petigara, president of Analytica

Analytica is an analytics platform that makes it easy to analyze and report on data like user profiles, event logs, product catalogs, user-generated content, financial assets, or anything else you may have stored in you MongoDB database.

Analytica is built from the ground up for rich document type data and uses a JSON-like representation throughout its architecture. You use Analytica Script a declarative expression language tailored for JSON data, to tell Analytica how perform calculations, filter, group, and transform your documents into the results you want. You can interact with Analytica using a plug-in to Microsoft Excel or a command line shell.  Analytica can also be used through its REST API. Browser-based and mobile interfaces are coming soon. 
Read more

Checking Disk Performance with the mongoperf Utility

Jan 17 • Posted 1 year ago

Note: while this blog post uses some Linux commands in its examples, mongoperf runs and is useful on just about all operating systems.

mongoperf is a utility for checking disk i/o performance of a server independent of MongoDB. It performs simple timed random disk i/o’s.

mongoperf has a couple of modes: mmf:false and mmf:true  

mmf:false mode is a completely generic random physical I/O test — there is effectively no MongoDB code involved.

Read more

MongoDB Text Search: Experimental Feature in MongoDB 2.4

Jan 14 • Posted 1 year ago

Text search (SERVER-380) is one of the most requested features for MongoDB 10gen is working on an experimental text-search feature, to be released in v2.4, and we’re already seeing some talk in the community about the native implementation within the server. We view this as an important step towards fulfilling a community need. 

MongoDB text search is still in its infancy and we encourage you to try it out on your datasets. Many applications use both MongoDB and Solr/Lucene, but realize that there is still a feature gap. For some applications, the basic text search that we are introducing may be sufficient. As you get to know text search, you can determine when MongoDB has crossed the threshold for what you need.

Setting up Text Search

You can configure text search in the mongo shell:

db.adminCommand( { setParameter : 1, textSearchEnabled : true } )

Or set a command:

mongod --setParameter textSearchEnabled=true


Read more
blog comments powered by Disqus