Mongeez Migration Gotchas

Mongeez is a MongoDB migration management library which allows to manage the changes in the documents & records of MongoDB. 

Grails supports the MongoDB database out-of-the box and works nicely with the GORM. In Grails + Hibernate combination, we have DB Migration Plugin and for Grails + MongoDB we have Mongeez Plugin which uses the Mongeez library for migration.

While writing migrations using Grails Mongeez migration, you could encounter a few problems which are listed below:

Using the variable “db”

Consider a following migration:

// changeset John:disable-all-non-admin-accounts
db.user.update({}, {$set: {enabled: true}}, {multi: true});

var usernames = ["admin1", "admin2", "admin3", "admin4"];
for (var i = 0; i < usernames.length; i++) {
   var username = usernames[i];
   db.user.update({username: username}, {$set: {enabled: true}});
}

When you run the above migration directly in MongoDB shell, this migration will work properly but when you run this migration using Grails migration plugin, this will fail with the error  variable “db” is not defined in the loop. To fix this, you need to modify your code as follows:

// changeset John:disable-all-non-admin-accounts
var dbRef = db;
db.user.update({}, {$set: {enabled: true}}, {multi: true});

var usernames = ["admin1", "admin2", "admin3", "admin4"];
for (var i = 0; i < usernames.length; i++) {
   var username = usernames[i];
   dbRef.user.update({username: username}, {$set: {enabled: true}});
}

Here, we are storing the reference of “db” variable and later using the same variable inside the loop to make it work properly via the Grails database migration plugin.

Using “snapshot()” method with MongoDB cursor

Consider a migration where you are iterating over 50,000 records of a collection using MongoDB cursor and modifying the same collection simultaneously, you may end up with junk data after completing the migration.

For example, a collection user has 50,000 records and you want to change the status value to something like from ACTIVE to ENABLED and INACTIVE -> DISABLED.

// changeset John:change-status-number
var statusMapping = {
   // Old Status: New Status
   "ACTIVE": "ENABLED",
   "INACTIVE": "DISABLED"
};

db.user.find().forEach(function(userData) {
   db.user.update({
       _id: userData._id
   }, {
       $set: {
           // Updating old status to new status
           status: statusMapping[userData.status]
       }
   });
});

Now, when you run this migration, some of your records in user collection may end up with undefined value in the status field. Why??

When you don’t specify any sorting order, the MongoDB’s cursor uses the natural sorting order for getting the documents which may return the same document multiple times in the forEach loop. Suppose, the status of a record with email john@example.com is updated from ACTIVE to ENABLED. Now, due to no sort order, the same user with email john@example.com may come again and now, this time, no key will be available for status ENABLED and the record will be updated with undefined.

This is a most problematic error which may leave your database with junk and in the unrecoverable state. To overcome this problem, you can use sort() method on MongoDB cursor but that is not always beneficial. So, we should always use a snapshot() method on the cursor which ensures that the query will not return a document multiple times, even if simultaneously write operations result in a move of the document due to the growth in document size.

So, modifying your query something like this will save you from this trouble:

// changeset John:change-status-number
var statusMapping = {
   // Old Status: New Status
   "ACTIVE": "ENABLED",
   "INACTIVE": "DISABLED"
};

db.user.find().snapshot().forEach(function(userData) {
   db.user.update({
       _id: userData._id
   }, {
       $set: {
           // Updating old status to new status
           status: statusMapping[userData.status]
       }
   });
});

About CauseCode: We are a technology company specializing in Healthtech related Web and Mobile application development. We collaborate with passionate companies looking to change health and wellness tech for good. If you are a startup, enterprise or generally interested in digital health, we would love to hear from you! Let's connect at bootstrap@causecode.com

Leave a Reply

Your email address will not be published. Required fields are marked *

STAY UPDATED!

Do you want to get articles like these in your inbox?

Email *

Interested groups *
Healthtech
Business
Technical articles

Archives