Home Blogs Field-level encryption in MongoDB community server, using Node JS and Mongoose
Engineering

Field-level encryption in MongoDB community server, using Node JS and Mongoose

Field-level encryption in MongoDB community server, using Node JS and Mongoose
Reading Time: 5 minutes

Field Level Encryption (FLE)

Simply put, it’s a kind of encryption where we encrypt specific columns or fields in the database, instead of encrypting the whole table or document.

Unlike Encryption at rest, FLE does not encrypt the whole database. 

Using Encryption at rest allows people with enough authentication to bypass the security check and access the data. These people could be

  • DBA
  • A third-party provider which hosts the MongoDB cluster
  • A third-party data analytics firm that has access to data that includes private, personal, or confidential information

This risk is mitigated by FLE, where we store the encrypted data in DB.

Automatic FLE in MongoDB

Automatic FLE is available only in enterprise servers with version 4.2 or later.

How does this work?

MongoDB Enterprise provides a service called `mongocryptd` which sits between application and DB.

This service is used to automate the encryption and decryption process.

mongocryptd uses the provided KMS to fetch the encryption keys and parses the JSON schema defined in the collection to encrypt the required fields.

This saves the overhead of handling encryption at the application level.

Image taken from MongoDB Documentation

How to handle FLE in community server with Mongoose

When using FLE in the community server, we need to handle encryption and decryption at the application level.
For this, we need to define a standard and secure encryption and decryption algorithm.

Defining getters and setters on fields

Mongoose getters and setters allow you to execute custom logic when getting or setting a property on a Mongoose document. Getters let you transform data in MongoDB into a more user-friendly form, and setters let you transform user data before it gets to MongoDB.

const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const { encrypt, decrypt } = require('./cipher');

const userSchema = new Schema(
    {
        name: String,
        phone: { type: String, set: encrypt, get: decrypt },
        email: { type: String, set: encrypt, get: decrypt },
    },

    {
        versionKey: false,
    }
);

const User = mongoose.model('users', userSchema, 'users');
module.exports = User;

We need to add some parameters in the schema, which will tell mongoose to use the getters and setters every time we do a query.

const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const { encrypt, decrypt } = require('./cipher');

const userSchema = new Schema(
    {
        name: String,
        phone: { type: String, set: encrypt, get: decrypt },
        email: { type: String, set: encrypt, get: decrypt },
    },

    {
        versionKey: false,

        // Following options will enable us to use getters and setters on almost all queries
        toObject: { getters: true, setters: true },
        toJSON: { getters: true, setters: true },
        runSettersOnQuery: true,
    }
);

const User = mongoose.model('users', userSchema, 'users');
module.exports = User;

Writing a document

var user = new User({
     name: 'Test User',
     email: 'sample@example.com',
     phone: '9999999999',
 });
 user.save()
Encrypted values of data are stored in DB

Fetching the Document

User.findOne({ email: "sample@example.com" });
Notice that we did not have to search the email with its encrypted value, because that will be taken care by the runSettersOnQuery parameter passed in the schema

Problem with this approach

  • As you might have guessed, for data to be encrypted or decrypted, it needs to go through the getters and setters of mongoose model.
  • This does not happen in 2 cases
    • find queries with lean
    • aggregation queries
  • In both of the cases, the JSON data is directly returned from MongoDB, without being converted into mongoose model data type, and hence getter function is not executed.

lean()

User.findOne({ email: "sample@example.com" }).lean();
Data was not decrypted when lean was used

aggregate()

User.aggregate([
    {
        $match: {
            email: 'sample@example.com',
        },
    },
    {
        $project: {
            name: 1,
            phone: 1,
        },
    },
]);

The result of the above query is empty.

Solution 

lean()

We need to make sure that the getter function defined in the schema is called every time we use lean

An npm package mongoose-lean-getter can be used to achieve this

A parameter needs to be passed to lean, to invoke the package

The plugin is used like following,

const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const { encrypt, decrypt } = require('./cipher');

// Adding the package
const mongooseLeanGetter = require('mongoose-lean-getters');


const userSchema = new Schema(
    {
        name: String,
        phone: { type: String, set: encrypt, get: decrypt },
        email: { type: String, set: encrypt, get: decrypt },
    },

    {
        versionKey: false,
        toObject: { getters: true, setters: true },
        toJSON: { getters: true, setters: true },
        runSettersOnQuery: true,
    }
);

// Using the package
userSchema.plugin(mongooseLeanGetter);

const User = mongoose.model('users', userSchema, 'users');
module.exports = User;

Query to the collection will look like the following

User.find({ email: "sample@example.com" }).lean({ 
    getters: true 
});
After using mongoose-lean-getters, data is decrypted

aggregate()

We need to manually encrypt or decrypt the aggregation queries at 2 points

  • Entry point
  • After getting the result

Entry Point

If we are using filter operation in the aggregation pipeline where we want to match an encrypted field, we need to encrypt the email and then do the search, like follow

User.aggregate([
    {
        $match: {
            email: 'f373a715d2b545f3f78422f64293539:0431e1ceb0c9373a9e30313c55306c6e2cdf32f5bc1b00b686c468f50fdd2a81',
        },
    },
    {
        $project: {
            name: 1,
            phone: 1,
        },
    },
]);

After getting the result

We need to manually decrypt the data returned in the above query

phone was not decrypted

Caveats

  • Aggregation queries need to be handled separately at each instance.
  • Reason being, 
    • When we use aggregation, we might project the fields with some other name than that defined in the schema
    • If there is a deeply nested array in the result, we need to recursively traverse the array and check the fields that need to be decrypted. This traversal causes a significant performance hit

Benchmarking

  • Process
    • Query –
      User.findOne({ email: "sample@example.com" })
      .lean()
    • Used npm package autocannon
    • Created API to perform the query
    • Hit the API from 5 nodes, 100 requests per node
  • Git repo for this can be found here
Without Encryption
With Encryption and using mongoose-lean-getters plugin
  • Performance hit at
    • 99 percentile = ~7%
    • Avg = ~12%

Conclusion

  • Automatic FLE in MongoDB is only available in Enterprise Server with version 4.2 or higher
  • In the community server of MongoDB, FLE needs to be implemented at the application level
    When using Node JS and Mongoose ORM, this can be achieved by using
    • Mongoose getters and setters
    • mongoose-lean-getter
  • Aggregation queries need to be handled separately at every instance
  • There is an expected performance hit when we introduce FLE in the application

You may want to read

Your email address will not be published. Required fields are marked *

Field-level encryption in MongoDB community server, using Node JS and Mongoose
Share:
Share via Whatsapp