MongoDB Best Practices for Sort Queries

Pavneet Kaur
5 min readFeb 23, 2021

We all have encountered the sort() method for sorting data over particular fields either in ascending or descending order.

Have you ever wondered how MongoDB performs sorting?
Umm, a brief idea will help us to digest why we should be cautious while querying data in some sorted order.

MongoDB does in-memory sorting. All documents stored on disk are read into RAM, then a sorting algorithm is performed. In-memory sorting is inefficient, time-consuming & sorting a large number of documents in memory is quite expensive. Moreover, MongoDB aborts the operation if memory usage is more than 32MB.

Let’s see the winning plan for a simple query with sort method:-

Important things to consider:-

  • totalDocsExamined to nReturned ratio, totalDocsExamined should be closer to nReturned documents for better performance.
  • totalKeysExamined, here zero indicates, no index is used for this query.
  • executionTimeMillis, 27ms, time-taken for the execution of the query.
  • Query Plan for the above query, COLLSCAN → SORT
    - COLLSCAN stage refers to the Collection scan.
    - SORT
    stage refers to in-memory sorting.
  • memUsage: memory used for sorting documents w.r.t fields in the query & should be lesser than memLimit, else operation will be aborted.
    memLimit: Max memory size mongo can use for sorting.

If collection size increases, sort query will result in scanning more documents, higher execution time, or may result in aborting the query if the memory used exceeds memory limit.

Sad!

How to deal with these problems?

MongoDB Indexes, the life saviour. Index fields are ordered with the value specified during index creation i.e. either in ascending(1) or descending(-1) order. If sorting is done on an indexed field, then there is no need to perform an extra sort operation. The server will fetch data in sorted order automatically. Wooh, we have a perfect solution to avoid in-memory sort! But sorting doesn’t work in the same manner for all index types.

How sorting is done for different index types?

Sorting a single key index field can be done both in the backward or forward direction.

/* Creating index on 'last_name' field with ascending order */db.getCollection('people').createIndex({last_name: 1})This index works for below queries:-/* sorting last_name in ascending order */db.getCollection('people').find({}).sort({last_name: 1})/* sorting last_name in descending order */db.getCollection('people').find({}).sort({last_name: -1})

Sorting the compound index fields works a bit differently. The order of field matters, also prefixes can be used independently in queries.

/* Creating compound index on fields 
a: ascending order,
b: descending order,
c: ascending order
*/
db.getCollection('people').createIndex({a: 1, b: -1, c: 1})This index works for below queries:-// forward direction sort with prefix
db.getCollection('people').find({}).sort({a: 1})
db.getCollection('people').find({}).sort({a: 1, b: -1})
db.getCollection('people').find({}).sort({a: 1, b: -1, c: 1})
// backward direction sort
db.getCollection('people').find({}).sort({a: -1})
db.getCollection('people').find({}).sort({a: -1, b: 1})
db.getCollection('people').find({}).sort({a: -1, b: 1, c: -1})
Doesn't work for below queries:-db.getCollection('people').find({}).sort({b: -1})
db.getCollection('people').find({}).sort({b: 1})
db.getCollection('people').find({}).sort({c: 1})
db.getCollection('people').find({}).sort({b: -1, a: 1})
db.getCollection('people').find({}).sort({b: -1, c: 1}) db.getCollection('people').find({}).sort({c: -1, b: 1, a: -1})

Look into the difference in stats of the same query, but with compound index.

After creating index on company_id_1_last_name_-1,

  • totalDocsExamined to nReturned ratio is reduced to 1, i.e. only docs scanned are returned.
  • totalKeysExamined, 63 indicates that index is used in query execution.
  • executionTimeMillis, reduced to nearly 0ms from 27ms.
  • Query Plan for the above query, IXSCAN → FETCH
    - IXSCAN determines the index used while querying.
    - FETCH converts records to actual documents.
  • direction determines the order of index scan, forward or backward.

Insane difference in performance, right!

So, in-memory sort can be avoided based on how indexes are created. Do consider the following points to achieve the best results:-

1. Index on with equality filter and sort fields works well, but with range queries, sort field indexes don’t work.

In case of creating an index on range and sort field, the query plan is:-

IXSCAN → SORT → FETCH

Here, executionStages include ‘SORT’ stage, which is completely off-track. But, this is how it works!

In such cases, it’s better to use the index only on the sort field. It avoids in-memory sort & reduces executionTimeMillis.

IXSCAN → FETCH

But, totalDocsExamined & totalKeysExamined number is quite higher.

So depending on the requirement, create indexes wisely!

2. A bit complex query can involve Equality, sort, range filters. Order of index matters here, so while building indexes keep this in mind. Indexes should be built as per ESR Rule.

What is ESR Rule?

Build index in order of equality(exact match), sort, range fields. Queries build on the ESR rule generally work best in terms of execution time and also avoids in-memory sort.
Basic difference observed while querying on different compound indexes:-

db.getCollection('people').find({
emp_id: { $gt: 20 },
company_id: '57d7a121fa937f710a7d486d'
}).sort({last_name:-1})

Though, building an index through the ESR rule results in examining more keys but it reduces execution time.

Now, we have a clear idea of which index is to be created and when. Create indexes, cross-check the winning plan of the query over a sufficient amount of data-set. Good Luck!

References:-

https://docs.mongodb.com/manual/indexes/
https://blog.mlab.com/2012/06/cardinal-ins/
https://www.mongodb.com/blog/post/performance-best-practices-indexing

--

--