Updated March 8, 2023
Introduction to Mongodb aggregation pipeline
Mongo aggregation pipeline work by using different operators such as $group, $match, $sort, $geoNear, etc. One of the efficient ways of grouping the multiple fields present inside the documents of MongoDB is by making the use of $group operator which helps in performing various other aggregation functions as well on the grouped data which is an aggregation pipeline.
In this article, we will discuss the aggregation pipelines and study in detail one of the operators named $group along with an example in Mongo DB to group by different multiple fields inside the document which are listed as aggregate operation for single-use, aggregate pipeline, and programming model of MapReduce. We will also have a look at the list of operators used along with aggregation and how we can implement the same along with the help of certain examples.
Syntax of $group operator:
The input and output of the $group operator used for aggregating multiple fields in MongoDB are nothing but a documented record of mongo DB database. It accepts single documents and returns single or multiple documents. While using any of the fields in Mongo DB we will reference it by using the dollar symbol ($) prepended by the field name such as $name of the field. We can make the use of different kinds of operators while grouping the multiple fields which areas listed below.
- $first – This operator is used in the group by multiple fields operation to get only a single first document from the grouped ones which is mostly used when performing sorting.
- $push – This operation will insert a new value of a field inside the resultant field.
- $last – This operation will insert a new value of a field inside the resultant field that is at last.
- $addToSet – It helps in adding a new value to the existing values of the array of the resulting documents without any duplication taking place.
- $min – This operator helps in finding out and returning the smallest integer value or smallest value from the supplied and passed integer value.
- $avg – It will calculate the average value of all the specified numeric values between the fields.
- $max – This operator helps in finding out and returning the largest integer value or biggest value from the supplied and passed integer value.
- $sum – It will calculate the sum or total value of all the specified numeric values between the fields.
Aggregation Pipeline
The other method of grouping the fields in MongoDb when the multi-threaded POSIX mode is set to true is the use of pipelines. A particular order is considered while executing the stream of the data in pipelines by a particular set of threads. The pipeline used for aggregation consists of the stages. When the document is processed in a particular stage the resultant is transferred to the next stage and so on.
While using the aggregation pipelines. We can filter out the documents that satisfy the criteria by using the polymer tube functions. It is also used in changing the form of the output document and converting the same.
The stage operators are defined for each the level of aggregation pipeline. The stage operators can internally make the use of the expression operators for having a line break before each of the stages of level or even to calculate the average or sum or to concatenate a particular value. The ultimate results of the aggregation pipeline is considered the final output to be returned which can even be stored in collections if needed.
Processing Flow
The third way of grouping the multiple fields is by making the use of processing flow where we need to consider the following points –
- We can make use of multiple channels for processing the data at the same time by using the Db.collection.aggregate() function.
- We don’t need to write any of the custom JS routines for implementing the facility of the group by same as that of SQL group by if we are using the Db.collection.aggregate() function as it internally efficiently does the aggregation and also provides the support for multiple operations to be used inside it.
- The limit of each phase in a pipeline is 100 MB. An error occurs if we are trying to use a phrase that exceeds the limit of 100 MB in MongoDB. The solution for processing a large amount of data is to use the allowDisk property which can be set to true and all the required data should be written to a temporary file in case if we are using a memory limit of 100 MB in the aggregated pipe nodes.
- The Db.collection.aggregate() operation can be used with the series of slices in a very efficient way without any loss of data in the result while in the case of map-reduce with series of slices there are chances of losing the result.
- collection.aggregate() function returns the data stored inside the memory in the form of a cursor which can be used directly as MongoShell.
- The limitation for the size of BSON document is 16 MB and the output of this function can be stored only inside a single document.
Grouping Method
This is quite similar to the SQL Group By clause as it has three parameters listed below –
Key – this is used for showing the group key
Initial – It will help to give the initial value to the field of document which will be representing the document group.
Reduce – This will return the count of the total number of elements that accepts the parameters including the current element and the aggregate pf result document.
There is also the presence of other optional parameters.
Example – Let us consider an example. We have the customers details document whose contents are as shown below –
We have to calculate the total bill amount for each of the stores.
Consider the following statement whose output will be –
js > db.users.group ({key: {store_id : true}, initial: {totalBillAmount : 0}, reduce : function (currentValue, result){result.totalBillAmount += 1}}) `
whose output is as shown below –
Conclusion – Mongodb aggregation pipeline
We can make the aggregation pipelines in Mongo DB includes the $group operator and many others that are useful for getting an aggregating result.
Recommended Articles
This is a guide to Mongodb aggregation pipeline. Here we discuss the aggregation pipelines and study in detail one of the operators named $group along with an example. You may also have a look at the following articles to learn more –