Learn MongoDB 4.x
上QQ阅读APP看书,第一时间看更新

Using the Mongo shell for common database operations

Among the most important shell methods you will need to master, whether a database administrator or a developer, is to be able to create a database, add documents to collections, and perform queries. We will start by creating a new learn database.

Creating a new database and collection

Oddly, the shell method used to create a new database is not listed among the database commands!  Instead, a new database is created by simply inserting a document into a collection. Using two simple commands, you will create the database, create the collection within the new database, and add a document!

Inserting documents

To insert a document, you can use either the db.collection.insertOne() or the db.collection.insertMany() command. The syntax is quite simple: you specify the name of the collection into which you plan to insert a document, followed by the document to be inserted, using JavaScript Object Notation (JSON) syntax, as illustrated in the following diagram:

In this example, you will create a new learn database and a collection called chapters, as follows:

use learn;
db.chapters.insertOne({
"chapterNumber" : 3,
"chapterName" : "Essential MongoDB Administration Techniques"
});

Here is the output:

If the featureCompatibilityVersion parameter is set to 4.2 or less (for example, you're running MongoDB 4.2 or below), the maximum size of the combined database name plus the collection name cannot exceed 120 bytes. If, however, it is set to 4.4 or greater (for example, you're running MongoDB 4.4 or above), the maximum size of the combined database name plus the collection name can go up to 255 bytes.

Querying a collection

The main command used to query a collection from the mongo shell is db.<collection_name>.find(). If you are only interested in a single result, you can alternatively use db.<collection_name>.findOne(). The find() and findOne() commands have two basic parameters: the query (also referred to as filter) and the projection, as illustrated in the following diagram:

The result of a query is a cursor. The cursor is an iteration, which means you need to type the it helper command in the mongo shell to see the remaining results. 

As mentioned here, the db.collection.find() shell method produces a cursor over which you can iterate to retrieve all results. There are a number of cursor methods that can be applied, which has the effect of modifying the final results. A number of examples in this section will demonstrate a few of the most important such methods. For more information, refer to  https://docs.mongodb.com/manual/reference/method/js-cursor/#cursor-methods.
If you are using a programming language driver, you can use a loop to iterate through a cursor result.
Simple queries

As you can see from the preceding diagram, the query (also referred to as query criteria or filter) can be as simple as using JSON syntax to identify a document field, and the value it equals. As an example, we will use the customers collection in the sweetscomplete database. When you specify a query filter, you can provide a hardcoded value, as shown here:

db.customers.findOne({"phoneNumber":"+44-118-652-0519"});

However, you can also simply wrap the target value into a regular expression by surrounding the target string with delimiters ("/"). Here is a simple example, whereby we use the db.collection.findOne() command to return a single customer record where the target phone number is located using a /44-118-652-0519/ regular expression:

Let's say that the management has asked you to produce a count of customers. In this case, you can use a cursor method to operate on the results. To form this query, simply execute db.collection.find().count() without any parameters, as shown here:

IMPORTANT: The db.collection.find() command returns a cursor, whereas  db.collection.findOne() does not. For more information, consult the  Iterate a Cursor in the mongo Shell documentation article ( https://docs.mongodb.com/manual/tutorial/iterate-a-cursor/#iterate-a-cursor-in-the-mongo-shell).
Building complex queries using query operators

Below the dotted line in the preceding diagram, you will note a more complex form of stating the query. In the diagram, $op would represent one of the query operators (also referred to as query selectors) that are available in MongoDB.

Query operators fall into the following categories:

  • Comparison ($eq, $gt, $gte, $in, $lt,$lte, $ne, $nin)
  • Logical ($and, $not, $or, $nor)
  • Element ($exists, $type)
  • Evaluation ($expr, $jsonSchema, $mod, $regex, $text, $where)
  • Geospatial ($geoIntersects, $geoWithin, $near, $nearSphere)
  • Array ($all, $elemMatch, $size)
  • Bitwise ($bitsAllClear, $bitsAllSet, $bitsAnyClear, $bitsAnySet)
  • Comments ($comment)

As an example, let's say that the management wants a count of customers from non-English-speaking majority countries over the age of 50. The first thing we can do is create a JavaScript array variable of majority-English-speaking country codes. As a reference, we draw a list of countries from a document produced by the British government (https://www.gov.uk/english-language/exemptions). The code can be seen in the following snippet:

maj_english = ["AG","AU","BS","BB","BZ","CA","DM","GB","GD",
"GY","IE","JM","NZ","KN","LC","VC","TT","US"];

Next, we create two sub-phrases. The first addresses customers not in ($nin) the preceding list, as follows:

{"country" : { "$nin": maj_english }}

The second clause addresses the date of birth. Note the use of the less-than ($lt) operator in the following code snippet:

{"dateOfBirth" : {"$lt":"1968-01-01"}}

We then join the two clauses, using the $and query operator, for the final complete query, as follows:

db.customers.find(
{
"$and" : [
{"country" : { "$nin": maj_english }},
{"dateOfBirth" : {"$lt":"1968-01-01"}}
]
}).count();

Here is the result:

Subsequent chapters in this book will cover other query operators of increasing degrees of complexity. For more information on query operators, see this page: https://docs.mongodb.com/manual/reference/operator/query/#query-and-projection-operators.
Applying projections to your query

The second argument to the db.collection.find() and db.collection.findOne() commands, also using JSON syntax, is the projection argument. The projection argument allows you to control which fields appear or do not appear in the final output. 

Here is a summary of the projection options:

For this example, we will assume that the management has asked for the name and email address of all customers from Quebec. Accordingly, you construct a query that uses the projection argument to only include fields pertinent to the name and address. You also suppress the _id field as it would only confuse those in management.

As an example, we first define a JavaScript variable that represents the query portion. Next, we define a variable that represents the projection. Finally, we issue the db.collection.find() command, adding the pretty() cursor modifier (discussed next) to improve the output appearance. Here is how that query might appear:

query = { "stateProvince" : "QC" }
projection = { "_id":0, "firstName":1, "lastName":1, "email":1, "stateProvince":1, "country":1 }
db.customers.find(query,projection).pretty();

Here is the output:

In addition to values of 1 or 0, you can also use projection operators to perform more granular operations on the projection. These are especially useful when working with embedded arrays, covered in later chapters of this book.

Later in this book, we will address more sophisticated ways of using the cursor when we discuss the MongoDB aggregation pipeline (see Chapter 9, Handling Complex Queries in MongoDB). Here is the documentation on projection ( https://docs.mongodb.com/manual/reference/method/db.collection.find/#projection) and projection operators ( https://docs.mongodb.com/manual/reference/operator/projection/#projection-operators).
MongoDB 4.4 introduces a new  allowDiskUse() method that allows MongoDB to write temporary files to disk if a query sort operation exceeds the default allowed memory size of 100 megabytes ( MB).  allowDiskUse() is appended after the sort() method, which itself would follow a query (for example, db.collection.find().sort().allowDiskUse()).
Updating a document

Updating a document involves two mandatory arguments: the filter and the update document arguments. The third (optional) argument is options. The primary shell methods used are either db.collection.updateOne(), which—as the name implies—only updates a single document, or db.collection.updateMany(), which lets you perform updates on all documents matching the filter. The former argument is illustrated in the following diagram:

The filter has exactly the same syntax as the query (that is, the first argument to the db.collection.find() command) discussed previously, so we will not repeat that discussion here. The update usually takes the form of a partial document, containing only the fields that need to be replaced with respect to the original. Finally, options are summarized here:

There is a  db.collection.replaceOne() shell method that can be used as an alternative to db.collection.updateOne(). It is used in situations where you need to replace the entire document, not just update a subset of fields within an existing document.
If the update operation includes a field that does not already exist, due to the flexible nature of MongoDB document structuring, MongoDB will add the field to the document. This capability allows you to perform ad hoc database schema adjustments without loss of data. No more messy legacy Relational Database Management System ( RDBMS) data conversions!
Documentation on the write concern document can be found at  https://docs.mongodb.com/manual/reference/write-concern/#write-concern-specification. Documentation on the collation document can be found at  https://docs.mongodb.com/manual/reference/collation/#collation-document. These are both described in greater detail in Chapter 13, Deploying a Replica Set.
Update operators

In order to perform an update, the update document needs to use update operators. At a minimum, you need to use the $set update operator to assign the new value. The update operators are broken out into categories, as follows:

  • Fields ($currentDate, $inc, $min, $max, $mul, $rename, $set, $setOnInsert, $unset)
  • Array ($, $[], $[<identifier>], $addToSet, $pop, $pull, $push, $pullAll)
  • Bitwise ($bit)

In addition, there are modifiers ($each, $position, $slice, $sort) that are used in conjunction with $push so that updates can affect all or a subset of an embedded array. Further, the $each modifier can be used with $addToSet to add multiple elements to an embedded array. A more detailed discussion of the array and modifier operators is presented in Chapter 10, Working with Complex Documents across Collections. In this section, we will only deal with the fields update operators.

Bitwise update operators let you perform AND, OR, and XOR operations, at the binary level, on integer values. For more information, see the MongoDB documentation on the $bit update operator. You can find more about update operators at  https://docs.mongodb.com/manual/reference/operator/update/#update-operators.
Updating customer contact information

The simplest—and probably most common—type of update would be to update contact information. First, it's extremely important to test your update filter by using it in a db.collection.find() command. The reason why we use db.collection.find() instead of db.collection.findOne() is because you need to ensure that your update only affects the one customer! Thus, if we wish to change the email address and phone number for a fictitious customer named Ola Mann, we start with this command:

db.customers.findOne({ "email" : "omann137@Chunghwa.com" });

Now that we have verified the existing customer email is correct and that our filter only affects one document, we are prepared to issue the update, as follows:

db.customers.updateOne(
{"email" : "omann137@Chunghwa.com"},
{ $set: {
"email" : "ola.mann22@somenet.com",
"phoneNumber" : "+94-111-222-3333" }
}
);

Here is the result:

Let's now cover data cleanup.

Data cleanup

Updates are also often performed to clean up data. As an example, you notice that for some customers, the buildingName, floor, and roomApartmentCondoNumber fields are set to null. This can cause problems when customer reports are generated, and the management wants you to replace the null values with an empty string instead.

Because you will be using the updateMany() command, the potential for a total database meltdown disaster is great! Accordingly, to be on the safe side, here are the actions you can perform:

  • Back up the database (discussed in the next section).
  • Run a find() command that tests the query filter, ensuring the correct documents are affected.
  • Make a note of the first customer key on the list produced by the find() command.
  • Build an updateOne() command using the tested filter.
  • Use findOne() to confirm that the first customer document was changed correctly.
  • Rerun the first find() command, appending count(), to get a count of the documents remaining to be updated.
  • Run an updateMany() command using the tested syntax, checking to see if the count matches.

As mentioned previously, you should first craft a query to ensure that only those documents where the type is null pass the filter. Here is the query for the buildingName field:

db.customers.find({"buildingName":{"$type":"null"}},{"customerKey":1,"buildingName":1});

Here is the output:

Make a note of the customerKey field for the first entry on the list produced from the query. You can then run an updateOne() command to reset the null value to an empty string. You then confirm the change was successful using the customerKey field. Here is a screenshot of this sequence of operations:

Before the final update, perform a query to get a count of how many documents remain to be modified, as follows:

db.customers.find({"buildingName":{"$type":"null"}}).count();

You are now ready to perform the final update, using this command:

db.customers.updateMany({"buildingName":{"$type":"null"}}, \
{"$set":{"buildingName":""}});

And here is the result:

You could perform conditional updates on all three fields simultaneously using the MongoDB Aggregation Framework and the $ifNull aggregation pipeline operator. This topic is covered in detail in Chapter 9, Handling Complex Queries in MongoDB.
Deleting a document

The command used to delete a single document is db.collection.deleteOne(). If you want to delete more than one document, use db.collection.deleteMany() instead. If you wish to delete an entire collection, the most efficient command is db.collection.drop().

Document deletion has features in common with both update and query operations. There are two arguments: the filter, which is the same as described in the preceding section on updates. The second argument consists of options, which are a limited subset of the options available during an update: only the writeConcern and collation options are supported by a delete operation.

The following diagram maps out the generic syntax for the db.collection.deleteOne() shell method:

For the purposes of illustration, let's turn our attention back to the learn database featured in the first section in this chapter on inserting documents. First, we use the load() shell helper method to restore the sample data for the learn database, as illustrated in the following screenshot:

You will notice that there is a duplicate document for Chapter 3. If we were to specify either the chapter number or chapter name as the query filter, we would end up deleting both documents. The only truly unique field in this example is the _id field. For this illustration, we frame a deleteOne() command using the _id field as the deletion criteria. However, as you can see from the following screenshot, you cannot simply state the value of the _id field: the deletion fails, the deleteCount value is 0, and the duplicate entry still remains:

Granted—we could frame a delete query filter using the chapter number and rely upon deleteOne() to only delete the first match; however, that's a risky strategy. It's much better to exactly locate the document to be deleted.

The reason why the delete operation fails is due to the fact that the _id field is actually an ObjectId instance, not a string! Accordingly, if we reframe the delete query filter as an ObjectId, the operation is successful. Here is the final command and result:

Let's now have a look at the essential operations of backup and restore.