MongoDB has a flexible storage system, which means stored objects are not necessarily required to have the same structure or fields. MongoDB also has some optimization features, which distributes the data collections across, being overall a more balanced and performance focused system.
MongoDB is a schema-free, document-oriented database written in C++. As a document store based, it stores values (referred to as documents) in the form of encoded data.
The choice of encoded format in MongoDB is JSON. This means that even if the data is nested inside JSON documents, it will still be queryable and indexable.
Sharding is the partitioning and distributing of data across multiple nodes. A shard is a collection of MongoDB nodes. Using shards also means the ability to make an horizontally scalation across multiple nodes. In the case that there is an application using a single database server, it can be converted to sharded cluster with very few changes to the original application. Software is almost completely decoupled from the public APIs exposed to the client side.
Each one holds a copy of the metadata indicating which shard contains what data.
A group of servers acting as a main interface for one or more clients. A router dipatch the client requests to the appropriate shards using the configuration servers to know who owns the required information.
Read and write actions are sent from the clients to one of the router servers in the cluster, and are automatically routed by that server to the appropriate shards that contain the data with the help of the configuration servers.
A shard in MongoDB has a data replication scheme, which creates a copy set of each shard that holds exactly the same data. There are two types of replication schemes in MongoDB: Master-Slave replication and Replica-Set replication. Replica-Set provides more automation and better failure handling, while Master-Slave requires the administrator intervention more often. Regardless of the replication scheme, at any point in time in a replica set, only one shard acts as the primary shard, all other replica shards are secondary shards. All write and read operations go to the primary shard, and are then distributed evenly (if needed) to the other secondary shards in the set.
In the graphic below, we see the MongoDB architecture explained, showing the router servers in green, the configuration servers in yellow, and the shards that contain the blue MongoDB nodes.
It should be noted that sharding (or sharing the data between shards) in MongoDB is completely automatic, which reduces the failure rate and makes it a highly scalable database management system.
Linux, Unix, BSD, Windows, Mac OSX, Android
Document based: BSON (Binary JSON)
WiredTiger (default from v3.2): document-level concurrency model, checkpointing, and compression, among other features. In MongoDB Enterprise, WiredTiger also supports Encryption at Rest.
MMAPv1 (the original): performs well on workloads with high volumes of reads and writes, as well as in-place updates.
HTTP verbs mapping:
Some of the common HTTP result codes are often used inside REST APIs:
Partitioning and distributing data across multiple nodes (sharding). Data can be duplicated across shards (replica sets). Data is written in a single shard instance and is successivelly distributed in the replica set.
Very good horizontal scalability. Shards can be added easily on running systems. Safer, easier and less expensive than vertical scaling (cpu,storage,ram).
When a single write operation modifies multiple documents, the modification of each document is atomic, but the operation as a whole is not atomic and other operations may interleave. For cases where a sequence of write operations must operate as if in a single transaction, you can implement a two-phase commit in your application. Using two-phase commit ensures data consistency.