Contents

NoSQL Comparison 2021: Couchbase Server, MongoDB, and Cassandra (DataStax)

Our new benchmark report includes functioning results under a brusque-range scan workload, simulating action typical for an east-commerce app.

What was compared?

Today, we are announcing the results of our latest research newspaper that compares the operation of iii NoSQL databases: Couchbase Server v6.threescore, MongoDB v4.2.11, and DataStax Enterprise v6.8.3 (Cassandra). The evaluation was conducted on three different cluster configurations—4, 10, and 20 nodes—every bit well as under four different workloads.

Couchbase Server is both a document-oriented and a key-value distributed NoSQL database. It guarantees loftier performance with a built-in object-level cache, a SQL-similar query language, asynchronous replication, Acid transactions, and data persistence. The database is designed to automatically scale resources such as CPU and RAM depending on the workload.

MongoDB is a certificate-oriented NoSQL database. It has extensive support for a variety of secondary indexes and API-based ad-hoc queries, equally well every bit strong features for manipulating JSON documents. The database uses a separate and incremental arroyo to data replication and partitioning that occur as completely independent processes.

DataStax Enterprise (Cassandra) is a wide-column store designed to handle large amounts of data across multiple commodity servers, providing high availability with no single point of failure.

In this blog mail service, we share the performance results of the databases under a short-range workload.

Workload with 95% scans and 5% updates

A brusque-range scan workload simulates threaded conversations typical in an e-commerce application, where each scan goes through the posts in a given thread. The workload was executed with the following settings:

  • The scan/update ratio was 95%–five%.
  • The size of a data ready was scaled in accord with the cluster size: l one thousand thousand records on a 4-node cluster, 100 million records on a x-node cluster, and 250 one thousand thousand records on a 20-node cluster. Each record is 1 KB in size, consisting of x fields and a key.
  • The maximum scan length reached 100 records.
  • The Zipfian request distribution was used.
  • Compatible was used every bit a scan length distribution.

In real-world situations, an example of a scan operation in an e-commerce app is viewing a product catalog. In its turn, an update functioning tin can be manipulating an existing product in the catalog—adding a model in a new color or changing the toll, for instance.

The size of the data sets in our tests for the 4-node, ten-node, and 20-node clusters were fifty GB, 100 GB, and 250 GB, respectively. Our findings may prove useful to organizations that are evaluating a NoSQL system for an existing data set or optimizing a information array size to fit the database in utilise.

The following queries were used to perform the short-range browse workload.

Couchbase N1QL

MongoDB Query

Cassandra CQL

SELECT RAW meta().id

FROM `ycsb`

WHERE meta().id >= $1

ORDER BY meta().id

LIMIT $2

db.ycsb.find({

_id: {

$gte: $one

}, {

_id: 1

}).sort({

_id: 1

}).limit($2)

SELECT id

FROM table

WHERE token(id) >= token($one)

LIMIT $2

Brusk-range scan performance results

Couchbase demonstrated dandy scalability with the linear growth of throughput that was proportional to the number of cluster nodes: from nine,625 ops/sec on a 4-node cluster to 22,580 ops/sec on a x-node cluster. On a 20-node cluster, the throughput reached 33,095 ops/sec, which is virtually 46% more than on a 10-node cluster, with the asking latency decreasing from 34 ms to about 13 ms, due to the usage of a master alphabetize and replication of the Index Service. Asking latency refers to the real-time delay in viewing a itemize or updating a product.

Throughput and latency for the brusque-range scan workload

MongoDB similarly scaled from 18,255 ops/sec to 21,440 ops/sec. The results were comparatively the aforementioned regardless of cluster and data set sizes. MongoDB performed better than Couchbase on a four-node cluster, but lower on x- and 20-node clusters.

"Based on our tests, Couchbase scales improve than MongoDB on larger clusters. Couchbase uses a peer-to-peer structure, enabling direct access to nodes. Meanwhile, MongoDB has main-slave relationships, where certain operations take to call Mongoose, an Object Document Mapper, and a configuration server to access a node, creating a queue." —Artsiom Yudovin, Altoros

Cassandra did not perform and then well with 2,570 ops/sec on a 4-node cluster, 4,230 ops/sec on a x-node cluster, and vi,563 ops/sec on a xx-node cluster. However, Cassandra achieved a linear operation increment across all clusters and data sets. This can exist explained past the fact that coordinator nodes send scan requests to other nodes in the cluster responsible for specific token ranges. The more nodes a cluster has, the less data falls in the target range on each node, thus the less information each node has to render. This resulted in reduced per-node request processing time. As the coordinator sends the requests in parallel, the overall asking processing time depends on each cluster node request latency which decreases with cluster growth. This is proven by the gradual decrease of request latencies from 173 ms on a 4-node cluster to 104 ms on a ten-node cluster and 63 ms on a twenty-node cluster.

To sum up, MongoDB performed better than Couchbase on relatively modest-sized clusters and data sets, just remained flat irrespective of the cluster size. On the other hand, Couchbase outscaled and outperformed MongoDB on 10- and twenty-node clusters showing linear throughput growth across data sets of 125 and 250 one thousand thousand records. MongoDB showed the ability to handle the increasing amount of data with the throughput remaining the aforementioned. Cassandra had linear performance growth, but is lagging backside Couchbase and MongoDB in terms of performance on scan and update operations.

Unlike MondoDB and Couchbase, which are certificate-oriented databases, Cassandra is a broad-cavalcade shop. Its construction and architecture design are amend suited for write and read operations that in a real-life scenario stand for to creating a new product or viewing a detail product out of the whole catalogue.

To learn more about how each database was configured, every bit well as how each performed in the evaluation, cheque out our full study. In addition to a curt-range scan, the databases were tested across the update-heavy (50% reads and 50% updates), pagination (a query with a single filtering option to which an offset and a limit are applied), and JOIN (with grouping and ordering practical) workloads.

Download the total report here.

Farther reading

  • Performance Evaluation of NoSQL Databases 2021: Couchbase Server, DataStax Enterprise (Cassandra), and MongoDB
  • Performance Evaluation of NoSQL Databases equally a Service: Couchbase Capella, MongoDB Atlas, and Amazon DynamoDB
  • Comparing Database Query Languages in MySQL, Couchbase, and MongoDB

This blog post was written past Carlo Gutierrez with assist from Artsiom Yudovin,
edited by Sophia Turol and Alex Khizhniak.