NoSQL Databases — learn.surkar.in

01 / NoSQL Categories

The Six Families of NoSQL

NoSQL is not a single technology but a family of storage paradigms, each optimized for a different access pattern. Choosing the right one depends on how you query, not how you store.

Document

Stores JSON/BSON documents with flexible schemas. Queries on nested fields. Examples: MongoDB, Couchbase.

Key-Value

Simple get/put by key. Extremely fast, often in-memory. Examples: Redis, Memcached, DynamoDB (simple mode).

Column-Family

Wide rows with dynamic columns grouped into families. Optimized for write-heavy, time-series-like workloads. Examples: Cassandra, HBase.

Graph

Nodes and edges as first-class citizens. Excels at traversals and relationship queries. Examples: Neo4j, Amazon Neptune.

Time-Series

Optimized for append-heavy, timestamp-indexed data with downsampling and retention policies. Examples: InfluxDB, TimescaleDB.

Search Engine

Full-text search over inverted indexes with scoring and aggregations. Examples: Elasticsearch, Apache Solr.

Key Insight

Many production systems use polyglot persistence: a relational database for transactions, Redis for caching, Elasticsearch for search, and a graph database for recommendations -- all in the same application.

02 / MongoDB

MongoDB: The Document Store

MongoDB stores data as BSON (Binary JSON) documents inside collections. There is no enforced schema at write time -- this is called schema-on-read. You decide how to interpret the data when you query it, not when you insert it.

BSON Document Structure

{
  "_id": ObjectId("64a7f..."),
  "name": "Alice",
  "orders": [
    { "item": "Widget", "qty": 3, "price": 9.99 },
    { "item": "Gadget", "qty": 1, "price": 24.50 }
  ],
  "address": { "city": "Bangalore", "zip": "560001" }
}

Aggregation Pipeline

MongoDB's aggregation pipeline is a sequence of stages that transform documents. Each stage feeds its output to the next, similar to Unix pipes.

Aggregation Pipeline Flow

$match

→

$group

→

$sort

→

$project

→

$limit

Replica Sets & Sharding

Replica sets provide high availability: one primary node accepts writes, and secondary nodes replicate asynchronously. If the primary fails, an automatic election promotes a secondary.

Sharding distributes data across multiple machines by a shard key. Each shard holds a range (or hash) of keys. A mongos router directs queries to the correct shard.

Sharded Cluster

Client

→

mongos (Router)

→

Shard A

Client

→

mongos (Router)

→

Shard B

Client

→

mongos (Router)

→

Shard C

Watch Out

A poorly chosen shard key causes "hot spots" where one shard receives disproportionate traffic. Choose a key with high cardinality and even distribution.

03 / Redis & DynamoDB

Key-Value Powerhouses

Redis

Redis is an in-memory data structure store. Unlike simple key-value caches, Redis supports rich data structures: strings, lists, sets, sorted sets, hashes, streams, bitmaps, and HyperLogLogs.

Data Structure	Use Case	Example Command
String	Caching, counters	`SET key val`, `INCR key`
Hash	Object storage	`HSET user:1 name "Alice"`
List	Queues, feeds	`LPUSH queue task1`
Sorted Set	Leaderboards, ranking	`ZADD board 100 "alice"`
Stream	Event log, pub/sub	`XADD stream * field val`

Persistence: Redis offers two persistence modes. RDB takes point-in-time snapshots at intervals. AOF (Append Only File) logs every write operation. You can combine both for durability with fast recovery.

Redis Cluster: Data is divided into 16,384 hash slots distributed across master nodes. Each master has replicas for failover. Clients use hash slot mapping to route commands directly to the right node.

DynamoDB

Amazon DynamoDB is a fully managed key-value and document database. Every table must have a partition key (hash key). Optionally, you add a sort key to enable range queries within a partition.

DynamoDB Table Structure

Partition Key (PK)

Sort Key (SK)

→

Attributes (flexible)

Global Secondary Indexes (GSI): A GSI lets you query on a completely different partition key and sort key than the base table. DynamoDB replicates data to the GSI asynchronously, so GSI reads are eventually consistent.

Best Practice

In DynamoDB, design your access patterns first, then model your table. Use single-table design with composite keys (e.g., PK=USER#123, SK=ORDER#456) to avoid joins entirely.

04 / Cassandra

Cassandra: Masterless Wide-Column Store

Apache Cassandra is designed for massive write throughput and linear horizontal scalability. Every node in the cluster is equal -- there is no single master or leader.

Gossip Protocol

Nodes communicate cluster state through a gossip protocol: each node periodically exchanges metadata (health, token ranges, schema versions) with a few random peers. Information propagates exponentially, reaching all nodes quickly.

Data Model

Cassandra tables have a partition key that determines which node stores the data, and optional clustering columns that determine sort order within a partition. Data within a single partition is stored together on disk for fast sequential reads.

CREATE TABLE orders (
  customer_id UUID,       -- partition key
  order_date TIMESTAMP,   -- clustering column
  order_id UUID,          -- clustering column
  total DECIMAL,
  PRIMARY KEY (customer_id, order_date, order_id)
) WITH CLUSTERING ORDER BY (order_date DESC);

Tunable Consistency

Cassandra lets you choose consistency per query. With a replication factor of 3:

Consistency Level	Nodes Required	Trade-off
`ONE`	1 replica	Fastest, lowest consistency
`QUORUM`	2 of 3 replicas	Strong consistency if R+W > N
`ALL`	3 replicas	Strongest, but any node failure blocks writes
`LOCAL_QUORUM`	Quorum in local DC	Multi-DC deployments, low latency

Consistency Formula

If R + W > N (reads + writes > replication factor), you get strong consistency. For example, QUORUM reads (2) + QUORUM writes (2) > 3 replicas = strong consistency.

05 / Neo4j

Neo4j: The Graph Database

Neo4j stores data as a property graph: nodes (entities) connected by relationships (edges). Both nodes and relationships can hold key-value properties. Relationships are always directed and typed.

Property Graph Example

(:Person {name:"Alice"})

-[:FRIENDS_WITH]->

(:Person {name:"Bob"})

(:Person {name:"Alice"})

-[:PURCHASED]->

(:Product {name:"Widget"})

Cypher Query Language

Cypher uses ASCII-art syntax to describe graph patterns. Nodes are parentheses, relationships are arrows with brackets.

// Find friends of friends who bought the same product
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)
      -[:FRIENDS_WITH]->(fof)-[:PURCHASED]->(product)
WHERE alice <> fof
RETURN fof.name, product.name;

Index-Free Adjacency

In Neo4j, each node physically stores pointers to its adjacent relationships. Traversing a relationship is a direct pointer follow, not a global index lookup. This means traversal cost is proportional to the local neighborhood size, not the total graph size -- a key advantage over relational JOIN-based graph queries.

When to Use a Graph DB

Graph databases shine when the relationships ARE the data: social networks, fraud detection (following money trails), recommendation engines, knowledge graphs, and access control hierarchies.

06 / Theoretical Foundations

ACID vs BASE, CAP & PACELC

ACID vs BASE

Property	ACID (Relational)	BASE (NoSQL)
Atomicity / Basically Available	All-or-nothing transactions	System guarantees availability; partial failures are tolerated
Consistency / Soft state	DB moves from one valid state to another	State may change over time even without new input (replicas converging)
Isolation / Eventual consistency	Concurrent transactions don't interfere	Reads may see stale data temporarily, but will converge
Durability	Committed data survives crashes	(Still durable, but consistency is relaxed)

CAP Theorem

In a distributed system experiencing a network Partition, you must choose between Consistency (every read returns the most recent write) and Availability (every request gets a response). You cannot have both during a partition.

CAP Trade-offs

CP: MongoDB, HBase

AP: Cassandra, DynamoDB

CA: Single-node RDBMS (no partition)

Common Misconception

CAP does not mean you pick two and lose one permanently. Partitions are rare events. During normal operation, you can have all three. CAP only forces a choice during a partition.

PACELC

PACELC extends CAP: if there is a Partition, choose Availability or Consistency. Else (normal operation), choose Latency or Consistency.

System	During Partition (PA/PC)	Else (EL/EC)
DynamoDB	PA (available)	EL (low latency)
Cassandra	PA (available)	EL (low latency)
MongoDB	PC (consistent)	EC (consistent)
HBase	PC (consistent)	EC (consistent)

Time-Series & Search Engines

InfluxDB and TimescaleDB are optimized for time-stamped data. InfluxDB uses a custom TSM (Time-Structured Merge) storage engine with built-in downsampling and retention policies. TimescaleDB extends PostgreSQL with hypertables that auto-partition by time, giving you full SQL over time-series data.

Elasticsearch builds an inverted index: for each unique term, it stores a sorted list of document IDs containing that term. At query time, it scores documents using BM25 (Best Matching 25), a probabilistic relevance function that considers term frequency, inverse document frequency, and document length normalization.

Inverted Index

"nosql" → [doc1, doc4, doc7]

"database" → [doc1, doc2, doc4, doc9]

"redis" → [doc3, doc7]

BM25 Scoring

BM25 improves on TF-IDF by adding saturation (diminishing returns for high term frequency) and document length normalization. A short document with two mentions of a term ranks higher than a long document with three mentions.

Test Yourself

Score: 0 / 10

Question 01

What does "schema-on-read" mean in MongoDB?

Schema-on-read means MongoDB accepts any valid BSON document regardless of structure. The application interprets and validates the shape of data at query/read time, not at insertion.

Question 02

In Cassandra with replication factor 3, which consistency level guarantees strong consistency when both reads and writes use it?

With RF=3, QUORUM requires 2 nodes. R(2) + W(2) = 4 > 3 (N), satisfying the strong consistency formula R + W > N.

Question 03

What is "index-free adjacency" in Neo4j?

Index-free adjacency means each node has direct pointers to its neighbors. Traversals follow pointers rather than performing global index lookups, making traversal cost proportional to the local neighborhood, not the entire graph.

Question 04

How many hash slots does a Redis Cluster partition data into?

Redis Cluster uses exactly 16,384 hash slots. Each key is mapped to a slot via CRC16(key) mod 16384, and slots are distributed across master nodes.

Question 05

In DynamoDB, what does a Global Secondary Index (GSI) allow you to do?

A GSI creates an alternate view of the table with a different partition key and optional sort key, enabling queries on attributes that are not part of the base table's primary key. GSI reads are eventually consistent.

Question 06

What does the CAP theorem state about distributed systems during a network partition?

CAP states that during a network partition (P), a system must choose between Consistency (C) -- every read sees the latest write -- and Availability (A) -- every request gets a response. During normal operation, you can have both.

Question 07

How do nodes in a Cassandra cluster discover each other's state?

Cassandra uses a peer-to-peer gossip protocol. Each node periodically shares metadata with random peers, and information propagates exponentially through the cluster without any central coordinator.

Question 08

What data structure does Elasticsearch use internally to enable fast full-text search?

Elasticsearch builds an inverted index that maps each unique term to a sorted list of document IDs. This allows extremely fast lookups: given a search term, Elasticsearch immediately knows which documents contain it.

Question 09

In PACELC, what trade-off does a system face during normal (non-partition) operation?

PACELC says: during a Partition, choose A or C. Else (normal operation), choose between Latency (L) and Consistency (C). For example, DynamoDB favors low latency (PA/EL), while MongoDB favors consistency (PC/EC).

Question 10

Which Redis persistence mode logs every write operation for maximum durability?

AOF (Append Only File) logs every write operation. With the "always" fsync policy, it provides the strongest durability guarantee but at the cost of throughput. RDB only takes periodic snapshots.