NoSQL and Alternatives: When SQL Isn't the Answer
This chapter helps you recognize when the relational model is wrong for the problem and pick a better store, including the main NoSQL families and how PostgreSQL stacks up against SQLite and MySQL.
The Three-Second Pitch per Family
There are four broad families of non-relational databases, each solving a specific problem. Use them alongside PostgreSQL, not instead of it, unless you have a clear reason otherwise.
- Key-value stores (Redis, Memcached): get/set by key, in-memory, sub-millisecond.
- Document stores (MongoDB, Couchbase): nested JSON-ish documents.
- Column stores (ClickHouse, BigQuery, Cassandra): analytical scans over huge tables.
- Graph stores (Neo4j, Dgraph): traversing relationships (friend-of-friend, recommendation chains).
Key-Value: Redis
Redis is an in-memory key-value store with data structures (strings, lists, sets, hashes, sorted sets, streams). It's fast because it's in memory and because its commands are simple.
Use it for:
- Session storage. Small, short-lived, read constantly. Moving these off Postgres removes write pressure.
- Rate limiting.
INCRa counter per user per minute, expire it. Trivial in Redis, awkward in SQL. - Queues. Lists (
LPUSH/BRPOP) or Streams (XADD/XREADGROUP). See also RabbitMQ or SQS for heavier needs. - Caching. Store the result of an expensive query, key by its arguments, expire in N seconds.
Don't use it for:
- Anything where the source of truth can't be rebuilt. Redis is memory; treat its contents as transient.
- Complex queries. There's no SQL equivalent; you work by known keys.
A quick taste:
redis-cli
> SET session:abc123 "user_id=42;expires=..."
OK
> GET session:abc123
"user_id=42;expires=..."
> EXPIRE session:abc123 3600
(integer) 1
Document: MongoDB
A document store keeps nested JSON-like objects. Good for data that doesn't fit neatly into rows and columns, especially when the schema changes often.
Use it for:
- Content-heavy schemas with deep nesting (product catalogs, CMS content, per-tenant custom fields).
- Event payloads where each event type has a different shape.
Don't use it for:
- Strongly relational data. Joins are awkward; referential integrity is on you.
- Transactions across documents. MongoDB added multi-document transactions in 4.0 but they're slower and not the paved path.
Before reaching for MongoDB, try PostgreSQL's JSONB:
CREATE TABLE events (
id SERIAL PRIMARY KEY,
type TEXT NOT NULL,
payload JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX ON events USING GIN (payload jsonb_path_ops);
INSERT INTO events (type, payload) VALUES
('checkout', '{"user_id": 42, "total": 89.50, "items": 3}'),
('signup', '{"user_id": 43, "source": "newsletter"}');
-- Filter by a key
SELECT id, payload->>'source' AS source
FROM events
WHERE type = 'signup';
-- Filter by containment (uses the GIN index)
SELECT * FROM events WHERE payload @> '{"user_id": 42}';
You get relational + document in one database. Most "we need MongoDB" requirements dissolve once a team sees JSONB.
Column: ClickHouse / BigQuery
Column stores lay out data column by column instead of row by row. Analytical queries (SUM over 10 billion rows) are much faster because the engine reads only the columns it needs.
Use them for:
- Analytics and BI over large historical datasets.
- Log and metric storage (billions of rows with a few hundred columns).
- Any query that touches a few columns across most rows.
Don't use them for:
- Transactional workloads. Writes are slow; updates and deletes often rewrite the partition.
- Point lookups. A column store is the wrong tool for
SELECT * WHERE id = 42.
Typical setup: OLTP in PostgreSQL, replicate to ClickHouse (or export to BigQuery) for analytics.
Graph: Neo4j
Graph databases traverse edges efficiently. "Friends of my friends who live in Berlin and liked this post" is a three-hop join; in a relational schema that's three JOINs; in a graph DB it's a walk.
Use them for:
- Recommendation engines with rich relationship patterns.
- Fraud detection (rings of connected accounts).
- Knowledge graphs.
Don't use them for:
- General OLTP. You lose transactions and SQL familiarity.
Most applications don't need a graph DB. PostgreSQL handles 3 or 4 hops fine with good indexes.
SQLite: The One You Already Have
SQLite is a relational database in a single file, embedded in your process. No server, no network, no accounts.
Use it for:
- Mobile and desktop apps. iOS and Android ship it.
- CLI tools that need local storage.
- Tests. Spinning up SQLite per test is cheap. (Caveat: test-vs-prod divergence; see below.)
- Small websites. SQLite handles modest write loads on modern hardware. Really.
Don't use it for:
- High concurrent writes. SQLite serializes writes; one writer at a time.
- Anything needing a network client. It's embedded, not a server.
Key differences from PostgreSQL:
PostgreSQL SQLite
Server required none (library)
Concurrent writes many one at a time
Data types strict dynamic (column types are hints)
Full outer join yes yes (3.39+)
Enforced FK yes yes (pragma foreign_keys=ON)
Row limit TB+ fine ~281 TB theoretical, much less practical
Testing against SQLite when prod is PostgreSQL is a classic trap: they disagree on quoting, on NULL sort order, on type coercion, on supported functions. Prefer a real PostgreSQL in tests (Docker, pg_tmp, testcontainers).
MySQL: The Other Big Relational DB
MySQL (and its fork MariaDB) is the other open-source relational heavyweight. Most of what you learn in this tutorial transfers. The differences that trip people up:
- Default isolation is REPEATABLE READ (Postgres defaults to READ COMMITTED).
- String handling. MySQL's default collation is case-insensitive on some string columns. Postgres is case-sensitive by default.
- Full-text search. MySQL has it built in; Postgres uses
tsvector. Both work; shapes differ. - Storage engines. MySQL has InnoDB (transactional) and others (MyISAM, MEMORY). Use InnoDB unless you have a reason. Postgres has one storage engine, and it just works.
- ON CONFLICT vs ON DUPLICATE KEY. Syntax differs. Upserts are spelled differently.
- JSON type. Both support JSON. Postgres
JSONBis binary and indexable; MySQLJSONis similar in 8.0+.
PostgreSQL tends to be more standards-compliant and feature-rich; MySQL tends to be easier to operate at scale out of the box. Both are fine choices. Pick based on your team, your hosting, and what your stack expects.
How to Pick
A rough decision tree:
- Transactional, relational, flexible queries? PostgreSQL.
- Transactional, embedded, single-user? SQLite.
- Fast key lookups or ephemeral state? Redis, alongside your primary DB.
- Analytical scans over huge historical data? A column store like ClickHouse or BigQuery, fed from your primary DB.
- Deep relationship traversal as the core workload? A graph DB, possibly Neo4j.
- Semi-structured data with variable schemas? PostgreSQL with JSONB first. MongoDB if JSONB is genuinely not enough.
The meta-rule: most systems are PostgreSQL + Redis. Adding a third database is a decision with real ongoing cost (operations, backups, monitoring, people who know it). Measure before adding.
Common Pitfalls
"We chose MongoDB because our data is JSON." JSONB exists. Look at it first.
"We need a graph DB because we have relationships." Relational means "has relationships". If you're doing 1-2 hops, Postgres is fine.
"We'll use Redis as our primary store." Redis is memory. Machines restart. Data expires. Don't make it the source of truth unless you understand AOF and RDB and their failure modes.
Assuming features match across engines. Window functions, CTEs, array types, JSON functions, full-text search: all spelled differently, with different capabilities. Porting SQL across engines always takes longer than you planned.
Next Steps
Continue to 12-best-practices.md for the habits that keep databases healthy in production.