Tutorial

System Design Tutorial

A practical tutorial on designing backend systems that survive scale and failure. Covers the building blocks (load balancers, caches, queues, databases), scaling patterns, consistency and consensus, reliability and observability, architecture patterns, and a worked design example.

Tutorial·Difficulty: Intermediate·12 chapters·Updated Apr 19, 2026

Chapters

About this tutorial

A practical tour of designing backend systems that survive real scale and real failure, from load balancers to consensus to a full worked example.

Who This Is For

  • Developers who've shipped services but never sat down with the distributed systems literature
  • Engineers preparing for system design interviews who want the real engineering behind the whiteboard answers
  • Anyone who has watched a production system melt and wants a mental toolkit for the next one

Contents

Fundamentals

  1. Introduction: Vocabulary (latency, throughput, availability), trade-offs, the shape of system design problems
  2. Building Blocks: Clients, DNS, CDN, load balancers, reverse proxies, API gateways

Core Concepts

  1. Scaling: Vertical vs horizontal, statelessness, sticky sessions, load balancing strategies
  2. Caching: Where to cache, invalidation, TTLs, cache stampedes, Redis patterns
  3. Databases: SQL vs NoSQL, replication, partitioning, sharding
  4. Queues and Events: Message queues, streams, pub/sub, event-driven patterns, backpressure

Advanced

  1. Consistency and Consensus: CAP, PACELC, quorum, Raft basics, eventual consistency
  2. Reliability: Failure modes, retries, timeouts, circuit breakers, idempotency, graceful degradation
  3. Observability: Logs, metrics, traces, SLOs, alerting that doesn't cry wolf

Ecosystem

  1. Architecture Patterns: Monolith, modular monolith, microservices, CQRS, event sourcing

Mastery

  1. Designing a System: Worked example (a URL shortener), back-of-envelope math, step-by-step
  2. Best Practices: Evaluation patterns, common traps, anti-patterns

How to Use This Tutorial

  1. Read sequentially for a complete learning path
  2. Sketch the diagrams. A whiteboard and a marker beat reading alone
  3. Tie each concept to a real system. Every pattern here runs in systems you use daily; name one when you meet a new term

Quick Reference

Back-of-Envelope Numbers

Every system design conversation uses these. Learn them.

L1 cache reference              1 ns
L2 cache reference              4 ns
Main memory reference           100 ns
SSD random read                 100 microseconds
Round trip in same datacenter   500 microseconds
Disk seek                       10 ms
Round trip intercontinental     150 ms

1 byte                          1 B
1 kilobyte                      10^3 B
1 megabyte                      10^6 B
1 gigabyte                      10^9 B
1 terabyte                      10^12 B

1 second                        10^9 nanoseconds

Availability Cheat Sheet

Availability    Downtime per year   Downtime per month
99%             3.65 days           7.2 hours
99.9%           8.76 hours          43.2 minutes
99.99%          52.56 minutes       4.32 minutes
99.999%         5.26 minutes        26 seconds

The Six Questions to Ask About Any Design

1. What's the read/write ratio?
2. What's the request rate at peak?
3. What's the data volume over a year?
4. What's the latency budget for the hot path?
5. What's the consistency requirement?
6. What happens when a component fails?

Answer these and most of the design draws itself.

Learning Path Suggestions

Daily fluency (roughly 8 hours)

  1. Chapters 01 to 06 for the core building blocks
  2. Chapter 08 for reliability
  3. Chapter 11 for the worked example

Interview prep (roughly 10 hours)

  1. All 12 chapters
  2. Chapter 11 twice. Then design three more systems on your own
  3. Practice the six questions above on a system you already know

Going deeper (roughly 15+ hours)

  1. All chapters plus the additional resources below
  2. Read the papers referenced in chapters 05 and 07
  3. Build a small but real distributed thing (a key-value store, a pub/sub broker)

Additional Resources

A Note on Examples

Examples throughout use concrete systems (PostgreSQL, Redis, Kafka, NGINX, Envoy) rather than generic "component X". The patterns transfer across providers, but pointing at specific tools makes the trade-offs real.