System Design Interview Preparation Plan
If you’re here, you’re probably preparing for system design interviews.
When I was switching job from Google last year, I felt overwhelmed by the sheer amount of system design content scattered across the internet. There was advice everywhere, but no clear path to follow.
So I decided to create a preparation plan for myself—structured, practical, and focused on what actually matters in interviews. I’m sharing that plan here so others can use it too.
Go through the entire plan, commit to it, and use it as a steady guide throughout your preparation. Note: I have also shared resources in the end to help you direct through quality content.
1. Core Concepts & Fundamentals
| Topic | Brief Information | Examples |
|---|---|---|
| Load Parameters | Metrics that define system scale and requirements | Requests per second (avg vs peak), Read/write ratios, Concurrent WebSocket users, Cache hit rates, Twitter: Follower distribution affects fan-out |
| CAP Theorem | Distributed systems can guarantee only 2 of 3: Consistency, Availability, Partition Tolerance | CP Systems: Banking, booking systems · AP Systems: Social feeds, catalogs Read here |
| Consistency Models | How data stays synchronized across systems | Transactional: 2PC, 3PC for atomic operations · Sequential: Event queues with ordered processing |
| Eventual Consistency | System becomes consistent after some time | Most social media, content platforms |
| Fault Tolerance | System's ability to continue operating despite failures | Backup nodes for servers/databases |
| Data Replication | Copying data across multiple nodes | Single leader · Multi-leader · Leaderless (Dynamo-style) |
| Synchronous vs Asynchronous Replication | Trade-off between consistency and performance | Sync: Strong consistency, higher latency · Async: Eventual consistency, better performance |
| Data Partitioning | Splitting data across multiple nodes horizontally | Range-based: Good for range queries · Hash-based: Uniform distribution |
| Hot Partitions | Partitions receiving disproportionate traffic | Celebrity users, viral content |
| Consistent Hashing | Minimizes data movement when nodes change | Load balancers, distributed caches |
| Leader Election | Choosing coordinator in distributed systems | Raft, Paxos algorithms |
| Logical Clocks | Ordering events without synchronized time | Lamport Clock · Vector Clock |
| Quorum | Minimum votes needed for operation to proceed | W + R > N for strong consistency |
| Merkle Trees | Detect inconsistencies between replicas efficiently | Cassandra, DynamoDB synchronization |
| Gossip Protocol | Peer-to-peer information propagation | Cluster membership, service discovery |
| Bloom Filters | Probabilistic set membership testing | Cache lookups, reducing DB queries |
| Probabilistic Counting | Approximate counting with less memory | Unique visitors, cardinality estimation |
| Circuit Breaker | Prevent cascading failures | Microservices fault tolerance |
| GeoHashing | Encode geographic coordinates | Location-based services, proximity search |
2. Building Blocks & Components
Load Balancer
Purpose: Distribute requests across servers
| Aspect | Details |
|---|---|
| Algorithms | Round Robin / Weighted Round Robin · URL Hash · IP Hash · Least Connections |
| Types | Layer 4 (TCP/UDP level) · Layer 7 (HTTP/application level) · GSLB (Geographic distribution) · LLB (Within data center) |
Rate Limiter
Purpose: Control request rates from clients
| Algorithm | Description |
|---|---|
| Token Bucket | Tokens added at fixed rate, consumed per request |
| Leaking Bucket | Requests processed at constant rate |
| Fixed Window | Count requests in fixed time windows |
| Sliding Window Log | Track timestamps of recent requests |
| Sliding Window Counter | Hybrid of fixed window and sliding log |
Databases
Purpose: Persistent data storage
| Type | Use Case | Examples |
|---|---|---|
| SQL | Relational, ACID, complex queries | PostgreSQL, MySQL |
| Document | Flexible schema | MongoDB |
| Key-Value | Simple fast lookups | DynamoDB, Redis |
| Column Store | Time-series data | Cassandra |
Cache
Purpose: In-memory fast data access
| Aspect | Options |
|---|---|
| Write Policies | Write-through (cache + DB simultaneously) · Write-ahead (cache first, then DB) · Write-around (DB only, cache on read) |
| Eviction | LRU (Least Recently Used) · LFU (Least Frequently Used) |
| Invalidation | Active expiration (daemon) · Passive expiration (on access) |
| Implementation | Doubly Linked List + HashMap |
| Examples | Redis, Memcached |
CDN
Purpose: Deliver static content from edge locations
| Type | Use Case |
|---|---|
| Push CDN | Static content |
| Pull CDN | Frequently changing content |
Consistency: Periodic polling (TTR) · Time to Live (TTL)
Message Queue
Purpose: Decouple services with async messaging
| Aspect | Options |
|---|---|
| Delivery | Push to consumers (less reliable) · Pull by consumers (reliable, scalable) |
| Ordering | Best-effort ordering · Strict ordering (FIFO) |
| Concurrency | Locking mechanisms · Sequential processing |
| Use Cases | Email/notifications · Data post-processing · Recommendation systems |
| Examples | Kafka, SQS, RabbitMQ, PubSub |
Other Essential Components
| Component | Purpose | Details |
|---|---|---|
| Sequencer / ID Generator | Generate unique IDs in distributed systems | Snowflake ID Structure, Twitter Snowflake, Apache Flake |
| Blob/File Store | Store unstructured data | S3, images, videos, files |
| Distributed Search | Full-text search across distributed data | Elasticsearch, Apache Solr |
| Distributed Task Scheduling | Schedule and execute tasks across nodes | Cron jobs at scale |
| Distributed Logging | Centralized log aggregation | ELK stack, debugging |
| Sharded Counter | High-throughput counting | Likes, views counters |
| Service Discovery | Find and communicate with services dynamically | ZooKeeper for coordination |
| Service Monitoring | Track system health and metrics | Time-series databases, Prometheus |
| Business Analytics | Process large datasets for insights | Data warehouses, OLAP |
| Batch Processing | Process large data in batches | Hadoop, MapReduce |
3. System Design Patterns
| Pattern | Description | When to Use |
|---|---|---|
| Fan-out | One write generates many derived updates | Social media posts, notifications to followers |
| CQRS | Separate read and write data models | Different optimization needs for reads vs writes |
| Event Sourcing | Store events instead of current state | Audit trails, replay capability, banking |
| Saga Pattern | Distributed transactions as sequence of local transactions | Order processing across microservices |
| API Gateway | Single entry point for all client requests | Routing, auth, rate limiting |
| Circuit Breaker | Fail fast when dependencies unavailable | Prevent cascading failures |
| Async Messaging | Communicate via message queues | Decoupling, reliability, buffering |
4. Practice: System Design for Isolated Features
| Feature | Key Concepts |
|---|---|
| Real-time Leaderboard | Redis sorted sets, real-time updates, tie handling |
| Top Items Over Duration | Sliding window, count-min sketch, time-series aggregation |
| Likes Counter | Sharded counters, eventual consistency, periodic aggregation |
| Autocomplete | Trie data structure, prefix matching, caching, ranking |
| Search | Inverted index, ranking, sharding, relevance scoring |
| Live Streaming | Low-latency protocols, transcoding, CDN distribution |
| Video Uploading | Chunking, resumable uploads, processing pipeline |
| Video Streaming | Adaptive bitrate, HLS/DASH, CDN · Techniques: DNS redirection, Anycast, Client multiplexing |
| Large File Upload/Download | Chunking, resumable, multipart, progress tracking |
| Collaborative Editing | Operational transformation, CRDTs, conflict resolution |
| GeoMap/GeoHash | Spatial indexing, proximity search, QuadTree |
| File Sharing | Access control, permissions, sharing links |
| Messaging | WebSockets, message queues, delivery guarantees |
| Notifications | Push notifications, multi-channel delivery, preferences |
| URL Shortener | ID generation, Base62 encoding, redirect, analytics |
| Web Crawler | URL frontier, politeness, deduplication, distributed crawling |
| Rate Limiter | Token bucket, distributed limiting, per-user/per-IP |
| Proximity Service | Geohash, QuadTree, nearby search |
| Distributed Cache | Consistent hashing, replication, eviction |
| Ticketing System | Seat reservation, concurrency, queue, payment |
5. System Designs Practice
Products to Design
| Product | Key Focus Areas |
|---|---|
| Twitter/Instagram | News feed, follower/following, fan-out strategies, media storage |
| Friends feed, relationship graph, privacy controls | |
| Netflix | Video streaming, transcoding, CDN, recommendations |
| Real-time messaging, group chat, online status, message persistence | |
| YouTube | Video upload, processing pipeline, streaming, search |
| Dropbox/Google Drive | File sync, chunking, deduplication, conflict resolution, versioning |
| Uber | Geolocation, driver-rider matching, dynamic pricing, real-time tracking |
| TikTok | Short video feed, recommendations, viral content distribution |
| Tinder/Bumble | Profile matching, geolocation, swipe mechanism, messaging |
| Stock Trading Platform | Order matching, low latency, strong consistency, market data |
| Investment Platform | Portfolio management, real-time prices, order execution |
| Google Meet | Video conferencing, WebRTC, screen sharing, recording |
| Google Calendar | Event scheduling, recurring events, notifications, timezone handling |
| Gmail | Email storage, search, spam filtering, attachments |
| Ad Platform | Real-time bidding, targeting, click/impression tracking, budget pacing |
6. Preparation Approach
- Study Core Concepts — Understand all fundamentals thoroughly
- Learn Building Blocks — Know when and how to use each component
- Master Patterns — Recognize common design patterns
- Start JobHunt - Find jobs at quality tech companies on OpenShot and continue with next preparation steps.
- Practice Isolated Feature Designs - Practice designing isolated features of products.
- Practice Designs — Start with simpler products, progress to complex ones
- Mock Interviews — Practice explaining designs under time pressure
7. Things to Remember
- Problem Understanding - Understand the problem statement thoroughly before jumping into the design. Ask questions to clarify requirements and constraints.
- Defining Scope/Features - Define the scope/features of the system and the requirements of the system. This will help you to narrow down the problem and focus on the most important aspects of the system. Be smart and focus on the scope that is most relevant to the problem statement.
- Estimation - Estimate the scale of the system and the requirements of the system.
- Trade-offs - There are no right or wrong answers in system design. It's all about understanding the trade-offs and making informed decisions, and communicating those trade-offs to the interviewer. Thus, while studying, its important to question yourself: "Is there any other way to solve this?" and "What are the trade-offs of this approach?"
8. Key Resources
| Category | Resource |
|---|---|
| Book | Designing Data-Intensive Applications by Martin Kleppmann (Highly Recommended) |
| Websites | High Scalability · Hello Interview |
| GitHub | System Design Primer |
| YouTube | JordanHasNoLife · Hussein Nasser |
| Practice | Pramp — Mock interviews with peers |
| Job Search | OpenShot — Jobs at quality technology companies |
Preparation is only half the battle; finding a company that is actually hiring is the other. I built OpenShot to solve this. Every listing is updated daily to ensure you never waste time on a ghost job again.
When you’re ready to put this prep into practice, start your search on OpenShot here.
Good luck with your preparation!