Capacity Estimation in System Design
Capacity estimation is essential to ensure that a system can handle its expected load and perform efficiently. It involves calculating the resources needed for processing/Traffic handling, storage, and network bandwidth.
Key Rules for System Design Estimation Calculations
Rounding Approximations
Simplify calculations by rounding to more manageable numbers.
Example: Instead of calculating for 86,400 seconds in a day, use 10,000 seconds to simplify.
Powers of 2 and 10
Familiarize yourself with powers of 2 and 10 for quick estimations.
Example values for powers of 2: 2, 4, 8, 16, 32, 64, etc.
Example values for powers of 10:
(10^1 = 10)
(10^2 = 100)
(10^3 = 1,000)
(10^6 = 1,000,000) (1 million)
(10^9 = 1,000,000,000) (1 billion)
(10^12 = 1,000,000,000,000) (1 trillion)
Metric System
Use metric system units for large numbers:
1 million = (10^6)
1 billion = (10^9)
1 trillion = (10^12)
Storage Capacity
Understand common storage units:
1 KB = (10^3) bytes
1 MB = (10^6) bytes
1 GB = (10^9) bytes
1 TB = (10^12) bytes
1 PB = (10^15) bytes
Key Metrics to Memorize
1 million requests per day ≈ 12 requests/second
1 million requests per minute ≈ 700 requests/second
1 million requests per hour ≈ 4,200 requests/minute
Latency Numbers
Familiarize yourself with common latency benchmarks to make informed decisions during system design.
Note: Google on the table for latency you will find table
Table for Powers of 2 and 10
Power of 2
Value
Power of 10
Value
(2^1)
2
(10^1)
10
(2^2)
4
(10^2)
100
(2^3)
8
(10^3)
1,000
(2^4)
16
(10^6)
1,000,000
(2^5)
32
(10^9)
1,000,000,000
(2^6)
64
(10^12)
1,000,000,000,000
Let’s go through the capacity estimation process step-by-step using a hypothetical Twitter-like application as an example.
1. Traffic Estimation
Monthly Active Users (MAU): The number of unique users who use the application in a month.
Daily Active Users (DAU): The number of unique users who use the application in a day.
Example:
MAU: 300 million
DAU: 100 million (assume 1/3 of MAUs are active daily)
2. Read and Write Requests
To estimate read and write requests, we need to make some assumptions about user behavior.
Average Tweets per Day per User:
Let’s assume each active user tweets 2 times per day.
Read Requests:
Each user reads 100 tweets per day (including their feed, replies, and notifications).
Write Requests:
Each tweet is a write request.
Additional write requests for likes, retweets, and replies. Assume 2 additional write requests per tweet.
Calculations:
Daily Write Requests:
Daily Write Requests=DAU×(Average Tweets per User+Additional Writes per Tweet)
Daily Write Requests=DAU×(Average Tweets per User+Additional Writes per Tweet)
Daily Write Requests=100 million×(2+2)=400 million
Daily Write Requests=100 million×(2+2)=400 million
Daily Read Requests:
Daily Read Requests=DAU×Average Reads per User
Daily Read Requests=DAU×Average Reads per User
Daily Read Requests=100 million×100=10 billion
Daily Read Requests=100 million×100=10 billion
Requests per Second (RPS):
There are 86,400 seconds in a day.
Write RPS=Daily Write Requests86,400
Write RPS=86,400Daily Write Requests
Write RPS=400 million86,400≈4,630 writes per second
Write RPS=86,400400 million≈4,630 writes per second
Read RPS=Daily Read Requests86,400
Read RPS=86,400Daily Read Requests
Read RPS=10 billion86,400≈115,740 reads per second
Read RPS=86,40010 billion≈115,740 reads per second
Storage Requirements
Assume the following for storage calculations:
Average size of a tweet: 280 bytes
Retention period: 1 year (365 days)
Additional storage for metadata (likes, retweets, etc.): 3 times the tweet size
Calculations:
Daily Storage for Tweets:
Daily Storage=Daily Write Requests×Average Tweet Size×4
Daily Storage=Daily Write Requests×Average Tweet Size×4
Daily Storage=400 million×280×4≈448 TB
Daily Storage=400 million×280×4≈448 TB
Annual Storage:
Annual Storage=Daily Storage×365
Annual Storage=Daily Storage×365
Annual Storage=448 TB×365≈163 PB
Annual Storage=448 TB×365≈163 PB
Bandwidth Requirements
Assume each read and write request has the following average sizes:
Write request: 1 KB (including metadata)
Read request: 10 KB (average size of tweets fetched in a read)
Calculations:
Daily Bandwidth for Writes:
Daily Write Bandwidth=Daily Write Requests×1 KB
Daily Write Bandwidth=Daily Write Requests×1 KB
Daily Write Bandwidth=400 million×1 KB=400 TB
Daily Write Bandwidth=400 million×1 KB=400 TB
Daily Bandwidth for Reads:
Daily Read Bandwidth=Daily Read Requests×10 KB
Daily Read Bandwidth=Daily Read Requests×10 KB
Daily Read Bandwidth=10 billion×10 KB=100 PB
Daily Read Bandwidth=10 billion×10 KB=100 PB
Total Bandwidth per Day:
Total Daily Bandwidth=Daily Write Bandwidth+Daily Read Bandwidth
Total Daily Bandwidth=Daily Write Bandwidth+Daily Read Bandwidth
Total Daily Bandwidth=400 TB+100 PB≈100.4 PB
Total Daily Bandwidth=400 TB+100 PB≈100.4 PB
Bandwidth per Second:
Bandwidth per Second=Total Daily Bandwidth86,400
Bandwidth per Second=86,400Total Daily Bandwidth
Bandwidth per Second=100.4 PB86,400≈1.16 TB/s
Bandwidth per Second=86,400100.4 PB≈1.16 TB/s
Summary
For a Twitter-like application with 100 million daily active users:
Daily Write Requests: 400 million
Daily Read Requests: 10 billion
Write RPS: ~4,630 writes/second
Read RPS: ~115,740 reads/second
Annual Storage Requirement: ~163 PB
Bandwidth Requirement: ~1.16 TB/s