How to Design a URL Shortener (e.g., bit.ly) – System Design Explained

Ramesh Choudhary
Feb 14
3 min read

1. Introduction

URL shorteners like Bitly and TinyURL transform long, unwieldy URLs into concise links, optimizing sharing on platforms with character limits (e.g., Twitter) and enabling click tracking. These systems are foundational to modern web infrastructure, balancing simplicity with scalability.

Key Features:

Short URL generation
Redirect to original URLs
Custom aliases (e.g., bit.ly/myslug)
Analytics (clicks, geographic data)

Use Cases:

Social media marketing (tracking campaign performance).
IoT devices with limited display capabilities.
Enterprise internal link management.

2. System Requirements and Goals

Functional Requirements:

Generate short URLs from long URLs.
Redirect short URLs to original URLs with 301/302 HTTP codes.
Support custom slugs (user-defined short paths).
Track click metrics (timestamp, device, location).

Non-Functional Requirements:

Availability: 99.99% uptime.
Latency: <100ms for redirection.
Scalability: Handle 100M daily requests, store 1B URLs.
Durability: Zero data loss.

Constraints:

Assume 10:1 read:write ratio (redirects dominate).
Short URLs expire after 1 year by default.

3. High-Level Design Overview

Core Components:

API Gateway: Routes requests (shorten, redirect, analytics).
URL Shortening Service: Generates short codes via hashing or counters.
Database: Stores URL mappings (short ↔ long).
Cache (Redis): Accelerates frequent redirects.
Load Balancer: Distributes traffic across servers.

Communication Flow:

Shortening: Client → API Gateway → URL Service → DB → Return short URL.
Redirection: Client → API Gateway → Cache → DB (if cache miss) → Redirect.

4. URL Shortening Process

Approach 1: Base62 Encoding

Convert a unique integer (e.g., database ID) to a 7-character string using [a-zA-Z0-9].
Code Example (Python):

def base62_encode(num):  
    charset = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"  
    short_url = []  
    while num > 0:  
        num, rem = divmod(num, 62)  
        short_url.append(charset[rem])  
    return ''.join(reversed(short_url))

Approach 2: Hashing (SHA-256)

Hash the long URL, take the first 7 characters.
Collision Handling: Append a salt and rehash.

Trade-Offs:

Method	Pros	Cons
Counter	Predictable, no collisions	Requires distributed ID generation (e.g., Snowflake)
Hashing	No coordination needed	Collision risk, longer URLs

5. Database Design

Schema:

Column	Type	Description
short_url	VARCHAR	Primary key (e.g., "a3dFg7")
long_url	TEXT	Original URL
created_at	DATETIME	Creation timestamp
expires_at	DATETIME	Expiration date
clicks	BIGINT	Total redirects

Database Choice:

NoSQL (Cassandra): Scales writes horizontally, tunable consistency.
Relational (MySQL): ACID compliance for analytics.

Indexing:

Primary index on short_url for O(1) lookups.
Secondary index on long_url for deduplication.

6. Scalability and Optimization

1M Requests/Second:

Horizontal Scaling: Deploy stateless URL services behind load balancers.
Caching: Cache 20% most frequent short URLs in Redis (90% hit rate).

1B Requests/Second:

Sharding: Split database by short_url hash (e.g., 10K shards).
Distributed Queues: Use Kafka to batch writes and reduce DB load.
Geo-Replication: Deploy regional caches (e.g., Redis Cluster).

7. Trade-Offs and Bottlenecks

Consistency vs. Availability (CAP):

Redirects: Prioritize availability (return stale data if DB is down).
Shortening: Prioritize consistency (ensure uniqueness).

Bottlenecks:

DB Writes: Mitigate with sharding and async writes.
Cache Failures: Use cache-aside pattern with TTL fallback.

8. Security and Fault Tolerance

Rate Limiting: Allow 100 requests/minute per IP (using token buckets).
URL Validation: Block phishing links with Google Safe Browsing API.
Data Replication: Multi-region DB backups (e.g., Cassandra’s replication factor=3).

9. Analytics and Logging

Real-Time Pipeline:

Click Events: Log to Kafka.
Stream Processing: Aggregate metrics with Apache Flink.
Storage: Write to OLAP databases (e.g., ClickHouse).

Metrics Tracked:

Click-through rate (CTR).
Geographic hotspots (using MaxMind DB).

10. Real-World Scenarios

Bitly’s Scaling Challenge:

Problem: Handling viral links (e.g., 10M redirects/hour).
Solution: Pre-warm caches with predicted hot links.

11. Sample Code

Base62 Encoder (Python):

def base62_encode(num):  
    # Implementation as above

API Endpoint (Node.js):

app.post('/shorten', async (req, res) => {  
    const longUrl = req.body.url;  
    const id = await generateUniqueId(); // Distributed ID
    const shortCode = base62_encode(id);    
    await db.insert({ shortCode, longUrl });  
    res.json({ shortUrl: `https://short.xyz/${shortCode}` });
});

12. Interview-Focused Insights

Common Questions:

Q: How would you handle duplicate long URLs?
A: Hash the long URL and check the database before creating a new entry.
Q: How to ensure global uniqueness in distributed counters?
A: Use a distributed ID generator (e.g., Twitter Snowflake).

Tips:

Emphasize trade-offs (e.g., “We chose NoSQL for write scalability”).
Discuss real-world metrics (e.g., “Bitly’s 600M monthly links”).

13. Conclusion

Designing a URL shortener requires balancing scalability, latency, and durability. By leveraging distributed systems principles (sharding, caching, async processing), engineers can build a system capable of handling billions of requests. Trade-offs like eventual consistency for redirects and strong consistency for shortening ensure reliability without sacrificing performance.

Data-Driven Insights:

Cache Performance: Redis achieves ~0.2ms read latency.
Sharding: Cassandra handles 1M writes/sec across 100 nodes.

Final Note: Mastery of these patterns not only prepares you for interviews but also equips you to architect systems that power the modern web.

“Simplicity is the ultimate sophistication.” – Leonardo da Vinci 🚀🔗

Next AI Thrill