API Design Tradeoffs

REST vs gRPC vs GraphQL vs WebSockets vs SSE vs HTTP Long Polling vs WebHooks

Aspect	REST	gRPC	GraphQL	WebSockets	SSE	HTTP Long Polling	WebHooks
Protocol	HTTP/1.1 or HTTP/2; JSON	HTTP/2; Protocol Buffers	HTTP/1.1 or HTTP/2; JSON query	HTTP/2 with WS upgrade; binary framing	HTTP/1.1 chunked stream	HTTP/1.1 repeated requests	HTTP/1.1 POST callbacks
Direction	Request-response (client asks)	Request-response (client asks)	Request-response (client asks)	Bi-directional (client ↔ server)	One-way push (server → client)	Polling (client repeatedly asks)	One-way push (server → client)
Payload Size	Large (JSON verbose)	Small (binary protobuf)	Varies (only requested fields)	Small (binary framing)	Medium (text events)	Large (repeated requests + headers)	Medium (JSON)
Latency	Moderate (text parsing)	Low (binary, multiplexing)	Moderate (parsing, resolving)	Very low (~ms, bidirectional)	Low (~ms, server→client)	High (polling interval 1-30s)	N/A (async, eventual)
Bandwidth	High (verbose JSON)	Low (compact binary)	Medium (flexible selection)	Low (binary, efficient)	Medium (text, header overhead)	Very High (repeated requests, headers)	Medium
Connection Model	Stateless (request/response)	Stateless (request/response)	Stateless (request/response)	Persistent, stateful	Persistent, stateful	Stateless (repeated connections)	No connection (event-driven)
Learning Curve	Easy (HTTP verbs, JSON)	Moderate (protobuf, proto files)	Moderate-Hard (query language)	Moderate (WebSocket API, backpressure)	Easy (EventSource API)	Very Easy (just loop + sleep)	Easy (HTTP POST)
Caching	Easy (HTTP caching, ETags)	Difficult (POST-based, binary)	Difficult (queries vary)	Difficult (stateful)	Difficult (streaming)	Difficult (cache busting needed)	N/A (events)
Browser Support	Native	Requires gRPC-Web proxy	Native (via HTTP)	Native (most modern)	Native (most modern)	Native (oldest browsers)	N/A (server-side)
Flexibility	Fixed endpoints (over/under-fetch)	Fixed schema (efficient)	Highly flexible (client specifies)	Flexible (custom messages)	Fixed event types	Fixed endpoints	Event-based (no control)
Streaming	Not native (chunked transfer)	Bi-directional streaming	Subscriptions (separate WS/SSE)	Full bi-directional	One-way server→client	No streaming	N/A
Scalability	Good (stateless, scales horizontally)	Good (stateless, scales well)	Good (stateless, query-dependent)	Complex (stateful, sticky sessions)	Excellent (stateless, HTTP-friendly)	Poor (too many requests, server load)	Excellent (fire-and-forget)
Error Handling	HTTP status codes (200-5xx)	gRPC codes (granular)	Always 200; errors in body	Custom in message framing	No error feedback (one-way)	HTTP status codes	Implicit (no ack)
Use When	CRUD, public APIs, general web	Microservices, internal, high-throughput	Mobile clients, flexible schemas, multiple shapes	Chat, gaming, collaborative editing, real-time trading	Live dashboards, tickers, notifications, progress	Legacy browsers, fallback mechanism, simple updates	Webhooks, async events, integrations
Drawbacks	Verbose; hard to cache; over/under-fetch	Complex setup; not browser-native; proto versioning	N+1 queries; expensive queries; caching hard	Stateful; sticky sessions; connection mgmt; proxy issues	One-way only; message size limits; no request-response	High latency & bandwidth; wasted requests; server overload	No guaranteed delivery; unidirectional; ordering issues
Examples	Stripe, GitHub REST, AWS	Kubernetes, Google Cloud internal, Etsy	GitHub GraphQL, Shopify, Slack	Slack, Discord, Figma, Google Docs	Stock tickers, live scores, monitoring dashboards	Old Gmail, older Slack, polling fallback	GitHub, Stripe, Twilio webhooks

Quick Decision Guide:

REST: Default for public/web APIs, CRUD-heavy, browser support, simple operations
gRPC: Internal service-to-service, need low latency/bandwidth, polyglot microservices
GraphQL: Mobile clients, flexible queries, complex nested data, multiple client types
WebSockets: Interactive, bidirectional real-time (chat, gaming, collab editing, trading)
SSE: Server push only, simple one-way updates (dashboards, notifications, tickers)
HTTP Long Polling: Legacy browser support, simple fallback, tolerate high latency/bandwidth
WebHooks: Async event notifications, third-party integrations, fire-and-forget

REST (Representational State Transfer)

Pros:

Simple, intuitive HTTP verbs (GET, POST, PUT, DELETE)
Excellent browser support and debugging tools
Native HTTP caching with ETags, cache headers
Stateless; highly scalable
Mature ecosystem and widespread adoption
Easy API versioning (v1, v2 in URL)

Cons:

Verbose JSON payloads; high bandwidth usage
Over-fetching (unnecessary fields returned)
Under-fetching (need multiple requests)
Hard to evolve without breaking clients
No fine-grained field selection
Not ideal for complex, nested data relationships

When to Use:

Public APIs for third-party developers (Stripe, AWS, GitHub REST API)
Simple CRUD operations on resources
High cache-hit scenarios (product catalogs, static content)
Team expertise in REST; no special infrastructure
Browser-based clients or mobile clients that benefit from HTTP standards

Example:

Click to view code

GET /api/users/123  → { id, name, email, createdAt, posts: [...] }  # Over-fetch
GET /api/users/123/posts  → Returns all post fields  # Under-fetch

# With REST, you either get all fields or need multiple endpoints

gRPC (Google Remote Procedure Call)

Pros:

Binary Protocol Buffers: compact, fast serialization
HTTP/2 multiplexing: multiple requests over one connection
Bi-directional streaming (client→server, server→client)
Strong typing via proto definitions
Low latency and bandwidth (30% smaller than JSON)
Service generation: auto-generate client/server stubs
Built-in load balancing and service discovery

Cons:

Not browser-native (requires gRPC-Web proxy)
Steeper learning curve (Protocol Buffers, proto versioning)
Binary payloads not human-readable; harder to debug
HTTP/2 required (older infrastructure might struggle)
Harder to cache than REST
Overkill for simple, infrequent APIs
Requires dedicated tooling and code generation

When to Use:

Internal service-to-service communication (microservices)
High-throughput, latency-sensitive systems (real-time, finance)
Mobile apps needing bandwidth efficiency (2G/3G connections)
Streaming requirements (file uploads, real-time updates)
Organizations with polyglot microservices (language-agnostic)

Example:

Click to view code (protobuf)

// Proto definition
service UserService {
  rpc GetUser(UserId) returns (User);
  rpc StreamPosts(UserId) returns (stream Post);  // Server streams posts
  rpc UploadProfilePic(stream ImageChunk) returns (ProfileUrl);  // Client streams chunks
}

// Result: Type-safe, compact, multiplexed over HTTP/2

GraphQL

Pros:

Client specifies exactly what fields needed; no over/under-fetch
Single endpoint; no API versioning headaches
Strong typing via schema; excellent IDE support
Nested queries in single request (relationships)
Self-documenting via introspection; built-in schema exploration
Great for mobile clients with bandwidth constraints
Easier API evolution (new fields without breaking old clients)

Cons:

Resolver complexity (N+1 query problem if not careful)
Query cost hard to predict (expensive queries possible)
Caching is non-trivial (GET via query string vs POST)
Large query payloads possible (more parsing overhead)
Learning curve (schema design, resolvers, federation)
Overkill for simple CRUD APIs
Requires monitoring query depth/complexity to prevent abuse
Subscription support needs separate WebSocket infrastructure

When to Use:

Mobile/web clients needing flexible field selection
Multiple clients with different data shape requirements (web, mobile, TV)
Complex, highly-related data (social graphs, e-commerce product hierarchies)
API that evolves frequently without breaking clients
Reduce bandwidth for mobile apps

Example:

Click to view code (graphql)

# Client requests only needed fields
query {
  user(id: 123) {
    id
    name
    posts {
      id
      title
      comments {
        text
      }
    }
  }
}

# Fetches nested data in single request; only returns what's asked for
# No over-fetching unnecessary fields

WebSockets vs Server-Sent Events (SSE)

Aspect	WebSockets	SSE (Server-Sent Events)	When to Use
Direction	Bi-directional (client ↔ server)	One-way (server → client)	WebSockets: interactive; SSE: notifications
Connection Type	Full-duplex persistent TCP (HTTP upgrade)	One-way persistent HTTP	WebSockets: chat/gaming; SSE: streaming updates
Latency	Very low (~ms)	Low (~ms, but one-way)	Both excellent for real-time
Protocol	Custom binary framing after HTTP upgrade	Plain HTTP with chunked transfer	WebSockets for low-latency; SSE for simplicity
Browser Support	Native (modern browsers)	Native (most modern browsers)	Both have good support
Fallback	Requires custom polyfill (long-polling)	Auto-reconnect, built-in retry	SSE has better fallback semantics
Bandwidth	Low (binary framing, multiplexing)	Medium (text events, headers repeated)	WebSockets more efficient
Scalability	More connections; stateful session	Fewer resources; HTTP-friendly	SSE scales better with many clients
Proxy/LB Compat	Needs sticky sessions, WS-aware proxies	Works with standard HTTP load balancers	SSE better for cloud/CDN deployment
Use When	Chat, collaborative editing, multiplayer games, real-time trading	Live dashboards, notifications, live feeds, progress tracking
Drawbacks	Stateful; complex backpressure; sticky sessions; old proxies drop connections	One-way only (client can't stream to server); message size limits on some servers; no native request-response
Examples	Slack, Discord, Google Docs collab, Twitch chat	Stock price tickers, live sports scores, GitHub live feeds, Sentry error notifications

WebSockets Deep Dive

Pros:

True bi-directional communication (client ↔ server, simultaneously)
Very low latency; minimal overhead after handshake
Binary framing; efficient protocol
Ideal for interactive, high-frequency updates (chat, gaming, collaborative editing)
Single persistent connection; reduces connection overhead vs long-polling
Built-in ping/pong keepalive

Cons:

Stateful connections; harder to scale (sticky sessions, in-memory state)
Requires WS-aware load balancers/proxies; older infrastructure may drop connections
Complex backpressure handling; no built-in flow control
Manual reconnect logic and state sync on disconnect
Memory overhead per connection (not suitable for millions of idle connections)
Harder to debug (binary protocol, custom framing)
Requires separate port/endpoint configuration

When to Use:

Real-time collaborative applications (Google Docs, Figma, Miro)
Chat and messaging systems (Slack, Discord, WhatsApp Web)
Multiplayer games (Fortnite, Valorant—not turn-based)
Live trading/financial platforms (stock prices, forex)
Real-time notifications requiring bidirectional interaction
High-frequency, low-latency requirements

Example:

Click to view code (javascript)

// Client
const ws = new WebSocket('wss://api.example.com/ws');
ws.onmessage = (event) => {
  console.log('Server says:', event.data);
};
ws.send(JSON.stringify({ action: 'move', x: 100, y: 200 })); // Client→Server

// Server sends back immediately
ws.onmessage = (event) => {
  // Other players' movements, game state, etc.
  const message = JSON.parse(event.data);
  updateGameState(message);
};

Server-Sent Events (SSE) Deep Dive

Pros:

Simpler than WebSockets; uses standard HTTP
Built-in reconnect mechanism with exponential backoff
Works with standard HTTP load balancers; no sticky sessions needed
Lower memory overhead per connection (HTTP semantics)
Works through CDNs and proxies seamlessly
Text-based; easy to debug (plain HTTP stream)
Event IDs and retry semantics built-in
Perfect for unidirectional server→client streaming

Cons:

One-way only; client cannot stream to server (need separate channel)
Text-based payload; less efficient than binary WebSocket framing
HTTP header overhead on each message (in some implementations)
Limited message size on some servers
Older browsers need polyfill
Per-connection memory still grows with concurrent clients (but less than WS)
Not ideal for request-response patterns

When to Use:

Live dashboards and monitoring (real-time metrics, system status)
Notifications (GitHub deployments, Stripe webhooks as server pushes)
Live feeds (Twitter live tweets, news tickers)
Progress tracking (video transcoding, long-running jobs)
Server → browser notifications (system alerts, real-time updates)
Scenarios where client rarely needs to send data

Example:

Click to view code (javascript)

// Client
const eventSource = new EventSource('/api/live/scores');
eventSource.addEventListener('score-update', (e) => {
  const data = JSON.parse(e.data);
  console.log(`Goal! ${data.team}: ${data.score}`);
});

eventSource.addEventListener('game-over', (e) => {
  console.log('Final score:', e.data);
  eventSource.close();
});

// Server (Node.js)
app.get('/api/live/scores', (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  
  // Send initial connection
  res.write('retry: 10000\n\n');
  
  // Stream events
  const interval = setInterval(() => {
    const score = getLatestScore();
    res.write(`id: ${score.id}\n`);
    res.write(`event: score-update\n`);
    res.write(`data: ${JSON.stringify(score)}\n\n`);
  }, 1000);
  
  req.on('close', () => clearInterval(interval));
});

HTTP Long Polling Deep Dive

Pros:

Simple to implement; no special infrastructure
Works in old browsers (IE6+); no WebSocket support needed
Natural fallback for WebSocket failures
Stateless server; scales like REST
Works through proxies and firewalls without special config
Easy to debug (plain HTTP requests/responses)

Cons:

High latency (polling interval creates delay: 1-30 seconds typical)
Wasted bandwidth (repeated requests with headers even if no data)
Server overhead (many open connections; each polls)
Inefficient for high-frequency updates
Race conditions (requests in flight when new data arrives)
Terrible user experience for real-time needs

When to Use:

Fallback for WebSocket-incompatible environments
Legacy browser support (IE9, older Android)
When other real-time techniques unavailable
Low-frequency, non-critical updates (once per minute acceptable)
Simple integration with existing REST APIs

When NOT to Use:

Anything requiring < 1 second latency
High-frequency updates (stock prices, gaming)
Large concurrent user bases (bandwidth killer)
Real-time collaboration
Any modern application (just use WebSockets/SSE)

Example:

Click to view code (javascript)

// Client: Poll every 5 seconds
function longPoll() {
  fetch('/api/messages')
    .then(res => res.json())
    .then(data => {
      // Process messages
      data.messages.forEach(msg => {
        console.log('New message:', msg);
        displayMessage(msg);
      });
      
      // Poll again after 5 seconds
      setTimeout(longPoll, 5000);
    })
    .catch(err => {
      console.error('Poll failed:', err);
      // Retry with exponential backoff
      setTimeout(longPoll, 5000);
    });
}

// Server: Return immediately with data, or wait for data
app.get('/api/messages', (req, res) => {
  const userId = req.query.user_id;
  const lastSeenId = req.query.last_id || 0;
  
  // Get unread messages
  let messages = db.getMessages(userId, lastSeenId);
  
  if (messages.length > 0) {
    // Data available; return immediately
    res.json({ messages });
  } else {
    // No data; wait up to 30 seconds for new messages
    const timeout = setTimeout(() => {
      res.json({ messages: [] });
    }, 30000);
    
    // Listen for new messages
    db.on(`user:${userId}:new-message`, (msg) => {
      clearTimeout(timeout);
      res.json({ messages: [msg] });
    });
  }
});

Bandwidth comparison (1000 users, 5-second polling):

Click to view code

Long Polling:
- 1000 users × 200 requests/hour = 200K requests/hour
- × 500 bytes/request (headers + small response) = 100MB/hour
- Server holds open connection per user = 1000 connections

WebSocket/SSE:
- 1000 users × 1 connection = 1000 connections
- Only sends when data available = 10MB/hour (events only)
- Result: 10x less bandwidth, 1000x fewer requests

Real-Time API Pattern Comparison Summary

Scenario	Best Choice	Why
Chat application	WebSocket	Bidirectional, low latency, always connected
Live stock ticker	SSE	Server→client only, simpler, load balancer friendly
Collaborative document editing	WebSocket	Bidirectional edits, conflict resolution, low latency
Live sports scores	SSE	One-way push, no client input, high client count
Multiplayer game	WebSocket	Bidirectional, precise timing, state sync
System monitoring dashboard	SSE	Metrics pushed from server, no client control needed
Real-time notifications	SSE or WebSocket	SSE if no response; WebSocket if acknowledge/interact
Video call (WebRTC)	WebSocket (signaling)	WebSocket for SDP/ICE exchange; RTC for media

Interview Questions & Answers

Q1: Design Slack. Which API patterns would you combine?

Answer: Layered approach:

REST for static operations

- User signup/login - Workspace/channel creation - Profile updates

WebSockets for real-time chat

- Bidirectional message exchange - Typing indicators - Online status updates - Low-latency (sub-second) requirements

Webhooks for integrations

- GitHub integration (code notifications) - Jira integration (ticket updates) - External event notifications

Architecture:

Click to view code

User → REST login → Get auth token
User → WebSocket connect → Join chat room
Chat message → WebSocket broadcast → All users in room

External system → Webhook POST → Slack notification

Q2: Should you use REST, gRPC, or GraphQL for your microservices?

Answer: By use case:

REST: Public/external APIs, simple CRUD
gRPC: Internal service-to-service, high throughput
GraphQL: API Gateway, multiple client needs

Recommendation: Hybrid approach

Click to view code

Client (Web/Mobile) → REST Gateway → Converts to gRPC internally

Benefits:
- External clients use simple REST
- Internal services use efficient gRPC
- No protocol mismatch
- Best of both worlds

Example:

Click to view code

GET /api/users/123 (REST)
  ↓
  API Gateway converts to:
  ↓
GetUser(id=123) (gRPC to user-service)
  ↓
GetUserPosts(user_id=123) (gRPC to post-service)
  ↓
Combine responses → Return REST response

Q3: You need to push 100K events/second to web browsers. REST, SSE, or WebSockets?

Answer: SSE is optimal because:

Server-to-client only (browsers don't send events back)
Standard HTTP (works with CDN, load balancers)
Scales to 100K connections easily (stateless)

Avoid WebSockets because:

Stateful = sticky sessions
100K WebSocket connections need complex infrastructure

Architecture:

Click to view code

Event stream (Kafka) → Event broadcaster → SSE server (multiple instances)
                                         ↓
                                      Browser 1
                                      Browser 2
                                      ...
                                      Browser 100K

Each SSE connection:
- Gets own HTTP/2 stream
- No server state
- Load balancer can distribute freely

Q4: GraphQL vs REST for mobile app. Which is better and why?

Answer: GraphQL wins for mobile because:

Bandwidth: Request only needed fields

```graphql # Instead of: GET /users/123 → 50KB (name, bio, profile pic, friends count, etc.)

# Mobile requests: query { user(id: 123) { name, profile_pic } } → 5KB ```

Fewer requests: Fetch related data in one query

```graphql query { user(id: 123) { posts { id, title }, friends { name, avatar } } } # One request, multiple data types

Click to view code


3. **Network efficiency**: Critical on 4G/LTE

**Caveats:**
- Resolver complexity (N+1 problem)
- Need proper query depth limits

**Mitigation**:

Set limits:

Max depth: 5 levels
Max fields per query: 100
Timeout: 30 seconds
Query cost analysis (expensive queries rejected)

Click to view code


---

### Q5: How would you design real-time notifications for 10 million users?

**Answer:**
**Architecture**:

Notification source → Message queue (Kafka)

↓

Event processor → Millions of notifications/sec

↓

Delivery layer (choose by user preference):

- Push notification (mobile) → Firebase Cloud Messaging - Email → Email service - In-app WebSocket → Real-time updates - SSE → Server-sent events

Click to view code


**For 10M users**:
- 10M users × 1% active = 100K concurrent connections
- SSE more scalable than WebSockets
- Use Redis Pub/Sub for fan-out

**Implementation**:

python

Notification triggered

def sendnotification(userid, message): queue.push("notifications", {"userid": userid, "msg": message})

Worker processes notifications

def processnotifications(): while True: notification = queue.pop("notifications") userid = notification['user_id']

# Route by user preference if userprefs[userid].method == "websocket": broadcastviawebsocket(userid, notification) elif userprefs[userid].method == "sse": broadcastviasse(userid, notification) elif userprefs[userid].method == "push": sendpushnotification(user_id, notification)

Broadcast via Redis Pub/Sub

def broadcastviasse(userid, notification): redis.publish(f"user:{userid}:notifications", json.dumps(notification)) # 1000 active browsers on that user's notification channel receive it


**Key points**:
- Message queue decouples producers from consumers
- Redis Pub/Sub for fan-out (10 subscribers share 1 publish)
- SSE scales better than WebSockets for this volume

02-Caching-Strategies

04-Storage-Tradeoffs

03-API-Design

API Design Tradeoffs

REST vs gRPC vs GraphQL vs WebSockets vs SSE vs HTTP Long Polling vs WebHooks

REST (Representational State Transfer)

gRPC (Google Remote Procedure Call)

GraphQL

WebSockets vs Server-Sent Events (SSE)

WebSockets Deep Dive

Server-Sent Events (SSE) Deep Dive

HTTP Long Polling Deep Dive

Real-Time API Pattern Comparison Summary

Interview Questions & Answers

Q1: Design Slack. Which API patterns would you combine?

Q2: Should you use REST, gRPC, or GraphQL for your microservices?

Q3: You need to push 100K events/second to web browsers. REST, SSE, or WebSockets?

Q4: GraphQL vs REST for mobile app. Which is better and why?

Notification triggered

Worker processes notifications

Broadcast via Redis Pub/Sub