Backend Development from First Principles → Advanced (Theory + Practical Code)

By Janmajay Kumar

25 min read
January 20, 2026

Introduction

Why this guide?

Most backend resources either stay too high-level (“use FastAPI”) or jump into code without a clean mental model. This post builds backend engineering from first principles — so a motivated non-computer-science learner can follow it, while still using correct CS language and real production patterns.

Who this is for: self-taught developers, scientists transitioning into software, and Python backend interview prep. Examples use FastAPI + Python, but the principles apply to any stack.

How to read: skim the headings once, then re-read with the code blocks and build a tiny demo API as you go.

The one mental model

Every backend topic here fits the request lifecycle: HTTP → routing → parsing → validation → auth → business logic → data layer → response, plus cross-cutting concerns (middleware, caching, security, observability) that wrap the whole pipeline.

What you’ll be able to explain after this:

0) First principles: what a backend is

A backend exists to do four fundamental jobs:

  1. Expose capabilities via a stable interface (usually HTTP APIs).
  2. Enforce correctness (validation + business rules).
  3. Control access (authentication + authorization).
  4. Manage state reliably (databases, caches, queues) and operate under load (performance, scaling, observability).

Everything else (frameworks, ORMs, caches, message queues) is a tool to serve these jobs.

1) The ground: network + HTTP

1.1 Request → Response

HTTP is a message protocol:

1.2 Methods (verbs)

Common methods:

1.3 Status codes (API "physics")

Use status codes consistently:

2) What is a REST API?

REST is an architecture style defined by constraints. It's not a library.

2.1 REST = Representation + State + Transfer

Example:

2.2 REST constraints

  1. Client–Server separation
    • UI logic stays on the client; data + rules stay on the server.
  2. Uniform interface
    • consistent endpoints, methods, status codes, and payload shapes.
  3. Layered system
    • intermediaries (load balancer, gateway, proxy) can exist; each layer interacts with adjacent layer only.
  4. Cacheable
    • responses explicitly declare if caching is allowed and for how long.
  5. Stateless
    • server does not rely on stored client context between requests (unless you choose sessions explicitly).
  6. Code on demand (optional)
    • server may send executable code (e.g., JavaScript) to extend client functionality.

3) Method semantics: safe + idempotent

These properties matter for retries, caching, and correctness.

3.1 Safe

A safe operation should not change server state:

3.2 Idempotent

An operation is idempotent if repeating it yields the same end state:

Key Insight

Why it matters: if the client retries due to network failure, idempotent methods prevent duplicate side effects.

4) Resources: design by nouns

A resource is any noun-like business object:

4.1 Good URL patterns

Keep URLs:

4.2 CRUD mapping

4.3 Beyond CRUD (actions)

Sometimes one needs an action:

Prefer modeling as state change (e.g., done=true) when possible.

5) API interface design in practice (Postman/Insomnia)

Postman/Insomnia helps :

A professional habit:

6) Pagination + sorting + filtering

6.1 Why pagination is not optional

Without pagination:

6.2 Offset pagination (page + limit)

Query:

Rules:

Pros: easy
Cons: slow/unstable for deep pages, duplicates when data changes

6.3 Cursor pagination (best for infinite scroll)

Instead of page, you use a cursor token (like created_at + id):

Pros: stable and scalable
Cons: more complex

6.4 Cursor pagination (stable infinite scroll)

Cursor pagination usually needs:

Endpoint idea

Pseudo-implementation

# where tasks are ordered by created_at desc, id desc
# cursor = "2026-01-20T12:00:00Z|<id>"

# Fetch items with (created_at, id) < cursor tuple in same order.

In production you also:

7) Serialization & deserialization

In FastAPI, Pydantic handles:

8) Validation (trust nothing from the network)

Validation enforces an input contract: structure (schema), types, and constraints. In backend systems it is a trust boundary: every request payload is untrusted until it passes checks at the API edge and domain rules inside the application.

What validation protects against

Validation reduces failures caused by unexpected payloads (bugs), inconsistent/partial inputs (corrupted data), and hostile or abusive requests (security, including oversized bodies and injection attempts).

8.1 What to validate (contract + invariants)

8.2 Validation vs sanitization vs escaping

Security note

Validation is not a complete injection defense. For SQL use parameterized queries; for HTML/templates use correct escaping. Never concatenate raw user input into SQL.

8.3 Validation across layers (Controller → Service → Repository)

In a layered backend architecture, validation is defense in depth. Each layer validates what it owns:

Why multiple layers?

Controller validation avoids wasting resources on bad input, service validation encodes business meaning, and repository/DB constraints guarantee correctness even under concurrency.

8.4 Error semantics (API behavior)

Good validation errors should be specific, consistent, and safe (do not leak internal stack traces, raw DB errors, or secrets).

8.5 FastAPI example (layered validation with clean error mapping)

This example shows: (1) Pydantic boundary validation, (2) service business checks, (3) repository integrity as a last guardrail.

from fastapi import FastAPI, HTTPException, status
from pydantic import BaseModel, EmailStr, Field
from typing import Dict

app = FastAPI()

# ---------- Domain error ----------
class ConflictError(Exception):
    pass

# ---------- 1) Controller schema ----------
class UserCreateIn(BaseModel):
    email: EmailStr
    name: str = Field(min_length=1, max_length=60)

class UserOut(BaseModel):
    id: int
    email: EmailStr
    name: str

# ---------- 3) Repository (integrity) ----------
class UserRepository:
    def __init__(self):
        self._users_by_email: Dict[str, dict] = {}
        self._id = 0

    def email_exists(self, email: str) -> bool:
        return email.lower() in self._users_by_email

    def insert_user(self, email: str, name: str) -> dict:
        # In real DB: UNIQUE(email) enforces this under concurrency.
        if self.email_exists(email):
            raise ConflictError("email already exists")

        self._id += 1
        user = {"id": self._id, "email": email.lower(), "name": name}
        self._users_by_email[email.lower()] = user
        return user

# ---------- 2) Service (business rules) ----------
class UserService:
    def __init__(self, repo: UserRepository):
        self.repo = repo

    def create_user(self, email: str, name: str) -> dict:
        # Business invariant: email must be unique
        if self.repo.email_exists(email):
            raise ConflictError("email already exists")
        return self.repo.insert_user(email=email, name=name)

repo = UserRepository()
svc = UserService(repo)

@app.post("/users", response_model=UserOut, status_code=status.HTTP_201_CREATED)
def create_user(payload: UserCreateIn):
    try:
        return svc.create_user(payload.email, payload.name)
    except ConflictError as e:
        raise HTTPException(status_code=409, detail=str(e))

8.6 Practical limits (defense against abuse)

Validation also includes resource bounding: even “valid” inputs can be abusive if they are too large, too frequent, or too expensive to process. Limits preserve availability and stable latency.

Interview one-liner

“In FastAPI, I validate request shape at the boundary with Pydantic (422), enforce business invariants in the service, rely on DB constraints in the repository for integrity under concurrency, and apply resource limits (payload size, rate limits, pagination caps, timeouts) to protect availability.”

9) Authentication and Authorization

Authentication answers: Who are you?
Authorization answers: What are you allowed to do?

Core concept

AuthN (authentication) establishes identity. AuthZ (authorization) enforces permissions on resources. You can be authenticated but not authorized (e.g., logged in but forbidden).

9.0 Typical HTTP status codes

9.1 Typical approaches

1) Session cookie (stateful)

Server stores session state (e.g., in Redis/DB). Client holds a session ID cookie.

2) JWT Bearer token (stateless)

Client sends Authorization: Bearer <jwt>. JWT contains claims (user id, roles, expiry), signed by server. No session lookup is required for each request.

3) API keys (simple but limited)

Key identifies the client/application, often used for service-to-service or public APIs.

9.2 Cookies (short but important)

For browser-based auth, cookies must be configured to reduce XSS/CSRF risks:

Cookie example (secure session cookie)

Set-Cookie: session_id=abc123;
  HttpOnly;
  Secure;
  SameSite=Lax;
  Path=/;
CSRF note

If you use cookies for authentication, you must consider CSRF defenses: SameSite, CSRF tokens, and verifying Origin/Referer for sensitive requests.

9.3 Authorization models (how permissions are expressed)

9.4 Example: protect endpoint + role check (FastAPI)

from fastapi import FastAPI, Depends, HTTPException, status

app = FastAPI()

def get_current_user():
    # verify token/session and return user object
    return {"id": "u1", "role": "user"}

def require_admin(user=Depends(get_current_user)):
    if user["role"] != "admin":
        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Forbidden")
    return user

@app.delete("/admin/users/{user_id}")
def delete_user(user_id: str, admin=Depends(require_admin)):
    return {"deleted": user_id}
Interview one-liner

“Authentication proves identity (401 if missing/invalid). Authorization enforces permissions (403 if not allowed). For browsers I prefer secure session cookies + CSRF defenses; for APIs JWT bearer tokens are common with short expiry and rotation.”

9.5 Why “HTTP is stateless” matters

HTTP is stateless, meaning each request is independent and the server does not automatically remember any client state between requests. Request #2 must contain all information needed to handle it, or the server must be able to look up required context using identifiers provided by the client.

Stateless ≠ no state

Applications still need state (login sessions, shopping carts). Stateless means the protocol does not preserve that state automatically. State is carried by the client (cookies/tokens) or stored in a server-side database/cache and retrieved per request.

Where state lives in practice

Stateful vs stateless authentication

Interview one-liner

“HTTP is stateless: every request must be self-contained. We implement user state using cookies or tokens, and if we use sessions, the server retrieves state from a shared store like Redis.”

Diagram: how “stateless HTTP” still supports login state

HTTP is stateless, so the server doesn’t remember you automatically. The client must send context on every request (cookie/token), and the server may fetch state from storage.

STATEFUL AUTH (Session Cookie + Server-side session store)
--------------------------------------------------------
Browser                 API Server                    Redis/DB
  |  POST /login           |                            |
  |----------------------->| create session             |
  |                        |--------------------------->| SET session:abc = {user_id, roles, ...}
  |                        |<---------------------------|
  |  Set-Cookie: session_id=abc                          |
  |<-----------------------|                            |
  |
  |  GET /profile
  |  Cookie: session_id=abc
  |----------------------->| lookup session by ID       |
  |                        |--------------------------->| GET session:abc
  |                        |<---------------------------| {user_id, roles, ...}
  |                        | authorize + respond         |
  |<-----------------------| 200 OK                      |


STATELESS AUTH (JWT Bearer Token)
---------------------------------
Client                 API Server
  |  POST /login          |
  |---------------------->| issue JWT (signed)
  |<----------------------| 200 OK + access_token
  |
  |  GET /profile
  |  Authorization: Bearer <jwt>
  |---------------------->| verify signature + exp
  |                        (no session lookup needed)
  |<----------------------| 200 OK

9.6 JWT (Bearer token) + RBAC in FastAPI (minimal example)

This section shows the core idea of JWT-based authentication and role-based authorization (RBAC) in FastAPI. The flow is:

Production warning

This is intentionally minimal to teach the concept. Real production JWT systems require stronger controls: key rotation (kid/JWKS), issuer/audience validation, refresh tokens, revocation strategy, secure secret management, and careful claim validation.

Install

pip install python-jose[cryptography] passlib[bcrypt]

Conceptual model (claims you care about)

AuthN vs AuthZ reminder

JWT verification gives you authentication (who the user is). Role checks implement authorization (what the user is allowed to do). You will usually return 401 for invalid/missing tokens and 403 for “authenticated but not allowed.”

Minimal FastAPI JWT + RBAC code

The example below includes: (1) password verification (bcrypt), (2) token issuance (/login), (3) dependency that extracts the current user from the Bearer token, (4) an admin-only endpoint.

from datetime import datetime, timedelta, timezone
from typing import Optional, Dict

from fastapi import FastAPI, Depends, HTTPException, status
from fastapi.security import OAuth2PasswordBearer, OAuth2PasswordRequestForm
from jose import jwt, JWTError
from passlib.context import CryptContext
from pydantic import BaseModel

app = FastAPI()

# -----------------------------
# Minimal config (DO NOT hardcode secrets in production)
# -----------------------------
SECRET_KEY = "change-me-in-production"
ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_MINUTES = 30

pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="login")

# -----------------------------
# Fake user store (replace with DB)
# -----------------------------
# In a real app: store password hashes, not plain passwords.
# Here we hash at startup for demo clarity.
fake_users_db: Dict[str, dict] = {
    "alice": {"username": "alice", "role": "user", "password_hash": pwd_context.hash("alicepass")},
    "admin": {"username": "admin", "role": "admin", "password_hash": pwd_context.hash("adminpass")},
}

class TokenOut(BaseModel):
    access_token: str
    token_type: str = "bearer"

class User(BaseModel):
    username: str
    role: str

# -----------------------------
# Helpers
# -----------------------------
def verify_password(plain_password: str, password_hash: str) -> bool:
    return pwd_context.verify(plain_password, password_hash)

def authenticate_user(username: str, password: str) -> Optional[User]:
    record = fake_users_db.get(username)
    if not record:
        return None
    if not verify_password(password, record["password_hash"]):
        return None
    return User(username=record["username"], role=record["role"])

def create_access_token(*, sub: str, role: str, expires_minutes: int) -> str:
    now = datetime.now(timezone.utc)
    payload = {
        "sub": sub,
        "role": role,
        "iat": int(now.timestamp()),
        "exp": int((now + timedelta(minutes=expires_minutes)).timestamp()),
    }
    return jwt.encode(payload, SECRET_KEY, algorithm=ALGORITHM)

# -----------------------------
# Auth dependency: parse and validate JWT
# -----------------------------
def get_current_user(token: str = Depends(oauth2_scheme)) -> User:
    cred_error = HTTPException(
        status_code=status.HTTP_401_UNAUTHORIZED,
        detail="Invalid authentication credentials",
        headers={"WWW-Authenticate": "Bearer"},
    )

    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        username: str = payload.get("sub")
        role: str = payload.get("role")
        if not username or not role:
            raise cred_error
    except JWTError:
        raise cred_error

    # Optional: verify user still exists (common in production)
    if username not in fake_users_db:
        raise cred_error

    return User(username=username, role=role)

# -----------------------------
# Authorization dependency: RBAC
# -----------------------------
def require_admin(user: User = Depends(get_current_user)) -> User:
    if user.role != "admin":
        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Forbidden")
    return user

# -----------------------------
# Routes
# -----------------------------
@app.post("/login", response_model=TokenOut)
def login(form: OAuth2PasswordRequestForm = Depends()):
    user = authenticate_user(form.username, form.password)
    if not user:
        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Incorrect username or password")

    token = create_access_token(sub=user.username, role=user.role, expires_minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
    return TokenOut(access_token=token)

@app.get("/me")
def read_me(user: User = Depends(get_current_user)):
    return {"username": user.username, "role": user.role}

@app.get("/admin/metrics")
def admin_metrics(admin: User = Depends(require_admin)):
    return {"ok": True, "message": f"Hello {admin.username}, you are an admin."}

How to test quickly (curl)

# 1) login to get JWT
curl -X POST http://localhost:8000/login \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=admin&password=adminpass"

# 2) use the token on protected endpoint
curl http://localhost:8000/admin/metrics \
  -H "Authorization: Bearer <PASTE_TOKEN_HERE>"

Minimum production checklist

Interview one-liner

“JWT gives stateless authentication: verify signature + expiry, extract sub and claims. Authorization is enforced separately (RBAC dependency). Invalid/missing token → 401; insufficient role → 403.”

10) Middleware & CORS (cross-cutting concerns)

Some backend problems are not “business logic.” They are concerns that apply to every request: logging, timing, auth, security headers, compression, request IDs, CORS, etc. Instead of repeating the same code in every endpoint, backends use middleware.

What middleware is

Middleware is code that runs around your endpoints: before the request reaches the route handler and/or after the handler returns a response. Think of it as a pipeline: request → middleware chain → route handler → middleware chain → response.

10.1 Why middleware exists (real-world reasons)

10.2 FastAPI middleware example: request ID + timing

This adds a correlation ID (useful for logs) and exposes response time. In production you’d also log it (or send to tracing/metrics).

import time, uuid
from fastapi import FastAPI, Request

app = FastAPI()

@app.middleware("http")
async def add_request_id_and_timing(request: Request, call_next):
    request_id = request.headers.get("X-Request-ID") or str(uuid.uuid4())
    start = time.perf_counter()

    response = await call_next(request)

    duration_ms = (time.perf_counter() - start) * 1000
    response.headers["X-Request-ID"] = request_id
    response.headers["X-Response-Time-ms"] = f"{duration_ms:.2f}"
    return response
Pro habit

When debugging production: request ID + structured logs can reduce “guessing time” massively.

10.3 CORS: what it actually is (and what it is NOT)

CORS (Cross-Origin Resource Sharing) is a browser security rule. It controls whether a web page running on one origin (domain) is allowed to call APIs on another origin.

Critical misconception

CORS is not authentication and not a server security boundary. It only restricts what browsers allow. Non-browser clients (curl, Postman) can call your API regardless of CORS. You still need AuthN/AuthZ on the server.

10.4 Preflight (OPTIONS): why the browser sends it

For some requests, the browser sends a preflight request: OPTIONS /endpoint to ask the server which methods/headers are allowed. This happens for “non-simple” requests (e.g., custom headers like Authorization, or non-GET/POST with JSON in some cases).

Typical flow (browser):

1) OPTIONS /api/secure
   Origin: http://localhost:3000
   Access-Control-Request-Method: GET
   Access-Control-Request-Headers: Authorization

2) Server replies with:
   Access-Control-Allow-Origin: http://localhost:3000
   Access-Control-Allow-Methods: GET
   Access-Control-Allow-Headers: Authorization

3) Browser then sends the real GET request

10.5 FastAPI CORS configuration (recommended patterns)

If you control the frontend origins, whitelist them explicitly. Avoid wildcard * in production.

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

ALLOWED_ORIGINS = [
    "http://localhost:3000",
    "https://www.janmajay.de",
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=ALLOWED_ORIGINS,
    allow_credentials=True,  # needed if you use cookies
    allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"],
    allow_headers=["Authorization", "Content-Type", "X-Request-ID"],
)

10.6 Cookies + CORS (the part that breaks people)

If you use cookie-based auth across origins, you must set:

fetch("https://api.example.com/me", {
  method: "GET",
  credentials: "include"
});
Security note

Cookies across origins raise CSRF risk. If you use cookies for auth, use SameSite + CSRF protections for sensitive actions.

10.7 Practical CORS rules (safe defaults)

Interview one-liner

“Middleware handles cross-cutting concerns like logging, timing, headers, and auth uniformly. CORS is a browser policy for cross-origin calls; it’s not auth. In production I whitelist origins and handle preflight correctly.”

11) Caching (speed by remembering)

Caching is a performance technique where we store the result of an expensive operation (DB query, API call, computation) so repeated requests can reuse it instead of recomputing. In CS terms, caching trades space (memory/storage) for time (lower latency) and reduces load on upstream systems.

Caching can exist at many layers (each with different scope and consistency guarantees):

Important: do not cache everything. Caching introduces the risk of stale data. You must define a freshness policy (e.g., TTL), invalidation strategy, or revalidation mechanism.

11.0 Cache vocabulary: hit, miss, TTL

11.1 HTTP caching with ETag (best for GET)

HTTP caching is especially effective for GET endpoints and static resources. One robust strategy is revalidation using ETag.

Idea:

Combine ETag with Cache-Control for explicit freshness: Cache-Control: public, max-age=60 means the response can be reused for 60 seconds before revalidation.

FastAPI example (ETag + Cache-Control):

from fastapi import FastAPI, Request, Response
import hashlib, json

app = FastAPI()

@app.get("/api/config")
def get_config(request: Request, response: Response):
    payload = {"featureA": True, "version": 3}
    body = json.dumps(payload, separators=(",", ":")).encode()

    etag = hashlib.sha256(body).hexdigest()

    if request.headers.get("if-none-match") == etag:
        response.status_code = 304
        return

    response.headers["ETag"] = etag
    response.headers["Cache-Control"] = "public, max-age=60"
    return payload

11.2 CDN caching (edge cache, closest to the user)

A CDN caches responses at edge locations near users. It reduces latency and load on your origin server. CDNs work best for static assets and cacheable public GET responses.

11.3 Reverse proxy caching with Nginx (cache in front of the app)

A reverse proxy (e.g., Nginx) can cache upstream responses so your app/DB does not get hit for repeated requests. This is useful for public GET endpoints and for absorbing traffic bursts.

Minimal Nginx proxy cache example:

# inside http { ... }
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=api_cache:10m
                 max_size=1g inactive=60m use_temp_path=off;

server {
    listen 80;
    server_name example.com;

    location /api/ {
        proxy_pass http://127.0.0.1:8000;

        proxy_cache api_cache;
        proxy_cache_key "$scheme$request_method$host$request_uri";

        # cache only successful responses
        proxy_cache_valid 200 10m;
        proxy_cache_valid 404 1m;

        # do not cache when auth/cookies exist (safety rule)
        proxy_no_cache $http_authorization $http_cookie;
        proxy_cache_bypass $http_authorization $http_cookie;

        add_header X-Cache-Status $upstream_cache_status always;
    }
}

Debug tip: the first request usually shows X-Cache-Status: MISS, the next shows HIT.

11.4 Application caching with Redis (cache-aside / lazy loading)

Redis is commonly used as an application cache because it is fast, supports TTL, and provides atomic operations. A standard approach is cache-aside:

  1. read from cache
  2. if miss → read from DB
  3. store result in cache (with TTL)
  4. return result

Python + Redis example (cache-aside):

import json
from redis import Redis

r = Redis(host="localhost", port=6379, decode_responses=True)

def get_user_profile(user_id: str) -> dict:
    key = f"user:profile:{user_id}"

    cached = r.get(key)
    if cached is not None:
        return json.loads(cached)

    # expensive operation (DB query)
    profile = db_fetch_user_profile(user_id)

    # TTL limits staleness
    r.set(key, json.dumps(profile), ex=60)
    return profile

Consistency note: TTL-based caching may serve stale data for up to TTL seconds. For stronger consistency, invalidate the relevant cache keys on writes/updates.

11.5 Cache stampede (thundering herd) + mitigation

A cache stampede occurs when many requests miss simultaneously (e.g., popular key expires), causing a burst of DB load. Common mitigations:

Best-effort Redis lock example (single-flight + TTL jitter):

import json, random, time
from redis import Redis

r = Redis(decode_responses=True)

def get_with_lock(key: str, ttl_s: int, compute_fn):
    cached = r.get(key)
    if cached is not None:
        return json.loads(cached)

    lock_key = key + ":lock"
    got_lock = r.set(lock_key, "1", nx=True, ex=10)  # lock auto-expires

    if got_lock:
        try:
            value = compute_fn()
            jitter = random.randint(0, 10)
            r.set(key, json.dumps(value), ex=ttl_s + jitter)
            return value
        finally:
            r.delete(lock_key)

    # someone else recomputing: wait briefly and retry
    for _ in range(5):
        time.sleep(0.05)
        cached2 = r.get(key)
        if cached2 is not None:
            return json.loads(cached2)

    # fallback policy choice
    return compute_fn()

11.6 DB indexes are not cache (but essential for performance)

A DB index is a data structure (e.g., B-tree) maintained by the database to accelerate queries. It is not a cache because it is part of the DB engine’s storage and changes query complexity (often from scan to logarithmic lookup). Indexes improve read performance but usually increase write cost and storage usage.

12) Scaling: vertical vs horizontal

12.1 Vertical scaling (scale up)

You increase resources on one machine:

Example

Pros: simple
Cons: hard limit; single point of failure

12.2 Horizontal scaling (scale out)

You run multiple replicas of your service:

Pros: scalable + resilient
Cons: requires stateless design + shared state in DB/cache

12.3 Horizontal scaling example with Docker + Nginx load balancing

docker-compose.yml

services:
  api:
    build: .
    deploy:
      replicas: 3  # (works in swarm; for local dev use multiple services or docker compose scale)
    environment:
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/app
    depends_on:
      - db

  nginx:
    image: nginx:alpine
    ports:
      - "8080:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - api

  db:
    image: postgres:16
    environment:
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: app

nginx.conf

events {}

http {
  upstream api_upstream {
    # in real setups, you'd list service DNS names or use service discovery
    # Example conceptually:
    server api:8000;
  }

  server {
    listen 80;

    location / {
      proxy_pass http://api_upstream;
      proxy_set_header Host $host;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
  }
}

12.4 Concurrency vs Parallelism (backend mental model)

Many beginners confuse concurrency with parallelism. Backends care about both — but for different reasons.

Two definitions
  • Concurrency: handling multiple requests in overlapping time (good for I/O waits).
  • Parallelism: doing multiple computations at the same time (needs multiple CPU cores).

Why backends are mostly “I/O bound”

A typical request spends most time waiting on database, network calls, or disk, not executing Python code. While you wait, concurrency lets you serve other requests.

Request timeline (typical)
--------------------------
parse+validate:   2ms
DB query:       120ms  (waiting)
serialize:        3ms
total:          125ms

Main lesson: DB/network waiting dominates.

3 execution models used in real backends

Key rule

Async improves concurrency for I/O. It does not make CPU-heavy work faster. CPU-heavy work needs parallelism (multiple processes) or a background job queue.

Async vs Background jobs (common confusion)

Use async when:     waiting on DB, waiting on HTTP, waiting on Redis
Use background jobs: PDF processing, video conversion, ML inference, embeddings, long pipelines

Python reality check: GIL (one sentence only)

In CPython, CPU-bound Python code does not run truly in parallel in threads due to the GIL. For CPU-heavy work, prefer multiple processes or move work to background workers.

Concurrency “tools” backends use

Interview one-liner

“Concurrency is overlapping requests (great for I/O waits); parallelism is true simultaneous execution (CPU cores). Async helps I/O-bound endpoints; CPU-heavy tasks go to background workers or multi-process scaling.”

Latency is mostly waiting

Imagine a request takes 30ms total, but only 3ms is actual CPU work. The other 27ms is usually waiting on the database/network.

A typical CPU runs around 3–4 GHz. At 3.5 GHz, in 30ms a single core has about 105 million CPU cycles available — and your handler might use only a small fraction of them. In a synchronous/blocking design, your thread just sits there waiting.

This is why concurrency matters: while one request waits on I/O, the server can make progress on other requests instead of wasting time.

12.4.1 FastAPI example: concurrency with async I/O (DB/HTTP waiting)

Below, both endpoints do the same thing: call an external API and return the result. The async version can keep handling other requests while waiting on the network. The blocking version ties up a worker while it waits.

Important

Async only helps if the work is truly I/O wait and the libraries are async-friendly. If you call blocking code inside an async def, you can still block the event loop.

from fastapi import FastAPI
import time
import httpx
import requests

app = FastAPI()

# ---------------------------
# BAD for high concurrency (blocking I/O)
# ---------------------------
@app.get("/blocking-weather")
def blocking_weather():
    # This blocks the worker while waiting on the network.
    r = requests.get("https://httpbin.org/delay/1", timeout=3)
    return {"status": r.status_code}

# ---------------------------
# GOOD for high concurrency (async I/O)
# ---------------------------
@app.get("/async-weather")
async def async_weather():
    # This yields control while waiting, so the server can handle other requests.
    async with httpx.AsyncClient(timeout=3.0) as client:
        r = await client.get("https://httpbin.org/delay/1")
    return {"status": r.status_code}

Practical rule: If your endpoint spends time waiting (DB/HTTP/Redis), prefer async I/O libraries.


12.4.2 FastAPI example: parallelism for CPU-heavy work (process pool)

For CPU-heavy work (hashing, image processing, ML inference), async does not help. You need parallelism using multiple CPU cores. A simple pattern is to offload CPU work to a process pool.

Why process pool?

CPython threads are limited for CPU-bound code by the GIL. A ProcessPool uses multiple OS processes → true parallel CPU execution across cores.

from fastapi import FastAPI
from concurrent.futures import ProcessPoolExecutor
import hashlib

app = FastAPI()

# A global pool (one per app process)
cpu_pool = ProcessPoolExecutor(max_workers=4)

def heavy_cpu_task(n: int) -> str:
    # Artificial CPU work: repeated hashing
    x = b"hello"
    for _ in range(n):
        x = hashlib.sha256(x).digest()
    return x.hex()

@app.get("/cpu-sync")
def cpu_sync(n: int = 200_000):
    # This blocks the worker CPU (bad under load)
    out = heavy_cpu_task(n)
    return {"result": out[:16]}

@app.get("/cpu-parallel")
async def cpu_parallel(n: int = 200_000):
    # Offload CPU work to another process (parallelism)
    import asyncio
    loop = asyncio.get_running_loop()
    out = await loop.run_in_executor(cpu_pool, heavy_cpu_task, n)
    return {"result": out[:16]}
Production note

For real systems, CPU-heavy work is often better as a background job (Celery/RQ), especially if it may take seconds+ or needs retries. Use process pools for “medium” CPU tasks that must return quickly.


12.4.3 One clean decision table

Problem type Best tool Why
I/O wait (DB/HTTP/Redis) async/await + async libs Free the server while waiting
CPU heavy (hashing, ML, image/PDF) multi-process / process pool / job queue Use multiple cores (true parallelism)
Long-running pipeline (seconds-minutes) background jobs (Celery/RQ) Durable + retries + doesn’t block requests
Interview one-liner

“Async increases concurrency for I/O-bound endpoints by letting the server do other work while waiting. CPU-heavy work needs parallelism (processes) or background workers — async won’t make CPU faster.”

Critical Requirement

In horizontal scaling, your API must be stateless (or store session state in Redis / DB).

13) Performance: what matters most

Backend performance is primarily about latency (time per request) and throughput (requests per second). In practice, most slow backends are not slow because of Python itself — they are slow because the request path spends time waiting on I/O (database, network) or doing too much work per request.

Mental model

A request handler is a pipeline: parse → validate → query/compute → respond. Performance work is about finding the dominant cost in that pipeline and reducing it.

13.1 The backend performance hierarchy (typical bottlenecks)

The following “hierarchy” is a useful rule-of-thumb: when an endpoint is slow, these are usually the reasons, in roughly decreasing frequency.

Rule of thumb

Optimize the biggest wait first: if you spend 300ms in the DB and 10ms in Python code, optimizing the Python part won’t move the needle.

13.2 Measure first: where time actually goes

Performance tuning without measurement is guessing. The minimal professional approach:

FastAPI example: timing middleware (quick visibility)

import time
from fastapi import FastAPI, Request

app = FastAPI()

@app.middleware("http")
async def timing_middleware(request: Request, call_next):
    start = time.perf_counter()
    resp = await call_next(request)
    duration_ms = (time.perf_counter() - start) * 1000
    resp.headers["X-Response-Time-ms"] = f"{duration_ms:.2f}"
    return resp

13.3 Practical rules (high-impact improvements)

1) Paginate lists (never return unbounded collections)

Returning “all rows” is a common performance and memory failure. Pagination bounds work per request and improves perceived performance. Prefer cursor-based pagination for large datasets; offset pagination is simpler but slows down at high offsets.

from fastapi import Query

@app.get("/items")
def list_items(limit: int = Query(20, ge=1, le=100), offset: int = Query(0, ge=0)):
    # SELECT ... LIMIT :limit OFFSET :offset
    return db_list_items(limit=limit, offset=offset)

2) Index columns used in filters/sorts

Indexes speed up lookups and sorting, but cost extra work on writes. Index columns that appear frequently in: WHERE, JOIN, and ORDER BY. Verify with query plans rather than guessing.

Index tradeoff

More indexes → faster reads, slower writes, more storage. Use indexes based on real query patterns.

3) Avoid N+1 queries

The N+1 problem happens when you fetch a list (1 query), then for each item fetch related data (N queries). It is common with ORMs if relationships are lazily loaded. Fix it by using joins, eager loading, or batch queries.

Example pattern:

Bad:
  1 query: fetch 100 posts
  100 queries: fetch author for each post
Total: 101 queries (slow)

Good:
  1 query: fetch posts + authors (JOIN / eager load)

4) Cache expensive reads (but handle staleness)

If the same expensive data is requested repeatedly, cache it (Redis, Nginx cache, HTTP caching). Use TTL to limit staleness and consider invalidation on writes for critical correctness.

import json
from redis import Redis

r = Redis(decode_responses=True)

def get_stats():
    key = "stats:v1"
    cached = r.get(key)
    if cached:
        return json.loads(cached)

    data = db_compute_stats()          # expensive query/aggregation
    r.set(key, json.dumps(data), ex=30) # cache for 30s
    return data

5) Use async for I/O waits, not for CPU-heavy work

async/await improves concurrency when your handler spends time waiting on I/O (HTTP calls, DB calls). It does not make CPU-heavy code faster. CPU-heavy tasks should be moved to: background workers (Celery/RQ), or optimized with native libraries, or parallelized safely.

Practical rule

Async helps when you wait on the network; background jobs help when you burn CPU.

6) Add timeouts for external calls (performance + reliability)

External services can be slow or hang. Always set timeouts, and consider retries with backoff for transient failures. Without timeouts, slow dependencies can saturate workers and cascade into outages.

import httpx

def fetch_user_from_partner(user_id: str):
    with httpx.Client(timeout=3.0) as client:
        r = client.get(f"https://partner.example/api/users/{user_id}")
        r.raise_for_status()
        return r.json()

13.4 A realistic performance scenario (end-to-end)

Suppose GET /orders is slow. A typical optimization workflow:

  1. Measure: log DB time + external call time + total time (p50/p95).
  2. Fix query shape: avoid selecting unused columns, limit result size, paginate.
  3. Add/adjust indexes: on user_id, created_at if used in filters/sorts.
  4. Remove N+1: join related tables or eager load.
  5. Cache: cache expensive aggregates (e.g., summary totals) with TTL.
  6. Protect dependencies: add timeouts/retries for external services.
Interview one-liner

“Most backend latency is DB + network. I measure first (percentiles), then fix query patterns (pagination, indexes, avoid N+1), cache expensive reads, use async for I/O waits, and always set timeouts on external calls.”

14) Data layer: ORM design (FastAPI + SQLAlchemy)

An ORM (Object–Relational Mapper) is a programming abstraction that maps relational database tables (rows/columns) to language objects (Python classes/instances). Instead of writing raw SQL for every operation, you work with objects and relations, and the ORM generates SQL and tracks changes for you.

Core mapping idea

Table users ↔ Python class User
Row in users ↔ instance of User
Column email ↔ attribute User.email

14.1 Why ORMs are used (benefits)

14.2 What an ORM does under the hood (unit of work + identity map)

Mature ORMs (including SQLAlchemy ORM) implement two key ideas:

Important tradeoffs

ORMs are not “free performance.” You still must understand SQL, indexes, and query patterns (especially to avoid N+1 queries and accidental full-table scans).

14.3 A small theoretical example: objects vs tables

Suppose you have a relational table:

CREATE TABLE posts (
  id INTEGER PRIMARY KEY,
  title TEXT NOT NULL,
  created_at TIMESTAMP NOT NULL,
  updated_at TIMESTAMP NOT NULL
);

In an ORM, you represent this table as a class. The ORM maps class attributes to columns and generates SQL for you. When you create an object and commit, the ORM emits an INSERT. When you modify an attribute and commit, it emits an UPDATE.

14.4 created_at / updated_at (timestamps for auditing)

In production systems, created_at and updated_at are common auditing fields:

14.5 Minimal FastAPI + SQLAlchemy ORM stack (SQLite demo)

This is a realistic minimal stack:

SQLite vs Postgres

SQLite is great for demos and local dev. In production, Postgres is preferred for concurrency, robustness, and advanced indexing/features. The ORM layer remains similar, but performance and operational behavior differ.

14.6 Minimal code example (model + session + sorting)

The code below shows: (1) an ORM model, (2) automatic timestamps, and (3) sorting by created_at.

from datetime import datetime
from fastapi import FastAPI, Depends, Query
from sqlalchemy import create_engine, Column, Integer, String, DateTime, select, desc, asc
from sqlalchemy.orm import declarative_base, sessionmaker, Session

DATABASE_URL = "sqlite:///./app.db"

engine = create_engine(
    DATABASE_URL,
    connect_args={"check_same_thread": False}  # needed for SQLite + threads
)
SessionLocal = sessionmaker(bind=engine, autocommit=False, autoflush=False)
Base = declarative_base()

class Post(Base):
    __tablename__ = "posts"

    id = Column(Integer, primary_key=True, index=True)
    title = Column(String(120), nullable=False)

    created_at = Column(DateTime, nullable=False, default=datetime.utcnow)
    updated_at = Column(DateTime, nullable=False, default=datetime.utcnow, onupdate=datetime.utcnow)

Base.metadata.create_all(bind=engine)

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

app = FastAPI()

@app.post("/posts")
def create_post(title: str, db: Session = Depends(get_db)):
    post = Post(title=title)
    db.add(post)
    db.commit()
    db.refresh(post)
    return {"id": post.id, "title": post.title, "created_at": post.created_at}

@app.get("/posts")
def list_posts(
    sort: str = Query("desc", pattern="^(asc|desc)$"),
    db: Session = Depends(get_db)
):
    order = desc(Post.created_at) if sort == "desc" else asc(Post.created_at)
    posts = db.execute(select(Post).order_by(order).limit(50)).scalars().all()
    return [{"id": p.id, "title": p.title, "created_at": p.created_at, "updated_at": p.updated_at} for p in posts]

14.7 Common ORM pitfalls (fast interview checklist)

Interview one-liner

“An ORM maps tables to objects and uses a session (identity map + unit of work) to generate SQL and manage transactions. It improves productivity, but you still need SQL awareness to avoid N+1 queries and slow scans.”

15) Background jobs (RQ / Celery) for heavy tasks

A background job is work that runs outside the HTTP request–response lifecycle. The API handler enqueues a task and returns quickly; the heavy/slow part is executed by a separate worker process (often on another machine). This design increases reliability and throughput for real-world systems.

Definition

Background jobs are tasks executed asynchronously after the API response, typically via a queue (Redis/RabbitMQ/SQS) and workers that consume tasks.

15.1 Why background jobs exist

15.2 Typical use cases

15.3 Common pattern (Producer → Queue → Worker)

The web server acts as a producer and enqueues jobs. A queue/broker stores jobs. A worker acts as a consumer and executes them. Results are stored in a DB/cache and can be queried through a status endpoint.

Async vs Background (important)

async/await is primarily about non-blocking I/O inside the same process. Background jobs mean the work happens in separate execution (workers), potentially durable and retriable.

15.4 Response semantics

For heavy tasks, the API should usually return 202 Accepted with a job_id. This indicates the request was accepted for processing, but is not complete yet.

Example response:

{
  "status": "queued",
  "job_id": "a1b2c3d4"
}

15.5 Minimal in-process background tasks (FastAPI BackgroundTasks)

Framework background tasks (e.g., FastAPI BackgroundTasks) are useful for small, best-effort jobs but they are not a durable queue (tasks can be lost if the server restarts).

from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

def send_verification_email(to_email: str) -> None:
    # call SMTP/provider here
    pass

@app.post("/signup")
def signup(email: str, background_tasks: BackgroundTasks):
    # create user in DB ...
    background_tasks.add_task(send_verification_email, email)
    return {"status": "created"}
Limitations of in-process tasks

Not durable (lost on crash), competes with API for CPU/memory, limited visibility/retry control. For heavy tasks, use a real queue (RQ/Celery).

15.6 Celery + Redis example (durable queue + workers)

Celery uses a broker (Redis/RabbitMQ) to store jobs and worker processes to execute them. This is a standard production pattern for background processing.

Worker: define task (tasks.py)

from celery import Celery

celery_app = Celery(
    "worker",
    broker="redis://localhost:6379/0",
    backend="redis://localhost:6379/1",
)

@celery_app.task(bind=True, max_retries=3)
def build_embeddings(self, document_id: str):
    try:
        # heavy pipeline:
        # 1) load document
        # 2) chunk text
        # 3) generate embeddings
        # 4) store vectors + build index
        return {"document_id": document_id, "status": "done"}
    except Exception as exc:
        # exponential backoff for transient failures
        raise self.retry(exc=exc, countdown=2 ** self.request.retries)

API server: enqueue job (app.py)

from fastapi import FastAPI
from tasks import build_embeddings, celery_app

app = FastAPI()

@app.post("/documents/{document_id}/embed")
def embed_document(document_id: str):
    job = build_embeddings.delay(document_id)  # enqueue
    return {"status": "queued", "job_id": job.id}

@app.get("/jobs/{job_id}")
def job_status(job_id: str):
    res = celery_app.AsyncResult(job_id)
    return {"state": res.state, "result": res.result}

15.7 Reliability topics

Interview one-liner

“For heavy work, return 202 + job_id, process via queue + workers, and design tasks to be idempotent with retries/backoff and good observability.”

(See the full RQ/Celery section near the end of this file )

16) Testing with pytest (backend quality)

Install:

pip install pytest httpx

Example test using FastAPI TestClient:

from fastapi.testclient import TestClient
from main import app

client = TestClient(app)

def test_create_and_get_task():
    r = client.post("/tasks", json={"title": "hello", "done": False})
    assert r.status_code == 201
    task = r.json()
    assert task["title"] == "hello"

    r2 = client.get(f"/tasks/{task['id']}")
    assert r2.status_code == 200
    assert r2.json()["id"] == task["id"]

Test principles:

17) CI (GitHub Actions)

.github/workflows/ci.yml

name: CI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install -r requirements.txt
      - run: pytest -q

18) Security essentials (production mindset)

Security is not one feature — it’s a collection of defaults that limit damage when something goes wrong. The goal is simple: reduce attack surface, prevent easy mistakes, and fail safely under bad inputs, leaked credentials, and broken dependencies.

Threat model in one sentence

Assume: inputs are malicious, credentials leak, dependencies fail, and traffic spikes — then design defaults so the system degrades safely.

18.1 Don’t leak internals (errors, stack traces, debug mode)

In production, never expose stack traces, file paths, raw SQL errors, or secrets to users. Return a generic error to clients and log the details internally with a request ID.

from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
import logging

app = FastAPI()
log = logging.getLogger("app")

@app.exception_handler(Exception)
async def catch_all(request: Request, exc: Exception):
    log.exception("Unhandled error")  # log full stack trace internally
    return JSONResponse(status_code=500, content={"detail": "Internal Server Error"})

18.2 Secrets (env vars, rotation, and “never log tokens”)

Secrets include DB passwords, JWT signing keys, API keys, OAuth client secrets. One rule covers 90% of incidents: secrets must not live in Git or logs.

import os

DATABASE_URL = os.environ["DATABASE_URL"]
SECRET_KEY = os.environ["SECRET_KEY"]  # JWT signing key
Common mistake

“It’s fine, it’s only on my server.” If it’s in Git history, HTML, or logs, it eventually leaks.

18.3 Browser threats (XSS vs CSRF) — why cookies need extra care

If you use cookies for authentication, understand these two common web threats:

Practical takeaway: cookie-based auth needs good cookie flags and CSRF defenses for sensitive actions.

Set-Cookie: session_id=...; HttpOnly; Secure; SameSite=Lax; Path=/;

See also: Authentication/Cookies section for the meaning of HttpOnly, Secure, and SameSite.

18.4 HTTPS/TLS (security + correctness)

HTTPS is not optional for real systems. Without HTTPS, credentials and tokens can be intercepted, and cookies are unsafe (the Secure flag becomes meaningless).

Nginx: redirect HTTP → HTTPS (minimal)

server {
  listen 80;
  server_name example.com;
  return 301 https://$host$request_uri;
}

18.5 Security headers (cheap, high impact)

Security headers reduce browser attack surface. They don’t replace validation/auth, but they harden defaults.

from fastapi import Request

@app.middleware("http")
async def security_headers(request: Request, call_next):
    resp = await call_next(request)
    resp.headers["X-Content-Type-Options"] = "nosniff"
    resp.headers["X-Frame-Options"] = "DENY"
    resp.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
    # Start simple; CSP needs tuning:
    # resp.headers["Content-Security-Policy"] = "default-src 'self';"
    return resp

18.6 Abuse controls (rate limits + payload limits + timeouts)

Many “attacks” are just resource exhaustion: too many requests, huge bodies, or slow upstream calls. Apply hard limits to preserve availability.

Nginx: cap request body size

server {
  client_max_body_size 5m;
}

18.7 File uploads (the forgotten attack surface)

Uploads create real risk: large payload DoS, zip bombs, malicious file types, path traversal. Safe defaults:

18.8 Dependency hygiene (silent killer)

Many real incidents come from outdated dependencies. Pin versions, update regularly, and audit in CI.

pip install pip-audit
pip-audit

18.9 Security checklist (fast revision)

Interview one-liner

“I treat security as safe defaults: strict boundaries (validation/auth), no internal leakage, secrets outside Git/logs, HTTPS everywhere, hardened browser surface (cookies/headers/CSRF), abuse limits (rate/body/timeouts), safe uploads, and dependency hygiene.”

19) Observability

20) Quick Review Table: 10 backend concepts

This table is a fast revision checklist. For each row, you should be able to explain: (1) what it is, (2) why it matters, (3) one real example.

Concept What it is (theory) Why it matters + typical tools
1) Authentication vs Authorization AuthN proves identity (“who are you?”). AuthZ enforces permissions (“what can you do?”). HTTP: 401 vs 403. Prevents unauthorized access and defines security model. Tools: session cookies, JWT bearer tokens, OAuth2, RBAC/ABAC policies.
2) Rate limiting Bounds requests per identity (IP/user/token) using algorithms like token bucket/leaky bucket. Protects availability, prevents brute force, stabilizes latency. Tools: Nginx/Cloudflare rate limits, Redis counters, API gateways.
3) Database indexing Indexes are DB-managed data structures (often B-trees) that accelerate query lookup and ordering. Faster reads, but increased write cost + storage. Don’t index everything; index based on query patterns. Tools: EXPLAIN query plans, composite indexes.
4) Transactions + ACID Transaction = atomic unit of work. ACID: Atomicity, Consistency, Isolation, Durability. Guarantees correctness under concurrency; prevents partial updates. Tools: DB transactions, isolation levels, row locks, optimistic locking.
5) Caching Stores results to avoid recomputation (space ↔ time tradeoff). Key issues: staleness, invalidation, TTL. Lower latency and reduced DB/origin load; risk of stale reads and stampedes. Tools: Redis, Nginx cache, CDN cache, HTTP cache (ETag/Cache-Control).
6) Message queues Producer → queue → consumer model for async work; jobs processed by workers with ack/retry semantics. Handles heavy tasks reliably, decouples services, smooths spikes. Tools: Celery/RQ, Redis/RabbitMQ/SQS, DLQ, idempotency patterns.
7) Load balancing Distributes traffic across instances. Strategies: round-robin, least-connections, hashing, sticky sessions. Improves availability and throughput; enables horizontal scaling. Tools: Nginx/HAProxy/Cloud LB, autoscaling, health checks.
8) CAP theorem Under network partition, choose between Consistency and Availability; Partition tolerance is required. Guides distributed DB/service design tradeoffs (CP vs AP). Tools: consensus (Raft), eventual consistency, quorum reads/writes.
9) Reverse proxy Front door for apps: routes requests to upstreams and can terminate TLS, cache, compress, and filter traffic. Central place for security + performance controls; improves deployability. Tools: Nginx, Envoy, Traefik (TLS, caching, rate limiting, routing).
10) CDN Distributed edge network that caches/serves content near users; reduces origin load and latency. Faster global delivery, better burst handling; must set caching rules carefully. Tools: Cloudflare/Akamai/Fastly, cache rules, TTL, purge/invalidation.
30-second drill

For each row: say one definition sentence, one tradeoff sentence, and one tool/example sentence. That’s usually enough to answer most backend interview “concept” questions cleanly.

21) Production basics: Docker Compose + Nginx reverse proxy

A common production setup is: Nginx as a reverse proxy in front of your app container. Nginx terminates HTTP traffic, handles routing, and can add TLS, compression, caching, and rate limiting. Your FastAPI app runs behind it (often with Uvicorn/Gunicorn).

Typical architecture

Client → Nginx (reverse proxy) → FastAPI (app) → DB/Redis

Docker Compose example (FastAPI + Nginx)

This Compose file runs two services: app (FastAPI) and nginx (reverse proxy). Nginx forwards requests to the app using the Docker service name app on port 8000.

version: "3.9"

services:
  app:
    build: .
    container_name: fastapi_app
    expose:
      - "8000"
    environment:
      - ENV=production
    restart: unless-stopped

  nginx:
    image: nginx:1.27-alpine
    container_name: nginx_proxy
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
    depends_on:
      - app
    restart: unless-stopped

Minimal Nginx reverse proxy config

This configuration forwards all requests to the FastAPI app. It also forwards common proxy headers so your app can read the real client IP and scheme (useful for logs, redirects, auth callbacks).

server {
  listen 80;
  server_name _;

  location / {
    proxy_pass http://app:8000;

    proxy_http_version 1.1;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;

    # reasonable timeouts for upstream
    proxy_connect_timeout 5s;
    proxy_read_timeout 60s;
    proxy_send_timeout 60s;
  }
}

Production notes (short but essential)

HTTPS hint

In real production you should serve HTTPS. A common pattern is Nginx + Let’s Encrypt (Certbot) or a managed edge proxy. Keep HTTP (80) only for redirecting to HTTPS (443).

22) Data-intensive backends (real-world architectures + technology choices)

A data-intensive backend is a system where the hard part is not CRUD — the hard part is moving + transforming + serving data reliably at scale. These systems fail in different ways: duplicates, out-of-order events, partial writes, overloaded downstreams, long tail latency, and “one bad tenant” issues.

The mental model: Hot path vs Cold path
HOT PATH (user-facing, strict latency)     COLD PATH (heavy, async, reliable)
API → validate → read/cache → respond   |  ingest → transform → index/aggregate → publish

Strong backends keep the hot path boring and predictable, and push heavy work to the cold path.

22.1 Real examples of “data-intensive” systems

22.2 Reference architectures

A) Queue-based “job pipeline” (most common prototype)

Client
  → API (FastAPI)
      → Postgres (metadata + job state)
      → Object storage (S3/MinIO/local) for large payloads/files
      → Queue (Redis/RabbitMQ/Kafka)
          → Workers (Celery/RQ/Arq) for heavy processing
      → Cache (Redis) for hot reads + rate limit

B) Streaming/event-driven pipeline (Kafka-style)

Producers → Kafka topics → stream processors (Flink/Spark/ksqlDB)
                        → sinks (ClickHouse/BigQuery/Postgres/Elastic)
                        → API reads optimized stores
The hidden rule

Most teams don’t need Kafka on day 1. Start with queue + workers. Add streaming only when you truly need: huge throughput, event ordering/partitioning, or many downstream consumers.

22.3 Technology choices: what goes where (practical mapping)

Storage selection (quick guide)
  • Postgres/MySQL: metadata, transactions, job states, permissions, audit logs
  • S3/MinIO/local FS: big blobs (PDFs, images, exports, embeddings files)
  • Redis: cache, rate limit, locks, queues (small pipelines)
  • ClickHouse / BigQuery: analytics queries, aggregations, time-series at scale
  • Elasticsearch/OpenSearch: full-text search + filters

22.4 Data modeling for pipelines (the part people miss)

22.5 Reliability patterns (real production painkillers)

1) Idempotency (avoid duplicates under retry)

Networks fail. Clients retry. Workers retry. Without idempotency, you will duplicate jobs and corrupt derived data.

Idempotency key examples:
- upload_id
- tenant_id + file_sha256
- order_id + operation_type
- doc_id + version + chunk_index

2) Retry policy (transient vs permanent)

3) Backpressure (systems die without it)

4) Outbox pattern (don’t lose events)

If you write to Postgres and also publish to a queue, you can lose one of them on crash. Outbox stores the message in the same DB transaction, and a dispatcher publishes later.

Transaction:
  INSERT job row
  INSERT outbox row (event to publish)
Commit
Dispatcher reads outbox → publishes → marks delivered

22.6 Performance engineering: where time really goes

Intuition: CPU “red light” time

If a request is waiting on DB/network for ~30ms, your CPU can do ~tens of millions of cycles in that time. Concurrency wins by not wasting waiting time, not by “making Python faster”.

22.7 A concrete prototype: “Document Processing Service” (real backend model)

This is a strong interview demo because it includes: file upload, object storage, job queue, worker processing, and polling/streaming status.

Endpoints:
- POST /documents            → upload metadata + get presigned URL (or direct upload)
- POST /documents/{id}/ingest → enqueue processing job (returns 202 + job_id)
- GET  /jobs/{job_id}         → status: queued/running/success/failed
- GET  /documents/{id}        → returns derived outputs (text, index status, etc.)

FastAPI sketch (enqueue + status)

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import uuid
import time

app = FastAPI()

# Pretend stores (replace with Postgres + Redis queue in real code)
JOBS = {}
DOCS = {}

class IngestReq(BaseModel):
    tenant_id: str
    object_key: str  # path in S3/MinIO/local
    idempotency_key: str

@app.post("/documents/ingest", status_code=202)
def ingest(req: IngestReq):
    # Idempotency: return existing job if same key was already used
    for job_id, job in JOBS.items():
        if job["idempotency_key"] == req.idempotency_key:
            return {"job_id": job_id, "status": job["status"]}

    doc_id = str(uuid.uuid4())
    job_id = str(uuid.uuid4())
    DOCS[doc_id] = {"tenant_id": req.tenant_id, "object_key": req.object_key}

    JOBS[job_id] = {
        "doc_id": doc_id,
        "tenant_id": req.tenant_id,
        "status": "queued",
        "created_at": time.time(),
        "idempotency_key": req.idempotency_key,
        "retry_count": 0,
        "last_error": None,
    }

    # In real system: publish job_id into Redis/RabbitMQ/Kafka
    return {"job_id": job_id, "doc_id": doc_id, "status": "queued"}

@app.get("/jobs/{job_id}")
def job_status(job_id: str):
    job = JOBS.get(job_id)
    if not job:
        raise HTTPException(404, "job not found")
    return job

Production upgrade: Postgres for JOBS/DOCS, Redis/RabbitMQ for queue, Celery/RQ workers to process, S3/MinIO for file storage, and structured logs + metrics for visibility.

22.8 Observability (what you log/measure in real data pipelines)

Quick view on data-intensive backend

“I design a hot path with predictable latency and a cold path with queues/workers. I use idempotency + retries + DLQ + backpressure to survive failures, store raw vs derived separately (object store + DB/index), and I measure p95/p99 plus queue depth to keep the system stable under load.”

Final checklist: backend maturity

When you build any feature, ask:

  1. What is the resource + contract?
  2. What validation and invariants must hold?
  3. What authn/authz rules apply?
  4. Where is truth stored (DB)?
  5. How will it scale (stateless + cache + queue)?
  6. How is it tested and deployed?
  7. How do I observe it in production?