Backend Development from First Principles → Advanced (Theory + Practical Code)

Introduction

Why this guide?

Most backend resources either stay too high-level (“use FastAPI”) or jump into code without a clean mental model. This post builds backend engineering from first principles — so a motivated non-computer-science learner can follow it, while still using correct CS language and real production patterns.

Who this is for: self-taught developers, scientists transitioning into software, and Python backend interview prep. Examples use FastAPI + Python, but the principles apply to any stack.

How to read: skim the headings once, then re-read with the code blocks and build a tiny demo API as you go.

The one mental model

Every backend topic here fits the request lifecycle: HTTP → routing → parsing → validation → auth → business logic → data layer → response, plus cross-cutting concerns (middleware, caching, security, observability) that wrap the whole pipeline.

What you’ll be able to explain after this:

Why HTTP is stateless and how apps still store user state (cookies/JWT/sessions)
How to design clean REST resources, status codes, and pagination
Validation as a trust boundary (controller → service → repository)
AuthN vs AuthZ + common security failures (CSRF, token leakage, brute force)
Caching layers (ETag/CDN/Nginx/Redis) and staleness/invalidation tradeoffs
Where performance really goes (DB queries, N+1, indexes, timeouts)
When to use background jobs and queues (Celery/RQ), retries, idempotency
What “production-ish” looks like (Docker Compose + Nginx reverse proxy)

0. First principles: the backend’s 4 jobs
1. HTTP fundamentals: requests, responses, and status codes
2. HTTP method semantics: safe, idempotent, retry-friendly
3. REST architecture: constraints and why they matter
4. Resource design: nouns, URLs, and CRUD mapping
5. JSON contract: serialization & deserialization
6. Lists done right: pagination, sorting, filtering
7. Working with APIs: Postman/Insomnia + failure modes
8. Validation: the trust boundary (controller → service → DB)
9. Access control: authentication vs authorization (cookies/JWT)
10. Middleware & CORS: cross-cutting concerns
11. Data layer: ORM, transactions, indexes (FastAPI + SQLAlchemy)
12. Caching layers: HTTP/CDN/proxy/Redis and staleness
13. Performance: measure first, then fix the big costs
14. Scaling: vertical vs horizontal (stateless design)
15. Background jobs: queues, retries, idempotency
16. Testing: pytest, integration tests, contracts
17. CI: GitHub Actions basics
18. Security essentials: production mindset
19. Observability: logs, metrics, tracing
20. Production basics: Docker Compose + Nginx reverse proxy
21. Quick review: 10 backend concepts (interview drill)
22. Data-intensive backends (performance + reliability patterns)

0) First principles: what a backend is

A backend exists to do four fundamental jobs:

Expose capabilities via a stable interface (usually HTTP APIs).
Enforce correctness (validation + business rules).
Control access (authentication + authorization).
Manage state reliably (databases, caches, queues) and operate under load (performance, scaling, observability).

Everything else (frameworks, ORMs, caches, message queues) is a tool to serve these jobs.

1) The ground: network + HTTP

1.1 Request → Response

HTTP is a message protocol:

client sends a request (method, path, headers, body)
server sends a response (status code, headers, body)

1.2 Methods (verbs)

Common methods:

GET read
POST create/submit action
PUT replace
PATCH partial update
DELETE delete
HEAD like GET but no body
OPTIONS capabilities / CORS preflight

1.3 Status codes (API "physics")

Use status codes consistently:

200 OK success read/update
201 Created success create
202 Accepted accepted for async job
204 No Content success with no response body
400 Bad Request invalid request format
401 Unauthorized not authenticated
403 Forbidden authenticated but not allowed
404 Not Found resource not found
409 Conflict duplicates / version conflict
422 Unprocessable Entity validation errors (FastAPI default)
429 Too Many Requests rate limited
500 Internal Server Error unexpected failure

2) What is a REST API?

REST is an architecture style defined by constraints. It's not a library.

2.1 REST = Representation + State + Transfer

Representation (RE): how the resource is represented (JSON, HTML, XML).
State (S): current properties of the resource.
Transfer (T): movement of representation via HTTP (GET/POST/…).

Example:

GET /tasks/123 transfers a JSON representation of task #123.

2.2 REST constraints

Client–Server separation
- UI logic stays on the client; data + rules stay on the server.
Uniform interface
- consistent endpoints, methods, status codes, and payload shapes.
Layered system
- intermediaries (load balancer, gateway, proxy) can exist; each layer interacts with adjacent layer only.
Cacheable
- responses explicitly declare if caching is allowed and for how long.
Stateless
- server does not rely on stored client context between requests (unless you choose sessions explicitly).
Code on demand (optional)
- server may send executable code (e.g., JavaScript) to extend client functionality.

3) Method semantics: safe + idempotent

These properties matter for retries, caching, and correctness.

3.1 Safe

A safe operation should not change server state:

GET, HEAD, OPTIONS

3.2 Idempotent

An operation is idempotent if repeating it yields the same end state:

GET idempotent (and safe)
PUT idempotent (replace)
DELETE idempotent (delete again → still deleted)
POST usually not idempotent (creates a new resource each time)
PATCH depends on implementation (often not guaranteed)

Key Insight

Why it matters: if the client retries due to network failure, idempotent methods prevent duplicate side effects.

4) Resources: design by nouns

A resource is any noun-like business object:

users, tasks, tags, orders, documents

4.1 Good URL patterns

collection: /tasks
item: /tasks/{id}
nested: /users/{id}/tasks

Keep URLs:

noun-based (no verbs in path if possible)
stable
consistent across the API

4.2 CRUD mapping

Create: POST /tasks
Read: GET /tasks, GET /tasks/{id}
Update full: PUT /tasks/{id}
Update partial: PATCH /tasks/{id}
Delete: DELETE /tasks/{id}

4.3 Beyond CRUD (actions)

Sometimes one needs an action:

POST /tasks/{id}/complete
POST /payments/{id}/refund

Prefer modeling as state change (e.g., done=true) when possible.

5) API interface design in practice (Postman/Insomnia)

Postman/Insomnia helps :

test endpoints and payloads
validate status codes
keep "collections" as a living contract

A professional habit:

test success and all failure modes:
- invalid input → 422
- unauthenticated → 401
- unauthorized → 403
- not found → 404
- conflict → 409

6) Pagination + sorting + filtering

6.1 Why pagination is not optional

Without pagination:

responses become huge
DB gets overloaded
UI becomes slow (especially infinite scroll)

6.2 Offset pagination (page + limit)

Query:

GET /tasks?limit=20&page=2&sort=-created_at

Rules:

limit must have bounds (e.g. 1..100)
page starts at 1
provide defaults: limit=20, page=1, sort=-created_at

Pros: easy
Cons: slow/unstable for deep pages, duplicates when data changes

6.3 Cursor pagination (best for infinite scroll)

Instead of page, you use a cursor token (like created_at + id):

GET /tasks?limit=20&cursor=2026-01-20T12:00:00Z|a1b2...

Pros: stable and scalable
Cons: more complex

6.4 Cursor pagination (stable infinite scroll)

Cursor pagination usually needs:

a stable sort: (created_at, id)
a cursor token: "created_at|id"

Endpoint idea

GET /tasks?limit=20&cursor=<created_at>|<id>

Pseudo-implementation

# where tasks are ordered by created_at desc, id desc
# cursor = "2026-01-20T12:00:00Z|<id>"

# Fetch items with (created_at, id) < cursor tuple in same order.

In production you also:

sign/encrypt the cursor token
validate token format
return next_cursor in response

7) Serialization & deserialization

Deserialization: request JSON → typed objects
Serialization: typed objects → response JSON

In FastAPI, Pydantic handles:

type coercion
validation
schema generation (OpenAPI docs)

8) Validation (trust nothing from the network)

Validation enforces an input contract: structure (schema), types, and constraints. In backend systems it is a trust boundary: every request payload is untrusted until it passes checks at the API edge and domain rules inside the application.

What validation protects against

Validation reduces failures caused by unexpected payloads (bugs), inconsistent/partial inputs (corrupted data), and hostile or abusive requests (security, including oversized bodies and injection attempts).

8.1 What to validate (contract + invariants)

Presence: required fields must exist
Types: string vs integer vs list/object
Constraints: min/max length, numeric ranges, allowed enums
Formats: email, UUID, URL, ISO-8601 datetime
Normalization: trim whitespace, lowercase emails, canonical forms
Cross-field rules: e.g., start_date < end_date, min ≤ max

8.2 Validation vs sanitization vs escaping

Validation rejects inputs that violate the contract (correctness gate)
Sanitization transforms inputs into a canonical safer form (trim/normalize)
Escaping / parameterization prevents injection when input is used in a context (SQL/HTML)

Security note

Validation is not a complete injection defense. For SQL use parameterized queries; for HTML/templates use correct escaping. Never concatenate raw user input into SQL.

8.3 Validation across layers (Controller → Service → Repository)

In a layered backend architecture, validation is defense in depth. Each layer validates what it owns:

Controller (FastAPI route + Pydantic): boundary validation of untrusted network input (schema, types, basic constraints). Invalid payloads typically return 422.
Service (domain/business rules): semantic validation (uniqueness, state transitions, cross-field domain rules, permission decisions). These map to stable application errors (e.g., 409 conflict).
Repository/DB (integrity): final enforcement using constraints and transactions (UNIQUE/NOT NULL/CHECK/FK). This layer prevents race-condition inconsistencies and translates DB exceptions into domain errors.

Why multiple layers?

Controller validation avoids wasting resources on bad input, service validation encodes business meaning, and repository/DB constraints guarantee correctness even under concurrency.

8.4 Error semantics (API behavior)

400 Bad Request: malformed JSON / invalid syntax
422 Unprocessable Content: valid JSON but fails schema/constraint validation (common in FastAPI)
409 Conflict: well-formed request but violates a business invariant (e.g., duplicate unique field)

Good validation errors should be specific, consistent, and safe (do not leak internal stack traces, raw DB errors, or secrets).

8.5 FastAPI example (layered validation with clean error mapping)

This example shows: (1) Pydantic boundary validation, (2) service business checks, (3) repository integrity as a last guardrail.

from fastapi import FastAPI, HTTPException, status
from pydantic import BaseModel, EmailStr, Field
from typing import Dict

app = FastAPI()

# ---------- Domain error ----------
class ConflictError(Exception):
    pass

# ---------- 1) Controller schema ----------
class UserCreateIn(BaseModel):
    email: EmailStr
    name: str = Field(min_length=1, max_length=60)

class UserOut(BaseModel):
    id: int
    email: EmailStr
    name: str

# ---------- 3) Repository (integrity) ----------
class UserRepository:
    def __init__(self):
        self._users_by_email: Dict[str, dict] = {}
        self._id = 0

    def email_exists(self, email: str) -> bool:
        return email.lower() in self._users_by_email

    def insert_user(self, email: str, name: str) -> dict:
        # In real DB: UNIQUE(email) enforces this under concurrency.
        if self.email_exists(email):
            raise ConflictError("email already exists")

        self._id += 1
        user = {"id": self._id, "email": email.lower(), "name": name}
        self._users_by_email[email.lower()] = user
        return user

# ---------- 2) Service (business rules) ----------
class UserService:
    def __init__(self, repo: UserRepository):
        self.repo = repo

    def create_user(self, email: str, name: str) -> dict:
        # Business invariant: email must be unique
        if self.repo.email_exists(email):
            raise ConflictError("email already exists")
        return self.repo.insert_user(email=email, name=name)

repo = UserRepository()
svc = UserService(repo)

@app.post("/users", response_model=UserOut, status_code=status.HTTP_201_CREATED)
def create_user(payload: UserCreateIn):
    try:
        return svc.create_user(payload.email, payload.name)
    except ConflictError as e:
        raise HTTPException(status_code=409, detail=str(e))

8.6 Practical limits (defense against abuse)

Validation also includes resource bounding: even “valid” inputs can be abusive if they are too large, too frequent, or too expensive to process. Limits preserve availability and stable latency.

Max request size: cap body size to prevent memory pressure and payload DoS
Rate limiting: protect expensive endpoints (login, search) from brute force and spikes
Pagination limits: cap page_size (e.g., max 100) to avoid large scans
Timeouts: apply timeouts to DB calls/external APIs to avoid stuck workers

Interview one-liner

“In FastAPI, I validate request shape at the boundary with Pydantic (422), enforce business invariants in the service, rely on DB constraints in the repository for integrity under concurrency, and apply resource limits (payload size, rate limits, pagination caps, timeouts) to protect availability.”

9) Authentication and Authorization

Authentication answers: Who are you?
Authorization answers: What are you allowed to do?

Core concept

AuthN (authentication) establishes identity. AuthZ (authorization) enforces permissions on resources. You can be authenticated but not authorized (e.g., logged in but forbidden).

9.0 Typical HTTP status codes

401 Unauthorized: not authenticated (missing/invalid credentials)
403 Forbidden: authenticated but not authorized

9.1 Typical approaches

1) Session cookie (stateful)

Server stores session state (e.g., in Redis/DB). Client holds a session ID cookie.

Pros: easy logout/invalidation, good for browsers
Cons: requires server-side state and storage; scaling needs shared session store

2) JWT Bearer token (stateless)

Client sends Authorization: Bearer <jwt>. JWT contains claims (user id, roles, expiry), signed by server. No session lookup is required for each request.

Pros: scalable; works well across services
Cons: revocation is harder (needs denylist/short expiry); token leakage is serious

3) API keys (simple but limited)

Key identifies the client/application, often used for service-to-service or public APIs.

Pros: simple to implement
Cons: weak identity model (often no user context), rotation and leakage risks

9.2 Cookies (short but important)

For browser-based auth, cookies must be configured to reduce XSS/CSRF risks:

HttpOnly: prevents JavaScript from reading the cookie (mitigates token theft via XSS)
Secure: cookie is only sent over HTTPS
SameSite: reduces CSRF by restricting cross-site cookie sending

Cookie example (secure session cookie)

Set-Cookie: session_id=abc123;
  HttpOnly;
  Secure;
  SameSite=Lax;
  Path=/;

CSRF note

If you use cookies for authentication, you must consider CSRF defenses: SameSite, CSRF tokens, and verifying Origin/Referer for sensitive requests.

9.3 Authorization models (how permissions are expressed)

RBAC (Role-Based Access Control): roles like admin/editor/viewer
ABAC (Attribute-Based): policies based on attributes (user, resource, context)
Resource-based checks: “user can access only their own objects”

9.4 Example: protect endpoint + role check (FastAPI)

from fastapi import FastAPI, Depends, HTTPException, status

app = FastAPI()

def get_current_user():
    # verify token/session and return user object
    return {"id": "u1", "role": "user"}

def require_admin(user=Depends(get_current_user)):
    if user["role"] != "admin":
        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Forbidden")
    return user

@app.delete("/admin/users/{user_id}")
def delete_user(user_id: str, admin=Depends(require_admin)):
    return {"deleted": user_id}

Interview one-liner

“Authentication proves identity (401 if missing/invalid). Authorization enforces permissions (403 if not allowed). For browsers I prefer secure session cookies + CSRF defenses; for APIs JWT bearer tokens are common with short expiry and rotation.”

9.5 Why “HTTP is stateless” matters

HTTP is stateless, meaning each request is independent and the server does not automatically remember any client state between requests. Request #2 must contain all information needed to handle it, or the server must be able to look up required context using identifiers provided by the client.

Stateless ≠ no state

Applications still need state (login sessions, shopping carts). Stateless means the protocol does not preserve that state automatically. State is carried by the client (cookies/tokens) or stored in a server-side database/cache and retrieved per request.

Where state lives in practice

Client-side: cookies or Authorization headers are sent with every request
Server-side: session data stored in Redis/DB, fetched by a session ID from the cookie

Stateful vs stateless authentication

Session cookie (stateful): cookie contains session ID, server loads session from storage
JWT bearer (stateless): token contains claims, server verifies signature without DB lookup

Interview one-liner

“HTTP is stateless: every request must be self-contained. We implement user state using cookies or tokens, and if we use sessions, the server retrieves state from a shared store like Redis.”

Diagram: how “stateless HTTP” still supports login state

HTTP is stateless, so the server doesn’t remember you automatically. The client must send context on every request (cookie/token), and the server may fetch state from storage.

STATEFUL AUTH (Session Cookie + Server-side session store)
--------------------------------------------------------
Browser                 API Server                    Redis/DB
  |  POST /login           |                            |
  |----------------------->| create session             |
  |                        |--------------------------->| SET session:abc = {user_id, roles, ...}
  |                        |<---------------------------|
  |  Set-Cookie: session_id=abc                          |
  |<-----------------------|                            |
  |
  |  GET /profile
  |  Cookie: session_id=abc
  |----------------------->| lookup session by ID       |
  |                        |--------------------------->| GET session:abc
  |                        |<---------------------------| {user_id, roles, ...}
  |                        | authorize + respond         |
  |<-----------------------| 200 OK                      |


STATELESS AUTH (JWT Bearer Token)
---------------------------------
Client                 API Server
  |  POST /login          |
  |---------------------->| issue JWT (signed)
  |<----------------------| 200 OK + access_token
  |
  |  GET /profile
  |  Authorization: Bearer <jwt>
  |---------------------->| verify signature + exp
  |                        (no session lookup needed)
  |<----------------------| 200 OK

9.6 JWT (Bearer token) + RBAC in FastAPI (minimal example)

This section shows the core idea of JWT-based authentication and role-based authorization (RBAC) in FastAPI. The flow is:

User logs in → backend verifies credentials → issues a signed JWT
Client sends Authorization: Bearer <token> on each request
Backend verifies JWT signature + expiry → extracts identity (sub) and role → enforces permissions

Production warning

This is intentionally minimal to teach the concept. Real production JWT systems require stronger controls: key rotation (kid/JWKS), issuer/audience validation, refresh tokens, revocation strategy, secure secret management, and careful claim validation.

Install

pip install python-jose[cryptography] passlib[bcrypt]

Conceptual model (claims you care about)

sub: subject (user identifier)
role: authorization role (e.g., user/admin)
exp: expiration time (token lifetime)

AuthN vs AuthZ reminder

JWT verification gives you authentication (who the user is). Role checks implement authorization (what the user is allowed to do). You will usually return 401 for invalid/missing tokens and 403 for “authenticated but not allowed.”

Minimal FastAPI JWT + RBAC code

The example below includes: (1) password verification (bcrypt), (2) token issuance (/login), (3) dependency that extracts the current user from the Bearer token, (4) an admin-only endpoint.

from datetime import datetime, timedelta, timezone
from typing import Optional, Dict

from fastapi import FastAPI, Depends, HTTPException, status
from fastapi.security import OAuth2PasswordBearer, OAuth2PasswordRequestForm
from jose import jwt, JWTError
from passlib.context import CryptContext
from pydantic import BaseModel

app = FastAPI()

# -----------------------------
# Minimal config (DO NOT hardcode secrets in production)
# -----------------------------
SECRET_KEY = "change-me-in-production"
ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_MINUTES = 30

pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="login")

# -----------------------------
# Fake user store (replace with DB)
# -----------------------------
# In a real app: store password hashes, not plain passwords.
# Here we hash at startup for demo clarity.
fake_users_db: Dict[str, dict] = {
    "alice": {"username": "alice", "role": "user", "password_hash": pwd_context.hash("alicepass")},
    "admin": {"username": "admin", "role": "admin", "password_hash": pwd_context.hash("adminpass")},
}

class TokenOut(BaseModel):
    access_token: str
    token_type: str = "bearer"

class User(BaseModel):
    username: str
    role: str

# -----------------------------
# Helpers
# -----------------------------
def verify_password(plain_password: str, password_hash: str) -> bool:
    return pwd_context.verify(plain_password, password_hash)

def authenticate_user(username: str, password: str) -> Optional[User]:
    record = fake_users_db.get(username)
    if not record:
        return None
    if not verify_password(password, record["password_hash"]):
        return None
    return User(username=record["username"], role=record["role"])

def create_access_token(*, sub: str, role: str, expires_minutes: int) -> str:
    now = datetime.now(timezone.utc)
    payload = {
        "sub": sub,
        "role": role,
        "iat": int(now.timestamp()),
        "exp": int((now + timedelta(minutes=expires_minutes)).timestamp()),
    }
    return jwt.encode(payload, SECRET_KEY, algorithm=ALGORITHM)

# -----------------------------
# Auth dependency: parse and validate JWT
# -----------------------------
def get_current_user(token: str = Depends(oauth2_scheme)) -> User:
    cred_error = HTTPException(
        status_code=status.HTTP_401_UNAUTHORIZED,
        detail="Invalid authentication credentials",
        headers={"WWW-Authenticate": "Bearer"},
    )

    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        username: str = payload.get("sub")
        role: str = payload.get("role")
        if not username or not role:
            raise cred_error
    except JWTError:
        raise cred_error

    # Optional: verify user still exists (common in production)
    if username not in fake_users_db:
        raise cred_error

    return User(username=username, role=role)

# -----------------------------
# Authorization dependency: RBAC
# -----------------------------
def require_admin(user: User = Depends(get_current_user)) -> User:
    if user.role != "admin":
        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Forbidden")
    return user

# -----------------------------
# Routes
# -----------------------------
@app.post("/login", response_model=TokenOut)
def login(form: OAuth2PasswordRequestForm = Depends()):
    user = authenticate_user(form.username, form.password)
    if not user:
        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Incorrect username or password")

    token = create_access_token(sub=user.username, role=user.role, expires_minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
    return TokenOut(access_token=token)

@app.get("/me")
def read_me(user: User = Depends(get_current_user)):
    return {"username": user.username, "role": user.role}

@app.get("/admin/metrics")
def admin_metrics(admin: User = Depends(require_admin)):
    return {"ok": True, "message": f"Hello {admin.username}, you are an admin."}

How to test quickly (curl)

# 1) login to get JWT
curl -X POST http://localhost:8000/login \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=admin&password=adminpass"

# 2) use the token on protected endpoint
curl http://localhost:8000/admin/metrics \
  -H "Authorization: Bearer <PASTE_TOKEN_HERE>"

Minimum production checklist

Validate claims: check exp (and in production also iss, aud)
Key management: rotate keys; consider kid header + JWKS for multiple keys
Token lifetime: short-lived access tokens; refresh tokens for longer sessions
Revocation strategy: denylist or session store for “logout everywhere”
Secure transport: HTTPS everywhere; never log tokens
RBAC vs ABAC: roles are simple; attribute/policy checks may be needed for fine-grained control

Interview one-liner

“JWT gives stateless authentication: verify signature + expiry, extract sub and claims. Authorization is enforced separately (RBAC dependency). Invalid/missing token → 401; insufficient role → 403.”

10) Middleware & CORS (cross-cutting concerns)

Some backend problems are not “business logic.” They are concerns that apply to every request: logging, timing, auth, security headers, compression, request IDs, CORS, etc. Instead of repeating the same code in every endpoint, backends use middleware.

What middleware is

Middleware is code that runs around your endpoints: before the request reaches the route handler and/or after the handler returns a response. Think of it as a pipeline: request → middleware chain → route handler → middleware chain → response.

10.1 Why middleware exists (real-world reasons)

Consistency: apply headers/logging/auth rules uniformly
Observability: add request IDs, timing, metrics
Security: add security headers, block oversized bodies, enforce HTTPS behind proxy
Performance: caching headers, compression, rate limiting (often at proxy)

10.2 FastAPI middleware example: request ID + timing

This adds a correlation ID (useful for logs) and exposes response time. In production you’d also log it (or send to tracing/metrics).

import time, uuid
from fastapi import FastAPI, Request

app = FastAPI()

@app.middleware("http")
async def add_request_id_and_timing(request: Request, call_next):
    request_id = request.headers.get("X-Request-ID") or str(uuid.uuid4())
    start = time.perf_counter()

    response = await call_next(request)

    duration_ms = (time.perf_counter() - start) * 1000
    response.headers["X-Request-ID"] = request_id
    response.headers["X-Response-Time-ms"] = f"{duration_ms:.2f}"
    return response

Pro habit

When debugging production: request ID + structured logs can reduce “guessing time” massively.

10.3 CORS: what it actually is (and what it is NOT)

CORS (Cross-Origin Resource Sharing) is a browser security rule. It controls whether a web page running on one origin (domain) is allowed to call APIs on another origin.

Origin = scheme + host + port (e.g., https://app.com)
If your frontend is on http://localhost:3000 and API on http://localhost:8000, that is cross-origin.

Critical misconception

CORS is not authentication and not a server security boundary. It only restricts what browsers allow. Non-browser clients (curl, Postman) can call your API regardless of CORS. You still need AuthN/AuthZ on the server.

10.4 Preflight (OPTIONS): why the browser sends it

For some requests, the browser sends a preflight request: OPTIONS /endpoint to ask the server which methods/headers are allowed. This happens for “non-simple” requests (e.g., custom headers like Authorization, or non-GET/POST with JSON in some cases).

Typical flow (browser):

1) OPTIONS /api/secure
   Origin: http://localhost:3000
   Access-Control-Request-Method: GET
   Access-Control-Request-Headers: Authorization

2) Server replies with:
   Access-Control-Allow-Origin: http://localhost:3000
   Access-Control-Allow-Methods: GET
   Access-Control-Allow-Headers: Authorization

3) Browser then sends the real GET request

10.5 FastAPI CORS configuration (recommended patterns)

If you control the frontend origins, whitelist them explicitly. Avoid wildcard * in production.

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

ALLOWED_ORIGINS = [
    "http://localhost:3000",
    "https://www.janmajay.de",
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=ALLOWED_ORIGINS,
    allow_credentials=True,  # needed if you use cookies
    allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"],
    allow_headers=["Authorization", "Content-Type", "X-Request-ID"],
)

10.6 Cookies + CORS (the part that breaks people)

If you use cookie-based auth across origins, you must set:

allow_credentials=True in CORS middleware
SameSite=None; Secure on the cookie (requires HTTPS)
Frontend must send credentials (fetch: credentials: "include")

fetch("https://api.example.com/me", {
  method: "GET",
  credentials: "include"
});

Security note

Cookies across origins raise CSRF risk. If you use cookies for auth, use SameSite + CSRF protections for sensitive actions.

10.7 Practical CORS rules (safe defaults)

Whitelist exact origins (don’t use * in production)
Only allow headers you need (especially Authorization)
Don’t confuse CORS with security: AuthN/AuthZ still required
Handle preflight (OPTIONS) or your frontend will “mysteriously fail”

Interview one-liner

“Middleware handles cross-cutting concerns like logging, timing, headers, and auth uniformly. CORS is a browser policy for cross-origin calls; it’s not auth. In production I whitelist origins and handle preflight correctly.”

11) Caching (speed by remembering)

Caching is a performance technique where we store the result of an expensive operation (DB query, API call, computation) so repeated requests can reuse it instead of recomputing. In CS terms, caching trades space (memory/storage) for time (lower latency) and reduces load on upstream systems.

Caching can exist at many layers (each with different scope and consistency guarantees):

Browser cache (HTTP caching, client-side)
CDN cache (edge caching, near end users)
Reverse proxy cache (Nginx/Varnish in front of your app)
Application cache (Redis/Memcached, app-controlled)
DB indexes (not a cache; query-acceleration structures inside the DB engine)

Important: do not cache everything. Caching introduces the risk of stale data. You must define a freshness policy (e.g., TTL), invalidation strategy, or revalidation mechanism.

11.0 Cache vocabulary: hit, miss, TTL

Cache hit: data exists in cache → fast response
Cache miss: not in cache → fetch from origin/DB → store → return
TTL (Time-To-Live): expiry time for a cached entry (limits staleness)

11.1 HTTP caching with ETag (best for GET)

HTTP caching is especially effective for GET endpoints and static resources. One robust strategy is revalidation using ETag.

Idea:

server responds with an ETag representing the current resource version (often a hash)
client later sends If-None-Match with that ETag
server returns 304 Not Modified if unchanged (no body, saves bandwidth)
server returns 200 with new content + new ETag if changed

Combine ETag with Cache-Control for explicit freshness: Cache-Control: public, max-age=60 means the response can be reused for 60 seconds before revalidation.

FastAPI example (ETag + Cache-Control):

from fastapi import FastAPI, Request, Response
import hashlib, json

app = FastAPI()

@app.get("/api/config")
def get_config(request: Request, response: Response):
    payload = {"featureA": True, "version": 3}
    body = json.dumps(payload, separators=(",", ":")).encode()

    etag = hashlib.sha256(body).hexdigest()

    if request.headers.get("if-none-match") == etag:
        response.status_code = 304
        return

    response.headers["ETag"] = etag
    response.headers["Cache-Control"] = "public, max-age=60"
    return payload

11.2 CDN caching (edge cache, closest to the user)

A CDN caches responses at edge locations near users. It reduces latency and load on your origin server. CDNs work best for static assets and cacheable public GET responses.

High impact for global audiences (lower round-trip time)
Use TTL and cache rules carefully
Avoid caching private/user-specific responses as public

11.3 Reverse proxy caching with Nginx (cache in front of the app)

A reverse proxy (e.g., Nginx) can cache upstream responses so your app/DB does not get hit for repeated requests. This is useful for public GET endpoints and for absorbing traffic bursts.

Minimal Nginx proxy cache example:

# inside http { ... }
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=api_cache:10m
                 max_size=1g inactive=60m use_temp_path=off;

server {
    listen 80;
    server_name example.com;

    location /api/ {
        proxy_pass http://127.0.0.1:8000;

        proxy_cache api_cache;
        proxy_cache_key "$scheme$request_method$host$request_uri";

        # cache only successful responses
        proxy_cache_valid 200 10m;
        proxy_cache_valid 404 1m;

        # do not cache when auth/cookies exist (safety rule)
        proxy_no_cache $http_authorization $http_cookie;
        proxy_cache_bypass $http_authorization $http_cookie;

        add_header X-Cache-Status $upstream_cache_status always;
    }
}

Debug tip: the first request usually shows X-Cache-Status: MISS, the next shows HIT.

11.4 Application caching with Redis (cache-aside / lazy loading)

Redis is commonly used as an application cache because it is fast, supports TTL, and provides atomic operations. A standard approach is cache-aside:

read from cache
if miss → read from DB
store result in cache (with TTL)
return result

Python + Redis example (cache-aside):

import json
from redis import Redis

r = Redis(host="localhost", port=6379, decode_responses=True)

def get_user_profile(user_id: str) -> dict:
    key = f"user:profile:{user_id}"

    cached = r.get(key)
    if cached is not None:
        return json.loads(cached)

    # expensive operation (DB query)
    profile = db_fetch_user_profile(user_id)

    # TTL limits staleness
    r.set(key, json.dumps(profile), ex=60)
    return profile

Consistency note: TTL-based caching may serve stale data for up to TTL seconds. For stronger consistency, invalidate the relevant cache keys on writes/updates.

11.5 Cache stampede (thundering herd) + mitigation

A cache stampede occurs when many requests miss simultaneously (e.g., popular key expires), causing a burst of DB load. Common mitigations:

single-flight locking: only one request recomputes, others wait and reuse
TTL jitter: add small random noise to TTL to avoid synchronized expirations
stale-while-revalidate: serve slightly stale data while refreshing in the background

Best-effort Redis lock example (single-flight + TTL jitter):

import json, random, time
from redis import Redis

r = Redis(decode_responses=True)

def get_with_lock(key: str, ttl_s: int, compute_fn):
    cached = r.get(key)
    if cached is not None:
        return json.loads(cached)

    lock_key = key + ":lock"
    got_lock = r.set(lock_key, "1", nx=True, ex=10)  # lock auto-expires

    if got_lock:
        try:
            value = compute_fn()
            jitter = random.randint(0, 10)
            r.set(key, json.dumps(value), ex=ttl_s + jitter)
            return value
        finally:
            r.delete(lock_key)

    # someone else recomputing: wait briefly and retry
    for _ in range(5):
        time.sleep(0.05)
        cached2 = r.get(key)
        if cached2 is not None:
            return json.loads(cached2)

    # fallback policy choice
    return compute_fn()

11.6 DB indexes are not cache (but essential for performance)

A DB index is a data structure (e.g., B-tree) maintained by the database to accelerate queries. It is not a cache because it is part of the DB engine’s storage and changes query complexity (often from scan to logarithmic lookup). Indexes improve read performance but usually increase write cost and storage usage.

12) Scaling: vertical vs horizontal

12.1 Vertical scaling (scale up)

You increase resources on one machine:

more CPU
more RAM
faster disk

Example

A single VM: upgrade from 2 CPU / 4GB RAM → 8 CPU / 16GB RAM

Pros: simple
Cons: hard limit; single point of failure

12.2 Horizontal scaling (scale out)

You run multiple replicas of your service:

2, 4, 10 backend instances
a load balancer distributes requests

Pros: scalable + resilient
Cons: requires stateless design + shared state in DB/cache

12.3 Horizontal scaling example with Docker + Nginx load balancing

docker-compose.yml

services:
  api:
    build: .
    deploy:
      replicas: 3  # (works in swarm; for local dev use multiple services or docker compose scale)
    environment:
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/app
    depends_on:
      - db

  nginx:
    image: nginx:alpine
    ports:
      - "8080:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - api

  db:
    image: postgres:16
    environment:
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: app

nginx.conf

events {}

http {
  upstream api_upstream {
    # in real setups, you'd list service DNS names or use service discovery
    # Example conceptually:
    server api:8000;
  }

  server {
    listen 80;

    location / {
      proxy_pass http://api_upstream;
      proxy_set_header Host $host;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
  }
}

12.4 Concurrency vs Parallelism (backend mental model)

Many beginners confuse concurrency with parallelism. Backends care about both — but for different reasons.

Two definitions

Concurrency: handling multiple requests in overlapping time (good for I/O waits).
Parallelism: doing multiple computations at the same time (needs multiple CPU cores).

Why backends are mostly “I/O bound”

A typical request spends most time waiting on database, network calls, or disk, not executing Python code. While you wait, concurrency lets you serve other requests.

Request timeline (typical)
--------------------------
parse+validate:   2ms
DB query:       120ms  (waiting)
serialize:        3ms
total:          125ms

Main lesson: DB/network waiting dominates.

3 execution models used in real backends

1) Thread-per-request (classic): simple mental model; good for blocking I/O; too many threads can hurt.
2) Async event loop (async/await): one thread can juggle many in-flight I/O waits efficiently.
3) Multi-process workers: uses multiple CPU cores; good isolation; common with Gunicorn.

Key rule

Async improves concurrency for I/O. It does not make CPU-heavy work faster. CPU-heavy work needs parallelism (multiple processes) or a background job queue.

Async vs Background jobs (common confusion)

async/await: “I can serve other requests while waiting for DB/HTTP.”
background jobs: “This task should not run in the request path at all.”

Use async when:     waiting on DB, waiting on HTTP, waiting on Redis
Use background jobs: PDF processing, video conversion, ML inference, embeddings, long pipelines

Python reality check: GIL (one sentence only)

In CPython, CPU-bound Python code does not run truly in parallel in threads due to the GIL. For CPU-heavy work, prefer multiple processes or move work to background workers.

Concurrency “tools” backends use

Connection pools (DB): limit concurrent DB connections (prevents overload)
Timeouts: don’t let requests hang forever
Backpressure: reject/queue work when overloaded
Rate limiting: protects scarce resources

Interview one-liner

“Concurrency is overlapping requests (great for I/O waits); parallelism is true simultaneous execution (CPU cores). Async helps I/O-bound endpoints; CPU-heavy tasks go to background workers or multi-process scaling.”

Latency is mostly waiting

Imagine a request takes 30ms total, but only 3ms is actual CPU work. The other 27ms is usually waiting on the database/network.

A typical CPU runs around 3–4 GHz. At 3.5 GHz, in 30ms a single core has about 105 million CPU cycles available — and your handler might use only a small fraction of them. In a synchronous/blocking design, your thread just sits there waiting.

This is why concurrency matters: while one request waits on I/O, the server can make progress on other requests instead of wasting time.

12.4.1 FastAPI example: concurrency with async I/O (DB/HTTP waiting)

Below, both endpoints do the same thing: call an external API and return the result. The async version can keep handling other requests while waiting on the network. The blocking version ties up a worker while it waits.

Important

Async only helps if the work is truly I/O wait and the libraries are async-friendly. If you call blocking code inside an async def, you can still block the event loop.

from fastapi import FastAPI
import time
import httpx
import requests

app = FastAPI()

# ---------------------------
# BAD for high concurrency (blocking I/O)
# ---------------------------
@app.get("/blocking-weather")
def blocking_weather():
    # This blocks the worker while waiting on the network.
    r = requests.get("https://httpbin.org/delay/1", timeout=3)
    return {"status": r.status_code}

# ---------------------------
# GOOD for high concurrency (async I/O)
# ---------------------------
@app.get("/async-weather")
async def async_weather():
    # This yields control while waiting, so the server can handle other requests.
    async with httpx.AsyncClient(timeout=3.0) as client:
        r = await client.get("https://httpbin.org/delay/1")
    return {"status": r.status_code}

Practical rule: If your endpoint spends time waiting (DB/HTTP/Redis), prefer async I/O libraries.

12.4.2 FastAPI example: parallelism for CPU-heavy work (process pool)

For CPU-heavy work (hashing, image processing, ML inference), async does not help. You need parallelism using multiple CPU cores. A simple pattern is to offload CPU work to a process pool.

Why process pool?

CPython threads are limited for CPU-bound code by the GIL. A ProcessPool uses multiple OS processes → true parallel CPU execution across cores.

from fastapi import FastAPI
from concurrent.futures import ProcessPoolExecutor
import hashlib

app = FastAPI()

# A global pool (one per app process)
cpu_pool = ProcessPoolExecutor(max_workers=4)

def heavy_cpu_task(n: int) -> str:
    # Artificial CPU work: repeated hashing
    x = b"hello"
    for _ in range(n):
        x = hashlib.sha256(x).digest()
    return x.hex()

@app.get("/cpu-sync")
def cpu_sync(n: int = 200_000):
    # This blocks the worker CPU (bad under load)
    out = heavy_cpu_task(n)
    return {"result": out[:16]}

@app.get("/cpu-parallel")
async def cpu_parallel(n: int = 200_000):
    # Offload CPU work to another process (parallelism)
    import asyncio
    loop = asyncio.get_running_loop()
    out = await loop.run_in_executor(cpu_pool, heavy_cpu_task, n)
    return {"result": out[:16]}

Production note

For real systems, CPU-heavy work is often better as a background job (Celery/RQ), especially if it may take seconds+ or needs retries. Use process pools for “medium” CPU tasks that must return quickly.

12.4.3 One clean decision table

Problem type	Best tool	Why
I/O wait (DB/HTTP/Redis)	`async/await` + async libs	Free the server while waiting
CPU heavy (hashing, ML, image/PDF)	multi-process / process pool / job queue	Use multiple cores (true parallelism)
Long-running pipeline (seconds-minutes)	background jobs (Celery/RQ)	Durable + retries + doesn’t block requests

Interview one-liner

“Async increases concurrency for I/O-bound endpoints by letting the server do other work while waiting. CPU-heavy work needs parallelism (processes) or background workers — async won’t make CPU faster.”

Critical Requirement

In horizontal scaling, your API must be stateless (or store session state in Redis / DB).

13) Performance: what matters most

Backend performance is primarily about latency (time per request) and throughput (requests per second). In practice, most slow backends are not slow because of Python itself — they are slow because the request path spends time waiting on I/O (database, network) or doing too much work per request.

Mental model

A request handler is a pipeline: parse → validate → query/compute → respond. Performance work is about finding the dominant cost in that pipeline and reducing it.

13.1 The backend performance hierarchy (typical bottlenecks)

The following “hierarchy” is a useful rule-of-thumb: when an endpoint is slow, these are usually the reasons, in roughly decreasing frequency.

Database queries dominate latency
Poor queries, missing indexes, large result sets, and N+1 query patterns often outweigh everything else.
External API calls dominate latency
Network round trips and third-party services introduce unpredictable latency and failures.
CPU-heavy work blocks worker threads/processes
Serialization, large JSON transformations, PDF/image processing, or ML inference can saturate CPU and reduce throughput.

Rule of thumb

Optimize the biggest wait first: if you spend 300ms in the DB and 10ms in Python code, optimizing the Python part won’t move the needle.

13.2 Measure first: where time actually goes

Performance tuning without measurement is guessing. The minimal professional approach:

Add timing logs around DB calls and external HTTP calls.
Inspect query plans (e.g., EXPLAIN) for slow database queries.
Track percentiles: p50 vs p95 vs p99 latency (tail latency matters in production).

FastAPI example: timing middleware (quick visibility)

import time
from fastapi import FastAPI, Request

app = FastAPI()

@app.middleware("http")
async def timing_middleware(request: Request, call_next):
    start = time.perf_counter()
    resp = await call_next(request)
    duration_ms = (time.perf_counter() - start) * 1000
    resp.headers["X-Response-Time-ms"] = f"{duration_ms:.2f}"
    return resp

13.3 Practical rules (high-impact improvements)

1) Paginate lists (never return unbounded collections)

Returning “all rows” is a common performance and memory failure. Pagination bounds work per request and improves perceived performance. Prefer cursor-based pagination for large datasets; offset pagination is simpler but slows down at high offsets.

from fastapi import Query

@app.get("/items")
def list_items(limit: int = Query(20, ge=1, le=100), offset: int = Query(0, ge=0)):
    # SELECT ... LIMIT :limit OFFSET :offset
    return db_list_items(limit=limit, offset=offset)

2) Index columns used in filters/sorts

Indexes speed up lookups and sorting, but cost extra work on writes. Index columns that appear frequently in: WHERE, JOIN, and ORDER BY. Verify with query plans rather than guessing.

Index tradeoff

More indexes → faster reads, slower writes, more storage. Use indexes based on real query patterns.

3) Avoid N+1 queries

The N+1 problem happens when you fetch a list (1 query), then for each item fetch related data (N queries). It is common with ORMs if relationships are lazily loaded. Fix it by using joins, eager loading, or batch queries.

Example pattern:

Bad:
  1 query: fetch 100 posts
  100 queries: fetch author for each post
Total: 101 queries (slow)

Good:
  1 query: fetch posts + authors (JOIN / eager load)

4) Cache expensive reads (but handle staleness)

If the same expensive data is requested repeatedly, cache it (Redis, Nginx cache, HTTP caching). Use TTL to limit staleness and consider invalidation on writes for critical correctness.

import json
from redis import Redis

r = Redis(decode_responses=True)

def get_stats():
    key = "stats:v1"
    cached = r.get(key)
    if cached:
        return json.loads(cached)

    data = db_compute_stats()          # expensive query/aggregation
    r.set(key, json.dumps(data), ex=30) # cache for 30s
    return data

5) Use async for I/O waits, not for CPU-heavy work

async/await improves concurrency when your handler spends time waiting on I/O (HTTP calls, DB calls). It does not make CPU-heavy code faster. CPU-heavy tasks should be moved to: background workers (Celery/RQ), or optimized with native libraries, or parallelized safely.

Practical rule

Async helps when you wait on the network; background jobs help when you burn CPU.

6) Add timeouts for external calls (performance + reliability)

External services can be slow or hang. Always set timeouts, and consider retries with backoff for transient failures. Without timeouts, slow dependencies can saturate workers and cascade into outages.

import httpx

def fetch_user_from_partner(user_id: str):
    with httpx.Client(timeout=3.0) as client:
        r = client.get(f"https://partner.example/api/users/{user_id}")
        r.raise_for_status()
        return r.json()

13.4 A realistic performance scenario (end-to-end)

Suppose GET /orders is slow. A typical optimization workflow:

Measure: log DB time + external call time + total time (p50/p95).
Fix query shape: avoid selecting unused columns, limit result size, paginate.
Add/adjust indexes: on user_id, created_at if used in filters/sorts.
Remove N+1: join related tables or eager load.
Cache: cache expensive aggregates (e.g., summary totals) with TTL.
Protect dependencies: add timeouts/retries for external services.

Interview one-liner

“Most backend latency is DB + network. I measure first (percentiles), then fix query patterns (pagination, indexes, avoid N+1), cache expensive reads, use async for I/O waits, and always set timeouts on external calls.”

14) Data layer: ORM design (FastAPI + SQLAlchemy)

An ORM (Object–Relational Mapper) is a programming abstraction that maps relational database tables (rows/columns) to language objects (Python classes/instances). Instead of writing raw SQL for every operation, you work with objects and relations, and the ORM generates SQL and tracks changes for you.

Core mapping idea

Table users ↔ Python class User
Row in users ↔ instance of User
Column email ↔ attribute User.email

14.1 Why ORMs are used (benefits)

Productivity: CRUD operations become concise and less error-prone
Maintainability: domain model lives in code (types, relationships, constraints)
Portability: same ORM code can target SQLite/Postgres/MySQL (with caveats)
Safety: parameterized queries by default reduce SQL injection risk
Transactions: ORMs integrate well with unit-of-work + session patterns

14.2 What an ORM does under the hood (unit of work + identity map)

Mature ORMs (including SQLAlchemy ORM) implement two key ideas:

Identity map: within a session, each DB row is represented by a single Python object. If you query the same row twice, you usually get the same object instance (consistency inside the session).
Unit of work: the ORM tracks changes you make to objects and flushes them as SQL (INSERT/UPDATE/DELETE) on commit().

Important tradeoffs

ORMs are not “free performance.” You still must understand SQL, indexes, and query patterns (especially to avoid N+1 queries and accidental full-table scans).

14.3 A small theoretical example: objects vs tables

Suppose you have a relational table:

CREATE TABLE posts (
  id INTEGER PRIMARY KEY,
  title TEXT NOT NULL,
  created_at TIMESTAMP NOT NULL,
  updated_at TIMESTAMP NOT NULL
);

In an ORM, you represent this table as a class. The ORM maps class attributes to columns and generates SQL for you. When you create an object and commit, the ORM emits an INSERT. When you modify an attribute and commit, it emits an UPDATE.

14.4 created_at / updated_at (timestamps for auditing)

In production systems, created_at and updated_at are common auditing fields:

created_at: time the row was created (immutable)
updated_at: time the row was last modified (changes on update)

14.5 Minimal FastAPI + SQLAlchemy ORM stack (SQLite demo)

This is a realistic minimal stack:

SQLAlchemy ORM for mapping classes ↔ tables
SQLite for demo (swap to Postgres in production)

SQLite vs Postgres

SQLite is great for demos and local dev. In production, Postgres is preferred for concurrency, robustness, and advanced indexing/features. The ORM layer remains similar, but performance and operational behavior differ.

14.6 Minimal code example (model + session + sorting)

The code below shows: (1) an ORM model, (2) automatic timestamps, and (3) sorting by created_at.

from datetime import datetime
from fastapi import FastAPI, Depends, Query
from sqlalchemy import create_engine, Column, Integer, String, DateTime, select, desc, asc
from sqlalchemy.orm import declarative_base, sessionmaker, Session

DATABASE_URL = "sqlite:///./app.db"

engine = create_engine(
    DATABASE_URL,
    connect_args={"check_same_thread": False}  # needed for SQLite + threads
)
SessionLocal = sessionmaker(bind=engine, autocommit=False, autoflush=False)
Base = declarative_base()

class Post(Base):
    __tablename__ = "posts"

    id = Column(Integer, primary_key=True, index=True)
    title = Column(String(120), nullable=False)

    created_at = Column(DateTime, nullable=False, default=datetime.utcnow)
    updated_at = Column(DateTime, nullable=False, default=datetime.utcnow, onupdate=datetime.utcnow)

Base.metadata.create_all(bind=engine)

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

app = FastAPI()

@app.post("/posts")
def create_post(title: str, db: Session = Depends(get_db)):
    post = Post(title=title)
    db.add(post)
    db.commit()
    db.refresh(post)
    return {"id": post.id, "title": post.title, "created_at": post.created_at}

@app.get("/posts")
def list_posts(
    sort: str = Query("desc", pattern="^(asc|desc)$"),
    db: Session = Depends(get_db)
):
    order = desc(Post.created_at) if sort == "desc" else asc(Post.created_at)
    posts = db.execute(select(Post).order_by(order).limit(50)).scalars().all()
    return [{"id": p.id, "title": p.title, "created_at": p.created_at, "updated_at": p.updated_at} for p in posts]

14.7 Common ORM pitfalls (fast interview checklist)

N+1 queries: fetching relationships in a loop; fix with joins/eager loading
Unbounded queries: missing pagination/limits
Missing indexes: slow filters/sorts without indexes (verify with query plans)
Session misuse: long-lived sessions or leaking sessions across requests

Interview one-liner

“An ORM maps tables to objects and uses a session (identity map + unit of work) to generate SQL and manage transactions. It improves productivity, but you still need SQL awareness to avoid N+1 queries and slow scans.”

15) Background jobs (RQ / Celery) for heavy tasks

A background job is work that runs outside the HTTP request–response lifecycle. The API handler enqueues a task and returns quickly; the heavy/slow part is executed by a separate worker process (often on another machine). This design increases reliability and throughput for real-world systems.

Definition

Background jobs are tasks executed asynchronously after the API response, typically via a queue (Redis/RabbitMQ/SQS) and workers that consume tasks.

15.1 Why background jobs exist

Keep API fast: return response quickly (low latency)
Prevent timeouts: avoid long blocking operations inside web workers
Improve throughput: free request handlers to serve more traffic
Enable retries safely: transient failures can be retried with backoff
Isolate resources: heavy CPU/RAM work runs in worker pool, not API processes

15.2 Typical use cases

Email/SMS: verification email after signup, password reset
RAG pipelines: chunking documents, generating embeddings, indexing vectors
Media processing: resizing images, transcoding video/audio
Analytics: event ingestion, aggregation, periodic reports
Webhooks: delivery with retries and exponential backoff

15.3 Common pattern (Producer → Queue → Worker)

The web server acts as a producer and enqueues jobs. A queue/broker stores jobs. A worker acts as a consumer and executes them. Results are stored in a DB/cache and can be queried through a status endpoint.

POST /jobs → enqueue job → returns job_id (202 Accepted)
GET /jobs/{job_id} → job state + result/error

Async vs Background (important)

async/await is primarily about non-blocking I/O inside the same process. Background jobs mean the work happens in separate execution (workers), potentially durable and retriable.

15.4 Response semantics

For heavy tasks, the API should usually return 202 Accepted with a job_id. This indicates the request was accepted for processing, but is not complete yet.

Example response:

{
  "status": "queued",
  "job_id": "a1b2c3d4"
}

15.5 Minimal in-process background tasks (FastAPI BackgroundTasks)

Framework background tasks (e.g., FastAPI BackgroundTasks) are useful for small, best-effort jobs but they are not a durable queue (tasks can be lost if the server restarts).

from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

def send_verification_email(to_email: str) -> None:
    # call SMTP/provider here
    pass

@app.post("/signup")
def signup(email: str, background_tasks: BackgroundTasks):
    # create user in DB ...
    background_tasks.add_task(send_verification_email, email)
    return {"status": "created"}

Limitations of in-process tasks

Not durable (lost on crash), competes with API for CPU/memory, limited visibility/retry control. For heavy tasks, use a real queue (RQ/Celery).

15.6 Celery + Redis example (durable queue + workers)

Celery uses a broker (Redis/RabbitMQ) to store jobs and worker processes to execute them. This is a standard production pattern for background processing.

Worker: define task (tasks.py)

from celery import Celery

celery_app = Celery(
    "worker",
    broker="redis://localhost:6379/0",
    backend="redis://localhost:6379/1",
)

@celery_app.task(bind=True, max_retries=3)
def build_embeddings(self, document_id: str):
    try:
        # heavy pipeline:
        # 1) load document
        # 2) chunk text
        # 3) generate embeddings
        # 4) store vectors + build index
        return {"document_id": document_id, "status": "done"}
    except Exception as exc:
        # exponential backoff for transient failures
        raise self.retry(exc=exc, countdown=2 ** self.request.retries)

API server: enqueue job (app.py)

from fastapi import FastAPI
from tasks import build_embeddings, celery_app

app = FastAPI()

@app.post("/documents/{document_id}/embed")
def embed_document(document_id: str):
    job = build_embeddings.delay(document_id)  # enqueue
    return {"status": "queued", "job_id": job.id}

@app.get("/jobs/{job_id}")
def job_status(job_id: str):
    res = celery_app.AsyncResult(job_id)
    return {"state": res.state, "result": res.result}

15.7 Reliability topics

Idempotency: tasks may retry; ensure re-running does not create duplicates
Retry policy: retry transient errors; fail fast on permanent input errors
Dead-letter queue (DLQ): move repeatedly failing jobs for later inspection
Observability: log job_id, duration, failures; track metrics
Ordering/priority: some systems need priority queues and rate limiting

Interview one-liner

“For heavy work, return 202 + job_id, process via queue + workers, and design tasks to be idempotent with retries/backoff and good observability.”

(See the full RQ/Celery section near the end of this file )

16) Testing with pytest (backend quality)

Install:

pip install pytest httpx

Example test using FastAPI TestClient:

from fastapi.testclient import TestClient
from main import app

client = TestClient(app)

def test_create_and_get_task():
    r = client.post("/tasks", json={"title": "hello", "done": False})
    assert r.status_code == 201
    task = r.json()
    assert task["title"] == "hello"

    r2 = client.get(f"/tasks/{task['id']}")
    assert r2.status_code == 200
    assert r2.json()["id"] == task["id"]

Test principles:

unit tests for pure functions (fast)
integration tests for API endpoints
database tests using temporary DB (or test containers)

17) CI (GitHub Actions)

.github/workflows/ci.yml

name: CI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install -r requirements.txt
      - run: pytest -q

18) Security essentials (production mindset)

Security is not one feature — it’s a collection of defaults that limit damage when something goes wrong. The goal is simple: reduce attack surface, prevent easy mistakes, and fail safely under bad inputs, leaked credentials, and broken dependencies.

Threat model in one sentence

Assume: inputs are malicious, credentials leak, dependencies fail, and traffic spikes — then design defaults so the system degrades safely.

18.1 Don’t leak internals (errors, stack traces, debug mode)

In production, never expose stack traces, file paths, raw SQL errors, or secrets to users. Return a generic error to clients and log the details internally with a request ID.

from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
import logging

app = FastAPI()
log = logging.getLogger("app")

@app.exception_handler(Exception)
async def catch_all(request: Request, exc: Exception):
    log.exception("Unhandled error")  # log full stack trace internally
    return JSONResponse(status_code=500, content={"detail": "Internal Server Error"})

Dev: detailed errors help you
Prod: detailed errors help attackers

18.2 Secrets (env vars, rotation, and “never log tokens”)

Secrets include DB passwords, JWT signing keys, API keys, OAuth client secrets. One rule covers 90% of incidents: secrets must not live in Git or logs.

Store: environment variables or a secret manager
Rotate: treat leaks as inevitable; rotation is your recovery path
Log hygiene: never log Authorization headers, cookies, or passwords

import os

DATABASE_URL = os.environ["DATABASE_URL"]
SECRET_KEY = os.environ["SECRET_KEY"]  # JWT signing key

Common mistake

“It’s fine, it’s only on my server.” If it’s in Git history, HTML, or logs, it eventually leaks.

18.3 Browser threats (XSS vs CSRF) — why cookies need extra care

If you use cookies for authentication, understand these two common web threats:

XSS: attacker injects JavaScript into your site → tries to steal data or perform actions
CSRF: browser automatically sends cookies → attacker triggers actions from another site

Practical takeaway: cookie-based auth needs good cookie flags and CSRF defenses for sensitive actions.

Set-Cookie: session_id=...; HttpOnly; Secure; SameSite=Lax; Path=/;

See also: Authentication/Cookies section for the meaning of HttpOnly, Secure, and SameSite.

18.4 HTTPS/TLS (security + correctness)

HTTPS is not optional for real systems. Without HTTPS, credentials and tokens can be intercepted, and cookies are unsafe (the Secure flag becomes meaningless).

Nginx: redirect HTTP → HTTPS (minimal)

server {
  listen 80;
  server_name example.com;
  return 301 https://$host$request_uri;
}

18.5 Security headers (cheap, high impact)

Security headers reduce browser attack surface. They don’t replace validation/auth, but they harden defaults.

from fastapi import Request

@app.middleware("http")
async def security_headers(request: Request, call_next):
    resp = await call_next(request)
    resp.headers["X-Content-Type-Options"] = "nosniff"
    resp.headers["X-Frame-Options"] = "DENY"
    resp.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
    # Start simple; CSP needs tuning:
    # resp.headers["Content-Security-Policy"] = "default-src 'self';"
    return resp

18.6 Abuse controls (rate limits + payload limits + timeouts)

Many “attacks” are just resource exhaustion: too many requests, huge bodies, or slow upstream calls. Apply hard limits to preserve availability.

Rate limiting: protect login/search and expensive endpoints
Max request size: avoid huge payload DoS
Timeouts: external calls must not hang workers

Nginx: cap request body size

server {
  client_max_body_size 5m;
}

18.7 File uploads (the forgotten attack surface)

Uploads create real risk: large payload DoS, zip bombs, malicious file types, path traversal. Safe defaults:

limit size (proxy + app)
validate content type (extension is not enough)
store outside web root (don’t serve raw uploads directly)
randomize filenames (avoid collisions and path tricks)

18.8 Dependency hygiene (silent killer)

Many real incidents come from outdated dependencies. Pin versions, update regularly, and audit in CI.

pip install pip-audit
pip-audit

18.9 Security checklist (fast revision)

Input boundary: validate early; reject bad payloads
Access control: AuthN + AuthZ; least privilege
No leaks: no stack traces / debug in prod; safe error responses
Secrets: env/secret manager; rotate; never log tokens/passwords
Browser hardening: cookie flags, CSRF awareness, security headers
Transport: HTTPS everywhere
Abuse limits: rate limits, body caps, timeouts
Uploads: strict limits and safe storage
Dependencies: pin + audit + update discipline

Interview one-liner

“I treat security as safe defaults: strict boundaries (validation/auth), no internal leakage, secrets outside Git/logs, HTTPS everywhere, hardened browser surface (cookies/headers/CSRF), abuse limits (rate/body/timeouts), safe uploads, and dependency hygiene.”

19) Observability

structured logs (JSON logs)
request IDs / correlation IDs
metrics (latency, error rate, throughput)
traces (distributed tracing if microservices)

20) Quick Review Table: 10 backend concepts

This table is a fast revision checklist. For each row, you should be able to explain: (1) what it is, (2) why it matters, (3) one real example.

Concept	What it is (theory)	Why it matters + typical tools
1) Authentication vs Authorization	AuthN proves identity (“who are you?”). AuthZ enforces permissions (“what can you do?”). HTTP: `401` vs `403`.	Prevents unauthorized access and defines security model. Tools: session cookies, JWT bearer tokens, OAuth2, RBAC/ABAC policies.
2) Rate limiting	Bounds requests per identity (IP/user/token) using algorithms like token bucket/leaky bucket.	Protects availability, prevents brute force, stabilizes latency. Tools: Nginx/Cloudflare rate limits, Redis counters, API gateways.
3) Database indexing	Indexes are DB-managed data structures (often B-trees) that accelerate query lookup and ordering.	Faster reads, but increased write cost + storage. Don’t index everything; index based on query patterns. Tools: EXPLAIN query plans, composite indexes.
4) Transactions + ACID	Transaction = atomic unit of work. ACID: Atomicity, Consistency, Isolation, Durability.	Guarantees correctness under concurrency; prevents partial updates. Tools: DB transactions, isolation levels, row locks, optimistic locking.
5) Caching	Stores results to avoid recomputation (space ↔ time tradeoff). Key issues: staleness, invalidation, TTL.	Lower latency and reduced DB/origin load; risk of stale reads and stampedes. Tools: Redis, Nginx cache, CDN cache, HTTP cache (ETag/Cache-Control).
6) Message queues	Producer → queue → consumer model for async work; jobs processed by workers with ack/retry semantics.	Handles heavy tasks reliably, decouples services, smooths spikes. Tools: Celery/RQ, Redis/RabbitMQ/SQS, DLQ, idempotency patterns.
7) Load balancing	Distributes traffic across instances. Strategies: round-robin, least-connections, hashing, sticky sessions.	Improves availability and throughput; enables horizontal scaling. Tools: Nginx/HAProxy/Cloud LB, autoscaling, health checks.
8) CAP theorem	Under network partition, choose between Consistency and Availability; Partition tolerance is required.	Guides distributed DB/service design tradeoffs (CP vs AP). Tools: consensus (Raft), eventual consistency, quorum reads/writes.
9) Reverse proxy	Front door for apps: routes requests to upstreams and can terminate TLS, cache, compress, and filter traffic.	Central place for security + performance controls; improves deployability. Tools: Nginx, Envoy, Traefik (TLS, caching, rate limiting, routing).
10) CDN	Distributed edge network that caches/serves content near users; reduces origin load and latency.	Faster global delivery, better burst handling; must set caching rules carefully. Tools: Cloudflare/Akamai/Fastly, cache rules, TTL, purge/invalidation.

30-second drill

For each row: say one definition sentence, one tradeoff sentence, and one tool/example sentence. That’s usually enough to answer most backend interview “concept” questions cleanly.

21) Production basics: Docker Compose + Nginx reverse proxy

A common production setup is: Nginx as a reverse proxy in front of your app container. Nginx terminates HTTP traffic, handles routing, and can add TLS, compression, caching, and rate limiting. Your FastAPI app runs behind it (often with Uvicorn/Gunicorn).

Typical architecture

Client → Nginx (reverse proxy) → FastAPI (app) → DB/Redis

Docker Compose example (FastAPI + Nginx)

This Compose file runs two services: app (FastAPI) and nginx (reverse proxy). Nginx forwards requests to the app using the Docker service name app on port 8000.

version: "3.9"

services:
  app:
    build: .
    container_name: fastapi_app
    expose:
      - "8000"
    environment:
      - ENV=production
    restart: unless-stopped

  nginx:
    image: nginx:1.27-alpine
    container_name: nginx_proxy
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
    depends_on:
      - app
    restart: unless-stopped

Minimal Nginx reverse proxy config

This configuration forwards all requests to the FastAPI app. It also forwards common proxy headers so your app can read the real client IP and scheme (useful for logs, redirects, auth callbacks).

server {
  listen 80;
  server_name _;

  location / {
    proxy_pass http://app:8000;

    proxy_http_version 1.1;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;

    # reasonable timeouts for upstream
    proxy_connect_timeout 5s;
    proxy_read_timeout 60s;
    proxy_send_timeout 60s;
  }
}

Production notes (short but essential)

Don’t run debug: use production settings and proper logging.
Run multiple workers: for CPU-bound scaling, prefer Gunicorn with Uvicorn workers (or scale containers horizontally behind Nginx).
Health checks: add a /health endpoint and configure monitoring.
TLS/HTTPS: terminate TLS at Nginx or use a managed proxy (e.g., Cloudflare). For real production, add HTTPS.
Secrets: never bake API keys into images; use env vars or secret managers.

HTTPS hint

In real production you should serve HTTPS. A common pattern is Nginx + Let’s Encrypt (Certbot) or a managed edge proxy. Keep HTTP (80) only for redirecting to HTTPS (443).

22) Data-intensive backends (real-world architectures + technology choices)

A data-intensive backend is a system where the hard part is not CRUD — the hard part is moving + transforming + serving data reliably at scale. These systems fail in different ways: duplicates, out-of-order events, partial writes, overloaded downstreams, long tail latency, and “one bad tenant” issues.

The mental model: Hot path vs Cold path

HOT PATH (user-facing, strict latency)     COLD PATH (heavy, async, reliable)
API → validate → read/cache → respond   |  ingest → transform → index/aggregate → publish

Strong backends keep the hot path boring and predictable, and push heavy work to the cold path.

22.1 Real examples of “data-intensive” systems

Analytics/event tracking: clickstream → Kafka → warehouse → dashboards
Media/OCR pipelines: upload → queue → OCR/ETL → searchable index
Search/recommendation: ingest content → compute features → serve ranked results
Payments/orders: state machines + idempotency + auditability
IoT/telemetry: high-frequency writes + aggregation + downsampling

22.2 Reference architectures

A) Queue-based “job pipeline” (most common prototype)

Client
  → API (FastAPI)
      → Postgres (metadata + job state)
      → Object storage (S3/MinIO/local) for large payloads/files
      → Queue (Redis/RabbitMQ/Kafka)
          → Workers (Celery/RQ/Arq) for heavy processing
      → Cache (Redis) for hot reads + rate limit

B) Streaming/event-driven pipeline (Kafka-style)

Producers → Kafka topics → stream processors (Flink/Spark/ksqlDB)
                        → sinks (ClickHouse/BigQuery/Postgres/Elastic)
                        → API reads optimized stores

The hidden rule

Most teams don’t need Kafka on day 1. Start with queue + workers. Add streaming only when you truly need: huge throughput, event ordering/partitioning, or many downstream consumers.

22.3 Technology choices: what goes where (practical mapping)

Storage selection (quick guide)

Postgres/MySQL: metadata, transactions, job states, permissions, audit logs
S3/MinIO/local FS: big blobs (PDFs, images, exports, embeddings files)
Redis: cache, rate limit, locks, queues (small pipelines)
ClickHouse / BigQuery: analytics queries, aggregations, time-series at scale
Elasticsearch/OpenSearch: full-text search + filters

22.4 Data modeling for pipelines (the part people miss)

Immutable events are easier than mutable state. Store “what happened”, derive views later.
Version everything: doc_version, schema_version, pipeline_version.
Separate raw vs derived: raw input (object store) vs derived artifacts (DB/index/warehouse).
Explicit job states: queued → running → success/failed (+ retry_count + last_error).

22.5 Reliability patterns (real production painkillers)

1) Idempotency (avoid duplicates under retry)

Networks fail. Clients retry. Workers retry. Without idempotency, you will duplicate jobs and corrupt derived data.

Idempotency key examples:
- upload_id
- tenant_id + file_sha256
- order_id + operation_type
- doc_id + version + chunk_index

2) Retry policy (transient vs permanent)

Retry: timeouts, 5xx, connection resets
Fail fast: invalid input, forbidden access, schema mismatch
DLQ: after N retries → dead letter queue for manual inspection

3) Backpressure (systems die without it)

Queue + 202: accept request, return job_id, process async
429 / rate limit: protect DB, workers, and external APIs
Load shedding: degrade features (“no rerank”, “no export”) under overload

4) Outbox pattern (don’t lose events)

If you write to Postgres and also publish to a queue, you can lose one of them on crash. Outbox stores the message in the same DB transaction, and a dispatcher publishes later.

Transaction:
  INSERT job row
  INSERT outbox row (event to publish)
Commit
Dispatcher reads outbox → publishes → marks delivered

22.6 Performance engineering: where time really goes

Tail latency (p95/p99) matters more than average
Batch I/O: fewer round-trips beats micro-optimizing Python
Connection pooling: DB pools & HTTP client pools are critical
Use async for I/O waits, not for CPU-heavy work
Cache what’s safe: hot reads, precomputed views, aggregated results

Intuition: CPU “red light” time

If a request is waiting on DB/network for ~30ms, your CPU can do ~tens of millions of cycles in that time. Concurrency wins by not wasting waiting time, not by “making Python faster”.

22.7 A concrete prototype: “Document Processing Service” (real backend model)

This is a strong interview demo because it includes: file upload, object storage, job queue, worker processing, and polling/streaming status.

Endpoints:
- POST /documents            → upload metadata + get presigned URL (or direct upload)
- POST /documents/{id}/ingest → enqueue processing job (returns 202 + job_id)
- GET  /jobs/{job_id}         → status: queued/running/success/failed
- GET  /documents/{id}        → returns derived outputs (text, index status, etc.)

FastAPI sketch (enqueue + status)

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import uuid
import time

app = FastAPI()

# Pretend stores (replace with Postgres + Redis queue in real code)
JOBS = {}
DOCS = {}

class IngestReq(BaseModel):
    tenant_id: str
    object_key: str  # path in S3/MinIO/local
    idempotency_key: str

@app.post("/documents/ingest", status_code=202)
def ingest(req: IngestReq):
    # Idempotency: return existing job if same key was already used
    for job_id, job in JOBS.items():
        if job["idempotency_key"] == req.idempotency_key:
            return {"job_id": job_id, "status": job["status"]}

    doc_id = str(uuid.uuid4())
    job_id = str(uuid.uuid4())
    DOCS[doc_id] = {"tenant_id": req.tenant_id, "object_key": req.object_key}

    JOBS[job_id] = {
        "doc_id": doc_id,
        "tenant_id": req.tenant_id,
        "status": "queued",
        "created_at": time.time(),
        "idempotency_key": req.idempotency_key,
        "retry_count": 0,
        "last_error": None,
    }

    # In real system: publish job_id into Redis/RabbitMQ/Kafka
    return {"job_id": job_id, "doc_id": doc_id, "status": "queued"}

@app.get("/jobs/{job_id}")
def job_status(job_id: str):
    job = JOBS.get(job_id)
    if not job:
        raise HTTPException(404, "job not found")
    return job

Production upgrade: Postgres for JOBS/DOCS, Redis/RabbitMQ for queue, Celery/RQ workers to process, S3/MinIO for file storage, and structured logs + metrics for visibility.

22.8 Observability (what you log/measure in real data pipelines)

Request: request_id, tenant_id, endpoint, status, duration_ms
Queue: queue depth, enqueue rate, worker concurrency, retry counts
Jobs: success rate, p95 processing time, failure reasons, DLQ size
DB: slow query logs, connection pool saturation, locks
Cost: external API calls per tenant/day (if any)

Quick view on data-intensive backend

“I design a hot path with predictable latency and a cold path with queues/workers. I use idempotency + retries + DLQ + backpressure to survive failures, store raw vs derived separately (object store + DB/index), and I measure p95/p99 plus queue depth to keep the system stable under load.”

Final checklist: backend maturity

When you build any feature, ask:

What is the resource + contract?
What validation and invariants must hold?
What authn/authz rules apply?
Where is truth stored (DB)?
How will it scale (stateless + cache + queue)?
How is it tested and deployed?
How do I observe it in production?

Backend Development from First Principles → Advanced (Theory + Practical Code)

Introduction

Table of contents

0) First principles: what a backend is

1) The ground: network + HTTP

1.1 Request → Response

1.2 Methods (verbs)

1.3 Status codes (API "physics")

2) What is a REST API?

2.1 REST = Representation + State + Transfer

2.2 REST constraints

3) Method semantics: safe + idempotent

3.1 Safe

3.2 Idempotent

4) Resources: design by nouns

4.1 Good URL patterns

4.2 CRUD mapping

4.3 Beyond CRUD (actions)

5) API interface design in practice (Postman/Insomnia)

6) Pagination + sorting + filtering

6.1 Why pagination is not optional

6.2 Offset pagination (page + limit)

6.3 Cursor pagination (best for infinite scroll)

6.4 Cursor pagination (stable infinite scroll)

7) Serialization & deserialization

8) Validation (trust nothing from the network)

8.1 What to validate (contract + invariants)

8.2 Validation vs sanitization vs escaping

8.3 Validation across layers (Controller → Service → Repository)

8.4 Error semantics (API behavior)

8.5 FastAPI example (layered validation with clean error mapping)

8.6 Practical limits (defense against abuse)

9) Authentication and Authorization

9.0 Typical HTTP status codes

9.1 Typical approaches

1) Session cookie (stateful)

2) JWT Bearer token (stateless)

3) API keys (simple but limited)

9.2 Cookies (short but important)

Cookie example (secure session cookie)

9.3 Authorization models (how permissions are expressed)

9.4 Example: protect endpoint + role check (FastAPI)

9.5 Why “HTTP is stateless” matters

Where state lives in practice

Stateful vs stateless authentication

Diagram: how “stateless HTTP” still supports login state

9.6 JWT (Bearer token) + RBAC in FastAPI (minimal example)

Install

Conceptual model (claims you care about)

Minimal FastAPI JWT + RBAC code

How to test quickly (curl)

Minimum production checklist

10) Middleware & CORS (cross-cutting concerns)

10.1 Why middleware exists (real-world reasons)

10.2 FastAPI middleware example: request ID + timing

10.3 CORS: what it actually is (and what it is NOT)

10.4 Preflight (OPTIONS): why the browser sends it

10.5 FastAPI CORS configuration (recommended patterns)

10.6 Cookies + CORS (the part that breaks people)

10.7 Practical CORS rules (safe defaults)

11) Caching (speed by remembering)

11.0 Cache vocabulary: hit, miss, TTL

11.1 HTTP caching with ETag (best for GET)

11.2 CDN caching (edge cache, closest to the user)

11.3 Reverse proxy caching with Nginx (cache in front of the app)

11.4 Application caching with Redis (cache-aside / lazy loading)

11.5 Cache stampede (thundering herd) + mitigation

11.6 DB indexes are not cache (but essential for performance)

12) Scaling: vertical vs horizontal

12.1 Vertical scaling (scale up)

12.2 Horizontal scaling (scale out)

12.3 Horizontal scaling example with Docker + Nginx load balancing

12.4 Concurrency vs Parallelism (backend mental model)

Why backends are mostly “I/O bound”

3 execution models used in real backends

Async vs Background jobs (common confusion)

Python reality check: GIL (one sentence only)

Concurrency “tools” backends use

12.4.1 FastAPI example: concurrency with async I/O (DB/HTTP waiting)

12.4.2 FastAPI example: parallelism for CPU-heavy work (process pool)