API Gateway & Service Mesh

Break a monolith into microservices and a new problem appears: every service now needs the same boring-but-critical machinery — authentication, rate limiting, TLS, routing, retries, metrics. Re-implementing that in every service is wasteful and inconsistent. Two patterns extract it into infrastructure: the API gateway handles traffic coming into the system from clients, and the service mesh handles traffic between services. Together they let your service code focus on business logic.

⚡ Quick Takeaways

Both solve cross-cutting concerns — auth, rate limiting, TLS, routing, retries, observability — so individual services don't each reimplement them.
API gateway = the front door — a single entry point for clients that routes to backend services and handles auth, rate limiting, TLS termination, and aggregation.
North-south vs east-west — the gateway governs client↔system (north-south) traffic; the mesh governs service↔service (east-west) traffic.
Service mesh = a sidecar proxy per service — it transparently adds mTLS, retries, load balancing, and metrics to internal calls, with a control plane configuring all the proxies.
BFF (Backend for Frontend) — a gateway tailored per client type (web, mobile) to shape responses.
The cost is a hop + complexity — meshes especially add latency and operational overhead; don't add one before you need it.

tldr

An API gateway is the single entry point for external clients: it routes requests to services and centralizes auth, rate limiting, TLS termination, and request aggregation (north-south traffic). A service mesh pushes service-to-service concerns (mTLS, retries, load balancing, tracing) into a sidecar proxy deployed next to every service, configured by a control plane (east-west traffic). Both move cross-cutting logic out of application code — at the cost of an extra network hop and operational complexity.

The Problem: Cross-Cutting Concerns

In a microservices architecture, a set of concerns applies to every service: verifying who's calling (authn/authz), limiting abusive traffic, terminating TLS, routing requests, retrying transient failures, and emitting metrics/traces. If each team bakes these into their service, you get duplicated effort and — worse — inconsistency: ten slightly different rate limiters, five auth implementations, uneven observability. The fix is to lift these concerns into a shared layer. Where that layer sits depends on the direction of the traffic.

North-South vs East-West Traffic

A useful mental model from networking:

North-south traffic crosses the boundary of your system — external clients calling in (and responses going out). The API gateway sits here, at the edge.
East-west traffic flows between your internal services. The service mesh governs this.

The gateway is the guarded front door; the mesh is the internal road network. They're complementary, not competing.

The API Gateway

An API gateway is a single entry point that sits in front of your backend services. Clients talk only to the gateway, which authenticates and routes each request to the right service and applies edge policies. Its responsibilities:

Routing — map incoming paths/hosts to backend services.
Authentication & authorization — validate tokens/keys once at the edge (often with OAuth/JWT) so services trust the gateway.
Rate limiting & throttling — protect backends from abuse and overload.
TLS termination — handle HTTPS at the edge.
Request aggregation — combine calls to several services into one client response (reducing chatty mobile round-trips).
Protocol translation — e.g. expose REST externally while talking gRPC internally.

API gateway: the single front door

clients ─▶ ┌─────────────── API GATEWAY ───────────────┐
           │ TLS term · authn · rate limit · routing   │
           └───┬──────────────┬──────────────┬─────────┘
               ▼              ▼              ▼
          order-svc      user-svc       payment-svc
   (services trust the gateway; they don't each re-auth the client)

The BFF Pattern

A web app, a mobile app, and a partner API often want different shapes of data from the same backends. The Backend for Frontend (BFF) pattern runs a separate gateway per client type, each tailoring aggregation and response shaping to that client's needs — so the mobile BFF can return a compact, battery-friendly payload while the web BFF returns a richer one, without bloating the backend services.

The Service Mesh

The gateway handles the edge, but inside a large system, services call each other constantly, and those calls need the same reliability and security features. A service mesh provides them without changing application code, using the sidecar pattern: a small proxy (e.g. Envoy) is deployed alongside every service instance, and all the service's network traffic flows through its sidecar. The sidecars (the data plane) are configured centrally by a control plane (e.g. Istio, Linkerd).

service mesh: sidecar proxies + control plane

          ┌── control plane (config, policy, certs) ──┐
          │              Istio / Linkerd               │
          ▼                                             ▼
   ┌─ service A ─┐   mTLS, retries, LB, metrics   ┌─ service B ─┐
   │  app  │ proxy│◀══════════════════════════════▶│ proxy │ app │
   └───────┴──────┘   (all A↔B traffic via sidecars)└──────┴─────┘

   app code makes a plain call to "B"; the sidecar does the rest

Because every call goes through a sidecar, the mesh transparently provides: mutual TLS (encrypt + authenticate service-to-service automatically), retries, timeouts, and circuit breaking, client-side load balancing, traffic shifting (for canary deploys), and uniform observability (every hop is traced/measured — see observability). The application stays oblivious; it just calls "service B" and the sidecar handles encryption, retries, and routing.

Gateway vs Mesh

Aspect	API Gateway	Service Mesh
Traffic	North-south (client ↔ system)	East-west (service ↔ service)
Form	Central edge service	Sidecar proxy per service + control plane
Main jobs	Auth, rate limit, routing, aggregation	mTLS, retries, LB, traffic shifting
Client-facing?	Yes — external clients	No — internal only
Examples	Kong, NGINX, AWS API Gateway	Istio, Linkerd (Envoy data plane)

Trade-offs and When to Use Which

Both add an extra network hop and moving parts, so they're not free. An API gateway is almost always worth it once you have more than a couple of services exposed to clients — centralizing auth and rate limiting alone justifies it (just don't let it become a bloated monolith of business logic; keep it to cross-cutting concerns). A service mesh is heavier: the sidecar adds latency to every internal call and the control plane is real operational complexity. It pays off at scale — dozens-plus services where you want uniform mTLS, retries, and observability without touching every codebase — but for a handful of services it's overkill, and a few good libraries plus the gateway suffice. The honest interview answer: adopt a mesh when the cross-cutting needs across many services outweigh its operational cost, not by default.

the sidecar trade-off

The sidecar's superpower — intercepting all traffic transparently — is also its cost: every request now traverses two extra proxies (caller's sidecar → callee's sidecar), adding latency and CPU. Newer "sidecar-less"/ambient mesh designs aim to reduce this overhead, but the fundamental tension between transparency and an extra hop remains.

Pitfalls

Gateway as a monolith — stuffing business logic into the gateway recreates the monolith you split up; keep it to cross-cutting concerns.
Gateway as a SPOF — it's on the critical path for all traffic; run it highly available and behind load balancers.
Premature mesh — adopting a service mesh for five services adds more complexity than it removes.
Latency blindness — extra hops (gateway + two sidecars) add up; measure the tail latency they introduce.

takeaway

API gateways and service meshes both extract cross-cutting concerns out of services — the gateway at the north-south edge (auth, rate limiting, routing, aggregation for external clients) and the mesh on east-west internal traffic (mTLS, retries, load balancing, observability via sidecars). Add a gateway early; add a mesh only when many services make its uniformity worth the latency and operational cost.

🎯 interview hot-takes

API gateway vs service mesh? Gateway handles north-south (client→system) traffic as a central edge; mesh handles east-west (service→service) traffic via sidecar proxies.
What does a gateway do? Single entry point: routing, authentication, rate limiting, TLS termination, request aggregation, protocol translation.
What's the sidecar pattern? A proxy deployed next to each service that intercepts all its traffic, adding mTLS/retries/LB/metrics without changing app code; a control plane configures all sidecars.
What's a BFF? A gateway per client type (web, mobile) that tailors aggregation and response shape to that client.
When NOT to use a mesh? With few services — the per-call latency and operational overhead of sidecars outweigh the benefit; a gateway plus libraries is enough.