Nginx Reverse Proxy and Load Balancing: upstream, Algorithms, and Health Checks — Blog

Reverse proxy setup

nginx receives client requests and forwards them to one or more backend servers. The client only sees nginx — backend addresses are never exposed:

server {
    listen 80;
    server_name app.example.com;

    location / {
        proxy_pass http://backend;

        # Forward original client info to backend
        proxy_set_header Host              $host;
        proxy_set_header X-Real-IP         $remote_addr;
        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

proxy_pass http://backend refers to an upstream block by name. Without an upstream block, it proxies to a single address directly.

upstream block and load balancing algorithms

The upstream block defines the backend pool. nginx selects a server on each request using the configured algorithm:

# Round-robin (default) — distributes requests evenly in sequence
upstream backend {
    server app1.internal:8080;
    server app2.internal:8080;
    server app3.internal:8080;
}

# Weighted round-robin — app1 gets 3× the requests of app2
upstream backend_weighted {
    server app1.internal:8080 weight=3;
    server app2.internal:8080 weight=1;
}

# Least connections — sends to the server with fewest active connections
upstream backend_least_conn {
    least_conn;
    server app1.internal:8080;
    server app2.internal:8080;
}

# IP hash — same client IP always routes to the same server
upstream backend_ip_hash {
    ip_hash;
    server app1.internal:8080;
    server app2.internal:8080;
}

| Algorithm | How it works | Best for | |---|---|---| | Round-robin (default) | Cycles through servers in order | Stateless services with uniform request cost | | Weighted | Round-robin with per-server multiplier | Mixed-capacity backends | | least_conn | Sends to server with fewest active connections | Long-lived connections (WebSocket, file uploads) | | ip_hash | Hash of client IP determines server | Session affinity without sticky cookies |

keepalive connections

By default, nginx opens a new TCP connection to upstream for every proxied request. keepalive maintains a pool of persistent connections:

upstream backend {
    server app1.internal:8080;
    server app2.internal:8080;

    # Keep up to 32 idle connections per worker process
    keepalive 32;
}

server {
    location / {
        proxy_pass http://backend;

        # Required for keepalive to work — tells upstream to keep the connection
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Without keepalive, a high-traffic nginx instance opens and tears down thousands of TCP connections per second to upstream — visible as TIME_WAIT backlog in ss -s. With keepalive 32, each nginx worker maintains up to 32 idle connections that get reused.

nginx passive health checks mark servers as failed after failed_timeout — active checks require nginx Plus

Gotchanginx / Infrastructure

nginx open-source only does passive health checks: if a request to a server fails (connection refused, timeout, 5xx with proxy_next_upstream), nginx marks it failed for fail_timeout seconds (default: 10s) after max_fails failures (default: 1). It does not proactively probe the server — it discovers failure only when real traffic fails. nginx Plus adds active health checks that probe a /health endpoint on a configurable interval without waiting for real requests to fail.

Prerequisites

nginx upstream
TCP connection handling
health check patterns

Key Points

max_fails=1 fail_timeout=10s (defaults): one failure removes the server for 10 seconds.
passive check only: nginx discovers failure when a real request fails, not proactively.
proxy_next_upstream: controls which error codes trigger failover to the next upstream.
Workaround for active checks in open-source nginx: use lua-resty-upstream-healthcheck module or an external health check sidecar.

Configuring failure behavior

upstream backend {
    server app1.internal:8080 max_fails=3 fail_timeout=30s;
    server app2.internal:8080 max_fails=3 fail_timeout=30s;

    # Backup: only used when all primary servers are unavailable
    server app3.internal:8080 backup;
}

server {
    location / {
        proxy_pass http://backend;

        # Retry on these errors — fail over to next server
        proxy_next_upstream error timeout http_500 http_502 http_503;

        # Don't retry more than 2 times
        proxy_next_upstream_tries 2;

        proxy_connect_timeout 2s;
        proxy_read_timeout    30s;
    }
}

proxy_next_upstream is the passive check mechanism: when a request to one upstream server returns these errors, nginx retries on the next available server. Without it, the client immediately sees the upstream error.

An nginx upstream has 3 servers using round-robin. Server 2 is handling a long file upload (30 seconds). 6 new requests arrive during that time. How many go to server 2?

medium

Round-robin cycles through servers in order regardless of current connection count. least_conn would behave differently.

A2 — round-robin distributes evenly so each server gets 2 of the 6 new requests
Correct!Round-robin doesn't track active connections — it cycles regardless of server load. With 3 servers and 6 new requests: server 1 gets requests 1,4; server 2 gets requests 2,5; server 3 gets requests 3,6. Server 2 receives 2 new requests even though it's already handling the long upload. This is why least_conn is better for workloads with variable request durations — it would route new requests away from server 2 until its connection count drops.
B0 — nginx detects server 2 is busy and skips it
Incorrect.Round-robin has no visibility into connection count or server load. It doesn't skip busy servers — that's what least_conn does.
C6 — all requests go to the next server in sequence after the slow one
Incorrect.Round-robin doesn't queue requests behind a slow server. It cycles through all servers in sequence regardless of their current state.
D1 — nginx uses least_conn automatically when one server is slow
Incorrect.nginx doesn't automatically switch algorithms. Round-robin is used unless explicitly configured otherwise with least_conn or ip_hash.

Hint:What information does round-robin use to select a server? What information does it ignore?