TLS: How the Handshake Works and What Goes Wrong in Production
TLS 1.3 cut the handshake to one round-trip and made forward secrecy mandatory. Understanding the mechanics — certificate chains, cipher negotiation, OCSP stapling — turns certificate errors from mysterious to diagnosable.
What TLS actually protects
TLS provides three guarantees:
- Confidentiality: data is encrypted in transit. An eavesdropper on the network sees ciphertext.
- Integrity: a tampered message is detected. The MAC on each record catches modification.
- Authentication: the server proves it holds the private key for the certificate. The client confirms it is talking to the intended server, not an impersonator.
What TLS does not provide: security at the endpoints. If a user's browser is compromised, or your server's private key is leaked, TLS cannot help. It protects the wire, not the machines on either end.
The TLS record and handshake layers
ConceptTransport SecurityTLS operates in two layers: the handshake protocol (negotiates keys and authenticates) and the record protocol (encrypts application data using the negotiated keys). Once the handshake completes, every subsequent byte is encrypted and integrity-checked.
Prerequisites
- Public/private key cryptography
- symmetric vs asymmetric encryption
- TCP
Key Points
- Asymmetric cryptography (RSA, ECDHE) is used only during the handshake to establish shared keys.
- Symmetric encryption (AES-GCM, ChaCha20) encrypts the actual data — much faster than asymmetric.
- Each TLS record has a sequence number used in the MAC, so reordered or replayed records are detected.
- TLS sessions can be resumed without a full handshake using session tickets or session IDs.
TLS 1.2 vs TLS 1.3: the practical differences
TLS 1.2 (2008) requires two round-trips before application data flows:
Client → Server: ClientHello (supported ciphers, client random)
Server → Client: ServerHello + Certificate + ServerKeyExchange + ServerHelloDone
Client → Server: ClientKeyExchange + ChangeCipherSpec + Finished
Server → Client: ChangeCipherSpec + Finished
Client → Server: [application data]
TLS 1.3 (2018) collapses this to one round-trip:
Client → Server: ClientHello (key share included)
Server → Client: ServerHello + Certificate + CertificateVerify + Finished
Client → Server: Finished + [application data immediately]
The client includes a key share guess in its first message. If the server supports the guessed group (it almost always does), the key exchange is done in one round-trip. For high-latency connections or APIs making many short-lived requests, this difference is measurable.
TLS 1.2 vs TLS 1.3
Most new deployments should prefer TLS 1.3. The main reason to retain 1.2 support is legacy client compatibility.
- Two round-trips before application data
- Supports RSA key exchange — private key exposure decrypts recorded traffic
- Large cipher suite surface including weak options (RC4, 3DES, export ciphers)
- Required for clients on older OS/browser versions
- One round-trip (0-RTT available for session resumption, with replay caveats)
- Forward secrecy mandatory — ephemeral key exchange only
- Cipher suite list reduced to five strong options
- Encrypts more of the handshake, including server certificate
Enable TLS 1.3 on all new infrastructure. Keep TLS 1.2 as a fallback only if your user base includes clients that cannot support 1.3. Drop TLS 1.0 and 1.1 — they are deprecated and should not be accepted.
Certificate chains: the part that breaks most deployments
A browser does not trust your server certificate directly. It trusts a small set of root CAs whose certificates are embedded in the operating system and browser. Your server certificate is issued by an intermediate CA, which is trusted by a root CA.
Root CA (in browser trust store)
└── Intermediate CA (issued by Root CA)
└── Your server certificate (issued by Intermediate CA)
The server must present its certificate and the intermediate CA certificate during the handshake. The browser cannot discover the intermediate on its own.
The most common certificate misconfiguration: the server certificate is installed but the intermediate CA is not bundled with it. Browsers see a broken chain and throw NET::ERR_CERT_AUTHORITY_INVALID. This passes all local tests (your browser may have cached the intermediate) but fails for users who have never visited a site signed by that intermediate.
Verify your chain is complete:
openssl s_client -connect yourdomain.com:443 -showcerts 2>/dev/null | \
openssl x509 -noout -text | grep -A2 "Issuer:\|Subject:"
You should see both your leaf certificate and the intermediate CA in the output.
Certificate verification: what the client checks
When the client receives the server certificate chain:
- Chain validation: each certificate must be signed by the one above it, up to a trusted root.
- Expiry:
notBeforeandnotAfterfields. An expired certificate fails immediately. - Subject / SAN match: the hostname must match the certificate's Common Name or one of its Subject Alternative Names. Wildcard certificates (
*.example.com) match one level deep only —api.example.commatches,v1.api.example.comdoes not. - Revocation: the CA may have revoked the certificate (key compromise, mis-issuance). Checked via CRL (certificate revocation list) or OCSP.
💡OCSP stapling: solving the revocation latency problem
Checking OCSP at handshake time adds a round-trip to a CA server, which adds latency and creates a liveness dependency on the CA. OCSP stapling moves this check to the server: the server periodically fetches a signed OCSP response from the CA and attaches ("staples") it to the TLS handshake. The client gets the freshness proof without the extra round-trip.
Enable it in nginx:
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 valid=300s;
Most major CAs support OCSP. Without stapling, browsers may skip the revocation check entirely (soft-fail) to avoid blocking the user — which means revoked certificates often still work in practice.
Common production failures
Certificate expiry: Let's Encrypt certificates expire every 90 days. Set a monitoring alert at 30 days. Automated renewal via certbot or ACME clients (cert-manager in Kubernetes) handles this, but the automation itself can fail silently. Check the renewal logs.
SNI mismatch: When multiple HTTPS sites share an IP, the client sends the hostname in the server_name TLS extension before the certificate is presented. If the server does not support SNI or serves the wrong certificate, the handshake succeeds with a certificate that does not match the requested hostname. In nginx, each server block's ssl_certificate must match its server_name.
Missing intermediate CA on new server: Common when restoring from backup or provisioning new instances with only the leaf certificate copied over. The chain must include the intermediate.
mTLS for service-to-service: In mTLS (mutual TLS), the client also presents a certificate. The server validates the client certificate against a trusted CA. This is the correct authentication mechanism for internal service-to-service calls — stronger than shared secrets and avoids the rotation burden of API keys.
# nginx mTLS: require client certificate
ssl_client_certificate /etc/nginx/client-ca.crt;
ssl_verify_client on;
A TLS handshake succeeds, but users with fresh browsers on mobile devices report 'Certificate not trusted' errors. Your local browser shows the certificate as valid. What is the most likely cause?
mediumThe certificate was issued 6 months ago by a commercial CA. It has not expired. The domain name matches.
AThe certificate uses SHA-1 signature algorithm, which mobile browsers have deprecated
Incorrect.SHA-1 deprecation is a real issue but typically produces an 'insecure' warning rather than an outright trust failure, and most modern CAs stopped issuing SHA-1 certs years ago.BThe intermediate CA certificate is missing from the server's certificate bundle
Correct!Desktop browsers often cache intermediate CAs from previous visits to other sites signed by the same CA. Mobile browsers on fresh devices have empty caches and cannot complete the chain without the intermediate being served by the server. This is the classic 'works on my machine' certificate problem.CThe certificate's not-before date is in the future due to a clock skew on the server
Incorrect.Clock skew affects the server's own certificate validity check, not the certificate itself as presented to clients. The not-before date is baked into the certificate at issuance.DMobile browsers require EV (Extended Validation) certificates
Incorrect.Mobile browsers accept DV (domain-validated) certificates. EV status affects the browser UI, not whether the certificate is trusted.
Hint:Think about what a brand-new browser has in its cache versus a browser that has visited many HTTPS sites.