API Gateway Payload Limits: The 10MB Ceiling and What To Do When You Hit It

2 min readCloud Infrastructure

API Gateway enforces a hard 10MB payload limit on both requests and responses. Exceeding it returns a 413 or silently truncates responses. Large file transfers need a different architecture — presigned URLs, multipart uploads, or streaming — not bigger payloads.

awsapi-gateways3lambda

The limits

API Gateway enforces the following payload size limits. These are hard limits — you cannot increase them by request:

| API Type | Request payload | Response payload | |---|---|---| | REST API | 10 MB | 10 MB | | HTTP API | 10 MB | 10 MB | | WebSocket API | 128 KB per message | 128 KB per message |

For REST and HTTP APIs, exceeding the request limit returns HTTP 413 ({"message": "Request Too Large"}). Exceeding the response limit truncates the response — the client receives a partial body, usually causing a JSON parse error or a 502 from API Gateway.

Why API Gateway has a fixed payload ceiling

ConceptAWS API Gateway

API Gateway is a managed proxy service. It buffers the full request payload before forwarding to Lambda or HTTP integrations, and buffers the full response before returning to the client. The 10MB limit exists to bound memory allocation per request on the shared API Gateway infrastructure.

Prerequisites

  • HTTP request/response lifecycle
  • API Gateway integration types
  • S3 presigned URLs

Key Points

  • The limit applies to the body only — headers and query parameters don't count against it.
  • Base64-encoded binary data counts toward the limit at the encoded size (33% larger than binary).
  • WebSocket 128KB limit applies per message frame, not per connection session.
  • Lambda's own payload limit (6MB synchronous, 256KB async) may hit before the API Gateway limit.

Lambda's own payload limits compound the problem

Lambda has separate, lower limits that hit before API Gateway's:

  • Synchronous invocation (what API Gateway uses): 6MB request payload, 6MB response payload
  • Asynchronous invocation: 256KB

If your Lambda integration processes large payloads, Lambda's 6MB limit is the binding constraint, not API Gateway's 10MB. A 7MB file upload fails with a Lambda error, not a 413 from API Gateway.

Handling large file uploads: presigned URLs

The standard pattern for large file uploads bypasses API Gateway entirely. The client requests a presigned URL from your API, then uploads directly to S3:

Client → API Gateway → Lambda: "give me an upload URL"
Lambda → S3: create presigned PUT URL
Lambda → API Gateway → Client: presigned URL

Client → S3: PUT file directly (no API Gateway involved)
import boto3

def handler(event, context):
    s3 = boto3.client('s3')

    # Generate presigned URL for direct client upload
    presigned_url = s3.generate_presigned_url(
        'put_object',
        Params={
            'Bucket': 'uploads-bucket',
            'Key': f"uploads/{event['pathParameters']['user_id']}/{event['queryStringParameters']['filename']}",
            'ContentType': event['queryStringParameters']['content_type'],
        },
        ExpiresIn=300  # 5 minutes
    )

    return {
        'statusCode': 200,
        'body': json.dumps({'upload_url': presigned_url})
    }

The client uploads directly to S3 with a PUT request using the presigned URL. The file size limit is S3's: 5GB for a single PUT, 5TB for multipart upload.

After upload completes, trigger processing via an S3 event notification to Lambda — keeping the heavy computation out of the synchronous API request path entirely.

Handling large responses: presigned download URLs

The same pattern applies to downloads. Instead of returning large data through API Gateway, return a presigned GET URL:

def handler(event, context):
    s3 = boto3.client('s3')
    report_key = f"reports/{event['pathParameters']['report_id']}.csv"

    # Check the file size first
    head = s3.head_object(Bucket='reports-bucket', Key=report_key)
    size_mb = head['ContentLength'] / (1024 * 1024)

    if size_mb > 8:  # leave buffer below 10MB limit
        # Return a presigned URL for direct download
        download_url = s3.generate_presigned_url(
            'get_object',
            Params={'Bucket': 'reports-bucket', 'Key': report_key},
            ExpiresIn=900  # 15 minutes
        )
        return {
            'statusCode': 202,
            'body': json.dumps({'download_url': download_url})
        }
    else:
        # Small enough to return inline
        obj = s3.get_object(Bucket='reports-bucket', Key=report_key)
        return {
            'statusCode': 200,
            'headers': {'Content-Type': 'text/csv'},
            'body': obj['Body'].read().decode('utf-8')
        }
📝Lambda response streaming: bypass the buffering limit

Lambda response streaming (launched 2023) allows Lambda to stream responses to API Gateway incrementally instead of buffering the full response. This enables responses larger than 6MB with lower time-to-first-byte.

import json

# Lambda streaming handler (Python)
def handler(event, context):
    # Use response streaming
    yield json.dumps({"status": "starting"}).encode()

    for chunk in generate_large_dataset():
        yield chunk.encode()

    yield json.dumps({"status": "complete"}).encode()

Streaming requires:

  • Lambda function URL or Function URL integration (not standard API Gateway proxy integration)
  • Response payload mode set to RESPONSE_STREAM in function configuration
  • The client must handle chunked transfer encoding

For standard REST/HTTP API Gateway integrations, streaming is not supported — the full response must fit within 10MB.

WebSocket APIs and the 128KB message limit

WebSocket APIs have a much smaller per-message limit. For real-time applications sending large data:

# Split large data into chunks before sending over WebSocket
import json

def send_large_payload(apigw_client, connection_id, data: str):
    MAX_CHUNK = 100_000  # bytes, leave buffer below 128KB
    chunks = [data[i:i+MAX_CHUNK] for i in range(0, len(data), MAX_CHUNK)]
    total = len(chunks)

    for i, chunk in enumerate(chunks):
        message = json.dumps({
            "type": "chunk",
            "index": i,
            "total": total,
            "data": chunk
        })
        apigw_client.post_to_connection(
            ConnectionId=connection_id,
            Data=message.encode()
        )

The client reassembles chunks. This is simpler than it sounds for structured data — the alternative is an out-of-band download URL sent over WebSocket, which is cleaner for large one-time transfers.

A Lambda function behind API Gateway generates a 12MB CSV report and tries to return it as the response body. The function succeeds (no Lambda error), but clients receive a 502 error. What is happening?

easy

The Lambda function itself doesn't error. CloudWatch logs show the function completing successfully. API Gateway logs show a 502 response.

  • ALambda timed out before returning the full response
    Incorrect.A timeout produces a specific error in CloudWatch and a different error message. The function completing successfully in logs rules this out.
  • BAPI Gateway's 10MB response payload limit truncated the 12MB response — API Gateway returns 502 when the response exceeds the limit
    Correct!When a Lambda integration returns a response larger than API Gateway's limit, API Gateway returns a 502 Bad Gateway to the client. The Lambda function completes without error — it successfully returned the payload to API Gateway. API Gateway then rejects it because it exceeds the 10MB limit. The fix is to store the report in S3 and return a presigned download URL instead of the report content.
  • CThe CSV content type is not supported by API Gateway
    Incorrect.API Gateway supports arbitrary content types for responses. Content type is not a factor here.
  • DLambda's 6MB response limit was exceeded, causing the error
    Incorrect.Lambda's 6MB limit applies. But since the response is 12MB, API Gateway's 10MB limit also applies — either would cause a failure. In practice Lambda's 6MB limit would hit first, producing a Lambda error, not a successful Lambda execution followed by a 502.

Hint:The function succeeds but the 502 comes from API Gateway. What limit does API Gateway enforce on responses?