S3 Glacier Storage Classes: Retrieval Tradeoffs, Lifecycle Policies, and Vault Lock — Blog

Three storage classes, not one service

"S3 Glacier" is a category of storage, not a single option. There are three distinct storage classes with different access patterns, retrieval speeds, and costs:

| Storage Class | Minimum Storage Duration | Retrieval Time | Use Case | |---|---|---|---| | S3 Glacier Instant Retrieval | 90 days | Milliseconds | Quarterly access, immediate when needed | | S3 Glacier Flexible Retrieval | 90 days | 1–12 hours (standard), 1–5 min (expedited) | Annual access, acceptable latency | | S3 Glacier Deep Archive | 180 days | 12 hours (standard), 48 hours (bulk) | 7–10 year retention, compliance archives |

All three cost significantly less per GB than S3 Standard, with Deep Archive being the cheapest object storage AWS offers. The cost tradeoff is retrieval latency and retrieval pricing.

How Glacier storage works under the hood

ConceptAWS S3

Glacier storage classes use a tiered media infrastructure. Glacier Instant Retrieval keeps data on disk in a warm tier. Flexible Retrieval and Deep Archive stage data to tape or cold disk on a retrieval request — the retrieval time reflects the time to stage the data back to disk before it can be served.

Prerequisites

S3 storage classes
S3 lifecycle policies
data durability concepts

Key Points

All Glacier classes offer 11 nines (99.999999999%) durability — same as S3 Standard.
Retrieval is a two-step process for Flexible and Deep Archive: initiate retrieval, wait for staging, then download.
Minimum storage duration means you pay for the full minimum even if you delete early.
Retrieval costs apply per GB retrieved, plus per-request fees. Factor these into TCO for frequently retrieved archives.

Retrieval mechanics for Flexible Retrieval and Deep Archive

Flexible Retrieval and Deep Archive require initiating a retrieval job before the data is downloadable:

# Initiate a retrieval job
aws s3api restore-object \
  --bucket my-archive-bucket \
  --key "backups/2023/database-backup.tar.gz" \
  --restore-request '{"Days": 3, "GlacierJobParameters": {"Tier": "Standard"}}'

# Check status
aws s3api head-object \
  --bucket my-archive-bucket \
  --key "backups/2023/database-backup.tar.gz"

# Response shows restore status:
# "Restore": "ongoing-request=\"false\", expiry-date=\"Fri, 01 Jan 2026 00:00:00 GMT\""

Days: 3 specifies how long the restored copy remains available in S3 before expiring. The restored object is a temporary copy — the original stays in Glacier storage class.

Retrieval tiers for Flexible Retrieval:

Expedited: 1–5 minutes. Higher per-GB cost. Good for urgent, small-volume retrieval.
Standard: 3–5 hours. Default pricing.
Bulk: 5–12 hours. Lowest cost. For large-scale, non-urgent retrieval.

Lifecycle policies: automating transitions

Direct uploads to Glacier Instant Retrieval work via the GLACIER_IR storage class. For Flexible Retrieval and Deep Archive, you typically transition objects from S3 Standard using lifecycle rules:

{
  "Rules": [
    {
      "ID": "archive-old-logs",
      "Status": "Enabled",
      "Filter": { "Prefix": "logs/" },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555  // delete after 7 years
      }
    }
  ]
}

In Terraform:

resource "aws_s3_bucket_lifecycle_configuration" "archive" {
  bucket = aws_s3_bucket.logs.id

  rule {
    id     = "archive-logs"
    status = "Enabled"

    filter {
      prefix = "logs/"
    }

    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }

    transition {
      days          = 90
      storage_class = "GLACIER"
    }

    transition {
      days          = 365
      storage_class = "DEEP_ARCHIVE"
    }

    expiration {
      days = 2555
    }
  }
}

The lifecycle runs at midnight UTC. Objects are not transitioned at exactly day 30 — they transition on the first lifecycle evaluation after day 30 from creation.

📝Vault Lock: write-once compliance enforcement

S3 Glacier supports a feature called Vault Lock (available via the legacy Glacier API) and S3 Object Lock (available on S3 buckets including those storing Glacier-class objects).

S3 Object Lock provides two retention modes:

Compliance mode: once set, no user — including the root account — can delete or overwrite the object until the retention period expires. Used for SEC Rule 17a-4, FINRA, HIPAA, and similar regulatory requirements.

Governance mode: protects objects from accidental deletion. Users with s3:BypassGovernanceRetention permission can override it. Useful for operational protection without compliance-grade immutability.

# Enable Object Lock when creating a bucket (cannot be added later)
aws s3api create-bucket \
  --bucket compliance-archive \
  --object-lock-enabled-for-bucket

# Set retention on a specific object
aws s3api put-object-retention \
  --bucket compliance-archive \
  --key "audit/2024/report.pdf" \
  --retention '{"Mode": "COMPLIANCE", "RetainUntilDate": "2031-01-01T00:00:00Z"}'

Critical: Object Lock must be enabled at bucket creation. It cannot be enabled on an existing bucket. If you have compliance requirements, plan for this upfront.

The combination of S3 Glacier Deep Archive + Compliance Object Lock produces storage that is immutable for the retention period at the lowest possible cost — the standard approach for 7-year financial record retention.

Choosing between Glacier classes

Glacier Instant Retrieval when:

Data must be retrievable immediately when needed (medical imaging, financial records that occasionally require audit access)
Access is quarterly at most — paying Standard-IA prices for more frequent access is cheaper
You want Glacier pricing without managing retrieval jobs

Glacier Flexible Retrieval when:

Data is accessed annually or less
Several hours of retrieval time is acceptable
Large-volume periodic restores (DR testing, annual audits) where bulk retrieval saves significant cost

Glacier Deep Archive when:

Data is accessed rarely if ever — regulatory retention, long-term backup of replaced systems
Minimum 180-day storage duration is acceptable
12–48 hour retrieval time is acceptable

The minimum storage duration is an often-overlooked cost. Storing 1 TB in Deep Archive for 10 days and then deleting it still costs as if it was stored for 180 days.

An organization stores database backups in S3 Glacier Flexible Retrieval. After a production incident, they need to restore last night's backup. They initiate a Standard retrieval. Three hours later, the restore still shows 'ongoing-request=true'. They escalate to change the retrieval tier. What should they do?

easy

The backup is 500 GB. The incident is ongoing and the database needs to be restored as quickly as possible. The organization has no budget constraints for this recovery.

AWait — Standard retrieval takes 3–5 hours and the job is still within that window
Incorrect.Correct that Standard takes 3–5 hours. But given an ongoing incident with no budget constraint, initiating an Expedited retrieval for the urgent recovery is the right call, not waiting.
BCancel the Standard retrieval and initiate an Expedited retrieval for 1–5 minute access
Correct!Expedited retrieval is 1–5 minutes and is appropriate for exactly this scenario: urgent, unplanned recovery where speed matters more than cost. The existing Standard retrieval job can be left to complete or cancelled. For incident recovery with no budget constraint, paying for Expedited is the correct decision. If you regularly need fast recovery, consider Glacier Instant Retrieval or keeping a recent copy in S3 Standard.
CMove the backup to S3 Standard first, then download it
Incorrect.You cannot change the storage class of an object mid-retrieval. You'd need to wait for the retrieval to complete, then copy to S3 Standard — this would take longer than just using Expedited retrieval.
DUse the S3 console instead of CLI — the console uses a faster retrieval path
Incorrect.The retrieval tier (Expedited/Standard/Bulk) determines retrieval speed. The interface used to initiate the retrieval doesn't affect it.

Hint:What retrieval tier is designed for urgent, time-sensitive access?