GitHub Actions Caching: What to Cache and What Not To — Blog

Why caching in CI is different from caching in production

CI caches exist to avoid re-downloading packages that haven't changed. The cache must be invalidated when dependencies change (new package added, version bumped) and reused when they haven't.

The two failure modes:

No invalidation: cache stores stale packages → builds fail with wrong dependencies.
Over-invalidation: cache key changes on every run → no hits → slow builds.

The cache key is the mechanism that controls this. Get it wrong in either direction and the cache either corrupts your build or provides no benefit.

GitHub Actions cache mechanics

ConceptCI/CD

actions/cache saves a directory to GitHub's cache storage using a key. On subsequent runs, a matching key restores the directory. Cache misses fall through to restore-keys — partial matches allow a stale-but-close cache to be a starting point.

Prerequisites

GitHub Actions workflow basics
npm/yarn dependency management

Key Points

Cache key must change when dependencies change. Use a hash of the lockfile as the key.
Cache key must NOT change when only application code changes.
restore-keys allows fallback to a prefix match — gets a stale cache rather than nothing, then only newly-added packages are downloaded.
Cache is per-branch by default. Main branch cache can be read by PRs, but PRs cannot write to main's cache.

Node.js: cache the npm registry, not node_modules

npm ci deletes node_modules at the start of every run. Caching node_modules is therefore pointless — the first thing npm ci does is delete it.

The right approach: cache ~/.npm (the npm package registry cache). When npm ci runs, it downloads packages from the registry cache on disk rather than from the internet.

- name: Cache npm registry
  uses: actions/cache@v4
  with:
    path: ~/.npm
    key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      npm-${{ runner.os }}-

- name: Install dependencies
  run: npm ci

Key breakdown:

npm-${{ runner.os }}-: prefix ensures cross-OS isolation.
${{ hashFiles('**/package-lock.json') }}: exact hash of the lockfile. Any package change produces a new hash → cache miss → fresh download. Same lockfile → cache hit → fast install.

The restore-keys fallback (npm-${{ runner.os }}-) matches any cache for this OS. If the lockfile changed, npm only downloads the new/changed packages — much faster than downloading everything fresh.

Simpler alternative: actions/setup-node has built-in caching:

- uses: actions/setup-node@v4
  with:
    node-version: '20'
    cache: 'npm'  # handles the above automatically

This is equivalent but requires less boilerplate.

Go modules

- name: Cache Go modules
  uses: actions/cache@v4
  with:
    path: |
      ~/go/pkg/mod
      ~/.cache/go-build
    key: go-${{ runner.os }}-${{ hashFiles('**/go.sum') }}
    restore-keys: |
      go-${{ runner.os }}-

Two directories to cache:

~/go/pkg/mod: downloaded module source. Changes when go.sum changes.
~/.cache/go-build: compiled artifacts. Speeds up compilation, not just downloads.

The build cache (go-build) does not change when modules change — it changes when your source code changes. For most repos, the module download savings are most significant; the build cache savings are harder to predict because any code change invalidates it.

Python (pip)

- name: Cache pip
  uses: actions/cache@v4
  with:
    path: ~/.cache/pip
    key: pip-${{ runner.os }}-${{ hashFiles('**/requirements*.txt') }}
    restore-keys: |
      pip-${{ runner.os }}-

- run: pip install -r requirements.txt

For poetry, use poetry.lock as the hash source and cache the virtualenv:

- name: Cache poetry virtualenv
  uses: actions/cache@v4
  with:
    path: ~/.cache/pypoetry/virtualenvs
    key: poetry-${{ runner.os }}-${{ hashFiles('**/poetry.lock') }}

💡Docker layer caching in GitHub Actions

Docker builds are expensive in CI. Each RUN instruction in a Dockerfile is a layer that can be cached.

GitHub Actions does not preserve Docker's local cache between runs by default — each runner is a fresh machine. Two approaches:

Registry-based caching (recommended for ECR/GHCR):

- name: Build and push
  uses: docker/build-push-action@v5
  with:
    cache-from: type=registry,ref=ghcr.io/myorg/myapp:cache
    cache-to: type=registry,ref=ghcr.io/myorg/myapp:cache,mode=max
    push: true
    tags: ghcr.io/myorg/myapp:latest

GitHub Actions cache backend (uses actions/cache storage):

- name: Build
  uses: docker/build-push-action@v5
  with:
    cache-from: type=gha
    cache-to: type=gha,mode=max

mode=max caches all layers including intermediate. mode=min (default) caches only final layers. mode=max uses more cache storage but gets better hit rates.

The key optimization: order Dockerfile instructions from least-frequently-changing to most. COPY requirements.txt + RUN pip install before COPY . . — dependencies only reinstall when requirements change, not when application code changes.

A GitHub Actions workflow uses actions/cache with key: build-${{ github.sha }}. The cache restore rate is 0% — every run is a cache miss. What is wrong?

easy

The workflow runs on every push. Each push has a unique commit SHA.

AThe cache path is wrong
Incorrect.The path determines what is cached, not whether the cache is found. A wrong path would still produce cache hits for that path.
BThe cache key changes on every push because github.sha is the commit hash, which is unique per commit
Correct!github.sha is the commit hash — unique for every commit. The cache key will never match a previous run. The correct key for dependency caching is a hash of the dependency file (package-lock.json, go.sum, requirements.txt) — this changes only when dependencies change, not on every code push.
CGitHub Actions caches expire after 24 hours
Incorrect.GitHub Actions caches expire after 7 days of non-use, not 24 hours. Expiry is not the issue here.
DThe restore-keys field is missing, which is required for cache to work
Incorrect.restore-keys is optional. It provides fallback matching when the exact key is not found. Without it, the cache still works for exact key matches.

Hint:What does github.sha represent, and how often does it change?