logo
Published on

CircleCI Workspace

Authors
  • avatar
    Name
    Bowen Y
    Twitter

CircleCI Workspace

What Exactly Are CircleCI Workspaces?

CircleCI Workspaces allow you to share files and directories between jobs within a workflow. It's crucial to note that Workspaces are not the same as caching or artifacts—they serve a unique role. Workspaces are mutable; you can add data to them as jobs progress through the workflow. Think of it as a temporary storage unit that multiple jobs within a single workflow can access and modify.

Practical Use Cases for Workspaces

Let’s look at a couple of scenarios where Workspaces can really help:

1. Sharing Build Artifacts Across Jobs

Suppose you have a microservice architecture, and you need to build multiple Docker images. Let's say you have a build job that compiles your code and packages it. You can store your compiled binaries in a Workspace, allowing subsequent jobs to use those binaries to build Docker images in parallel. This means you avoid re-building artifacts in multiple jobs, which is a huge time saver.

2. Running Extensive Test Suites

Another common scenario is splitting tests across multiple jobs to reduce total build time. Say you have unit tests, integration tests, and some UI tests. Each of these requires the same codebase but involves different environments or configurations. You could have a prepare-test job that installs dependencies and builds the environment, storing everything in a Workspace. Then, the unit tests, integration tests, and UI tests can all start from the same state, cutting down on redundant setup time.

Some Gotchas You Should Watch Out For

1. Size Limitations

One of the biggest misconceptions is that you can just dump everything into a Workspace without considering size. Large files can dramatically slow down the workflow because CircleCI has to store and retrieve them multiple times. Always try to keep the data shared via Workspace as lightweight as possible. For instance, avoid placing full Docker images in a Workspace—use caching for that instead.

2. Mutable State Challenges

A Workspace is mutable, which means that every job that writes to it changes its state. If multiple jobs depend on the same workspace, unexpected changes can lead to flaky workflows. Imagine job A writes some test results to the Workspace, and then job B modifies them in some way—that’s a recipe for non-deterministic behavior. To counter this, you should always be explicit about what data is being put into or taken out of the Workspace.

Best Practices

  • Isolate Data by Workflow Stage: Instead of using one giant Workspace, split your data into separate Workspaces by the nature of the job—for example, build artifacts, dependencies, and test results. This can make your workflow easier to manage and debug.
  • Leverage Filters: Make sure you’re not using Workspaces unnecessarily—often, job dependencies or caching mechanisms are more suitable, especially for read-only data. Use Workspaces for data that is actively modified or needs to be shared across jobs.
  • Minimize Workspace Data: Only persist what's absolutely necessary. If you need specific build artifacts, don’t store the entire build directory—filter the data before persisting it to the Workspace.

Why I cannot access files in a workspace when there is no internet in my job?

Workspaces do rely on an internet connection. CircleCI is a cloud-based service, and Workspaces are uploaded to and downloaded from CircleCI's storage during the workflow. If your job has no internet connection, it will not be able to upload data to or retrieve data from the Workspace, which will likely cause your workflow to fail. To ensure smooth usage of Workspaces, all jobs that interact with the Workspace must have a stable internet connection.

Workspaces vs. Caching in CircleCI

While both Workspaces and caching help in optimizing your CI/CD pipeline, they serve different purposes and have distinct behaviors:

workspaces

  • Workspaces are used to share files and directories between jobs within the same workflow. They are mutable, meaning jobs can modify the data as they progress. Workspaces are ideal when different jobs within a workflow need to access and modify the same set of files.
  • Caching, on the other hand, is used to save dependencies or data between different workflow runs. The cache is immutable, meaning once data is cached, it cannot be changed. Caching is particularly useful for saving build dependencies, like packages from npm install or pip install, to speed up subsequent builds. Unlike Workspaces, the cache is meant for storing data that remains constant across different runs to avoid repeated downloads or builds.

In practical terms, use Workspaces when you need to pass data between jobs in a single workflow and allow for modification. Use Caching when you want to speed up jobs by reusing previously downloaded or generated data across different workflows or branches.

How Long Will Data Be Stored in the Workspace and When Will It Be Removed?

The data stored in the Workspace is temporary and will be removed once the workflow is finished. Workspaces are designed to facilitate data sharing between jobs within a single workflow, and they do not persist beyond the lifecycle of that workflow. However, if you need to re-run a workflow, CircleCI retains the Workspace data for up to 15 days by default. This allows you to reuse the data when re-running a workflow without having to regenerate it. After 15 days, if no re-run occurs, the Workspace data will be deleted automatically. If you need to store data beyond the workflow or the 15-day retention, consider using caching or artifacts instead.