The History of Nix at Bellroy
Bellroy relies heavily on Nix as an important part of our developer tooling. It provides us with reproducible environments for developer shells and CI runs, as well as a build environment for our statically linked Haskell code. Our tech team works in a moderately conservative Haskell dialect, so this level of Nix dependence might seem surprising and incongruent. In this post, I will explain how and why our Nix usage evolved in the way that it did, and point out useful tricks and tools at each stage of adoption.
Phase 1: Developer Shells via shell.nix
Nix has an intimidating learning curve, but most of this comes from
writing Nix expressions. Developers can be easily taught to use
Nix-based infrastructure once it’s been set up. Our first use of Nix
was writing shell.nix
files for use with
nix-shell
. Nix
uses these files to create reproducible development environments
containing correct versions of tools like ruby
, ghc
, etc.,
depending on the project. This is a great way to get started because
it doesn’t ask developers to radically change their workflows, and
allows them to trial Nix at their own pace. There are some subtleties
to be aware of when setting up these shell expressions.
For true reproducibility, you need to store a reference to the version of
nixpkgs
used in each project’s source control. This is called “pinning nixpkgs”. We initially did this using theniv
tool, and later by using Nix flakes.As we have several developers using macOS, we pinned
nixpkgs
commits fromnixpkgs-*-darwin
branches. We found that this improved the cache hit rate for our macOS-using developers, and reduced the amount of software they had to build locally.For Ruby and npm projects, we found it too difficult to capture all of their dependencies as Nix expressions. Packages in private repositories and on private package registries were the biggest challenge here, as many
foo2nix
tools only support public package repositories. As a workaround, our shells provide the language runtime (e.g.,ruby
) and its packaging tool (e.g.,bundler
), but leave fetching language-level dependencies to that language’s tool. We have found this to be a reasonable trade-off between correctness and practicality. Theaccess-tokens
setting in modern versions of Nix might help us, if we revisit this.
Phase 2: Building Haskell Deployment Packages using haskell.nix
We were comfortable just using Nix for developer shells for a fairly long time, until a confluence of several constraints forced us into a more elaborate Nix setup.
Most of our Haskell code is deployed to AWS
Lambda. To build binaries for this
environment, we originally used the
lambci/lambda
Docker
container to build in an environment close to what AWS provides at
runtime. This ceased to be viable once we started using Apache
Kafka: the Haskell client we use
(hw-kafka-client
)
binds to librdkafka
,
which is not provided by the AWS runtime environment. Instead of
wrangling third-party RPM repositories or Lambda
Layers,
we used the excellent
haskell.nix
framework to build statically linked,
UPX-compressed deployment packages. We
published example Nix code
code
which does this, as part of our
wai-handler-hal
project.
Phase 3: Private Binary Cache using Amazon S3 and GitHub Actions
Nix + haskell.nix
was a reliable way to generate deployment packages
for our Haskell services, but even after adding IOG’s binary
cache
we would often have a lot of cache misses, leading to very long build
times (particularly on macOS). It was time to bite the bullet and set
up our own private cache. Nix links against the AWS SDK for
C++ and can use S3-compatible
object
stores
as binary caches, so an S3 Bucket was an obvious place to store our
derivations. We needed a way to populate the cache, and weren’t ready
to tackle Nix-native solutions like
Hydra, so we built out a caching
workflow using GitHub Actions’
hosted Linux and macOS runners. Behind this simple idea are a lot of
details worth getting right, so we’ve tried to capture as many of them
here as we possibly can.
Setting up the Bucket
The bucket is just a normal S3 bucket. Because Nix uses the S3 API, we can block all public access and leave website hosting turned off.
It might be worth creating the bucket in the region closest to most of your developers.
It is generally the case that many derivations stop being relevant shortly after they’ve been built.
It might be worth considering an S3 Lifecycle Configuration to migrate old derivations from the “Standard” Storage Class to “Standard — Infrequent Access” and possibly even “Glacier Instant Retrieval”. Be careful of increased retrieval charges when using these storage classes.
S3 Intelligent Tiering might also be worth considering. Be careful of its automation charges.
It is possible to use a Lifecycle Configuration to delete very old derivations, but this can confuse the cache of Nix clients. It might also confuse Hydra (which keeps records of which derivations it has built).
The Nix manual provides example AWS Identity and Access Management (AWS IAM) Policy Documents for read-only and read/write access to an S3 Bucket. Actually providing credentials to Nix that have these permissions can be tricky, due to constraints imposed by Nix:
We cannot use regular credentials to assume a more restricted role, because the C++ SDK that Nix uses does not support
assume_role
entries in~/.aws/config
.The Nix daemon runs as the
root
user, so we need to configure credentials inroot
’s home directory, and cannot use interactive ways of providing credentials.In the AWS cloud, Nix should be able to access credentials in the normal way (e.g., EC2 Instance Profiles).
Nix on non-cloud machines (e.g., developer laptops) is more difficult. We are basically forced into using long-lived
aws_access_key_id
andaws_secret_access_key
pairs. This is not best practice, so we don’t want these keypairs to be able to do too much. We recommend creating entirely separate IAM Users that can only access the cache bucket, and creating a separate User for each developer or server that needs access. Automating key rotation or setting up theaccess-keys-rotated
rule in AWS Config can help ensure that keys are rotated regularly.The GitHub Actions Workflow that populates the cache will assume an AWS IAM Role with permissions to read and write the cache bucket. We don’t create an IAM User for the workflow, because GitHub Actions supports OpenID Connect and provides a guide for configuring OpenID Connect between GitHub and AWS.
Setting up Keys
Nix uses public/private key pairs to know which derivations to trust: our builder will sign derivations with the private key before uploading them to S3, and clients will know to trust the corresponding public key.
We generated a cache key pair following the recommendation in the Nix manual:
$ nix-store --generate-binary-cache-key example-nix-cache-1 key.private key.public
The private key was stored as as a GitHub Actions Secret.
The public key was set in the
nixConfig
setting of our flakes, which means that it applies to only our repositories. This speeds up cache checking for other builds, as Nix clients will only check our bucket when it makes sense:{ description = "A flake"; inputs = ...; outputs = ...; nixConfig = { extra-substituters = [ "s3://example-nix-cache?profile=bellroy" "https://cache.iog.io" ]; extra-trusted-public-keys = [ "example-nix-cache-1:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" "hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ=" ]; }; }
Setting up Clients
For multi-user Nix installations (the default), these AWS keys need to
be loaded by the user running the Nix daemon (by default, this is
root
). These can be set by running commands like:
sudo -H aws configure --profile bellroy set aws_access_key_id AKYOURACCESSKEY
sudo -H aws configure --profile bellroy set aws_secret_access_key YOURSECRETKEY
macOS updates tend to remove files in ~root
, including AWS config
files. One way to permanently provide credentials to the Nix daemon is
(thanks
@lrworth
):
- Create AWS config and credential files in
/etc/nix/aws/config
and/etc/nix/aws/credentials
. - Edit
/Library/LaunchDaemons/org.nixos.nix-daemon.plist
, adding the following lines under<key>EnvironmentVariables</key>
:
key>AWS_CONFIG_FILE</key>
<string>/etc/nix/aws/config</string>
<key>AWS_SHARED_CREDENTIALS_FILE</key>
<string>/etc/nix/aws/credentials</string> <
- Run
sudo -i sh -c 'launchctl remove org.nixos.nix-daemon && launchctl load /Library/LaunchDaemons/org.nixos.nix-daemon.plist'
to restart the Nix daemon.
Setting up the Workflow
Here is a YAML description of a sample workflow, derived from the workflow that we previously used to update our cache:
name: Populate nix shell cache
on:
schedule:
- cron: "0 0 * * 0"
workflow_dispatch: {}
jobs:
populate-cache:
strategy:
fail-fast: false
matrix:
os:
- ubuntu-latest
- macos-latest
runs-on: "${{ matrix.os }}"
steps:
- uses: "actions/checkout@93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8"
- uses: "aws-actions/configure-aws-credentials"
with:
aws-region: "${{ env.AWS_REGION }}"
role-to-assume: "${{ secrets.AWS_OIDC_ROLE_ARN }}"
- uses: "cachix/install-nix-action@daddc62a2e67d1decb56e028c9fa68344b9b7c2a"
with:
extra_nix_config: |
post-build-hook = /etc/nix/upload-to-cache.sh
substituters = https://cache.nixos.org/ https://cache.iog.io s3://example-nix-cache
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ= example-nix-cache-1:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= install_url: https://releases.nixos.org/nix/nix-2.7.0/install
nix_path: nixpkgs=channel:nixpkgs-22.11-darwin
- name: Set up nix signing key
run: "echo ${{ secrets.NIX_CACHE_NIX_SIGNING_KEY }} | sudo tee /etc/nix/example-nix-cache.private > /dev/null"
- name: Set up post-build hook
run: |
sudo tee /etc/nix/upload-to-cache.sh <<EOF > /dev/null
#!/bin/sh
set -eu
set -f # disable globbing
export IFS=' '
echo "Uploading paths" \$OUT_PATHS
exec $(which nix) copy --to 's3://example-nix-cache?region=wherever&secret-key=/etc/nix/example-nix-cache.private&compression=zstd¶llel-compression=true' \$OUT_PATHS
EOF
sudo chmod u+x /etc/nix/upload-to-cache.sh - name: Restart nix-daemon
run: |
case $RUNNER_OS in
Linux) sudo systemctl restart nix-daemon.service ;;
macOS) sudo launchctl kickstart -k system/org.nixos.nix-daemon ;;
esac - name: Install nix-build-uncached
run: |
nix-env -iE '_: import (builtins.fetchTarball {
url = "https://github.com/Mic92/nix-build-uncached/archive/77fe5c8c4c5c7a1fa3f9baa042474b98f2456652.tar.gz";
sha256 = "sha256:04hqiw3rhz01qqyz2x1q14aml1ifk3m97pldf4v5vhd5hg73k1zn";
}) {}' - name: Build shells
run: |
nix-build-uncached -build-flags '-L --keep-going' -E '(import ./.).devShells.${builtins.currentSystem}'
As with the rest of this process, the basic idea is simple (run the workflow to build all derivations required by our development shells), but the devil is in the details:
A Nix post-build hook to sign and upload any derivations we build.
We used
nix-build-uncached
(now deprecated) to build only the derivations that we could not find in S3, preventing lots of redundant downloads.nix-build-uncached
does not support flakes, so we invoked the build through adefault.nix
which usesflake-compat
.The deprecation notice in
nix-build-uncached
’sREADME.md
suggests more modern alternatives:nix-fast-build
has a--skip-cached
flag. A comment on Nix issue says thatnix-eval-jobs
(which powersnix-fast-build
) can be problematic when lots of import-from-derivation (IFD) is required, as inhaskell.nix
;The comments on Nix issue #3946 suggest that
nix build --store $remote_store --builders auto
might (eventually?) work.
If we were doing this again, we’d probably consider Determinate Systems’ Magic Nix Cache to evaluate Nix expressions more quickly, before the build begins.
We ran the workflow weekly as a trade-off between cache freshness and billable minutes, and enabled manual workflow dispatch for when we upgraded GHC versions or major packages.
We use Zstandard compression when we upload to S3, because it’s very light on CPU time and we found that XZ was very slow on large derivations.
Conclusion
You don’t have to adopt Nix all at once to get good value out of
it. Simple development shells prevent a lot of headaches and tend to
have good cache hit rates, which means that it’s fine to delay private
caching until much later. Our initial shell.nix
served us well for
over a year before we started adding more sophisticated tooling, and
we only did that because we were forced. Our moves to haskell.nix
and GitHub-Actions-based caching were made in response to genuine needs,
and we learned as we went. We did eventually move to a Hydra-based CI
system, but that’s a story for another time.