Nix patterns I

Introduction §

I’ve been using Nix and NixOS for many years now, both personally and professionally, on workstations and servers. While the tools aren’t without their warts, I strongly believe the model espoused by Nix for package management is a leap ahead of other available tools. However, the Nix language can be very unstructured, and knowing how to use it in an effective and composable way can involve a lot of searching and deep-diving through nixpkgs. This series aims to collect patterns and good practices I encounter or devise, significantly for my own future benefit, but hopefully other people will find it helpful too.

In this entry, I aim to document a pattern for defining services in NixOS that I’ve found useful for separating concerns and producing reüsable code.

The problem §

Service configuration often mixes together service inputs and outputs. Inputs are things that the user cares about configuring — things that change the user-visible behaviour of the service. Outputs are things the user doesn’t care about, except to make sure that they are passed around correctly — examples include socket paths or database names.

Nix teaches us that only the inputs to a build process should be user-provided. Outputs (such as installation paths) can be generated by the process itself; the user shouldn’t have to care about them. All we need to do is make sure the outputs can be passed effectively from the process to its caller.

But NixOS modules often forget this. For example, when setting up a NGINX configuration to proxy to a backend application, we typically see:

services.backend-service.address = "address";
services.nginx
  .virtualHosts."host"
  .locations."/location"
  .proxyPass = "address";

There are two problems here. Firstly, it is impossible to set up multiple instances of the backend-service. This makes sense for some services, for which it is meaningless to have more than one per machine, but in general we’d like to be able to have as many instances as we need. The second issue is that the administrator has to deal with the address "address" and keep it consistent between output (our backend-service) and input (NGINX).

Both of these problems have the same root cause: both services.backend-service and "address" are essentially global variables. "address" is particularly bad, as it must be manually plumbed around by hand, and mismatches will produce a configuration that will activate correctly, but whose resulting system components will be unable to connect to each other.

User-defined instances §

The first part of the problem can be solved by letting the user define their own names for their service instances. This takes the form of expecting an attrset (with user-defined keys) as configuration for our service module, instead of just a single instance’s configuration. At this juncture we could actually use a list, as the key names are largely useless, but having a human-readable name to attach to things belonging to the instance helps significantly with debugging. In the next section we will want to refer back to the name, and it is much less fragile to do this with a user-chosen name than an integer list index.

services.backend-service.instance1 = {
  enable = true;
  address = "address1";
};
services.backend-service.instance2 = {
  enable = true;
  address = "address2";
};

services.nginx
  .virtualHosts."host"
  .locations."/instance1"
  .proxyPass = "address1";
services.nginx
  .virtualHosts."host"
  .locations."/instance2"
  .proxyPass = "address2";

The module interface to allow this kind of usage looks something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
{ lib, config }:
let
  instance = name: cfg: lib.mkIf cfg.enable {
    # per-instance system configuration here, e.g.:
    systemd.services."backend-service.${name}" = {
      # …
      serviceConfig.ExecStart = ''
        ${cfg.package}/bin/backend-service \
          --address "${cfg.address}"
      '';
      # …
    };
    # …
  };
in
{
  options.services.backend-service = mkOption {
    description = ''
      Named instances of backend-service to run.
    '';
    type = types.attrsOf (types.submodule {
      options = {
        # instance configuration options here, e.g.
        enable = lib.mkEnableOption { … };
        package = lib.mkOption { … };
        address = lib.mkOption { … };
        # …
      };
    });
  };

  # `lib.mergeAttrsList` combines all our instance configs into a
  # larger config object by merging
  config = lib.mergeAttrsList
    # `lib.mapAttrsToList` applies our `instance` function to each of
    # the instance configuration attrsets and returns the result as a
    # list of attrsets
    (lib.mapAttrsToList
      instance
      config.services.backend-service);
}

… at least, that’s what we’d like to write. Unfortunately, since attrset keys in Nix are strict, this will cause an infinite recursion when Nix attempts to evaluate the keys of config.

Instead, we need to expressly restrict the keys that can appear in the output:

34
35
36
37
38
39
40
config = let
  c = lib.mapAttrsToList
    instance
    config.services.gunicorn;
in {
  systemd = lib.mkMerge (lib.catAttrs "systemd" c);
};

This helps the first problem, but it’s still unpleasant to manually have to plumb around "address1" and "address2", when we really don’t care about it. One can work around this problem by having backend-service understand NGINX, and indeed this is a pattern used in several places in nixpkgs, but it’s rather clumsy to push knowledge of the outer proxy into the backend service.

Passing data back out of instances §

A helpful insight is that modules can update their own configuration attrset as well as read from it, a fact that we can use to implement a sort of out-param pattern. This example assumes we’re using UNIX sockets; finding a fresh TCP port is much harder. As John Day notes in Patterns in Network Architecture, the system of TCP port numbers essentially commits the sin we note here on a much larger scale.

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
options.services.backend-service = mkOption {
  default = { };
  description = ''
    Named instances of backend-service to run.
  '';
  type = types.attrsOf (types.submodule ({ name, ... }: {
    options = {
      enable = lib.mkEnableOption { … };
      package = lib.mkOption { … };
      address = lib.mkOption {
        description = "Read-only attribute!";
        …
      };
    };

    config.address = "/var/lib/backend-service/${name}.sock";
  }));
};

Here we use the parameterized submodule functionality to generate the socket attribute locally, guaranteeing that the keys of our submodules can only depend on their names, not other keys.

With this, our end-user can use config.services.backend-service."some-identifier".address to refer to the address of the service they’ve defined, without having to manually devise that address and plumb it through:

services.backend-service.instance1.enable = true;
services.backend-service.instance2.enable = true;

services.nginx
  .virtualHosts."host"
  .locations."/instance1"
  .proxyPass
  = config.services.backend-service.instance1.address;
services.nginx
  .virtualHosts."host"
  .locations."/instance2"
  .proxyPass
  = config.services.backend-service.instance2.address;

We do unfortunately still have to use the instance1 and instance2 names, despite them only being used in one place, but this is a much better scenario: an illegal instance name will be caught at configuration-build time, and not result in a broken system.

Imagining a better solution §

An even better way to write this might be to use something like algebraic effects to allow us to write:

services.nginx
  .virtualHosts."host"
  .locations."/instance"
  .proxyPass
  = mk-backend-service { … };

where mk-backend-service is an effectful function that both registers the service configuration in the global system configuration attrset and returns the address of the registered service directly to its caller (i.e. a state effect). This way we can avoid naming the expression if we like, and if we do want to refer to it multiple times we can re-use the Nix identifier binding system, e.g.

let
  instance1 = mk-backend-service { … };
in {
  services.nginx
    .virtualHosts."host1"
    .locations."/instance1"
    .proxyPass
    = instance1;

  services.nginx
    .virtualHosts."host2"
    .locations."/instance1"
    .proxyPass
    = instance1;
}

This is not currently supported by the Nix language, though, and the encoding of it (using a manual CPS transform to pass the remainder of the configuration to the function) is probably too clumsy to be worthwhile.