K8s on NixOS - Chapter 1: Getting the nodes ready

Posted on Mon 17 March 2025 in software

If you missed the first part of this guide find it here: K8s on NixOS - Chapter 0: Preface

IPAM

Before we start, let’s write down the IP addresses we will use so we don’t get them confused. I will use documentation prefixes throughout this guide. Replace them with your own ranges accordingly.

IP address/network Usage
2606:4700:4700::1111/128 Cloudflare DNS server
2620:fe::fe/128 Quad9 DNS server
2001:db8::/56 Provided prefix by ISP/Hoster
├─ 2001:db8::1/64 Default Gateway
│ ├─ 2001:db8:0:a::/64 Nodes
│ │ ├─ 2001:db8:0:a::1/128 Node-01
│ │ ├─ 2001:db8:0:a::2/128 Node-02
│ │ └─ 2001:db8:0:a::3/128 Node-03
│ ├─ 2001:db8:0:b::/112 Kubernetes services (65536 service IPs)
│ └─ 2001:db8:0:c::/64 Kubernetes pods
│   ├─ 2001:db8:0:c:1::/80 Kubernetes pods on node-01 (256k pod IPs)
│   ├─ 2001:db8:0:c:2::/80 Kubernetes pods on node-02 (256k pod IPs)
│   └─ 2001:db8:0:c:3::/80 Kubernetes pods on node-03 (256k pod IPs)
└─ 2001:db8:ff::1:2:3/128 Example IP for a jump host or client
1.1.1.1/32 Cloudflare fallback DNS server
9.9.9.9/32 Quad9 fallback DNS server
192.0.2.0/24 Node network
├─ 192.0.2.254/32 Default Gateway
├─ 192.0.2.1/32 Node-01
├─ 192.0.2.2/32 Node-02
└─ 192.0.2.3/32 Node-03
198.51.100.123/32 Example IP for a jump host or client

Getting the nodes ready

For simplicity, all nodes in this guide are QEMU VMs. In reality these can be anything, like hardware servers or Raspberry Pis. Just adapt the setup to your needs. Especially when you use ARM SBCs, check out flake-utils, which was mentioned in the last chapter.

NixOS installation

First, we have to run through the NixOS installation of these nodes. We only need a basic setup for now.

Download the NixOS ISO from nixos.org/download and setup a QEMU VM with virt-manager.

Create a new VM with the wizard like this:

  1. Choose Local install media (ISO image or CDROM)
  2. Use the ISO and choose NixOS 24.11 or NixOS Unstable as OS
  3. Use at least 4GB of memory and 4 cores
  4. Create at least a 64GB disk to have enough storage for the Nix store and container images
  5. Call it k8s-node-01 and hit Finish or check the Customize configuration before install to review and add more options

I usually customize the config like this:

  • Overview:
    • Use UEFI Firmware
  • CPUs:
    • Topology:
      • Set 1 Socket and 4 Cores
  • Boot options:
    • Start VM on host boot up

Boot the NixOS installer and setup partitions like this for a UEFI setup:

sudo cfdisk /dev/vda
# Select label type: gpt
# 1GB type EFI System
# ~63GB type Linux filesystem
# Write and Quit
sudo mkdosfs -F32 -n ESP /dev/vda1
sudo mkfs.ext4 -L NIXOS /dev/vda2
sudo mount /dev/vda2 /mnt
sudo mkdir /mnt/boot
sudo mount /dev/vda1 /mnt/boot

You should end up with a setup like this:

# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sr0     11:0    1 1024M  0 rom
vda    253:0    0   64G  0 disk
├─vda1 253:1    0    1G  0 part /mnt/boot
└─vda2 253:2    0   63G  0 part /mnt

We will not use any SWAP space as this is not recommended by Kubernetes. It is supported in newer versions though. So if you want to experiment with that, add a SWAP partition like so:

# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sr0     11:0    1 1024M  0 rom
vda    253:0    0   64G  0 disk
├─vda1 253:1    0    1G  0 part /mnt/boot
├─vda2 253:2    0    4G  0 part [SWAP]
└─vda3 253:3    0   59G  0 part /mnt

If you want to use a traditional BIOS setup, simply use a DOS partition table in cfdisk and skip the creation of a ESP partition. So you’ll end up with just the root partition or root and swap.

Now generate the NixOS config:

sudo nixos-generate-config --root /mnt
$EDITOR /mnt/etc/nixos/configuration.nix

Create a minimal nix configuration like this:

{ config, lib, pkgs, modulesPath, ... }:

{
  imports = [
    (modulesPath + "/profiles/qemu-guest.nix")
    ./hardware-configuration.nix
  ];

  nix.settings.experimental-features = [ "nix-command" "flakes" ];

  boot.loader = {
    systemd-boot.enable = true;
    efi.canTouchEfiVariables = true;
  };

  networking = {
    hostName = "node-01";
    firewall.enable = true;
    useDHCP = true;
    useNetworkd = true;
    dhcpcd.enable = false;
    nftables.enable = true;
  };
  services.resolved.enable = true;

  time.timeZone = "Europe/Berlin";

  i18n.defaultLocale = "en_US.UTF-8";
  console = {
    font = "Lat2-Terminus16";
    useXkbConfig = true;
  };

  users.users.root.openssh.authorizedKeys.keys = [
    "ssh-ed25519 Put-your-ssh-keys-in-here"
  ];

  environment.systemPackages = with pkgs; [
    vim # or any other editor you like
    git
  ];

  programs.vim = { # or any other editor you like
    enable = true;
    defaultEditor = true;
  };

  services.openssh.enable = true;

  system.stateVersion = "24.11"; # never touch that unless you know what you're doing
}

Tip: Get your SSH keys from GitHub or GitLab with:

curl -sL github.com/your-username.keys
curl -sL gitlab.com/your-username.keys

Start the installation:

cd /mnt
sudo nixos-install

In the end, the installer will ask you for a root password. Choose a strong one. Then reboot and run the installation on the remaining VMs.

Yes, we could reuse the image from the first machine for the others but for that we have to prepare some other things to make them truly unique and work properly. As we only have 3 VMs it is probably faster to install them one by one than setting up a proper template from the first image. But feel free to do so if you want to scale out more.

I won’t do it here to keep the guide shorter. If you’re interested in this topic have a look at Packer or nixos-generators.

Update our flake

After all VMs are installed and ready we will update them from our flake. Here is the new flake.nix we will use:

{
  inputs = {
    nixpkgs.url = github:NixOS/nixpkgs/nixos-24.11;
    agenix = {
      url = github:ryantm/agenix;
      inputs.nixpkgs.follows = "nixpkgs";
    };
  };

  outputs = { self, nixpkgs, agenix }:
    let
      system = "x86_64-linux";
      pkgs = import nixpkgs { inherit system; };
    in
    {
      formatter.${system} = pkgs.nixpkgs-fmt;
      devShells.${system}.default = pkgs.mkShell {
        packages = with pkgs; [
          agenix.packages.${system}.default
          age # secrets debugging
          gnumake
        ];
      };
      nixosConfigurations =
      let
        clusterNodes = [
          {
            hostName = "node-01";
            ip6Address = "2001:db8:0:a::1";
            ip4Address = "192.0.2.1";
            podCidr = "2001:db8:0:c:1::/80";

            root-uuid = "237cca24-aaaa-bbbb-cccc-b33ce7e1c60c";
            boot-uuid = "3AA9-ABCD";
            swap-uuid = "2acde6ef-aaaa-bbbb-cccc-e5dbf51b098e";

            stateVersion = "24.11";
          }
          {
            hostName = "node-02";
            ip6Address = "2001:db8:0:a::2";
            ip4Address = "192.0.2.2";
            podCidr = "2001:db8:0:c:2::/80";

            root-uuid = "764670d1-aaaa-bbbb-cccc-2f1f0345c9ec";
            boot-uuid = "62F5-ABCD";
            swap-uuid = "48e980b8-aaaa-bbbb-cccc-4ccb3e753f56";

            stateVersion = "24.11";
          }
          {
            hostName = "node-03";
            ip6Address = "2001:db8:0:a::3";
            ip4Address = "192.0.2.3";
            podCidr = "2001:db8:0:c:3::/80";

            root-uuid = "1927f79a-aaaa-bbbb-cccc-54f619bd5466";
            boot-uuid = "F67C-ABCD";
            swap-uuid = "09893a08-aaaa-bbbb-cccc-c9b76fac5d94";

            stateVersion = "24.11";
          }
        ];
        cidrs = {
          nodeCidr6 = "2001:db8:0:a::/64";
          nodeCidr4 = "192.0.2.0/24";
          serviceCidr = "2001:db8:0:b::/112";
          podCidr = "2001:db8:0:c::/64";
        };
        k8s_secrets = {};
      in
      {
        "node-01" =
        let
          node = builtins.head (builtins.filter (n: n.hostName == "node-01") clusterNodes);
        in
          nixpkgs.lib.nixosSystem {
          inherit system;
          specialArgs = {
            inherit node;
            inherit clusterNodes;
            inherit cidrs;
          };
          modules = [
            agenix.nixosModules.default
            # install agenix system-wide
            { environment.systemPackages = [ agenix.packages.${system}.default ]; }
            {
              age.secrets = {} // k8s_secrets;
            }
            ./configuration.nix
          ];
        };
        "node-02" =
        let
          node = builtins.head (builtins.filter (n: n.hostName == "node-02") clusterNodes);
        in
        nixpkgs.lib.nixosSystem {
          inherit system;
          specialArgs = {
            inherit node;
            inherit clusterNodes;
            inherit cidrs;
          };
          modules = [
            agenix.nixosModules.default
            # install agenix system-wide
            { environment.systemPackages = [ agenix.packages.${system}.default ]; }
            {
              age.secrets = {} // k8s_secrets;
            }
            ./configuration.nix
          ];
        };
        "node-03" =
        let
          node = builtins.head (builtins.filter (n: n.hostName == "node-03") clusterNodes);
        in
        nixpkgs.lib.nixosSystem {
          inherit system;
          specialArgs = {
            inherit node;
            inherit clusterNodes;
            inherit cidrs;
          };
          modules = [
            agenix.nixosModules.default
            # install agenix system-wide
            { environment.systemPackages = [ agenix.packages.${system}.default ]; }
            {
              age.secrets = {} // k8s_secrets;
            }
            ./configuration.nix
          ];
        };
      };
    };
}

The important additions are the nixosConfigurations for our 3 nodes. Each of them will receive some secrets we will define later on. I combined the configuration.nix and hardware-configuration.nix that the installer generates.

All node specific variables are defined in clusterNodes. The cidrs variable holds the networks and the node variable will only hold the object from clusterNodes of the node we’re currently generating.

So for node specific config the node variable can be used and if information from all cluster members are needed we can iterate over the clusterNodes variable.

We can also split cluster wide and node specific secrets that way.

New configuration

Here is the new configuration.nix file:

{ config, lib, pkgs, modulesPath, node, ... }:

{
  imports = [
    (modulesPath + "/profiles/qemu-guest.nix")
  ];

  fileSystems = {
    "/" = {
      device = "/dev/disk/by-uuid/${node.root-uuid}";
      fsType = "ext4";
      options = [ "noatime" ];
    };
    "/boot" = {
      device = "/dev/disk/by-uuid/${node.boot-uuid}";
      fsType = "vfat";
      options = [ "noatime" "fmask=0022" "dmask=0022" ];
    };
  };

  swapDevices = [{ device = "/dev/disk/by-uuid/${node-swap-uuid}"; }];

  boot = {
    loader = {
      systemd-boot.enable = true;
      efi.canTouchEfiVariables = true;
    };
    initrd = {
      availableKernelModules = [ "ahci" "xhci_pci" "virtio_pci" "virtio_scsi" "sr_mod" "virtio_blk" ];
      kernelModules = [ ];
    };
    kernelModules = [ "kvm-intel" ];
    extraModulePackages = [ ];
    kernelPackages = pkgs.linuxPackages_latest;
  };

  nixpkgs.hostPlatform = lib.mkDefault "x86_64-linux";
  nix.settings.experimental-features = [ "nix-command" "flakes" ];

  networking = {
    hostName = node.hostName;
    domain = "k8s.example.com";
    nameservers = [
      "2606:4700:4700::1111"
      "2620:fe::fe"
      "1.1.1.1" # Fallback
      "9.9.9.9" # Fallback
    ];
    useDHCP = false;
    useNetworkd = true;
    dhcpcd.enable = false;
    nftables.enable = true;
    defaultGateway6 = {
      address = "2001:db8::1";
      interface = "enp1s0";
    };
    defaultGateway = {
      address = "192.0.2.254";
      interface = "enp1s0";
    };
    tempAddresses = "disabled";
    interfaces.enp1s0 = {
      ipv6.addresses = [{ address = node.ip6Address; prefixLength = 64; }];
      ipv4.addresses = [{ address = node.ip4Address; prefixLength = 29; }];
    };
    firewall = {
      enable = true;
      # allowedTCPPorts = [ ... ];
      # allowedUDPPorts = [ ... ];
    };
  };
  services.resolved = {
    enable = true;
    dnssec = "allow-downgrade";
    dnsovertls = "opportunistic";
  };

  time.timeZone = "Europe/Berlin";

  i18n.defaultLocale = "en_US.UTF-8";
  console = {
    font = "Lat2-Terminus16";
    # keyMap = "us";
    useXkbConfig = true;
  };

  users.users.root = {
    openssh.authorizedKeys.keys = [
      "ssh-ed25519 Put-your-ssh-keys-in-here"
    ];
  };

  environment.systemPackages = with pkgs; [
    vim # or any other editor you like
    git
    fastfetch # just for fun
  ];

  programs = {
    vim = { # or any other editor you like
      enable = true;
      defaultEditor = true;
    };
    mtr.enable = true; # network debugging
    direnv.enable = true; # handy for local flake
    htop = {
      enable = true;
      settings = {
        highlight_base_name = true;
        show_cpu_frequency = true;
        #show_cpu_temperature = true; # not relevant in a VM
        update_process_names = true;
        color_scheme = 6;
      };
    };
    tmux = { # keep your session if you loose network during an update
      enable = true;
      clock24 = true;
      shortcut = "a";
      terminal = "screen-256color";
      plugins = with pkgs.tmuxPlugins; [
        prefix-highlight
        yank
      ];
      historyLimit = 10000;
    };
  };

  services = {
    openssh.enable = true;
    sshguard = { # reduce logspam from annoying script kiddies
      enable = true;
      services = [ "sshd" ];
      whitelist = [
        # add the fixed IPs of your jump hosts or clients so you don't ban yourself
        "2001:db8::1:2:3"
        "198.51.100.123"
      ];
    };
  };

  system.stateVersion = node.stateVersion;
}

The important bits here are:

  • boot.kernelPackages = pkgs.linuxPackages_latest: Let’s us use the latest kernel for best performance and driver support. This might be more relevant on real hardware than in a VM.
  • We will switch to fixed IP addresses, so update these settings accordingly:
    • networking.domain: choose/buy a nice domain or simply use your local one (e.g. fritz.box). Avoid using cluster.local as this will be used by Kubernetes internally.
    • networking.nameservers: can be set to your local router for a home lab or any other DNS server you like. I choose the servers from cloudflare and quad9 for performance and privacy reasons.
    • networking.useDHCP = false: Disable DHCP for static IP addressing.
    • networking.defaultGateway6: Set a fixed gateway.
    • networking.defaultGateway: Our nodes also need IPv4 connectivity for legacy workloads.
    • networking.tempAddresses = "disabled": This is a privacy feature which is not needed on a server.
    • services.resolved.dnssec = "allow-downgrade": This will try to use DNSSEC but fallback if the domain is not signed.
    • services.resolved.dnsovertls = "opportunistic": This will try to encrypt the DNS connection if the server supports it.

I also added some tools like mtr for network debugging, htop to monitor processes, tmux to keep our session even if we accidentally kill our network connection and sshguard to shield us against brute-force attacks on our SSH server.

IPv6 temporary addresses

An IPv6 temporary address includes a randomly generated 64-bit number as the interface ID, instead of an interface’s MAC address. You can use temporary addresses for any interfaces on an IPv6 node that you want to keep anonymous. It should be enabled by default and is very useful for most IPv6 clients.

We’re setting up a server that doesn’t need this feature. Even worse, services like etcd will use these temporary addresses for outgoing connections and then fail to connect because we will explicitly allow only the fixed addresses of our nodes to connect to each other.

Enforce DoT

If you chose DNS servers that support it, you can encorce DNS oser TLS by setting services.resolved.dnsovertls = "true". If you do so, you also have to set valid domain names so that the certificates can be validated. So in this example you have to adapt the config like this:

networking.nameservers = [
    "2606:4700:4700::1111#one.one.one.one"
    "2620:fe::fe#dns.quad9.net"
    "1.1.1.1#one.one.one.one"
    "9.9.9.9#dns.quad9.net"
]

Update Makefile

The new Makefile looks like this:

.DEFAULT_GOAL := help

.PHONY: all
all: update-flake k8s ## Update flake inputs and deploy k8s cluster

.PHONY: update-flake
update-flake: ## Update nix flake
    nix flake update

.PHONY: k8s
k8s: node-01 node-02 node-03 ## Deploy k8s cluster

.PHONY: node-01
node-01: ## Deploy node-01
    nix run "nixpkgs#nixos-rebuild" -- switch --build-host node-01 --target-host node-01 --flake ".#node-01"
    ssh node-01 'nix run nixpkgs#nvd -- --color=always diff $$(ls -d1v /nix/var/nix/profiles/system-*-link|tail -n 2)'

.PHONY: node-02
node-02: ## Deploy node-02
    nix run "nixpkgs#nixos-rebuild" -- switch --build-host node-02 --target-host node-02 --flake ".#node-02"
    ssh node-02 'nix run nixpkgs#nvd -- --color=always diff $$(ls -d1v /nix/var/nix/profiles/system-*-link|tail -n 2)'

.PHONY: node-03
node-03: ## Deploy node-03
    nix run "nixpkgs#nixos-rebuild" -- switch --build-host node-03 --target-host node-03 --flake ".#node-03"
    ssh node-03 'nix run nixpkgs#nvd -- --color=always diff $$(ls -d1v /nix/var/nix/profiles/system-*-link|tail -n 2)'

.PHONY: help
help: ## Display this help
    @grep -h -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'

This should work on NixOS as well as any other OS with Nix installed. On non NisOS systems you won’t have the nixos-rebuild command, but you can run it from a nix run command. You might have to enable the experimental features by adding experimental-features = nix-command flakes to your nix.conf.

The config will be build on the remote host, that has the advantage that the final config doesn’t have to be transfered to the remote host. On the other hand, does it also have the disadvantage that you have to build the config 3 times, because local builds from the Nix store can’t be reused. Try to remove the --build-host argument and test if that works better for you.

SSH config

The nodes are configured in SSH (~/.ssh/config) so it’s easier to connect to them.

Host node-01
  User root
  HostName 2001:db8:a::1
  Port 22
  IdentitiesOnly yes
  IdentityFile /home/user/.ssh/id_ed25519

Host node-02
  User root
  HostName 2001:db8:a::2
  Port 22
  IdentitiesOnly yes
  IdentityFile /home/user/.ssh/id_ed25519

Host node-03
  User root
  HostName 2001:db8:a::3
  Port 22
  IdentitiesOnly yes
  IdentityFile /home/user/.ssh/id_ed25519

Update nodes from local flake

Now with everything prepared let’s update our VMs by running make all.

This should update our flake and deploy all 3 nodes, one at a time.

As a proof of success login to one of them and run the fastfetch command so you can show it to all your reddit friends. :P

[root@node-01:~]# fastfetch
          ▗▄▄▄       ▗▄▄▄▄    ▄▄▄▖             root@node-01
          ▜███▙       ▜███▙  ▟███▛             ------------
           ▜███▙       ▜███▙▟███▛              OS: NixOS 24.11 (Vicuna) x86_64
            ▜███▙       ▜██████▛               Host: KVM/QEMU Standard PC (Q35 + ICH9, 2009) (pc-q35-9.1)
     ▟█████████████████▙ ▜████▛     ▟▙         Kernel: Linux 6.13.6
    ▟███████████████████▙ ▜███▙    ▟██▙        Uptime: 6 days, 1 hour, 24 mins
           ▄▄▄▄▖           ▜███▙  ▟███▛        Packages: 379 (nix-system)
          ▟███▛             ▜██▛ ▟███▛         Shell: bash 5.2.37
         ▟███▛               ▜▛ ▟███▛          Display (Virtual-1): 1024x768
▟███████████▛                  ▟██████████▙    Terminal: /dev/pts/0
▜██████████▛                  ▟███████████▛    CPU: Intel(R) Xeon(R) E5-2699C v4 (16) @ 2.20 GHz
      ▟███▛ ▟▙               ▟███▛             GPU: Red Hat, Inc. QXL paravirtual graphic card
     ▟███▛ ▟██▙             ▟███▛              Memory: 1.11 GiB / 62.78 GiB (2%)
    ▟███▛  ▜███▙           ▝▀▀▀▀               Swap: 0 B / 4.00 GiB (0%)
    ▜██▛    ▜███▙ ▜██████████████████▛         Disk (/): 17.79 GiB / 57.77 GiB (31%) - ext4
     ▜▛     ▟████▙ ▜████████████████▛          Local IP (enp1s0): 192.0.2.1/24
           ▟██████▙       ▜███▙                Locale: en_US.UTF-8
          ▟███▛▜███▙       ▜███▙
         ▟███▛  ▜███▙       ▜███▙
         ▝▀▀▀    ▀▀▀▀▘       ▀▀▀▘

As a last step, check out the repo with our flake on each node. I prefer to put in under /root/nix-config. Then delete the /etc/nixos directory and replace it with a symlink to our git repo like this:

ln -sf /root/nix-config /etc/nixos

With this setup you’re able to also log into a VM enter the repo dir and roll out locally by running nixos-rebuild switch.

This should be enough for this chapter. Lean back and enjoy your new VMs. Next time we will setup our PKI.