Self Build Amd Ryzen Nas

It is almost 5 years after building my Xeon D NAS, and my [Dell T20] started to fail. (Well actually as it turned out just the RAM was failing, the rest is still fine)

And here is what I ended up with this round:

  • AsRock X570D4i-2t
  • AMD Ryzen 5 4500 AM4, 3.60 GHz, 6 Core
  • Corsair Vengeance 2 x 32GB, 3200 MHz, DDR4-RAM, SO-DIMM
  • Case: Supermicro SC721 TQ-350B2

I was looking for something new with AMD since the Intel boards I seen where less exciting the last few years. The AsRock X570D4i-2t comes with 10GbE, is mini-ITX sized and takes AM4 socket CPUs. It can even handle up to 128GB memory which is great since 24GB was not enough to build my FreeBSD packages. I took the same Supermicro case since I was very happy with it the first time around as it provides space for 4 hot swappable HDDs. And the form factor is great and it contains already a power supply. I debated to build a only nvme based NAS but I guess that's and idea for later.

Let's talk about the things I learned/hated. Starting with minor things: OCuLink is an interesting connector and since I was not sure if an OCuLink to SATA adapter is part of the mainboard I went ahead and ordered one. During that I found out that it is not that popular and kinda hard to buy. I think OCuLink is cool from a technical aspect and is easy to use, but annoying to buy since it is not wide spread yet.

You might have spotted the 8pin(DC-IN)+4pin(ATX) in the specs for the power connector. This was supper unclear to me and even with the documentation provided in paper form it was not mentioned. I needed to consult the full documentation where on page 25 we can find a drawing for it. Apparently for 12V you could connect one 8 pin connector, but for a normal ATX you need to connect one 4 pin into the 8 pin connector and an other 4 pin for CPU. In the end I'm just happy I didn't fry my mainboard by trying to connect power to this board.

Things getting worse: I own now 3 CPU coolers for this CPU. The first one is the stock fan which came with the CPU and I had intended to use it. When starting assembling it became clear that this will not fit and I need a LGA115x fan. Fair enough I missed that the first time I looked at the specs. I got one from aliexpress because it is supper hard to find coolers for LGA115x. A week later when I tried to install this fan I learned that only some LGA115x will work or I would need to remove the already installed backpannel for the cooler. This at least to me is the worst design decision by ASRock not to go for a standard cooler mount. I ended up with Cooler Master I30 I50 I50c MINI CPU Cooler 2600 RPM Quiet Fan For Intel LGA115X 1200 And M-ATX Radiator. Good news is a lot of people talk about overheating issues as far as my few day testing showed, this case and the fan do a good job to keep things cool.

Last but not least I'm not sure who's fault this is but the case front panel connector does not match the mainboard's System and Auxiliary Panel and I needed to resort to connect these two components manually. Meaning I needed to first find the documentation for the case which was harder than expected. It is not in the manual for my case. I needed to resort to looking up a mainboard which uses this connector and search for it there. Luckily the X10SDV Mini-ITX Series documentation contains a the pin layout. Apparently this thing is called JF1 and the manual contained infos on how to connect.

Here is how it looks:

self build connector between JF1 and AMD System Panel

In summary it runs now for a few days and I'm slowly making sure it runs all the things I need mainly backup and building FreeBSD packages. And I can report that everything runs smoothly and the extra CPU and Memory help to build my packages faster. In addition a lot of people online had problems with heat, which is something I did not observe yet, everything runs cool.

Refresh Gpg Key

No worries my complaining about NixOS isn't done :). But for this blog post we take a break and talk complain about GPG. I use and love pass as my primary password manager. Which uses gpg to encrypt and decrypt files which are tracked via git. (I even have a small rofi script to access passwords quickly)

At some point my previous gpg key expired and since it was 10 years old it was time to update. A task that sounds easy enough:

gpg --full-generate-key
pass init $keyid

and that should be it. And it might be good enough depending on your gpg version and settings. Now the only step is to export this key and put it on my phone to unlock my password manager there. I use a combination of OpenKeychain: Easy PGP and Password Store.

Exporting and importing that via a USB stick on the phone was so easy.

gpg --export-secret-key $keyid > gpgprivate.key

Only to be then greeted with:

Encountered OpenPGP Exception during operation!

The new OpenPGP AEAD Mode

AEAD is Authenticated Encryption with Associated Data which is as far as i understand it a way to have unencrypted data (for example router header) as a part of your authenticated message. Meaning the receiver of the message can check if the header was modified.

And there are a bunch of incompatible modes / implementations for this. (See: https://articles.59.ca/doku.php?id=pgpfan:schism for way more details) In summary OpenGPG defaults to a mode called OCB which is not standard and implementations like the Android App do not support.

The arch wiki contains a good description on how to disable AEAD on an existing key.

$ gpg --expert --edit-key <FINGERPRINT>
gpg> showpref
[ultimate] (1). Foobar McFooface (test) <foobar@mcfooface.com>
    Cipher: AES256, AES192, AES, 3DES
    AEAD: OCB
    Digest: SHA512, SHA384, SHA256, SHA224, SHA1
    Compression: ZLIB, BZIP2, ZIP, Uncompressed
    Features: MDC, AEAD, Keyserver no-modify

gpg> setpref AES256 AES192 AES SHA512 SHA384 SHA256 SHA224 ZLIB BZIP2 ZIP
Set preference list to:
    Cipher: AES256, AES192, AES, 3DES
    AEAD:
    Digest: SHA512, SHA384, SHA256, SHA224, SHA1
    Compression: ZLIB, BZIP2, ZIP, Uncompressed
    Features: MDC, Keyserver no-modify
Really update the preferences? (y/N) y

(source: https://wiki.archlinux.org/title/GnuPG#Disable_unsupported_AEAD_mechanism)

Now if we already updated the key in pass we unfortunately need to re-encrypt all files again.

for filename in ./*/*.gpg; do gpg -d -r $USER ./${filename} > ./${filename}.decrypt ; done
for filename in ./*/*.decrypt; do gpg -e -r $USER ./${filename} ; done
for filename in ./*/*.decrypt.gpg; do mv "${filename}" "${filename/.decrypt.gpg/}" ; done

git commit -am "re-encrypt passwords without AEAD mode"
git clean -dfx

(source: https://github.com/open-keychain/open-keychain/issues/2096)

And tada 🎉 our key works now on Android as well.

Build Custom Nixos Raspberry Pi Images

Since you might be not interested in me hating NixOS, Linux and the world in general i put my little rant at the end of this article. The first part is how to cross build a NixOS image for a Raspberry Pi 3 B+ from Fedora. I used compiling through binfmt QEMU. My Fedora laptop is a x86 system and we need to build a AArch64 image.

I assume that nix is already installed and binfmt is installed and works. And spoiler / warning no idea if that is proper or a good way to do it, it is just the way that worked for me.

$ nix --version
nix (Nix) 2.15.1

$ ls /proc/sys/fs/binfmt_misc/ | grep aarch64
qemu-aarch64

$ systemctl status systemd-binfmt.service

We need to configure nix to use this. For this I added the following config to /etc/nix/nix.conf.

extra-platforms = aarch64-linux
extra-sandbox-paths = /usr/bin/qemu-aarch64-static

After that we need to restart the nix daemon.

$ systemctl restart nix-daemon.service

After that we are ready to create the config file:

$ cat configuration.sdImage.nix
{ config, pkgs, lib, ... }:
{
  nixpkgs.overlays = [
    (final: super: {
      makeModulesClosure = x:
        super.makeModulesClosure (x // { allowMissing = true; });
    })
  ];
  system.stateVersion = lib.mkDefault "23.11";

  imports = [
    <nixpkgs/nixos/modules/installer/sd-card/sd-image-aarch64.nix>
  ];

  nixpkgs.hostPlatform.system = "aarch64-linux";
  sdImage.compressImage = false;

  # NixOS wants to enable GRUB by default
  boot.loader.grub.enable = false;
  # Enables the generation of /boot/extlinux/extlinux.conf
  boot.loader.generic-extlinux-compatible.enable = true;

  # Set to specific linux kernel version
  boot.kernelPackages = pkgs.linuxPackages_rpi3;

  # Needed for the virtual console to work on the RPi 3, as the default of 16M doesn't seem to be enough.
  # If X.org behaves weirdly (I only saw the cursor) then try increasing this to 256M.
  # On a Raspberry Pi 4 with 4 GB, you should either disable this parameter or increase to at least 64M if you want the USB ports to work.
  boot.kernelParams = ["cma=256M"];

  # Settings
  # The rest of your config things

  # Use less privileged nixos user
  users.users.nixos = {
    isNormalUser = true;
    extraGroups = [ "wheel" "networkmanager" "video" ];
    # Allow the graphical user to login without password
    initialHashedPassword = "";
  };

  # Allow the user to log in as root without a password.
  users.users.root.initialHashedPassword = "";
}

The overlays are quite important as there is some issue which I don't fully understand. If not added the error looks something like this where a kernel module was not found:

modprobe: FATAL: Module ahci not found in directory /nix/store/8bsagfwwxdvp9ybz37p092n131vnk8wz-linux-aarch64-unknown-linux-gnu-6.1.21-1.20230405-modules/lib/modules/6.1.21
error: builder for '/nix/store/jmb55l06cvdpvwwivny97aldzh147jwx-linux-aarch64-unknown-linux-gnu-6.1.21-1.20230405-modules-shrunk.drv' failed with exit code 1;
       last 3 log lines:
       > kernel version is 6.1.21
       > root module: ahci
       > modprobe: FATAL: Module ahci not found in directory /nix/store/8bsagfwwxdvp9ybz37p092n131vnk8wz-linux-aarch64-unknown-linux-gnu-6.1.21-1.20230405-modules/lib/modules/6.1.21
       For full logs, run 'nix log /nix/store/jmb55l06cvdpvwwivny97aldzh147jwx-linux-aarch64-unknown-linux-gnu-6.1.21-1.20230405-modules-shrunk.drv'.
error: 1 dependencies of derivation '/nix/store/ndd1yhiy68c2av64gwn8zfpn3yg07iq5-stage-1-init.sh.drv' failed to build
error: 1 dependencies of derivation '/nix/store/j2gmvl3vaj083ww87lwfrnx81g6vias2-initrd-linux-aarch64-unknown-linux-gnu-6.1.21-1.20230405.drv' failed to build
building '/nix/store/vs0cg5kzbislprzrd3ya16n1xd532763-zfs-user-2.1.12-aarch64-unknown-linux-gnu.drv'...
error: 1 dependencies of derivation '/nix/store/gjhfjh9bb3ha0v03k7b4r3wvw4nxm7r3-nixos-system-aegaeon-23.11pre493358.a30520bf8ea.drv' failed to build
error: 1 dependencies of derivation '/nix/store/x5mnb1xfxk7kp0mbjw7ahxrz2yiv922s-ext4-fs.img-aarch64-unknown-linux-gnu.drv' failed to build
error: 1 dependencies of derivation '/nix/store/8qbjy9mnkrbyhj4kvl50m8ynzpgwmrpz-nixos-sd-image-23.11pre493358.a30520bf8ea-aarch64-linux.img-aarch64-unknown-linux-gnu.drv' failed to build

Don't forget to add your customization after # Settings. This is the place where you setup your user, enable required services, configure networking. In my case that's where most of the config is from this blog post: Build a simple dns with a Raspberry Pi and NixOS.

After that we can build (this takes some time!) and flash the image.

nix-build '<nixpkgs/nixos>' -A config.system.build.sdImage -I nixos-config=./configuration.sdImage.nix --option sandbox false --argstr system aarch64-linux

and

sudo -s
cat /path/to/img > /dev/sdX

The Rant

Why am I building a image myself instead of using the official image and just do what i have written in my earlier blog post Build a simple dns with a Raspberry Pi and NixOS. And the answer to that is part of my rant somehow NixOS is not able to upgrade / build on 23.11 on a Raspberry Pi it crashes for my either while downloading some packages or with some pid that either deadlocks or hangs for longer than i was willing to wait (more than 6 hours).

After I decided to try to cross build it was a real struggle to figure out how to do that. There are a lot of resources:

And a lot of them are not well structured or outdated. Which makes it very hard for a beginner like me to figure out where to start.

But with all this ranting i also want to point out that it seems like most NixOS user want to help you out. Thanks makefu for answering all my stupid NixOS questions and nova for pointing me to the correct github issue.

Incus Gitlab Runner

With the recent fork and drama around LXD it might be time to give Incus a chance.

Using Incus as GitLab runner is nice because it provides you with a simple interface to run containers and VMs for the cases where Docker is not enough. Helpfully there is a custom LXD GitLab runner provided by GitLab.

Based on that I created a custom Incus GitLab runner. Checkout: https://github.com/fliiiix/gitlab-incus-runner

This can be easily integrated into the deployment system used to setup GitLab runners. (Thinks Ansible)

It also assumes that you already installed Incus on the runner. To achieve that you can follow the official documentation for that. Or take some inspiration from the next section.

Installing Incus

Look at the official documentation. This is just a quick summary on how I did it.

mkdir -p /etc/apt/keyrings/
curl -fsSL https://pkgs.zabbly.com/key.asc -o /etc/apt/keyrings/zabbly.asc

sh -c 'cat <<EOF > /etc/apt/sources.list.d/zabbly-incus-stable.sources
Enabled: yes
Types: deb
URIs: https://pkgs.zabbly.com/incus/stable
Suites: $(. /etc/os-release && echo ${VERSION_CODENAME})
Components: main
Architectures: $(dpkg --print-architecture)
Signed-By: /etc/apt/keyrings/zabbly.asc

EOF'

apt-get update
apt-get install incus

sudo adduser ubuntu incus-admin

Big Kudos to zabbly and Stéphane Graber for providing pre-built images!

And to setup Incus I used cloud-config runcmd.

#cloud-config
runcmd:
 - 'incus admin init --preseed < /etc/incus.seed && touch /etc/incus.init'

This assumes that you created your preseed config:

$ cat /etc/incus.seed
config: {}
networks:
- config:
    ipv4.address: auto
    ipv6.address: none
  description: ""
  name: incusbr0
  type: ""
  project: default
storage_pools:
- config:
    size: 200GiB
  description: ""
  name: default
  driver: zfs
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      network: incusbr0
      type: nic
    root:
      path: /
      pool: default
      type: disk
  name: default
projects: []
cluster: null

Zfs Nixos

As I am still experimenting with my NixOS setup I thought it would be nice to separate the user-date onto a separate nvme ssd. The plan was to use ZFS and put my /var/lib on it. This would allow me to create snapshots which can be pushed or pulled to my other ZFS systems. That all sounded easy enough but took way longer than expected.

Hardware

It all starts with a new NVME SSD. I got a WD Blue SN570 2000 GB, M.2 228 because it was very cheap. And here is my first learning apparently one should re-run nixos-generate-config or add the nvme module by hand to the hardware config (boot.initrd.availableKernelModules) to allow NixOS to correctly detect the new hardware. (I lost a lot of time to figure this out.)

Software

Creating the ZFS pool is the usual. But one thing to note is the device name since NixOS imports using the /dev/disk/by-id/ path it is recommended to use that path to create the pool. The by-id name should also be consistent during hardware changes, while other mappings might change and lead to a broken pool. At least that is my understanding of it. (Source people on the internet, Inconsistent Device Names Across Reboot Cause Mount Failure Or Incorrect Mount in Linux)

sudo zpool create -f -O atime=off -O utf8only=on -O normalization=formD -O aclinherit=passthrough -O compression=zstd -O recordsize=1m -O exec=off tank /dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4e246dab

Move data

On the new pool we create datasets and mount them.

zfs create tank/var -o canmount=on
zfs create tank/var/lib -o canmount=on

Then we can copy over all the current data from /var/lib.

# 1. stop all services accessing `/var/lib`
# 2. move data
sudo cp -r /var/lib/* /tank/var/lib/
sudo rm -rf /var/lib/
sudo zfs set mountpoint=/var/lib tank/var/lib

And here is the rest of my NixOS config for ZFS:

# Setup ZFS
# Offical resources:
# - https://wiki.nixos.org/wiki/ZFS
# - https://openzfs.github.io/openzfs-docs/Getting%20Started/NixOS/index.html#installation

# Enable support for ZSF and always use a compatible kernel
boot.supportedFilesystems = [ "zfs" ];
boot.zfs.forceImportRoot = false;
boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;

# head -c 8 /etc/machine-id
# The primary use case is to ensure when using ZFS 
# that a pool isn’t imported accidentally on a wrong machine.
networking.hostId = "aaaaaaaa";

# Enable scrubing once a week
# https://openzfs.github.io/openzfs-docs/man/master/8/zpool-scrub.8.html
services.zfs.autoScrub.enable = true;

# Names of the pools to import
boot.zfs.extraPools = [ "tank" ];

And in the end run sudo nixos-rebuild switch to build it and switch to the configuration.

Fucked up ZFS Pool

In the end I ended up doing everything again and starting fresh. Because my system did not import my ZFS pool after a reboot. Here are the key things i learned.

NixOS does import the pools by-id by running a command like this:

zpool import -d "/dev/disk/by-id" -N tank

source

And this can be configured via boot.zfs.devNodes source. Took a while to figure out since i usually just run zpool import tank.

And the behavior I saw was:

zpool import tank <- works
zpool import -d "/dev/disk/by-id" -N tank <- fails

As it turns out wipefs does not necessarily remove all zpool information from a disk.

$ sudo wipefs -a /dev/nvme0n1
/dev/nvme0n1: 8 bytes were erased at offset 0x1d1c10abc00 (zfs_member): 0c b1 ba 00 00 00 00 00
/dev/nvme0n1: 8 bytes were erased at offset 0x1d1c10a9800 (zfs_member): 0c b1 ba 00 00 00 00 00
/dev/nvme0n1: 8 bytes were erased at offset 0x1d1c10a8000 (zfs_member): 0c b1 ba 00 00 00 00 00
...

While wipefs reports everything deleted we can still check with zdb that there is in fact still a ZFS label on the disk.

$ sudo zdb -l /dev/nvme0n1
failed to unpack label 0
------------------------------------
LABEL 1
------------------------------------
    version: 5000
    name: 'tank'
    state: 1
    txg: 47
    pool_guid: 16638860066397443734
    errata: 0
    hostid: 2138265770
    hostname: 'telesto'
    top_guid: 4799150557898763025
    guid: 4799150557898763025
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 4799150557898763025
        path: '/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4e246dab'
        whole_disk: 0
        metaslab_array: 64
        metaslab_shift: 34
        ashift: 9
        asize: 2000394125312
        is_log: 0
        create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 1 2 3

And the way to clear that is by dd-ing the right spots in the front and at the back of the disk.

sudo dd if=/dev/zero of=/dev/nvme0n1 count=4 bs=512k
sudo dd if=/dev/zero of=/dev/nvme0n1 oseek=3907027120

There is a superuser answer which shows how that works. And here is my lengthy back and forth where we figured out that this is the issue.

Other resources which where helpful