glusterfs setup on c1: * for h in c1 c2 c3; do ssh $h sudo mkdir /persist/glusterfs/compute; done * gluster peer probe c2 * gluster peer probe c3 * gluster volume create compute replica 3 c{1,2,3}:/persist/glusterfs/compute/brick1 * gluster volume start compute * gluster volume bitrot compute enable mysql credentials * Put secrets/mysql_root_password into a Nomad var named secrets/mysql.root_password postgres credentials * Put secrets/postgres_password into a Nomad var named secrets/postgresql.postgres_password adding a new gluster node to the compute volume, with c3 having failed: (instructions from https://icicimov.github.io/blog/high-availability/Replacing-GlusterFS-failed-node/) * zippy: sudo mkdir /persist/glusterfs/compute -p * c1: gluster peer probe 192.168.1.2 (by IP because zippy resolved to a tailscale address) * c1: gluster volume replace-brick compute c3:/persist/glusterfs/compute/brick1 192.168.1.2:/persist/glusterfs/compute/brick1 commit force * c1: gluster volume heal compute full * c1: gluster peer detach c3 same to then later replace 192.168.1.2 with 192.168.1.73 replacing failed / reinstalled gluster volume (c1 in this case). all commands on c2: * gluster volume remove-brick compute replica 2 c1:/persist/glusterfs/compute/brick1 force * gluster peer detach c1 * gluster peer probe 192.168.1.71 (not c1 because switching to IPs to avoid DNS/tailscale issues) * gluster volume add-brick compute replica 3 192.168.1.71:/persist/glusterfs/compute/brick1 kopia repository server setup (on a non-NixOS host at the time): * kopia repository create filesystem --path /backup/persist * kopia repository connect filesystem --path=/backup/persist * kopia server user add root@zippy then, add the password to secrets/zippy.yaml -- the key needs to be "kopia" * kopia server start --address 0.0.0.0:51515 --tls-cert-file ~/kopia-certs/kopia.cert --tls-key-file ~/kopia-certs/kopia.key --tls-generate-cert (first time) * kopia server start --address 0.0.0.0:51515 --tls-cert-file ~/kopia-certs/kopia.cert --tls-key-file ~/kopia-certs/kopia.key (subsequent) [TLS is mandatory for this] NFS services server setup (one-time on the NFS server host, e.g. zippy): * sudo btrfs subvolume create /persist/services * sudo mkdir -p /persist/root/.ssh * sudo ssh-keygen -t ed25519 -f /persist/root/.ssh/btrfs-replication -N "" -C "root@$(hostname)-replication" * Get the public key: sudo cat /persist/root/.ssh/btrfs-replication.pub Then add this public key to each standby's nfsServicesStandby.replicationKeys option NFS services standby setup (one-time on each standby host, e.g. c1): * sudo btrfs subvolume create /persist/services-standby Moving NFS server role between hosts (e.g. from zippy to c1): See docs/NFS_FAILOVER.md for detailed procedure Summary: 1. On current primary: create final snapshot and send to new primary 2. On new primary: promote snapshot to /persist/services 3. Update configs: remove nfs-services-server.nix from old primary, add to new primary 4. Update configs: add nfs-services-standby.nix to old primary (with replication keys) 5. Deploy old primary first (to demote), then new primary (to promote)