alo-cluster/docs/CLUSTER_REVAMP.md

# Cluster Architecture Revamp

**Status**: Planning complete, ready for review and refinement

## Key Decisions

✅ **Replication**: 5-minute intervals (incremental btrfs send)
✅ **WordPress**: Currently syncthing → will use `/data/services` via NFS
✅ **Media**: Only media.hcl needs `/data/media`, constrained to fractal
✅ **Unifi**: Floating (no constraint needed)
✅ **Sunny**: Standalone, ethereum data stays local (not replicated)
✅ **Quorum**: 5 servers (c1, c2, c3, fractal, zippy)
✅ **NFS Failover**: Via Consul DNS (`services.service.consul`)

## Table of Contents
1. [End State Architecture](#end-state-architecture)
2. [Migration Steps](#migration-steps)
3. [Service Catalog](#service-catalog)
4. [Failover Procedures](#failover-procedures)

---

## End State Architecture

### Cluster Topology

**5-Server Quorum (Consul + Nomad server+client):**
- **c1, c2, c3**: Cattle nodes - x86_64, run most stateless workloads
- **fractal**: Storage node - x86_64, 6x spinning drives, runs media workloads
- **zippy**: Stateful anchor - x86_64, runs database workloads (via affinity), primary NFS server

**Standalone Nodes (not in quorum):**
- **sunny**: x86_64, ethereum node + staking, base NixOS configs only
- **chilly**: x86_64, Home Assistant VM, base NixOS configs only

**Quorum Math:**
- 5 servers → quorum requires 3 healthy nodes
- Can tolerate 2 simultaneous failures
- Bootstrap expect: 3

### Storage Architecture

**Primary Storage (zippy):**
- `/persist/services` - btrfs subvolume
  - Contains: mysql, postgres, redis, clickhouse, mongodb, app data
  - Exported via NFS to: `services.service.consul:/persist/services`
  - Replicated via **btrfs send** to c1 and c2 every **5 minutes** (incremental)

**Standby Storage (c1, c2):**
- `/persist/services-standby` - btrfs subvolume
  - Receives replicated snapshots from zippy via incremental btrfs send
  - Can be promoted to `/persist/services` and exported as NFS during failover
  - Maximum data loss: **5 minutes** (last replication interval)

**Standalone Storage (sunny):**
- `/persist/ethereum` - local btrfs subvolume (or similar)
  - Contains: ethereum blockchain data, staking keys
  - **NOT replicated** - too large/expensive to replicate full ethereum node
  - Backed up via kopia to fractal (if feasible/needed)

**Media Storage (fractal):**
- `/data/media` - existing spinning drive storage
  - Exported via Samba (existing)
  - Mounted on c1, c2, c3 via CIFS (existing)
  - Local access on fractal for media workloads

**Shared Storage (fractal):**
- `/data/shared` - existing spinning drive storage
  - Exported via Samba (existing)
  - Mounted on c1, c2, c3 via CIFS (existing)

### Network Services

**NFS Primary (zippy):**
```nix
services.nfs.server = {
  enable = true;
  exports = ''
    /persist/services 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
  '';
};

services.consul.extraConfig.services = [{
  name = "services";
  port = 2049;
  checks = [{ tcp = "localhost:2049"; interval = "30s"; }];
}];
```

**NFS Client (all nodes):**
```nix
fileSystems."/data/services" = {
  device = "services.service.consul:/persist/services";
  fsType = "nfs";
  options = [ "x-systemd.automount" "noauto" "x-systemd.idle-timeout=60" ];
};
```

**Samba Exports (fractal - existing):**
- `//fractal/media` → `/data/media`
- `//fractal/shared` → `/data/shared`

### Nomad Job Placement Strategy

**Affinity-based (prefer zippy, allow c1/c2):**
- mysql, postgres, redis - stateful databases
- Run on zippy normally, can failover to c1/c2 if zippy down

**Constrained (must run on fractal):**
- **media.hcl** - radarr, sonarr, bazarr, plex, qbittorrent
  - Reason: Heavy /data/media access, benefits from local storage
- **prometheus.hcl** - metrics database with 30d retention
  - Reason: Large time-series data, spinning disks OK, saves SSD space
- **loki.hcl** - log aggregation with 31d retention
  - Reason: Large log data, spinning disks OK
- **clickhouse.hcl** - analytics database for plausible
  - Reason: Large time-series data, spinning disks OK

**Floating (can run anywhere on c1/c2/c3/fractal/zippy):**
- All other services including:
  - traefik, authentik, web apps
  - **grafana** (small data, just dashboards/config, queries prometheus for metrics)
  - databases (mysql, postgres, redis)
  - vector (system job, runs everywhere)
- Nomad schedules based on resources and constraints

### Data Migration

**Path changes needed in Nomad jobs:**
- `/data/compute/appdata/*` → `/data/services/*`
- `/data/compute/config/*` → `/data/services/*`
- `/data/sync/wordpress` → `/data/services/wordpress`

**No changes needed:**
- `/data/media/*` - stays the same (CIFS mount from fractal, used only by media services)
- `/data/shared/*` - stays the same (CIFS mount from fractal)

**Deprecated after migration:**
- `/data/sync/wordpress` - currently managed by syncthing to avoid slow GlusterFS
  - Will be replaced by NFS mount at `/data/services/wordpress`
  - Syncthing configuration for this can be removed
  - Final sync: copy from syncthing to `/persist/services/wordpress` on zippy before cutover

---

## Migration Steps

**Important path simplification note:**
- All service paths use `/data/services/*` directly (not `/data/services/*`)
- Example: `/data/compute/appdata/mysql` → `/data/services/mysql`
- Simpler, cleaner, easier to manage

### Phase 0: Preparation
**Duration: 1-2 hours**

1. **Backup everything**
   ```bash
   # On all nodes, ensure kopia backups are current
   kopia snapshot list

   # Backup glusterfs data manually
   rsync -av /data/compute/ /backup/compute-pre-migration/
   ```

2. **Document current state**
   ```bash
   # Save current nomad job list
   nomad job status -json > /backup/nomad-jobs-pre-migration.json

   # Save consul service catalog
   consul catalog services > /backup/consul-services-pre-migration.txt
   ```

3. **Review this document**
   - Verify all services are cataloged
   - Confirm priority assignments
   - Adjust as needed

### Phase 1: Convert fractal to NixOS
**Duration: 6-8 hours**

**Current state:**
- Proxmox on ZFS
- System pool: `rpool` (~500GB, will be wiped)
- Data pools (preserved):
  - `double1` - 3.6T (homes, shared)
  - `double2` - 7.2T (backup - kopia repo, PBS)
  - `double3` - 17T (media, torrent)
- Services: Samba (homes, shared, media), Kopia server, PBS
- Bind mounts: `/data/{homes,shared,media,torrent}` → ZFS datasets

**Goal:** Fresh NixOS on rpool, preserve data pools, join cluster

#### Step-by-step procedure:

**1. Pre-migration documentation**
   ```bash
   # On fractal, save ZFS layout
   cat > /tmp/detect-zfs.sh << 'EOF'
#!/bin/bash
echo "=== ZFS Pools ==="
zpool status

echo -e "\n=== ZFS Datasets ==="
zfs list -o name,mountpoint,used,avail,mounted -r double1 double2 double3

echo -e "\n=== Bind mounts ==="
cat /etc/fstab | grep double

echo -e "\n=== Data directories ==="
ls -la /data/

echo -e "\n=== Samba users/groups ==="
getent group shared compute
getent passwd compute
EOF
   chmod +x /tmp/detect-zfs.sh
   ssh fractal /tmp/detect-zfs.sh > /backup/fractal-zfs-layout.txt

   # Save samba config
   scp fractal:/etc/samba/smb.conf /backup/fractal-smb.conf

   # Save kopia certs and config
   scp -r fractal:~/kopia-certs /backup/fractal-kopia-certs/
   scp fractal:~/.config/kopia/repository.config /backup/fractal-kopia-repository.config

   # Verify kopia backups are current
   ssh fractal "kopia snapshot list --all"
   ```

**2. Stop services on fractal**
   ```bash
   ssh fractal "systemctl stop smbd nmbd kopia"
   # Don't stop PBS yet (in case we need to restore)
   ```

**3. Install NixOS**
   - Boot NixOS installer USB
   - **IMPORTANT**: Do NOT touch double1, double2, double3 during install!
   - Install only on `rpool` (or create new pool if needed)

   ```bash
   # In NixOS installer
   # Option A: Reuse rpool (wipe and recreate)
   zpool destroy rpool

   # Option B: Use different disk if available
   # Then follow standard NixOS btrfs install on that disk
   ```

   - Use standard encrypted btrfs layout (matching other hosts)
   - Minimal install first, will add cluster configs later

**4. First boot - import ZFS pools**
   ```bash
   # SSH into fresh NixOS install

   # Import pools (read-only first, to be safe)
   zpool import -f -o readonly=on double1
   zpool import -f -o readonly=on double2
   zpool import -f -o readonly=on double3

   # Verify datasets
   zfs list -r double1 double2 double3

   # Example output should show:
   # double1/homes
   # double1/shared
   # double2/backup
   # double3/media
   # double3/torrent

   # If everything looks good, export and reimport read-write
   zpool export double1 double2 double3
   zpool import double1
   zpool import double2
   zpool import double3

   # Set ZFS mountpoints (if needed)
   # These may already be set from Proxmox
   zfs set mountpoint=/double1 double1
   zfs set mountpoint=/double2 double2
   zfs set mountpoint=/double3 double3
   ```

**5. Create fractal NixOS configuration**
   ```nix
   # hosts/fractal/default.nix
   { config, pkgs, ... }:
   {
     imports = [
       ../../common/encrypted-btrfs-layout.nix
       ../../common/global
       ../../common/cluster-node.nix  # Consul + Nomad (will add in step 7)
       ../../common/nomad.nix  # Both server and client
       ./hardware.nix
     ];

     networking.hostName = "fractal";

     # ZFS support
     boot.supportedFilesystems = [ "zfs" ];
     boot.zfs.extraPools = [ "double1" "double2" "double3" ];

     # Ensure ZFS pools are imported before mounting
     systemd.services.zfs-import.wantedBy = [ "multi-user.target" ];

     # Bind mounts for /data (matching Proxmox setup)
     fileSystems."/data/homes" = {
       device = "/double1/homes";
       fsType = "none";
       options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
     };

     fileSystems."/data/shared" = {
       device = "/double1/shared";
       fsType = "none";
       options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
     };

     fileSystems."/data/media" = {
       device = "/double3/media";
       fsType = "none";
       options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
     };

     fileSystems."/data/torrent" = {
       device = "/double3/torrent";
       fsType = "none";
       options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
     };

     fileSystems."/backup" = {
       device = "/double2/backup";
       fsType = "none";
       options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
     };

     # Create data directory structure
     systemd.tmpfiles.rules = [
       "d /data 0755 root root -"
     ];

     # Users and groups for samba
     users.groups.shared = { gid = 1001; };
     users.groups.compute = { gid = 1002; };
     users.users.compute = {
       isSystemUser = true;
       uid = 1002;
       group = "compute";
     };

     # Ensure ppetru is in shared group
     users.users.ppetru.extraGroups = [ "shared" ];

     # Samba server
     services.samba = {
       enable = true;
       openFirewall = true;

       extraConfig = ''
         workgroup = WORKGROUP
         server string = fractal
         netbios name = fractal
         security = user
         map to guest = bad user
       '';

       shares = {
         homes = {
           comment = "Home Directories";
           browseable = "no";
           path = "/data/homes/%S";
           "read only" = "no";
         };

         shared = {
           path = "/data/shared";
           "read only" = "no";
           browseable = "yes";
           "guest ok" = "no";
           "create mask" = "0775";
           "directory mask" = "0775";
           "force group" = "+shared";
         };

         media = {
           path = "/data/media";
           "read only" = "no";
           browseable = "yes";
           "guest ok" = "no";
           "create mask" = "0755";
           "directory mask" = "0755";
         };
       };
     };

     # Kopia backup server
     systemd.services.kopia-server = {
       description = "Kopia Backup Server";
       wantedBy = [ "multi-user.target" ];
       after = [ "network.target" "zfs-mount.service" ];

       serviceConfig = {
         User = "ppetru";
         Group = "users";
         ExecStart = ''
           ${pkgs.kopia}/bin/kopia server start \
             --address 0.0.0.0:51515 \
             --tls-cert-file /persist/kopia-certs/kopia.cert \
             --tls-key-file /persist/kopia-certs/kopia.key
         '';
         Restart = "on-failure";
       };
     };

     # Kopia nightly snapshot (from cron)
     systemd.services.kopia-snapshot = {
       description = "Kopia snapshot of homes and shared";
       serviceConfig = {
         Type = "oneshot";
         User = "ppetru";
         Group = "users";
         ExecStart = ''
           ${pkgs.kopia}/bin/kopia --config-file=/home/ppetru/.config/kopia/repository.config \
             snapshot create /data/homes /data/shared \
             --log-level=warning --no-progress
         '';
       };
     };

     systemd.timers.kopia-snapshot = {
       wantedBy = [ "timers.target" ];
       timerConfig = {
         OnCalendar = "22:47";
         Persistent = true;
       };
     };

     # Keep kopia config and certs persistent
     environment.persistence."/persist" = {
       directories = [
         "/home/ppetru/.config/kopia"
         "/home/ppetru/kopia-certs"
       ];
     };

     networking.firewall.allowedTCPPorts = [
       139 445  # Samba
       51515    # Kopia
     ];
     networking.firewall.allowedUDPPorts = [
       137 138  # Samba
     ];
   }
   ```

**6. Deploy initial config (without cluster)**
   ```bash
   # First, deploy without cluster-node.nix to verify storage works
   # Comment out cluster-node import temporarily

   deploy -s '.#fractal'

   # Verify mounts
   ssh fractal "df -h | grep data"
   ssh fractal "ls -la /data/"

   # Test samba
   smbclient -L fractal -U ppetru

   # Test kopia
   ssh fractal "systemctl status kopia-server"
   ```

**7. Join cluster (add to quorum)**
   ```bash
   # Uncomment cluster-node.nix import in fractal config
   # Update all cluster configs for 5-server quorum
   # (See step 3 in existing Phase 1 docs)

   deploy  # Deploy to all nodes

   # Verify quorum
   consul members
   nomad server members
   ```

**8. Update cluster configs for 5-server quorum**
   ```nix
   # common/consul.nix
   servers = ["c1" "c2" "c3" "fractal" "zippy"];
   bootstrap_expect = 3;

   # common/nomad.nix
   servers = ["c1" "c2" "c3" "fractal" "zippy"];
   bootstrap_expect = 3;
   ```

**9. Verify fractal is fully operational**
   ```bash
   # Check all services
   ssh fractal "systemctl status samba kopia-server kopia-snapshot.timer"

   # Verify ZFS pools
   ssh fractal "zpool status"
   ssh fractal "zfs list"

   # Test accessing shares from another node
   ssh c1 "ls /data/media /data/shared"

   # Verify kopia clients can still connect
   kopia repository status --server=https://fractal:51515

   # Check nomad can see fractal
   nomad node status | grep fractal

   # Verify quorum
   consul members  # Should see c1, c2, c3, fractal
   nomad server members  # Should see 4 servers
   ```

### Phase 2: Setup zippy storage layer
**Duration: 2-3 hours**

**Goal:** Prepare zippy for NFS server role, setup replication

1. **Create btrfs subvolume on zippy**
   ```bash
   ssh zippy
   sudo btrfs subvolume create /persist/services
   sudo chown ppetru:users /persist/services
   ```

2. **Update zippy configuration**
   ```nix
   # hosts/zippy/default.nix
   imports = [
     ../../common/encrypted-btrfs-layout.nix
     ../../common/global
     ../../common/cluster-node.nix  # Adds to quorum
     ../../common/nomad.nix
     ./hardware.nix
   ];

   # NFS server
   services.nfs.server = {
     enable = true;
     exports = ''
       /persist/services 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
     '';
   };

   # Consul service registration for NFS
   services.consul.extraConfig.services = [{
     name = "services";
     port = 2049;
     checks = [{ tcp = "localhost:2049"; interval = "30s"; }];
   }];

   # Btrfs replication to standbys (incremental after first full send)
   systemd.services.replicate-to-c1 = {
     description = "Replicate /persist/services to c1";
     script = ''
       ${pkgs.btrfs-progs}/bin/btrfs subvolume snapshot -r /persist/services /persist/services@$(date +%Y%m%d-%H%M%S)
       LATEST=$(ls -t /persist/services@* | head -1)

       # Get previous snapshot for incremental send
       PREV=$(ls -t /persist/services@* | head -2 | tail -1)

       # First run: full send. Subsequent: incremental with -p (parent)
       if [ "$LATEST" != "$PREV" ]; then
         ${pkgs.btrfs-progs}/bin/btrfs send -p $PREV $LATEST | ${pkgs.openssh}/bin/ssh c1 "${pkgs.btrfs-progs}/bin/btrfs receive /persist/"
       else
         # First snapshot, full send
         ${pkgs.btrfs-progs}/bin/btrfs send $LATEST | ${pkgs.openssh}/bin/ssh c1 "${pkgs.btrfs-progs}/bin/btrfs receive /persist/"
       fi

       # Cleanup old snapshots (keep last 24 hours on sender)
       find /persist/services@* -mtime +1 -exec ${pkgs.btrfs-progs}/bin/btrfs subvolume delete {} \;
     '';
   };

   systemd.timers.replicate-to-c1 = {
     wantedBy = [ "timers.target" ];
     timerConfig = {
       OnCalendar = "*:0/5";  # Every 5 minutes (incremental after first full send)
       Persistent = true;
     };
   };

   # Same for c2
   systemd.services.replicate-to-c2 = { ... };
   systemd.timers.replicate-to-c2 = { ... };
   ```

3. **Setup standby storage on c1 and c2**
   ```bash
   # On c1 and c2
   ssh c1 sudo btrfs subvolume create /persist/services-standby
   ssh c2 sudo btrfs subvolume create /persist/services-standby
   ```

4. **Deploy and verify**
   ```bash
   deploy -s '.#zippy'

   # Verify NFS export
   showmount -e zippy

   # Verify Consul registration
   dig @localhost -p 8600 services.service.consul
   ```

5. **Verify quorum is now 5 servers**
   ```bash
   consul members  # Should show c1, c2, c3, fractal, zippy
   nomad server members
   ```

### Phase 3: Migrate from GlusterFS to NFS
**Duration: 3-4 hours**

**Goal:** Move all data, update mounts, remove GlusterFS

1. **Copy data from GlusterFS to zippy**
   ```bash
   # On any node with /data/compute mounted
   rsync -av --progress /data/compute/ zippy:/persist/services/

   # Verify
   ssh zippy du -sh /persist/services
   ```

2. **Update all nodes to mount NFS**
   ```nix
   # Update common/glusterfs-client.nix → common/nfs-client.nix
   # OR update common/cluster-node.nix to import nfs-client instead

   fileSystems."/data/services" = {
     device = "services.service.consul:/persist/services";
     fsType = "nfs";
     options = [ "x-systemd.automount" "noauto" "x-systemd.idle-timeout=60" ];
   };

   # Remove old GlusterFS mount
   # fileSystems."/data/compute" = ...  # DELETE
   ```

3. **Deploy updated configs**
   ```bash
   deploy -s '.#c1' '.#c2' '.#c3' '.#fractal' '.#zippy'
   ```

4. **Verify NFS mounts**
   ```bash
   for host in c1 c2 c3 fractal zippy; do
     ssh $host "df -h | grep services"
   done
   ```

5. **Stop all Nomad jobs temporarily**
   ```bash
   # Get list of running jobs
   nomad job status | grep running | awk '{print $1}' > /tmp/running-jobs.txt

   # Stop all (they'll be restarted with updated paths in Phase 4)
   cat /tmp/running-jobs.txt | xargs -I {} nomad job stop {}
   ```

6. **Remove GlusterFS from cluster**
   ```bash
   # On c1 (or any gluster server)
   gluster volume stop compute
   gluster volume delete compute

   # On all nodes
   for host in c1 c2 c3; do
     ssh $host "sudo systemctl stop glusterd; sudo systemctl disable glusterd"
   done
   ```

7. **Remove GlusterFS from NixOS configs**
   ```nix
   # common/compute-node.nix - remove ./glusterfs.nix import
   # Deploy again
   deploy
   ```

### Phase 4: Update and redeploy Nomad jobs
**Duration: 2-4 hours**

**Goal:** Update all Nomad job paths, add constraints/affinities, redeploy

1. **Update job specs** (see Service Catalog below for details)
   - Change `/data/compute` → `/data/services`
   - Add constraints for media jobs → fractal
   - Add affinities for database jobs → zippy

2. **Deploy critical services first**
   ```bash
   # Core infrastructure
   nomad run services/mysql.hcl
   nomad run services/postgres.hcl
   nomad run services/redis.hcl
   nomad run services/traefik.hcl
   nomad run services/authentik.hcl

   # Verify
   nomad job status mysql
   consul catalog services
   ```

3. **Deploy high-priority services**
   ```bash
   nomad run services/prometheus.hcl
   nomad run services/grafana.hcl
   nomad run services/loki.hcl
   nomad run services/vector.hcl

   nomad run services/unifi.hcl
   nomad run services/gitea.hcl
   ```

4. **Deploy medium-priority services**
   ```bash
   # See service catalog for full list
   nomad run services/wordpress.hcl
   nomad run services/ghost.hcl
   nomad run services/wiki.hcl
   # ... etc
   ```

5. **Deploy low-priority services**
   ```bash
   nomad run services/media.hcl  # Will run on fractal due to constraint
   # ... etc
   ```

6. **Verify all services healthy**
   ```bash
   nomad job status
   consul catalog services
   # Check traefik dashboard for health
   ```

### Phase 5: Convert sunny to NixOS (Optional, can defer)
**Duration: 6-10 hours (split across 2 stages)**

**Current state:**
- Proxmox with ~1.5TB ethereum node data
- 2x LXC containers: besu (execution client), lighthouse (consensus beacon)
- 1x VM: Rocketpool smartnode (docker containers for validator, node, MEV-boost, etc.)
- Running in "hybrid mode" - managing own execution/consensus, rocketpool manages the rest

**Goal:** Get sunny on NixOS quickly, preserve ethereum data, defer "perfect" native setup

---

#### Stage 1: Quick NixOS Migration (containers)
**Duration: 6-8 hours**
**Goal:** NixOS + containerized ethereum stack, minimal disruption

**1. Pre-migration backup and documentation**
   ```bash
   # Document current setup
   ssh sunny "pct list" > /backup/sunny-containers.txt
   ssh sunny "qm list" > /backup/sunny-vms.txt

   # Find ethereum data locations in LXC containers
   ssh sunny "pct config BESU_CT_ID" > /backup/sunny-besu-config.txt
   ssh sunny "pct config LIGHTHOUSE_CT_ID" > /backup/sunny-lighthouse-config.txt

   # Document rocketpool VM volumes
   ssh sunny "qm config ROCKETPOOL_VM_ID" > /backup/sunny-rocketpool-config.txt

   # Estimate ethereum data size
   ssh sunny "du -sh /path/to/besu/data"
   ssh sunny "du -sh /path/to/lighthouse/data"

   # Backup rocketpool config (docker-compose, wallet keys, etc.)
   # This is in the VM - need to access and backup critical files
   ```

**2. Extract ethereum data from containers/VM**
   ```bash
   # Stop ethereum services to get consistent state
   # (This will pause validation! Plan for attestation penalties)

   # Copy besu data out of LXC
   ssh sunny "pct stop BESU_CT_ID"
   rsync -av --progress sunny:/var/lib/lxc/BESU_CT_ID/rootfs/path/to/besu/ /backup/sunny-besu-data/

   # Copy lighthouse data out of LXC
   ssh sunny "pct stop LIGHTHOUSE_CT_ID"
   rsync -av --progress sunny:/var/lib/lxc/LIGHTHOUSE_CT_ID/rootfs/path/to/lighthouse/ /backup/sunny-lighthouse-data/

   # Copy rocketpool data out of VM
   # This includes validator keys, wallet, node config
   # Access VM and copy out: ~/.rocketpool/data
   ```

**3. Install NixOS on sunny**
   - Fresh install with btrfs + impermanence
   - Create large `/persist/ethereum` for 1.5TB+ data
   - **DO NOT** try to resync from network (takes weeks!)

**4. Restore ethereum data to NixOS**
   ```bash
   # After NixOS install, copy data back
   ssh sunny "mkdir -p /persist/ethereum/{besu,lighthouse,rocketpool}"

   rsync -av --progress /backup/sunny-besu-data/ sunny:/persist/ethereum/besu/
   rsync -av --progress /backup/sunny-lighthouse-data/ sunny:/persist/ethereum/lighthouse/
   # Rocketpool data copied later
   ```

**5. Create sunny NixOS config (container-based)**
   ```nix
   # hosts/sunny/default.nix
   { config, pkgs, ... }:
   {
     imports = [
       ../../common/encrypted-btrfs-layout.nix
       ../../common/global
       ./hardware.nix
     ];

     networking.hostName = "sunny";

     # NO cluster-node import - standalone for now
     # Can add to quorum later if desired

     # Container runtime
     virtualisation.podman = {
       enable = true;
       dockerCompat = true;  # Provides 'docker' command
       defaultNetwork.settings.dns_enabled = true;
     };

     # Besu execution client (container)
     virtualisation.oci-containers.containers.besu = {
       image = "hyperledger/besu:latest";
       volumes = [
         "/persist/ethereum/besu:/var/lib/besu"
       ];
       ports = [
         "8545:8545"   # HTTP RPC
         "8546:8546"   # WebSocket RPC
         "30303:30303" # P2P
       ];
       cmd = [
         "--data-path=/var/lib/besu"
         "--rpc-http-enabled=true"
         "--rpc-http-host=0.0.0.0"
         "--rpc-ws-enabled=true"
         "--rpc-ws-host=0.0.0.0"
         "--engine-rpc-enabled=true"
         "--engine-host-allowlist=*"
         "--engine-jwt-secret=/var/lib/besu/jwt.hex"
         # Add other besu flags as needed
       ];
       autoStart = true;
     };

     # Lighthouse beacon client (container)
     virtualisation.oci-containers.containers.lighthouse-beacon = {
       image = "sigp/lighthouse:latest";
       volumes = [
         "/persist/ethereum/lighthouse:/data"
         "/persist/ethereum/besu/jwt.hex:/jwt.hex:ro"
       ];
       ports = [
         "5052:5052"   # HTTP API
         "9000:9000"   # P2P
       ];
       cmd = [
         "lighthouse"
         "beacon"
         "--datadir=/data"
         "--http"
         "--http-address=0.0.0.0"
         "--execution-endpoint=http://besu:8551"
         "--execution-jwt=/jwt.hex"
         # Add other lighthouse flags
       ];
       dependsOn = [ "besu" ];
       autoStart = true;
     };

     # Rocketpool stack (podman-compose for multi-container setup)
     # TODO: This requires converting docker-compose to NixOS config
     # For now, can run docker-compose via systemd service
     systemd.services.rocketpool = {
       description = "Rocketpool Smartnode Stack";
       after = [ "podman.service" "lighthouse-beacon.service" ];
       wantedBy = [ "multi-user.target" ];

       serviceConfig = {
         Type = "oneshot";
         RemainAfterExit = "yes";
         WorkingDirectory = "/persist/ethereum/rocketpool";
         ExecStart = "${pkgs.docker-compose}/bin/docker-compose up -d";
         ExecStop = "${pkgs.docker-compose}/bin/docker-compose down";
       };
     };

     # Ensure ethereum data persists
     environment.persistence."/persist" = {
       directories = [
         "/persist/ethereum"
       ];
     };

     # Firewall for ethereum
     networking.firewall = {
       allowedTCPPorts = [
         30303  # Besu P2P
         9000   # Lighthouse P2P
         # Add rocketpool ports
       ];
       allowedUDPPorts = [
         30303  # Besu P2P
         9000   # Lighthouse P2P
       ];
     };
   }
   ```

**6. Setup rocketpool docker-compose on NixOS**
   ```bash
   # After NixOS is running, restore rocketpool config
   ssh sunny "mkdir -p /persist/ethereum/rocketpool"

   # Copy rocketpool data (wallet, keys, config)
   rsync -av /backup/sunny-rocketpool-data/ sunny:/persist/ethereum/rocketpool/

   # Create docker-compose.yml for rocketpool stack
   # Based on rocketpool hybrid mode docs
   # This runs: validator, node software, MEV-boost, prometheus, etc.
   # Connects to your besu + lighthouse containers
   ```

**7. Deploy and test**
   ```bash
   deploy -s '.#sunny'

   # Verify containers are running
   ssh sunny "podman ps"

   # Check besu sync status
   ssh sunny "curl -X POST -H 'Content-Type: application/json' --data '{\"jsonrpc\":\"2.0\",\"method\":\"eth_syncing\",\"params\":[],\"id\":1}' http://localhost:8545"

   # Check lighthouse sync status
   ssh sunny "curl http://localhost:5052/eth/v1/node/syncing"

   # Monitor rocketpool
   ssh sunny "cd /persist/ethereum/rocketpool && docker-compose logs -f"
   ```

**8. Monitor and stabilize**
   - Ethereum should resume from where it left off (not resync!)
   - Validation will resume once beacon is sync'd
   - May have missed a few attestations during migration (minor penalty)

---

#### Stage 2: Native NixOS Services (Future)
**Duration: TBD (do this later when time permits)**
**Goal:** Convert to native NixOS services using ethereum-nix

**Why defer this:**
- Complex (rocketpool not fully packaged for Nix)
- Current container setup works fine
- Can migrate incrementally (besu → native, then lighthouse, etc.)
- No downtime once Stage 1 is stable

**When ready:**
1. Research ethereum-nix support for besu + lighthouse + rocketpool
2. Test on separate machine first
3. Migrate one service at a time with minimal downtime
4. Document in separate migration plan

**For now:** Stage 1 gets sunny on NixOS with base configs, managed declaratively, just using containers instead of native services.

### Phase 6: Verification and cleanup
**Duration: 1 hour**

1. **Test failover procedure** (see Failover Procedures below)

2. **Verify backups are working**
   ```bash
   kopia snapshot list
   # Check that /persist/services is being backed up
   ```

3. **Update documentation**
   - Update README.md
   - Document new architecture
   - Update stateful-commands.txt

4. **Clean up old GlusterFS data**
   ```bash
   # Only after verifying everything works!
   for host in c1 c2 c3; do
     ssh $host "sudo rm -rf /persist/glusterfs"
   done
   ```

---

## Service Catalog

**Legend:**
- **Priority**: CRITICAL (must be up) / HIGH (important) / MEDIUM (nice to have) / LOW (can wait)
- **Target**: Where it should run (constraint or affinity)
- **Data**: What data it needs access to
- **Changes**: What needs updating in the .hcl file

### Core Infrastructure

#### mysql
- **File**: `services/mysql.hcl`
- **Priority**: CRITICAL
- **Current**: Uses `/data/compute/appdata/mysql`
- **Target**: Affinity for zippy, allow c1/c2
- **Data**: `/data/services/mysql` (NFS from zippy)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/mysql` → `/data/services/mysql`
  - ✏️ Add affinity:
    ```hcl
    affinity {
      attribute = "${node.unique.name}"
      value     = "zippy"
      weight    = 100
    }
    ```
  - ✏️ Add constraint to allow fallback:
    ```hcl
    constraint {
      attribute = "${node.unique.name}"
      operator  = "regexp"
      value     = "zippy|c1|c2"
    }
    ```
- **Notes**: Core database, needs to stay up. Consul DNS `mysql.service.consul` unchanged.

#### postgres
- **File**: `services/postgres.hcl`
- **Priority**: CRITICAL
- **Current**: Uses `/data/compute/appdata/postgres`, `/data/compute/appdata/pgadmin`
- **Target**: Affinity for zippy, allow c1/c2
- **Data**: `/data/services/postgres`, `/data/services/pgadmin` (NFS)
- **Changes**:
  - ✏️ Volume paths: `/data/compute/appdata/*` → `/data/services/*`
  - ✏️ Add affinity and constraint (same as mysql)
- **Notes**: Core database for authentik, gitea, plausible, netbox, etc.

#### redis
- **File**: `services/redis.hcl`
- **Priority**: CRITICAL
- **Current**: Uses `/data/compute/appdata/redis`
- **Target**: Affinity for zippy, allow c1/c2
- **Data**: `/data/services/redis` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/redis` → `/data/services/redis`
  - ✏️ Add affinity and constraint (same as mysql)
- **Notes**: Used by authentik, wordpress. Should co-locate with databases.

#### traefik
- **File**: `services/traefik.hcl`
- **Priority**: CRITICAL
- **Current**: Uses `/data/compute/config/traefik`
- **Target**: Float on c1/c2/c3 (keepalived handles HA)
- **Data**: `/data/services/config/traefik` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/config/traefik` → `/data/services/config/traefik`
- **Notes**: Reverse proxy, has keepalived for VIP failover. Critical for all web access.

#### authentik
- **File**: `services/authentik.hcl`
- **Priority**: CRITICAL
- **Current**: No persistent volumes (stateless, uses postgres/redis)
- **Target**: Float on c1/c2/c3
- **Data**: None (uses postgres.service.consul, redis.service.consul)
- **Changes**: None needed
- **Notes**: SSO for most services. Must stay up.

### Monitoring Stack

#### prometheus
- **File**: `services/prometheus.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/prometheus`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/prometheus` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/prometheus` → `/data/services/prometheus`
- **Notes**: Metrics database. Important for monitoring but not critical for services.

#### grafana
- **File**: `services/grafana.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/grafana`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/grafana` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/grafana` → `/data/services/grafana`
- **Notes**: Monitoring UI. Depends on prometheus.

#### loki
- **File**: `services/loki.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/loki`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/loki` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/loki` → `/data/services/loki`
- **Notes**: Log aggregation. Important for debugging.

#### vector
- **File**: `services/vector.hcl`
- **Priority**: MEDIUM
- **Current**: No persistent volumes, type=system (runs on all nodes)
- **Target**: System job (runs everywhere)
- **Data**: None (ephemeral logs, ships to loki)
- **Changes**:
  - ❓ Check if glusterfs log path is still needed: `/var/log/glusterfs:/var/log/glusterfs:ro`
  - ✏️ Remove glusterfs log collection after GlusterFS is removed
- **Notes**: Log shipper. Can tolerate downtime.

### Databases (Specialized)

#### clickhouse
- **File**: `services/clickhouse.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/clickhouse`
- **Target**: Affinity for zippy (large dataset), allow c1/c2/c3
- **Data**: `/data/services/clickhouse` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/clickhouse` → `/data/services/clickhouse`
  - ✏️ Add affinity for zippy (optional, but helps with performance)
- **Notes**: Used by plausible. Large time-series data. Important but can be recreated.

#### mongodb
- **File**: `services/unifi.hcl` (embedded in unifi job)
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/unifi/mongodb`
- **Target**: Float on c1/c2/c3 (with unifi)
- **Data**: `/data/services/unifi/mongodb` (NFS)
- **Changes**: See unifi below
- **Notes**: Only used by unifi. Should stay with unifi controller.

### Web Applications

#### wordpress
- **File**: `services/wordpress.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/sync/wordpress` (syncthing-managed to avoid slow GlusterFS)
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/wordpress` (NFS from zippy)
- **Changes**:
  - ✏️ Volume path: `/data/sync/wordpress` → `/data/services/wordpress`
  - 📋 **Before cutover**: Copy data from syncthing to zippy: `rsync -av /data/sync/wordpress/ zippy:/persist/services/appdata/wordpress/`
  - 📋 **After migration**: Remove syncthing configuration for wordpress sync
- **Notes**: Production website. Important but can tolerate brief downtime during migration.

#### ghost
- **File**: `services/ghost.hcl`
- **Priority**: no longer used, should wipe
- **Current**: Uses `/data/compute/appdata/ghost`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/ghost` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/ghost` → `/data/services/ghost`
- **Notes**: Blog platform (alo.land). Can tolerate downtime.

#### gitea
- **File**: `services/gitea.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/gitea/data`, `/data/compute/appdata/gitea/config`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/gitea/*` (NFS)
- **Changes**:
  - ✏️ Volume paths: `/data/compute/appdata/gitea/*` → `/data/services/gitea/*`
- **Notes**: Git server. Contains code repositories. Important.

#### wiki (tiddlywiki)
- **File**: `services/wiki.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/wiki` via host volume mount
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/wiki` (NFS)
- **Changes**:
  - ✏️ Volume mount path in `volume_mount` blocks
  - ⚠️ Uses `exec` driver with host volumes - verify NFS mount works with this
- **Notes**: Multiple tiddlywiki instances. Personal wikis. Can tolerate downtime.

#### code-server
- **File**: `services/code-server.hcl`
- **Priority**: LOW
- **Current**: Uses `/data/compute/appdata/code`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/code` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/code` → `/data/services/code`
- **Notes**: Web IDE. Low priority, for development only.

#### beancount (fava)
- **File**: `services/beancount.hcl`
- **Priority**: MEDIUM
- **Current**: Uses `/data/compute/appdata/beancount`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/beancount` (NFS)
- **Changes**:
  - ✏️ Volume path: `/data/compute/appdata/beancount` → `/data/services/beancount`
- **Notes**: Finance tracking. Low priority.

#### adminer
- **File**: `services/adminer.hcl`
- **Priority**: LOW
- **Current**: Stateless
- **Target**: Float on c1/c2/c3
- **Data**: None
- **Changes**: None needed
- **Notes**: Database admin UI. Only needed for maintenance.

#### plausible
- **File**: `services/plausible.hcl`
- **Priority**: HIGH
- **Current**: Stateless (uses postgres and clickhouse)
- **Target**: Float on c1/c2/c3
- **Data**: None (uses postgres.service.consul, clickhouse.service.consul)
- **Changes**: None needed
- **Notes**: Website analytics. Nice to have but not critical.

#### evcc
- **File**: `services/evcc.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/evcc/evcc.yaml`, `/data/compute/appdata/evcc/evcc`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/evcc/*` (NFS)
- **Changes**:
  - ✏️ Volume paths: `/data/compute/appdata/evcc/*` → `/data/services/evcc/*`
- **Notes**: EV charging controller. Important for daily use.

#### vikunja
- **File**: `services/vikunja.hcl` (assumed to exist based on README)
- **Priority**: no longer used, should delete
- **Current**: Likely uses `/data/compute/appdata/vikunja`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/vikunja` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/vikunja`
- **Notes**: Task management. Low priority.

#### leantime
- **File**: `services/leantime.hcl`
- **Priority**: no longer used, should delete
- **Current**: Likely uses `/data/compute/appdata/leantime`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/leantime` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/leantime`
- **Notes**: Project management. Low priority.

### Network Infrastructure

#### unifi
- **File**: `services/unifi.hcl`
- **Priority**: HIGH
- **Current**: Uses `/data/compute/appdata/unifi/data`, `/data/compute/appdata/unifi/mongodb`
- **Target**: Float on c1/c2/c3/fractal/zippy
- **Data**: `/data/services/unifi/*` (NFS)
- **Changes**:
  - ✏️ Volume paths: `/data/compute/appdata/unifi/*` → `/data/services/unifi/*`
- **Notes**: UniFi network controller. Critical for network management. Has keepalived VIP for stable inform address. Floating is fine.

### Media Stack

#### media (radarr, sonarr, bazarr, plex, qbittorrent)
- **File**: `services/media.hcl`
- **Priority**: MEDIUM
- **Current**: Uses `/data/compute/appdata/radarr`, `/data/compute/appdata/sonarr`, etc. and `/data/media`
- **Target**: **MUST run on fractal** (local /data/media access)
- **Data**:
  - `/data/services/radarr` (NFS) - config data
  - `/data/media` (local CIFS mount on fractal, local disk on fractal)
- **Changes**:
  - ✏️ Volume paths: `/data/compute/appdata/*` → `/data/services/*`
  - ✏️ **Add constraint**:
    ```hcl
    constraint {
      attribute = "${node.unique.name}"
      value     = "fractal"
    }
    ```
- **Notes**: Heavy I/O to /data/media. Must run on fractal for performance. Has keepalived VIP.

### Utility Services

#### weewx
- **File**: `services/weewx.hcl`
- **Priority**: HIGH
- **Current**: Likely uses `/data/compute/appdata/weewx`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/weewx` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/weewx`
- **Notes**: Weather station. Low priority.

#### maps
- **File**: `services/maps.hcl`
- **Priority**: MEDIUM
- **Current**: Likely uses `/data/compute/appdata/maps`
- **Target**: Float on c1/c2/c3 (or fractal if large tile data)
- **Data**: `/data/services/maps` (NFS) or `/data/media/maps` if large
- **Changes**:
  - ✏️ Volume paths: Check data size, may want to move to /data/media
- **Notes**: Map tiles. Low priority.

#### netbox
- **File**: `services/netbox.hcl`
- **Priority**: LOW
- **Current**: Likely uses `/data/compute/appdata/netbox`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/netbox` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/netbox`
- **Notes**: IPAM/DCIM. Low priority, for documentation.

#### farmos
- **File**: `services/farmos.hcl`
- **Priority**: LOW
- **Current**: Likely uses `/data/compute/appdata/farmos`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/farmos` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/farmos`
- **Notes**: Farm management. Low priority.

#### urbit
- **File**: `services/urbit.hcl`
- **Priority**: LOW
- **Current**: Likely uses `/data/compute/appdata/urbit`
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/urbit` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/urbit`
- **Notes**: Urbit node. Experimental, low priority.

#### webodm
- **File**: `services/webodm.hcl`
- **Priority**: LOW
- **Current**: Likely uses `/data/compute/appdata/webodm`
- **Target**: Float on c1/c2/c3 (or fractal if processing large imagery from /data/media)
- **Data**: `/data/services/webodm` (NFS)
- **Changes**:
  - ✏️ Volume paths: Update to `/data/services/webodm`
  - 🤔 May benefit from running on fractal if it processes files from /data/media
- **Notes**: Drone imagery processing. Low priority.

#### velutrack
- **File**: `services/velutrack.hcl`
- **Priority**: LOW
- **Current**: Likely minimal state
- **Target**: Float on c1/c2/c3
- **Data**: Minimal
- **Changes**: Verify if any volume paths need updating
- **Notes**: Vehicle tracking. Low priority.

#### resol-gateway
- **File**: `services/resol-gateway.hcl`
- **Priority**: HIGH
- **Current**: Likely minimal state
- **Target**: Float on c1/c2/c3
- **Data**: Minimal
- **Changes**: Verify if any volume paths need updating
- **Notes**: Solar thermal controller. Low priority.

#### igsync
- **File**: `services/igsync.hcl`
- **Priority**: MEDIUM
- **Current**: Likely uses `/data/compute/appdata/igsync` or `/data/media`
- **Target**: Float on c1/c2/c3 (or fractal if storing to /data/media)
- **Data**: Check if it writes to `/data/media` or `/data/services`
- **Changes**:
  - ✏️ Volume paths: Verify and update
- **Notes**: Instagram sync. Low priority.

#### jupyter
- **File**: `services/jupyter.hcl`
- **Priority**: LOW
- **Current**: Stateless or minimal state
- **Target**: Float on c1/c2/c3
- **Data**: Minimal
- **Changes**: Verify if any volume paths need updating
- **Notes**: Notebook server. Low priority, for experimentation.

#### whoami
- **File**: `services/whoami.hcl`
- **Priority**: LOW
- **Current**: Stateless
- **Target**: Float on c1/c2/c3
- **Data**: None
- **Changes**: None needed
- **Notes**: Test service. Can be stopped during migration.

#### tiddlywiki (if separate from wiki.hcl)
- **File**: `services/tiddlywiki.hcl`
- **Priority**: MEDIUM
- **Current**: Likely same as wiki.hcl
- **Target**: Float on c1/c2/c3
- **Data**: `/data/services/tiddlywiki` (NFS)
- **Changes**: Same as wiki.hcl
- **Notes**: May be duplicate of wiki.hcl.

### Backup Jobs

#### mysql-backup
- **File**: `services/mysql-backup.hcl`
- **Priority**: HIGH
- **Current**: Likely writes to `/data/compute` or `/data/shared`
- **Target**: Float on c1/c2/c3
- **Data**: Should write to `/data/shared` (backed up to fractal)
- **Changes**:
  - ✏️ Verify backup destination, should be `/data/shared/backups/mysql`
- **Notes**: Important for disaster recovery. Should run regularly.

#### postgres-backup
- **File**: `services/postgres-backup.hcl`
- **Priority**: HIGH
- **Current**: Likely writes to `/data/compute` or `/data/shared`
- **Target**: Float on c1/c2/c3
- **Data**: Should write to `/data/shared` (backed up to fractal)
- **Changes**:
  - ✏️ Verify backup destination, should be `/data/shared/backups/postgres`
- **Notes**: Important for disaster recovery. Should run regularly.

#### wordpress-backup
- **File**: `services/wordpress-backup.hcl`
- **Priority**: MEDIUM
- **Current**: Likely writes to `/data/compute` or `/data/shared`
- **Target**: Float on c1/c2/c3
- **Data**: Should write to `/data/shared` (backed up to fractal)
- **Changes**:
  - ✏️ Verify backup destination
- **Notes**: Periodic backup job.

---

## Failover Procedures

### NFS Server Failover (zippy → c1 or c2)

**When to use:** zippy is down and not coming back soon

**Prerequisites:**
- c1 and c2 have been receiving btrfs snapshots from zippy
- Last successful replication was < 1 hour ago (verify timestamps)

**Procedure:**

1. **Choose standby node** (c1 or c2)
   ```bash
   # Check replication freshness
   ssh c1 "ls -lt /persist/services-standby@* | head -5"
   ssh c2 "ls -lt /persist/services-standby@* | head -5"

   # Choose the one with most recent snapshot
   # For this example, we'll use c1
   ```

2. **On standby node (c1), promote standby to primary**
   ```bash
   ssh c1

   # Stop NFS client mount (if running)
   sudo systemctl stop data-services.mount

   # Find latest snapshot
   LATEST=$(ls -t /persist/services-standby@* | head -1)

   # Create writable subvolume from snapshot
   sudo btrfs subvolume snapshot $LATEST /persist/services

   # Verify
   ls -la /persist/services
   ```

3. **Deploy c1-nfs-server configuration**
   ```bash
   # From your workstation
   deploy -s '.#c1-nfs-server'

   # This activates:
   # - NFS server on c1
   # - Consul service registration for "services"
   # - Firewall rules
   ```

4. **On c1, verify NFS is running**
   ```bash
   ssh c1
   sudo systemctl status nfs-server
   showmount -e localhost
   dig @localhost -p 8600 services.service.consul  # Should show c1's IP
   ```

5. **On other nodes, remount NFS**
   ```bash
   # Nodes should auto-remount via Consul DNS, but you can force it:
   for host in c2 c3 fractal zippy; do
     ssh $host "sudo systemctl restart data-services.mount"
   done
   ```

6. **Verify Nomad jobs are healthy**
   ```bash
   nomad job status mysql
   nomad job status postgres
   # Check all critical services
   ```

7. **Update monitoring/alerts**
   - Note in documentation that c1 is now primary NFS server
   - Set up alert to remember to fail back to zippy when it's repaired

**Recovery Time Objective (RTO):** ~10-15 minutes

**Recovery Point Objective (RPO):** Last snapshot interval (**5 minutes** max)

### Failing Back to zippy

**When to use:** zippy is repaired and ready to resume primary role

**Procedure:**

1. **Sync data from c1 back to zippy**
   ```bash
   # On c1 (current primary)
   sudo btrfs subvolume snapshot -r /persist/services /persist/services@failback-$(date +%Y%m%d-%H%M%S)
   FAILBACK=$(ls -t /persist/services@failback-* | head -1)
   sudo btrfs send $FAILBACK | ssh zippy "sudo btrfs receive /persist/"

   # On zippy, make it writable
   ssh zippy "sudo btrfs subvolume snapshot /persist/$(basename $FAILBACK) /persist/services"
   ```

2. **Deploy zippy back to NFS server role**
   ```bash
   deploy -s '.#zippy'
   # Consul will register services.service.consul → zippy again
   ```

3. **Demote c1 back to standby**
   ```bash
   deploy -s '.#c1'
   # This removes NFS server, restores NFS client mount
   ```

4. **Verify all nodes are mounting from zippy**
   ```bash
   dig @c1 -p 8600 services.service.consul  # Should show zippy's IP

   for host in c1 c2 c3 fractal; do
     ssh $host "df -h | grep services"
   done
   ```

### Database Job Failover (automatic via Nomad)

**When to use:** zippy is down, database jobs need to run elsewhere

**What happens automatically:**
1. Nomad detects zippy is unhealthy
2. Jobs with constraint `zippy|c1|c2` are rescheduled to c1 or c2
3. Jobs start on new node, accessing `/data/services` (now via NFS from promoted standby)

**Manual intervention needed:**
- None if NFS failover completed successfully
- If jobs are stuck: `nomad job stop mysql && nomad job run services/mysql.hcl`

**What to check:**
```bash
nomad job status mysql
nomad job status postgres
nomad job status redis

# Verify they're running on c1 or c2, not zippy
nomad alloc status <alloc-id>
```

### Complete Cluster Failure (lose quorum)

**Scenario:** 3 or more servers go down, quorum lost

**Prevention:** This is why we have 5 servers (need 3 for quorum)

**Recovery:**
1. **Bring up at least 3 servers** (any 3 from c1, c2, c3, fractal, zippy)
2. **If that's not possible, bootstrap new cluster:**
   ```bash
   # On one surviving server, force bootstrap
   consul force-leave <failed-node>
   nomad operator raft list-peers
   nomad operator raft remove-peer <failed-peer>
   ```
3. **Restore from backups** (worst case)

---

## Post-Migration Verification Checklist

- [ ] All 5 servers in quorum: `consul members` shows c1, c2, c3, fractal, zippy
- [ ] NFS mounts working: `df -h | grep services` on all nodes
- [ ] Btrfs replication running: Check systemd timers on zippy
- [ ] Critical services up: mysql, postgres, redis, traefik, authentik
- [ ] Monitoring working: Prometheus, Grafana, Loki accessible
- [ ] Media stack on fractal: `nomad alloc status` shows media job on fractal
- [ ] Database jobs on zippy: `nomad alloc status` shows mysql/postgres on zippy
- [ ] Consul DNS working: `dig @localhost -p 8600 services.service.consul`
- [ ] Backups running: Kopia snapshots include `/persist/services`
- [ ] GlusterFS removed: No glusterfs processes, volumes deleted
- [ ] Documentation updated: README.md, architecture diagrams

---

## Rollback Plan

**If migration fails catastrophically:**

1. **Stop all new Nomad jobs**
   ```bash
   nomad job stop -purge <new-jobs>
   ```

2. **Restore GlusterFS mounts**
   ```bash
   # On all nodes, re-enable GlusterFS client
   deploy  # With old configs
   ```

3. **Restart old Nomad jobs**
   ```bash
   # With old paths pointing to /data/compute
   nomad run services/*.hcl  # Old versions from git
   ```

4. **Restore data if needed**
   ```bash
   rsync -av /backup/compute-pre-migration/ /data/compute/
   ```

**Important:** Keep GlusterFS running until Phase 4 is complete and verified!

---

## Questions Answered

1. ✅ **Where is `/data/sync/wordpress` mounted from?**
   - **Answer**: Syncthing-managed to avoid slow GlusterFS
   - **Action**: Migrate to `/data/services/wordpress`, remove syncthing config

2. ✅ **Which services use `/data/media` directly?**
   - **Answer**: Only media.hcl (radarr, sonarr, plex, qbittorrent)
   - **Action**: Constrain media.hcl to fractal, everything else uses CIFS mount

3. ✅ **Do we want unifi on fractal or floating?**
   - **Answer**: Floating is fine
   - **Action**: No constraint needed

4. ✅ **What's the plan for sunny's existing data?**
   - **Answer**: Ethereum data stays local, not replicated (too expensive)
   - **Action**: Either backup/restore or resync from network during NixOS conversion

## Questions Still to Answer

1. **Backup retention for btrfs snapshots?**
   - Current plan: Keep 24 hours of snapshots on zippy
   - Is this enough? Or do we want more for safety?
   - This should be fine -- snapshots are just for hot recovery. More/older backups are kept via kopia on fractal.

2. **c1-nfs-server vs c1 config - same host, different configs?**
   - Recommendation: Use same hostname, different flake output
   - `c1` = normal config with NFS client
   - `c1-nfs-server` = variant with NFS server enabled
   - Both in flake.nix, deploy appropriate one based on role
   - Answer: recommendation makes sense.

3. **Should we verify webodm, igsync, maps don't need /data/media access?**
   - neither of them needs /data/media
   - maps needs /data/shared

---

## Timeline Estimate

**Total duration: 12-20 hours** (can be split across multiple sessions)

- Phase 0 (Prep): 1-2 hours
- Phase 1 (fractal): 4-6 hours
- Phase 2 (zippy storage): 2-3 hours
- Phase 3 (GlusterFS → NFS): 3-4 hours
- Phase 4 (Nomad jobs): 2-4 hours
- Phase 5 (sunny): 2-3 hours (optional, can be done later)
- Phase 6 (Cleanup): 1 hour

**Suggested schedule:**
- **Day 1**: Phases 0-1 (fractal conversion, establish quorum)
- **Day 2**: Phases 2-3 (zippy storage, data migration)
- **Day 3**: Phase 4 (Nomad job updates and deployment)
- **Day 4**: Phases 5-6 (sunny + cleanup) or take a break and do later

**Maintenance windows needed:**
- Phase 3: ~1 hour downtime (all services stopped during data migration)
- Phase 4: Rolling (services come back up as redeployed)