1718 lines
52 KiB
Markdown
1718 lines
52 KiB
Markdown
# Cluster Architecture Revamp
|
|
|
|
**Status**: Planning complete, ready for review and refinement
|
|
|
|
## Key Decisions
|
|
|
|
✅ **Replication**: 5-minute intervals (incremental btrfs send)
|
|
✅ **WordPress**: Currently syncthing → will use `/data/services` via NFS
|
|
✅ **Media**: Only media.hcl needs `/data/media`, constrained to fractal
|
|
✅ **Unifi**: Floating (no constraint needed)
|
|
✅ **Sunny**: Standalone, ethereum data stays local (not replicated)
|
|
✅ **Quorum**: 5 servers (c1, c2, c3, fractal, zippy)
|
|
✅ **NFS Failover**: Via Consul DNS (`services.service.consul`)
|
|
|
|
## Table of Contents
|
|
1. [End State Architecture](#end-state-architecture)
|
|
2. [Migration Steps](#migration-steps)
|
|
3. [Service Catalog](#service-catalog)
|
|
4. [Failover Procedures](#failover-procedures)
|
|
|
|
---
|
|
|
|
## End State Architecture
|
|
|
|
### Cluster Topology
|
|
|
|
**5-Server Quorum (Consul + Nomad server+client):**
|
|
- **c1, c2, c3**: Cattle nodes - x86_64, run most stateless workloads
|
|
- **fractal**: Storage node - x86_64, 6x spinning drives, runs media workloads
|
|
- **zippy**: Stateful anchor - x86_64, runs database workloads (via affinity), primary NFS server
|
|
|
|
**Standalone Nodes (not in quorum):**
|
|
- **sunny**: x86_64, ethereum node + staking, base NixOS configs only
|
|
- **chilly**: x86_64, Home Assistant VM, base NixOS configs only
|
|
|
|
**Quorum Math:**
|
|
- 5 servers → quorum requires 3 healthy nodes
|
|
- Can tolerate 2 simultaneous failures
|
|
- Bootstrap expect: 3
|
|
|
|
### Storage Architecture
|
|
|
|
**Primary Storage (zippy):**
|
|
- `/persist/services` - btrfs subvolume
|
|
- Contains: mysql, postgres, redis, clickhouse, mongodb, app data
|
|
- Exported via NFS to: `services.service.consul:/persist/services`
|
|
- Replicated via **btrfs send** to c1 and c2 every **5 minutes** (incremental)
|
|
|
|
**Standby Storage (c1, c2):**
|
|
- `/persist/services-standby` - btrfs subvolume
|
|
- Receives replicated snapshots from zippy via incremental btrfs send
|
|
- Can be promoted to `/persist/services` and exported as NFS during failover
|
|
- Maximum data loss: **5 minutes** (last replication interval)
|
|
|
|
**Standalone Storage (sunny):**
|
|
- `/persist/ethereum` - local btrfs subvolume (or similar)
|
|
- Contains: ethereum blockchain data, staking keys
|
|
- **NOT replicated** - too large/expensive to replicate full ethereum node
|
|
- Backed up via kopia to fractal (if feasible/needed)
|
|
|
|
**Media Storage (fractal):**
|
|
- `/data/media` - existing spinning drive storage
|
|
- Exported via Samba (existing)
|
|
- Mounted on c1, c2, c3 via CIFS (existing)
|
|
- Local access on fractal for media workloads
|
|
|
|
**Shared Storage (fractal):**
|
|
- `/data/shared` - existing spinning drive storage
|
|
- Exported via Samba (existing)
|
|
- Mounted on c1, c2, c3 via CIFS (existing)
|
|
|
|
### Network Services
|
|
|
|
**NFS Primary (zippy):**
|
|
```nix
|
|
services.nfs.server = {
|
|
enable = true;
|
|
exports = ''
|
|
/persist/services 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
|
|
'';
|
|
};
|
|
|
|
services.consul.extraConfig.services = [{
|
|
name = "services";
|
|
port = 2049;
|
|
checks = [{ tcp = "localhost:2049"; interval = "30s"; }];
|
|
}];
|
|
```
|
|
|
|
**NFS Client (all nodes):**
|
|
```nix
|
|
fileSystems."/data/services" = {
|
|
device = "services.service.consul:/persist/services";
|
|
fsType = "nfs";
|
|
options = [ "x-systemd.automount" "noauto" "x-systemd.idle-timeout=60" ];
|
|
};
|
|
```
|
|
|
|
**Samba Exports (fractal - existing):**
|
|
- `//fractal/media` → `/data/media`
|
|
- `//fractal/shared` → `/data/shared`
|
|
|
|
### Nomad Job Placement Strategy
|
|
|
|
**Affinity-based (prefer zippy, allow c1/c2):**
|
|
- mysql, postgres, redis - stateful databases
|
|
- Run on zippy normally, can failover to c1/c2 if zippy down
|
|
|
|
**Constrained (must run on fractal):**
|
|
- **media.hcl** - radarr, sonarr, bazarr, plex, qbittorrent
|
|
- Reason: Heavy /data/media access, benefits from local storage
|
|
- **prometheus.hcl** - metrics database with 30d retention
|
|
- Reason: Large time-series data, spinning disks OK, saves SSD space
|
|
- **loki.hcl** - log aggregation with 31d retention
|
|
- Reason: Large log data, spinning disks OK
|
|
- **clickhouse.hcl** - analytics database for plausible
|
|
- Reason: Large time-series data, spinning disks OK
|
|
|
|
**Floating (can run anywhere on c1/c2/c3/fractal/zippy):**
|
|
- All other services including:
|
|
- traefik, authentik, web apps
|
|
- **grafana** (small data, just dashboards/config, queries prometheus for metrics)
|
|
- databases (mysql, postgres, redis)
|
|
- vector (system job, runs everywhere)
|
|
- Nomad schedules based on resources and constraints
|
|
|
|
### Data Migration
|
|
|
|
**Path changes needed in Nomad jobs:**
|
|
- `/data/compute/appdata/*` → `/data/services/*`
|
|
- `/data/compute/config/*` → `/data/services/*`
|
|
- `/data/sync/wordpress` → `/data/services/wordpress`
|
|
|
|
**No changes needed:**
|
|
- `/data/media/*` - stays the same (CIFS mount from fractal, used only by media services)
|
|
- `/data/shared/*` - stays the same (CIFS mount from fractal)
|
|
|
|
**Deprecated after migration:**
|
|
- `/data/sync/wordpress` - currently managed by syncthing to avoid slow GlusterFS
|
|
- Will be replaced by NFS mount at `/data/services/wordpress`
|
|
- Syncthing configuration for this can be removed
|
|
- Final sync: copy from syncthing to `/persist/services/wordpress` on zippy before cutover
|
|
|
|
---
|
|
|
|
## Migration Steps
|
|
|
|
**Important path simplification note:**
|
|
- All service paths use `/data/services/*` directly (not `/data/services/*`)
|
|
- Example: `/data/compute/appdata/mysql` → `/data/services/mysql`
|
|
- Simpler, cleaner, easier to manage
|
|
|
|
### Phase 0: Preparation
|
|
**Duration: 1-2 hours**
|
|
|
|
1. **Backup everything**
|
|
```bash
|
|
# On all nodes, ensure kopia backups are current
|
|
kopia snapshot list
|
|
|
|
# Backup glusterfs data manually
|
|
rsync -av /data/compute/ /backup/compute-pre-migration/
|
|
```
|
|
|
|
2. **Document current state**
|
|
```bash
|
|
# Save current nomad job list
|
|
nomad job status -json > /backup/nomad-jobs-pre-migration.json
|
|
|
|
# Save consul service catalog
|
|
consul catalog services > /backup/consul-services-pre-migration.txt
|
|
```
|
|
|
|
3. **Review this document**
|
|
- Verify all services are cataloged
|
|
- Confirm priority assignments
|
|
- Adjust as needed
|
|
|
|
### Phase 1: Convert fractal to NixOS
|
|
**Duration: 6-8 hours**
|
|
|
|
**Current state:**
|
|
- Proxmox on ZFS
|
|
- System pool: `rpool` (~500GB, will be wiped)
|
|
- Data pools (preserved):
|
|
- `double1` - 3.6T (homes, shared)
|
|
- `double2` - 7.2T (backup - kopia repo, PBS)
|
|
- `double3` - 17T (media, torrent)
|
|
- Services: Samba (homes, shared, media), Kopia server, PBS
|
|
- Bind mounts: `/data/{homes,shared,media,torrent}` → ZFS datasets
|
|
|
|
**Goal:** Fresh NixOS on rpool, preserve data pools, join cluster
|
|
|
|
#### Step-by-step procedure:
|
|
|
|
**1. Pre-migration documentation**
|
|
```bash
|
|
# On fractal, save ZFS layout
|
|
cat > /tmp/detect-zfs.sh << 'EOF'
|
|
#!/bin/bash
|
|
echo "=== ZFS Pools ==="
|
|
zpool status
|
|
|
|
echo -e "\n=== ZFS Datasets ==="
|
|
zfs list -o name,mountpoint,used,avail,mounted -r double1 double2 double3
|
|
|
|
echo -e "\n=== Bind mounts ==="
|
|
cat /etc/fstab | grep double
|
|
|
|
echo -e "\n=== Data directories ==="
|
|
ls -la /data/
|
|
|
|
echo -e "\n=== Samba users/groups ==="
|
|
getent group shared compute
|
|
getent passwd compute
|
|
EOF
|
|
chmod +x /tmp/detect-zfs.sh
|
|
ssh fractal /tmp/detect-zfs.sh > /backup/fractal-zfs-layout.txt
|
|
|
|
# Save samba config
|
|
scp fractal:/etc/samba/smb.conf /backup/fractal-smb.conf
|
|
|
|
# Save kopia certs and config
|
|
scp -r fractal:~/kopia-certs /backup/fractal-kopia-certs/
|
|
scp fractal:~/.config/kopia/repository.config /backup/fractal-kopia-repository.config
|
|
|
|
# Verify kopia backups are current
|
|
ssh fractal "kopia snapshot list --all"
|
|
```
|
|
|
|
**2. Stop services on fractal**
|
|
```bash
|
|
ssh fractal "systemctl stop smbd nmbd kopia"
|
|
# Don't stop PBS yet (in case we need to restore)
|
|
```
|
|
|
|
**3. Install NixOS**
|
|
- Boot NixOS installer USB
|
|
- **IMPORTANT**: Do NOT touch double1, double2, double3 during install!
|
|
- Install only on `rpool` (or create new pool if needed)
|
|
|
|
```bash
|
|
# In NixOS installer
|
|
# Option A: Reuse rpool (wipe and recreate)
|
|
zpool destroy rpool
|
|
|
|
# Option B: Use different disk if available
|
|
# Then follow standard NixOS btrfs install on that disk
|
|
```
|
|
|
|
- Use standard encrypted btrfs layout (matching other hosts)
|
|
- Minimal install first, will add cluster configs later
|
|
|
|
**4. First boot - import ZFS pools**
|
|
```bash
|
|
# SSH into fresh NixOS install
|
|
|
|
# Import pools (read-only first, to be safe)
|
|
zpool import -f -o readonly=on double1
|
|
zpool import -f -o readonly=on double2
|
|
zpool import -f -o readonly=on double3
|
|
|
|
# Verify datasets
|
|
zfs list -r double1 double2 double3
|
|
|
|
# Example output should show:
|
|
# double1/homes
|
|
# double1/shared
|
|
# double2/backup
|
|
# double3/media
|
|
# double3/torrent
|
|
|
|
# If everything looks good, export and reimport read-write
|
|
zpool export double1 double2 double3
|
|
zpool import double1
|
|
zpool import double2
|
|
zpool import double3
|
|
|
|
# Set ZFS mountpoints (if needed)
|
|
# These may already be set from Proxmox
|
|
zfs set mountpoint=/double1 double1
|
|
zfs set mountpoint=/double2 double2
|
|
zfs set mountpoint=/double3 double3
|
|
```
|
|
|
|
**5. Create fractal NixOS configuration**
|
|
```nix
|
|
# hosts/fractal/default.nix
|
|
{ config, pkgs, ... }:
|
|
{
|
|
imports = [
|
|
../../common/encrypted-btrfs-layout.nix
|
|
../../common/global
|
|
../../common/cluster-node.nix # Consul + Nomad (will add in step 7)
|
|
../../common/nomad.nix # Both server and client
|
|
./hardware.nix
|
|
];
|
|
|
|
networking.hostName = "fractal";
|
|
|
|
# ZFS support
|
|
boot.supportedFilesystems = [ "zfs" ];
|
|
boot.zfs.extraPools = [ "double1" "double2" "double3" ];
|
|
|
|
# Ensure ZFS pools are imported before mounting
|
|
systemd.services.zfs-import.wantedBy = [ "multi-user.target" ];
|
|
|
|
# Bind mounts for /data (matching Proxmox setup)
|
|
fileSystems."/data/homes" = {
|
|
device = "/double1/homes";
|
|
fsType = "none";
|
|
options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
|
|
};
|
|
|
|
fileSystems."/data/shared" = {
|
|
device = "/double1/shared";
|
|
fsType = "none";
|
|
options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
|
|
};
|
|
|
|
fileSystems."/data/media" = {
|
|
device = "/double3/media";
|
|
fsType = "none";
|
|
options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
|
|
};
|
|
|
|
fileSystems."/data/torrent" = {
|
|
device = "/double3/torrent";
|
|
fsType = "none";
|
|
options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
|
|
};
|
|
|
|
fileSystems."/backup" = {
|
|
device = "/double2/backup";
|
|
fsType = "none";
|
|
options = [ "bind" "x-systemd.requires=zfs-mount.service" ];
|
|
};
|
|
|
|
# Create data directory structure
|
|
systemd.tmpfiles.rules = [
|
|
"d /data 0755 root root -"
|
|
];
|
|
|
|
# Users and groups for samba
|
|
users.groups.shared = { gid = 1001; };
|
|
users.groups.compute = { gid = 1002; };
|
|
users.users.compute = {
|
|
isSystemUser = true;
|
|
uid = 1002;
|
|
group = "compute";
|
|
};
|
|
|
|
# Ensure ppetru is in shared group
|
|
users.users.ppetru.extraGroups = [ "shared" ];
|
|
|
|
# Samba server
|
|
services.samba = {
|
|
enable = true;
|
|
openFirewall = true;
|
|
|
|
extraConfig = ''
|
|
workgroup = WORKGROUP
|
|
server string = fractal
|
|
netbios name = fractal
|
|
security = user
|
|
map to guest = bad user
|
|
'';
|
|
|
|
shares = {
|
|
homes = {
|
|
comment = "Home Directories";
|
|
browseable = "no";
|
|
path = "/data/homes/%S";
|
|
"read only" = "no";
|
|
};
|
|
|
|
shared = {
|
|
path = "/data/shared";
|
|
"read only" = "no";
|
|
browseable = "yes";
|
|
"guest ok" = "no";
|
|
"create mask" = "0775";
|
|
"directory mask" = "0775";
|
|
"force group" = "+shared";
|
|
};
|
|
|
|
media = {
|
|
path = "/data/media";
|
|
"read only" = "no";
|
|
browseable = "yes";
|
|
"guest ok" = "no";
|
|
"create mask" = "0755";
|
|
"directory mask" = "0755";
|
|
};
|
|
};
|
|
};
|
|
|
|
# Kopia backup server
|
|
systemd.services.kopia-server = {
|
|
description = "Kopia Backup Server";
|
|
wantedBy = [ "multi-user.target" ];
|
|
after = [ "network.target" "zfs-mount.service" ];
|
|
|
|
serviceConfig = {
|
|
User = "ppetru";
|
|
Group = "users";
|
|
ExecStart = ''
|
|
${pkgs.kopia}/bin/kopia server start \
|
|
--address 0.0.0.0:51515 \
|
|
--tls-cert-file /persist/kopia-certs/kopia.cert \
|
|
--tls-key-file /persist/kopia-certs/kopia.key
|
|
'';
|
|
Restart = "on-failure";
|
|
};
|
|
};
|
|
|
|
# Kopia nightly snapshot (from cron)
|
|
systemd.services.kopia-snapshot = {
|
|
description = "Kopia snapshot of homes and shared";
|
|
serviceConfig = {
|
|
Type = "oneshot";
|
|
User = "ppetru";
|
|
Group = "users";
|
|
ExecStart = ''
|
|
${pkgs.kopia}/bin/kopia --config-file=/home/ppetru/.config/kopia/repository.config \
|
|
snapshot create /data/homes /data/shared \
|
|
--log-level=warning --no-progress
|
|
'';
|
|
};
|
|
};
|
|
|
|
systemd.timers.kopia-snapshot = {
|
|
wantedBy = [ "timers.target" ];
|
|
timerConfig = {
|
|
OnCalendar = "22:47";
|
|
Persistent = true;
|
|
};
|
|
};
|
|
|
|
# Keep kopia config and certs persistent
|
|
environment.persistence."/persist" = {
|
|
directories = [
|
|
"/home/ppetru/.config/kopia"
|
|
"/home/ppetru/kopia-certs"
|
|
];
|
|
};
|
|
|
|
networking.firewall.allowedTCPPorts = [
|
|
139 445 # Samba
|
|
51515 # Kopia
|
|
];
|
|
networking.firewall.allowedUDPPorts = [
|
|
137 138 # Samba
|
|
];
|
|
}
|
|
```
|
|
|
|
**6. Deploy initial config (without cluster)**
|
|
```bash
|
|
# First, deploy without cluster-node.nix to verify storage works
|
|
# Comment out cluster-node import temporarily
|
|
|
|
deploy -s '.#fractal'
|
|
|
|
# Verify mounts
|
|
ssh fractal "df -h | grep data"
|
|
ssh fractal "ls -la /data/"
|
|
|
|
# Test samba
|
|
smbclient -L fractal -U ppetru
|
|
|
|
# Test kopia
|
|
ssh fractal "systemctl status kopia-server"
|
|
```
|
|
|
|
**7. Join cluster (add to quorum)**
|
|
```bash
|
|
# Uncomment cluster-node.nix import in fractal config
|
|
# Update all cluster configs for 5-server quorum
|
|
# (See step 3 in existing Phase 1 docs)
|
|
|
|
deploy # Deploy to all nodes
|
|
|
|
# Verify quorum
|
|
consul members
|
|
nomad server members
|
|
```
|
|
|
|
**8. Update cluster configs for 5-server quorum**
|
|
```nix
|
|
# common/consul.nix
|
|
servers = ["c1" "c2" "c3" "fractal" "zippy"];
|
|
bootstrap_expect = 3;
|
|
|
|
# common/nomad.nix
|
|
servers = ["c1" "c2" "c3" "fractal" "zippy"];
|
|
bootstrap_expect = 3;
|
|
```
|
|
|
|
**9. Verify fractal is fully operational**
|
|
```bash
|
|
# Check all services
|
|
ssh fractal "systemctl status samba kopia-server kopia-snapshot.timer"
|
|
|
|
# Verify ZFS pools
|
|
ssh fractal "zpool status"
|
|
ssh fractal "zfs list"
|
|
|
|
# Test accessing shares from another node
|
|
ssh c1 "ls /data/media /data/shared"
|
|
|
|
# Verify kopia clients can still connect
|
|
kopia repository status --server=https://fractal:51515
|
|
|
|
# Check nomad can see fractal
|
|
nomad node status | grep fractal
|
|
|
|
# Verify quorum
|
|
consul members # Should see c1, c2, c3, fractal
|
|
nomad server members # Should see 4 servers
|
|
```
|
|
|
|
### Phase 2: Setup zippy storage layer
|
|
**Duration: 2-3 hours**
|
|
|
|
**Goal:** Prepare zippy for NFS server role, setup replication
|
|
|
|
1. **Create btrfs subvolume on zippy**
|
|
```bash
|
|
ssh zippy
|
|
sudo btrfs subvolume create /persist/services
|
|
sudo chown ppetru:users /persist/services
|
|
```
|
|
|
|
2. **Update zippy configuration**
|
|
```nix
|
|
# hosts/zippy/default.nix
|
|
imports = [
|
|
../../common/encrypted-btrfs-layout.nix
|
|
../../common/global
|
|
../../common/cluster-node.nix # Adds to quorum
|
|
../../common/nomad.nix
|
|
./hardware.nix
|
|
];
|
|
|
|
# NFS server
|
|
services.nfs.server = {
|
|
enable = true;
|
|
exports = ''
|
|
/persist/services 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
|
|
'';
|
|
};
|
|
|
|
# Consul service registration for NFS
|
|
services.consul.extraConfig.services = [{
|
|
name = "services";
|
|
port = 2049;
|
|
checks = [{ tcp = "localhost:2049"; interval = "30s"; }];
|
|
}];
|
|
|
|
# Btrfs replication to standbys (incremental after first full send)
|
|
systemd.services.replicate-to-c1 = {
|
|
description = "Replicate /persist/services to c1";
|
|
script = ''
|
|
${pkgs.btrfs-progs}/bin/btrfs subvolume snapshot -r /persist/services /persist/services@$(date +%Y%m%d-%H%M%S)
|
|
LATEST=$(ls -t /persist/services@* | head -1)
|
|
|
|
# Get previous snapshot for incremental send
|
|
PREV=$(ls -t /persist/services@* | head -2 | tail -1)
|
|
|
|
# First run: full send. Subsequent: incremental with -p (parent)
|
|
if [ "$LATEST" != "$PREV" ]; then
|
|
${pkgs.btrfs-progs}/bin/btrfs send -p $PREV $LATEST | ${pkgs.openssh}/bin/ssh c1 "${pkgs.btrfs-progs}/bin/btrfs receive /persist/"
|
|
else
|
|
# First snapshot, full send
|
|
${pkgs.btrfs-progs}/bin/btrfs send $LATEST | ${pkgs.openssh}/bin/ssh c1 "${pkgs.btrfs-progs}/bin/btrfs receive /persist/"
|
|
fi
|
|
|
|
# Cleanup old snapshots (keep last 24 hours on sender)
|
|
find /persist/services@* -mtime +1 -exec ${pkgs.btrfs-progs}/bin/btrfs subvolume delete {} \;
|
|
'';
|
|
};
|
|
|
|
systemd.timers.replicate-to-c1 = {
|
|
wantedBy = [ "timers.target" ];
|
|
timerConfig = {
|
|
OnCalendar = "*:0/5"; # Every 5 minutes (incremental after first full send)
|
|
Persistent = true;
|
|
};
|
|
};
|
|
|
|
# Same for c2
|
|
systemd.services.replicate-to-c2 = { ... };
|
|
systemd.timers.replicate-to-c2 = { ... };
|
|
```
|
|
|
|
3. **Setup standby storage on c1 and c2**
|
|
```bash
|
|
# On c1 and c2
|
|
ssh c1 sudo btrfs subvolume create /persist/services-standby
|
|
ssh c2 sudo btrfs subvolume create /persist/services-standby
|
|
```
|
|
|
|
4. **Deploy and verify**
|
|
```bash
|
|
deploy -s '.#zippy'
|
|
|
|
# Verify NFS export
|
|
showmount -e zippy
|
|
|
|
# Verify Consul registration
|
|
dig @localhost -p 8600 services.service.consul
|
|
```
|
|
|
|
5. **Verify quorum is now 5 servers**
|
|
```bash
|
|
consul members # Should show c1, c2, c3, fractal, zippy
|
|
nomad server members
|
|
```
|
|
|
|
### Phase 3: Migrate from GlusterFS to NFS
|
|
**Duration: 3-4 hours**
|
|
|
|
**Goal:** Move all data, update mounts, remove GlusterFS
|
|
|
|
1. **Copy data from GlusterFS to zippy**
|
|
```bash
|
|
# On any node with /data/compute mounted
|
|
rsync -av --progress /data/compute/ zippy:/persist/services/
|
|
|
|
# Verify
|
|
ssh zippy du -sh /persist/services
|
|
```
|
|
|
|
2. **Update all nodes to mount NFS**
|
|
```nix
|
|
# Update common/glusterfs-client.nix → common/nfs-client.nix
|
|
# OR update common/cluster-node.nix to import nfs-client instead
|
|
|
|
fileSystems."/data/services" = {
|
|
device = "services.service.consul:/persist/services";
|
|
fsType = "nfs";
|
|
options = [ "x-systemd.automount" "noauto" "x-systemd.idle-timeout=60" ];
|
|
};
|
|
|
|
# Remove old GlusterFS mount
|
|
# fileSystems."/data/compute" = ... # DELETE
|
|
```
|
|
|
|
3. **Deploy updated configs**
|
|
```bash
|
|
deploy -s '.#c1' '.#c2' '.#c3' '.#fractal' '.#zippy'
|
|
```
|
|
|
|
4. **Verify NFS mounts**
|
|
```bash
|
|
for host in c1 c2 c3 fractal zippy; do
|
|
ssh $host "df -h | grep services"
|
|
done
|
|
```
|
|
|
|
5. **Stop all Nomad jobs temporarily**
|
|
```bash
|
|
# Get list of running jobs
|
|
nomad job status | grep running | awk '{print $1}' > /tmp/running-jobs.txt
|
|
|
|
# Stop all (they'll be restarted with updated paths in Phase 4)
|
|
cat /tmp/running-jobs.txt | xargs -I {} nomad job stop {}
|
|
```
|
|
|
|
6. **Remove GlusterFS from cluster**
|
|
```bash
|
|
# On c1 (or any gluster server)
|
|
gluster volume stop compute
|
|
gluster volume delete compute
|
|
|
|
# On all nodes
|
|
for host in c1 c2 c3; do
|
|
ssh $host "sudo systemctl stop glusterd; sudo systemctl disable glusterd"
|
|
done
|
|
```
|
|
|
|
7. **Remove GlusterFS from NixOS configs**
|
|
```nix
|
|
# common/compute-node.nix - remove ./glusterfs.nix import
|
|
# Deploy again
|
|
deploy
|
|
```
|
|
|
|
### Phase 4: Update and redeploy Nomad jobs
|
|
**Duration: 2-4 hours**
|
|
|
|
**Goal:** Update all Nomad job paths, add constraints/affinities, redeploy
|
|
|
|
1. **Update job specs** (see Service Catalog below for details)
|
|
- Change `/data/compute` → `/data/services`
|
|
- Add constraints for media jobs → fractal
|
|
- Add affinities for database jobs → zippy
|
|
|
|
2. **Deploy critical services first**
|
|
```bash
|
|
# Core infrastructure
|
|
nomad run services/mysql.hcl
|
|
nomad run services/postgres.hcl
|
|
nomad run services/redis.hcl
|
|
nomad run services/traefik.hcl
|
|
nomad run services/authentik.hcl
|
|
|
|
# Verify
|
|
nomad job status mysql
|
|
consul catalog services
|
|
```
|
|
|
|
3. **Deploy high-priority services**
|
|
```bash
|
|
nomad run services/prometheus.hcl
|
|
nomad run services/grafana.hcl
|
|
nomad run services/loki.hcl
|
|
nomad run services/vector.hcl
|
|
|
|
nomad run services/unifi.hcl
|
|
nomad run services/gitea.hcl
|
|
```
|
|
|
|
4. **Deploy medium-priority services**
|
|
```bash
|
|
# See service catalog for full list
|
|
nomad run services/wordpress.hcl
|
|
nomad run services/ghost.hcl
|
|
nomad run services/wiki.hcl
|
|
# ... etc
|
|
```
|
|
|
|
5. **Deploy low-priority services**
|
|
```bash
|
|
nomad run services/media.hcl # Will run on fractal due to constraint
|
|
# ... etc
|
|
```
|
|
|
|
6. **Verify all services healthy**
|
|
```bash
|
|
nomad job status
|
|
consul catalog services
|
|
# Check traefik dashboard for health
|
|
```
|
|
|
|
### Phase 5: Convert sunny to NixOS (Optional, can defer)
|
|
**Duration: 6-10 hours (split across 2 stages)**
|
|
|
|
**Current state:**
|
|
- Proxmox with ~1.5TB ethereum node data
|
|
- 2x LXC containers: besu (execution client), lighthouse (consensus beacon)
|
|
- 1x VM: Rocketpool smartnode (docker containers for validator, node, MEV-boost, etc.)
|
|
- Running in "hybrid mode" - managing own execution/consensus, rocketpool manages the rest
|
|
|
|
**Goal:** Get sunny on NixOS quickly, preserve ethereum data, defer "perfect" native setup
|
|
|
|
---
|
|
|
|
#### Stage 1: Quick NixOS Migration (containers)
|
|
**Duration: 6-8 hours**
|
|
**Goal:** NixOS + containerized ethereum stack, minimal disruption
|
|
|
|
**1. Pre-migration backup and documentation**
|
|
```bash
|
|
# Document current setup
|
|
ssh sunny "pct list" > /backup/sunny-containers.txt
|
|
ssh sunny "qm list" > /backup/sunny-vms.txt
|
|
|
|
# Find ethereum data locations in LXC containers
|
|
ssh sunny "pct config BESU_CT_ID" > /backup/sunny-besu-config.txt
|
|
ssh sunny "pct config LIGHTHOUSE_CT_ID" > /backup/sunny-lighthouse-config.txt
|
|
|
|
# Document rocketpool VM volumes
|
|
ssh sunny "qm config ROCKETPOOL_VM_ID" > /backup/sunny-rocketpool-config.txt
|
|
|
|
# Estimate ethereum data size
|
|
ssh sunny "du -sh /path/to/besu/data"
|
|
ssh sunny "du -sh /path/to/lighthouse/data"
|
|
|
|
# Backup rocketpool config (docker-compose, wallet keys, etc.)
|
|
# This is in the VM - need to access and backup critical files
|
|
```
|
|
|
|
**2. Extract ethereum data from containers/VM**
|
|
```bash
|
|
# Stop ethereum services to get consistent state
|
|
# (This will pause validation! Plan for attestation penalties)
|
|
|
|
# Copy besu data out of LXC
|
|
ssh sunny "pct stop BESU_CT_ID"
|
|
rsync -av --progress sunny:/var/lib/lxc/BESU_CT_ID/rootfs/path/to/besu/ /backup/sunny-besu-data/
|
|
|
|
# Copy lighthouse data out of LXC
|
|
ssh sunny "pct stop LIGHTHOUSE_CT_ID"
|
|
rsync -av --progress sunny:/var/lib/lxc/LIGHTHOUSE_CT_ID/rootfs/path/to/lighthouse/ /backup/sunny-lighthouse-data/
|
|
|
|
# Copy rocketpool data out of VM
|
|
# This includes validator keys, wallet, node config
|
|
# Access VM and copy out: ~/.rocketpool/data
|
|
```
|
|
|
|
**3. Install NixOS on sunny**
|
|
- Fresh install with btrfs + impermanence
|
|
- Create large `/persist/ethereum` for 1.5TB+ data
|
|
- **DO NOT** try to resync from network (takes weeks!)
|
|
|
|
**4. Restore ethereum data to NixOS**
|
|
```bash
|
|
# After NixOS install, copy data back
|
|
ssh sunny "mkdir -p /persist/ethereum/{besu,lighthouse,rocketpool}"
|
|
|
|
rsync -av --progress /backup/sunny-besu-data/ sunny:/persist/ethereum/besu/
|
|
rsync -av --progress /backup/sunny-lighthouse-data/ sunny:/persist/ethereum/lighthouse/
|
|
# Rocketpool data copied later
|
|
```
|
|
|
|
**5. Create sunny NixOS config (container-based)**
|
|
```nix
|
|
# hosts/sunny/default.nix
|
|
{ config, pkgs, ... }:
|
|
{
|
|
imports = [
|
|
../../common/encrypted-btrfs-layout.nix
|
|
../../common/global
|
|
./hardware.nix
|
|
];
|
|
|
|
networking.hostName = "sunny";
|
|
|
|
# NO cluster-node import - standalone for now
|
|
# Can add to quorum later if desired
|
|
|
|
# Container runtime
|
|
virtualisation.podman = {
|
|
enable = true;
|
|
dockerCompat = true; # Provides 'docker' command
|
|
defaultNetwork.settings.dns_enabled = true;
|
|
};
|
|
|
|
# Besu execution client (container)
|
|
virtualisation.oci-containers.containers.besu = {
|
|
image = "hyperledger/besu:latest";
|
|
volumes = [
|
|
"/persist/ethereum/besu:/var/lib/besu"
|
|
];
|
|
ports = [
|
|
"8545:8545" # HTTP RPC
|
|
"8546:8546" # WebSocket RPC
|
|
"30303:30303" # P2P
|
|
];
|
|
cmd = [
|
|
"--data-path=/var/lib/besu"
|
|
"--rpc-http-enabled=true"
|
|
"--rpc-http-host=0.0.0.0"
|
|
"--rpc-ws-enabled=true"
|
|
"--rpc-ws-host=0.0.0.0"
|
|
"--engine-rpc-enabled=true"
|
|
"--engine-host-allowlist=*"
|
|
"--engine-jwt-secret=/var/lib/besu/jwt.hex"
|
|
# Add other besu flags as needed
|
|
];
|
|
autoStart = true;
|
|
};
|
|
|
|
# Lighthouse beacon client (container)
|
|
virtualisation.oci-containers.containers.lighthouse-beacon = {
|
|
image = "sigp/lighthouse:latest";
|
|
volumes = [
|
|
"/persist/ethereum/lighthouse:/data"
|
|
"/persist/ethereum/besu/jwt.hex:/jwt.hex:ro"
|
|
];
|
|
ports = [
|
|
"5052:5052" # HTTP API
|
|
"9000:9000" # P2P
|
|
];
|
|
cmd = [
|
|
"lighthouse"
|
|
"beacon"
|
|
"--datadir=/data"
|
|
"--http"
|
|
"--http-address=0.0.0.0"
|
|
"--execution-endpoint=http://besu:8551"
|
|
"--execution-jwt=/jwt.hex"
|
|
# Add other lighthouse flags
|
|
];
|
|
dependsOn = [ "besu" ];
|
|
autoStart = true;
|
|
};
|
|
|
|
# Rocketpool stack (podman-compose for multi-container setup)
|
|
# TODO: This requires converting docker-compose to NixOS config
|
|
# For now, can run docker-compose via systemd service
|
|
systemd.services.rocketpool = {
|
|
description = "Rocketpool Smartnode Stack";
|
|
after = [ "podman.service" "lighthouse-beacon.service" ];
|
|
wantedBy = [ "multi-user.target" ];
|
|
|
|
serviceConfig = {
|
|
Type = "oneshot";
|
|
RemainAfterExit = "yes";
|
|
WorkingDirectory = "/persist/ethereum/rocketpool";
|
|
ExecStart = "${pkgs.docker-compose}/bin/docker-compose up -d";
|
|
ExecStop = "${pkgs.docker-compose}/bin/docker-compose down";
|
|
};
|
|
};
|
|
|
|
# Ensure ethereum data persists
|
|
environment.persistence."/persist" = {
|
|
directories = [
|
|
"/persist/ethereum"
|
|
];
|
|
};
|
|
|
|
# Firewall for ethereum
|
|
networking.firewall = {
|
|
allowedTCPPorts = [
|
|
30303 # Besu P2P
|
|
9000 # Lighthouse P2P
|
|
# Add rocketpool ports
|
|
];
|
|
allowedUDPPorts = [
|
|
30303 # Besu P2P
|
|
9000 # Lighthouse P2P
|
|
];
|
|
};
|
|
}
|
|
```
|
|
|
|
**6. Setup rocketpool docker-compose on NixOS**
|
|
```bash
|
|
# After NixOS is running, restore rocketpool config
|
|
ssh sunny "mkdir -p /persist/ethereum/rocketpool"
|
|
|
|
# Copy rocketpool data (wallet, keys, config)
|
|
rsync -av /backup/sunny-rocketpool-data/ sunny:/persist/ethereum/rocketpool/
|
|
|
|
# Create docker-compose.yml for rocketpool stack
|
|
# Based on rocketpool hybrid mode docs
|
|
# This runs: validator, node software, MEV-boost, prometheus, etc.
|
|
# Connects to your besu + lighthouse containers
|
|
```
|
|
|
|
**7. Deploy and test**
|
|
```bash
|
|
deploy -s '.#sunny'
|
|
|
|
# Verify containers are running
|
|
ssh sunny "podman ps"
|
|
|
|
# Check besu sync status
|
|
ssh sunny "curl -X POST -H 'Content-Type: application/json' --data '{\"jsonrpc\":\"2.0\",\"method\":\"eth_syncing\",\"params\":[],\"id\":1}' http://localhost:8545"
|
|
|
|
# Check lighthouse sync status
|
|
ssh sunny "curl http://localhost:5052/eth/v1/node/syncing"
|
|
|
|
# Monitor rocketpool
|
|
ssh sunny "cd /persist/ethereum/rocketpool && docker-compose logs -f"
|
|
```
|
|
|
|
**8. Monitor and stabilize**
|
|
- Ethereum should resume from where it left off (not resync!)
|
|
- Validation will resume once beacon is sync'd
|
|
- May have missed a few attestations during migration (minor penalty)
|
|
|
|
---
|
|
|
|
#### Stage 2: Native NixOS Services (Future)
|
|
**Duration: TBD (do this later when time permits)**
|
|
**Goal:** Convert to native NixOS services using ethereum-nix
|
|
|
|
**Why defer this:**
|
|
- Complex (rocketpool not fully packaged for Nix)
|
|
- Current container setup works fine
|
|
- Can migrate incrementally (besu → native, then lighthouse, etc.)
|
|
- No downtime once Stage 1 is stable
|
|
|
|
**When ready:**
|
|
1. Research ethereum-nix support for besu + lighthouse + rocketpool
|
|
2. Test on separate machine first
|
|
3. Migrate one service at a time with minimal downtime
|
|
4. Document in separate migration plan
|
|
|
|
**For now:** Stage 1 gets sunny on NixOS with base configs, managed declaratively, just using containers instead of native services.
|
|
|
|
### Phase 6: Verification and cleanup
|
|
**Duration: 1 hour**
|
|
|
|
1. **Test failover procedure** (see Failover Procedures below)
|
|
|
|
2. **Verify backups are working**
|
|
```bash
|
|
kopia snapshot list
|
|
# Check that /persist/services is being backed up
|
|
```
|
|
|
|
3. **Update documentation**
|
|
- Update README.md
|
|
- Document new architecture
|
|
- Update stateful-commands.txt
|
|
|
|
4. **Clean up old GlusterFS data**
|
|
```bash
|
|
# Only after verifying everything works!
|
|
for host in c1 c2 c3; do
|
|
ssh $host "sudo rm -rf /persist/glusterfs"
|
|
done
|
|
```
|
|
|
|
---
|
|
|
|
## Service Catalog
|
|
|
|
**Legend:**
|
|
- **Priority**: CRITICAL (must be up) / HIGH (important) / MEDIUM (nice to have) / LOW (can wait)
|
|
- **Target**: Where it should run (constraint or affinity)
|
|
- **Data**: What data it needs access to
|
|
- **Changes**: What needs updating in the .hcl file
|
|
|
|
### Core Infrastructure
|
|
|
|
#### mysql
|
|
- **File**: `services/mysql.hcl`
|
|
- **Priority**: CRITICAL
|
|
- **Current**: Uses `/data/compute/appdata/mysql`
|
|
- **Target**: Affinity for zippy, allow c1/c2
|
|
- **Data**: `/data/services/mysql` (NFS from zippy)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/mysql` → `/data/services/mysql`
|
|
- ✏️ Add affinity:
|
|
```hcl
|
|
affinity {
|
|
attribute = "${node.unique.name}"
|
|
value = "zippy"
|
|
weight = 100
|
|
}
|
|
```
|
|
- ✏️ Add constraint to allow fallback:
|
|
```hcl
|
|
constraint {
|
|
attribute = "${node.unique.name}"
|
|
operator = "regexp"
|
|
value = "zippy|c1|c2"
|
|
}
|
|
```
|
|
- **Notes**: Core database, needs to stay up. Consul DNS `mysql.service.consul` unchanged.
|
|
|
|
#### postgres
|
|
- **File**: `services/postgres.hcl`
|
|
- **Priority**: CRITICAL
|
|
- **Current**: Uses `/data/compute/appdata/postgres`, `/data/compute/appdata/pgadmin`
|
|
- **Target**: Affinity for zippy, allow c1/c2
|
|
- **Data**: `/data/services/postgres`, `/data/services/pgadmin` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: `/data/compute/appdata/*` → `/data/services/*`
|
|
- ✏️ Add affinity and constraint (same as mysql)
|
|
- **Notes**: Core database for authentik, gitea, plausible, netbox, etc.
|
|
|
|
#### redis
|
|
- **File**: `services/redis.hcl`
|
|
- **Priority**: CRITICAL
|
|
- **Current**: Uses `/data/compute/appdata/redis`
|
|
- **Target**: Affinity for zippy, allow c1/c2
|
|
- **Data**: `/data/services/redis` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/redis` → `/data/services/redis`
|
|
- ✏️ Add affinity and constraint (same as mysql)
|
|
- **Notes**: Used by authentik, wordpress. Should co-locate with databases.
|
|
|
|
#### traefik
|
|
- **File**: `services/traefik.hcl`
|
|
- **Priority**: CRITICAL
|
|
- **Current**: Uses `/data/compute/config/traefik`
|
|
- **Target**: Float on c1/c2/c3 (keepalived handles HA)
|
|
- **Data**: `/data/services/config/traefik` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/config/traefik` → `/data/services/config/traefik`
|
|
- **Notes**: Reverse proxy, has keepalived for VIP failover. Critical for all web access.
|
|
|
|
#### authentik
|
|
- **File**: `services/authentik.hcl`
|
|
- **Priority**: CRITICAL
|
|
- **Current**: No persistent volumes (stateless, uses postgres/redis)
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: None (uses postgres.service.consul, redis.service.consul)
|
|
- **Changes**: None needed
|
|
- **Notes**: SSO for most services. Must stay up.
|
|
|
|
### Monitoring Stack
|
|
|
|
#### prometheus
|
|
- **File**: `services/prometheus.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/prometheus`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/prometheus` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/prometheus` → `/data/services/prometheus`
|
|
- **Notes**: Metrics database. Important for monitoring but not critical for services.
|
|
|
|
#### grafana
|
|
- **File**: `services/grafana.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/grafana`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/grafana` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/grafana` → `/data/services/grafana`
|
|
- **Notes**: Monitoring UI. Depends on prometheus.
|
|
|
|
#### loki
|
|
- **File**: `services/loki.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/loki`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/loki` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/loki` → `/data/services/loki`
|
|
- **Notes**: Log aggregation. Important for debugging.
|
|
|
|
#### vector
|
|
- **File**: `services/vector.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: No persistent volumes, type=system (runs on all nodes)
|
|
- **Target**: System job (runs everywhere)
|
|
- **Data**: None (ephemeral logs, ships to loki)
|
|
- **Changes**:
|
|
- ❓ Check if glusterfs log path is still needed: `/var/log/glusterfs:/var/log/glusterfs:ro`
|
|
- ✏️ Remove glusterfs log collection after GlusterFS is removed
|
|
- **Notes**: Log shipper. Can tolerate downtime.
|
|
|
|
### Databases (Specialized)
|
|
|
|
#### clickhouse
|
|
- **File**: `services/clickhouse.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/clickhouse`
|
|
- **Target**: Affinity for zippy (large dataset), allow c1/c2/c3
|
|
- **Data**: `/data/services/clickhouse` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/clickhouse` → `/data/services/clickhouse`
|
|
- ✏️ Add affinity for zippy (optional, but helps with performance)
|
|
- **Notes**: Used by plausible. Large time-series data. Important but can be recreated.
|
|
|
|
#### mongodb
|
|
- **File**: `services/unifi.hcl` (embedded in unifi job)
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/unifi/mongodb`
|
|
- **Target**: Float on c1/c2/c3 (with unifi)
|
|
- **Data**: `/data/services/unifi/mongodb` (NFS)
|
|
- **Changes**: See unifi below
|
|
- **Notes**: Only used by unifi. Should stay with unifi controller.
|
|
|
|
### Web Applications
|
|
|
|
#### wordpress
|
|
- **File**: `services/wordpress.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/sync/wordpress` (syncthing-managed to avoid slow GlusterFS)
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/wordpress` (NFS from zippy)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/sync/wordpress` → `/data/services/wordpress`
|
|
- 📋 **Before cutover**: Copy data from syncthing to zippy: `rsync -av /data/sync/wordpress/ zippy:/persist/services/appdata/wordpress/`
|
|
- 📋 **After migration**: Remove syncthing configuration for wordpress sync
|
|
- **Notes**: Production website. Important but can tolerate brief downtime during migration.
|
|
|
|
#### ghost
|
|
- **File**: `services/ghost.hcl`
|
|
- **Priority**: no longer used, should wipe
|
|
- **Current**: Uses `/data/compute/appdata/ghost`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/ghost` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/ghost` → `/data/services/ghost`
|
|
- **Notes**: Blog platform (alo.land). Can tolerate downtime.
|
|
|
|
#### gitea
|
|
- **File**: `services/gitea.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/gitea/data`, `/data/compute/appdata/gitea/config`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/gitea/*` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: `/data/compute/appdata/gitea/*` → `/data/services/gitea/*`
|
|
- **Notes**: Git server. Contains code repositories. Important.
|
|
|
|
#### wiki (tiddlywiki)
|
|
- **File**: `services/wiki.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/wiki` via host volume mount
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/wiki` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume mount path in `volume_mount` blocks
|
|
- ⚠️ Uses `exec` driver with host volumes - verify NFS mount works with this
|
|
- **Notes**: Multiple tiddlywiki instances. Personal wikis. Can tolerate downtime.
|
|
|
|
#### code-server
|
|
- **File**: `services/code-server.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Uses `/data/compute/appdata/code`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/code` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/code` → `/data/services/code`
|
|
- **Notes**: Web IDE. Low priority, for development only.
|
|
|
|
#### beancount (fava)
|
|
- **File**: `services/beancount.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: Uses `/data/compute/appdata/beancount`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/beancount` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume path: `/data/compute/appdata/beancount` → `/data/services/beancount`
|
|
- **Notes**: Finance tracking. Low priority.
|
|
|
|
#### adminer
|
|
- **File**: `services/adminer.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Stateless
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: None
|
|
- **Changes**: None needed
|
|
- **Notes**: Database admin UI. Only needed for maintenance.
|
|
|
|
#### plausible
|
|
- **File**: `services/plausible.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Stateless (uses postgres and clickhouse)
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: None (uses postgres.service.consul, clickhouse.service.consul)
|
|
- **Changes**: None needed
|
|
- **Notes**: Website analytics. Nice to have but not critical.
|
|
|
|
#### evcc
|
|
- **File**: `services/evcc.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/evcc/evcc.yaml`, `/data/compute/appdata/evcc/evcc`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/evcc/*` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: `/data/compute/appdata/evcc/*` → `/data/services/evcc/*`
|
|
- **Notes**: EV charging controller. Important for daily use.
|
|
|
|
#### vikunja
|
|
- **File**: `services/vikunja.hcl` (assumed to exist based on README)
|
|
- **Priority**: no longer used, should delete
|
|
- **Current**: Likely uses `/data/compute/appdata/vikunja`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/vikunja` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/vikunja`
|
|
- **Notes**: Task management. Low priority.
|
|
|
|
#### leantime
|
|
- **File**: `services/leantime.hcl`
|
|
- **Priority**: no longer used, should delete
|
|
- **Current**: Likely uses `/data/compute/appdata/leantime`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/leantime` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/leantime`
|
|
- **Notes**: Project management. Low priority.
|
|
|
|
### Network Infrastructure
|
|
|
|
#### unifi
|
|
- **File**: `services/unifi.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Uses `/data/compute/appdata/unifi/data`, `/data/compute/appdata/unifi/mongodb`
|
|
- **Target**: Float on c1/c2/c3/fractal/zippy
|
|
- **Data**: `/data/services/unifi/*` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: `/data/compute/appdata/unifi/*` → `/data/services/unifi/*`
|
|
- **Notes**: UniFi network controller. Critical for network management. Has keepalived VIP for stable inform address. Floating is fine.
|
|
|
|
### Media Stack
|
|
|
|
#### media (radarr, sonarr, bazarr, plex, qbittorrent)
|
|
- **File**: `services/media.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: Uses `/data/compute/appdata/radarr`, `/data/compute/appdata/sonarr`, etc. and `/data/media`
|
|
- **Target**: **MUST run on fractal** (local /data/media access)
|
|
- **Data**:
|
|
- `/data/services/radarr` (NFS) - config data
|
|
- `/data/media` (local CIFS mount on fractal, local disk on fractal)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: `/data/compute/appdata/*` → `/data/services/*`
|
|
- ✏️ **Add constraint**:
|
|
```hcl
|
|
constraint {
|
|
attribute = "${node.unique.name}"
|
|
value = "fractal"
|
|
}
|
|
```
|
|
- **Notes**: Heavy I/O to /data/media. Must run on fractal for performance. Has keepalived VIP.
|
|
|
|
### Utility Services
|
|
|
|
#### weewx
|
|
- **File**: `services/weewx.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Likely uses `/data/compute/appdata/weewx`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/weewx` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/weewx`
|
|
- **Notes**: Weather station. Low priority.
|
|
|
|
#### maps
|
|
- **File**: `services/maps.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: Likely uses `/data/compute/appdata/maps`
|
|
- **Target**: Float on c1/c2/c3 (or fractal if large tile data)
|
|
- **Data**: `/data/services/maps` (NFS) or `/data/media/maps` if large
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Check data size, may want to move to /data/media
|
|
- **Notes**: Map tiles. Low priority.
|
|
|
|
#### netbox
|
|
- **File**: `services/netbox.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Likely uses `/data/compute/appdata/netbox`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/netbox` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/netbox`
|
|
- **Notes**: IPAM/DCIM. Low priority, for documentation.
|
|
|
|
#### farmos
|
|
- **File**: `services/farmos.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Likely uses `/data/compute/appdata/farmos`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/farmos` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/farmos`
|
|
- **Notes**: Farm management. Low priority.
|
|
|
|
#### urbit
|
|
- **File**: `services/urbit.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Likely uses `/data/compute/appdata/urbit`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/urbit` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/urbit`
|
|
- **Notes**: Urbit node. Experimental, low priority.
|
|
|
|
#### webodm
|
|
- **File**: `services/webodm.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Likely uses `/data/compute/appdata/webodm`
|
|
- **Target**: Float on c1/c2/c3 (or fractal if processing large imagery from /data/media)
|
|
- **Data**: `/data/services/webodm` (NFS)
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Update to `/data/services/webodm`
|
|
- 🤔 May benefit from running on fractal if it processes files from /data/media
|
|
- **Notes**: Drone imagery processing. Low priority.
|
|
|
|
#### velutrack
|
|
- **File**: `services/velutrack.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Likely minimal state
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: Minimal
|
|
- **Changes**: Verify if any volume paths need updating
|
|
- **Notes**: Vehicle tracking. Low priority.
|
|
|
|
#### resol-gateway
|
|
- **File**: `services/resol-gateway.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Likely minimal state
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: Minimal
|
|
- **Changes**: Verify if any volume paths need updating
|
|
- **Notes**: Solar thermal controller. Low priority.
|
|
|
|
#### igsync
|
|
- **File**: `services/igsync.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: Likely uses `/data/compute/appdata/igsync` or `/data/media`
|
|
- **Target**: Float on c1/c2/c3 (or fractal if storing to /data/media)
|
|
- **Data**: Check if it writes to `/data/media` or `/data/services`
|
|
- **Changes**:
|
|
- ✏️ Volume paths: Verify and update
|
|
- **Notes**: Instagram sync. Low priority.
|
|
|
|
#### jupyter
|
|
- **File**: `services/jupyter.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Stateless or minimal state
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: Minimal
|
|
- **Changes**: Verify if any volume paths need updating
|
|
- **Notes**: Notebook server. Low priority, for experimentation.
|
|
|
|
#### whoami
|
|
- **File**: `services/whoami.hcl`
|
|
- **Priority**: LOW
|
|
- **Current**: Stateless
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: None
|
|
- **Changes**: None needed
|
|
- **Notes**: Test service. Can be stopped during migration.
|
|
|
|
#### tiddlywiki (if separate from wiki.hcl)
|
|
- **File**: `services/tiddlywiki.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: Likely same as wiki.hcl
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: `/data/services/tiddlywiki` (NFS)
|
|
- **Changes**: Same as wiki.hcl
|
|
- **Notes**: May be duplicate of wiki.hcl.
|
|
|
|
### Backup Jobs
|
|
|
|
#### mysql-backup
|
|
- **File**: `services/mysql-backup.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Likely writes to `/data/compute` or `/data/shared`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: Should write to `/data/shared` (backed up to fractal)
|
|
- **Changes**:
|
|
- ✏️ Verify backup destination, should be `/data/shared/backups/mysql`
|
|
- **Notes**: Important for disaster recovery. Should run regularly.
|
|
|
|
#### postgres-backup
|
|
- **File**: `services/postgres-backup.hcl`
|
|
- **Priority**: HIGH
|
|
- **Current**: Likely writes to `/data/compute` or `/data/shared`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: Should write to `/data/shared` (backed up to fractal)
|
|
- **Changes**:
|
|
- ✏️ Verify backup destination, should be `/data/shared/backups/postgres`
|
|
- **Notes**: Important for disaster recovery. Should run regularly.
|
|
|
|
#### wordpress-backup
|
|
- **File**: `services/wordpress-backup.hcl`
|
|
- **Priority**: MEDIUM
|
|
- **Current**: Likely writes to `/data/compute` or `/data/shared`
|
|
- **Target**: Float on c1/c2/c3
|
|
- **Data**: Should write to `/data/shared` (backed up to fractal)
|
|
- **Changes**:
|
|
- ✏️ Verify backup destination
|
|
- **Notes**: Periodic backup job.
|
|
|
|
---
|
|
|
|
## Failover Procedures
|
|
|
|
### NFS Server Failover (zippy → c1 or c2)
|
|
|
|
**When to use:** zippy is down and not coming back soon
|
|
|
|
**Prerequisites:**
|
|
- c1 and c2 have been receiving btrfs snapshots from zippy
|
|
- Last successful replication was < 1 hour ago (verify timestamps)
|
|
|
|
**Procedure:**
|
|
|
|
1. **Choose standby node** (c1 or c2)
|
|
```bash
|
|
# Check replication freshness
|
|
ssh c1 "ls -lt /persist/services-standby@* | head -5"
|
|
ssh c2 "ls -lt /persist/services-standby@* | head -5"
|
|
|
|
# Choose the one with most recent snapshot
|
|
# For this example, we'll use c1
|
|
```
|
|
|
|
2. **On standby node (c1), promote standby to primary**
|
|
```bash
|
|
ssh c1
|
|
|
|
# Stop NFS client mount (if running)
|
|
sudo systemctl stop data-services.mount
|
|
|
|
# Find latest snapshot
|
|
LATEST=$(ls -t /persist/services-standby@* | head -1)
|
|
|
|
# Create writable subvolume from snapshot
|
|
sudo btrfs subvolume snapshot $LATEST /persist/services
|
|
|
|
# Verify
|
|
ls -la /persist/services
|
|
```
|
|
|
|
3. **Deploy c1-nfs-server configuration**
|
|
```bash
|
|
# From your workstation
|
|
deploy -s '.#c1-nfs-server'
|
|
|
|
# This activates:
|
|
# - NFS server on c1
|
|
# - Consul service registration for "services"
|
|
# - Firewall rules
|
|
```
|
|
|
|
4. **On c1, verify NFS is running**
|
|
```bash
|
|
ssh c1
|
|
sudo systemctl status nfs-server
|
|
showmount -e localhost
|
|
dig @localhost -p 8600 services.service.consul # Should show c1's IP
|
|
```
|
|
|
|
5. **On other nodes, remount NFS**
|
|
```bash
|
|
# Nodes should auto-remount via Consul DNS, but you can force it:
|
|
for host in c2 c3 fractal zippy; do
|
|
ssh $host "sudo systemctl restart data-services.mount"
|
|
done
|
|
```
|
|
|
|
6. **Verify Nomad jobs are healthy**
|
|
```bash
|
|
nomad job status mysql
|
|
nomad job status postgres
|
|
# Check all critical services
|
|
```
|
|
|
|
7. **Update monitoring/alerts**
|
|
- Note in documentation that c1 is now primary NFS server
|
|
- Set up alert to remember to fail back to zippy when it's repaired
|
|
|
|
**Recovery Time Objective (RTO):** ~10-15 minutes
|
|
|
|
**Recovery Point Objective (RPO):** Last snapshot interval (**5 minutes** max)
|
|
|
|
### Failing Back to zippy
|
|
|
|
**When to use:** zippy is repaired and ready to resume primary role
|
|
|
|
**Procedure:**
|
|
|
|
1. **Sync data from c1 back to zippy**
|
|
```bash
|
|
# On c1 (current primary)
|
|
sudo btrfs subvolume snapshot -r /persist/services /persist/services@failback-$(date +%Y%m%d-%H%M%S)
|
|
FAILBACK=$(ls -t /persist/services@failback-* | head -1)
|
|
sudo btrfs send $FAILBACK | ssh zippy "sudo btrfs receive /persist/"
|
|
|
|
# On zippy, make it writable
|
|
ssh zippy "sudo btrfs subvolume snapshot /persist/$(basename $FAILBACK) /persist/services"
|
|
```
|
|
|
|
2. **Deploy zippy back to NFS server role**
|
|
```bash
|
|
deploy -s '.#zippy'
|
|
# Consul will register services.service.consul → zippy again
|
|
```
|
|
|
|
3. **Demote c1 back to standby**
|
|
```bash
|
|
deploy -s '.#c1'
|
|
# This removes NFS server, restores NFS client mount
|
|
```
|
|
|
|
4. **Verify all nodes are mounting from zippy**
|
|
```bash
|
|
dig @c1 -p 8600 services.service.consul # Should show zippy's IP
|
|
|
|
for host in c1 c2 c3 fractal; do
|
|
ssh $host "df -h | grep services"
|
|
done
|
|
```
|
|
|
|
### Database Job Failover (automatic via Nomad)
|
|
|
|
**When to use:** zippy is down, database jobs need to run elsewhere
|
|
|
|
**What happens automatically:**
|
|
1. Nomad detects zippy is unhealthy
|
|
2. Jobs with constraint `zippy|c1|c2` are rescheduled to c1 or c2
|
|
3. Jobs start on new node, accessing `/data/services` (now via NFS from promoted standby)
|
|
|
|
**Manual intervention needed:**
|
|
- None if NFS failover completed successfully
|
|
- If jobs are stuck: `nomad job stop mysql && nomad job run services/mysql.hcl`
|
|
|
|
**What to check:**
|
|
```bash
|
|
nomad job status mysql
|
|
nomad job status postgres
|
|
nomad job status redis
|
|
|
|
# Verify they're running on c1 or c2, not zippy
|
|
nomad alloc status <alloc-id>
|
|
```
|
|
|
|
### Complete Cluster Failure (lose quorum)
|
|
|
|
**Scenario:** 3 or more servers go down, quorum lost
|
|
|
|
**Prevention:** This is why we have 5 servers (need 3 for quorum)
|
|
|
|
**Recovery:**
|
|
1. **Bring up at least 3 servers** (any 3 from c1, c2, c3, fractal, zippy)
|
|
2. **If that's not possible, bootstrap new cluster:**
|
|
```bash
|
|
# On one surviving server, force bootstrap
|
|
consul force-leave <failed-node>
|
|
nomad operator raft list-peers
|
|
nomad operator raft remove-peer <failed-peer>
|
|
```
|
|
3. **Restore from backups** (worst case)
|
|
|
|
---
|
|
|
|
## Post-Migration Verification Checklist
|
|
|
|
- [ ] All 5 servers in quorum: `consul members` shows c1, c2, c3, fractal, zippy
|
|
- [ ] NFS mounts working: `df -h | grep services` on all nodes
|
|
- [ ] Btrfs replication running: Check systemd timers on zippy
|
|
- [ ] Critical services up: mysql, postgres, redis, traefik, authentik
|
|
- [ ] Monitoring working: Prometheus, Grafana, Loki accessible
|
|
- [ ] Media stack on fractal: `nomad alloc status` shows media job on fractal
|
|
- [ ] Database jobs on zippy: `nomad alloc status` shows mysql/postgres on zippy
|
|
- [ ] Consul DNS working: `dig @localhost -p 8600 services.service.consul`
|
|
- [ ] Backups running: Kopia snapshots include `/persist/services`
|
|
- [ ] GlusterFS removed: No glusterfs processes, volumes deleted
|
|
- [ ] Documentation updated: README.md, architecture diagrams
|
|
|
|
---
|
|
|
|
## Rollback Plan
|
|
|
|
**If migration fails catastrophically:**
|
|
|
|
1. **Stop all new Nomad jobs**
|
|
```bash
|
|
nomad job stop -purge <new-jobs>
|
|
```
|
|
|
|
2. **Restore GlusterFS mounts**
|
|
```bash
|
|
# On all nodes, re-enable GlusterFS client
|
|
deploy # With old configs
|
|
```
|
|
|
|
3. **Restart old Nomad jobs**
|
|
```bash
|
|
# With old paths pointing to /data/compute
|
|
nomad run services/*.hcl # Old versions from git
|
|
```
|
|
|
|
4. **Restore data if needed**
|
|
```bash
|
|
rsync -av /backup/compute-pre-migration/ /data/compute/
|
|
```
|
|
|
|
**Important:** Keep GlusterFS running until Phase 4 is complete and verified!
|
|
|
|
---
|
|
|
|
## Questions Answered
|
|
|
|
1. ✅ **Where is `/data/sync/wordpress` mounted from?**
|
|
- **Answer**: Syncthing-managed to avoid slow GlusterFS
|
|
- **Action**: Migrate to `/data/services/wordpress`, remove syncthing config
|
|
|
|
2. ✅ **Which services use `/data/media` directly?**
|
|
- **Answer**: Only media.hcl (radarr, sonarr, plex, qbittorrent)
|
|
- **Action**: Constrain media.hcl to fractal, everything else uses CIFS mount
|
|
|
|
3. ✅ **Do we want unifi on fractal or floating?**
|
|
- **Answer**: Floating is fine
|
|
- **Action**: No constraint needed
|
|
|
|
4. ✅ **What's the plan for sunny's existing data?**
|
|
- **Answer**: Ethereum data stays local, not replicated (too expensive)
|
|
- **Action**: Either backup/restore or resync from network during NixOS conversion
|
|
|
|
## Questions Still to Answer
|
|
|
|
1. **Backup retention for btrfs snapshots?**
|
|
- Current plan: Keep 24 hours of snapshots on zippy
|
|
- Is this enough? Or do we want more for safety?
|
|
- This should be fine -- snapshots are just for hot recovery. More/older backups are kept via kopia on fractal.
|
|
|
|
2. **c1-nfs-server vs c1 config - same host, different configs?**
|
|
- Recommendation: Use same hostname, different flake output
|
|
- `c1` = normal config with NFS client
|
|
- `c1-nfs-server` = variant with NFS server enabled
|
|
- Both in flake.nix, deploy appropriate one based on role
|
|
- Answer: recommendation makes sense.
|
|
|
|
3. **Should we verify webodm, igsync, maps don't need /data/media access?**
|
|
- neither of them needs /data/media
|
|
- maps needs /data/shared
|
|
|
|
---
|
|
|
|
## Timeline Estimate
|
|
|
|
**Total duration: 12-20 hours** (can be split across multiple sessions)
|
|
|
|
- Phase 0 (Prep): 1-2 hours
|
|
- Phase 1 (fractal): 4-6 hours
|
|
- Phase 2 (zippy storage): 2-3 hours
|
|
- Phase 3 (GlusterFS → NFS): 3-4 hours
|
|
- Phase 4 (Nomad jobs): 2-4 hours
|
|
- Phase 5 (sunny): 2-3 hours (optional, can be done later)
|
|
- Phase 6 (Cleanup): 1 hour
|
|
|
|
**Suggested schedule:**
|
|
- **Day 1**: Phases 0-1 (fractal conversion, establish quorum)
|
|
- **Day 2**: Phases 2-3 (zippy storage, data migration)
|
|
- **Day 3**: Phase 4 (Nomad job updates and deployment)
|
|
- **Day 4**: Phases 5-6 (sunny + cleanup) or take a break and do later
|
|
|
|
**Maintenance windows needed:**
|
|
- Phase 3: ~1 hour downtime (all services stopped during data migration)
|
|
- Phase 4: Rolling (services come back up as redeployed)
|