diff --git a/CLAUDE.md b/CLAUDE.md index 3e4b7bd..fca3a2d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -30,10 +30,7 @@ NixOS cluster configuration using flakes. Homelab infrastructure with Nomad/Cons └── services/ # Nomad job specs (.hcl files) ``` -## Current Architecture (transitioning) - -**OLD**: GlusterFS on c1/c2/c3 at `/data/compute` (being phased out) -**NEW**: NFS from zippy at `/data/services` (current target) +## Current Architecture ### Storage Mounts - `/data/services` - NFS from `data-services.service.consul` (zippy primary, c1 standby) @@ -86,26 +83,18 @@ NixOS cluster configuration using flakes. Homelab infrastructure with Nomad/Cons ## Migration Status -**Phase**: 4 in progress (20/35 services migrated) -**Current**: Migrating services from GlusterFS → NFS -**Next**: Finish migrating remaining services, update host volumes, remove GlusterFS -**Later**: Convert fractal to NixOS (deferred) +**Phase 3 & 4**: COMPLETE! GlusterFS removed, all services on NFS +**Next**: Convert fractal to NixOS (deferred) See `docs/MIGRATION_TODO.md` for detailed checklist. -**IMPORTANT**: When working on migration tasks: -1. Always update `docs/MIGRATION_TODO.md` after completing each service migration -2. Update both the individual service checklist AND the summary counts at the bottom -3. Pattern: `/data/compute/appdata/foo` → `/data/services/foo` (NOT `/data/services/appdata/foo`!) -4. Migration workflow per service: stop → copy data → edit config → start → update MIGRATION_TODO.md - ## Common Tasks **Deploy a host**: `deploy -s '.#hostname'` **Deploy all**: `deploy` **Check replication**: `ssh zippy journalctl -u replicate-services-to-c1.service -f` **NFS failover**: See `docs/NFS_FAILOVER.md` -**Nomad jobs**: `services/*.hcl` - update paths: `/data/compute/appdata/foo` → `/data/services/foo` (NOT `/data/services/appdata/foo`!) +**Nomad jobs**: `services/*.hcl` - service data stored at `/data/services/` ## Troubleshooting Hints diff --git a/common/cluster-member.nix b/common/cluster-member.nix index 2fd453a..7eefc50 100644 --- a/common/cluster-member.nix +++ b/common/cluster-member.nix @@ -8,7 +8,6 @@ ./unattended-encryption.nix ./cifs-client.nix ./consul.nix - ./glusterfs-client.nix # Keep during migration, will be removed in Phase 3 ./nfs-services-client.nix # New: NFS client for /data/services ]; diff --git a/common/glusterfs-client.nix b/common/glusterfs-client.nix deleted file mode 100644 index 5e9792e..0000000 --- a/common/glusterfs-client.nix +++ /dev/null @@ -1,13 +0,0 @@ -{ pkgs, ... }: -{ - environment.systemPackages = [ pkgs.glusterfs ]; - - fileSystems."/data/compute" = { - device = "192.168.1.71:/compute"; - fsType = "glusterfs"; - options = [ - "backup-volfile-servers=192.168.1.72:192.168.1.73" - "_netdev" - ]; - }; -} diff --git a/common/glusterfs.nix b/common/glusterfs.nix deleted file mode 100644 index 5de1f40..0000000 --- a/common/glusterfs.nix +++ /dev/null @@ -1,24 +0,0 @@ -{ - pkgs, - config, - lib, - ... -}: -{ - services.glusterfs = { - enable = true; - }; - - environment.persistence."/persist".directories = [ "/var/lib/glusterd" ]; - - # TODO: each volume needs its own port starting at 49152 - networking.firewall.allowedTCPPorts = [ - 24007 - 24008 - 24009 - 49152 - 49153 - 49154 - 49155 - ]; -} diff --git a/docs/MIGRATION_TODO.md b/docs/MIGRATION_TODO.md index 820aeb3..7fccfc0 100644 --- a/docs/MIGRATION_TODO.md +++ b/docs/MIGRATION_TODO.md @@ -37,17 +37,17 @@ See [CLUSTER_REVAMP.md](./CLUSTER_REVAMP.md) for detailed procedures. ## Phase 3: Migrate from GlusterFS to NFS - [x] Update all nodes to mount NFS at `/data/services` - [x] Deploy updated configs (NFS client on all nodes) -- [ ] Stop all Nomad jobs temporarily -- [ ] Copy data from GlusterFS to zippy NFS - - [ ] Copy `/data/compute/appdata/*` → `/persist/services/appdata/` - - [ ] Copy `/data/compute/config/*` → `/persist/services/config/` - - [ ] Copy `/data/sync/wordpress` → `/persist/services/appdata/wordpress` - - [ ] Verify data integrity -- [ ] Verify NFS mounts working on all nodes -- [ ] Stop GlusterFS volume -- [ ] Delete GlusterFS volume -- [ ] Remove GlusterFS from NixOS configs -- [ ] Remove syncthing wordpress sync configuration +- [x] Stop all Nomad jobs temporarily +- [x] Copy data from GlusterFS to zippy NFS + - [x] Copy `/data/compute/appdata/*` → `/persist/services/appdata/` + - [x] Copy `/data/compute/config/*` → `/persist/services/config/` + - [x] Copy `/data/sync/wordpress` → `/persist/services/appdata/wordpress` + - [x] Verify data integrity +- [x] Verify NFS mounts working on all nodes +- [x] Stop GlusterFS volume +- [x] Delete GlusterFS volume +- [x] Remove GlusterFS from NixOS configs +- [x] Remove syncthing wordpress sync configuration (no longer used) ## Phase 4: Update and redeploy Nomad jobs @@ -125,8 +125,8 @@ See [CLUSTER_REVAMP.md](./CLUSTER_REVAMP.md) for detailed procedures. - [ ] Verify backups include `/persist/services` data - [ ] Verify backups exclude replication snapshots - [ ] Update documentation (README.md, architecture diagrams) -- [ ] Clean up old GlusterFS data (only after everything verified!) -- [ ] Remove old glusterfs directories from all nodes +- [x] Clean up old GlusterFS data (only after everything verified!) +- [x] Remove old glusterfs directories from all nodes ## Post-Migration Checklist - [ ] All 5 servers in quorum (consul members) @@ -143,8 +143,8 @@ See [CLUSTER_REVAMP.md](./CLUSTER_REVAMP.md) for detailed procedures. --- -**Last updated**: 2025-10-23 22:30 -**Current phase**: Phase 4 complete! All services migrated to NFS +**Last updated**: 2025-10-25 +**Current phase**: Phase 3 & 4 complete! GlusterFS removed, all services on NFS **Note**: Phase 1 (fractal NixOS conversion) deferred until after GlusterFS migration is complete ## Migration Summary diff --git a/hosts/c1/default.nix b/hosts/c1/default.nix index cf6dde5..c261b82 100644 --- a/hosts/c1/default.nix +++ b/hosts/c1/default.nix @@ -6,7 +6,6 @@ ../../common/cluster-member.nix # Consul + storage clients ../../common/nomad-worker.nix # Nomad client (runs jobs) ../../common/nomad-server.nix # Consul + Nomad server mode - ../../common/glusterfs.nix # GlusterFS server (temp during migration) ../../common/nfs-services-standby.nix # NFS standby for /data/services # To promote to NFS server (during failover): # 1. Follow procedure in docs/NFS_FAILOVER.md diff --git a/hosts/c2/default.nix b/hosts/c2/default.nix index 436f704..c5d717a 100644 --- a/hosts/c2/default.nix +++ b/hosts/c2/default.nix @@ -6,7 +6,6 @@ ../../common/cluster-member.nix # Consul + storage clients ../../common/nomad-worker.nix # Nomad client (runs jobs) ../../common/nomad-server.nix # Consul + Nomad server mode - ../../common/glusterfs.nix # GlusterFS server (temp during migration) ./hardware.nix ]; diff --git a/hosts/c3/default.nix b/hosts/c3/default.nix index f8f3103..656fc92 100644 --- a/hosts/c3/default.nix +++ b/hosts/c3/default.nix @@ -6,7 +6,6 @@ ../../common/cluster-member.nix # Consul + storage clients ../../common/nomad-worker.nix # Nomad client (runs jobs) ../../common/nomad-server.nix # Consul + Nomad server mode - ../../common/glusterfs.nix # GlusterFS server (temp during migration) ../../common/binary-cache-server.nix ./hardware.nix ]; diff --git a/hosts/zippy/default.nix b/hosts/zippy/default.nix index e7fc27d..abd373a 100644 --- a/hosts/zippy/default.nix +++ b/hosts/zippy/default.nix @@ -6,7 +6,6 @@ ../../common/cluster-member.nix # Consul + storage clients ../../common/nomad-worker.nix # Nomad client (runs jobs) # NOTE: zippy is NOT a server - no nomad-server.nix import - ../../common/glusterfs.nix # GlusterFS server (temp during migration) # ../../common/ethereum.nix ../../common/nfs-services-server.nix # NFS server for /data/services # To move NFS server role to another host: