# Claude Code Quick Reference NixOS cluster configuration using flakes. Homelab infrastructure with Nomad/Consul orchestration. ## Project Structure ``` ├── common/ │ ├── global/ # Applied to all hosts (backup, sops, users, etc.) │ ├── minimal-node.nix # Base (ssh, user, boot, impermanence) │ ├── cluster-member.nix # Consul agent + storage mounts (NFS/CIFS) │ ├── nomad-worker.nix # Nomad client (runs jobs) + Docker + NFS deps │ ├── nomad-server.nix # Enables Consul + Nomad server mode │ ├── cluster-tools.nix # Just CLI tools (nomad, wander, damon) │ ├── workstation-node.nix # Dev tools (wget, deploy-rs, docker, nix-ld) │ ├── desktop-node.nix # Hyprland + GUI environment │ ├── nfs-services-server.nix # NFS server + btrfs replication │ └── nfs-services-standby.nix # NFS standby + receive replication ├── hosts/ # Host configs - check imports for roles ├── docs/ │ ├── CLUSTER_REVAMP.md # Master plan for architecture changes │ ├── MIGRATION_TODO.md # Tracking checklist for migration │ ├── NFS_FAILOVER.md # NFS failover procedures │ └── AUTH_SETUP.md # Authentication (Pocket ID + Traefik OIDC) └── services/ # Nomad job specs (.hcl files) ``` ## Current Architecture ### Storage Mounts - `/data/services` - NFS from `data-services.service.consul` (check nfs-services-server.nix for primary) - `/data/media` - CIFS from fractal - `/data/shared` - CIFS from fractal ### Cluster Roles (check hosts/*/default.nix for each host's imports) - **Quorum**: hosts importing `nomad-server.nix` (3 expected for consensus) - **Workers**: hosts importing `nomad-worker.nix` (run Nomad jobs) - **NFS server**: host importing `nfs-services-server.nix` (affinity for direct disk access like DBs) - **Standby**: hosts importing `nfs-services-standby.nix` (receive replication) ## Config Architecture **Modular role-based configs** (compose as needed): - `minimal-node.nix` - Base for all systems (SSH, user, boot, impermanence) - `cluster-member.nix` - Consul agent + shared storage mounts (no Nomad) - `nomad-worker.nix` - Nomad client to run jobs (requires cluster-member) - `nomad-server.nix` - Enables Consul + Nomad server mode (for quorum members) - `cluster-tools.nix` - Just CLI tools (no services) **Machine type configs** (via flake profile): - `workstation-node.nix` - Dev tools (deploy-rs, docker, nix-ld, emulation) - `desktop-node.nix` - Extends workstation + Hyprland/GUI **Composition patterns**: - Quorum member: `cluster-member + nomad-worker + nomad-server` - Worker only: `cluster-member + nomad-worker` - CLI only: `cluster-member + cluster-tools` (Consul agent, no Nomad service) - NFS primary: `cluster-member + nomad-worker + nfs-services-server` - Standalone: `minimal-node` only (no cluster membership) **Key insight**: Profiles (workstation/desktop) don't imply cluster roles. Check imports for actual roles. ## Key Patterns **NFS Server/Standby**: - Primary: imports `nfs-services-server.nix`, sets `standbys = [...]` - Standby: imports `nfs-services-standby.nix`, sets `replicationKeys = [...]` - Replication: btrfs send/receive every 5min, incremental with fallback to full - Check host configs for current primary/standby assignments **Backups**: - Kopia client on all nodes → Kopia server on fractal - Backs up `/persist` hourly via btrfs snapshot - Excludes: `services@*` and `services-standby/services@*` (replication snapshots) **Secrets**: - SOPS for secrets, files in `secrets/` - Keys managed per-host **Authentication**: - Pocket ID (OIDC provider) at `pocket-id.v.paler.net` - Traefik uses `traefik-oidc-auth` plugin for SSO - Services add `middlewares=oidc-auth@file` tag to protect - See `docs/AUTH_SETUP.md` for details ## Migration Status **Phase 3 & 4**: COMPLETE! GlusterFS removed, all services on NFS **Next**: Convert fractal to NixOS (deferred) See `docs/MIGRATION_TODO.md` for detailed checklist. ## Common Tasks **Deploy a host**: `deploy -s '.#hostname'` **Deploy all**: `deploy` **Check replication**: Check NFS primary host, then `ssh journalctl -u replicate-services-to-*.service -f` **NFS failover**: See `docs/NFS_FAILOVER.md` **Nomad jobs**: `services/*.hcl` - service data stored at `/data/services/` ## Troubleshooting Hints - Replication errors with "empty stream": SSH key restricted to `btrfs receive`, can't run other commands - NFS split-brain protection: nfs-server checks Consul before starting - Btrfs snapshots: nested snapshots appear as empty dirs in parent snapshots - Kopia: uses temporary snapshot for consistency, doesn't back up nested subvolumes ## Important Files - `common/global/backup.nix` - Kopia backup configuration - `common/nfs-services-server.nix` - NFS server role (check hosts for which imports this) - `common/nfs-services-standby.nix` - NFS standby role (check hosts for which imports this) - `flake.nix` - Host definitions, nixpkgs inputs --- *Auto-generated reference for Claude Code. Keep concise. Update when architecture changes.*