99 lines
4.3 KiB
Markdown
99 lines
4.3 KiB
Markdown
# Claude Code Quick Reference
|
|
|
|
NixOS cluster configuration using flakes. Homelab infrastructure with Nomad/Consul orchestration.
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
├── common/
|
|
│ ├── global/ # Applied to all hosts (backup, sops, users, etc.)
|
|
│ ├── compute-node.nix # Nomad client + Consul agent + NFS client
|
|
│ ├── cluster-node.nix # Nomad server + Consul server (for quorum members)
|
|
│ ├── nfs-services-server.nix # NFS server + btrfs replication (zippy)
|
|
│ └── nfs-services-standby.nix # NFS standby + receive replication (c1, c2)
|
|
├── hosts/
|
|
│ ├── c1/, c2/, c3/ # Cattle nodes (compute, quorum members)
|
|
│ ├── zippy/ # Primary storage + NFS server + stateful workloads
|
|
│ ├── fractal/ # (Proxmox, will become NixOS storage node)
|
|
│ ├── sunny/ # (Standalone ethereum node, not in cluster)
|
|
│ └── chilly/ # (Home Assistant VM, not in cluster)
|
|
├── docs/
|
|
│ ├── CLUSTER_REVAMP.md # Master plan for architecture changes
|
|
│ ├── MIGRATION_TODO.md # Tracking checklist for migration
|
|
│ └── NFS_FAILOVER.md # NFS failover procedures
|
|
└── services/ # Nomad job specs (.hcl files)
|
|
```
|
|
|
|
## Current Architecture (transitioning)
|
|
|
|
**OLD**: GlusterFS on c1/c2/c3 at `/data/compute` (being phased out)
|
|
**NEW**: NFS from zippy at `/data/services` (current target)
|
|
|
|
### Storage Mounts
|
|
- `/data/services` - NFS from `data-services.service.consul` (zippy primary, c1 standby)
|
|
- `/data/media` - CIFS from fractal (existing, unchanged)
|
|
- `/data/shared` - CIFS from fractal (existing, unchanged)
|
|
|
|
### Hosts
|
|
- **c1, c2, c3**: Cattle nodes, run most workloads, Nomad/Consul quorum
|
|
- **zippy**: Primary NFS server, runs databases (affinity), replicates to c1 every 5min
|
|
- **fractal**: Storage node (Proxmox/ZFS), will join quorum after GlusterFS removed
|
|
- **sunny**: Standalone ethereum staking node
|
|
- **chilly**: Home Assistant VM
|
|
|
|
## Key Patterns
|
|
|
|
**NFS Server/Standby**:
|
|
- Primary (zippy): imports `nfs-services-server.nix`, sets `standbys = ["c1"]`
|
|
- Standby (c1): imports `nfs-services-standby.nix`, sets `replicationKeys = [...]`
|
|
- Replication: btrfs send/receive every 5min, incremental with fallback to full
|
|
|
|
**Backups**:
|
|
- Kopia client on all nodes → Kopia server on fractal
|
|
- Backs up `/persist` hourly via btrfs snapshot
|
|
- Excludes: `services@*` and `services-standby/services@*` (replication snapshots)
|
|
|
|
**Secrets**:
|
|
- SOPS for secrets, files in `secrets/`
|
|
- Keys managed per-host
|
|
|
|
## Migration Status
|
|
|
|
**Phase**: 4 in progress (20/35 services migrated)
|
|
**Current**: Migrating services from GlusterFS → NFS
|
|
**Next**: Finish migrating remaining services, update host volumes, remove GlusterFS
|
|
**Later**: Convert fractal to NixOS (deferred)
|
|
|
|
See `docs/MIGRATION_TODO.md` for detailed checklist.
|
|
|
|
**IMPORTANT**: When working on migration tasks:
|
|
1. Always update `docs/MIGRATION_TODO.md` after completing each service migration
|
|
2. Update both the individual service checklist AND the summary counts at the bottom
|
|
3. Pattern: `/data/compute/appdata/foo` → `/data/services/foo` (NOT `/data/services/appdata/foo`!)
|
|
4. Migration workflow per service: stop → copy data → edit config → start → update MIGRATION_TODO.md
|
|
|
|
## Common Tasks
|
|
|
|
**Deploy a host**: `deploy -s '.#hostname'`
|
|
**Deploy all**: `deploy`
|
|
**Check replication**: `ssh zippy journalctl -u replicate-services-to-c1.service -f`
|
|
**NFS failover**: See `docs/NFS_FAILOVER.md`
|
|
**Nomad jobs**: `services/*.hcl` - update paths: `/data/compute/appdata/foo` → `/data/services/foo` (NOT `/data/services/appdata/foo`!)
|
|
|
|
## Troubleshooting Hints
|
|
|
|
- Replication errors with "empty stream": SSH key restricted to `btrfs receive`, can't run other commands
|
|
- NFS split-brain protection: nfs-server checks Consul before starting
|
|
- Btrfs snapshots: nested snapshots appear as empty dirs in parent snapshots
|
|
- Kopia: uses temporary snapshot for consistency, doesn't back up nested subvolumes
|
|
|
|
## Important Files
|
|
|
|
- `common/global/backup.nix` - Kopia backup configuration
|
|
- `hosts/zippy/default.nix` - NFS server config, replication targets
|
|
- `hosts/c1/default.nix` - NFS standby config, authorized replication keys
|
|
- `flake.nix` - Host definitions, nixpkgs inputs
|
|
|
|
---
|
|
*Auto-generated reference for Claude Code. Keep concise. Update when architecture changes.*
|