439 lines
10 KiB
Markdown
439 lines
10 KiB
Markdown
# NFS Services Failover Procedures
|
|
|
|
This document describes how to fail over the `/data/services` NFS server between hosts and how to fail back.
|
|
|
|
## Architecture Overview
|
|
|
|
- **Primary NFS Server**: Typically `zippy`
|
|
- Exports `/persist/services` via NFS
|
|
- Has local bind mount: `/data/services` → `/persist/services` (same path as clients)
|
|
- Registers `data-services.service.consul` in Consul
|
|
- Sets Nomad node meta: `storage_role = "primary"`
|
|
- Replicates snapshots to standbys every 5 minutes via btrfs send
|
|
- **Safety check**: Refuses to start if another NFS server is already active in Consul
|
|
|
|
- **Standby**: Typically `c1`
|
|
- Receives snapshots at `/persist/services-standby/services@<timestamp>`
|
|
- Can be promoted to NFS server during failover
|
|
- No special Nomad node meta (not primary)
|
|
|
|
- **Clients**: All cluster nodes (c1, c2, c3, zippy)
|
|
- Mount `/data/services` from `data-services.service.consul:/persist/services`
|
|
- Automatically connect to whoever is registered in Consul
|
|
|
|
### Nomad Job Constraints
|
|
|
|
Jobs that need to run on the primary storage node should use:
|
|
|
|
```hcl
|
|
constraint {
|
|
attribute = "${meta.storage_role}"
|
|
value = "primary"
|
|
}
|
|
```
|
|
|
|
This is useful for:
|
|
- Database jobs (mysql, postgres, redis) that benefit from local storage
|
|
- Jobs that need guaranteed fast disk I/O
|
|
|
|
During failover, the `storage_role = "primary"` meta attribute moves to the new NFS server, and Nomad automatically reschedules constrained jobs to the new primary.
|
|
|
|
## Prerequisites
|
|
|
|
- Standby has been receiving snapshots (check: `ls /persist/services-standby/services@*`)
|
|
- Last successful replication was recent (< 5-10 minutes)
|
|
|
|
---
|
|
|
|
## Failover: Promoting Standby to Primary
|
|
|
|
**Scenario**: `zippy` is down and you need to promote `c1` to be the NFS server.
|
|
|
|
### Step 1: Choose Latest Snapshot
|
|
|
|
On the standby (c1):
|
|
|
|
```bash
|
|
ssh c1
|
|
sudo ls -lt /persist/services-standby/services@* | head -5
|
|
```
|
|
|
|
Find the most recent snapshot. Note the timestamp to estimate data loss (typically < 5 minutes).
|
|
|
|
### Step 2: Promote Snapshot to Read-Write Subvolume
|
|
|
|
On c1:
|
|
|
|
```bash
|
|
# Find the latest snapshot
|
|
LATEST=$(sudo ls -t /persist/services-standby/services@* | head -1)
|
|
|
|
# Create writable subvolume from snapshot
|
|
sudo btrfs subvolume snapshot "$LATEST" /persist/services
|
|
|
|
# Verify
|
|
ls -la /persist/services
|
|
```
|
|
|
|
### Step 3: Update NixOS Configuration
|
|
|
|
Edit your configuration to swap the NFS server role:
|
|
|
|
**In `hosts/c1/default.nix`**:
|
|
```nix
|
|
imports = [
|
|
# ... existing imports ...
|
|
# ../../common/nfs-services-standby.nix # REMOVE THIS
|
|
../../common/nfs-services-server.nix # ADD THIS
|
|
];
|
|
|
|
# Add standbys if desired (optional - can leave empty during emergency)
|
|
nfsServicesServer.standbys = []; # Or ["c2"] to add a new standby
|
|
```
|
|
|
|
**Optional: Prepare zippy config for when it comes back**:
|
|
|
|
In `hosts/zippy/default.nix` (can do this later too):
|
|
```nix
|
|
imports = [
|
|
# ... existing imports ...
|
|
# ../../common/nfs-services-server.nix # REMOVE THIS
|
|
../../common/nfs-services-standby.nix # ADD THIS
|
|
];
|
|
|
|
# Add the replication key from c1 (get it from c1:/persist/root/.ssh/btrfs-replication.pub)
|
|
nfsServicesStandby.replicationKeys = [
|
|
"ssh-ed25519 AAAA... root@c1-replication"
|
|
];
|
|
```
|
|
|
|
### Step 4: Deploy Configuration
|
|
|
|
```bash
|
|
# From your workstation
|
|
deploy -s '.#c1'
|
|
|
|
# If zippy is still down, updating its config will fail, but that's okay
|
|
# You can update it later when it comes back
|
|
```
|
|
|
|
### Step 5: Verify NFS Server is Running
|
|
|
|
On c1:
|
|
|
|
```bash
|
|
sudo systemctl status nfs-server
|
|
sudo showmount -e localhost
|
|
dig @localhost -p 8600 data-services.service.consul # Should show c1's IP
|
|
```
|
|
|
|
### Step 6: Verify Clients Can Access
|
|
|
|
From any node:
|
|
|
|
```bash
|
|
df -h | grep services
|
|
ls /data/services
|
|
```
|
|
|
|
The mount should automatically reconnect via Consul DNS.
|
|
|
|
### Step 7: Check Nomad Jobs
|
|
|
|
```bash
|
|
nomad job status mysql
|
|
nomad job status postgres
|
|
# Verify critical services are healthy
|
|
|
|
# Jobs constrained to ${meta.storage_role} = "primary" will automatically
|
|
# reschedule to c1 once it's deployed with the NFS server module
|
|
```
|
|
|
|
**Recovery Time Objective (RTO)**: ~10-15 minutes
|
|
**Recovery Point Objective (RPO)**: Last replication interval (5 minutes max)
|
|
|
|
**Note**: Jobs with the `storage_role = "primary"` constraint will automatically move to c1 because it now has that node meta attribute. No job spec changes needed!
|
|
|
|
---
|
|
|
|
## What Happens When zippy Comes Back?
|
|
|
|
**IMPORTANT**: If zippy reboots while still configured as NFS server, it will **refuse to start** the NFS service because it detects c1 is already active in Consul.
|
|
|
|
You'll see this error in `journalctl -u nfs-server`:
|
|
|
|
```
|
|
ERROR: Another NFS server is already active at 192.168.1.X
|
|
This host (192.168.1.2) is configured as NFS server but should be standby.
|
|
To fix:
|
|
1. If this is intentional (failback), first demote the other server
|
|
2. Update this host's config to use nfs-services-standby.nix instead
|
|
3. Sync data from active server before promoting this host
|
|
```
|
|
|
|
This is a **safety feature** to prevent split-brain and data corruption.
|
|
|
|
### Options when zippy comes back:
|
|
|
|
**Option A: Keep c1 as primary** (zippy becomes standby)
|
|
1. Update zippy's config to use `nfs-services-standby.nix`
|
|
2. Deploy to zippy
|
|
3. c1 will start replicating to zippy
|
|
|
|
**Option B: Fail back to zippy as primary**
|
|
Follow the "Failing Back to Original Primary" procedure below.
|
|
|
|
---
|
|
|
|
## Failing Back to Original Primary
|
|
|
|
**Scenario**: `zippy` is repaired and you want to move the NFS server role back from `c1` to `zippy`.
|
|
|
|
### Step 1: Sync Latest Data from c1 to zippy
|
|
|
|
On c1 (current primary):
|
|
|
|
```bash
|
|
# Create readonly snapshot of current state
|
|
sudo btrfs subvolume snapshot -r /persist/services /persist/services@failback-$(date +%Y%m%d-%H%M%S)
|
|
|
|
# Find the snapshot
|
|
FAILBACK=$(sudo ls -t /persist/services@failback-* | head -1)
|
|
|
|
# Send to zippy (use root SSH key if available, or generate temporary key)
|
|
sudo btrfs send "$FAILBACK" | ssh root@zippy "btrfs receive /persist/"
|
|
```
|
|
|
|
On zippy:
|
|
|
|
```bash
|
|
# Verify snapshot arrived
|
|
ls -la /persist/services@failback-*
|
|
|
|
# Create writable subvolume from the snapshot
|
|
FAILBACK=$(ls -t /persist/services@failback-* | head -1)
|
|
sudo btrfs subvolume snapshot "$FAILBACK" /persist/services
|
|
|
|
# Verify
|
|
ls -la /persist/services
|
|
```
|
|
|
|
### Step 2: Update NixOS Configuration
|
|
|
|
Swap the roles back:
|
|
|
|
**In `hosts/zippy/default.nix`**:
|
|
```nix
|
|
imports = [
|
|
# ... existing imports ...
|
|
# ../../common/nfs-services-standby.nix # REMOVE THIS
|
|
../../common/nfs-services-server.nix # ADD THIS
|
|
];
|
|
|
|
nfsServicesServer.standbys = ["c1"];
|
|
```
|
|
|
|
**In `hosts/c1/default.nix`**:
|
|
```nix
|
|
imports = [
|
|
# ... existing imports ...
|
|
# ../../common/nfs-services-server.nix # REMOVE THIS
|
|
../../common/nfs-services-standby.nix # ADD THIS
|
|
];
|
|
|
|
nfsServicesStandby.replicationKeys = [
|
|
"ssh-ed25519 AAAA... root@zippy-replication" # Get from zippy:/persist/root/.ssh/btrfs-replication.pub
|
|
];
|
|
```
|
|
|
|
### Step 3: Deploy Configurations
|
|
|
|
```bash
|
|
# IMPORTANT: Deploy c1 FIRST to demote it
|
|
deploy -s '.#c1'
|
|
|
|
# Wait for c1 to stop NFS server
|
|
ssh c1 sudo systemctl status nfs-server # Should be inactive
|
|
|
|
# Then deploy zippy to promote it
|
|
deploy -s '.#zippy'
|
|
```
|
|
|
|
The order matters! If you deploy zippy first, it will see c1 is still active and refuse to start.
|
|
|
|
### Step 4: Verify Failback
|
|
|
|
Check Consul DNS points to zippy:
|
|
|
|
```bash
|
|
dig @c1 -p 8600 data-services.service.consul # Should show zippy's IP
|
|
```
|
|
|
|
Check clients are mounting from zippy:
|
|
|
|
```bash
|
|
for host in c1 c2 c3; do
|
|
ssh $host "df -h | grep services"
|
|
done
|
|
```
|
|
|
|
### Step 5: Clean Up Temporary Snapshots
|
|
|
|
On c1:
|
|
|
|
```bash
|
|
# Remove the failback snapshot and the promoted subvolume
|
|
sudo btrfs subvolume delete /persist/services@failback-*
|
|
sudo btrfs subvolume delete /persist/services
|
|
```
|
|
|
|
---
|
|
|
|
## Adding a New Standby
|
|
|
|
**Scenario**: You want to add `c2` as an additional standby.
|
|
|
|
### Step 1: Create Standby Subvolume on c2
|
|
|
|
```bash
|
|
ssh c2
|
|
sudo btrfs subvolume create /persist/services-standby
|
|
```
|
|
|
|
### Step 2: Update c2 Configuration
|
|
|
|
**In `hosts/c2/default.nix`**:
|
|
```nix
|
|
imports = [
|
|
# ... existing imports ...
|
|
../../common/nfs-services-standby.nix
|
|
];
|
|
|
|
nfsServicesStandby.replicationKeys = [
|
|
"ssh-ed25519 AAAA... root@zippy-replication" # Get from current NFS server
|
|
];
|
|
```
|
|
|
|
### Step 3: Update NFS Server Configuration
|
|
|
|
On the current NFS server (e.g., zippy), update the standbys list:
|
|
|
|
**In `hosts/zippy/default.nix`**:
|
|
```nix
|
|
nfsServicesServer.standbys = ["c1" "c2"]; # Added c2
|
|
```
|
|
|
|
### Step 4: Deploy
|
|
|
|
```bash
|
|
deploy -s '.#c2'
|
|
deploy -s '.#zippy'
|
|
```
|
|
|
|
The next replication cycle (within 5 minutes) will do a full send to c2, then switch to incremental.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Replication Failed
|
|
|
|
Check the replication service logs:
|
|
|
|
```bash
|
|
# On NFS server
|
|
sudo journalctl -u replicate-services-to-c1 -f
|
|
```
|
|
|
|
Common issues:
|
|
- SSH key not found → Run key generation step (see stateful-commands.txt)
|
|
- Permission denied → Check authorized_keys on standby
|
|
- Snapshot already exists → Old snapshot with same timestamp, wait for next cycle
|
|
|
|
### Clients Can't Mount
|
|
|
|
Check Consul:
|
|
|
|
```bash
|
|
dig @localhost -p 8600 data-services.service.consul
|
|
consul catalog services | grep data-services
|
|
```
|
|
|
|
If Consul isn't resolving:
|
|
- NFS server might not have registered → Check `sudo systemctl status nfs-server`
|
|
- Consul agent might be down → Check `sudo systemctl status consul`
|
|
|
|
### Mount is Stale
|
|
|
|
Force remount:
|
|
|
|
```bash
|
|
sudo systemctl restart data-services.mount
|
|
```
|
|
|
|
Or unmount and let automount handle it:
|
|
|
|
```bash
|
|
sudo umount /data/services
|
|
ls /data/services # Triggers automount
|
|
```
|
|
|
|
### Split-Brain Prevention: NFS Server Won't Start
|
|
|
|
If you see:
|
|
```
|
|
ERROR: Another NFS server is already active at 192.168.1.X
|
|
```
|
|
|
|
This is **intentional** - the safety check is working! You have two options:
|
|
|
|
1. **Keep the other server as primary**: Update this host's config to be a standby instead
|
|
2. **Fail back to this host**: First demote the other server, sync data, then deploy both hosts in correct order
|
|
|
|
---
|
|
|
|
## Monitoring
|
|
|
|
### Check Replication Status
|
|
|
|
On NFS server:
|
|
|
|
```bash
|
|
# List recent snapshots
|
|
ls -lt /persist/services@* | head
|
|
|
|
# Check last replication run
|
|
sudo systemctl status replicate-services-to-c1
|
|
|
|
# Check replication logs
|
|
sudo journalctl -u replicate-services-to-c1 --since "1 hour ago"
|
|
```
|
|
|
|
On standby:
|
|
|
|
```bash
|
|
# List received snapshots
|
|
ls -lt /persist/services-standby/services@* | head
|
|
|
|
# Check how old the latest snapshot is
|
|
stat /persist/services-standby/services@* | grep Modify | head -1
|
|
```
|
|
|
|
### Verify NFS Exports
|
|
|
|
```bash
|
|
sudo showmount -e localhost
|
|
```
|
|
|
|
Should show:
|
|
```
|
|
/persist/services 192.168.1.0/24
|
|
```
|
|
|
|
### Check Consul Registration
|
|
|
|
```bash
|
|
consul catalog services | grep data-services
|
|
dig @localhost -p 8600 data-services.service.consul
|
|
```
|