- .gitea/workflows/deploy-nomad.yaml: Shared workflow for build/push/deploy - docs/CICD_SETUP.md: Guide for adding CI/CD to new services - nix-runner/README.md: Document the custom Nix runner image Services can now use a 10-line workflow that calls the shared one: uses: ppetru/alo-cluster/.gitea/workflows/deploy-nomad.yaml@master 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.4 KiB
CI/CD Setup for Nomad Services
Guide for adding automated builds and deployments to a service.
Prerequisites
1. Service Repository
Your service needs a flake.nix that exports a Docker image:
{
outputs = { self, nixpkgs, ... }: {
# The workflow looks for this output by default
dockerImage = pkgs.dockerTools.buildImage {
name = "gitea.v.paler.net/ppetru/<service>";
tag = "latest";
# ... image config
};
};
}
Important: Use extraCommands instead of runAsRoot in your Docker build - the CI runner doesn't have KVM.
2. Nomad Job
Your job in services/<name>.hcl needs:
job "<service>" {
# Required: UUID changes trigger deployments
meta {
uuid = uuidv4()
}
# Required: enables deployment tracking and auto-rollback
update {
max_parallel = 1
health_check = "checks"
min_healthy_time = "30s"
healthy_deadline = "5m"
auto_revert = true
}
# Required: pulls new image on each deployment
task "app" {
config {
force_pull = true
}
# Recommended: health check for deployment validation
service {
check {
type = "http"
path = "/healthz"
interval = "10s"
timeout = "5s"
}
}
}
}
Quick Start
1. Create Workflow
Add .gitea/workflows/deploy.yaml to your service repo:
name: Deploy
on:
push:
branches: [master]
workflow_dispatch:
jobs:
deploy:
uses: ppetru/alo-cluster/.gitea/workflows/deploy-nomad.yaml@master
with:
service_name: <your-service> # Must match Nomad job ID
secrets: inherit
2. Add Secrets
In Gitea → Your Repo → Settings → Actions → Secrets, add:
| Secret | Value |
|---|---|
REGISTRY_USERNAME |
ppetru |
REGISTRY_PASSWORD |
Gitea access token with packages:write |
NOMAD_ADDR |
http://nomad.service.consul:4646 |
3. Push
Push to master branch. The workflow will:
- Build your Docker image with Nix
- Push to Gitea registry
- Update the Nomad job to trigger deployment
- Monitor until deployment succeeds or fails
Workflow Options
The shared workflow accepts these inputs:
| Input | Default | Description |
|---|---|---|
service_name |
(required) | Nomad job ID |
flake_output |
dockerImage |
Flake output to build |
registry |
gitea.v.paler.net |
Container registry |
Example with custom flake output:
jobs:
deploy:
uses: ppetru/alo-cluster/.gitea/workflows/deploy-nomad.yaml@master
with:
service_name: myservice
flake_output: packages.x86_64-linux.docker
secrets: inherit
How It Works
Push to master
↓
Build: nix build .#dockerImage
↓
Push: skopeo → gitea.v.paler.net/ppetru/<service>:latest
↓
Deploy: Update job meta.uuid → Nomad creates deployment
↓
Monitor: Poll deployment status for up to 5 minutes
↓
Success: Deployment healthy
OR
Failure: Nomad auto-reverts to previous version
Troubleshooting
Build fails with KVM error
Required system: 'x86_64-linux' with features {kvm}
Use extraCommands instead of runAsRoot in your docker.nix:
# Bad - requires KVM
runAsRoot = ''
mkdir -p /tmp
'';
# Good - no KVM needed
extraCommands = ''
mkdir -p tmp
chmod 1777 tmp
'';
No deployment created
Ensure your Nomad job has the update stanza with auto_revert = true.
Image not updating
Check that force_pull = true is set in the Nomad job's Docker config.
Deployment fails health checks
- Check your
/healthzendpoint works - Increase
healthy_deadlineif startup is slow - Check
nomad alloc logs <alloc-id>for errors
Workflow can't access alo-cluster
If Gitea can't pull the reusable workflow, you may need to make alo-cluster public or use a token. As a fallback, copy the workflow content directly.
Manual Deployment
If CI fails, you can deploy manually:
cd <service-repo>
nix build .#dockerImage
skopeo copy --dest-authfile ~/.docker/config.json \
docker-archive:result \
docker://gitea.v.paler.net/ppetru/<service>:latest
nomad run /path/to/alo-cluster/services/<service>.hcl
Rollback
Nomad auto-reverts on health check failure. For manual rollback:
nomad job history <service> # List versions
nomad job revert <service> <version> # Revert to specific version