Update all registry paths from ppetru/* to alo/* and workflow references from ppetru/alo-cluster to alo/alo-cluster. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
207 lines
4.4 KiB
Markdown
207 lines
4.4 KiB
Markdown
# CI/CD Setup for Nomad Services
|
|
|
|
Guide for adding automated builds and deployments to a service.
|
|
|
|
## Prerequisites
|
|
|
|
### 1. Service Repository
|
|
|
|
Your service needs a `flake.nix` that exports a Docker image:
|
|
|
|
```nix
|
|
{
|
|
outputs = { self, nixpkgs, ... }: {
|
|
# The workflow looks for this output by default
|
|
dockerImage = pkgs.dockerTools.buildImage {
|
|
name = "gitea.v.paler.net/alo/<service>";
|
|
tag = "latest";
|
|
# ... image config
|
|
};
|
|
};
|
|
}
|
|
```
|
|
|
|
**Important**: Use `extraCommands` instead of `runAsRoot` in your Docker build - the CI runner doesn't have KVM.
|
|
|
|
### 2. Nomad Job
|
|
|
|
Your job in `services/<name>.hcl` needs:
|
|
|
|
```hcl
|
|
job "<service>" {
|
|
# Required: UUID changes trigger deployments
|
|
meta {
|
|
uuid = uuidv4()
|
|
}
|
|
|
|
# Required: enables deployment tracking and auto-rollback
|
|
update {
|
|
max_parallel = 1
|
|
health_check = "checks"
|
|
min_healthy_time = "30s"
|
|
healthy_deadline = "5m"
|
|
auto_revert = true
|
|
}
|
|
|
|
# Required: pulls new image on each deployment
|
|
task "app" {
|
|
config {
|
|
force_pull = true
|
|
}
|
|
|
|
# Recommended: health check for deployment validation
|
|
service {
|
|
check {
|
|
type = "http"
|
|
path = "/healthz"
|
|
interval = "10s"
|
|
timeout = "5s"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### 1. Create Workflow
|
|
|
|
Add `.gitea/workflows/deploy.yaml` to your service repo:
|
|
|
|
```yaml
|
|
name: Deploy
|
|
|
|
on:
|
|
push:
|
|
branches: [master]
|
|
workflow_dispatch:
|
|
|
|
jobs:
|
|
deploy:
|
|
uses: alo/alo-cluster/.gitea/workflows/deploy-nomad.yaml@master
|
|
with:
|
|
service_name: <your-service> # Must match Nomad job ID
|
|
secrets: inherit
|
|
```
|
|
|
|
### 2. Add Secrets
|
|
|
|
In Gitea → Your Repo → Settings → Actions → Secrets, add:
|
|
|
|
| Secret | Value |
|
|
|--------|-------|
|
|
| `REGISTRY_USERNAME` | Your Gitea username |
|
|
| `REGISTRY_PASSWORD` | Gitea access token with `packages:write` |
|
|
| `NOMAD_ADDR` | `http://nomad.service.consul:4646` |
|
|
|
|
### 3. Push
|
|
|
|
Push to `master` branch. The workflow will:
|
|
1. Build your Docker image with Nix
|
|
2. Push to Gitea registry
|
|
3. Update the Nomad job to trigger deployment
|
|
4. Monitor until deployment succeeds or fails
|
|
|
|
## Workflow Options
|
|
|
|
The shared workflow accepts these inputs:
|
|
|
|
| Input | Default | Description |
|
|
|-------|---------|-------------|
|
|
| `service_name` | (required) | Nomad job ID |
|
|
| `flake_output` | `dockerImage` | Flake output to build |
|
|
| `registry` | `gitea.v.paler.net` | Container registry |
|
|
|
|
Example with custom flake output:
|
|
|
|
```yaml
|
|
jobs:
|
|
deploy:
|
|
uses: alo/alo-cluster/.gitea/workflows/deploy-nomad.yaml@master
|
|
with:
|
|
service_name: myservice
|
|
flake_output: packages.x86_64-linux.docker
|
|
secrets: inherit
|
|
```
|
|
|
|
## How It Works
|
|
|
|
```
|
|
Push to master
|
|
↓
|
|
Build: nix build .#dockerImage
|
|
↓
|
|
Push: skopeo → gitea.v.paler.net/alo/<service>:latest
|
|
↓
|
|
Deploy: Update job meta.uuid → Nomad creates deployment
|
|
↓
|
|
Monitor: Poll deployment status for up to 5 minutes
|
|
↓
|
|
Success: Deployment healthy
|
|
OR
|
|
Failure: Nomad auto-reverts to previous version
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Build fails with KVM error
|
|
|
|
```
|
|
Required system: 'x86_64-linux' with features {kvm}
|
|
```
|
|
|
|
Use `extraCommands` instead of `runAsRoot` in your `docker.nix`:
|
|
|
|
```nix
|
|
# Bad - requires KVM
|
|
runAsRoot = ''
|
|
mkdir -p /tmp
|
|
'';
|
|
|
|
# Good - no KVM needed
|
|
extraCommands = ''
|
|
mkdir -p tmp
|
|
chmod 1777 tmp
|
|
'';
|
|
```
|
|
|
|
### No deployment created
|
|
|
|
Ensure your Nomad job has the `update` stanza with `auto_revert = true`.
|
|
|
|
### Image not updating
|
|
|
|
Check that `force_pull = true` is set in the Nomad job's Docker config.
|
|
|
|
### Deployment fails health checks
|
|
|
|
- Check your `/healthz` endpoint works
|
|
- Increase `healthy_deadline` if startup is slow
|
|
- Check `nomad alloc logs <alloc-id>` for errors
|
|
|
|
### Workflow can't access alo-cluster
|
|
|
|
If Gitea can't pull the reusable workflow, you may need to make alo-cluster public or use a token. As a fallback, copy the workflow content directly.
|
|
|
|
## Manual Deployment
|
|
|
|
If CI fails, you can deploy manually:
|
|
|
|
```bash
|
|
cd <service-repo>
|
|
nix build .#dockerImage
|
|
skopeo copy --dest-authfile ~/.docker/config.json \
|
|
docker-archive:result \
|
|
docker://gitea.v.paler.net/alo/<service>:latest
|
|
nomad run /path/to/alo-cluster/services/<service>.hcl
|
|
```
|
|
|
|
## Rollback
|
|
|
|
Nomad auto-reverts on health check failure. For manual rollback:
|
|
|
|
```bash
|
|
nomad job history <service> # List versions
|
|
nomad job revert <service> <version> # Revert to specific version
|
|
```
|