Migrating from Docker Compose
This guide helps you convert an existing docker-compose.yaml into an hpc-compose spec for Slurm clusters with Enroot and Pyxis.
At a glance
| Docker Compose feature | hpc-compose equivalent |
|---|---|
image | image (same syntax, auto-prefixed with docker://) |
command | command (string or list, same syntax) |
entrypoint | entrypoint (string or list, same syntax) |
environment | environment (map or list, same syntax) |
volumes | volumes (host:container bind mounts, same syntax) |
depends_on | depends_on (list or map with condition: service_started / service_healthy) |
working_dir | working_dir (requires explicit command or entrypoint) |
build | Not supported. Use image + x-enroot.prepare.commands instead. |
ports | Not supported. Use host networking semantics instead. 127.0.0.1 works only when both sides run on the same node. |
networks / network_mode | Not supported. There is no Docker-style overlay network or service-name DNS layer. |
restart | Not supported as a Compose key. Use services.<name>.x-slurm.failure_policy. |
deploy | Not supported. Use x-slurm for resource allocation. |
healthcheck | Supported for a constrained TCP/HTTP subset and normalized into readiness; use explicit readiness for anything more complex. |
Resource limits (cpus, mem_limit) | Use x-slurm.cpus_per_task, x-slurm.mem, x-slurm.gpus |
Side-by-side: web app + Redis
Docker Compose
version: "3.9"
services:
redis:
image: redis:7
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 5
app:
build: .
ports:
- "8000:8000"
depends_on:
redis:
condition: service_healthy
environment:
REDIS_HOST: redis
volumes:
- ./app:/workspace
working_dir: /workspace
command: python -m main
hpc-compose
name: my-app
x-slurm:
job_name: my-app
time: "01:00:00"
mem: 8G
cpus_per_task: 4
cache_dir: /shared/$USER/hpc-compose-cache
services:
redis:
image: redis:7
command: redis-server --save "" --appendonly no
readiness:
type: tcp
host: 127.0.0.1
port: 6379
timeout_seconds: 30
app:
image: python:3.11-slim
depends_on:
redis:
condition: service_healthy
environment:
REDIS_HOST: 127.0.0.1
volumes:
- ./app:/workspace
working_dir: /workspace
command: python -m main
x-enroot:
prepare:
commands:
- pip install --no-cache-dir redis fastapi uvicorn
Key changes
build: .→image: python:3.11-slim+x-enroot.prepare.commandsfor dependencies.ports→ Removed. Services communicate via127.0.0.1because they run on the same node.REDIS_HOST: redis→REDIS_HOST: 127.0.0.1. No DNS service names; use localhost.healthcheck→readinesswithtype: tcp.- Added
x-slurmblock for Slurm resource allocation (time, memory, CPUs). - Added
x-slurm.cache_dirfor shared image storage.
Key differences
Networking
Docker Compose creates isolated networks where services find each other by name. In hpc-compose, helper services on the same node share the host network directly, and multi-node distributed steps must use explicit rendezvous addresses. Replace service hostnames with 127.0.0.1 only when both sides intentionally stay on one node. For multi-node runs, derive the rendezvous host from /hpc-compose/job/allocation/primary_node or HPC_COMPOSE_PRIMARY_NODE.
Building images
Docker Compose uses build: to run a Dockerfile. hpc-compose uses x-enroot.prepare.commands instead:
# Docker Compose
app:
build:
context: .
dockerfile: Dockerfile
# hpc-compose
app:
image: python:3.11-slim
x-enroot:
prepare:
commands:
- pip install --no-cache-dir -r /tmp/requirements.txt
mounts:
- ./requirements.txt:/tmp/requirements.txt
Prefer volumes for fast-changing source code and x-enroot.prepare.commands for slower-changing dependencies.
Health checks vs readiness
Docker Compose uses healthcheck with a test command, interval, timeout, and retries. hpc-compose now accepts a constrained healthcheck subset and normalizes it into readiness:
# TCP: wait for a port to accept connections
readiness:
type: tcp
host: 127.0.0.1
port: 6379
timeout_seconds: 30
# Log: wait for a pattern in service output
readiness:
type: log
pattern: "Server started"
timeout_seconds: 60
# Sleep: fixed delay
readiness:
type: sleep
seconds: 5
Supported healthcheck migration patterns:
["CMD", "nc", "-z", HOST, PORT]["CMD-SHELL", "nc -z HOST PORT"]- recognized
curlprobes againsthttp://orhttps://URLs - recognized
wget --spiderprobes againsthttp://orhttps://URLs
Still unsupported in v1:
- arbitrary custom command probes
intervalretriesstart_period
Resource allocation
Docker Compose uses deploy.resources or top-level cpus/mem_limit. hpc-compose uses Slurm-native resource settings:
x-slurm:
time: "02:00:00"
mem: 32G
cpus_per_task: 8
gpus: 1
services:
app:
x-slurm:
cpus_per_task: 4
gpus: 1
Restart policies
Docker Compose supports restart: always, on-failure, etc. hpc-compose does not accept the Compose restart: key, but it does support per-service restart behavior through services.<name>.x-slurm.failure_policy.
services:
app:
image: python:3.11-slim
x-slurm:
failure_policy:
mode: restart_on_failure
max_restarts: 3
backoff_seconds: 5
restart_on_failure retries only on non-zero exits. Use mode: fail_job (default) for fail-fast behavior, or mode: ignore for non-critical sidecars.
What to do about unsupported features
| Feature | Alternative |
|---|---|
build | Use image + x-enroot.prepare.commands. Mount build context files with x-enroot.prepare.mounts if needed. |
ports | Not needed. Services share 127.0.0.1 on one node. |
networks / network_mode | Not needed. All services are on the same host network. |
restart | Use services.<name>.x-slurm.failure_policy (fail_job, ignore, restart_on_failure). |
deploy | Use x-slurm for resources. |
| Service DNS names | Use 127.0.0.1 for same-node helpers, or explicit host metadata such as HPC_COMPOSE_PRIMARY_NODE for distributed runs. |
| Named volumes | Use host-path bind mounts in volumes. |
.env file | Supported. .env in the compose file directory is loaded automatically. |
Migration checklist
- Remove
build:— Replace withimage:pointing to a base image. Move dependency installation tox-enroot.prepare.commands. - Remove
ports:— Use host-network semantics instead of container port publishing. - Remove
networks:/network_mode:— There is no Docker-style overlay network or service-name DNS layer. - Remove Compose
restart:— useservices.<name>.x-slurm.failure_policywhen you need per-service restart behavior. - Remove
deploy:— Usex-slurmfor resource allocation. - Replace service hostnames — Change any service-name references (e.g.
redis,postgres) to127.0.0.1for same-node helpers, or to explicit allocation metadata for distributed runs. - Replace
healthcheck:— Convert toreadiness:withtype: tcp,type: log, ortype: sleep. - Add
x-slurm:— Settime,mem,cpus_per_task, and optionallygpus,partition,account. - Set
cache_dir— Pointx-slurm.cache_dirto shared storage visible from login and compute nodes. - Validate — Run
hpc-compose validate -f compose.yamlto check the converted spec. - Inspect — Run
hpc-compose inspect --verbose -f compose.yamlto confirm the planner understood your intent.