Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Migrating from Docker Compose

This guide helps you convert an existing docker-compose.yaml into an hpc-compose spec for Slurm clusters with Enroot and Pyxis.

At a glance

Docker Compose featurehpc-compose equivalent
imageimage (same syntax, auto-prefixed with docker://)
commandcommand (string or list, same syntax)
entrypointentrypoint (string or list, same syntax)
environmentenvironment (map or list, same syntax)
volumesvolumes (host:container bind mounts, same syntax)
depends_ondepends_on (list or map with condition: service_started / service_healthy)
working_dirworking_dir (requires explicit command or entrypoint)
buildNot supported. Use image + x-enroot.prepare.commands instead.
portsNot supported. Use host networking semantics instead. 127.0.0.1 works only when both sides run on the same node.
networks / network_modeNot supported. There is no Docker-style overlay network or service-name DNS layer.
restartNot supported as a Compose key. Use services.<name>.x-slurm.failure_policy.
deployNot supported. Use x-slurm for resource allocation.
healthcheckSupported for a constrained TCP/HTTP subset and normalized into readiness; use explicit readiness for anything more complex.
Resource limits (cpus, mem_limit)Use x-slurm.cpus_per_task, x-slurm.mem, x-slurm.gpus

Side-by-side: web app + Redis

Docker Compose

version: "3.9"
services:
  redis:
    image: redis:7
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5

  app:
    build: .
    ports:
      - "8000:8000"
    depends_on:
      redis:
        condition: service_healthy
    environment:
      REDIS_HOST: redis
    volumes:
      - ./app:/workspace
    working_dir: /workspace
    command: python -m main

hpc-compose

name: my-app

x-slurm:
  job_name: my-app
  time: "01:00:00"
  mem: 8G
  cpus_per_task: 4
  cache_dir: /shared/$USER/hpc-compose-cache

services:
  redis:
    image: redis:7
    command: redis-server --save "" --appendonly no
    readiness:
      type: tcp
      host: 127.0.0.1
      port: 6379
      timeout_seconds: 30

  app:
    image: python:3.11-slim
    depends_on:
      redis:
        condition: service_healthy
    environment:
      REDIS_HOST: 127.0.0.1
    volumes:
      - ./app:/workspace
    working_dir: /workspace
    command: python -m main
    x-enroot:
      prepare:
        commands:
          - pip install --no-cache-dir redis fastapi uvicorn

Key changes

  1. build: .image: python:3.11-slim + x-enroot.prepare.commands for dependencies.
  2. ports → Removed. Services communicate via 127.0.0.1 because they run on the same node.
  3. REDIS_HOST: redisREDIS_HOST: 127.0.0.1. No DNS service names; use localhost.
  4. healthcheckreadiness with type: tcp.
  5. Added x-slurm block for Slurm resource allocation (time, memory, CPUs).
  6. Added x-slurm.cache_dir for shared image storage.

Key differences

Networking

Docker Compose creates isolated networks where services find each other by name. In hpc-compose, helper services on the same node share the host network directly, and multi-node distributed steps must use explicit rendezvous addresses. Replace service hostnames with 127.0.0.1 only when both sides intentionally stay on one node. For multi-node runs, derive the rendezvous host from /hpc-compose/job/allocation/primary_node or HPC_COMPOSE_PRIMARY_NODE.

Building images

Docker Compose uses build: to run a Dockerfile. hpc-compose uses x-enroot.prepare.commands instead:

# Docker Compose
app:
  build:
    context: .
    dockerfile: Dockerfile

# hpc-compose
app:
  image: python:3.11-slim
  x-enroot:
    prepare:
      commands:
        - pip install --no-cache-dir -r /tmp/requirements.txt
      mounts:
        - ./requirements.txt:/tmp/requirements.txt

Prefer volumes for fast-changing source code and x-enroot.prepare.commands for slower-changing dependencies.

Health checks vs readiness

Docker Compose uses healthcheck with a test command, interval, timeout, and retries. hpc-compose now accepts a constrained healthcheck subset and normalizes it into readiness:

# TCP: wait for a port to accept connections
readiness:
  type: tcp
  host: 127.0.0.1
  port: 6379
  timeout_seconds: 30

# Log: wait for a pattern in service output
readiness:
  type: log
  pattern: "Server started"
  timeout_seconds: 60

# Sleep: fixed delay
readiness:
  type: sleep
  seconds: 5

Supported healthcheck migration patterns:

  • ["CMD", "nc", "-z", HOST, PORT]
  • ["CMD-SHELL", "nc -z HOST PORT"]
  • recognized curl probes against http:// or https:// URLs
  • recognized wget --spider probes against http:// or https:// URLs

Still unsupported in v1:

  • arbitrary custom command probes
  • interval
  • retries
  • start_period

Resource allocation

Docker Compose uses deploy.resources or top-level cpus/mem_limit. hpc-compose uses Slurm-native resource settings:

x-slurm:
  time: "02:00:00"
  mem: 32G
  cpus_per_task: 8
  gpus: 1

services:
  app:
    x-slurm:
      cpus_per_task: 4
      gpus: 1

Restart policies

Docker Compose supports restart: always, on-failure, etc. hpc-compose does not accept the Compose restart: key, but it does support per-service restart behavior through services.<name>.x-slurm.failure_policy.

services:
  app:
    image: python:3.11-slim
    x-slurm:
      failure_policy:
        mode: restart_on_failure
        max_restarts: 3
        backoff_seconds: 5

restart_on_failure retries only on non-zero exits. Use mode: fail_job (default) for fail-fast behavior, or mode: ignore for non-critical sidecars.

What to do about unsupported features

FeatureAlternative
buildUse image + x-enroot.prepare.commands. Mount build context files with x-enroot.prepare.mounts if needed.
portsNot needed. Services share 127.0.0.1 on one node.
networks / network_modeNot needed. All services are on the same host network.
restartUse services.<name>.x-slurm.failure_policy (fail_job, ignore, restart_on_failure).
deployUse x-slurm for resources.
Service DNS namesUse 127.0.0.1 for same-node helpers, or explicit host metadata such as HPC_COMPOSE_PRIMARY_NODE for distributed runs.
Named volumesUse host-path bind mounts in volumes.
.env fileSupported. .env in the compose file directory is loaded automatically.

Migration checklist

  1. Remove build: — Replace with image: pointing to a base image. Move dependency installation to x-enroot.prepare.commands.
  2. Remove ports: — Use host-network semantics instead of container port publishing.
  3. Remove networks: / network_mode: — There is no Docker-style overlay network or service-name DNS layer.
  4. Remove Compose restart: — use services.<name>.x-slurm.failure_policy when you need per-service restart behavior.
  5. Remove deploy: — Use x-slurm for resource allocation.
  6. Replace service hostnames — Change any service-name references (e.g. redis, postgres) to 127.0.0.1 for same-node helpers, or to explicit allocation metadata for distributed runs.
  7. Replace healthcheck: — Convert to readiness: with type: tcp, type: log, or type: sleep.
  8. Add x-slurm: — Set time, mem, cpus_per_task, and optionally gpus, partition, account.
  9. Set cache_dir — Point x-slurm.cache_dir to shared storage visible from login and compute nodes.
  10. Validate — Run hpc-compose validate -f compose.yaml to check the converted spec.
  11. Inspect — Run hpc-compose inspect --verbose -f compose.yaml to confirm the planner understood your intent.