| 3 min read

Docker Compose for AI Development: My Production Setup

Docker Docker Compose DevOps AI development containers

Why Docker Compose for AI Projects?

My AI development stack has multiple moving parts: PostgreSQL with pgvector for storage, Redis for caching and queues, Python services for pipeline processing, and Node.js services for APIs. Docker Compose orchestrates all of these with a single docker-compose up command.

I resisted containerization for too long, thinking it was overkill for personal projects. But after spending days debugging environment issues across machines, I made the switch and have not looked back. Here is my production configuration.

The Docker Compose File

version: '3.8'

services:
  postgres:
    image: pgvector/pgvector:pg16
    restart: always
    environment:
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: ${DB_NAME}
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    restart: always
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
    volumes:
      - redis_data:/data
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  api:
    build:
      context: ./api
      dockerfile: Dockerfile
    restart: always
    environment:
      DATABASE_URL: postgresql://${DB_USER}:${DB_PASSWORD}@postgres:5432/${DB_NAME}
      REDIS_URL: redis://redis:6379
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    ports:
      - "3000:3000"

volumes:
  postgres_data:
  redis_data:

Let me walk through the key decisions in this configuration.

PostgreSQL with pgvector

I use the pgvector/pgvector:pg16 image instead of the standard PostgreSQL image. This includes the pgvector extension pre-installed, which I need for vector similarity search in my AI applications. The init.sql file runs on first start to create extensions and initial schema:

-- init.sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS btree_gin;

Redis Configuration

The Redis instance serves two purposes: caching LLM responses and serving as a task queue for pipeline jobs. The allkeys-lru eviction policy ensures old cache entries are removed when memory fills up, which is the right behaviour for a cache.

Health Checks and Dependencies

The depends_on with condition: service_healthy ensures the API does not start until PostgreSQL and Redis are actually ready, not just running. Without this, you get connection errors on startup because the database has not finished initializing.

The Python Service Dockerfile

For Python AI services, my Dockerfile follows this pattern:

FROM python:3.12-slim

WORKDIR /app

# System dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Python dependencies (cached layer)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Application code
COPY . .

CMD ["python", "main.py"]

Key optimizations:

  • Slim base image: python:3.12-slim is much smaller than the full image
  • Layer caching: requirements.txt is copied before the application code. This means pip install only reruns when dependencies change, not on every code change.
  • System dependencies: FFmpeg is installed in the image for video processing pipelines
  • No cache: --no-cache-dir keeps the image size down

Environment Management

I use a .env file for local development and Docker secrets or environment variables for production:

# .env (never committed to git)
DB_USER=pipeline
DB_PASSWORD=secure_password_here
DB_NAME=ai_pipeline
OPENROUTER_API_KEY=sk-or-...
TELEGRAM_BOT_TOKEN=123456:ABC...

Docker Compose automatically reads the .env file and substitutes variables in the compose file. For production, I override with actual environment variables set on the host.

Development Workflow

My daily workflow with Docker Compose:

# Start all services
docker compose up -d

# View logs for a specific service
docker compose logs -f api

# Restart after code changes
docker compose restart api

# Run a one-off command
docker compose exec postgres psql -U pipeline -d ai_pipeline

# Tear down everything
docker compose down

# Tear down and remove volumes (full reset)
docker compose down -v

Production Considerations

  • Use named volumes: Anonymous volumes are hard to manage. Named volumes persist data reliably and are easier to back up.
  • Set memory limits: Add mem_limit to prevent a single container from consuming all server memory.
  • Log rotation: Configure Docker's log driver to rotate logs, or they will consume all disk space eventually.
  • Backup strategy: Schedule pg_dump from the PostgreSQL container to an external location.

Docker Compose is not the right tool for large-scale orchestration (that is Kubernetes territory), but for a single-server AI development and production setup, it strikes the perfect balance of simplicity and power.