| 3 min read

How I Manage 9 Production AI Projects on One VPS

VPS production DevOps resource management PM2

Why One Server

Running nine production AI projects on a single VPS sounds reckless until you understand the economics. Each project would cost 20 to 50 dollars per month on a cloud platform like AWS or Google Cloud. That is up to 450 dollars monthly for infrastructure alone. My single VPS costs around 40 pounds per month and handles everything comfortably.

The key is that most AI applications are not constantly running. They process requests, wait for responses from LLM APIs, and spend most of their time idle. This bursty usage pattern means multiple projects can share resources effectively.

Project Isolation Strategy

Each project lives in its own directory with its own virtual environment, configuration, and log files. There is no sharing of dependencies between projects, which prevents version conflicts:

/home/steve/projects/
    content-scorer/
        venv/
        .env
        main.py
        requirements.txt
    video-generator/
        venv/
        .env
        main.py
        requirements.txt
    document-analyzer/
        ...

Each project has its own PM2 process entry with isolated environment variables loaded from its own .env file.

Resource Allocation

With nine projects sharing one server, resource management is critical. Here is how I handle it:

Memory

I set memory limits on each PM2 process. If a process exceeds its allocation, PM2 restarts it automatically. The total memory allocation across all projects stays below 80% of available RAM, leaving headroom for the OS and temporary spikes:

// PM2 memory limits per project
content-scorer:      500MB
video-generator:     800MB
document-analyzer:   600MB
blog-generator:      256MB
email-system:        512MB
scraper-pipeline:    400MB
art-system:          300MB
api-gateway:         256MB
monitoring-service:  128MB

CPU

AI applications spend most of their CPU time waiting for API responses. The actual computation on my server is minimal: parsing responses, running validation logic, and handling HTTP requests. Nine projects rarely compete for CPU simultaneously.

Disk

Log rotation is essential. Without it, logs from nine projects would fill the disk within weeks. I use PM2's log rotation module with aggressive settings: 50MB max file size and 7 days retention.

Port Management

Each project gets a dedicated port range:

8001: content-scorer
8002: video-generator
8003: document-analyzer
8004: blog-generator
8005: email-system
8006: scraper-pipeline
8007: art-system
8008: api-gateway
8009: monitoring-service

Nginx routes external requests to the appropriate port based on the domain or URL path.

Deployment Without Downtime

I deploy updates to individual projects without affecting the others. The process is simple:

  • Pull the latest code for the specific project
  • Install any new dependencies in that project's virtual environment
  • Restart only that project's PM2 process
  • Verify the restart was successful

PM2 handles the restart gracefully, finishing in-flight requests before switching to the new code.

Monitoring All Nine Projects

A single pm2 status command gives me a dashboard of all nine projects:

+----+---------------------+--------+------+-------+
| id | name                | status | cpu  | mem   |
+----+---------------------+--------+------+-------+
| 0  | content-scorer      | online | 0.1% | 342MB |
| 1  | video-generator     | online | 0.0% | 567MB |
| 2  | document-analyzer   | online | 0.2% | 423MB |
| 3  | blog-generator      | online | 0.0% | 178MB |
| 4  | email-system        | online | 0.1% | 298MB |
| 5  | scraper-pipeline    | online | 0.3% | 356MB |
| 6  | art-system          | online | 0.0% | 234MB |
| 7  | api-gateway         | online | 0.1% | 187MB |
| 8  | monitoring-service  | online | 0.0% | 98MB  |
+----+---------------------+--------+------+-------+

Each project also sends health metrics to my Telegram monitoring bot, so I get alerted if any project goes down or starts consuming unexpected resources.

Backup Strategy

All project code is in Git, so code is always recoverable. For data, I run nightly backups of Supabase databases and any local data files. The backup script runs as a cron job and sends a confirmation to Telegram when complete.

When to Scale Beyond One Server

I have identified the triggers that would make me add a second server:

  • Sustained CPU usage above 70% for more than an hour
  • Memory pressure causing frequent PM2 restarts
  • Request latency increasing due to resource contention
  • Any single project needing dedicated GPU resources
Start with one server and scale when the numbers force you to. Premature infrastructure scaling is one of the most expensive mistakes in software engineering.

The Takeaway

Running multiple AI projects on a single VPS is not just feasible; it is often the smartest approach for solo developers and small teams. The key is disciplined resource management, good isolation between projects, and comprehensive monitoring. Save the complex multi-server architectures for when you actually need them.