| 3 min read

Deploying AI Applications on a VPS with PM2 and Nginx

deployment VPS PM2 Nginx DevOps AI applications

Why a VPS Over Cloud Functions

There is a time and place for serverless, but for AI applications that need persistent state, long-running processes, and predictable costs, a VPS is hard to beat. I run nine production AI projects on a single VPS, and the total cost is a fraction of what I would pay on AWS Lambda or Google Cloud Functions.

The key tools that make this possible are PM2 for process management and Nginx as a reverse proxy. Together, they give you production-grade reliability without the complexity of Kubernetes or Docker orchestration.

Setting Up PM2 for AI Services

PM2 is a Node.js process manager, but it handles Python processes beautifully. Here is how I configure it for a typical FastAPI AI service:

# ecosystem.config.js
module.exports = {
  apps: [
    {
      name: 'content-scorer',
      script: 'uvicorn',
      args: 'main:app --host 127.0.0.1 --port 8001',
      interpreter: '/home/steve/envs/scorer/bin/python',
      cwd: '/home/steve/projects/content-scorer',
      env: {
        ENVIRONMENT: 'production'
      },
      max_restarts: 10,
      restart_delay: 5000,
      watch: false
    }
  ]
}

Key PM2 Commands I Use Daily

  • pm2 start ecosystem.config.js to launch all services
  • pm2 status for a quick health check of all running processes
  • pm2 logs content-scorer --lines 50 to tail recent logs
  • pm2 restart content-scorer for zero-downtime restarts
  • pm2 monit for real-time CPU and memory monitoring
  • pm2 save to persist the process list across server reboots

Nginx as a Reverse Proxy

Each AI service runs on a different local port. Nginx sits in front, routing requests to the right service based on the domain or path. Here is a typical configuration:

server {
    listen 443 ssl http2;
    server_name api.stevecv.com;
    
    ssl_certificate /etc/letsencrypt/live/api.stevecv.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.stevecv.com/privkey.pem;
    
    location /scorer/ {
        proxy_pass http://127.0.0.1:8001/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 120s;
    }
    
    location /generator/ {
        proxy_pass http://127.0.0.1:8002/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 300s;
    }
}

Important Nginx Settings for AI Apps

AI applications often have longer response times than typical web servers. A call to Claude or GPT-4o might take 30 seconds or more. You need to adjust your timeouts:

  • proxy_read_timeout: Set this to at least 120 seconds for LLM-backed endpoints
  • client_max_body_size: Increase this if you accept file uploads for document processing
  • proxy_buffering: Consider disabling this for streaming responses

SSL with Let's Encrypt

Every production API needs HTTPS. Certbot makes this trivial:

sudo certbot --nginx -d api.stevecv.com
sudo certbot renew --dry-run

I have a cron job that checks for renewal twice daily. Certificates auto-renew 30 days before expiry.

Deployment Workflow

My deployment process is simple and reliable:

  • Push code to the Git repository
  • SSH into the VPS (or use a webhook trigger)
  • Pull the latest code
  • Install any new dependencies
  • Run database migrations if needed
  • Restart the service with PM2
#!/bin/bash
# deploy.sh
cd /home/steve/projects/content-scorer
git pull origin main
pip install -r requirements.txt --quiet
alembic upgrade head 2>/dev/null
pm2 restart content-scorer
echo "Deployed at $(date)"

Monitoring and Logs

PM2 keeps logs for each process automatically. I combine this with my Telegram alerting system (covered in another post) for comprehensive monitoring. The PM2 log rotation module prevents disk space issues:

pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 50M
pm2 set pm2-logrotate:retain 7

Resource Management

Running multiple AI services on one VPS requires careful resource management. I use PM2's built-in memory limit feature to prevent any single service from consuming all available RAM:

{
  name: 'content-scorer',
  max_memory_restart: '500M'
}
A well-configured VPS with PM2 and Nginx can handle surprisingly heavy AI workloads. Do not let anyone tell you that you need Kubernetes for production AI services.

The Bottom Line

This stack has been running my production AI services for over a year with 99.9% uptime. The total server cost is around 40 pounds per month. If you are a solo developer or small team building AI applications, start here. You can always migrate to more complex infrastructure later if you outgrow it.