Deploying AI Applications on a VPS with PM2 and Nginx
Why a VPS Over Cloud Functions
There is a time and place for serverless, but for AI applications that need persistent state, long-running processes, and predictable costs, a VPS is hard to beat. I run nine production AI projects on a single VPS, and the total cost is a fraction of what I would pay on AWS Lambda or Google Cloud Functions.
The key tools that make this possible are PM2 for process management and Nginx as a reverse proxy. Together, they give you production-grade reliability without the complexity of Kubernetes or Docker orchestration.
Setting Up PM2 for AI Services
PM2 is a Node.js process manager, but it handles Python processes beautifully. Here is how I configure it for a typical FastAPI AI service:
# ecosystem.config.js
module.exports = {
apps: [
{
name: 'content-scorer',
script: 'uvicorn',
args: 'main:app --host 127.0.0.1 --port 8001',
interpreter: '/home/steve/envs/scorer/bin/python',
cwd: '/home/steve/projects/content-scorer',
env: {
ENVIRONMENT: 'production'
},
max_restarts: 10,
restart_delay: 5000,
watch: false
}
]
}
Key PM2 Commands I Use Daily
pm2 start ecosystem.config.jsto launch all servicespm2 statusfor a quick health check of all running processespm2 logs content-scorer --lines 50to tail recent logspm2 restart content-scorerfor zero-downtime restartspm2 monitfor real-time CPU and memory monitoringpm2 saveto persist the process list across server reboots
Nginx as a Reverse Proxy
Each AI service runs on a different local port. Nginx sits in front, routing requests to the right service based on the domain or path. Here is a typical configuration:
server {
listen 443 ssl http2;
server_name api.stevecv.com;
ssl_certificate /etc/letsencrypt/live/api.stevecv.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/api.stevecv.com/privkey.pem;
location /scorer/ {
proxy_pass http://127.0.0.1:8001/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 120s;
}
location /generator/ {
proxy_pass http://127.0.0.1:8002/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_read_timeout 300s;
}
}
Important Nginx Settings for AI Apps
AI applications often have longer response times than typical web servers. A call to Claude or GPT-4o might take 30 seconds or more. You need to adjust your timeouts:
- proxy_read_timeout: Set this to at least 120 seconds for LLM-backed endpoints
- client_max_body_size: Increase this if you accept file uploads for document processing
- proxy_buffering: Consider disabling this for streaming responses
SSL with Let's Encrypt
Every production API needs HTTPS. Certbot makes this trivial:
sudo certbot --nginx -d api.stevecv.com
sudo certbot renew --dry-run
I have a cron job that checks for renewal twice daily. Certificates auto-renew 30 days before expiry.
Deployment Workflow
My deployment process is simple and reliable:
- Push code to the Git repository
- SSH into the VPS (or use a webhook trigger)
- Pull the latest code
- Install any new dependencies
- Run database migrations if needed
- Restart the service with PM2
#!/bin/bash
# deploy.sh
cd /home/steve/projects/content-scorer
git pull origin main
pip install -r requirements.txt --quiet
alembic upgrade head 2>/dev/null
pm2 restart content-scorer
echo "Deployed at $(date)"
Monitoring and Logs
PM2 keeps logs for each process automatically. I combine this with my Telegram alerting system (covered in another post) for comprehensive monitoring. The PM2 log rotation module prevents disk space issues:
pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 50M
pm2 set pm2-logrotate:retain 7
Resource Management
Running multiple AI services on one VPS requires careful resource management. I use PM2's built-in memory limit feature to prevent any single service from consuming all available RAM:
{
name: 'content-scorer',
max_memory_restart: '500M'
}
A well-configured VPS with PM2 and Nginx can handle surprisingly heavy AI workloads. Do not let anyone tell you that you need Kubernetes for production AI services.
The Bottom Line
This stack has been running my production AI services for over a year with 99.9% uptime. The total server cost is around 40 pounds per month. If you are a solo developer or small team building AI applications, start here. You can always migrate to more complex infrastructure later if you outgrow it.