How I Automated YouTube Uploads for Under 10p a Month
The Challenge: Automation Without the Price Tag
When people hear "AI automation," they often assume significant ongoing costs. API calls, cloud compute, storage fees. But one of my favourite projects proves that is not always the case. I run a fully automated YouTube upload pipeline that costs less than 10p per month in additional expenses.
This post breaks down exactly where every penny goes and the architectural decisions that keep costs so low.
What the Pipeline Does
As a quick recap, the pipeline handles the entire YouTube upload process:
- Detects new video files in a watched directory
- Extracts audio and transcribes using Whisper
- Generates optimised titles, descriptions, tags, and chapters using Gemini
- Uploads to YouTube via the Data API
- Archives the source file and logs everything
There is no manual intervention at any step. I drop a video file into a folder, and it appears on YouTube with full metadata within the hour.
Cost Breakdown: The Numbers
Whisper Transcription: 0.00 pounds
I run Whisper locally using the small model. The model weights are downloaded once and run on CPU. There is no API call, no per-minute charge, nothing. It uses some CPU time on my VPS, but that is a fixed cost I am already paying for other projects.
Processing time for a 10-minute video is about 3 minutes on a standard VPS. That is perfectly acceptable for a pipeline that does not need to be real-time.
Gemini Flash Metadata Generation: ~0.002p per video
Gemini 2.0 Flash is extraordinarily cheap. Let me show the maths:
- Input: roughly 2,000 tokens (transcript + prompt)
- Output: roughly 500 tokens (title, description, tags, chapters)
- Gemini Flash pricing: $0.10 per million input tokens, $0.40 per million output tokens
- Cost per video: (2000 * 0.0000001) + (500 * 0.0000004) = $0.0004
- In GBP at current rates: approximately 0.0003p
Even if I uploaded 100 videos per month, the Gemini cost would be about 0.03p. It rounds to zero.
YouTube Data API: 0.00 pounds
The YouTube Data API is free within quota limits. Each video upload costs 1,600 quota units out of a daily allocation of 10,000. That gives me 6 uploads per day, which is more than I need.
Server Overhead: ~8p per month
This is the only real cost, and it is an estimate. My VPS costs a fixed monthly fee, and the pipeline shares it with several other projects. Based on CPU usage monitoring, I attribute about 8p of the monthly server cost to the YouTube pipeline. This covers:
- The cron job running every 30 minutes
- Whisper processing time when videos are present
- Storage for the archive of processed files
Total: Under 10p per month
For up to 30 videos per month, the total cost is approximately 8 to 9p. The AI costs are so small they barely register.
Design Decisions That Keep Costs Low
1. Local Whisper Instead of API
The OpenAI Whisper API charges $0.006 per minute. For a 10-minute video, that is $0.06. Not expensive in isolation, but multiply by 30 videos and you are at $1.80/month. Running locally eliminates this entirely.
# Local Whisper - zero API cost
import whisper
model = whisper.load_model("small")
result = model.transcribe("video.mp4")
# vs. API Whisper - $0.006/min
# client.audio.transcriptions.create(
# model="whisper-1",
# file=open("audio.mp3", "rb")
# )
2. Gemini Flash for Non-Critical Generation
YouTube metadata generation does not need the most powerful model. Flash handles it perfectly. If I were using Claude Sonnet or GPT-4 for this task, costs would be 10 to 50 times higher for no meaningful improvement in output quality.
3. Shared Infrastructure
Running on a VPS I already pay for means the marginal cost of adding this pipeline is near zero. If I had spun up a dedicated server or used serverless functions, the baseline cost would be significantly higher.
4. Efficient Polling
The cron job checks for new files every 30 minutes. The check itself takes milliseconds. Only when a new file is found does the expensive processing (transcription, API calls) begin. This event-driven-ish approach means the system is effectively idle 99% of the time.
Comparison with Alternatives
To put this in perspective:
- Zapier or Make.com automation: The free tiers would not cover this workflow. Paid plans start at $20/month
- Using all cloud APIs: Whisper API + GPT-4 + cloud hosting would cost roughly $5 to $10/month
- Manual upload: Free in money, but 20 to 30 minutes per video. At 30 videos/month, that is 10+ hours of time
The Takeaway
AI automation does not have to be expensive. The key decisions are: run models locally when you can, use the cheapest model that does the job well, and leverage infrastructure you already have. This pipeline handles a genuinely useful task, runs completely unattended, and costs less than a packet of crisps per month. That is the kind of ROI that makes AI engineering exciting.