Automated Content Generation: 1000 Articles Across 10 Verticals
The Scale Challenge
A content marketing agency approached me with an ambitious goal: produce 1000 high-quality, SEO-optimized articles across 10 different industry verticals in 30 days. At traditional agency rates, this would cost over $250,000 and take 3 months. They wanted it done with AI in a fraction of the time and cost.
Here is how I designed and executed the pipeline.
Phase 1: Topic Research and Keyword Planning
Before generating a single word, I needed 1000 topics mapped to specific keywords with search volume data. I built a topic research agent that combines SEO data with AI analysis.
class TopicResearcher:
def __init__(self):
self.llm = OpenAI()
def generate_topics(self, vertical: str, count: int = 100) -> list[dict]:
# Step 1: Generate seed topics
seed_prompt = f"""Generate {count * 2} blog post topic ideas for the
{vertical} industry. Focus on:
- Common questions buyers have
- Technical how-to guides
- Industry trend analysis
- Comparison and review topics
Format: one topic per line"""
seeds = self.llm.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": seed_prompt}]
).choices[0].message.content.strip().split("\n")
# Step 2: Enrich with keyword data
enriched = []
for topic in seeds:
keyword_data = self.get_keyword_metrics(topic)
if keyword_data['monthly_volume'] >= 100:
enriched.append({
'topic': topic,
'primary_keyword': keyword_data['keyword'],
'volume': keyword_data['monthly_volume'],
'difficulty': keyword_data['difficulty'],
'vertical': vertical
})
# Return top topics by volume/difficulty ratio
return sorted(enriched, key=lambda x: x['volume'] / max(x['difficulty'], 1), reverse=True)[:count]
Phase 2: Content Brief Generation
Each article needs a detailed brief before generation. The brief ensures the article targets the right keywords, covers the topic comprehensively, and follows a consistent structure.
class BriefGenerator:
def create_brief(self, topic: dict) -> dict:
prompt = f"""Create a detailed content brief for this article:
Topic: {topic['topic']}
Primary keyword: {topic['primary_keyword']}
Industry: {topic['vertical']}
Include:
1. Suggested title (under 60 chars, keyword-rich)
2. Meta description (under 155 chars)
3. Target word count (800-1500)
4. Outline with H2 and H3 headings
5. Key points to cover under each heading
6. Secondary keywords to include naturally
7. Internal linking opportunities
8. Unique angle or perspective
Output as JSON."""
response = self.llm.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
Phase 3: Article Generation
The article generator follows the brief strictly, producing well-structured HTML content with proper heading hierarchy and natural keyword integration.
class ArticleGenerator:
def generate(self, brief: dict) -> dict:
prompt = f"""Write a complete blog article following this brief exactly.
Brief: {json.dumps(brief)}
Requirements:
- Write in a professional but approachable tone
- Use the exact heading structure from the outline
- Include the primary keyword in the first paragraph
- Weave secondary keywords naturally throughout
- Include practical examples and actionable advice
- Output as clean HTML with h2, h3, p, ul, li tags
- Target the specified word count
- Do not use em dashes
Write the complete article now."""
response = self.llm.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
max_tokens=4000
)
return {
'title': brief['title'],
'content': response.choices[0].message.content,
'brief': brief
}
Phase 4: Quality Assurance Pipeline
Every article passes through automated QA before human review:
class QualityChecker:
def check(self, article: dict) -> dict:
checks = {
'word_count': self.check_word_count(article),
'keyword_density': self.check_keyword_density(article),
'heading_structure': self.check_headings(article),
'readability': self.check_readability(article),
'originality': self.check_originality(article),
'factual_claims': self.check_facts(article)
}
passed = all(c['passed'] for c in checks.values())
return {'passed': passed, 'checks': checks}
def check_readability(self, article: dict) -> dict:
text = strip_html(article['content'])
score = textstat.flesch_reading_ease(text)
return {
'passed': 40 <= score <= 70,
'score': score,
'note': 'Target: 40-70 Flesch reading ease'
}
Phase 5: Human Review Workflow
AI generation is not fire-and-forget. I built a review queue where human editors can approve, request revisions, or reject articles. The key insight is that human reviewers should focus on accuracy and nuance, not grammar and structure, because the AI handles those well.
- Articles scoring 90%+ on QA go to a light review queue (5 minutes per article)
- Articles scoring 70-90% go to a standard review queue (15 minutes per article)
- Articles below 70% get regenerated automatically with adjusted prompts
Results and Metrics
The pipeline produced 1000 articles in 22 days:
- Average generation cost: $0.35 per article
- QA pass rate on first generation: 78%
- QA pass rate after one regeneration: 94%
- Human approval rate: 91%
- Average human review time: 8 minutes per article
- Total cost including human review: roughly $12 per article
What I Would Do Differently
If I were building this again, I would invest more time in the brief generation phase. The articles that required the most human revision were almost always the ones with weak briefs. A 10-minute improvement in brief quality saves 30 minutes in review and revision.
I would also build a feedback loop where human editor corrections automatically update the generation prompts. After 1000 articles, the editors had identified clear patterns in what the AI got wrong, and those patterns should flow back into the system.
Automated content generation at scale is entirely feasible today, but the quality bar must be maintained with systematic QA and human oversight. The AI writes, but humans still edit and approve.