How to Use Suno API for AI Music Generation
AI Music Generation Has Arrived
When I first heard AI-generated music, it was obvious novelty tracks that no one would actually use. That has changed dramatically. Suno produces music that is genuinely good enough for background tracks in videos, podcasts, and commercial content. I integrated the Suno API into a media production pipeline, and here is everything I learned.
Getting Started with the Suno API
The Suno API accepts text prompts and returns generated audio tracks. The key is understanding how to structure your prompts for consistent, usable output.
import requests
import time
class SunoClient:
def __init__(self, api_key: str):
self.api_key = api_key
self.base_url = "https://api.suno.ai/v1"
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
def generate(self, prompt: str, style: str = "", duration: int = 60) -> dict:
payload = {
"prompt": prompt,
"style": style,
"duration": duration,
"instrumental": True
}
response = requests.post(
f"{self.base_url}/generate",
headers=self.headers,
json=payload
)
response.raise_for_status()
task_id = response.json()["task_id"]
return self._poll_result(task_id)
def _poll_result(self, task_id: str, timeout: int = 300) -> dict:
start = time.time()
while time.time() - start < timeout:
result = requests.get(
f"{self.base_url}/tasks/{task_id}",
headers=self.headers
).json()
if result["status"] == "completed":
return result
elif result["status"] == "failed":
raise Exception(f"Generation failed: {result.get('error')}")
time.sleep(5)
raise TimeoutError("Generation timed out")
Prompt Engineering for Music
Music prompts work differently from text or image prompts. The key elements are genre, mood, tempo, instrumentation, and structure.
MUSIC_PROMPT_TEMPLATE = """
Genre: {genre}
Mood: {mood}
Tempo: {tempo} BPM
Instrumentation: {instruments}
Structure: {structure}
Additional notes: {notes}
"""
# Example: Background music for a tech tutorial
prompt = MUSIC_PROMPT_TEMPLATE.format(
genre="Electronic ambient",
mood="Focused, calm, slightly upbeat",
tempo="90",
instruments="Soft synth pads, light percussion, subtle bass",
structure="Gentle intro, steady middle section, soft fade out",
notes="No vocals. Suitable for background listening while coding. Not distracting."
)
Style Descriptors That Work
Through extensive testing, I found that certain style descriptors produce more consistent results:
- For background music: "ambient", "lo-fi", "minimal", "atmospheric"
- For upbeat content: "indie pop", "electronic dance", "funk groove"
- For corporate videos: "inspiring orchestral", "motivational", "cinematic light"
- For podcasts: "jazz lounge", "acoustic chill", "soft instrumental"
Building a Music Library
For my media pipeline, I generate a library of tracks organized by mood and use case. Each track gets metadata tags for easy retrieval.
class MusicLibrary:
def __init__(self, storage_dir: str, db_path: str):
self.storage_dir = Path(storage_dir)
self.storage_dir.mkdir(parents=True, exist_ok=True)
self.db = sqlite3.connect(db_path)
self._init_db()
def _init_db(self):
self.db.execute("""
CREATE TABLE IF NOT EXISTS tracks (
id TEXT PRIMARY KEY,
title TEXT,
genre TEXT,
mood TEXT,
tempo INTEGER,
duration INTEGER,
file_path TEXT,
prompt TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
def add_track(self, generation_result: dict, metadata: dict):
audio_url = generation_result["audio_url"]
file_name = f"{generation_result['task_id']}.mp3"
file_path = self.storage_dir / file_name
# Download audio
response = requests.get(audio_url)
file_path.write_bytes(response.content)
# Store metadata
self.db.execute(
"INSERT INTO tracks VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
(generation_result["task_id"], metadata["title"],
metadata["genre"], metadata["mood"], metadata["tempo"],
metadata["duration"], str(file_path), metadata["prompt"])
)
self.db.commit()
def find_tracks(self, mood: str = None, genre: str = None) -> list[dict]:
query = "SELECT * FROM tracks WHERE 1=1"
params = []
if mood:
query += " AND mood LIKE ?"
params.append(f"%{mood}%")
if genre:
query += " AND genre LIKE ?"
params.append(f"%{genre}%")
cursor = self.db.execute(query, params)
return [dict(zip([d[0] for d in cursor.description], row)) for row in cursor.fetchall()]
Integration with Video Pipelines
The real value comes from integrating generated music into automated video production. I use FFmpeg to mix the generated audio with video content.
import subprocess
def add_background_music(
video_path: str,
music_path: str,
output_path: str,
music_volume: float = 0.15
):
cmd = [
"ffmpeg", "-i", video_path, "-i", music_path,
"-filter_complex",
f"[1:a]volume={music_volume}[bg];[0:a][bg]amix=inputs=2:duration=first[out]",
"-map", "0:v", "-map", "[out]",
"-c:v", "copy", "-c:a", "aac",
"-shortest", output_path
]
subprocess.run(cmd, check=True)
Quality Control
Not every generated track is usable. I apply automated quality checks before adding tracks to the library:
- Duration check: Verify the track matches the requested length within 5 seconds
- Silence detection: Flag tracks with more than 3 seconds of silence
- Volume consistency: Check that the track maintains consistent levels throughout
- Human review: For client-facing content, always have a human listen to the final mix
Cost Optimization
Suno API credits add up when generating at scale. My optimization strategies:
- Generate tracks at standard quality first, then re-generate winners at high quality
- Build a reusable library rather than generating unique tracks for every piece of content
- Use shorter generation times (30 seconds) for initial prompt testing before committing to full-length tracks
AI music generation is a genuine game changer for content creators who need affordable, unique background music. The technology is not replacing human musicians for foreground music, but for background tracks and ambient audio, it delivers quality that was previously only available through expensive licensing or custom composition.