ACE-Step v1.5 Base
ACE-Step v1.5 Base is an open-source music generation foundation model built on a hybrid LLM planner and Diffusion Transformer architecture. It generates full tracks from text prompts with support for voice cloning, lyric editing, remixing, cover generation, and compositions up to 10 minutes. It supports over 50 languages and runs on consumer hardware with under 4GB VRAM.
API Options
Platform-level options for task execution and delivery.
-
taskType
string required value: audioInference -
Identifier for the type of task being performed
-
taskUUID
string required UUID v4 -
UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.
-
outputType
string default: URL -
Audio output type.
Allowed values 3 values
-
outputFormat
string default: MP3 -
Specifies the file format of the generated output. The available values depend on the task type and the specific model's capabilities.
- `MP3`: Compressed audio, smaller file size.
- `WAV`: Uncompressed, high-quality audio.
- `FLAC`: Lossless compression.
- `OGG`: Open-source compressed audio format (Vorbis codec).
Allowed values 4 values
-
audioSettings
object -
Audio encoding settings for controlling the bitrate, number of channels, and sample rate of the generated audio. Only applicable for lossy output formats (
MP3andOGG). When using lossless formats (WAVorFLAC), this parameter must not be provided.The available sample rates and valid bitrate ranges depend on the output format. For
OGG, bitrate limits also vary by the number of channels.MP3 bitrate limits
Bitrate limits for MP3 are the same regardless of mono or stereo.
Sample Rate Min Bitrate Max Bitrate 8,000 Hz 8 kbps 64 kbps 11,025 Hz 8 kbps 64 kbps 12,000 Hz 8 kbps 64 kbps 16,000 Hz 8 kbps 160 kbps 22,050 Hz 8 kbps 160 kbps 24,000 Hz 8 kbps 160 kbps 32,000 Hz 32 kbps 320 kbps 44,100 Hz 32 kbps 320 kbps 48,000 Hz 32 kbps 320 kbps OGG bitrate limits — Mono (1 channel)
Sample Rate Min Bitrate Max Bitrate 8,000 Hz 8 kbps 40 kbps 12,000 Hz 16 kbps 48 kbps 16,000 Hz 16 kbps 96 kbps 24,000 Hz 16 kbps 80 kbps 48,000 Hz 32 kbps 224 kbps OGG bitrate limits — Stereo (2 channels)
Sample Rate Min Bitrate Max Bitrate 8,000 Hz 16 kbps 80 kbps 12,000 Hz 16 kbps 96 kbps 16,000 Hz 24 kbps 192 kbps 24,000 Hz 32 kbps 160 kbps 48,000 Hz 48 kbps 256 kbps Lossless formats: When
outputFormatis set toWAVorFLAC, theaudioSettingsparameter is not available since these formats produce uncompressed or lossless audio with no configurable encoding settings.Properties 3 properties
-
audioSettings»bitratebitrate
integer min: 8 -
Audio bitrate in kbps.
-
audioSettings»channelschannels
integer default: 2 -
Number of audio channels. 1 for mono, 2 for stereo.
Allowed values 2 values
-
audioSettings»sampleRatesampleRate
integer -
Audio sample rate in Hz.
-
-
webhookURL
string URI -
Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.
Learn more 1 resource
- Webhooks PLATFORM
- Webhooks
-
deliveryMethod
string default: sync -
Determines how the API delivers task results.
Allowed values 2 values
- Returns complete results directly in the API response.
- Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse.
Learn more 1 resource
- Task Polling PLATFORM
-
uploadEndpoint
string URI -
Specifies a URL where the generated content will be automatically uploaded using the HTTP PUT method. The raw binary data of the media file is sent directly as the request body. For secure uploads to cloud storage, use presigned URLs that include temporary authentication credentials.
Common use cases:
- Cloud storage: Upload directly to S3 buckets, Google Cloud Storage, or Azure Blob Storage using presigned URLs.
- CDN integration: Upload to content delivery networks for immediate distribution.
// S3 presigned URL for secure upload https://your-bucket.s3.amazonaws.com/generated/content.mp4?X-Amz-Signature=abc123&X-Amz-Expires=3600 // Google Cloud Storage presigned URL https://storage.googleapis.com/your-bucket/content.jpg?X-Goog-Signature=xyz789 // Custom storage endpoint https://storage.example.com/uploads/generated-image.jpgThe content data will be sent as the request body to the specified URL when generation is complete.
-
ttl
integer min: 60 -
Time-to-live (TTL) in seconds for generated content. Only applies when
outputTypeisURL.
-
includeCost
boolean default: false -
Include task cost in the response.
-
numberResults
integer min: 1 max: 4 default: 1 -
Number of results to generate. Each result uses a different seed, producing variations of the same parameters.
Inputs
Input resources for the task (images, audio, etc). These must be nested inside the inputs object.
inputs object.-
inputs»audioaudio
string -
Audio input (UUID or URL).
Generation Parameters
Core parameters for controlling the generated content.
-
model
string required value: runware:ace-step@v1.5-base -
Identifier of the model to use for generation.
Learn more 3 resources
-
positivePrompt
string required min: 2 max: 3000 -
Text prompt describing elements to include in the generated output.
Learn more 2 resources
-
negativePrompt
string min: 2 max: 3000 -
Prompt to guide what to exclude from generation. Ignored when guidance is disabled (CFGScale ≤ 1).
Learn more 1 resource
-
duration
float min: 6 max: 300 step: 0.1 default: 60 -
Length of the generated audio track in seconds.
-
seed
integer min: 0 max: 2147483647 -
Random seed for reproducible generation. When not provided, a random seed is generated in the unsigned 32-bit range.
-
steps
integer min: 1 max: 300 default: 100 -
Total number of denoising steps. Higher values generally produce more detailed results but take longer.
Learn more 1 resource
-
CFGScale
float min: 1 max: 30 step: 0.01 default: 10 -
Guidance scale. Higher values follow the prompt more closely at the cost of quality.
Learn more 1 resource
-
strength
float min: 0 max: 1 step: 0.01 default: 0.5 -
Fraction of steps using the input source instead of generated output.
Settings
Technical parameters to fine-tune the inference process. These must be nested inside the settings object.
settings object.-
settings»bpmbpm
integer min: 30 max: 300 -
Beats per minute. If not set, the model decides automatically.
-
settings»coverConditioningScalecoverConditioningScale
float min: 0 max: 1 step: 0.01 default: 1 -
Fraction of steps using source-audio conditioning.
-
settings»guidanceTypeguidanceType
string default: apg -
Controls how guidance is applied during generation.
Allowed values 2 values
- Adaptive Projected Guidance.
- Classifier-Free Guidance.
-
settings»keyScalekeyScale
string -
Musical key and scale in '{Note}{Accidental} {Mode}' format (e.g. 'C major', 'F# minor', 'Bb major').
-
settings»lyricslyrics
string min: 10 max: 3000 default: -
Song lyrics, typically formatted like a lyrics website.
-
settings»repaintingEndrepaintingEnd
float min: 0 max: 300 -
End time in seconds for repaint region. Requires input audio. Values beyond audio duration append new audio.
-
settings»repaintingStartrepaintingStart
float min: -300 max: 300 -
Start time in seconds for repaint region. Requires input audio. Negative values prepend audio before the start.
-
settings»timeSignaturetimeSignature
string -
Beats per measure. Empty string lets the model decide.
Allowed values 4 values
-
settings»vocalLanguagevocalLanguage
string default: en -
ISO 639-1 language code for vocals.
unknownfor instrumental or auto detection.Allowed values 51 values
Midnight Roller Rink Synthwave
{
"taskType": "audioInference",
"taskUUID": "db9ad4da-7608-4fe2-9111-a9b0e6bc94a7",
"model": "runware:ace-step@v1.5-base",
"positivePrompt": "An energetic retro synth-pop track set in a nearly empty roller rink after closing time, glowing floors, humming amplifiers, soft announcer echoes, crisp drum machines, elastic bassline, bright analog synth leads, catchy female vocals, nostalgic yet stylish, polished studio mix, strong chorus, cinematic build, playful late-night momentum",
"negativePrompt": "muddy mix, distorted vocals, aggressive metal guitars, orchestral score, crowd noise, spoken word, lo-fi hiss, chaotic tempo changes",
"duration": 75,
"seed": 25165,
"steps": 120,
"CFGScale": 8.5,
"settings": {
"bpm": 112,
"guidanceType": "apg",
"keyScale": "A minor",
"timeSignature": 4,
"vocalLanguage": "en",
"lyrics": "[Verse 1]\nWheels in circles on a painted line\nNeon streaks and a borrowed shine\nLast song spinning through the cooling air\nI keep dancing like you're still there\n\n[Pre-Chorus]\nEmpty room, electric heart\nFalling back into the start\n\n[Chorus]\nGlide with me through the afterglow\nWhere the silver speakers throb real low\nTurn the silence into a spark\nWe can make a little sun in the dark\n\n[Verse 2]\nDisco dust on a cobalt floor\nOne more lap and I want more\nMirror lights in a gentle blur\nEvery beat makes the night occur\n\n[Chorus]\nGlide with me through the afterglow\nWhere the silver speakers throb real low\nTurn the silence into a spark\nWe can make a little sun in the dark"
}
}{
"taskType": "audioInference",
"taskUUID": "db9ad4da-7608-4fe2-9111-a9b0e6bc94a7",
"audioUUID": "7a277cbf-5cc6-4a2a-a2b9-06b3befcc0da",
"audioURL": "https://am.runware.ai/audio/os/a19d05/ws/5/ai/7a277cbf-5cc6-4a2a-a2b9-06b3befcc0da.mp3",
"seed": 25165,
"cost": 0.0058
}Glasshouse Tango Train Heist
{
"taskType": "audioInference",
"taskUUID": "d179e09e-03aa-439a-bdc7-545ece313a41",
"model": "runware:ace-step@v1.5-base",
"positivePrompt": "A dramatic electro-tango heist song set aboard a luxurious cross-country train racing through frost-covered farmland at dawn. Start with nervous bandoneon, brushed snare, muted upright bass, and soft analog synth pulses, then build into a sleek groove with crisp hand percussion, staccato strings, and a confident female lead vocal. The mood is elegant, tense, witty, and fast-moving, like a jewel thief smiling during the getaway. Strong melodic hook, polished modern production, cinematic transitions, a memorable chorus, and a final section with layered harmonies and a triumphant instrumental flourish.",
"negativePrompt": "muddy mix, distorted vocals, chaotic structure, overly aggressive rock guitars, lo-fi noise, crowd sounds, spoken word, comedy effects, childish melody, generic EDM drop",
"duration": 118,
"seed": 60007,
"steps": 160,
"CFGScale": 8.5,
"settings": {
"bpm": 124,
"guidanceType": "apg",
"keyScale": "D minor",
"timeSignature": 4,
"vocalLanguage": "en",
"lyrics": "[Verse 1]\nVelvet ticket, hidden name\nSilver buckle, borrowed fame\nGlass roof shivering with the light\nI count the seconds, hold them tight\n\n[Pre-Chorus]\nKeys in my glove, sparks in my grin\nOne small breach and I slide right in\nEngines hum like a secret choir\nSteel on steel and a pulse like fire\n\n[Chorus]\nRun with the rails, don't look behind\nGold in the dawn and smoke in my mind\nKiss of danger, precise and sweet\nHeart like a drum and dust on my feet\nRun with the rails, let the whole world guess\nGrace in the theft and calm in the mess\nBy the next station I'm already gone\nLeaving my name in the break of dawn\n\n[Verse 2]\nCrystal cases, coded lock\nBlack silk timing to the ticking shock\nPorters whisper down the hall\nNo one notices at all\n\n[Pre-Chorus]\nMap in my head, flame in my chest\nEvery clean gamble demands its best\nWindows blaze with a copper sky\nI take the prize and I don't ask why\n\n[Chorus]\nRun with the rails, don't look behind\nGold in the dawn and smoke in my mind\nKiss of danger, precise and sweet\nHeart like a drum and dust on my feet\nRun with the rails, let the whole world guess\nGrace in the theft and calm in the mess\nBy the next station I'm already gone\nLeaving my name in the break of dawn\n\n[Bridge]\nIf they remember, let them remember style\nThe heel turn, the half-smile, the impossible mile\nNo shattered glass, no wasted breath\nJust a red horizon and a dance with depth\n\n[Final Chorus]\nRun with the rails, don't look behind\nGold in the dawn and smoke in my mind\nKiss of danger, precise and sweet\nHeart like a drum and dust on my feet\nRun with the rails, let the whole world guess\nGrace in the theft and calm in the mess\nBy the next station I'm already gone\nLeaving my name in the break of dawn"
}
}{
"taskType": "audioInference",
"taskUUID": "d179e09e-03aa-439a-bdc7-545ece313a41",
"audioUUID": "8756bdd4-4848-4d70-b735-8726996e7acd",
"audioURL": "https://am.runware.ai/audio/os/a19d05/ws/5/ai/8756bdd4-4848-4d70-b735-8726996e7acd.mp3",
"seed": 60007,
"cost": 0.0122
}Brass Comet Derby Anthem
{
"taskType": "audioInference",
"taskUUID": "bba41773-669d-4bcb-b814-c607e381a7a5",
"model": "runware:ace-step@v1.5-base",
"positivePrompt": "An exhilarating retro-futurist race anthem about a sky arena where brass-powered comet carts streak through painted clouds. Big band horns collide with stomping glam rock drums, wiry electric bass, sparkling synth accents, handclaps, and a charismatic lead vocal. Starts with a tense snare roll and muted trumpet motif, blooms into a massive singalong chorus, then a brief breakdown with crowd chants and whistling before a triumphant final refrain. Bright, theatrical, athletic, playful, and grand; polished studio production with memorable hooks and dynamic transitions.",
"negativePrompt": "ambient drift, lo-fi haze, trap hi-hats, harsh distortion, death metal screaming, spoken word only, minimalist drone, chaotic structure, muddy mix, weak chorus",
"duration": 78,
"seed": 99684,
"steps": 120,
"CFGScale": 8.5,
"settings": {
"bpm": 146,
"guidanceType": "apg",
"keyScale": "D minor",
"lyrics": "[Verse 1]\nPaint on the clouds and a spark in the wheel\nCopper-wing engines and a daredevil feel\nHelmet full of thunder, number on my sleeve\nI lean to the horizon like I never plan to leave\n\n[Pre-Chorus]\nHear the trumpets flare\nFire in the air\nEverybody counting down\n\n[Chorus]\nRun, run, comet runner, blaze across the blue\nShake the rails of heaven with the wild things that you do\nHey, hey, hold the banner, let the whole world see\nWe were born for the derby of impossible velocity\n\n[Verse 2]\nRivals at my shoulder with a grin made of chrome\nEvery turn a gamble and the storm feels like home\nKick the pedal harder, let the bright gears sing\nA hundred hearts are beating to the racket that we bring\n\n[Pre-Chorus]\nHear the trumpets flare\nFire in the air\nEverybody counting down\n\n[Chorus]\nRun, run, comet runner, blaze across the blue\nShake the rails of heaven with the wild things that you do\nHey, hey, hold the banner, let the whole world see\nWe were born for the derby of impossible velocity\n\n[Bridge]\nWhoa-oh, whistle to the rafters\nWhoa-oh, chase the afterburn\nIf we fall, we fall like meteors\nIf we win, the stars will turn\n\n[Final Chorus]\nRun, run, comet runner, blaze across the blue\nShake the rails of heaven with the wild things that you do\nHey, hey, raise the chorus, let it ring ferociously\nWe were born for the derby of impossible velocity",
"timeSignature": 4,
"vocalLanguage": "en"
}
}{
"taskType": "audioInference",
"taskUUID": "bba41773-669d-4bcb-b814-c607e381a7a5",
"audioUUID": "3329a95a-21f2-43b4-a39b-30d52f0233d0",
"audioURL": "https://am.runware.ai/audio/os/a03d21/ws/5/ai/3329a95a-21f2-43b4-a39b-30d52f0233d0.mp3",
"seed": 99684,
"cost": 0.0058
}Chrome Elevator Funk Capsule
{
"taskType": "audioInference",
"taskUUID": "644e9e2e-caa3-425a-9872-68f5047d8473",
"model": "runware:ace-step@v1.5-base",
"positivePrompt": "A sleek retro-futurist funk track set inside a speeding orbital elevator capsule, with elastic slap bass, crisp handclaps, talkbox-style synth leads, brushed electric piano chords, playful analog drum fills, and confident female vocals. The mood is stylish, kinetic, polished, and slightly mischievous, like a VIP commute through the stratosphere. Build from a tight grooving verse into a catchy singalong chorus, with a short instrumental break featuring synth glides and bass pops. Rich stereo production, memorable hook, clean low end, and a glossy late-70s-meets-future sound.",
"negativePrompt": "heavy metal guitars, orchestral score, cinematic trailer booms, lo-fi tape noise, crowd chants, aggressive rap, distorted screaming, ambient drone, country twang, reggae skank, trap hi-hat rolls, glitch chaos, muddy mix",
"duration": 78,
"seed": 95999,
"steps": 120,
"CFGScale": 8.5,
"settings": {
"bpm": 112,
"guidanceType": "apg",
"keyScale": "E minor",
"timeSignature": 4,
"vocalLanguage": "en",
"lyrics": "[Verse 1]\nSilver rails and velvet doors\nCity lights below like scattered cores\nMagnet shoes and mirrored shades\nWe rise above the weather grids and trade\n\n[Pre-Chorus]\nFeel the cable hum, steady and bright\nStatic on the skin, hearts in flight\n\n[Chorus]\nTake me up, up through the blue\nChrome capsule dancing with a skyline view\nNo traffic, no ground, just a glowing line\nElevator funk on borrowed time\n\n[Verse 2]\nWindow halo, midnight glare\nJetstream writing signatures in air\nPocket stars on tailored sleeves\nWe laugh where gravity forgets to breathe\n\n[Chorus]\nTake me up, up through the blue\nChrome capsule dancing with a skyline view\nNo traffic, no ground, just a glowing line\nElevator funk on borrowed time\n\n[Bridge]\nBassline leaning, motors sing\nOrbit in the making with a velvet swing\n\n[Final Chorus]\nTake me up, up through the blue\nChrome capsule dancing with a skyline view\nNo traffic, no ground, just a glowing line\nElevator funk, your hand in mine"
}
}{
"taskType": "audioInference",
"taskUUID": "644e9e2e-caa3-425a-9872-68f5047d8473",
"audioUUID": "c0e0e3c7-dc2b-4492-91be-033560bf9e43",
"audioURL": "https://am.runware.ai/audio/os/a09d21/ws/5/ai/c0e0e3c7-dc2b-4492-91be-033560bf9e43.mp3",
"seed": 95999,
"cost": 0.0064
}