Fish Audio S2.1 Pro
Fish Audio S2.1 Pro is a flagship text-to-speech model built for highly expressive, low-latency speech generation. It supports natural-language bracket cues for emotion and delivery control, multi-speaker dialogue in a single generation, 80+ languages with automatic language detection, and realtime streaming with very fast time to first audio.
Complete technical specification for integration
Ready-to-use code snippets for common workflows
Step-by-step tutorials for advanced use cases
Underground Seed Vault Mediation
<|speaker:0|>[calm but urgent] Dr. Vale, put the seed case back on the table. We can still solve this without waking the security council. <|speaker:1|>[breathless, defensive] You think I came all this way for theft? No. I came because your vault ignored three villages begging for crop trials. <|speaker:0|>[short pause] Those samples are the last clean line of Ashen Pear barley. One mistake, one bad field, and a century of recovery work disappears. <|speaker:1|>[soft laugh, bitter] A century? My grandmother had four weeks of grain left. She said, “No hay futuro si las semillas duermen.” There is no future if the seeds sleep. <|speaker:0|>[voice lowering, conflicted] I know that sentence. It was carved above the first chamber door. <|speaker:1|>[whispering] Then listen to it. <|speaker:0|>[long exhale] The council will never approve a release under pressure. <|speaker:1|>[gently] Then do not call it a release. Call it a monitored exchange. Call it mercy with paperwork. <|speaker:0|>[a reluctant smile in the voice] You always did weaponize bureaucracy beautifully. <|speaker:1|>[quietly, in Japanese] Onegai. Hitotsu dake. Please. Just one tray. <|speaker:0|>[firm, resolving] One tray, twelve kernels, full chain of custody. You record the soil, the water, every sprout, every failure. <|speaker:1|>[near tears] Every sprout. Every failure. I swear. <|speaker:0|>[short pause, then into recorder] Emergency addendum, Seed Vault Nine. Authorized by Curator Mara Vale. Reason: survival should not require a unanimous vote.
{
"taskType": "audioInference",
"taskUUID": "eea491c9-1eb6-401a-9eb1-fd418f1a7cb5",
"model": "fishaudio:s2.1@pro",
"speech": {
"text": "<|speaker:0|>[calm but urgent] Dr. Vale, put the seed case back on the table. We can still solve this without waking the security council.\n<|speaker:1|>[breathless, defensive] You think I came all this way for theft? No. I came because your vault ignored three villages begging for crop trials.\n<|speaker:0|>[short pause] Those samples are the last clean line of Ashen Pear barley. One mistake, one bad field, and a century of recovery work disappears.\n<|speaker:1|>[soft laugh, bitter] A century? My grandmother had four weeks of grain left. She said, “No hay futuro si las semillas duermen.” There is no future if the seeds sleep.\n<|speaker:0|>[voice lowering, conflicted] I know that sentence. It was carved above the first chamber door.\n<|speaker:1|>[whispering] Then listen to it.\n<|speaker:0|>[long exhale] The council will never approve a release under pressure.\n<|speaker:1|>[gently] Then do not call it a release. Call it a monitored exchange. Call it mercy with paperwork.\n<|speaker:0|>[a reluctant smile in the voice] You always did weaponize bureaucracy beautifully.\n<|speaker:1|>[quietly, in Japanese] Onegai. Hitotsu dake. Please. Just one tray.\n<|speaker:0|>[firm, resolving] One tray, twelve kernels, full chain of custody. You record the soil, the water, every sprout, every failure.\n<|speaker:1|>[near tears] Every sprout. Every failure. I swear.\n<|speaker:0|>[short pause, then into recorder] Emergency addendum, Seed Vault Nine. Authorized by Curator Mara Vale. Reason: survival should not require a unanimous vote.",
"voices": [
"e3cd384158934cc9a01029cd7d278634",
"536d3a5e000945adb7038665781a4aca"
],
"speed": 0.96,
"volume": 0
},
"settings": {
"temperature": 0.82,
"topP": 0.78,
"chunkLength": 260,
"conditionOnPreviousChunks": true,
"latency": "balanced",
"normalize": true,
"normalizeLoudness": true,
"repetitionPenalty": 1.18
}
}{
"taskType": "audioInference",
"taskUUID": "eea491c9-1eb6-401a-9eb1-fd418f1a7cb5",
"audioUUID": "bb2f5e79-e75f-4860-871f-e8c2bd519119",
"audioURL": "https://am.runware.ai/audio/os/a05d22/ws/5/ai/bb2f5e79-e75f-4860-871f-e8c2bd519119.mp3"
}Zero Gravity Tailor Consultation
<|speaker:0|>[calm, precise] Ambassador Vale, please extend both arms. The ceremonial suit is reading your pulse through the sleeve seams. <|speaker:1|>[nervous laugh] That sounds very reassuring, Master Ivo. Should it be humming back at me in three different keys? <|speaker:0|>[short pause] Only two keys. The third is your translator badge vibrating against the collar. <|speaker:1|>[relieved sigh] Ah. Gracias a todos los satélites. I thought the fabric had developed opinions. <|speaker:0|>[softly amused] It has excellent opinions. It dislikes panic, sudden turns, and speeches longer than seven minutes. <|speaker:1|>[whispering] Then it may walk out before I do. <|speaker:0|>[encouraging] Breathe in. Good. Breathe out. The shoulder line is settling. Say the opening phrase, please. <|speaker:1|>[formal, measured] Distinguished delegates, on behalf of the outer rings, I offer a promise: no child will inherit a sky filled with weapons. <|speaker:0|>[quiet admiration] Better. The suit warmed on the word promise. <|speaker:1|>[switching gently] 約束します。We promise. Nous restons ensemble. <|speaker:0|>[focused] Excellent automatic language shift. Now, if the gravity dips during applause, place one hand over the silver clasp and count to four. <|speaker:1|>[playful] One, two, three, four. Do I look diplomatic? <|speaker:0|>[warmly] You look like someone who can make a room full of rivals lower their voices. <|speaker:1|>[steady, hopeful] Then seal the collar. I am ready.
{
"taskType": "audioInference",
"taskUUID": "40db4212-2990-4660-8efe-d5f5f584a593",
"model": "fishaudio:s2.1@pro",
"speech": {
"text": "<|speaker:0|>[calm, precise] Ambassador Vale, please extend both arms. The ceremonial suit is reading your pulse through the sleeve seams.\n<|speaker:1|>[nervous laugh] That sounds very reassuring, Master Ivo. Should it be humming back at me in three different keys?\n<|speaker:0|>[short pause] Only two keys. The third is your translator badge vibrating against the collar.\n<|speaker:1|>[relieved sigh] Ah. Gracias a todos los satélites. I thought the fabric had developed opinions.\n<|speaker:0|>[softly amused] It has excellent opinions. It dislikes panic, sudden turns, and speeches longer than seven minutes.\n<|speaker:1|>[whispering] Then it may walk out before I do.\n<|speaker:0|>[encouraging] Breathe in. Good. Breathe out. The shoulder line is settling. Say the opening phrase, please.\n<|speaker:1|>[formal, measured] Distinguished delegates, on behalf of the outer rings, I offer a promise: no child will inherit a sky filled with weapons.\n<|speaker:0|>[quiet admiration] Better. The suit warmed on the word promise.\n<|speaker:1|>[switching gently] 約束します。We promise. Nous restons ensemble.\n<|speaker:0|>[focused] Excellent automatic language shift. Now, if the gravity dips during applause, place one hand over the silver clasp and count to four.\n<|speaker:1|>[playful] One, two, three, four. Do I look diplomatic?\n<|speaker:0|>[warmly] You look like someone who can make a room full of rivals lower their voices.\n<|speaker:1|>[steady, hopeful] Then seal the collar. I am ready.",
"voices": [
"bf322df2096a46f18c579d0baa36f41d",
"933563129e564b19a115bedd57b7406a"
],
"speed": 0.96,
"volume": 0
},
"settings": {
"temperature": 0.82,
"topP": 0.76,
"chunkLength": 220,
"conditionOnPreviousChunks": true,
"latency": "balanced",
"normalize": true,
"normalizeLoudness": true,
"repetitionPenalty": 1.18
}
}{
"taskType": "audioInference",
"taskUUID": "40db4212-2990-4660-8efe-d5f5f584a593",
"audioUUID": "68ed2ee4-1e07-4413-a09c-1a14da1ea111",
"audioURL": "https://am.runware.ai/audio/os/a04d20/ws/5/ai/68ed2ee4-1e07-4413-a09c-1a14da1ea111.mp3"
}Polar Courier Distress Call
<|speaker:0|>[breathless, close to microphone] Control, this is Courier Seven. The compass just spun three full circles, and the sled dogs are refusing the ice bridge. [short pause] I can see the blue survey markers, but they are moving. <|speaker:1|>[calm but urgent] Courier Seven, stay with my voice. Switch to beacon channel nine and give me your last fixed bearing. <|speaker:0|>[nervous laugh] Last fixed bearing was three-one-zero. Then the aurora started pulsing like a warning sign. [whispering] There is music under the ice. <|speaker:1|>[softly] Breathe in for four. Out for six. Good. Repeat after me: ég er hér, ég held áfram. <|speaker:0|>[steadier, in Icelandic] Ég er hér, ég held áfram. [short pause] Control, the markers are aligning. I see a maintenance tower. <|speaker:1|>[relieved] Excellent. Walk toward the red strobe. Do not run. <|speaker:0|>[quiet awe, Japanese] 分かりました。赤い光へ向かいます。 [exhale] Thank you, Control. <|speaker:1|>[warm, reassuring] I am right here until you reach the door.
{
"taskType": "audioInference",
"taskUUID": "36bd44cf-e573-42bd-b576-37b6e9cc5bdc",
"model": "fishaudio:s2.1@pro",
"speech": {
"text": "<|speaker:0|>[breathless, close to microphone] Control, this is Courier Seven. The compass just spun three full circles, and the sled dogs are refusing the ice bridge. [short pause] I can see the blue survey markers, but they are moving.\n<|speaker:1|>[calm but urgent] Courier Seven, stay with my voice. Switch to beacon channel nine and give me your last fixed bearing.\n<|speaker:0|>[nervous laugh] Last fixed bearing was three-one-zero. Then the aurora started pulsing like a warning sign. [whispering] There is music under the ice.\n<|speaker:1|>[softly] Breathe in for four. Out for six. Good. Repeat after me: ég er hér, ég held áfram.\n<|speaker:0|>[steadier, in Icelandic] Ég er hér, ég held áfram. [short pause] Control, the markers are aligning. I see a maintenance tower.\n<|speaker:1|>[relieved] Excellent. Walk toward the red strobe. Do not run.\n<|speaker:0|>[quiet awe, Japanese] 分かりました。赤い光へ向かいます。 [exhale] Thank you, Control.\n<|speaker:1|>[warm, reassuring] I am right here until you reach the door.",
"voices": [
"bf322df2096a46f18c579d0baa36f41d",
"933563129e564b19a115bedd57b7406a"
],
"speed": 0.96,
"volume": 0
},
"settings": {
"temperature": 0.82,
"topP": 0.85,
"chunkLength": 220,
"conditionOnPreviousChunks": true,
"latency": "balanced",
"normalize": true,
"normalizeLoudness": true,
"repetitionPenalty": 1.18
}
}{
"taskType": "audioInference",
"taskUUID": "36bd44cf-e573-42bd-b576-37b6e9cc5bdc",
"audioUUID": "6784640c-3ae4-4669-b201-350c66a87f27",
"audioURL": "https://am.runware.ai/audio/os/a03d21/ws/5/ai/6784640c-3ae4-4669-b201-350c66a87f27.mp3"
}Freight Elevator Heist Rehearsal
<|speaker:0|>[hushed, amused] You called this a rehearsal, Mina. Why is the freight elevator moving? <|speaker:1|>[whispering, fast] Because the security chief took the stairs, our fake duchess fainted ahead of schedule, and the diamond is currently in a soup tureen. <|speaker:0|>[short pause] That is... a very specific kind of chaos. <|speaker:1|>[soft laugh] Tranquilo, doctor. I planned for three disasters. This is only two and a half. <|speaker:0|>[nervous] The elevator just stopped between floors. <|speaker:1|>[calm, close to the mic] Good. That means the signal jammer works. <|speaker:0|>[sigh] I miss paperwork. <|speaker:1|>[playful whisper] Non, tu adores ça. Admit it. <|speaker:0|>[reluctant chuckle] Fine. I adore not being arrested. <|speaker:1|>[suddenly serious] Footsteps above us. On my count, breathe slowly, sound bored, and pretend we belong here. <|speaker:0|>[deadpan] Mina, no one belongs in a stopped freight elevator with a soup diamond. <|speaker:1|>[whispering, delighted] Exactly. That is why they will never suspect us.
{
"taskType": "audioInference",
"taskUUID": "caea1875-93d0-4727-be4d-1b493fd9d519",
"model": "fishaudio:s2.1@pro",
"speech": {
"text": "<|speaker:0|>[hushed, amused] You called this a rehearsal, Mina. Why is the freight elevator moving?\n<|speaker:1|>[whispering, fast] Because the security chief took the stairs, our fake duchess fainted ahead of schedule, and the diamond is currently in a soup tureen.\n<|speaker:0|>[short pause] That is... a very specific kind of chaos.\n<|speaker:1|>[soft laugh] Tranquilo, doctor. I planned for three disasters. This is only two and a half.\n<|speaker:0|>[nervous] The elevator just stopped between floors.\n<|speaker:1|>[calm, close to the mic] Good. That means the signal jammer works.\n<|speaker:0|>[sigh] I miss paperwork.\n<|speaker:1|>[playful whisper] Non, tu adores ça. Admit it.\n<|speaker:0|>[reluctant chuckle] Fine. I adore not being arrested.\n<|speaker:1|>[suddenly serious] Footsteps above us. On my count, breathe slowly, sound bored, and pretend we belong here.\n<|speaker:0|>[deadpan] Mina, no one belongs in a stopped freight elevator with a soup diamond.\n<|speaker:1|>[whispering, delighted] Exactly. That is why they will never suspect us.",
"voices": [
"bf322df2096a46f18c579d0baa36f41d",
"9a9cf47702da476aa4629e2506d4a857"
],
"speed": 1.02,
"volume": 0
},
"settings": {
"temperature": 0.82,
"topP": 0.78,
"latency": "low",
"chunkLength": 220,
"conditionOnPreviousChunks": true,
"normalize": true,
"normalizeLoudness": true,
"repetitionPenalty": 1.15
}
}{
"taskType": "audioInference",
"taskUUID": "caea1875-93d0-4727-be4d-1b493fd9d519",
"audioUUID": "110837e7-46c7-41d2-b93c-989d0571d6dc",
"audioURL": "https://am.runware.ai/audio/os/a02d21/ws/5/ai/110837e7-46c7-41d2-b93c-989d0571d6dc.mp3"
}