Inworld

Access Inworld's TTS-1.5 models including Mini and Max for expressive text-to-speech through Runware's unified API. Learn about speech parameters, voice selection, audio settings, and provider-specific controls.

Introduction

Inworld's AI models are integrated into the Runware platform through our unified API, providing access to low-latency expressive text-to-speech technology optimized for real-time voice experiences, interactive agents, and conversational applications where responsiveness is critical.

Through the providerSettings.inworld object, you can access Inworld's unique features such as provider-specific voice selection, while maintaining the consistency of Runware's standard API structure. This page documents the technical specifications, parameter requirements, and provider-specific settings for all Inworld models available through our platform.

Audio models

Inworld TTS-1.5 Mini

Inworld TTS-1.5 Mini is a lightweight text-to-speech model designed for real-time voice experiences with ultra-low latency and efficient performance. It delivers natural, expressive audio suitable for interactive agents, voice assistants, and conversational applications. The Mini variant balances speed and quality, enabling responsive speech output even under constrained compute conditions.

Model AIR ID: inworld:tts@1.5-mini.

Supported workflows: Text-to-audio.

Technical specifications:

  • Speech text: 2–2,000 characters (required).
  • Speech voice: Required. Specifies the voice for synthesis.
  • Speech speed: 0.5–1.5 (multiples of 0.1, default: 1). Controls the playback speed of the generated audio.
  • Temperature: 0.1–2 (default: 1.1). Controls the expressiveness and variability of the generated speech.
  • Audio settings: Supports sampleRate and bitrate configuration.

Provider-specific settings:

Parameters supported: voice.

{
  "taskType": "audioInference",
  "taskUUID": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "model": "inworld:tts@1.5-mini",
  "speech": {
    "text": "Welcome to our platform. We're excited to have you here.",
    "voice": "alloy"
  }
}
{
  "taskType": "audioInference",
  "taskUUID": "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
  "model": "inworld:tts@1.5-mini",
  "speech": {
    "text": "This is a slower, more deliberate narration for a documentary-style presentation.",
    "voice": "alloy",
    "speed": 0.8
  }
}
{
  "taskType": "audioInference",
  "taskUUID": "550e8400-e29b-41d4-a716-446655440000",
  "model": "inworld:tts@1.5-mini",
  "speech": {
    "text": "A dramatic reading with expressive intonation and emotional depth.",
    "voice": "alloy",
    "speed": 1.2
  },
  "settings": {
    "temperature": 1.5
  },
  "audioSettings": {
    "sampleRate": 44100,
    "bitrate": 192
  }
}
{
  "taskType": "audioInference",
  "taskUUID": "a770f077-f413-47de-9dac-be0b26a35da7",
  "model": "inworld:tts@1.5-mini",
  "speech": {
    "text": "A conversational response from an AI assistant with natural pacing.",
    "voice": "alloy"
  },
  "providerSettings": {
    "inworld": {
      "voice": "custom-voice-id"
    }
  }
}

Inworld TTS-1.5 Max

Inworld TTS-1.5 Max is a high-fidelity text-to-speech model engineered for expressive voice synthesis with rich prosody, nuanced emotional range, and broadcast-ready audio quality. It supports a wide set of languages and delivers more natural pronunciation and expressive variation suitable for narration, content creation, and immersive character voices. The Max variant prioritizes audio quality and expressiveness while still supporting responsive generation.

Model AIR ID: inworld:tts@1.5-max.

Supported workflows: Text-to-audio.

Technical specifications:

  • Speech text: 2–2,000 characters (required).
  • Speech voice: Required. Specifies the voice for synthesis.
  • Speech speed: 0.5–1.5 (multiples of 0.1, default: 1). Controls the playback speed of the generated audio.
  • Temperature: 0.1–2 (default: 1.1). Controls the expressiveness and variability of the generated speech.
  • Audio settings: Supports sampleRate and bitrate configuration.

Provider-specific settings:

Parameters supported: voice.

{
  "taskType": "audioInference",
  "taskUUID": "24cd5dff-cb81-4db5-8506-b72a9425f9d7",
  "model": "inworld:tts@1.5-max",
  "speech": {
    "text": "In a world where technology and creativity converge, new possibilities emerge every day.",
    "voice": "alloy"
  }
}
{
  "taskType": "audioInference",
  "taskUUID": "b8c4d952-7f27-4a6e-bc9a-83f01d1c6d59",
  "model": "inworld:tts@1.5-max",
  "speech": {
    "text": "The stage lights dimmed, and the audience held its breath as the final act began.",
    "voice": "alloy",
    "speed": 0.9
  },
  "settings": {
    "temperature": 1.8
  }
}
{
  "taskType": "audioInference",
  "taskUUID": "4192bff0-e1e0-43ce-a4db-912808c32493",
  "model": "inworld:tts@1.5-max",
  "speech": {
    "text": "Breaking news: scientists have discovered a new method for sustainable energy production that could revolutionize the industry.",
    "voice": "alloy",
    "speed": 1.1
  },
  "settings": {
    "temperature": 1.1
  },
  "audioSettings": {
    "sampleRate": 48000,
    "bitrate": 256
  }
}
{
  "taskType": "audioInference",
  "taskUUID": "2b30193e-83b3-c392-1192-9cad0e1f2031",
  "model": "inworld:tts@1.5-max",
  "speech": {
    "text": "Welcome back to the show. Today we have an incredible lineup of guests joining us.",
    "voice": "alloy"
  },
  "providerSettings": {
    "inworld": {
      "voice": "custom-voice-id"
    }
  }
}