Inworld AI
Inworld AI

Inworld TTS-1.5 Mini

Low-latency expressive text-to-speech optimized for real-time apps

Text to Audio

Inworld TTS-1.5 Mini Overview

Inworld TTS-1.5 Mini is a lightweight text-to-speech model designed for real-time voice experiences with ultra-low latency and efficient performance. It delivers natural, expressive audio suitable for interactive agents, voice assistants, and conversational applications where responsiveness is critical. The Mini variant balances speed and quality, enabling responsive speech output even under constrained compute conditions.

From $0.0250/ audio
1000 characters$0.025

Commercial use

More models from Inworld AI

Inworld TTS-1.5 Max is a high-fidelity text-to-speech model engineered for expressive voice synthesis with rich prosody, nuanced emotional range, and broadcast-ready audio quality. It supports a wide set of languages and delivers more natural pronunciation and expressive variation suitable for narration, content creation, and immersive character voices. The Max variant prioritizes audio quality and expressiveness while still supporting responsive generation.