MiniMax
MiniMax

MiniMax M2.5

State-of-the-art agentic coding and office-work model, optimized for speed and cost

Text to Text

MiniMax M2.5 Overview

MiniMax-M2.5 is MiniMax’s latest frontier model, optimized for fast, low-cost agentic workflows across coding, search/tool use, and high-value office tasks. Trained with large-scale reinforcement learning in complex real-world environments, it delivers strong reasoning, efficient task decomposition, and high-quality outputs for production assistants and enterprise workflows.

From $0.2700/ tokens
Input tokens / 1M$0.27
Output tokens / 1M$0.95 / 1M

Commercial use

More models from MiniMax

MiniMax Music 2.6

Api Only

MiniMax Music 2.6 is MiniMax’s latest music generation model for full vocal songs and instrumentals from text prompts. It supports natural-language prompts or detailed production-style instructions, follows specified BPM and key with high reliability, and exposes fine-grained song structure control through section tags. The same Music API also supports instrumental generation, lyrics-assisted workflows, and synchronous or streaming delivery.

MiniMax Music Cover

Api Only

MiniMax Music Cover is MiniMax’s song-to-song transformation model for reimagining an existing track in a new style. It preserves the original vocal melody while changing voice timbre, instrumentation, genre, and arrangement through a text prompt. It supports one-step generation from reference audio or a two-step workflow with preprocessing and optional lyric editing.

MiniMax M2.7 is a long‑context LLM designed for agentic workflows across software engineering, search and tool use, and high‑value office productivity tasks. It’s built for multi‑step execution, with strong instruction following and dependable task decomposition, making it a solid default for production assistants that write code, call tools, and handle complex document workflows.

MiniMax M2.7‑Highspeed is the performance‑tuned variant of M2.7, built for lower latency and higher throughput while keeping output behavior consistent with the standard model. It’s a strong fit for interactive coding agents, tool‑calling pipelines, and office automation flows where responsiveness matters.

MiniMax Speech 2.8 is an advanced text-to-speech model that turns text into natural, expressive audio in multiple languages. It delivers broadcast-ready speech with rich prosody, emotional control, and a diverse voice library. The model supports up to large input lengths and can be used for voiceovers, narration, accessibility tools, and interactive voice applications.

MiniMax Hailuo 2.3 Fast is the speed tier of the Hailuo 2.3 video family. It targets rapid iteration for social clips, ads, and previews. It produces 6 second 768p or 1080p outputs with smooth motion and stable composition. Ideal for high volume image driven video workflows.

MiniMax Hailuo 2.3 is a cinematic video model for short form production. It accepts text prompts or image inputs and outputs 6 or 10 second clips at 768p or 1080p. It focuses on consistent motion, strong physics, and stable scenes for ads, social content, and creative shots.

MiniMax Hailuo 02 is a 1080p AI video model for cinematic, high motion scenes. It converts text prompts or still images into short, polished clips with strong instruction following and realistic physics. Ideal for commercial spots, trailers, music promos, and social shorts.

MiniMax 01 Live generates short stylized videos from static anime art. It focuses on expressive character motion with consistent details. Use it to turn illustrations or manga panels into dynamic clips suitable for cutscenes, social posts, or prototype shots.

MiniMax 01 Director generates short cinematic video clips from text prompts with director level control. It supports detailed camera movement instructions, stable framing, and reduced motion randomness. Ideal for film previz, ads, and story beats inside production tools.

MiniMax 01 is a compact text to video model for short clips. It turns simple prompts into 720p videos with smooth motion and cinematic framing. It targets fast iteration and stable output so developers can prototype interactive video features and creative tools with low latency.