Kimi K2.6

Open frontier multimodal LLM for coding, long-horizon execution, and tool-rich workflows

Text to TextImage to TextVideo to Text

Moonshot AI

Kimi K2.6

Open frontier multimodal LLM for coding, long-horizon execution, and tool-rich workflows

Text to TextImage to TextVideo to Text

Kimi K2.6 Overview

Kimi K2.6 is Moonshot AI's latest flagship open model for coding, reasoning, multimodal understanding, and agentic execution. It is designed for long-horizon software tasks, reliable tool use, autonomous multi-step workflows, coordinated agent swarms, and visual understanding across image and video inputs in addition to text.

Token based

Input tokens / 1M$0.9/M

Output tokens / 1M$4.2/M

Cached read / 1M$.18/M

Commercial use

How to Use Kimi K2.6

Overview

Kimi K2.6 is an open flagship multimodal language model built for coding, long-horizon execution, and agentic workflows.

It is a strong fit for tasks that require sustained multi-step problem solving, reliable tool use, software development, research, autonomous execution, and visual understanding across text, images, and video.

Strengths

Long-Horizon Coding

Kimi K2.6 is designed for complex engineering work that unfolds over many steps. It performs strongly on software tasks that require debugging, refactoring, architecture changes, performance optimization, and coordinated changes across large codebases.

Strong Agentic Execution

The model is built for agent-style workflows where planning, tool invocation, recovery, and persistent progress matter. It is positioned for autonomous work that goes beyond single-turn chat into multi-step execution.

Multimodal Understanding

Kimi K2.6 can take image and video inputs in addition to text, which makes it useful for workflows that combine reasoning with screenshots, diagrams, mixed-media documents, UI states, and video content.

Agent Swarm Workflows

K2.6 is designed to coordinate large-scale agent swarms, with the public product surfaces emphasizing much higher parallelism and longer execution chains than the prior K2.5 generation.

Tool-Rich Professional Work

The model is a strong fit for tool-using systems that combine reasoning with search, code execution, document work, and structured task completion.

Open Frontier Model

Kimi K2.6 is presented as an open model rather than only a hosted closed system, which makes it relevant for teams that care about open weights, ecosystem adoption, and flexible deployment choices.

Capabilities

Text-to-Text

Kimi K2.6 handles general language tasks including coding assistance, summarization, planning, reasoning, drafting, transformation, and research-oriented outputs.

Image-to-Text

The model can understand image input and use it as part of broader reasoning and analysis workflows.

Video-to-Text

The model can take video input for analysis and description tasks, making it useful for workflows that depend on temporal visual content rather than only still images.

Tool Calling

The K2 family is designed for strong tool use in API workflows, making K2.6 suitable for function calling, agent orchestration, and external-action pipelines.

Long-Context Work

Kimi K2.6 supports a large context window and is intended for use cases where substantial context must remain in scope across a long task.

Input and Output

AIR ID: moonshotai:[email protected]
Input: text, images, and video
Output: text
Context window: 262,144 tokens
Tool use: supported
JSON mode: supported
Partial mode: supported

Best Fit

Coding agents
Long-horizon engineering tasks
Tool-using assistants
Research and analysis workflows
Multimodal reasoning over images and video
Structured professional automation