OpenAI
OpenAI

GPT-5.5

Frontier reasoning LLM for complex coding, long-context work, and tool-using professional tasks

Text to TextImage to Text

GPT-5.5 Overview

GPT-5.5 is OpenAI's newest frontier model for complex professional work, with strong performance in coding, reasoning, and tool-using workflows. It supports a 1,050,000 token context window, 128,000 max output tokens, configurable reasoning effort, image input, and a broad tool stack including web search, file search, code interpreter, hosted shell, apply patch, skills, MCP, tool search, and computer use.

How to Use GPT-5.5

Overview

GPT-5.5 is a frontier language model built for complex coding, reasoning, and professional workflows.

It is a strong fit for tasks that need long context, careful multi-step thinking, image understanding, and deep tool use inside agentic systems rather than only fast single-turn text generation.

Strengths

Strong Coding and Professional Reasoning

GPT-5.5 is positioned as OpenAI's newest frontier model for the most complex professional work. It is designed for tasks where high reasoning quality, precise instruction following, and durable problem solving matter more than using a smaller lower-cost variant.

Very Large Context Window

The model supports a 1,050,000 token context window with up to 128,000 output tokens. This makes it useful for long documents, large codebases, multi-file workflows, and agent setups that need to keep a substantial working set in context.

Configurable Reasoning Effort

GPT-5.5 supports reasoning effort settings from none through xhigh, which makes it flexible across lighter tasks and deeper, compute-heavier problem solving.

Image Understanding

The model supports image input alongside text, which makes it useful for workflows that combine reasoning with screenshots, diagrams, UI states, documents, and other visual context.

Broad Tool Support

In the Responses API, GPT-5.5 supports a broad tool stack including web search, file search, image generation, code interpreter, hosted shell, apply patch, skills, MCP, tool search, and computer use. This makes it well suited to agentic workflows that need to act as well as reason.

Built for Long-Horizon Agent Work

The combination of long context, strong reasoning, structured output support, and tool use makes GPT-5.5 a good fit for coding agents, research agents, analysis pipelines, and multi-step business workflows.

Capabilities

Text-to-Text

GPT-5.5 handles general text generation, reasoning, summarization, planning, coding assistance, transformation, and structured response workflows.

Image-to-Text

GPT-5.5 can take image input and reason over visual information as part of a larger text-based task.

Structured and Tool-Using Workflows

The model supports function calling, structured outputs, and a wide tool surface, which makes it useful for production systems that need deterministic orchestration around the model.

Input and Output

  • AIR ID: openai:[email protected]
  • Input: text and image input
  • Output: text
  • Context window: 1,050,000 tokens
  • Max output: 128,000 tokens
  • Knowledge cutoff: December 1, 2025
  • Reasoning effort: none, low, medium, high, xhigh

Best Fit

  • Complex coding and debugging workflows
  • Long-context analysis
  • Tool-using agents
  • Image-grounded reasoning tasks
  • Professional research, planning, and execution pipelines

More models from OpenAI

GPT Image 2 is a general-purpose GPT Image family model for text-to-image generation and image editing. Its strengths include strong prompt adherence, readable embedded text, detailed edits, photorealistic rendering, and structured visual outputs such as posters, packaging, product comps, diagrams, and other layout-sensitive images.

GPT-5.4 Nano is the smallest and fastest variant of GPT-5.4, designed for high-throughput, low-latency tasks such as classification, data extraction, ranking, and lightweight automation. It prioritizes speed and cost efficiency for simple, high-volume workloads and is available exclusively via the API.

GPT-5.4 Mini is a compact, efficient variant of GPT-5.4 designed for coding assistants, subagent orchestration, and multimodal applications requiring faster responsiveness. It supports a 400K token context window and retains native computer use and configurable reasoning effort at a lower cost than the flagship model.

GPT-5.4 Pro

Coming Soon

GPT-5.4 Pro is the high-performance variant of GPT-5.4, optimized for enterprise-grade professional tasks. It offers deeper reasoning, enhanced accuracy, and extended compute for complex multi-step workflows including document creation, spreadsheet analysis, and autonomous agent orchestration. It shares the 1 million token context window and native computer use capabilities of the standard GPT-5.4.

GPT-5.4 is OpenAI's flagship large language model, featuring a 1 million token context window, native computer use, and a 33% reduction in factual errors over GPT-5.2. It integrates coding capabilities from GPT-5.3-Codex, is 47% more token-efficient, and supports configurable reasoning effort for complex professional tasks.

GPT Image 1.5 is OpenAI’s newest flagship image model powering the latest ChatGPT Images. It delivers significantly faster image generation with stronger instruction following, more precise edits that preserve original details, more believable transformations, and improved rendering of dense or small text. It is suited for practical creative workflows, detailed design tasks, and production use cases.

Sora 2 is OpenAI’s flagship generative model for video and audio. It accepts text prompts and generates visually rich clips with synchronized dialogue and sound. It improves physical realism and scene control. It also supports editing and extension of existing video inputs.

Sora 2 Pro is the higher quality Sora 2 variant for precision video work. It supports text prompts and image inputs. It outputs synchronized video with sound, higher resolution frames, and stronger temporal consistency. Ideal for production clips and demanding pipelines.

GPT Image 1 Mini is a lighter variant of OpenAI's GPT Image 1 model. It offers faster generation at a lower cost while retaining core capabilities including text-to-image generation, image editing, and text rendering. It is suited for high-volume workflows, rapid prototyping, and cost-sensitive applications where the full GPT Image 1 model may be excessive.

GPT Image 1 is OpenAI’s native GPT 4o image model. It creates detailed visuals from text prompts. It supports diverse styles and precise layouts. It can edit existing images with masks. It renders readable text in scenes. It suits design tools and production workflows.

DALL·E 3 converts natural language prompts into detailed images with strong caption fidelity. It improves handling of complex instructions and visual details. It integrates with ChatGPT and the OpenAI API for programmatic image creation and workflow automation.

DALL·E 2 is OpenAI’s diffusion based text to image model. It generates high quality images from prompts. It supports inpainting for local edits and outpainting for extended canvases. Developers use it through an API for creative tools, design workflows, and content pipelines.

OpenAI CLIP ViT-L/14 is a contrastive vision-language model that embeds images and text into a shared representation space. It enables tasks like zero-shot image classification, semantic search, and similarity scoring by computing aligned feature vectors for images and texts.