Best LLMs

Large language models for general-purpose text generation, reasoning, summarization, and conversation. Capable of understanding complex instructions and producing high-quality, coherent text responses.

#1
Top Pick
GLM-5.1

Coming Soon

Best rated

by Z.ai

GLM-5.1 is Z.ai’s flagship language model for agentic engineering, coding, reasoning, and tool-driven workflows. It supports a 200K token context window with up to 128K output tokens, deep thinking, function calling, structured output, and streaming tool calls, and is designed to stay effective over long multi-step sessions rather than only short-horizon tasks.

Featured Models

Top-performing models in this category, recommended by our community and performance benchmarks.

#2

by MiniMax

MiniMax M2.7 is a long‑context LLM designed for agentic workflows across software engineering, search and tool use, and high‑value office productivity tasks. It’s built for multi‑step execution, with strong instruction following and dependable task decomposition, making it a solid default for production assistants that write code, call tools, and handle complex document workflows.

#3

by MiniMax

MiniMax M2.7‑Highspeed is the performance‑tuned variant of M2.7, built for lower latency and higher throughput while keeping output behavior consistent with the standard model. It’s a strong fit for interactive coding agents, tool‑calling pipelines, and office automation flows where responsiveness matters.

#4

by OpenAI

GPT-5.4 Mini is a compact, efficient variant of GPT-5.4 designed for coding assistants, subagent orchestration, and multimodal applications requiring faster responsiveness. It supports a 400K token context window and retains native computer use and configurable reasoning effort at a lower cost than the flagship model.

#5

by OpenAI

GPT-5.4 Nano is the smallest and fastest variant of GPT-5.4, designed for high-throughput, low-latency tasks such as classification, data extraction, ranking, and lightweight automation. It prioritizes speed and cost efficiency for simple, high-volume workloads and is available exclusively via the API.

#6

by OpenAI

GPT-5.4 is OpenAI's flagship large language model, featuring a 1 million token context window, native computer use, and a 33% reduction in factual errors over GPT-5.2. It integrates coding capabilities from GPT-5.3-Codex, is 47% more token-efficient, and supports configurable reasoning effort for complex professional tasks.

#7
GPT-5.4 Pro

Coming Soon

by OpenAI

GPT-5.4 Pro is the high-performance variant of GPT-5.4, optimized for enterprise-grade professional tasks. It offers deeper reasoning, enhanced accuracy, and extended compute for complex multi-step workflows including document creation, spreadsheet analysis, and autonomous agent orchestration. It shares the 1 million token context window and native computer use capabilities of the standard GPT-5.4.

#8

by Google

Gemini 3.1 Flash Lite is Google’s flagship multimodal language model that processes text alongside images, audio, video, code, and documents. It offers high-performance reasoning, complex instruction following, and deep contextual understanding for a wide range of tasks across language, analysis, and problem solving

#9

by Google

Gemini 3.1 Pro is Google’s flagship multimodal language model that processes text alongside images, audio, video, code, and documents. It offers high-performance reasoning, complex instruction following, and deep contextual understanding for a wide range of tasks across language, analysis, and problem solving.

#10

by MiniMax

MiniMax-M2.5 is MiniMax’s latest frontier model, optimized for fast, low-cost agentic workflows across coding, search/tool use, and high-value office tasks. Trained with large-scale reinforcement learning in complex real-world environments, it delivers strong reasoning, efficient task decomposition, and high-quality outputs for production assistants and enterprise workflows.

#11
GLM-4.7

Coming Soon

by Z.ai

GLM-4.7 is a 358 billion parameter Mixture-of-Experts language model from Z.ai optimized for agentic coding, complex reasoning, and long-horizon tasks. It features interleaved thinking, preserved thinking for multi-turn consistency, and turn-level thinking control. It supports a 200K token context window with 128K max output, tool calling, and achieves 73.8% on SWE-bench Verified.

#12

by Google

Gemini 3 Flash is Google’s flagship multimodal language model that processes text alongside images, audio, video, code, and documents. It offers high-performance reasoning, complex instruction following, and deep contextual understanding for a wide range of tasks across language, analysis, and problem solving.

Explore other collections