Introduction

OpenAI announced GPT-5.5, the company's smartest and most intuitive model to date, representing a significant step toward a new paradigm of computer-based work. GPT-5.5 understands user intent faster and can autonomously carry more of the workload, excelling at coding, research, data analysis, document creation, and multi-tool task completion.

Unlike previous models requiring careful step-by-step management, GPT-5.5 can handle messy, multi-part tasks independently—planning, using tools, verifying work, navigating ambiguity, and persisting until completion.

GPT-5.5

Key Breakthrough: Intelligence Without Speed Compromise

The most remarkable achievement of GPT-5.5 is delivering substantial intelligence gains without sacrificing speed. While larger, more capable models typically suffer from increased latency, GPT-5.5 matches GPT-5.4 per-token latency in real-world serving while performing at a significantly higher intelligence level.

Additionally, GPT-5.5 uses substantially fewer tokens to complete the same Codex tasks, making it both more capable and more cost-efficient. On Artificial Analysis’s Coding Index, GPT-5.5 delivers state-of-the-art intelligence at half the cost of competitive frontier coding models.

Benchmark Performance

Coding and Agentic Tasks

Benchmark	GPT-5.5	GPT-5.4	Claude Opus 4.7	Gemini 3.1 Pro
Terminal-Bench 2.0	82.7%	75.1%	69.4%	68.5%
SWE-Bench Pro	58.6%	57.7%	64.3%*	-
Expert-SWE (Internal)	73.1%	68.5%	-	-
CyberGym	81.8%	79.0%	73.1%	-

*Anthropic acknowledged memoryization on some SWE-Bench Pro problems

GPT-5.5 achieves state-of-the-art accuracy on Terminal-Bench 2.0, which tests complex command-line workflows requiring planning, iteration, and tool coordination. On SWE-Bench Pro, evaluating real-world GitHub issue resolution, GPT-5.5 reaches 58.6%, solving more tasks end-to-end in a single pass than previous models.

Knowledge Work and Computer Use

Benchmark	GPT-5.5	GPT-5.4	Claude Opus 4.7
GDPval (wins or ties)	84.9%	83.0%	80.3%
OSWorld-Verified	78.7%	75.0%	78.0%
Tau2-Bench Telecom	98.0%	92.8%	-

GPT-5.5 demonstrates significant improvements in everyday computer work, from document generation to spreadsheet modeling and operational research.

Advanced Reasoning

Benchmark	GPT-5.5	GPT-5.5 Pro	GPT-5.4 Pro
FrontierMath Tier 1-3	51.7%	52.4%	50.0%
FrontierMath Tier 4	35.4%	39.6%	38.0%
BrowseComp	84.4%	90.1%	89.3%

Real-World Impact: Early Tester Feedback

Engineering Workflows

Dan Shipper, Founder and CEO of Every, described GPT-5.5 as “the first coding model I've used that has serious conceptual clarity.” After spending days debugging a post-launch issue and eventually bringing in a senior engineer to rewrite part of the system, Shipper tested whether GPT-5.5 could produce the same solution from the broken state. GPT-5.4 could not; GPT-5.5 succeeded.

Pietro Schirano, CEO of MagicPath, experienced a similar breakthrough when GPT-5.5 merged a branch with hundreds of frontend and refactor changes into a substantially modified main branch, completing the work in approximately 20 minutes in a single pass.

Industry Adoption

Senior engineers testing GPT-5.5 reported it was noticeably stronger than both GPT-5.4 and Claude Opus 4.7 at reasoning and autonomy—catching issues in advance and predicting testing and review needs without explicit prompting. One engineer at NVIDIA stated: “Losing access to GPT-5.5 feels like I’ve had a limb amputated.”

Michael Truell, Co-founder & CEO at Cursor, noted: “GPT-5.5 is noticeably smarter and more persistent than GPT-5.4, with stronger coding performance and more reliable tool use. It stays on task for significantly longer without stopping early, which matters most for the complex, long-running work our users delegate to Cursor.”

Enterprise Applications at OpenAI

More than 85% of OpenAI employees now use Codex weekly across software engineering, finance, communications, marketing, data science, and product management.

Communications Team

Used GPT-5.5 in Codex to analyze six months of speaking request data, build a scoring and risk framework, and validate an automated Slack agent. Low-risk requests are now handled automatically while higher-risk requests route to human review.

Finance Team

Leveraged Codex with GPT-5.5 to review 24,771 K-1 tax forms totaling 71,637 pages. The workflow excluded personal information and accelerated task completion by two weeks compared to the prior year.

Go-to-Market Team

Automated weekly business report generation using GPT-5.5, saving 5-10 hours per week per employee.

Scientific Research Capabilities

GPT-5.5 extends AI acceleration beyond software engineering into scientific research. OpenAI introduced new benchmarks for evaluating scientific capabilities:

GeneBench

Testing multi-stage genetics and quantitative biology data analysis (tasks typically requiring days to weeks of expert work): - GPT-5.5: 25.0% - GPT-5.4: 19.0% - GPT-5.5 Pro: 33.2%

BixBench

Real-world bioinformatics and data analysis benchmark: - GPT-5.5: 80.5% - GPT-5.4: 74.0%

Ramsey Number Discovery

An internal version of GPT-5.5 combined with custom tool chains discovered a new proof for Ramsey numbers—a core object in combinatorial mathematics with sparse research results and high technical difficulty. The proof was subsequently formalized and verified in Lean.

Safety and Preparedness

OpenAI released GPT-5.5 with the company's strongest safeguards to date, designed to reduce misuse while preserving access for beneficial work. The model underwent evaluation across OpenAI's full suite of safety and preparedness frameworks, with input from internal and external red teamers and feedback from nearly 200 trusted early-access partners.

Safety Ratings

· Cybersecurity Capability: High

· Biological Capability: High

· Chemical Capability: High

No capabilities reached the Critical threshold.

Biosecurity Vulnerability Bounty

Alongside GPT-5.5, OpenAI launched a Biosecurity Vulnerability Bounty Program:

· Challenge: Find a universal jailbreak prompt that passes all 5 biosecurity questions in Codex Desktop without triggering safeguards

· Reward: $25,000 for the first successful universal jailbreak; smaller rewards for partial breakthroughs

· Application Window: April 23 - June 22, 2026

· Testing Window: April 28 - July 27, 2026

· Requirements: Existing ChatGPT account, NDA signature, AI red team/security/biosecurity experience

Availability and Pricing

ChatGPT

· GPT-5.5 Thinking: Available to Plus, Pro, Business, and Enterprise users

· GPT-5.5 Pro: Available to Pro, Business, and Enterprise users

Codex

· GPT-5.5: Available to Plus, Pro, Business, Enterprise, Edu, and Go plan users

· Context Window: 400K tokens

· Fast Mode: 1.5x token generation speed at 2.5x cost

API (Coming Soon)

Model	Input Price	Output Price	Context Window
GPT-5.5	$5/1M tokens	$30/1M tokens	1M tokens
GPT-5.5 Pro	$30/1M tokens	$180/1M tokens	1M tokens

· Batch/Flex Pricing: 50% of standard rates

· Priority Pricing: 2.5x standard rates

While GPT-5.5 pricing is higher than GPT-5.4 (approximately 3x), the improved token efficiency means most users will consume fewer tokens for the same tasks in Codex, resulting in comparable or lower effective costs.

Technical Improvements

Infrastructure Optimization

Previously, OpenAI used fixed static partitions to balance computational load across GPUs. For GPT-5.5, Codex analyzed weeks of production traffic data and wrote custom heuristic partitioning algorithms. This single improvement increased token generation speed by over 20%—the model helped optimize the infrastructure it runs on.

Context Handling

· Codex: 400K token context window

· API: 1M token context window

· Improved performance on long-context tasks exceeding 256K tokens

Tool Use and Autonomy

GPT-5.5 demonstrates enhanced ability to: - Hold context across large systems - Reason through ambiguous failures - Verify assumptions with tools - Propagate changes throughout codebases - Persist on complex, long-horizon tasks without premature termination

Competitive Positioning

Where GPT-5.5 Leads

· Terminal-Bench 2.0 (complex command-line workflows)

· GDPval (knowledge work across 44 professions)

· OSWorld-Verified (autonomous computer operation)

· Tau2-Bench Telecom (customer service workflows)

· CyberGym (cybersecurity tasks)

· Token efficiency and cost-effectiveness

Areas for Improvement

· SWE-Bench Pro: Claude Opus 4.7 reports 64.3% (vs GPT-5.5 at 58.6%), though Anthropic acknowledged memoryization concerns

· MCP Atlas: Claude Opus 4.7 (79.1%) and Gemini 3.1 Pro (78.2%) outperform GPT-5.5 (75.3%)

· Humanity’s Last Exam (with tools): GPT-5.4 Pro (58.7%) slightly exceeds GPT-5.5 Pro (57.2%)

· Very Long Context (256K+): Claude Opus 4.7 maintains advantages on some metrics

Use Cases and Best Practices

Recommended Applications for GPT-5.5

Agentic Coding: Full engineering workflows from implementation to testing
Complex Research: Multi-step information gathering and synthesis
Data Analysis: Spreadsheet modeling and operational research
Document Creation: Reports, presentations, and technical documentation
Computer Automation: Multi-tool workflows requiring iteration and verification
Scientific Research: Biology, genetics, and mathematical problem-solving

Best Practices

· Leverage Long Context: Utilize the 400K-1M token window for large codebases and documents

· Enable Thinking Mode: Use GPT-5.5 Thinking for complex reasoning tasks

· Optimize Prompts: Clear intent communication reduces token consumption

· Tool Integration: GPT-5.5 performs best with custom tool chains for specialized tasks

· Iterative Verification: Allow GPT-5.5 to check its own work before finalizing outputs

The Path Forward

GPT-5.5 represents a milestone in AI development, demonstrating that significant intelligence gains can be achieved without sacrificing speed or efficiency. OpenAI emphasizes this is one step in an ongoing journey, with more iterative versions planned.

The core value proposition of GPT-5.5 lies in its ability to deliver substantial intelligence improvements while maintaining production-ready performance characteristics—enabling large-scale deployment in real-world enterprise environments.

As GPT-5.5 sees broader adoption across coding, knowledge work, and scientific research domains, the model's impact on productivity and capability augmentation will continue to evolve.

Explore More AI Models

GPT-5.5: OpenAI’s Most Intelligent and Efficient Model Yet

Introduction

Key Breakthrough: Intelligence Without Speed Compromise

Benchmark Performance

Coding and Agentic Tasks

Knowledge Work and Computer Use

Advanced Reasoning

Real-World Impact: Early Tester Feedback

Engineering Workflows

Industry Adoption

Enterprise Applications at OpenAI

Communications Team

Finance Team

Go-to-Market Team

Scientific Research Capabilities

GeneBench

BixBench

Ramsey Number Discovery

Safety and Preparedness

Safety Ratings

Biosecurity Vulnerability Bounty

Availability and Pricing

ChatGPT

Codex

API (Coming Soon)

Technical Improvements

Infrastructure Optimization

Context Handling

Tool Use and Autonomy

Competitive Positioning

Where GPT-5.5 Leads

Areas for Improvement

Use Cases and Best Practices

Recommended Applications for GPT-5.5

Best Practices

The Path Forward

Recommended Articles

Text to Image AI

Image to Image AI

Image Models

Text to Video AI

Image to Video AI

Video to Video AI

Video Models

Q&A

Contact us

GPT-5.5: OpenAI’s Most Intelligent and Efficient Model Yet

Introduction

Key Breakthrough: Intelligence Without Speed Compromise

Benchmark Performance

Coding and Agentic Tasks

Knowledge Work and Computer Use

Advanced Reasoning

Real-World Impact: Early Tester Feedback

Engineering Workflows

Industry Adoption

Enterprise Applications at OpenAI

Communications Team

Finance Team

Go-to-Market Team

Scientific Research Capabilities

GeneBench

BixBench

Ramsey Number Discovery

Safety and Preparedness

Safety Ratings

Biosecurity Vulnerability Bounty

Availability and Pricing

ChatGPT

Codex

API (Coming Soon)

Technical Improvements

Infrastructure Optimization

Context Handling

Tool Use and Autonomy

Competitive Positioning

Where GPT-5.5 Leads

Areas for Improvement

Use Cases and Best Practices

Recommended Applications for GPT-5.5

Best Practices

The Path Forward

Recommended Articles