GPT-5.4: Is It Free? The Pentagon Story Behind the Rush Release

GPT-5.4 — Fast Facts (March 5, 2026):

Released: March 5, 2026 — two days after GPT-5.3 Instant, amid a 295% ChatGPT uninstall spike following OpenAI's Pentagon deal
What's new: First general-purpose AI model to beat humans at computer use (75% vs 72.4% human baseline on OSWorld-Verified); 1M token context; Tool Search cuts agent token costs by 47%; mid-response steering; Codex /fast mode at 1.5x token speed
Key pricing cliff: Standard API is $2.50/$15 per MTok — but cross 272K input tokens and costs double to $5.00/M input and $22.50/M output for the entire session
Free tier: No access to GPT-5.4. Free users stay on GPT-5.3 Instant (non-thinking). GPT-5.4 Thinking requires Plus ($20/mo) minimum
API strings: gpt-5.4, gpt-5.4-pro, gpt-5.4-thinking
GPT-5.2 retirement: June 5, 2026 — same 90-day window as GPT-5.3's retirement of GPT-5.2 Thinking
Safety classification: "High cyber capability" under OpenAI's Preparedness Framework — same classification as GPT-5.3 Codex

On February 28, 2026, Anthropic walked away from a Pentagon contract. The Department of Defense had refused to include language explicitly prohibiting autonomous weapons and mass surveillance of U.S. citizens. Anthropic said no. Hours later, OpenAI said yes — taking the same deal with the same unresolved restrictions, after Sam Altman acknowledged the timing looked "opportunistic and sloppy."

The public response was immediate. App uninstalls spiked 295% over a single weekend. An Instagram account called "quitGPT" gained 10,000 followers overnight. A Reddit post calling for users to cancel and delete ChatGPT reached 30,000 upvotes. An estimated 2.5 million users pledged to switch platforms or already had. Anthropic's Claude hit #1 on the U.S. App Store — the first time any competitor had topped ChatGPT in that ranking.

Five days later, OpenAI shipped GPT-5.4. Two days after GPT-5.3 Instant. This is the fastest back-to-back model release in OpenAI's history — and the timing is not a coincidence. Whether GPT-5.4's capabilities are enough to stem the QuitGPT tide is an open question. The capabilities themselves are real and significant. Here's the complete breakdown: what GPT-5.4 actually does, what the 272K token pricing trap costs you, and whether the QuitGPT movement has a point.

What Is GPT-5.4? (The Model-Line Reset Explained)

GPT-5.4 is OpenAI's new mainline reasoning model — and it's the first release that the company explicitly describes as a "model-line reset" rather than an incremental update. The naming jump from GPT-5.3 Codex to GPT-5.4 reflects what changed: this is the first GPT mainline model to absorb the frontier coding capabilities of GPT-5.3 Codex while simultaneously adding native computer use, 1M context, and professional knowledge work improvements — all in a single default model across every surface simultaneously.

The practical significance: previously, a developer choosing between GPT-5.3 Codex (strong coding, limited general reasoning) and GPT-5.3 Instant (conversational, no deep thinking) was making a tradeoff. GPT-5.4 eliminates that tradeoff. It replaces GPT-5.2 Thinking as the default reasoning model in ChatGPT and replaces GPT-5.3 Codex as the recommended API model for serious professional work.

The 4 Headline Features (And What They Actually Mean)

1. Computer Use: GPT-5.4 Beats Humans at Controlling a Computer

GPT-5.4 is the first general-purpose AI model to surpass human performance on OSWorld-Verified — scoring 75.0% versus a human baseline of 72.4%. For context, GPT-5.2 scored 47.3% on the same benchmark. That 27.7-point jump in a single model generation is the largest improvement on any major benchmark in OpenAI's release history.

What this means in practice: GPT-5.4 can operate a computer through both Playwright code and direct mouse/keyboard commands from screenshots. It doesn't just generate instructions — it can execute them. Log into a portal, navigate a form, pull data from a web app, fill in a spreadsheet, click through a multi-step workflow. Mainstay, which runs agents across approximately 30,000 property tax portals, reported a 95% first-attempt success rate and sessions running three times faster while using 70% fewer tokens versus prior computer-use models.

The safety architecture around this feature is explicit: developers can configure custom confirmation policies that adjust risk tolerance for specific applications. High-stakes actions (form submissions, purchases, deletions) can require human confirmation; low-stakes navigation can run autonomously. OpenAI classifies GPT-5.4 as "High cyber capability" under its Preparedness Framework — the same classification that triggered API access delays for GPT-5.3 Codex. This time, the classification did not delay the launch.

2. Tool Search: 47% Fewer Tokens for Agent Workflows

Tool Search is the feature with the highest practical impact for developers building MCP server ecosystems and complex agent workflows — and it got buried under the computer use headlines.

The problem it solves: in complex agent systems, you might have 50–200 tool definitions that the model needs access to. Currently, you pass all those definitions in every request, consuming thousands of tokens before the model writes a single word. Tool Search lets GPT-5.4 receive a lightweight tool list instead and look up full definitions on demand. The result: 47% reduction in token usage for complex multi-tool workflows. At $2.50/M input tokens (or $5.00/M above 272K), that reduction is meaningful at any scale above a few hundred daily requests.

3. 1 Million Token Context — With a Hidden Cost Cliff at 272K

GPT-5.4 supports 1,050,000 input tokens and 128,000 output tokens in the API — matching Gemini 3.1 Pro's 1M context and more than doubling GPT-5.2's 400K limit. In Codex specifically, the full 1M context is the default. This means you can feed GPT-5.4 an entire large codebase, a full legal contract library, or a year's worth of financial documents in a single session without chunking or summarizing.

The 272K Pricing Cliff — Read This Before You Build:

GPT-5.4 has dynamic token pricing that activates at a hard threshold of 272,000 input tokens. Below 272K: $2.50/M input, $15.00/M output. Above 272K: input cost doubles to $5.00/M and output rises to $22.50/M — and critically, the surcharge applies to the entire session once you cross the threshold, not just the tokens above it. If you're building applications that regularly process large codebases or legal libraries that run 300–500K tokens per session, model your actual costs at the doubled rate. The "1 million token context" headline is real — the linear pricing assumption is not. The GPT-5.4 Pro API tier ($30/M input, $180/M output) carries additional surcharges for high-context sessions on top of its already 12x price premium over standard GPT-5.4.

It's also worth noting: benchmark quality degrades at the far end of the context window. OpenAI's own MRCR and Graphwalks evaluations confirm that extremely large-context retrieval — near the 900K–1M token range — is meaningfully weaker than short- and mid-context performance. The 1M context window is real. Perfect recall at 1M is not.

4. Mid-Response Steering: Redirect the Model While It's Still Thinking

GPT-5.4 Thinking in ChatGPT now shows an upfront plan of its reasoning before it starts generating — and users can redirect the model mid-response rather than starting over when it heads in the wrong direction. This is particularly useful for extended research workflows and complex analysis where a wrong assumption early in the reasoning chain would otherwise waste the entire generation. The model also handles longer reasoning stretches with better context retention, reducing the quality dropoff that GPT-5.2 showed in reasoning chains above 20–30 steps.

GPT-5.4 Benchmarks: The Full Picture

Benchmark	GPT-5.4	GPT-5.2	GPT-5.3 Codex	What It Tests
OSWorld-Verified (computer use)	75.0% ✅ beats humans	47.3%	—	Desktop control via screenshots + keyboard/mouse
GDPval (44 occupations)	83.0%	70.9%	—	Professional knowledge work across top 9 U.S. GDP industries
Investment banking spreadsheet modeling	87.3%	68.4%	—	Junior analyst-level financial modeling tasks
SWE-Bench Pro (coding)	57.7%	55.6%	56.8%	Real GitHub issue resolution — minimal improvement over 5.3 Codex
BrowseComp (web research)	+17pp vs GPT-5.2	baseline	—	Deep web research retrieval
Harvey BigLaw Bench (legal)	91%	—	—	BigLaw-grade legal research and analysis
Factual accuracy (claim-level)	33% fewer false claims vs GPT-5.2	baseline	—	Full responses 18% less likely to contain any error

The honest benchmark read: GPT-5.4's biggest gains are in computer use and professional knowledge work (documents, spreadsheets, legal research). For pure coding, the improvement over GPT-5.3 Codex is barely one percentage point on SWE-Bench Pro. If coding is your primary use case and you're already on GPT-5.3 Codex, the upgrade argument is efficiency and context window — not raw coding capability.

GPT-5.4 Pricing: Every Plan and the Hidden Surcharges

Plan	Price	GPT-5.4 Access	Limits
Free	$0	❌ No GPT-5.4 access — GPT-5.3 Instant only (non-thinking)	—
Plus	$20/mo	✅ GPT-5.4 Thinking — entry point for full model access	80 messages per 3 hours
Team	$25–30/user/mo	✅ GPT-5.4 Thinking — same as Plus with higher limits	Higher rate limits vs Plus
Pro	$200/mo	✅ GPT-5.4 + GPT-5.4 Pro — unlimited standard usage + Pro tier access	No message cap on standard GPT-5.4
API — Standard	$2.50/M input · $0.25/M cached · $15.00/M output	Model string: `gpt-5.4`	⚠️ Doubles above 272K input tokens per session
API — Pro	$30.00/M input · $180.00/M output	Model string: `gpt-5.4-pro`	12x price premium; additional surcharges for high-context sessions

GPT-5.4 vs. Claude Opus 4.6 vs. Gemini 3.1 Pro

Factor	GPT-5.4	Claude Opus 4.6	Gemini 3.1 Pro
Computer use	75% OSWorld ✅ beats humans	65.4% Terminal-Bench 2.0	Supports via Gemini Live
Context window	1.05M tokens (API/Codex)	200K tokens	1M tokens
Coding (SWE-Bench Pro)	57.7%	79.6% SWE-bench Verified	80.6% SWE-bench Verified
API input price (standard)	$2.50/M	$5.00/M	$2.00/M
Pentagon deal / QuitGPT controversy	Yes — core reason for QuitGPT movement	✅ Walked away from DoD contract	Google has existing DoD cloud contracts
Professional knowledge work (GDPval)	83.0%	~77%	~79%
Open source	❌	❌	❌

The honest competitive read for coders specifically:

If coding is your primary use case, Claude Opus 4.6 (79.6% SWE-bench Verified) and Gemini 3.1 Pro (80.6%) still lead GPT-5.4 (57.7% SWE-Bench Pro) on the benchmarks that matter most for software engineering. GPT-5.4's coding improvement over GPT-5.3 Codex is less than one percentage point. The model's advantages are in computer use, professional document/spreadsheet work, and long-context retrieval — not raw coding capability. For teams where coding is 80% of AI usage, this is not the upgrade that changes the calculus.

The QuitGPT Situation: What You Need to Know

GPT-5.4 launched inside the most significant public backlash OpenAI has ever faced. The sequence:

On February 28, Anthropic walked away from a Pentagon contract after the DoD refused language explicitly prohibiting autonomous weapons and mass surveillance of U.S. citizens. Hours later, OpenAI signed a deal with the same DoD, with the same restrictions unresolved. Sam Altman publicly acknowledged the timing looked "opportunistic and sloppy." The contract was amended after public backlash to explicitly bar domestic surveillance — but the original deal and the original decision remained the story.

ChatGPT app uninstalls spiked 295% in the days that followed. An estimated 2.5 million users pledged to cancel or had already canceled. Claude hit #1 on the U.S. App Store. Several European universities began reviewing their institutional ChatGPT licenses citing data governance concerns. The GPT-5.4 launch — two days after GPT-5.3 Instant — was openly described by multiple analysts as a defensive product move to give users a reason to stay.

GPT-5.4 ships with an expanded cyber safety stack covering monitoring systems, trusted access controls, and asynchronous blocking for higher-risk requests on Zero Data Retention surfaces. OpenAI introduced a new open-source evaluation for Chain-of-Thought controllability, finding that GPT-5.4 Thinking has low ability to deliberately obfuscate its reasoning — presented as a safety signal that CoT monitoring remains effective. Whether these safety communications address the actual concern behind QuitGPT — not AI capability safety, but the company's political and contractual choices — is a separate question.

How to Access GPT-5.4

In ChatGPT:

Log in to chatgpt.com — requires Plus ($20/mo), Team, or Pro plan
Select "GPT-5.4 Thinking" from the model dropdown — rolling out March 5–7 to Plus, Team, and Pro users
Enterprise and Edu: enable early access through admin settings
GPT-5.2 Thinking remains available in Legacy Models until June 5, 2026
Free users: no access — GPT-5.3 Instant remains the free tier default

Via API:

Model string: gpt-5.4 (standard) or gpt-5.4-pro (Pro tier)
Also: gpt-5.4-thinking for explicit Thinking mode access
Standard pricing: $2.50/M input, $0.25/M cached input, $15.00/M output
⚠️ Monitor session token counts carefully — costs double above 272K input tokens for the full session
Tool Search: include "tool_search": true in your API request to enable lightweight tool listing with on-demand definition lookup — recommended for any workflow with 10+ tool definitions

In GitHub Copilot:

GPT-5.4 is generally available in GitHub Copilot as of March 5, 2026
Select GPT-5.4 from the Copilot model picker in VS Code, JetBrains, or GitHub.com
Codex /fast mode available in Codex CLI: add --fast flag for 1.5x token speed with no intelligence change

Frequently Asked Questions

Is GPT-5.4 Free?

No. GPT-5.4 is not available on ChatGPT's free tier. Free users continue to access GPT-5.3 Instant, which does not support extended reasoning (Thinking mode). GPT-5.4 Thinking starts at ChatGPT Plus ($20/month) with a limit of 80 messages per 3 hours. The API is available to any paying OpenAI API customer at $2.50/M input tokens — with the 272K surcharge threshold applying above that context length.

What Is the QuitGPT Movement?

QuitGPT is a user-led boycott of ChatGPT that emerged after OpenAI signed a contract with the U.S. Department of Defense in late February 2026. Hours earlier, Anthropic had walked away from the same deal after the DoD refused to include language explicitly prohibiting autonomous weapons and mass surveillance of U.S. citizens. OpenAI took the deal. ChatGPT uninstalls spiked 295%, an estimated 2.5 million users pledged to switch, and Anthropic's Claude hit #1 on the U.S. App Store. GPT-5.4 launched five days later.

What Is GPT-5.4 Pro?

GPT-5.4 Pro is a higher-capability API tier (model string: gpt-5.4-pro) priced at $30/M input and $180/M output — a 12x price premium over standard GPT-5.4. It targets extended-horizon reasoning tasks that require the highest reasoning depth and tolerate much higher latency and cost. Available to Pro ($200/mo) and Enterprise plan holders. For most professional workflows, standard GPT-5.4 at $2.50/M input delivers the benchmark results cited in this article.

What Happened to GPT-5.3 Codex?

GPT-5.4 absorbs and replaces GPT-5.3 Codex as OpenAI's recommended model for software engineering. On SWE-Bench Pro, GPT-5.4 scores 57.7% versus GPT-5.3 Codex's 56.8% — a marginal improvement. The practical advantages of the upgrade are the unified model surface (no more choosing between coding and general-purpose models), the Codex /fast mode (1.5x token velocity), and the 1M context window now available in Codex. GPT-5.3 Codex is not officially deprecated yet — no retirement date has been announced as of March 7, 2026.

How Does the 272K Token Pricing Cliff Work?

GPT-5.4 uses dynamic token pricing with a threshold at 272,000 input tokens per session. Below 272K: $2.50/M input, $15.00/M output. Cross that threshold and the rate doubles to $5.00/M input and $22.50/M output — and the surcharge applies to the entire session, not just the tokens above the threshold. For applications that process large documents, long conversations, or full codebases, calculate your expected session token count before building on GPT-5.4. Sessions consistently running above 300K tokens will incur costs approximately double the headline API rate.

Does GPT-5.4 Replace GPT-5.2 in ChatGPT?

Yes — GPT-5.4 Thinking is replacing GPT-5.2 Thinking as the default reasoning model for Plus, Team, and Pro users. GPT-5.2 Thinking remains available in the Legacy Models section of ChatGPT until June 5, 2026, when it will be retired — 90 days from the GPT-5.4 launch date.