The highly anticipated GPT-5.6 family introduces Sol, Terra, and Luna in a phased, government-backed preview, boasting unprecedented reasoning capabilities paired with robust cyber safeguards.
- A Tiered Approach to Intelligence: The GPT-5.6 series introduces three distinct models—Sol, Terra, and Luna—designed to provide clear choices across peak performance, everyday efficiency, and high-volume speed.
- Unprecedented Safety Investments: To counter adaptive misuse, the models are shielded by a layered safeguard stack, rigorously stress-tested by over 700,000 GPU hours of automated red-teaming and real-time misuse classifiers.
- Government-Guided Rollout: At the request of the U.S. government, the launch is beginning as a limited preview for trusted partners, prioritizing security and framework development before wider public availability in the coming weeks.
The artificial intelligence landscape is shifting once again. In a move that balances frontier intelligence with extreme caution, a limited preview of the highly anticipated GPT-5.6 series has officially begun. Moving away from standard numerical increments, the launch introduces a new naming system based on durable capability tiers: Sol, Terra, and Luna.
Designed to offer developers and enterprises clearer choices across intelligence, speed, and cost, the GPT-5.6 family is positioned to redefine how AI is integrated into complex, high-stakes workflows. However, the release is currently gated. In an unusual step, the initial rollout is restricted to a small group of trusted partners—a decision made in direct coordination with the U.S. government.

The GPT-5.6 Lineup
The new generation fractures the traditional single-model approach into a three-tiered ecosystem, accommodating everything from complex cybersecurity research to affordable, high-volume data processing.
| Model Tier | Target Workload | Input Pricing (per 1M tokens) | Output Pricing (per 1M tokens) |
|---|---|---|---|
| GPT-5.6 Sol | Flagship capabilities, deep reasoning | $5.00 | $30.00 |
| GPT-5.6 Terra | Balanced for efficient, everyday work | $2.50 | $15.00 |
| GPT-5.6 Luna | Fast, affordable, high-volume tasks | $1.00 | $6.00 |
Terra delivers performance competitive with the previous generation (GPT-5.5) at half the cost, while Luna drops the price floor significantly. But the true star of the preview is Sol.
Pushing the Boundaries of Reasoning
GPT-5.6 Sol is described as the most capable model to date, introducing a new max reasoning effort that grants the AI the necessary time to think deeply through complex problems. Furthermore, an innovative ultra mode moves beyond single-agent operations, leveraging multiple subagents to accelerate and execute multi-step workflows.
Early benchmark data suggests Sol is shifting the performance-efficiency frontier across several highly specialized domains:
- Coding: Sets a new state-of-the-art on Terminal-Bench 2.1, a benchmark testing command-line workflows that require planning, tool coordination, and iteration.
- Biology: Outperforms previous models on GeneBench v1 for long-horizon genomics and quantitative-biology analyses, remarkably using fewer tokens to achieve better results.
- Cybersecurity: Demonstrates major improvements on ExploitBench and ExploitGym. It achieves competitive results with earlier specialized preview models using only a third of the output tokens.
The Government’s Hand in a Staggered Release
Despite the leaps in capability, the general public will have to wait. The creators have previewed their plans and model capabilities with the U.S. government, leading to a restricted initial launch. The government requested a limited preview for a select group of trusted partners, whose participation has been shared with federal authorities.
While the creators are complying with this short-term step to help develop a cyber Executive Order framework and a repeatable process for future releases, they have publicly pushed back on the precedent. They noted that this kind of government access process should not become the long-term default, as it keeps vital tools out of the hands of global defenders, developers, and enterprises who need them to secure modern infrastructure.
General availability for ChatGPT, Codex, and API users is promised in the coming weeks once this initial testing and coordination phase concludes.

A Layered Fortress of Safeguards
With greater power comes a dramatically increased risk of misuse. GPT-5.6 launches with the most robust safety stack to date, specifically hardened against real-world attacks, sensitive cyber requests, and repeated jailbreak attempts.
Recognizing that no single safeguard is foolproof, the security architecture utilizes a multi-layered approach:
- Model-Level Training: The AI is inherently trained to refuse prohibited cyber assistance, even when users attempt to mask their intent.
- Real-Time Classifiers: As the model generates text, biological and cyber misuse classifiers monitor the output. If a higher-risk violation is detected, generation is paused while a larger reasoning model reviews the context. Disallowed output is withheld before the user ever sees it.
- Account-Level Monitoring: Flagged activity triggers reviews across a user’s broader conversation history to distinguish between a legitimate security researcher and a persistent malicious actor.
To fortify these defenses, the development team dedicated over 700,000 A100-equivalent GPU hours to automated red-teaming. This massive computational effort utilized specialized AI to hunt for “universal jailbreaks”—attacks that work across multiple contexts rather than isolated prompts. This automated stress-testing was paired with rigorous human expert red-teaming to account for creative, unanticipated misuse.
The Cyber Critical Threshold
A major focus of the preview is evaluating the model’s offensive capabilities. According to the company’s Preparedness Framework, GPT-5.6 Sol does not cross the “Cyber Critical” threshold. During evaluations involving software like Chromium and Firefox, Sol successfully identified bugs and exploitation primitives (the building blocks of an attack) but failed to autonomously produce a functional, full-chain exploit.
The primary goal of the safeguard stack is asymmetrical: to make prohibited offensive activity incredibly difficult and detectable, while preserving access for legitimate “dual-use” work like code review, patch development, and defensive security testing.

Economics, Caching, and Extreme Speed
Beyond the baseline per-token pricing, GPT-5.6 introduces highly predictable prompt caching to optimize enterprise costs. Developers can now utilize explicit cache breakpoints and rely on a 30-minute minimum cache life. While cache writes are billed at a slight premium (1.25x the uncached input rate), cache reads retain a massive 90% discount, heavily incentivizing efficient prompt architecture.
For enterprise customers requiring raw speed, the launch includes a hardware partnership. In July, GPT-5.6 Sol will launch on Cerebras hardware, boasting unprecedented speeds of up to 750 tokens per second. Access to this hyper-fast tier will initially be limited to select customers as hardware capacity scales.
As the limited preview unfolds, the tech industry and government regulators alike will be watching closely. The GPT-5.6 rollout is not just a test of a new language model; it is a live experiment in how humanity safely deploys frontier intelligence in an increasingly complex digital world.
