GPT-4.5 is here, and OpenAI’s newest generative AI model is bigger and more compute-intensive than ever—it’s supposedly also better at understanding what ChatGPT users mean with their prompts. Users who want to be part of the first wave to try GPT-4.5, labeled as a research preview, will be required to pay for OpenAI’s $200-a-month ChatGPT Pro subscription.
Prior to this launch, 2025 has already been filled with new AI model releases. Anthropic recently put out a hybrid reasoning model for its Claude chatbot. Before that, Chinese researchers at DeepSeek rocked Silicon Valley with their release of a powerful model trained on a tiny budget, prompting OpenAI to drop a “mini” version of its reasoning model a month ago.
Alongside these new releases, OpenAI promised to invest billions into building the AI infrastructure required to fuel more massive models. And GPT-4.5 is a reiteration of this current strategy from the startup: Bigger is better.
ChatGPT 4.5 is in stark contrast to other recent AI innovations, like DeepSeek’s R1, that attempted to match the performance of a frontier model with as few resources as possible. OpenAI still sees a strong path forward through scaling its models. According to researchers who worked on GPT-4.5, this kind of maximalist mindset to model development has captured more of the nuances of human emotions and interactions.
They see the model’s size as also potentially helping this iteration hallucinate with less frequency than past releases. “If you know more things, you don’t need to make things up,” says Mia Glaese, who leads OpenAI’s alignment team and human data team. Exactly how big or compute-intensive GPT-4.5 is remains unclear—OpenAI declined to share specific numbers.
So, what’s it like to use the new model? GPT-4.5 supports the web search and canvas feature as well as uploads of files and images, though it’s not yet compatible with the AI Voice Mode.
In the announcement post for GPT-4.5, OpenAI included academic benchmark results that show the model getting vastly outpaced by the o3-mini model when it comes to math, and slightly upstaged on science as well, though GPT-4.5 did score a little higher on language benchmarks. The researchers say these measurements don’t capture the full story. “We would expect the difference in 4.5 to be similar to the experience difference of 4 to 3.5,” says Glaese. For the user, prompts related to subjects like writing or programming may yield stronger results, with the back-and-forth interactions feeling more “natural” overall. She hopes all of the chats from this limited release will help them to better understand what GPT-4.5 excels at, as well as its limitations.
Unlike those released as part of OpenAI’s “o” series, GPT-4.5 is not considered to be a reasoning model. The company’s CEO, Sam Altman, posted on social media earlier in February that OpenAI would “ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model.” Nick Ryder, who leads the company’s foundations-in-research team, clarified that this statement pertained to streamlining OpenAI’s product road map, not its research road map. The startup is not just looking into reasoning models, but users can expect to see a more blended experience overall with future releases for ChatGPT where you don’t have to pick which one to use.
“Saying this is the last non-reasoning model really means we’re really striving to be in a future where all users are getting routed to the right model,” says Ryder. After the user logs in to ChatGPT, the AI tool should be able to gauge which model to utilize in response to their prompts. While first designed as a way to easily switch between the different available options, the dropdown model menu in ChatGPT has become difficult to parse for users trying to understand when it’s advantageous to choose o3-mini-high rather than GPT-4o or some other pick.
In the face of increasing pressure from competition, OpenAI desires to still be seen as on the cutting edge of the technology and is investing in pretraining as part of that strategy. “By increasing the amount of compute we use, by increasing the amount of data we use, and focusing on really efficient training methods,” says Ryder, “we push the frontier of unsupervised learning.”
Due to GPT-4.5’s massive alleged size, does it become even harder to parse what’s going on inside of the model? Ryder doesn’t think system interpretability, the attempt to understand why a model generates specific outputs, will be harder due to scaling. Actually, he sees the same methods used for smaller models directly applying to these more massive endeavors.
As part of WIRED’s ongoing coverage of new software releases, I’ll be testing GPT-4.5 to see firsthand how it compares to the competition and past releases. It may be difficult to compare it to other versions due to OpenAI’s characterization of GPT-4.5’s potential strengths, like a stronger intuition, better emotional intelligence, and aesthetic taste, leaning into an almost abstract sense of anthropomorphism. Sure, the company wants to eventually build an AI capable of matching the labor output of a remote worker—and now it’s hoping to nail the soft skills as well.