Introducing GPT-4.1 in the API | OpenAI
A new series of GPT models featuring major improvements on coding, instruction following, and long context—plus first-ever nano model.- Coding: GPT‑4.1 scores 54.6% on SWE-bench Verified, improving by 21.4%abs over GPT‑4o and 26.6%abs over GPT‑4.5—making it a leading model for coding.
- Instruction following: On Scale’s MultiChallenge benchmark, a measure of instruction following ability, GPT‑4.1 scores 38.3%, a 10.5%abs increase over GPT‑4o.
by Matthew Berman