Introducing GPT-4.1 in the API | OpenAI
A new series of GPT models featuring major improvements on coding, instruction following, and long context—plus first-ever nano model.- Coding: GPT‑4.1 scores 54.6% on SWE-bench Verified, improving by 21.4%abs over GPT‑4o and 26.6%abs over GPT‑4.5—making it a leading model for coding.
- Instruction following: On Scale’s MultiChallenge benchmark, a measure of instruction following ability, GPT‑4.1 scores 38.3%, a 10.5%abs increase over GPT‑4o.
by Matthew Berman
| GPT-4o | GPT-4.1 | |
|---|---|---|
| Strengths | – Built-in support in ChatGPT (no setup) – Rich multimodal features (text, image, voice, image generation) – Strong general knowledge and creative flair | – About 40% faster responses – Better at coding and following detailed instructions – Huge context window (up to 1M tokens) – More up-to-date training (through mid-2024) – Slightly fewer hallucinations and more literal output | 
| Weaknesses | – Slower response times – Knowledge cutoff in late 2023 (unless you enable browsing) – Message limits for free users – Slightly looser adherence to exact instructions | – API-only at launch (no direct ChatGPT integration yet) – Requires clear, well-specified prompts – No built-in image generation in chat | 
 
No comments:
Post a Comment