how to use OpenAI's Batch API to send asynchronous groups of requests with 50% lower costs, a separate pool of significantly higher rate limits, and a clear 24-hour turnaround time. The service is ideal for processing jobs that don't require immediate responses.
- Better cost efficiency: 50% cost discount compared to synchronous APIs
- Higher rate limits: Substantially more headroom compared to the synchronous APIs
- Fast completion times: Each batch completes within 24 hours (and often more quickly)
Eval benchmark | ada v2 | text-embedding-3-small | text-embedding-3-large |
MIRACL average | 31.4 | 44.0 | 54.9 |
MTEB average | 61.0 | 62.3 | 64.6 |
No comments:
Post a Comment