Skip to main content

MODEL_RATE_LIMIT

You have hit the maximum number of requests that a model provider allows over a given time period and are being temporarily blocked. Generally, this error is temporary and your limit will reset after a certain amount of time.

Troubleshooting

The following may help resolve this error:

  • Contact your model provider and ask for a rate limit increase.
  • If many of your incoming requests are the same, utilize model response caching.
  • Spread requests across different providers if your application allows it.
  • Set a higher number of max retries when initializing your model. LangChain will use an exponential backoff strategy for requests that fail in this way, so the retry may occur when your limits have reset.

Was this page helpful?


You can also leave detailed feedback on GitHub.