What is the difference between rate limiting and concurrent API calls?

Aya Tofach
Aya Tofach
  • Updated

The Rate Limit is the maximum number of calls a user can do per hour and per day. When exceeding one or both limits, your call will return a 429 error. For example, if a user has a 1000 daily API call limit, call #1001 will return a 429 error code with no data.

The Concurrent Limit is a peak performance limit. Concurrency is the number api call we allow one user to make in parallel, and still get latency as if the user was making 1 API call. Once the user exceeds this limit, the API call beyond the limit will go into a queue and be handled once the initial request was done. For example, if a user has a 25 concurrent calls limit, and the user does 75 calls, the system will process the request in the following way: the first 25 calls will be returned back after 200-400 msec (the system takes care of 25 api calls concurrently and return the call as if it was a latency of one api call), then we return back calls #25-50 after 400-800 msec, and then calls #50-75 after 600-1200 msec. So overall, it takes the system 3 rounds to take care of the full request (75 api calls that were made in one user’s request). Of note, unlike the hourly/daily limit, the concurrency limit does not block the user or throw an error code when the limit is reached.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request



Please sign in to leave a comment.