Forum Discussion

hhersch's avatar
hhersch
Qrew Assistant Captain
6 years ago

API Communications - Best Practices

Dialing... busy... retrying... dialing...

Remember the days of dial-up internet? The ever-so-common scenario where you heard a busy signal. As frustrating as this was, the great part was your modem was smart enough to retry the connection - which usually resulted in success. It was only considered a failure after a certain number of attempts.

While the days of dial-up might be over, the days of communicating over the internet and having communication issues aren't and they probably never will be. Internet communications are blisteringly fast these days but there are more failures than you might think. The internet is a complex web of connected resources. Sometimes those connections break down. These are commonly referred to as transient faults simply a fault that is temporary and probably resolved by the next time the request is tried. In fact, this probably happens to you all the time when browsing the web but the web browser is smart enough to handle this without you even knowing.

Why is this important? We know that many of our builders enhance their Quick Base applications with integrations, by leveraging our API. While we are extremely proud of our uptime, virtually no service can guarantee 100% reliability on all requests made to the platform. This is especially true since web requests can break down before they even hit Quick Base, for any number of reasons. This could be as simple as a network router having a brief bottleneck. To alleviate the impact of these transient faults, it is best practice (for all integrations and web service calls, not only Quick Base) to implement API retries with exponential backoffs. In simple terms, this basically means trying the request again, waiting a certain amount of time, and trying again. In most cases, this will have a successful outcome. Here is a helpful article from Microsoft that overviews this type of mechanism in detail.

There are a few important considerations here:

1.      We usually do not want to retry an API call where the outcome is unlikely to change with time. For example, Quick Base might return an HTTP 400 (bad request) when a required field is missing. This scenario should not be retried, but rather healthily exited and logged. The same can be said for an authentication error.

2.      In the uncommon scenario where Quick Base, or any software application, has a legitimate service issue or if an environment is being throttled, etc., an integration should be cautious to not accidentally cause a denial-of-service (DOS) attack. This would inadvertently occur if the retries were occurring too rapidly. That is why the concept of exponential backoffs is so crucial. While there are many sophisticated approaches to timing of the retries, the below is a very simple example that illustrates how we ensure appropriate time is given in between each attempt.


One of the great things here is that leveraging modern toolkits and platforms, a lot of this heavy lifting is done for you. For example, Zapier has an Autoreplay feature. This feature intelligently replays tasks that failed to any web service, based on a non-HTTP 200. Zapier serves many different endpoints - over 1,000 stock connectors as of this writing, plus Webhooks by Zapier which can connect to almost any web service. Because of this, Zapier doesn't differentiate between transient faults and other business-related failures, like mentioned in #1. That's okay... In fact, it might give someone the opportunity to change something inside of the source system so that the connection succeeds on the next attempt. If you are building something yourself in .NET, JavaScript, etc., you can best judge the logic that makes sense in your business.


The general take away and concept here is that all solutions that connect to a web service out on the internet should be mindful of transient faults  and ensure their solutions are tolerant of these. Putting this work in up-front can help reduce support overhead, chasing down errors that do not need to be chased and ensuring the solution is built for scale.

3 Replies

  • First of all thanks for top posting on a technical topic. Having QuickBase staff share original information like this is helpful to the community to better understand features, best practices and your design rationale. 

    However, I am not entirely certain of the context and specific reason this topic was addressed. I suspect it has something to do with recent questions or incidents related to webhooks, actions, automation that involve calling the QuickBase API from third-party servers.

    I make the distinction of "calling the QuickBase API from third-party servers" because my experience is that "API retries" are almost never needed when using the QuickBase API from code pages saved within QuickBase.

    My argument is simple: Code pages containing JavaScript (which make API calls) are loaded on top of native QuickBase pages and it would be very unlikely that network conditions allow the successful load of the native page and then immediately deteriorate just prior to the loading of the code page.

    You of course can code "API retries" into your JavaSript if you want. Here is one approach:

    Answer: What's the best way to retry an AJAX request on failure using jQuery?
    https://stackoverflow.com/questions/10024469/whats-the-best-way-to-retry-an-ajax-request-on-failure-...

    If you use the above technique you wind up writing a lot of extra code for every API call which will rarely if ever be invoked. In almost 20 years of using QuickBase I have never had to use "API retries" from JavaScript served from code pages.


  • hhersch's avatar
    hhersch
    Qrew Assistant Captain
    Dan - thanks for engaging. It is true that this is mostly geared towards server-side operations. Code running client-side, to your point, probably has less likelihood of breaking down than a complex web of servers, network layers, etc. Whenever we post content like this, we want to make sure we draw the line between sharing best practices and being overly technical.

    In your example of JavaScript, one of the differences is loading JavaScript that co-loads with the QB UI vs having a custom JavaScript interface that sits "on top of" Quick Base. Many customers have their own user interface. In these scenarios, transient faults could occur transparently to the user since they aren't looking at Quick Base.
  • >Code running client-side ... probably has less likelihood of breaking down 

    In my experience QuickBase servers and broadband connectivity are so reliable today that you never have problems calling the QuickBase API from client side code - with possibly one exception.

    If you attempt to call the QuickBase API client side in a loop like so:

    for (var i=0; i < n; i++) {
      $.get(... ?act=API_AddRecord ...);
    }

    as n increases you will eventually get an error something to the effect of "waiting for an available socket" and your browser tab will stop responding. The solution in this particular case would be to make one call to API_ImportFrom CSV instead of n calls to API_AddRecord. In the general case of making a large number of API calls is to use an async iterator which allows you to iterate over promises (AJAX calls) without flooding the network with a large number of simultaneous requests.

    I just wanted to clarify that users should not shy away from using JavaScript client side because of your mention of "API retries" or other server or network availability issues.