This article talks about handling timeouts and retries with Python in detail along with some examples to demonstrate the usage. Python provides a very handy library to handle HTTP requests out of the box, the requests library. It is s a wrapper of urllib3 that simplifies all the lower level wiring and default configurations.
Generally, when working against a single endpoint, it is advisable to create a session and reuse the session for subsequent requests. When using defaults, you risk having the library create a new connection for each requests which can be a real burden in terms of performance and scalability.
Configuring a session with timeouts:
# With requests.session() as session: session.request(method=self.method, url=api_url, data=payload, headers=headers, timeout=(connect_timeout, read_timeout))
The timeout property accepts a tuple. The first position corresponds to the connect timeout, while the second is the read timeout. There is a important point to make here: a connect timeout (most generally) is triggered when the request has not been accepted for processing at all. Usually server is down or the URL being used is wrong. Configuring properly a read timeout needs a lot more thinking. Read timeout means that the request was accepted and the transfer of data was initiated. Now, it might be that it did not get the entire request payload, or that it got it and it started processing it but did not get enough time to complete.
So read timeout will simply drop the request before getting a response.
It does not mean that the request was not processed, simply means that the client not know if it was, or if it failed in the server.
But as a client, our mission is to make sure that the request is processed. For that we use retries.
Configuring Retries –
To configure a retry, all that is needed is to pass an HttpAdapter. This type
accepts a parameter max_retries which can be either a number or an instance of the Retry class. If not specified, or when it is specified as zero, the HttpAdapter creates an instance of Retry with total_retries = 0 and read retry disabled. When an integer, the adapter will create a a Retry instance with that parameter as the total retries but read connect retry on. Most generally, it is advisable to keep the entire control of the retry by specifying a Retry instance in the program.
The code would look like this –
# with requests.session() as session: retry_adapter = Retry(total=self.total_retries, method_whitelist=<method-list>, backoff_factor=<backoff_factor>, status_forcelist=<retry_status_force_list>, raise_on_status=False, read=<http_read_timeout_retries>, connect=<http_connect_timeout_retries> ) session.mount('https://', retry_adapter) session.request(method=self.method, url=api_url, data=payload, headers=headers, timeout=(connect_timeout, read_timeout))
The read and connect values of the Retry instance creation inform of the maximum number of retries that are allowed for each scenario.
A read timeout can happen in case of slow network connectivity when transferring larger amounts of data or any similar condition, after the request has been accepted by the sever for processing. A retry should be configured only when the side effects of retrying are fully understood and are inline with the expectations of the server. For example, if the server API is idempotent or if it is capable of skipping identical payload duplication as part of the program logic, and more importantly, when the response received from the server is critical for the operation success criteria.
Some important additional notes on the configuration of retry –
- Make a conscious decision about the retry backoff in conjunction with the maximum number of retries. You may end up overwhelming the server or having your system stalled if you get this wrong. Remember, the first retry happens without wait time, always.
- If you need to do multiple requests, the session and the adapter does not need to be recreated each time. Once will suffice for the entire lifespan of the session.
- By default, only response codes 413, 429 and 503 are retried, additional codes must be specified via status forcelist.
- The default of the library is to raise on status. This implies you will be
getting a MaxRetryError from urllib3 – not from requests library! Actually, all the exceptions that the retry module handles are from urllib3.
Latest posts by Juan Roldan (see all)
- Advancing organizations beyond the Unknown - January 22, 2020
- Timeouts and Retries with Python 2 Requests library - July 7, 2019