AWS throttling and Ansible

3 minute read

Throttling and eventual consistency errors with Ansible handling the AWS provisioning.

While using some of the AWS modules for Ansible, you may have been bitten by 1 of these 2 errors:

RequestLimitExceeded This happens when the region you are in is being saturated by API requests.
^\w+.NotFound (Eventual Consistency Errors). The Amazon API follows an eventual consistency model. This means that the result of an API command you run that affects your Amazon resources might not be immediately visible to all subsequent commands you run.

If we were to walk through a set of tasks in a role to deploy a VPC, we would see a common set of steps.

vpc
igw
subnets
etc..

Each of the modules above can make 1+ calls to AWS and depending on the amount of calls that is being made. AWS will start to throttle the requests it receives and your playbook will fail.

The current state on the majority of the AWS modules in Ansible.

Since a good portion of the modules for aws do not implement a backoff decorator, we are forced to do the following.

Add a pause between modules. (We really should not have to do this.)
Add the retry plugin to the module. (This is not a good idea either.)

Here are a few of the AWS modules that have implemented a basic retry functionality.

cloudformation
route53
ec2_elb_lb (This one actually uses a decorator that will retry only on RequestLimitExceeded)

This is where the AWSRetry.backoff decorator comes in and saves the day.

AWSRetry.backoff decorator will retry on the following errors.

RequestLimitExceeded
Unavailable
ServiceUnavailable
InternalFailure
InternalError
^\w+.NotFound (Eventual Consistency Errors)

If an exception that is not in that list is not matched, it will then just raise the exception as it normally would.

How AWSRetry.backoff works.

The AWSRetry.backoff decorator will retry to call the failing function using an exponential backoff algorithm. Each time the decorated function/method throws an exception, the decorator will wait for x amount of time and retry calling the function until the maximum number of tries is reached. If the decorated function fails on the last try, the exception will occur unhandled.

AWSRetry is derived from the CloudRetry class. This class is meant to be used as a base class to other cloud providers, that want to build similiar functionality into Ansible Cloud modules.

CloudRetry Class (Code)

AWSRetry Class (Code)

Example of how to use the AWSRetry.backoff decorator

#Default tries is 10 and default delay is 2
@AWSRetry.backoff(tries=2, delay=1.2)
def aws_client(region, service='ec2', profile=None):
    try:
        session = boto3.Session(region_name=region, profile_name=profile)
        return session.client(service)
    except botocore.exceptions.ClientError as e:
        raise e

Where to get AWSRetry.

You can the the AWSRetry decorator on GitHub in the Ansible Core repo (Still in a PR) or in my personal repo.

For Ansible Modules

Currently the AWSRetry.backoff decorator is in a PR.

Ansible Modules that are in a PR state and that depend on AWSRetry.

If you do not want to wait, you can just copy module_utils/cloud.py and module_utils/ec2.py into your ansible installation directory.

For Ansible Filters

If you want to use this decorator with your AWS Filters

Wrap up

If the community is going to seriously consider using Ansible for all of it’s cloud provisioning needs, then a backoff decorator needs to be implemented.

Share on

Twitter Facebook Google+ LinkedIn

Allen Sanabria