A few years ago, I wrote a little python function that retrieves all the available regions for an AWS service. I use it a lot in Lambda functions I write for cleaning up stray AWS resources.

def get_all_regions(service):
    return boto3.Session().get_available_regions(service)

The function has served me well until AWS started introducing regions that were not enabled by default. The function still works as intended by gathering all the regions that had a particular service available, but it would fail if you tried to do something against a region if that region was not enabled. The first time this happened, I just manually removed the region from the list:

def get_all_regions(service):
    all_regions = boto3.Session().get_available_regions(service)
    all_regions.remove("ap-east-1")
    return all_regions

As the number of new regions grew, I decided I needed to find a better way. I do not want to have to add a new “all_regions.remove()” line every time a new region is launched. I also did not want to make changes if we end up enabling a region.

What I came up with is to do a quick check against each endpoint by running a get_caller_identity against the sts endpoint. The first step is to get all the regions available as mentioned above, then loop through the regions, using the sts client against each region to try to get the account number. If the connection succeeds, I add the region to a variable called available_regions and if it fails because the region is not enabled, I ignore it.

def get_available_regions(service):
    available_regions = []
    all_regions = boto3.Session().get_available_regions(service)
    for region in all_regions:
        sts = boto3.Session(region_name=region).client('sts')
        try:
            sts.get_caller_identity()
            available_regions.append(region)
        except ClientError as e:
            if e.response['Error']['Code'] == "InvalidClientTokenId":
                pass
            else:
                raise

    return available_regions

Now I don’t have to worry whether a region is enabled or not. The only downside to this approach is that it takes between 10 and 15 seconds to loop through and test all the endpoints. I’m ok with that for now, but eventually I would like to see if the requests can be threaded so it checks them concurrently.