Feb 02 20243 min.

Service quotas and throttling on AWS

Overview of service qutoas and throttling on AWS

Full-Stack Cloud Engineer

TABLE OF CONTENTS

What are service quotas
What is throttling
What is the importance of service quotas and throttling
What AWS services help manage quotas and throttling
References

Service quotas and throttling are often overlooked aspects of cloud management, yet they play crucial roles in both security and cost control within AWS environments. It's not uncommon for users to interact with AWS services for an extended period without encountering the concepts of quotas and throttling limits.

Typically, these issues come to light as applications scale and demand more from the deployed services, highlighting the importance of understanding and managing these limits to prevent unexpected costs and ensure system stability.

What are service quotas

Service quotas pertain to the maximum number of resources you're allowed to utilize within your AWS account. These limits are in place to prevent overconsumption of resources, aiding in cost management and ensuring that AWS can serve a wide customer base without resource exhaustion.

Quotas are applicable to a variety of resources, such as the number of EC2 instances you can run simultaneously or the number of queues you can have in CodeBuild. Essentially, quotas manage the volume of resources you can deploy, which directly impacts how large and how complex your cloud infrastructure can be.

Quotas are typically imposed on specific services within a region, but sometimes additionally at the account level.

Quotas can be increased for most services and features. For example you can increase Concurrent Lambda executions where default is 1000 (per sec.) to even ten of thousand. But you can't increase maximum possible Lambda RAM size which is 3GB (per function).

Quotas can be increased upon request, either through the AWS Management Console or using the AWS CLI. Additionally, some quota increase requests are eligible for automatic approval.

What is throttling

Throttling, on the other hand, deals with the rate at which your applications can make requests to or process data through AWS services. This includes limitations on API call rates for services like API Gateway, DynamoDB, RDS, and others.

The primary purpose of throttling is to maintain the stability and reliability of AWS services by preventing any single user or application from consuming disproportionate amounts of bandwidth or processing power. Throttling ensures that services can handle high loads and distribute resources fairly among users.

What is the importance of service quotas and throttling

Without service quotas and throttling, users would be exposed to potentially unlimited costs due to errors or inefficient designs in their cloud architecture, such as a never-ending loop in a Lambda function or excessive use of resources by an application component. These mechanisms also protect the integrity and availability of AWS services by preventing resource monopolization.

Understanding and managing service quotas and throttling are critical for designing scalable, cost-effective, and reliable applications on AWS. Monitoring these limits and planning for growth can help avoid service disruptions and unexpected expenses. AWS provides tools, such as AWS Service Quotas and AWS CloudWatch, to monitor usage and manage these limits effectively.

Moreover, implementing best practices such as exponential backoff in request retries can help mitigate the impacts of throttling, ensuring your applications remain responsive under varying loads.

What AWS services help manage quotas and throttling

AWS offers a suite of services and tools to assist in managing service quotas and mitigating throttling, ensuring optimal application performance and capacity management.

Key services include:

AWS Service Quotas

Central hub for viewing and managing quotas across AWS services, offering capabilities to view current usage, request quota increases, and monitor quota utilization.

Amazon CloudWatch

Provides monitoring and observability for resource usage, enabling the creation of alarms to alert users as they approach or exceed quotas. It also helps in monitoring API call rates to prevent throttling.

AWS Trusted Advisor

Offers recommendations and checks for service limit utilization, alerting users to services nearing their quotas and suggesting optimizations or quota increases.

AWS Support

Plays a vital role in processing quota increase requests that require manual intervention and offering strategies to handle throttling issues effectively.

References

AWS service quotas

Requesting a quota increase - Service Quotas

How do I manage my AWS service quotas?

Understanding quotas - AWS Lambda

Lambda quotas - AWS Documentation - Amazon

REL01-BP01 Aware of service quotas and constraints - Reliability Pillar