Jan 26 20243 min.

Horizontal scaling and vertical scaling in AWS

An overview of horizontal and vertical scaling strategies in AWS and the associated services

Full-Stack Cloud Engineer

TABLE OF CONTENTS

Horizontal Scaling (scale Out/In)
Vertical Scaling (scale up/down)
What services are used for horizontal scaling in AWS?
References

Horizontal and vertical scaling are distinct methods for resource adjustment in Amazon Web Services, each addressing specific needs for managing workload variations.

In real-world scenarios, AWS users often combine these strategies, leveraging both horizontal and vertical scaling to achieve optimal performance and resource utilization, ensuring efficient handling of diverse application loads.

Horizontal Scaling (scale Out/In)

This approach involves managing the load by modifying the quantity of instances. Utilizing Amazon EC2 Auto Scaling, it adjusts the number of EC2 instances within an Auto Scaling group to meet demand changes.

It's particularly effective for stateless applications, where each instance operates independently. Services like EC2 Auto Scaling and Lambda support this type of scaling, automatically adapting to workload changes.

Vertical Scaling (scale up/down)

In contrast, vertical scaling adjusts the computing capabilities of existing instances, like CPU, memory, and storage. This is achieved by altering the type of EC2 instances, for instance, upgrading from a t2.micro to a t2.large. Suitable for applications with state, vertical scaling in EC2 involves changing instance types, though it might lead to some downtime.

The primary distinction lies in their scaling direction: horizontal scaling adds or removes instances, whereas vertical scaling enhances or reduces the power of existing instances. AWS tools like Elastic Load Balancing aid in horizontal scaling by distributing traffic across multiple instances. Meanwhile, AWS's flexibility allows for easy instance type modifications for vertical scaling.

What services are used for horizontal scaling in AWS?

AWS offers a suite of services designed to facilitate auto scaling, allowing for dynamic adjustment of resources to match application demand. These services include:

AWS Auto Scaling

This comprehensive solution scales multiple resources across various AWS services. It supports EC2 Auto Scaling groups, ECS services, DynamoDB tables, Lambda functions, and more. The scaling is based on metrics, schedules, or target tracking, optimizing application performance.

Amazon EC2 Auto Scaling

This service focuses on EC2 instances, managing their numbers to maintain desired availability. It handles launching or terminating instances based on predefined conditions like CPU usage or network traffic, and replaces unhealthy instances as needed.

Amazon DynamoDB Auto Scaling

This feature automatically adjusts throughput capacity for DynamoDB tables and global secondary indexes, aligning it with the application's request traffic to ensure efficiency and performance.

AWS Lambda Auto Scaling

Lambda scales the provisioned concurrency based on incoming event traffic, maintaining low latency for function execution.

Amazon Aurora Auto Scaling

For Aurora database clusters, this service adjusts the number of read replicas based on IOPS or storage demands, ensuring optimal database performance.

Additionally, other services like Elastic Load Balancing (ELB), Amazon Elastic Container Service (ECS), Amazon Elastic Kubernetes Service (EKS), and AWS Fargate support auto scaling. ELB works in conjunction with Amazon EC2 Auto Scaling to balance traffic, while ECS and EKS manage the auto scaling of containerized applications. Fargate offers serverless container deployment, adjusting computational resources as needed.

References

Application Scaling - AWS Auto Scaling - AWS

Instance Auto Scaling - Amazon EC2 Auto Scaling - AWS

Dynamic scaling for Amazon EC2 Auto Scaling - Amazon EC2 Auto Scaling

What is Application Auto Scaling? - Application Auto Scaling

What is a scaling plan? - AWS Auto Scaling