Feb 12 20245 min.

Distributed computing concepts supported by AWS global infrastructure and edge services

Explore the essentials of distributed computing, including its definition, architectural patterns and use cases. Delve into how AWS's global infrastructure and edge services bolster distributed computing systems

Piotr Golon

Full-Stack Cloud Engineer

TABLE OF CONTENTS

What is distributed computing
What are Key AWS Services and Infrastructure Supporting Distributed Computing
How to implement Distributed Computing
How does distributed computing work
What are use cases of distributed computing
References

This article focuses on high-level concepts, including the meaning of distributed computing, high-level architectural patterns like client-server, and the benefits and use cases of distributed computing, as well as the AWS services and infrastructure components involved.

On BlowStack, there is also a related article on Distributed Design Patterns on AWS that covers lower-level architectural patterns, such as microservice architecture or load balancing.

Along with distributed computing concepts, I will refer to various AWS computing and edge services, as well as AWS's global architecture related to the topic. For background information, please consult previous articles. For example, I have detailed AWS compute services and their use cases, highlighting the key features of both primary and supportive services. Additionally, an overview of AWS Global Infrastructure and the applications of edge accelerators in AWS were also discussed.

What is distributed computing

Distributed computing is a flexible term whose meaning has evolved as technology has advanced. Nowadays, it is often related to a pattern where multiple computing resources collaborate to solve a common problem.

Don't be misled into thinking that the presence of many computing units always indicates a complex problem. Distributed computing can also be applied to e-commerce stores. It is more about how a problem is solved rather than what is solved.

The key concepts of distributed computing:

Scability and Cost Efficiency

In theory, distributed computing allows for potentially endless horizontal scaling by simply adding new computing nodes as needed.

AWS offers a wide array of compute services that can automatically scale based on demand. Examples include EC2 with Auto Scaling groups, Lambda for serverless computing, ECS/EKS for container orchestration, and API Gateway for managing APIs.

Additionally, AWS enables payment only for the resources used, thereby providing cost efficiency.

Fault Tolerance and High Availability

The objective is to ensure continued operations even if specific computing units fail, thereby demonstrating high availability of the computing solution. Because a distributed system comprises multiple processing units across different layers, the failure of any single unit should not halt or significantly interfere with the overall system's functioning.

On AWS, there are numerous options to achieve high availability. For example, the Elastic Load Balancer (ELB) can distribute traffic across different EC2 instances, preventing any specific processing unit from becoming overloaded.

Auto Scaling plays a crucial role in both high availability (HA) and fault tolerance (FT), as it not only allows for scaling out/in based on demand but also performs health checks and replaces instances with healthy ones when necessary. Route 53 can globally monitor traffic in distributed systems and route traffic accordingly based on a variety of conditions, including health checks.

Data Consistency

In distributed systems, data is often temporarily duplicated, but the goal is to automatically detect discrepancies and manage data accordingly.

On AWS, numerous services help ensure data consistency. Amazon S3 offers a strong read-after-write consistency model, as well as options for strong consistency, ensuring immediate data availability following write operations. Similarly, DynamoDB supports both strong and eventual consistency, catering to different use cases based on the need for immediate data accuracy versus performance. RDS provides data consistency through ACID (Atomicity, Consistency, Isolation, Durability) mechanisms, including failover support to maintain consistency. Finally, ElastiCache ensures data consistency across Availability Zones (AZs).

Transparency and Heterogeneity

Distributed computing systems offer a seamless user experience, allowing interaction with the system as if it were a single computer, irrespective of the underlying diversity in hardware, middleware, software, and operating systems. This architecture ensures smooth functionality by abstracting the complexity of individual machine configurations.

Concurrency

Distributed systems enhance performance and resource efficiency, enabling the management of varying workloads without the risk of system failure from volume spikes or hardware underutilization. By allowing multiple processes to run concurrently across different machines, they significantly improve overall system efficiency and throughput.

Technologies such as containers and AWS Lambda enable the straightforward deployment and automatic scaling of concurrent processes across the infrastructure, optimizing resource utilization. Meanwhile, asynchronous, non-blocking message passing among loosely coupled components is facilitated by queues like Amazon SQS.

What are Key AWS Services and Infrastructure Supporting Distributed Computing

AWS Service	Role
Amazon EC2	Offers virtual server instances for scalable computing, allowing applications to run smoothly in the cloud
AWS Lambda	Supports serverless computing, enabling applications to respond to events and triggers efficiently without managing servers
Amazon S3	Amazon S3 serves as a scalable, secure, and unified object storage platform for distributed computing, offering centralized data storage, cross-region replication, and seamless integration with AWS services, facilitating robust data architectures and serverless application development.
Amazon SQS	Can be used in distributed computing to allow communication between different components of an application in a decoupled, asynchronous and scalable manner
Amazon VPC	Amazon VPC enables secure, private communication within distributed computing by offering customizable virtual networking that supports launching AWS resources across regions and AZs, facilitating scalable and isolated environments for applications.
Elastic Load Balancing	Can be used in distributed computing to improve availability and scalability of applications
Auto Scaling	Can be used in distributed computing to automatically scale AWS resources based on demand
AWS CloudFront	Amazon CloudFront enhances distributed computing by offering a global network of edge locations for caching content closer to users, reducing latency, and integrating with AWS services for a unified content delivery solution, while also providing security features and scalability for both static and dynamic content
AWS Global Accelerator	Can be used in distributed computing to improve performance and availability for applications running in multiple AWS Regions
AWS Wavelength	Can be used in distributed computing to enable applications that require ultra-low latency connectivity to mobile devices and end user
AWS Local Zone	Can provide benefits in distributed computing by bringing select AWS compute and storage services closer to end users
AWS Outposts	Can be used in distributed computing to enable applications that require low latency access to on-premises resources or data processing

Additional tools such as Amazon DynamoDB for NoSQL database services, ElastiCache for caching, RDS for relational databases, and ECS/EKS for container orchestration, further extend AWS's capabilities for distributed computing. These services work in tandem to offer a robust environment for deploying, managing, and scaling applications, with advanced networking, storage, and computing solutions that meet the demands of modern distributed system

How to implement Distributed Computing

There are four main architectural patterns that can be applied to enable the functioning of distributed computing:

Clinet-Server

The client-server architecture is the most common framework for distributed systems, primarily due to its straightforward separation into computing units, with responsibility resting solely on the server side of the model. These units are categorized into two main roles: clients, which request resources, and servers, which process data from clients and provide resources and responses. A common example is how web applications work. Typically, a client—usually an end user—requests information from a server, which then provides a response, often in the form of a web page populated with data.

Clients can include desktop computers, smartphones, and even other servers that request resources.

Servers are typically more powerful computers that may also communicate with other servers. A single server can usually respond to requests from many clients.

Note that microservices or event-driven architectures can be implemented on the server side within the Client-Server model.

Three-tier architecture

Typically, a three-tier architecture consists of the presentation, application, and data tiers. This architecture was further detailed in the article Multi-tier Architectures on AWS.

Actually, the three-tier architecture is a specific type of Client-Server architecture with a clear division of responsibility, where server machines take responsibility for the application and data tiers, and client machines interact with the servers through the presentation layer, which is usually the user interface. A common example can still be a web or mobile application of any sort.

N-tier architecture

In more comlex distributed system, three-tier architecture is ofeten extended to another layers to meet requioement of application and problems it toss. Usuall the following layer can be added: Service Layer, Cache Layer or Integration layer.

Peer-to-peer architecture

Peer-to-peer architecture is a truly unique architecture, vastly different from the three already mentioned, which were client-server models. The peer-to-peer model is based on shared responsibility without distinct separation. Every computing unit is equally important, responsible, and assumes the same role.

How does distributed computing work

Distributed computing operates through numerous interconnected computing units that exchange messages to ultimately deliver a final response to the client.

The communication method between specific components is crucial and determines how distributed computing functions. There are two main types of component coupling in distributed computing.

Loose coupling

It is the predominant type of coupling in microservices architecture and event-driven architecture. The concept aims to make components independent of each other, allowing them to be developed separately and operate independently. Typically, message queues are the fundamental means of communication for loosely coupled components.

Tight coupling

This type of coupling is predominant in high-performing systems where components work in clusters performing the same task. The work of each component is managed by central control systems, often referred to as middleware.

What are use cases of distributed computing

Use Case	Description
Web and Mobile Applications	Distributed computing allows applications to utilize cloud infrastructure for functions such as data storage, authentication, and content delivery, lightening the load on mobile devices. It also enables the processing of user search queries or transactions by distributing tasks across multiple servers
Cloud Computing	Cloud computing utilizes distributed computing techniques across server pools in various data centers to offer scalable, fault-tolerant, and highly available services, efficiently managing resources like processing, storage, and applications on platforms like AWS.
Big Data Analytics	Big data analytics leverages distributed computing across multiple servers with technologies like Hadoop, Spark, and AWS services such as EMR, Glue, and Redshift, enabling scalable processing of large datasets in parallel for improved performance and massive computing power for operations like data mining and machine learning model training.
IoT (Internet of Things)	IoT systems, utilizing distributed computing, edge, and cloud techniques, leverage AWS IoT services and machine learning algorithms across billions of devices for real-time analytics and optimization, enabling scalable processing and insights from data collected by sensors and equipment in applications like predictive maintenance and industrial IoT.
Financial Services	Financial institutions leverage distributed computing and AWS services like EMR, EC2 Spot Instances, and S3 to efficiently process and store vast data volumes, enabling scalable, high-powered analysis and simulations for risk modeling, trading, and regulatory compliance, while optimizing resource use and reducing costs.
Gaming	Massively multiplayer online games (MMOs) utilize distributed computing, cloud, and edge computing to scale processing across servers, distribute tasks like physics simulations, and synchronize game data, enabling dynamic scalability for fluctuating player loads, reduced latency, and cost efficiency.

References

What is Distributed Computing? - Distributed Systems Explained - AWS

Challenges with distributed systems

Understanding availability - Availability and Beyond: Understanding and Improving the Resilience of Distributed Systems on AWS

AWS global infrastructure

Date undefined min.

AWS leads the global infrastructure among public cloud providers, with hundreds of data centers and over six hundred points of presence, exceeding all others in terms of scale and reach

How to appropriately use edge accelerators in AWS

Overview of AWS edge accelerators and their use cases.

Multi-tier architectures on AWS

Overview of three-tirer and n-tier architecture models and associated services

Distributed design patterns on AWS

Mastering Distributed Systems: Architectural patterns and granular solutions

AWS compute services with appropriate use cases

A comprehensive overview of key compute services, including Amazon EC2, Amazon Fargate, AWS Lambda, and Lightsail, as well as supportive compute services like AWS Batch and Amazon EMR, among others