Jan 23 20248 min.

Data retention and classification in AWS

Overview of data retention and classification features and services in AWS

Full-Stack Cloud Engineer

TABLE OF CONTENTS

Data retention
Data classification
References

Data retention and classification in AWS (Amazon Web Services) involve organizing and managing an organization's data according to specific criteria within the AWS ecosystem.

These practices are essential for determining the protection, access, and retention of data over time, ensuring data security, maintaining compliance with legal and regulatory requirements, and facilitating efficient data management.

Data retention

Data retention in organizations involves defining the duration for storing data, guided by a clear policy that aligns with both regulatory mandates and business requirements.

Meanwhile, data classification in AWS focuses on identifying and categorizing data within AWS services according to its sensitivity, importance, and usage. This classification is crucial for understanding the stored data, thereby enhancing data management, security, and compliance.

AWS provides a range of services for effective data retention management:

Amazon S3 Lifecycle Policies

Automate the transition of objects to storage classes like Standard, Infrequent Access, or Glacier and enable automatic deletion after a set period.

Amazon S3 Intelligent-Tiering

Moves files between access tiers based on usage patterns, optimizing costs by placing frequently accessed data in cost-effective tiers.

Amazon EBS Snapshots and Amazon Data Lifecycle Manager

Manage EBS snapshots with policies for automatic creation and deletion after a defined retention period.

AWS Backup

Centralizes backup policies across AWS services, allowing backups to be stored in S3 and automatically deleted after a specified duration.

AWS Glue Crawlers

Schedule crawlers to periodically update data in stores like S3 and DynamoDB, aiding in data retention compliance.

Resource Tagging

Facilitate retention policy application and regulatory compliance by classifying data through resource tagging.

Additional Retention Mechanisms:

Data Archiving

Employ Amazon S3 Glacier for cost-effective, long-term archiving of infrequently accessed data.

Monitoring and Audits

Utilize AWS CloudTrail and AWS Config for ongoing auditing and monitoring of data access and lifecycle policies.

Legal Hold and Compliance

Use mechanisms like Amazon S3 Object Lock to prevent deletion of data under legal or compliance requirements.

Disaster Recovery

Align retention policies with disaster recovery and business continuity plans using AWS Backup.

Data classification

Data classification in AWS involves categorizing data stored within its services by sensitivity, importance, and usage. This process is key to understanding the nature of the data, guiding how it should be protected and accessed. It plays a critical role in applying appropriate security controls, such as enhanced access management and encryption, especially for more sensitive data.

Essential for regulatory compliance and data risk management, AWS offers tools like Amazon S3, Glacier, and Macie to streamline the classification process and the implementation of security measures, thereby ensuring effective data management and compliance.

Resource Tagging

Utilize AWS resource tagging to assign metadata tags for classifying resources by attributes such as confidentiality level.

AWS Identity and Access Management (IAM)

Provides fine-grained access control to manage resource permissions based on their classification.

AWS Organizations

Facilitates central governance of accounts and enforces policies, including mandatory resource tagging for classification.

Data Labeling Services:

Amazon SageMaker

Supports labeling activities in machine learning projects.

AWS Glue DataBrew

Useful for profiling and classifying data in data lakes as part of ETL processes.

Amazon Macie

Automates sensitive data discovery and classification using machine learning and pattern matching, ideal for identifying PII and financial data.

Monitoring and Compliance

Continuously monitor data access and security to ensure ongoing compliance with classification standards.

References

Data classification overview - Data Classification

Using AWS Cloud to support data classification - Data Classification

Data classification - security best practices

Data classification overview - Data Classification

SUS04-BP01 Implement a data classification policy - AWS Well-Architected Framework (2022-03-31)

Best practice 3.7 – Implement data retention policies for each class of data in the analytics workload - Data Analytics Lens

COST04-BP05 Enforce data retention policies - Cost Optimization Pillar

COST04-BP05 Enforce data retention policies - AWS Well-Architected Framework

Best practice 3.7 – Implement data retention policies for each class of data in the analytics workload - Data Analytics Lens

#design-secure-architectures-aws-certified-solutions-architect-associate-saa-c03-domain-1

Is AWS Certification worth it

Date undefined min.

Getting AWS certified is becoming a must have for anyone truly interested in working with AWS Cloud as a professional. There are 3 main reasons for that: requirements of AWS Partner program, learning process and your visibility on the market.

Data access and governance in AWS

Understanding data access and governance is crucial for AWS Solutions Architect - Associate (SAA-C03) exam candidates. These concepts are fundamental for designing secure architectures and selecting appropriate data security controls in AWS environments.

Data recovery in AWS

Overview of AWS data recovery solutions.

Encryption and appropriate key management in AWS

Simplified overview of encryption and key management in AWS