AWS Under the Hood - Day 1

AWS Under the Hood - Day 1

In the AWS console, why does the snapshot size always appear equal to the EBS volume size, even though snapshots are incremental? Since AWS stores EBS snapshots on S3 in the backend, how can we access or view detailed storage information for these snapshots if they are not directly visible in the S3 console?

In AWS, the display of snapshot sizes in the console might initially appear confusing, but there’s a logical explanation. Each snapshot is incremental, meaning only the blocks on the device that have changed after your most recent snapshot are saved in the new snapshot. However, AWS displays the total size of your snapshot’s data. This doesn’t mean that the snapshot physically occupies that much space on storage; it indicates the volume of data it represents at the time of the snapshot.

Even though each incremental snapshot contains only the changed blocks, the console shows the size of the entire volume to illustrate the total amount of data that the snapshot can restore. This is a user-friendly feature designed to enhance your understanding; it does not reflect the actual storage consumption, which is typically less and based on the amount of data changed between snapshots.

Let’s try to understand this in simple language.
AWS EBS snapshots are incremental, meaning only the data changed since the last snapshot is saved. However, AWS doesn’t directly show the size of these incremental changes in the AWS Management Console. Here’s how it’s handled:
First Snapshot: The first snapshot of an EBS volume captures all the data on the volume at that point in time.
Subsequent Snapshots:Each subsequent snapshot captures only the blocks that have changed since the last snapshot. This means if a block of data has changed, it is stored again; if it hasn’t, it isn’t included.

To find out the exact size of the data changed (i.e., the size of the incremental portion of a snapshot), you would need to use AWS tools or APIs to fetch these details. AWS does not provide this directly through the console, but using the AWS CLI (Command Line Interface) or Boto3 (the AWS SDK for Python), you can list all snapshots and their attributes. However, determining the exact change size between snapshots isn’t straightforward and requires monitoring changes at the block level.
AWS provides CloudWatch metrics for volumes, like VolumeWriteOps or VolumeReadOps, which can give you an idea of how active the volume is. Still, these do not directly indicate the actual change size in snapshots.

Regarding the second question, checking the S3 bucket where the snapshot is stored, AWS EBS snapshots are indeed stored on S3, but this is handled entirely in the backend by AWS. These snapshots are not accessible to users directly through the S3 console. They can only be managed and viewed through the EC2 Management Console under the snapshots section.

Let’s try to understand it again in simple terms:
EBS snapshots are stored in S3 but are not accessible like typical user files or buckets. Here’s what happens:
Storage Handling:When you create a snapshot, AWS manages the data storage automatically. The data is compressed and encrypted (if encryption is enabled), then stored in S3. However, AWS abstracts the details, meaning users do not interact directly with the S3 bucket where snapshots are stored.
Visibility and Access:There is no direct method to access these S3 buckets or see snapshot data via the S3 management console because AWS handles snapshots at a higher abstraction level. They are managed through the EC2 console under the Snapshots section.

What happens under the hood?
Under the hood, when a snapshot is taken, AWS identifies which volume blocks have been altered since your last snapshot. These changed blocks are then compressed and transferred to S3. If certain blocks are frequently changed, they may be transferred multiple times across multiple snapshots, but only the latest version is retained in full in the most recent snapshot.

🏁 For users or developers who need to manage snapshots or understand their billing implications, it’s important to use the management features provided by AWS, like the Cost Explorer, to understand the costs associated with snapshot storage.
📚 If you’re interested in more in-depth explanation of these topics, please check out my new book “Cracking the DevOps Interview”
https://pratimuniyal.gumroad.com/l/cracking-the-devops-interview

📚 To learn more about AWS, check out my book “AWS for System Administrators”
https://www.amazon.com/AWS-System-Administrators-automate-infrastructure/dp/1800201532

Leave a Reply

Your email address will not be published. Required fields are marked *