AWS S3 Lifecycle Rule

In this post, we will explore the significant economic and legal advantages of using lifecycle rules, demonstrating how well-defined rules can streamline data management, reduce storage costs and protect data throughout its lifecycle

AWS S3. Source: https://aws.amazon.com/es/s3/

As companies migrate to the cloud, automated data management becomes crucial. Therefore, before your data project expands, prioritize setting up the s3 buckets:

  1. Enable versioning: This safeguards critical data and allows fast rollbacks if needed.
  2. Define lifecycle rules: Determine deletion or archiving timelines based on data type, compliance requirements, and cost optimization goals.
  3. Fine-tune later: Flexibility lets you adjust rules as your needs evolve

What is versioning?

Before we go further on lifecycle rules, I would like to discuss briefly versioning. This is crucial to understand the impact of the lifecycle rules.

As its name suggests, versioning is a feature that allows you to keep multiple variants of an object within the same bucket. When enabled, every object stored in the bucket is assigned an unique version ID. This means that every time you modify an object, a new version of it is created.

Example of s3 bucket with severa versions of a file in it
Example of a file with 4 versions

On a similar way, this happens when we delete a file. While seemingly deleted, files with versioning enabled actually receive a “delete marker” instead of being permanently deleted: We won’t be able to see them or use them (unless we specifically check for the versions) but they still incur storage costs.

Example of deleted file with versioning enabled
Example of a file with a “deleted” version

Versioning is crucial as it simplifies accidental deletion recovery. However, it’s necessary to grasp the impact on “deletion” for optimal storage management and compliance:

Imagine you need to write 100GB of temporary files once a day (some shuffle using s3 — Glue). Doing some maths, by the end of a month you can expect to have 300 TB of “deleted” data and still pay for that: 300 TB * $0.0095/GB/month ≈ $2,850/month.

In this post, however, I will present you some ways to keep your environment clean and costs under control.

What are lifecycle rules?

S3 lifecycle rules are a feature of Amazon Simple Storage Service (S3) that, by defining a set of actions, allows you to automatically manage the storage of your objects over time. These rules are highly customizable so they can meet any specific needs: you just need to specify the conditions that an object must meet to trigger a particular action.

Types of S3 lifecycle rules

There are two types of S3 lifecycle rules:

  • Transition action rules: These rules move objects to a different storage class at pre-defined time intervals. For example, you could create a rule that moves objects to S3 Glacier Deep Archive storage after 1 year.
  • Expiration action rules: These rules are in charge of deleting permanently objects after a specified period of time. For example, you could create a rule that deletes objects that have not been modified in 3 years or a rule to keep only the last three versions of a file.

Benefits of using S3 lifecycle rules:

  • Simplify data management: You can automate the management of your data lifecycle, which can save you time and effort.
  • Compliance: GDPR rules demand responsible data handling, including defined retention periods. If this is the case for you, you can implement automated deletion for outdated data to simplify compliance and safeguard your users.
    Word of Caution: Data lakes in S3 require special attention. Their dynamic nature allows continuous modification, blurring the lines of “data age”
  • Reduce storage costs: You can move objects to lower-cost storage classes as they become less frequently accessed. For example, you can move objects to S3 Glacier Deep Archive storage after 3 years, which can save you up to 95% of the storage cost.

Changes in Billing

There may be a short delay between the moment a Lifecycle configuration rule is satisfied and the action it triggers. However, billing changes take effect immediately. After the object expiration time, you no longer pay for storage, even if the object is not deleted right away.

Transitions follow different rules. For example, when an object transitions to S3 Glacier, you start paying Glacier Flexible Retrieval rates as soon as the condition is met. If you access the object during this time, charges apply even if the object has not yet physically moved to Glacier storage. For details about other transitions,, please check [Source].

How to configure lifecycle rules

Defining policies in S3 console
  1. Go to the S3 service inside your account
  2. Go to the bucket you want to configure. In this case, “test-bucket”
  3. Make sure that bucket versioning is enabled
  4. Go to the “management” tab
  5. Create lifecycle rule

6. Give the rule a name and scope: To all the objects in a bucket or to objects that match a certain prefix or tag.

Let’s see this in more detail. Here we have inside our bucket, four different folders.

If we define a prefix like temp_folder, this prefix will trigger a regex that will match the four folders inside our bucket. Therefore, if we would want to delete only “temp_folder” then the prefix must be defined as temp_folder/.

Please, remember to use always the backslash(/) to define more precisely the scope of your rules.

7. Specify the action

Although 5 different options are presented we will focus on the last 3 as the others are more straightforward.

  • Expire current versions of objects: to delete the files inside the folder after an x amount of days.
  • Permanently delete noncurrent versions of objects: once a file has been deleted (or overwritten), it becomes noncurrent. With this action, we decide that after x days we want its versions to be deleted. Also, we can decide how many versions we want to keep.
  • Delete expired object delete markers or incomplete uploads: remove the delete markers.

And this is it. With lifecycle rules in place, you won’t longer need to worry of hidden cost for old versions and you can be sure that what you deleted is actually deleted after the days you have specified.

Take into account that lifecycle rules are executed once a day. After the first time that Amazon S3 runs the rules, all objects that are eligible for expiration are marked for deletion.

Expiring Objects in Amazon S3

Amazon S3 expires object versions and removes delete markers asynchronously. The cleanup can take a few days before the bucket becomes empty. However, once the rule triggers, you stop paying for objects marked for deletion. To learn more about asynchronous object removal, see Expiring objects.

It’s important to carefully plan and test your lifecycle rules to ensure they align with your data management requirements and budget constraints.

Others
  • One alternative to the previous option is to use the command shell in AWS to indicate the bucket and prefix to permanently delete. Check Using the AWS CLI.
  • Also, if you are using infrastructure as a code, when deploying a bucket you can configure the policies right away using terraform or pulumi.

BONUS: Permanently delete files via python script

In case you need to write that temporary parquet file in a folder and don’t want to mess with specific lifecycle rules you can still permanently delete it. For this, you will need to use boto3, the AWS SDK for python

Using this code you can permanently delete a folder.
import boto3

# Initialize Boto3 S3 client
s3_client = boto3.client('s3',
aws_access_key_id='YOUR_ACCESS_KEY',
aws_secret_access_key='YOUR_SECRET_KEY'
)

# Specify the bucket name and object key
bucket_name = 'test-bucket-policy'
object_key = 'temporary-folder/temporary-file'

# To delete all versions of an object (including delete markers):
s3_client.delete_object(Bucket=bucket_name, Key=object_key)

print(f"Folder deleted successfully.")

Conclusion

Amazon S3 Lifecycle rules give you a powerful way to cut storage costs and automate data management as it ages. With policies, you can move objects between storage classes, archive them for long-term use, or delete them when they are no longer required. This automation saves manual effort and keeps data management aligned with your compliance and cost-control goals.

Billing works differently from the background actions. Lifecycle rules may take time to complete, but billing changes start immediately once the rule is met. After an object expires, you stop paying for its storage even if deletion takes extra time. When objects move to colder storage like Amazon S3 Glacier, charges start as soon as the condition triggers, not when the transition finishes.

Plan your rules carefully to balance performance, availability, and cost. Begin with simple policies such as deleting old logs or moving rarely accessed data, and expand to advanced lifecycle strategies as your workloads grow. Over time, Lifecycle rules reduce costs while supporting governance, security, and compliance across large-scale environments.