Python – AWS S3 keep N latest artifacts

This script gets a list of all directories in the bucket and deletes all objects in each directory, except for the last "N" specified. To run the script, you need to pass two arguments:

  • Bucket name
  • Number of last stored objects

How to use it:

python3 artifacts-artem-services 3

import sys
import boto3

def cleanup():
    get_last_modified = lambda obj: int(obj['LastModified'].strftime('%s'))

    s3 = boto3.client('s3')
    result = s3.list_objects(Bucket=bucket, Delimiter='/')
    for dir in result.get('CommonPrefixes'):
        print('Directory: ' + str(dir['Prefix']))
        artifacts_listing = s3.list_objects_v2(Bucket = bucket, Prefix = dir.get('Prefix'))['Contents']
        artifacts_sorted = [obj['Key'] for obj in sorted(artifacts_listing, key=get_last_modified)]
        for artifact in artifacts_sorted[:-keep_last]:
            print('Deleting artifact: ' + str(artifact))
            s3.delete_object(Bucket = bucket, Key = artifact)

if sys.argv[1:] and sys.argv[2:]:
    bucket = sys.argv[1]
    keep_last = int(sys.argv[2])
    print("This script for cleanup old artifacts in S3 bucket")
    print("Usage  : python3 " + sys.argv[0] + " {BUCKET_NAME} " + "{NUMBER_OF_THE_LAST_KEEPING_ARTIFACTS}")
    print("Example: python3 " + sys.argv[0] + " artifacts-artem-services " + "3")

Tagged: Tags

Notify of
Inline Feedbacks
View all comments