Amazon Web Services - restore S3 Glacier objects with serverless Lambda

Kai Jian Tan

Published Jan 11, 2019

Firstly, why use Lambda?

Advantages:

Cost - you only pay what you use with Lambda. There are all kinds of startups out there that talk about running an entire service for hundreds of thousands of users for about $100/month.

First 1 Million Lambda request free (first 1 million request per month are free) $0.20 per 1 million request thereafter

Serverless - Stuff you no longer have to worry about. All the apps we run on Lambda get multi-AZ fault tolerance out of the box for free. There will be no architecture consideration of load balancers or auto scaling groups to set up.

No patching - No more strategy for patching of instances on host for any scripts. No configuration management tools & no one on the team has to keep track of OS versions. The operations model will be much cleaner using serverless

________________________________________________________________

Now back to the steps of restoring all S3-Glacier objects in a single S3 Bucket:

When files get transit to glacier as a storage class, it will be a little hassle to get it restored. The following steps will guide you on the restoration of S3 objects which have the storage class of Glacier

First, proceed to Lambda Services and create a new function. For this i wrote a python 3.7 code:

*Do remember to input your bucket and no. of days the objects will be available eg. RestoreRequest={'Days':10}

# This python scripts run in Lambda Python 3.7 with 15 minutes executation time
# This script execute glacier restore for one S3 buckets

import boto3

def lambda_handler(event, context):
    
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('your-bucket')
    for obj_sum in bucket.objects.all():
        obj = s3.Object(obj_sum.bucket_name, obj_sum.key)
        if obj.storage_class == 'GLACIER':
            # Try to restore the object if the storage class is glacier and
            # the object does not have a completed or ongoing restoration
            # request.
            if obj.restore is None:
                print('Submitting restoration request: %s' % obj.key)
                obj.restore_object(RestoreRequest={'Days': 10})
            # Print out objects whose restoration is on-going
            elif 'ongoing-request    ="true"' in obj.restore:
                print('Restoration in-progress: %s' % obj.key)
            # Print out objects whose restoration is complete
            elif 'ongoing-request    ="false"' in obj.restore:
                print('Restoration complete: %s' % obj.key)

"For all but the largest archives (250MB+), data accessed using Expedited retrievals are typically made available within 1 – 5 minutes. Standard retrievals complete within 3 – 5 hours. Bulk retrievals complete within 5 – 12 hours."

Once the S3 Glacier objects are made available to download, copy all the objects files over to a new S3 bucket for long term viewing.

You may create another Lambda function to trigger this copy objects process:

# This python scripts run in Lambda Python 3.7 with 15 minutes execution time
# This script copy all objects from one bucket to another

import boto3
s3 = boto3.resource('s3')

def lambda_handler(event, context):
    # source bucket
    bucket = s3.Bucket('source-bucket')
    # destination bucket
    dest_bucket = s3.Bucket('destination-bucket')
    print(bucket)
    print(dest_bucket)

    for obj in bucket.objects.all():
        dest_key = obj.key
        print(dest_key)
        s3.Object(dest_bucket.name, dest_key).copy_from(CopySource = {'Bucket': obj.bucket_name, 'Key': obj.key})

And once you trigger the Lambda function above, you are done!

Thats all folks , have a great week ahead !

To view or add a comment, sign in

Amazon Web Services - restore S3 Glacier objects with serverless Lambda

Kai Jian Tan

More articles by Kai Jian Tan

Others also viewed

Using Apache Airflow to get global flights information

Rethinking Rust vs. C: Beyond Memory Safety

Pitch a project, never an idea

AWS Cognito Backup Restore Solution with Python3 Lambda

Alan Cache, the best caching library? (Part 1)

ClusterViz = Stateless and clean Elasticsearch cluster visualization Part 1

How to scan millions of files on AWS S3

How One Failing API Endpoint Taught Me Everything About Scale

Building a CloudFormation macro

Explore content categories

More articles by Kai Jian Tan

Steampipe Dashboard architecture on Amazon Web Services (AWS)

Deploying Service Mesh - AWS App Mesh

Setup Amazon Managed Service for Grafana (AMG) on AWS EKS

Setup Amazon Managed Service for Prometheus (AMP) on AWS EKS

Deploying AWS Load Balancer Controller and Ingress on AWS EKS

Containerize Window Application & deploy into AWS Elastic Kubernetes Service (EKS)

AWS - Notification Text + Call when your mission critical instances goes down