Using Amazon Glacier as a cheap backup solution from the command line

‹ Using Docker in Mac OSX with VMWare Fusion and docker-machine | Terror Drone ›

Set up the bucket and lifecycle rules:

  1. Create an AWS account
  2. Create an S3 bucket inside that account
  3. Create a Lifecycle rule that targets the whole bucket
  4. Check the "Archive to the Glacier Storage Class" checkbox and set the number of days to 0
  5. Permanently delete previous versions, e.g. 30 days after becoming a previous version
  6. End and Clean up Incomplete Multipart Uploads, e.g. 7 days after an upload initiation date

Install the AWS Command Line Interface:
pip install awscli

Configure it with an access/secret key pair:
aws configure

Configure it to avoid using multipart (this makes the ETag use a simple MD5 hash):
aws configure set default.s3.multipart_threshold 5GB
aws configure set default.s3.multipart_chunksize 5GB

Copy a single file:
aws s3 cp file.zip s3://mybucket/somepath/

Copy a whole directory of files:
aws s3 sync . s3://mybucket/somepath/

Avoid using --storage-class STANDARD_IA as this will incur an EarlyDelete penalty when the keys are moved into Glacier

Subscribe to All Posts - Wesley Tanaka