s3invsync
is a Rust program for creating &
syncing backups of an AWS S3 bucket (including old versions of objects) by
making use of the bucket's Amazon S3 Inventory files.
Warning: This is an in-development program. They may be bugs, and some planned features have not been implemented yet.
-
Clone this repository and
cd
into it. -
Run
cargo build --release
to build the binary. The intermediate build artifacts will be cached intarget/
in order to speed up subsequent builds. -
Run with
cargo run --release -- <arguments ...>
. -
If necessary, the actual binary can be found in
target/release/s3invsync
. It should run on any system with the same OS and architecture as it was built on.
cargo run --release -- [<options>] <inventory-base> <outdir>
s3invsync
downloads the contents of an S3 bucket, including old versions of
objects, to the directory <outdir>
using S3 Inventory files located at
<inventory-base>
.
<inventory-base>
must be of the form s3://{bucket}/{prefix}/
, where
{bucket}
is the destination bucket on which the inventory files are stored
and {prefix}/
is the key prefix under which the inventory manifest files
are located in the bucket (i.e., appending a string of the form
YYYY-MM-DDTHH-MMZ/manifest.json
to {prefix}/
should yield a key for a
manifest file).
s3invsync
honors AWS credentials stored in the standard locations (e.g., the
environment variables AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, and
AWS_REGION
or the default credentials files ~/.aws/config
and
~/.aws/credentials
). For public buckets, no credentials need to be provided.
When downloading a given key from S3, the latest version (if not deleted) is
stored at {outdir}/{key}
, and the versionIds and etags of all latest object
versions in a given directory are stored in .s3invsync.versions.json
in that
directory. Each non-latest, non-deleted version of a given key is stored at
{outdir}/{key}.old.{versionId}.{etag}
.
-
-d <DATE>
,--date <DATE>
— Download objects from the inventory created at the given date.By default, the most recent inventory is downloaded.
The date must be in the format
YYYY-MM-DD
(in which case the latest inventory for the given date is used) or in the formatYYYY-MM-DDTHH-MMZ
(to specify a specific inventory). -
-I <INT>
,--inventory-jobs <INT>
— Specify the maximum number of inventory list files to download & process at once [default: 20] -
-l <level>
,--log-level <level>
— Set the log level to the given value. Possible values are "ERROR
", "WARN
", "INFO
", "DEBUG
", and "TRACE
" (all case-insensitive). [default value:DEBUG
] -
-O <INT>
,--object-jobs <INT>
— Specify the maximum number of inventory entries to download & process at once [default: 20] -
--path-filter <REGEX>
— Only download objects whose keys match the given regular expression -
--trace-progress
— Emit download progress information at the TRACE level. This is off by default because it can make for some very noisy logs.