如何找到我的 AWS S3存储桶或文件夹的总大小?

Amazon 是否提供了一种简单的方法来查看我的 S3存储桶或文件夹使用了多少存储空间?这样我就可以计算我的成本,等等。

119302 次浏览

As of the 28th July 2015 you can get this information via CloudWatch.

aws cloudwatch get-metric-statistics --namespace AWS/S3 --start-time 2015-07-15T10:00:00
--end-time 2015-07-31T01:00:00 --period 86400 --statistics Average --region us-east-1
--metric-name BucketSizeBytes --dimensions Name=BucketName,Value=myBucketNameGoesHere
Name=StorageType,Value=StandardStorage




Important: You must specify both StorageType and BucketName in the dimensions argument otherwise you will get no results.

As an alternative, you can try s3cmd, which has a du command like Unix.

I use s3cmd du s3://BUCKET/ --human-readable to view size of folders in S3. It gives quite a detailed info about the total objects in the bucket and its size in a very readable form.

Two ways,

Using aws cli

aws s3 ls --summarize --human-readable --recursive s3://bucket/folder/*

If we omit / in the end, it will get all the folders starting with your folder name and give a total size of all.

aws s3 ls --summarize --human-readable --recursive s3://bucket/folder

Using boto3 api

import boto3


def get_folder_size(bucket, prefix):
total_size = 0
for obj in boto3.resource('s3').Bucket(bucket).objects.filter(Prefix=prefix):
total_size += obj.size
return total_size
s3cmd du --human-readable --recursive s3://Bucket_Name/

Amazon has changed the Web interface so now you have the "Get Size" under the "More" menu.

Found here

aws s3api list-objects --bucket cyclops-images --output json --query "[sum(Contents[].Size), length(Contents[])]" | awk 'NR!=2 {print $0;next} NR==2 {print $0/1024/1024/1024" GB"}'

Using the AWS Web Console and Cloudwatch:

  1. Go to CloudWatch
  2. Clcik Metrics from the left side of the screen
  3. Click S3
  4. Click Storage
  5. You will see a list of all buckets. Note there are two possible points of confusion here:

    a. You will only see buckets that have at least one object in the bucket.
    b. You may not see buckets created in a different region and you might need to switch regions using the pull down at the top right to see the additional buckets

  6. Search for the word "StandardStorage" in the area stating "Search for any metric, dimension or resource id"

  7. Select the buckets (or all buckets with the checkbox at the left below the word "All") you would like to calculate total size for
  8. Select at least 3d (3 days) or longer from the time bar towards the top right of the screen

You will now see a graph displaying the daily (or other unit) size of list of all selected buckets over the selected time period.

If you don't need an exact byte count or if the bucket is really large (in the TBs or millions of objects), using CloudWatch metrics is the fastest way as it doesn't require iterating through all the objects, which can take significant CPU and can end in a timeout or network error if using a CLI command.

Based on some examples from others on SO for running the aws cloudwatch get-metric-statistics command, I've wrapped it up in a useful Bash function that allows you to optionally specify a profile for the aws command:

# print S3 bucket size and count
# usage: bsize <bucket> [profile]
function bsize() (
bucket=$1 profile=${2-default}


if [[ -z "$bucket" ]]; then
echo >&2 "bsize <bucket> [profile]"
return 1
fi


# ensure aws/jq/numfmt are installed
for bin in aws jq numfmt; do
if ! hash $bin 2> /dev/null; then
echo >&2 "Please install \"$_\" first!"
return 1
fi
done


# get bucket region
region=$(aws --profile $profile s3api get-bucket-location --bucket $bucket 2> /dev/null | jq -r '.LocationConstraint // "us-east-1"')
if [[ -z "$region" ]]; then
echo >&2 "Invalid bucket/profile name!"
return 1
fi


# get storage class (assumes
# all objects in same class)
sclass=$(aws --profile $profile s3api list-objects --bucket $bucket --max-items=1 2> /dev/null | jq -r '.Contents[].StorageClass // "STANDARD"')
case $sclass in
REDUCED_REDUNDANCY) sclass="ReducedRedundancyStorage" ;;
GLACIER)            sclass="GlacierStorage" ;;
DEEP_ARCHIVE)       sclass="DeepArchiveStorage" ;;
*)                  sclass="StandardStorage" ;;
esac


# _bsize <metric> <stype>
_bsize() {
metric=$1 stype=$2
utnow=$(date +%s)
aws --profile $profile cloudwatch get-metric-statistics --namespace AWS/S3 --start-time "$(echo "$utnow - 604800" | bc)" --end-time "$utnow" --period 604800 --statistics Average --region $region --metric-name $metric --dimensions Name=BucketName,Value="$bucket" Name=StorageType,Value="$stype" 2> /dev/null | jq -r '.Datapoints[].Average'
}


# _print <number> <units> <format> [suffix]
_print() {
number=$1 units=$2 format=$3 suffix=$4
if [[ -n "$number" ]]; then
numfmt --to="$units" --suffix="$suffix" --format="$format" $number | sed -En 's/([^0-9]+)$/ \1/p'
fi
}
_print "$(_bsize BucketSizeBytes $sclass)" iec-i "%10.2f" B
_print "$(_bsize NumberOfObjects AllStorageTypes)" si "%8.2f"
)

A few caveats:

  • For simplicity, the function assumes that all objects in the bucket are in the same storage class!
  • On macOS, use gnumfmt instead of numfmt.
  • If numfmt complains about invalid --format option, upgrade GNU coreutils for floating-point precision support.

Answer adjusted to 2020: Go into your bucket, select all folders, files and click on "Actions"->"Get Total Size"enter image description here

Answer updated for 2021 :)

In your AWS console, under S3 buckets, find bucket, or folder inside it, and click Calculate total size.

enter image description here

The most recent and the easiest way is to go to "Metric" tab. It provides clear understanding of the bucket size and number of objects inside it.

Metrics

in case if someone needs the bytes precision:

aws s3 ls --summarize --recursive s3://path | tail -1 | awk '{print $3}'

You can visit this URL to see the size of your bucket on the "Metrics" tab in S3: https://s3.console.aws.amazon.com/s3/buckets/{YOUR_BUCKET_NAME}?region={YOUR_REGION}&tab=metrics

The data's actually in CloudWatch so you can just go straight there instead and then save the buckets you're interested in to a dashboard.

In NodeJs

const getAllFileList = (s3bucket, prefix = null, token = null, files = []) => {
var opts = { Bucket: s3bucket, Prefix: prefix };
let s3 = awshelper.getS3Instance();
if (token) opts.ContinuationToken = token;
return new Promise(function (resolve, reject) {
s3.listObjectsV2(opts, async (err, data) => {
files = files.concat(data.Contents);
if (data.IsTruncated) {
resolve(
await getAllFileList(
s3bucket,
prefix,
data.NextContinuationToken,
files
)
);
} else {
resolve(files);
}
});
});
};




const calculateSize = async (bucket, prefix) => {
let fileList = await getAllFileList(bucket, prefix);
let size = 0;
for (let i = 0; i < fileList.length; i++) {
size += fileList[i].Size;
}
return size;
};

Now Just call calculateSize("YOUR_BUCKET_NAME","YOUR_FOLDER_NAME")

There are many ways to calculate the total size of folders in the bucket

Using AWS Console

S3 Buckets > #Bucket > #folder > Actions > Calculate total size

Using AWS CLI

aws s3 ls s3://YOUR_BUCKET/YOUR_FOLDER/ --recursive --human-readable --summarize

The command's output shows:

  1. The date the objects were created
  2. Individual file size of each object
  3. The path of each object the total number of objects in the s3 bucket
  4. The total size of the objects in the bucket

Using Bash script

    #!/bin/bash
while IFS= read -r line;
do
echo $line
aws s3 ls  --summarize  --human-readable  --recursive s3://#bucket/$line --region #region | tail -n 2 | awk '{print $1 $2 $3 $4}'
echo "----------"
done < folder-name.txt

Sample Output:

test1/
TotalObjects:10
TotalSize:2.1KiB
----------
s3folder1/
TotalObjects:2
TotalSize:18.2KiB
----------
testfolder/
TotalObjects:1
TotalSize:112 Mib
----------

AWS pricing for GET operation:

S3 List operations cost about $0.005 per 1,000 requests, where each request returns a maximum of 1,000 objects in the region.

For example:

if your folder contains 1,000,000 objects, you would make 1,000 requests and the List operation would cost you $0.005.