检查 S3上的文件大小而不下载?

我有客户文件上传到亚马逊 S3,我想添加一个功能,以计算这些文件的大小为每个客户。有没有办法不用下载就能“窥视”文件大小?我知道你可以从亚马逊控制面板查看,但我需要做它的语法。

98767 次浏览

Send an HTTP HEAD request to the object. A HEAD request will retrieve the same HTTP headers as a GET request, but it will not retrieve the body of the object (saving you bandwidth). You can then parse out the Content-Length header value from the HTTP response headers.

You can also do a listing of the contents of the bucket. The metadata in the listing contains the file sizes of all of the objects. This is how it's implemented in the AWS SDK for PHP.

Using Michael's advice, my successful code looked like this:

require 'net/http'
require 'uri'


file_url = MyObject.first.file.url


url = URI.parse(file_url)
req = Net::HTTP::Head.new url.path
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}


file_length = res["content-length"]

PHP code to check s3 object size (or any other object headers), notice the use stream_context_set_default to make sure it only uses a HEAD request

stream_context_set_default(
array(
'http' => array(
'method' => 'HEAD'
)
)
);


$headers = get_headers('http://s3.amazonaws.com/bucketname/filename.jpg', 1);
$headers = array_change_key_case($headers);


$size = trim($headers['content-length'],'"');

Android Solution

Integrate aws sdk and you get a pretty much straight forward solution:

// ... put this in background thread
List<S3ObjectSummary> s3ObjectSummaries;
s3ObjectSummaries = s3.listObjects(registeredBucket).getObjectSummaries();
for (int i = 0; i < s3ObjectSummaries.size(); i++) {
S3ObjectSummary s3ObjectSummary = s3ObjectSummaries.get(i);
Log.d(TAG, "doInBackground: size " + s3ObjectSummary.getSize());
}
  • Here is a link to the official documentation.
  • Very important to execute the code in AsyncTask or any means to get you in a background thread, otherwise you get an exception for running network on ui thread.

There is better solution.

$info = $s3->getObjectInfo($yourbucketName, $yourfilename);
print $info['size'];

.NET AWS SDK ---- ListObjectsRequest, ListObjectsResponse, S3Object

AmazonS3Client s3 = new AmazonS3Client();
SpaceUsed(s3, "putBucketNameHere");


static void SpaceUsed(AmazonS3Client s3Client, string bucketName)
{
ListObjectsRequest request = new ListObjectsRequest();
request.BucketName = bucketName;
ListObjectsResponse response = s3Client.ListObjects(request);
long totalSize = 0;
foreach (S3Object o in response.S3Objects)
{
totalSize += o.Size;
}
Console.WriteLine("Total Size of bucket " + bucketName + " is " +
Math.Round(totalSize / 1024.0 / 1024.0, 2) + " MB");
}

Node.js example:

const AWS = require('aws-sdk');
const s3 = new AWS.S3();


function sizeOf(key, bucket) {
return s3.headObject({ Key: key, Bucket: bucket })
.promise()
.then(res => res.ContentLength);
}




// A test
sizeOf('ahihi.mp4', 'output').then(size => console.log(size));

Doc is here.

I do something like this in Python to get the cumulative size of all files under a given prefix:

import boto3


bucket = 'your-bucket-name'
prefix = 'some/s3/prefix/'


s3 = boto3.client('s3')


size = 0


result = s3.list_objects_v2(Bucket=bucket, Prefix=prefix)
size += sum([x['Size'] for x in result['Contents']])


while result['IsTruncated']:
result = s3.list_objects_v2(
Bucket=bucket, Prefix=prefix,
ContinuationToken=result['NextContinuationToken'])
size += sum([x['Size'] for x in result['Contents']])


print('Total size in MB: ' + str(size / (1000**2)))

The following python code will provide the size of top 1000 files printing them individually from s3:

import boto3


bucket = 'bucket_name'
prefix = 'prefix'


s3 = boto3.client('s3')
contents = s3.list_objects_v2(Bucket=bucket,  MaxKeys=1000, Prefix=prefix)['Contents']


for c in contents:
print('Size (KB):', float(c['Size'])/1000)

This is a solution for whoever is using Java and the S3 java library provided by Amazon. If you are using com.amazonaws.services.s3.AmazonS3 you can use a GetObjectMetadataRequest request which allows you to query the object length.

The libraries you have to use are:

<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.11.511</version>
</dependency>

Imports:

import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.*;

And the code you need to get the content length:

GetObjectMetadataRequest metadataRequest = new GetObjectMetadataRequest(bucketName, fileName);
final ObjectMetadata objectMetadata = s3Client.getObjectMetadata(metadataRequest);
long contentLength = objectMetadata.getContentLength();

Before you can execute the code above, you will need to build the S3 client. Here is some example code for that:

AWSCredentials credentials = new BasicAWSCredentials(
accessKey,
secretKey
);
s3Client = AmazonS3ClientBuilder.standard()
.withRegion(clientRegion)
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.build();

You can simply use the s3 ls command:

aws s3 ls s3://mybucket --recursive --human-readable --summarize

Outputs

2013-09-02 21:37:53   10 Bytes a.txt
2013-09-02 21:37:53  2.9 MiB foo.zip
2013-09-02 21:32:57   23 Bytes foo/bar/.baz/a
2013-09-02 21:32:58   41 Bytes foo/bar/.baz/b
2013-09-02 21:32:57  281 Bytes foo/bar/.baz/c
2013-09-02 21:32:57   73 Bytes foo/bar/.baz/d
2013-09-02 21:32:57  452 Bytes foo/bar/.baz/e
2013-09-02 21:32:57  896 Bytes foo/bar/.baz/hooks/bar
2013-09-02 21:32:57  189 Bytes foo/bar/.baz/hooks/foo
2013-09-02 21:32:57  398 Bytes z.txt


Total Objects: 10
Total Size: 2.9 MiB

Reference: https://docs.aws.amazon.com/cli/latest/reference/s3/ls.html

Golang example, same principle, run head request again the object in question:

func returnKeySizeInMB(bucketName string, key string) {
output, err := svc.HeadObject(
&s3.HeadObjectInput{
Bucket: aws.String(bucketName),
Key:    aws.String(key),
})
if err != nil {
log.Fatalf("Unable to to send head request to item %q, %v", e.Detail.RequestParameters.Key, err)
}


return int(*output.ContentLength / 1024 / 1024)
}

Here, the parameter key means the path to the file.

For eg, if the URI of the file is S3://my-personal-bucket/folder1/subfolder1/myfile.pdf, then the syntax would look like:

output, err := svc.HeadObject(
&s3.HeadObjectInput{
Bucket: aws.String("my-personal-bucket"),
Key:    aws.String("folder1/subfolder1/myfile.pdf"),
})

Aws C++ solution to get file size

//! Step 1: create s3 client
Aws::S3::S3Client s3Client(cred, config); //!Used cred & config,You can use other options.


//! Step 2: Head Object request
Aws::S3::Model::HeadObjectRequest headObj;
headObj.SetBucket(bucket);
headObj.SetKey(key);


//! Step 3: read size from object header metadata
auto object = s3Client.HeadObject(headObj);
if (object.IsSuccess())
{
fileSize = object.GetResultWithOwnership().GetContentLength();
}
else
{
std::cout << "Head Object error: "
<< object .GetError().GetExceptionName() << " - "
<< object .GetError().GetMessage() << std::endl;
}


Note: Do not use GetObject to extract size, It reads file to extract information.

Ruby solution with head_object:

require 'aws-sdk-s3'


s3 = Aws::S3::Client.new(
region:               'us-east-1',     #or any other region
access_key_id:        AWS_ACCESS_KEY_ID,
secret_access_key:    AWS_SECRET_ACCESS_KEY
)


res = s3.head_object(bucket: bucket_name, key: object_key)
file_size = res[:content_length]

If the file is a private one, we can get the header by SDK.

PHP example:

$head = $client->headObject(
[
'Bucket' => $bucket,
'Key' => $key,
]
);
$result = (int) ($head->get('ContentLength') ?? 0);

These days you could also use Amazon S3 Inventory which gives you:

Size – The object size in bytes.

If you are looking to do this with a single file, you can use aws s3api head-object to get the metadata only without downloading the file itself:

$ aws s3api head-object --bucket mybucket --key path/to/myfile.csv --query "ContentLength"

Explanation

  • s3api head-object retrieves the object metadata in json format
  • --query "ContentLength" filters the json response to get the size of the body in bytes

This is how I did it in Java AWS SDK v2.x

Hope this helps.

Region region = Region.EU_CENTRAL_1;
S3Client s3client = S3Client.builder().region(region).build();


String bucket = "s3-demo";


HeadObjectRequest headObjectRequest = HeadObjectRequest.builder()
.bucket(bucket)
.key(fileName)
.build();
HeadObjectResponse headObjectResponse = s3client.headObject(headObjectRequest);
fileSize = headObjectResponse.contentLength();