如何从 Node.js 中的 S3 getObject 获得响应?

在 Node.js 项目中,我试图从 S3获取数据。

当我使用 getSignedURL时,一切都正常:

aws.getSignedUrl('getObject', params, function(err, url){
console.log(url);
});

我的参数是:

var params = {
Bucket: "test-aws-imagery",
Key: "TILES/Level4/A3_B3_C2/A5_B67_C59_Tiles.par"

如果我把 URL 输出到控制台并粘贴到 Web 浏览器中,它就会下载我需要的文件。

然而,如果我尝试使用 getObject,我会得到各种奇怪的行为。我想我只是用错了。这是我试过的方法:

aws.getObject(params, function(err, data){
console.log(data);
console.log(err);
});

产出:

{
AcceptRanges: 'bytes',
LastModified: 'Wed, 06 Apr 2016 20:04:02 GMT',
ContentLength: '1602862',
ETag: '9826l1e5725fbd52l88ge3f5v0c123a4"',
ContentType: 'application/octet-stream',
Metadata: {},
Body: <Buffer 01 00 00 00  ... > }


null

So it appears that this is working properly. However, when I put a breakpoint on one of the console.logs, my IDE (NetBeans) throws an error and refuses to show the value of data. While this could just be the IDE, I decided to try other ways to use getObject.

aws.getObject(params).on('httpData', function(chunk){
console.log(chunk);
}).on('httpDone', function(data){
console.log(data);
});

This does not output anything. Putting a breakpoint in shows that the code never reaches either of the console.logs. I also tried:

aws.getObject(params).on('success', function(data){
console.log(data);
});

但是,这也不会输出任何东西,放置断点表明从未到达 console.log

我做错了什么?

262622 次浏览

乍一看,您似乎没有做错什么,但是您并没有显示所有的代码。当我第一次查看 S3和 Node 时,以下内容对我很有用:

var AWS = require('aws-sdk');


if (typeof process.env.API_KEY == 'undefined') {
var config = require('./config.json');
for (var key in config) {
if (config.hasOwnProperty(key)) process.env[key] = config[key];
}
}


var s3 = new AWS.S3({accessKeyId: process.env.AWS_ID, secretAccessKey:process.env.AWS_KEY});
var objectPath = process.env.AWS_S3_FOLDER +'/test.xml';
s3.putObject({
Bucket: process.env.AWS_S3_BUCKET,
Key: objectPath,
Body: "<rss><data>hello Fred</data></rss>",
ACL:'public-read'
}, function(err, data){
if (err) console.log(err, err.stack); // an error occurred
else {
console.log(data);           // successful response
s3.getObject({
Bucket: process.env.AWS_S3_BUCKET,
Key: objectPath
}, function(err, data){
console.log(data.Body.toString());
});
}
});

@ aws-sdk/client-s3(2022 Update)

自从我在2016年写下这个答案以来,Amazon 已经发布了一个新的 JavaScript SDK @aws-sdk/client-s3。这个新的版本改善了原来的 getObject()返回的承诺,而不是总是选择通过 .promise()被链接到 getObject()。除此之外,response.Body不再是 getObject()0,而是 Readable|ReadableStream|Blob之一。这稍微改变了 response.Data的处理。这样性能应该更好,因为我们可以对返回的数据进行流处理,而不是将所有内容保存在内存中,这样做的好处是实现起来会更加冗长。

In the below example the response.Body data will be streamed into an array and then returned as a string. This is the equivalent example of my original answer. Alternatively, the response.Body could use stream.Readable.pipe() to an HTTP Response, a File or any other type of stream.Writeable for further usage, this would be the more performant way when getting large objects.

如果您想使用 Buffer,就像原来的 getObject()响应一样,这可以通过将 responseDataChunks封装在 getObject()0中而不是使用 getObject()1来实现,这在与二进制数据交互时非常有用。需要注意的是,由于 Array#join()返回一个字符串,因此 responseDataChunks中的每个 Buffer实例都将隐式调用 getObject()2,并使用 utf8的默认编码。

const { GetObjectCommand, S3Client } = require('@aws-sdk/client-s3')
const client = new S3Client() // Pass in opts to S3 if necessary


function getObject (Bucket, Key) {
return new Promise(async (resolve, reject) => {
const getObjectCommand = new GetObjectCommand({ Bucket, Key })


try {
const response = await client.send(getObjectCommand)
  

// Store all of data chunks returned from the response data stream
// into an array then use Array#join() to use the returned contents as a String
let responseDataChunks = []


// Handle an error while streaming the response body
response.Body.once('error', err => reject(err))
  

// Attach a 'data' listener to add the chunks of data to our array
// Each chunk is a Buffer instance
response.Body.on('data', chunk => responseDataChunks.push(chunk))
  

// Once the stream has no more data, join the chunks into a string and return the string
response.Body.once('end', () => resolve(responseDataChunks.join('')))
} catch (err) {
// Handle the error or throw
return reject(err)
}
})
}

Comments on using Readable.toArray()

使用 Readable.toArray()而不是直接处理流事件可能使用起来更方便,但是它的性能更差。它的工作原理是在继续之前将所有响应数据块读入内存。由于这会消除流的所有好处,因此 Node.js 文档不鼓励使用这种方法。

由于此方法将整个流读入内存,因此否定了流的好处。它旨在实现互操作性和方便性,而不是作为消费流的主要方式。文件连结

@aws-sdk/client-s3文件链接

Aws-sdk (原始答案)

当从 S3 API 执行 getObject()时,根据 医生,文件的内容位于 Body属性中,您可以从示例输出中看到该属性。您应该有如下所示的代码

const aws = require('aws-sdk');
const s3 = new aws.S3(); // Pass in opts to S3 if necessary


var getParams = {
Bucket: 'abc', // your bucket name,
Key: 'abc.txt' // path to the object you're looking for
}


s3.getObject(getParams, function(err, data) {
// Handle any error and exit
if (err)
return err;


// No error happened
// Convert Body from a Buffer to a String
let objectData = data.Body.toString('utf-8'); // Use the encoding necessary
});

您可能不需要从 data.Body对象创建新的缓冲区,但是如果需要,您可以使用上面的示例来实现这一点。

或者你可以使用 Minio-js 客户端库 get-object.js

var Minio = require('minio')


var s3Client = new Minio({
endPoint: 's3.amazonaws.com',
accessKey: 'YOUR-ACCESSKEYID',
secretKey: 'YOUR-SECRETACCESSKEY'
})


var size = 0
// Get a full object.
s3Client.getObject('my-bucketname', 'my-objectname', function(e, dataStream) {
if (e) {
return console.log(e)
}
dataStream.on('data', function(chunk) {
size += chunk.length
})
dataStream.on('end', function() {
console.log("End. Total size = " + size)
})
dataStream.on('error', function(e) {
console.log(e)
})
})

免责声明: 我为 米尼奥工作它的开源,S3兼容的对象存储用 golang 编写,客户端库可在 爪哇咖啡巨蟒Jsgolang

基于@peteb 的答案,但使用 PromisesAsync/Await:

const AWS = require('aws-sdk');


const s3 = new AWS.S3();


async function getObject (bucket, objectKey) {
try {
const params = {
Bucket: bucket,
Key: objectKey
}


const data = await s3.getObject(params).promise();


return data.Body.toString('utf-8');
} catch (e) {
throw new Error(`Could not retrieve file from S3: ${e.message}`)
}
}


// To retrieve you need to use `await getObject()` or `getObject().then()`
const myObject = await getObject('my-bucket', 'path/to/the/object.txt');

对于那些寻找以上 NEST JS TYPESCRIPT版本的人:

    /**
* to fetch a signed URL of a file
* @param key key of the file to be fetched
* @param bucket name of the bucket containing the file
*/
public getFileUrl(key: string, bucket?: string): Promise<string> {
var scopeBucket: string = bucket ? bucket : this.defaultBucket;
var params: any = {
Bucket: scopeBucket,
Key: key,
Expires: signatureTimeout  // const value: 30
};
return this.account.getSignedUrlPromise(getSignedUrlObject, params);
}


/**
* to get the downloadable file buffer of the file
* @param key key of the file to be fetched
* @param bucket name of the bucket containing the file
*/
public async getFileBuffer(key: string, bucket?: string): Promise<Buffer> {
var scopeBucket: string = bucket ? bucket : this.defaultBucket;
var params: GetObjectRequest = {
Bucket: scopeBucket,
Key: key
};
var fileObject: GetObjectOutput = await this.account.getObject(params).promise();
return Buffer.from(fileObject.Body.toString());
}


/**
* to upload a file stream onto AWS S3
* @param stream file buffer to be uploaded
* @param key key of the file to be uploaded
* @param bucket name of the bucket
*/
public async saveFile(file: Buffer, key: string, bucket?: string): Promise<any> {
var scopeBucket: string = bucket ? bucket : this.defaultBucket;
var params: any = {
Body: file,
Bucket: scopeBucket,
Key: key,
ACL: 'private'
};
var uploaded: any = await this.account.upload(params).promise();
if (uploaded && uploaded.Location && uploaded.Bucket === scopeBucket && uploaded.Key === key)
return uploaded;
else {
throw new HttpException("Error occurred while uploading a file stream", HttpStatus.BAD_REQUEST);
}
}

This is the async / await version

var getObjectAsync = async function(bucket,key) {
try {
const data = await s3
.getObject({ Bucket: bucket, Key: key })
.promise();
var contents = data.Body.toString('utf-8');
return contents;
} catch (err) {
console.log(err);
}
}
var getObject = async function(bucket,key) {
const contents = await getObjectAsync(bucket,key);
console.log(contents.length);
return contents;
}
getObject(bucket,key);

与上面@ArianAcosta 的回答极其相似。除了我正在使用 import(对于 Node 12.x 和更高版本) ,添加 AWS 配置和嗅探一个图像负载,并将 base64处理应用到 return

// using v2.x of aws-sdk
import aws from 'aws-sdk'


aws.config.update({
accessKeyId: process.env.YOUR_AWS_ACCESS_KEY_ID,
secretAccessKey: process.env.YOUR_AWS_SECRET_ACCESS_KEY,
region: "us-east-1" // or whatever
})


const s3 = new aws.S3();


/**
* getS3Object()
*
* @param { string } bucket - the name of your bucket
* @param { string } objectKey - object you are trying to retrieve
* @returns { string } - data, formatted
*/
export async function getS3Object (bucket, objectKey) {
try {
const params = {
Bucket: bucket,
Key: objectKey
}


const data = await s3.getObject(params).promise();


// Check for image payload and formats appropriately
if( data.ContentType === 'image/jpeg' ) {
return data.Body.toString('base64');
} else {
return data.Body.toString('utf-8');
}


} catch (e) {
throw new Error(`Could not retrieve file from S3: ${e.message}`)
}
}

使用节点获取将 GetObjectOutput.Body转换为 Promise<string>

在 aws-sdk-js-v3@aws-sdk/client-s3中,Readable4是 nodejs 中 Readable5的子类(特别是 Readable6的实例) ,而不是 Readable8中的 Readable7,因此 resp.Body.toString('utf-8')将给出错误的结果“[ Object Object ]”。相反,将 GetObjectOutput.Body转换成 Promise<string>的最简单的方法是构造一个节点获取 Readable9,它接受一个 Readable子类(或 Buffer实例,或其他 http.IncomingMessage0)并具有转换方法 Readable0、 Readable1、 Readable2和 Readable3。

这也应该适用于 aws-sdk 和平台的其他变体(@aws-sdk v3节点 Buffer,v3浏览器 Uint8Array子类,v2节点 Readable,v2浏览器 ReadableStreamBlob)

npm install node-fetch
import { Response } from 'node-fetch';
import * as s3 from '@aws-sdk/client-s3';


const client = new s3.S3Client({})
const s3Response = await client.send(new s3.GetObjectCommand({Bucket: '…', Key: '…'});
const response = new Response(s3Response.Body);


const obj = await response.json();
// or
const text = await response.text();
// or
const buffer = Buffer.from(await response.arrayBuffer());
// or
const blob = await response.blob();


参考文献: GetObjectOutput.Body文档Node-get Response文档node-fetch Body constructor sourceMinipass-fetBody构造函数源代码

感谢 GetObjectCommand可用性问题中的 kennu 评论

更新(2022)

如果这个 API 在你的节点版本中是可用的,代码会很短:

const buffer = Buffer.concat(
await (
await s3Client
.send(new GetObjectCommand({
Key: '<key>',
Bucket: '<bucket>',
}))
).Body.toArray()
)

如果您使用的是类型脚本,则可以安全地将 .Body部分强制转换为 Readable(其他类型的 ReadableStreamBlob仅在浏览器环境中返回)。此外,在浏览器中,当不支持 response.body时,旧版取 API 中的 Blob 只用于)

(response.Body as Readable).toArray()

请注意: Readable.toArray是一个实验性的(但很方便)特性,请谨慎使用。

enter image description here

=============

原始答案

如果使用 awsdkv3,sdkv3返回 nodejs可读(准确地说,是扩展了 Readable 的 收到信息) ,而不是 Buffer。

Here is a Typescript version. Note that this is for node only, if you send the request from browser, check the longer answer in the blog post mentioned below.

import {GetObjectCommand, S3Client} from '@aws-sdk/client-s3'
import type {Readable} from 'stream'


const s3Client = new S3Client({
apiVersion: '2006-03-01',
region: 'us-west-2',
credentials: {
accessKeyId: '<access key>',
secretAccessKey: '<access secret>',
}
})
const response = await s3Client
.send(new GetObjectCommand({
Key: '<key>',
Bucket: '<bucket>',
}))
const stream = response.Body as Readable


return new Promise<Buffer>((resolve, reject) => {
const chunks: Buffer[] = []
stream.on('data', chunk => chunks.push(chunk))
stream.once('end', () => resolve(Buffer.concat(chunks)))
stream.once('error', reject)
})
// if readable.toArray() is support
// return Buffer.concat(await stream.toArray())

Why do we have to cast response.Body as Readable? The answer is too long. Interested readers can find more information on 我的博客文章.

ToString ()方法不再适用于 s3 api 的最新版本:

const { S3Client, GetObjectCommand } = require("@aws-sdk/client-s3");


const streamToString = (stream) =>
new Promise((resolve, reject) => {
const chunks = [];
stream.on("data", (chunk) => chunks.push(chunk));
stream.on("error", reject);
stream.on("end", () => resolve(Buffer.concat(chunks).toString("utf8")));
});


(async () => {
const region = "us-west-2";
const client = new S3Client({ region });


const command = new GetObjectCommand({
Bucket: "test-aws-sdk-js-1877",
Key: "readme.txt",
});


const { Body } = await client.send(command);
const bodyContents = await streamToString(Body);
console.log(bodyContents);
})();

从这里复制粘贴: https://github.com/aws/aws-sdk-js-v3/issues/1877#issuecomment-755387549

Not sure why this solution hasn't already been added as I think it is cleaner than the top answer.

使用 Express 和 AWS SDK v3:

  public downloadFeedFile = (req: IFeedUrlRequest, res: Response) => {
const downloadParams: GetObjectCommandInput = parseS3Url(req.s3FileUrl.replace(/\s/g, ''));
logger.info("requesting S3 file  " + JSON.stringify(downloadParams));
const run = async () => {
try {
const fileStream = await this.s3Client.send(new GetObjectCommand(downloadParams));
if (fileStream.Body instanceof Readable){
fileStream.Body.once('error', err => {
console.error("Error downloading s3 file")
console.error(err);
});


fileStream.Body.pipe(res);


}
} catch (err) {
logger.error("Error", err);
}
};


run();


};