How to get the size of single document in Mongodb?

I encountered a strange behavior of mongo and I would like to clarify it a bit...
My request is simple as that: I would like to get a size of single document in collection. I found two possible solutions:

  • Object.bsonsize - some javascript method that should return a size in bytes
  • db.collection.stats() - where there is a line 'avgObjSize' that produce some "aggregated"(average) size view on the data. It simply represents average size of single document.

  • When I create test collection with only one document, both functions returns different values. How is it possible?
    Does it exist some other method to get a size of a mongo document?

Here, I provide some code I perform testing on:

  1. I created new database 'test' and input simple document with only one attribute: type:"auto"

    db.test.insert({type:"auto"})
    
  2. output from stats() function call: db.test.stats():

    {
    "ns" : "test.test",
    "count" : 1,
    "size" : 40,
    "avgObjSize" : 40,
    "storageSize" : 4096,
    "numExtents" : 1,
    "nindexes" : 1,
    "lastExtentSize" : 4096,
    "paddingFactor" : 1,
    "systemFlags" : 1,
    "userFlags" : 0,
    "totalIndexSize" : 8176,
    "indexSizes" : {
    "_id_" : 8176
    },
    "ok" : 1
    

    }

  3. output from bsonsize function call: Object.bsonsize(db.test.find({test:"auto"}))

    481
    
118634 次浏览

由于 记录填充机制,文档在集合中所占的有效空间大于文档的大小。

This is why there is a difference between the outputs of the db.test.stats() and Object.bsonsize(..).

要获取文档的 一模一样大小(以字节为单位) ,请坚持使用 Object.bsonsize()函数。

在之前的 Object.bsonsize()调用中,Mongodb 返回的是光标的大小,而不是文档的大小。

正确的方法是使用以下命令:

Object.bsonsize(db.test.findOne())

使用 findOne(),您可以为特定的文档定义查询:

Object.bsonsize(db.test.findOne({type:"auto"}))

这将返回特定文档的正确大小(以字节为单位)。

文件最高大小为16MiB (来源)


If you have version >=4.4 ($bsonSize 来源)

db.users.aggregate([
{
"$project": {
"size_bytes": { "$bsonSize": "$$ROOT" },
"size_KB": { "$divide": [{"$bsonSize": "$$ROOT"}, 1000] },
"size_MB": { "$divide": [{"$bsonSize": "$$ROOT"}, 1000000] }
}
}
])

If you have version <4.4 (Object.bsonSize() 来源)

你可以使用这个脚本来得到一个真实的大小:

db.users.find().forEach(function(obj)
{
var size = Object.bsonsize(obj);
print('_id: '+obj._id+' || Size: '+size+'B -> '+Math.round(size/(1000))+'KB -> '+Math.round(size/(1000*1000))+'MB (max 16MB)');
});

注意: 如果您的 ID 是64位整数,以上将截断 ID 值打印!如果是这样的话,你可以使用:

db.users.find().forEach(function(obj)
{
var size = Object.bsonsize(obj);
var stats =
{
'_id': obj._id,
'bytes': size,
'KB': Math.round(size/(1000)),
'MB': Math.round(size/(1000*1000))
};
print(stats);
});

这还具有返回 JSON 的优势,因此像 RoboMongo 这样的 GUI 可以将其制成表格!


编辑: 感谢 @ zAlbee的建议完成。

Bsonsize (db.test.findOne ({ type: “ auto”})) It gives in bytes.

使用 mongodb 4.4(即将推出) ,可以使用 强 > bsonSize操作符获取文档大小。

db.test.aggregate([
{
"$project": {
"name": 1,
"object_size": { "$bsonSize": "$$ROOT" }
}
}
])

方法 Object.bsonsize()只能在遗留的 mongo shell 中使用。在新的 mongosh中必须使用包 Bson猫科动物的片段

const BSON = require("bson");


BSON.calculateObjectSize({field: "value"})


BSON.calculateObjectSize(db.test.findOne())