MongoDB 中的字符串字段值长度

字段的数据类型是 String。我想获取字段名字符长度大于40的数据。

我尝试了这些查询,但返回错误。 1.

db.usercollection.find(
{$where: "(this.name.length > 40)"}
).limit(2);


output :error: {
"$err" : "TypeError: Cannot read property 'length' of undefined near '40)' ",
"code" : 16722
}

这是在2.4.9中工作的,但我的版本是2.6

161581 次浏览

For MongoDB 3.6 and newer:

The $expr operator allows the use of aggregation expressions within the query language, thus you can leverage the use of $strLenCP operator to check the length of the string as follows:

db.usercollection.find({
name: { $exists: true },
$expr: { $gt: [{ $strLenCP: '$name' }, 40] }
})

For MongoDB 3.4 and newer:

You can also use the aggregation framework with the $redact pipeline operator that allows you to proccess the logical condition with the $cond operator and uses the special operations $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "remove" the document where the condition was false.

This operation is similar to having a $project pipeline that selects the fields in the collection and creates a new field that holds the result from the logical condition query and then a subsequent $match, except that $redact uses a single pipeline stage which is more efficient.

As for the logical condition, there are String Aggregation Operators that you can use $strLenCP operator to check the length of the string. If the length is $gt a specified value, then this is a true match and the document is "kept". Otherwise it is "pruned" and discarded.


Consider running the following aggregate operation which demonstrates the above concept:

db.usercollection.aggregate([
{ $match: { name: { $exists: true } } },
{ $redact: {
$cond: [
{ $gt: [ { $strLenCP: "$name" }, 40] },
"$$KEEP",
"$$PRUNE"
]
} },
{ $limit: 2 }
])

If using $where, try your query without the enclosing brackets:

db.usercollection.find({ $where: "this.name.length > 40" }).limit(2);

A better query would be to to check for the field's existence and then check the length:

db.usercollection.find({ name: { $type: 2 }, $where: "this.name.length > 40" }).limit(2);

or:

db.usercollection.find({ name: { $exists: true }, $where: "this.name.length >
40" }).limit(2);

MongoDB evaluates non-$where query operations before $where expressions and non-$where query statements may use an index. A much better performance is to store the length of the string as another field and then you can index or search on it; applying $where will be much slower compared to that. It's recommended to use JavaScript expressions and the $where operator as a last resort when you can't structure the data in any other way, or when you are dealing with a small subset of data.


A different and faster approach that avoids the use of the $where operator is the $regex operator. Consider the following pattern which searches for

db.usercollection.find({"name": {"$type": 2, "$regex": /^.{41,}$/}}).limit(2);

Note - From the docs:

If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.

A regular expression is a “prefix expression” if it starts with a caret (^) or a left anchor (\A), followed by a string of simple symbols. For example, the regex /^abc.*/ will be optimized by matching only against the values from the index that start with abc.

Additionally, while /^a/, /^a.*/, and /^a.*$/ match equivalent strings, they have different performance characteristics. All of these expressions use an index if an appropriate index exists; however, /^a.*/, and /^a.*$/ are slower. /^a/ can stop scanning after matching the prefix.

I had a similar kind of scenario, but in my case string is not a 1st level attribute. It is inside an object. In here I couldn't find a suitable answer for it. So I thought to share my solution with you all(Hope this will help anyone with a similar kind of problem).

Parent Collection


{
"Child":
{
"name":"Random Name",
"Age:"09"
}
}

Ex: If we need to get only collections that having child's name's length is higher than 10 characters.

 db.getCollection('Parent').find({$where: function() {
for (var field in this.Child.name) {
if (this.Child.name.length > 10)
return true;


}
}})

Here is one of the way in mongodb you can achieve this.

db.usercollection.find({ $where: 'this.name.length < 4' })

Queries with $where and $expr are slow if there are too many documents.

Using $regex is much faster than $where, $expr.

db.usercollection.find({
"name": /^[\s\S]{40,}$/, // name.length >= 40
})


or


db.usercollection.find({
"name": { "$regex": "^[\s\S]{40,}$" }, // name.length >= 40
})

This query is the same meaning with

db.usercollection.find({
"$where": "this.name && this.name.length >= 40",
})


or


db.usercollection.find({
"name": { "$exists": true },
"$expr": { "$gte": [ { "$strLenCP": "$name" }, 40 ] }
})

I tested each queries for my collection.

# find
$where: 10529.359ms
$expr: 5305.801ms
$regex: 2516.124ms


# count
$where: 10872.006ms
$expr: 2630.155ms
$regex: 158.066ms

This query will give both field value and length:

db.usercollection.aggregate([
{
$project: {
"name": 1,
"length": { $strLenCP: "$name" }
}} ])