获取 Rust 中字符的 String 长度

基于 Rust String::len方法返回组成字符串的字节数,这可能与字符长度不一致。

例如,如果我们考虑以下日语字符串,len()将返回30,即字节数,而不是字符数,即10:

let s = String::from("ラウトは難しいです!");
s.len() // returns 30.

我发现获得字符数的唯一方法是使用以下函数:

s.chars().count()

返回10,并且是正确的字符数。

除了我上面使用的方法之外,String上有没有返回字符数的方法?

58184 次浏览

Is there any method on String that returns the characters count, aside from the one I am using above?

No. Using s.chars().count() is correct. Note that this is an O(N) operation (because UTF-8 is complex) while getting the number of bytes is an O(1) operation.

You can see all the methods on str for yourself.

As pointed out in the comments, a char is a specific concept:

It's important to remember that char represents a Unicode Scalar Value, and may not match your idea of what a 'character' is. Iteration over grapheme clusters may be what you actually want.

One such example is with precomposed characters:

fn main() {
println!("{}", "é".chars().count()); // 2
println!("{}", "é".chars().count()); // 1
}

You may prefer to use graphemes from the unicode-segmentation crate instead:

use unicode_segmentation::UnicodeSegmentation; // 1.6.0


fn main() {
println!("{}", "é".graphemes(true).count()); // 1
println!("{}", "é".graphemes(true).count()); // 1
}