何时在 Go 中使用[] byte 或 string?

经常在写围棋应用程序,我发现自己选择使用 []bytestring。除了 []byte明显的可变性之外,我如何决定使用哪一个?

我有几个用例:

  1. 函数返回一个新的 []byte。既然切片容量是固定的,那么为什么不返回字符串呢?
  2. 缺省情况下,[]byte的打印效果不如 string,因此我经常发现自己为了记录日志而将代码强制转换为 string。它应该一直是 string吗?
  3. 当前置 []byte时,始终创建一个新的基础数组。如果要前置的数据是常数,为什么不应该是 string呢?
18662 次浏览
  1. One difference is that the returned []byte can be potentially reused to hold another/new data (w/o new memory allocation), while string cannot. Another one is that, in the gc implementation at least, string is a one word smaller entity than []byte. Can be used to save some memory when there is a lot of such items live.

  2. Casting a []byte to string for logging is not necessary. Typical 'text' verbs, like %s, %q work for string and []byte expressions equally. In the other direction the same holds for e.g. %x or % 02x.

  3. Depends on why is the concatenation performed and if the result is ever to be again combined w/ something/somewhere else afterwards. If that's the case then []byte may perform better.

I've gotten the sense that in Go, more than in any other non-ML style language, the type is used to convey meaning and intended use. So, the best way to figure out which type to use is to ask yourself what the data is.

A string represents text. Just text. The encoding is not something you have to worry about and all operations work on a character by character basis, regardless of that a 'character' actually is.

An array represents either binary data or a specific encoding of that data. []byte means that the data is either just a byte stream or a stream of single byte characters. []int16 represents an integer stream or a stream of two byte characters.

Given that fact that pretty much everything that deals with bytes also has functions to deal with strings and vice versa, I would suggest that instead of asking what you need to do with the data, you ask what that data represents. And then optimize things once you figure out bottlenecks.

EDIT: This post is where I got the rationale for using type conversion to break up the string.

My advice would be to use string by default when you're working with text. But use []byte instead if one of the following conditions applies:

  • The mutability of a []byte will significantly reduce the number of allocations needed.

  • You are dealing with an API that uses []byte, and avoiding a conversion to string will simplify your code.