TypeError: ‘ str’不支持缓冲区接口

小开

如果不显式转换为某种编码，就不能将Python 3 'string'序列化为字节。

outfile.write(plaintext.encode('utf-8'))

可能是你想要的。这也适用于python 2。X和3.x。

小开

对于Python 3。X你可以通过以下方式将文本转换为原始字节:

bytes("my data", "encoding")

例如:

bytes("attack at dawn", "utf-8")

返回的对象将与outfile.write一起工作。

小开

最佳答案

如果使用Python3x，则string的类型与python2不同。X时，必须将其转换为字节(编码)。

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
outfile.write(bytes(plaintext, 'UTF-8'))

也不要使用像string或file这样的变量名，因为这些是模块或函数的名称。

编辑@Tom

是的，非ascii文本也被压缩/解压缩。我使用UTF-8编码的波兰字母:

plaintext = 'Polish text: ąćęłńóśźżĄĆĘŁŃÓŚŹŻ'
filename = 'foo.gz'
with gzip.open(filename, 'wb') as outfile:
outfile.write(bytes(plaintext, 'UTF-8'))
with gzip.open(filename, 'r') as infile:
outfile_content = infile.read().decode('UTF-8')
print(outfile_content)

小开

这个问题有一个更简单的解决办法。

你只需要在模式中添加一个t，这样它就变成了wt。这将导致Python以文本文件而不是二进制文件的形式打开文件。然后一切都会好起来的。

完整的程序变成这样:

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wt") as outfile:
outfile.write(plaintext)

小开

>>> s = bytes("s","utf-8")
>>> print(s)
b's'
>>> s = s.decode("utf-8")
>>> print(s)
s

好吧，如果对你有用的话，可以删除烦人的'b'字符。如果有人有更好的想法，请建议我或随时在这里编辑我。我只是个新手

小开

对于django.test.TestCase单元测试中的Django，我改变了我的Python2语法:

def test_view(self):
response = self.client.get(reverse('myview'))
self.assertIn(str(self.obj.id), response.content)
...

使用Python3 .decode('utf8')语法:

def test_view(self):
response = self.client.get(reverse('myview'))
self.assertIn(str(self.obj.id), response.content.decode('utf8'))
...

小开

这个问题通常发生在从py2切换到py3时。在py2中plaintext同时表示字符串和字节数组类型，它是类型灵活的，能够双向摆动。在py3中，plaintext现在只是一个字符串，它更加明确，并且当outfile以二进制模式打开时，方法outfile.write()实际上接受字节数组，因此会引发异常。将输入更改为plaintext.encode('utf-8')以解决问题。如果这让你感到困扰，请继续读下去。

在py2中，file.write的声明使它看起来像是传入了一个字符串:file.write(str)。实际上你是在传递一个字节数组，你应该像这样阅读声明:file.write(bytes)。如果你这样读，问题很简单，file.write(bytes)需要一个字节类型，在py3中，要从str中获得字节，你需要转换它:

py3>> outfile.write(plaintext.encode('utf-8'))

为什么py2文档声明file.write取字符串?在py2中，声明的区别并不重要，因为:

py2>> str==bytes         #str and bytes aliased a single hybrid class in py2
True

py2的str-bytes类具有一些方法/构造函数，使其在某些方面表现得像字符串类，在其他方面表现得像字节数组类。方便file.write，不是吗?：

py2>> plaintext='my string literal'
py2>> type(plaintext)
str                              #is it a string or is it a byte array? it's both!


py2>> outfile.write(plaintext)   #can use plaintext as a byte array

为什么py3破坏了这个漂亮的系统?因为在py2中基本的字符串函数并不适用于其他地方。用非ascii字符测量单词的长度?

py2>> len('¡no')        #length of string=3, length of UTF-8 byte array=4, since with variable len encoding the non-ASCII chars = 2-6 bytes
4                       #always gives bytes.len not str.len

一直以来，你以为你在py2中要求一个字符串的len，你从编码中得到字节数组的长度。这种模糊性是双重职责阶级的基本问题。您实现哪个版本的方法调用?

好消息是py3解决了这个问题。它分离了str和字节类。str类有类似字符串的方法，单独的字节类有字节数组方法:

py3>> len('¡ok')       #string
3
py3>> len('¡ok'.encode('utf-8'))     #bytes
4

希望了解这些有助于解开这个问题的谜团，并使迁移的痛苦更容易忍受一些。