在Python 3中将int转换为bytes

小开

这就是它的设计方式——这是有意义的，因为通常情况下，你会在一个可迭代对象上调用bytes，而不是单个整数:

>>> bytes([3])
b'\x03'

文档说明了这一点，以及bytes的文档字符串:

 >>> help(bytes)
...
bytes(int) -> bytes object of size given by the parameter initialized with null bytes

小开

从字节的文档:

因此，构造函数参数被解释为bytearray()。

然后，从中bytearray文档:

可选的source参数可以用几种不同的方式初始化数组:

如果它是一个整数，数组将具有该大小，并将初始化为空字节。

注意，它与2不同。x(其中x >= 6)行为，其中bytes就是str:

>>> bytes is str
True

PEP 3112:

2.6的str与3.0的bytes类型有很多不同之处;最值得注意的是，构造函数完全不同。

小开

这种行为源于这样一个事实:在Python版本3之前，bytes只是str的别名。在Python3。x bytes是bytearray的不可变版本——全新的类型，不向后兼容。

小开

文件说:

bytes(int) -> bytes object of size given by the parameter
initialized with null bytes

序列:

b'3\r\n'

它是字符“3”(十进制51)、字符“\r”(13)和“\n”(10)。

因此，方法会这样对待它，例如:

>>> bytes([51, 13, 10])
b'3\r\n'


>>> bytes('3', 'utf8') + b'\r\n'
b'3\r\n'


>>> n = 3
>>> bytes(str(n), 'ascii') + b'\r\n'
b'3\r\n'

在IPython 1.1.0 &上测试;Python 3.2.3

小开

你可以使用结构体的包:

In [11]: struct.pack(">I", 1)
Out[11]: '\x00\x00\x00\x01'

“>”是的字节顺序(大端)，“I”是格式字符。所以如果你想做别的事情，你可以具体一点:

In [12]: struct.pack("<H", 1)
Out[12]: '\x01\x00'


In [13]: struct.pack("B", 1)
Out[13]: '\x01'

这对python 2和python 3都是一样的。

注意:反向操作(字节到int)可以用解压缩完成。

小开

在python 3.2中，你可以使用to_bytes:

>>> (1024).to_bytes(2, byteorder='big')
b'\x04\x00'

def int_to_bytes(x: int) -> bytes:
return x.to_bytes((x.bit_length() + 7) // 8, 'big')
    

def int_from_bytes(xbytes: bytes) -> int:
return int.from_bytes(xbytes, 'big')

< p >因此,x == int_from_bytes(int_to_bytes(x))。注意，上述编码仅适用于无符号(非负)整数

对于有符号整数，比特长度的计算有点棘手:

def int_to_bytes(number: int) -> bytes:
return number.to_bytes(length=(8 + (number + (number < 0)).bit_length()) // 8, byteorder='big', signed=True)


def int_from_bytes(binary_data: bytes) -> Optional[int]:
return int.from_bytes(binary_data, byteorder='big', signed=True)

小开

3的ASCIIfication是"\x33"而不是"\x03"!

这就是python对str(3)所做的，但对字节来说是完全错误的，因为它们应该被认为是二进制数据的数组，而不应该被滥用为字符串。

实现你想要的最简单的方法是bytes((3,))，它比bytes([3])更好，因为初始化列表的代价要高得多，所以当你可以使用元组时，永远不要使用列表。可以使用int.to_bytes(3, "little")转换更大的整数。

初始化具有给定长度的字节是有意义的，也是最有用的，因为它们通常用于创建某种类型的缓冲区，为此需要分配一定大小的内存。我经常在初始化数组或通过写入零来扩展某个文件时使用这个方法。

小开

最佳答案

Python 3.5+为字节引入了%-插值(__abc0风格的格式化):

>>> b'%d\r\n' % 3
b'3\r\n'

看到PEP 0461 -在bytes和bytearray中添加%格式。

在早期版本中，你可以使用str和.encode('ascii')作为结果:

>>> s = '%d\r\n' % 3
>>> s.encode('ascii')
b'3\r\n'

注意:它与int.to_bytes产生什么不同:

>>> n = 3
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'\0'
b'\x03'
>>> b'3' == b'\x33' != b'\x03'
True

小开

int(包括Python2的long)可以使用以下函数转换为bytes:

import codecs


def int2bytes(i):
hex_value = '{0:x}'.format(i)
# make length of hex_value a multiple of two
hex_value = '0' * (len(hex_value) % 2) + hex_value
return codecs.decode(hex_value, 'hex_codec')

反向转换可以由另一个完成:

import codecs
import six  # should be installed via 'pip install six'


long = six.integer_types[-1]


def bytes2int(b):
return long(codecs.encode(b, 'hex_codec'), 16)

这两个函数都可以在Python2和Python3上工作。

小开

我很好奇在[0, 255]范围内的单个int的各种方法的性能，所以我决定做一些定时测试。

根据下面的计时，以及我从尝试许多不同的值和配置中观察到的总体趋势，struct.pack似乎是最快的，其次是int.to_bytes， bytes，而str.encode(毫不奇怪)是最慢的。请注意，结果显示的变化比所代表的要多，并且int.to_bytes和bytes在测试中有时会切换速度排名，但struct.pack显然是最快的。

在Windows上的CPython 3.7中的结果:

Testing with 63:
bytes_: 100000 loops, best of 5: 3.3 usec per loop
to_bytes: 100000 loops, best of 5: 2.72 usec per loop
struct_pack: 100000 loops, best of 5: 2.32 usec per loop
chr_encode: 50000 loops, best of 5: 3.66 usec per loop

测试模块(命名为int_to_byte.py):

"""Functions for converting a single int to a bytes object with that int's value."""


import random
import shlex
import struct
import timeit


def bytes_(i):
"""From Tim Pietzcker's answer:
https://stackoverflow.com/a/21017834/8117067
"""
return bytes([i])


def to_bytes(i):
"""From brunsgaard's answer:
https://stackoverflow.com/a/30375198/8117067
"""
return i.to_bytes(1, byteorder='big')


def struct_pack(i):
"""From Andy Hayden's answer:
https://stackoverflow.com/a/26920966/8117067
"""
return struct.pack('B', i)


# Originally, jfs's answer was considered for testing,
# but the result is not identical to the other methods
# https://stackoverflow.com/a/31761722/8117067


def chr_encode(i):
"""Another method, from Quuxplusone's answer here:
https://codereview.stackexchange.com/a/210789/140921


Similar to g10guang's answer:
https://stackoverflow.com/a/51558790/8117067
"""
return chr(i).encode('latin1')


converters = [bytes_, to_bytes, struct_pack, chr_encode]


def one_byte_equality_test():
"""Test that results are identical for ints in the range [0, 255]."""
for i in range(256):
results = [c(i) for c in converters]
# Test that all results are equal
start = results[0]
if any(start != b for b in results):
raise ValueError(results)


def timing_tests(value=None):
"""Test each of the functions with a random int."""
if value is None:
# random.randint takes more time than int to byte conversion
# so it can't be a part of the timeit call
value = random.randint(0, 255)
print(f'Testing with {value}:')
for c in converters:
print(f'{c.__name__}: ', end='')
# Uses technique borrowed from https://stackoverflow.com/q/19062202/8117067
timeit.main(args=shlex.split(
f"-s 'from int_to_byte import {c.__name__}; value = {value}' " +
f"'{c.__name__}(value)'"
))

小开

虽然前面的布伦斯加德的回答是一种有效的编码，但它只适用于无符号整数。这个函数构建在有符号整数和无符号整数的基础上。

def int_to_bytes(i: int, *, signed: bool = False) -> bytes:
length = ((i + ((i * signed) < 0)).bit_length() + 7 + signed) // 8
return i.to_bytes(length, byteorder='big', signed=signed)


def bytes_to_int(b: bytes, *, signed: bool = False) -> int:
return int.from_bytes(b, byteorder='big', signed=signed)


# Test unsigned:
for i in range(1025):
assert i == bytes_to_int(int_to_bytes(i))


# Test signed:
for i in range(-1024, 1025):
assert i == bytes_to_int(int_to_bytes(i, signed=True), signed=True)

对于编码器，使用(i + ((i * signed) < 0)).bit_length()而不是i.bit_length()，因为后者会导致-128，-32768等低效编码。

归功于:CervEd修复了一个小的低效率。

小开

有些答案对大数字不适用。

将整数转换为十六进制表示，然后将其转换为字节:

def int_to_bytes(number):
hrepr = hex(number).replace('0x', '')
if len(hrepr) % 2 == 1:
hrepr = '0' + hrepr
return bytes.fromhex(hrepr)

结果:

>>> int_to_bytes(2**256 - 1)
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'

小开

如果问题是如何将一个整数本身(不是它的字符串等效)转换为字节，我认为健壮的答案是:

>>> i = 5
>>> i.to_bytes(2, 'big')
b'\x00\x05'
>>> int.from_bytes(i.to_bytes(2, 'big'), byteorder='big')
5

更多关于这些方法的信息:

小开

当你想要处理二进制表示时，最好使用ctypes。

import ctypes
x = ctypes.c_int(1234)
bytes(x)

您必须使用特定的整数表示(有符号/无符号，位数:c_uint8， c_int8， c_unit16，…)

小开

我认为你可以先将int转换为str，然后再转换为字节。这将产生你想要的格式。

bytes(str(your_number),'UTF-8') + b'\r\n'

它在py3.8中为我工作。

小开

>>> chr(116).encode()
b't'