How does zip(*[iter(s)]*n) work in Python?

s = [1,2,3,4,5,6,7,8,9]
n = 3


zip(*[iter(s)]*n) # returns [(1,2,3),(4,5,6),(7,8,9)]

How does zip(*[iter(s)]*n) work? What would it look like if it was written with more verbose code?

34715 次浏览

iter() 是序列上的迭代器。[x] * n产生一个包含 n量的 x的列表,即一个长度为 n的列表,其中每个元素都是 x*arg将序列解包为函数调用的参数。因此,将同一个迭代器传递给 zip()3次,并且每次从迭代器中提取一个项。

x = iter([1,2,3,4,5,6,7,8,9])
print zip(x, x, x)

iter(s)返回 s 的迭代器。

[iter(s)]*n为 s 创建一个 n 乘以相同迭代器的列表。

因此,在执行 zip(*[iter(s)]*n)时,它按顺序从列表中的所有三个迭代器中提取一个项。因为所有迭代器都是同一个对象,所以它只是将列表分组为 n块。

其他的回答和评论很好地解释了 参数解包< strong > zip () 的作用。

As 伊格纳西奥 and ujukatzel say, you pass to zip() three references to the same iterator and zip() makes 3-tuples of the integers—in order—from each reference to the iterator:

1,2,3,4,5,6,7,8,9  1,2,3,4,5,6,7,8,9  1,2,3,4,5,6,7,8,9
^                    ^                    ^
^                    ^                    ^
^                    ^                    ^

既然您要求更详细的代码示例:

chunk_size = 3
L = [1,2,3,4,5,6,7,8,9]


# iterate over L in steps of 3
for start in range(0,len(L),chunk_size): # xrange() in 2.x; range() in 3.x
end = start + chunk_size
print L[start:end] # three-item chunks

以下是 startend的值:

[0:3) #[1,2,3]
[3:6) #[4,5,6]
[6:9) #[7,8,9]

FWIW,你可以得到相同的结果与 map()与一个初始参数的 None:

>>> map(None,*[iter(s)]*3)
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

For more on zip() and map(): http://muffinresearch.co.uk/archives/2007/10/16/python-transposing-lists-with-map-and-zip/

这样使用 zip 的一个建议。如果列表的长度不能均匀整除,它将截断列表。为了解决这个问题,如果可以接受填充值,则可以使用 Izip _ long。或者你可以用这样的东西:

def n_split(iterable, n):
num_extra = len(iterable) % n
zipped = zip(*[iter(iterable)] * n)
return zipped if not num_extra else zipped + [iterable[-num_extra:], ]

用法:

for ints in n_split(range(1,12), 3):
print ', '.join([str(i) for i in ints])

印刷品:

1, 2, 3
4, 5, 6
7, 8, 9
10, 11

我认为在所有的答案中都遗漏了一点(可能对那些熟悉迭代器的人来说是显而易见的) ,但对其他人来说却不那么显而易见的是-

因为我们有相同的迭代器,所以它被使用,剩下的元素由 zip 使用。因此,如果我们只使用列表,而不是 ITER 例如。

l = range(9)
zip(*([l]*3)) # note: not an iter here, the lists are not emptied as we iterate
# output
[(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8, 8)]

使用迭代器,弹出这些值,并且只保持其可用性,因此对于 zip,一旦使用0,则1可用,然后2可用,以此类推。非常微妙的事情,但相当聪明! ! !

在 python 解释器或 ipython中使用 n = 2可能更容易看到正在发生的情况:

In [35]: [iter("ABCDEFGH")]*2
Out[35]: [<iterator at 0x6be4128>, <iterator at 0x6be4128>]

因此,我们有一个两个迭代器的列表,它们指向同一个迭代器对象。请记住,对象上的 iter返回一个迭代器对象,在这个场景中,由于 *2 python 语法的原因,它是同一个迭代器两次。迭代器也只运行一次。

此外,zip采用任意数量的迭代器(序列可迭代的) ,并从每个输入序列的第 i 个元素创建 tuple。由于这两个迭代器在我们的示例中是相同的,因此对于每个2元素的输出元组,zip 移动相同的迭代器两次。

In [41]: help(zip)
Help on built-in function zip in module __builtin__:


zip(...)
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]


Return a list of tuples, where each tuple contains the i-th element
from each of the argument sequences.  The returned list is truncated
in length to the length of the shortest argument sequence.

解包装(*)操作员确保迭代器运行到耗尽,在这种情况下,直到没有足够的输入来创建一个2元素元组。

这可以扩展到所述的 nzip(*[iter(s)]*n)工程的任何值。

我需要分解每个部分步骤来真正内化它是如何工作的:

>>> # refresher on using list multiples to repeat item
>>> lst = list(range(15))
>>> lst
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> # lst id value
>>> id(lst)
139755081359872
>>> [id(x) for x in [lst]*3]
[139755081359872, 139755081359872, 139755081359872]


# replacing lst with an iterator of lst
# It's the same iterator three times
>>> [id(x) for x in [iter(lst)]*3 ]
[139755085005296, 139755085005296, 139755085005296]
# without starred expression zip would only see single n-item list.
>>> print([iter(lst)]*3)
[<list_iterator object at 0x7f1b440837c0>, <list_iterator object at 0x7f1b440837c0>, <list_iterator object at 0x7f1b440837c0>]
# Must use starred expression to expand n arguments
>>> print(*[iter(lst)]*3)
<list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0>


# by repeating the same iterator, n-times,
# each pass of zip will call the same iterator.__next__() n times
# this is equivalent to manually calling __next__() until complete
>>> iter_lst = iter(lst)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(0, 1, 2)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(3, 4, 5)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(6, 7, 8)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(9, 10, 11)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
(12, 13, 14)
>>> ((iter_lst.__next__(), iter_lst.__next__(), iter_lst.__next__()))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration


# all together now!
# continuing with same iterator multiple times in list
>>> print(*[iter(lst)]*3)
<list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0> <list_iterator object at 0x7f1b4418b1f0>
>>> zip(*[iter(lst)]*3)
<zip object at 0x7f1b43f14e00>
>>> list(zip(*[iter(lst)]*3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]


# NOTE: must use list multiples. Explicit listing creates 3 unique iterators
>>> [iter(lst)]*3 == [iter(lst), iter(lst), iter(lst)]
False
>>> list(zip(*[[iter(lst), iter(lst), iter(lst)]))
[(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3), ....