Itertools 中 chain 和 chain.from_iterable 的区别是什么?

我找不到任何有效的例子,在互联网上,我可以看到他们之间的差异,为什么选择其中之一。

39012 次浏览

第一个接受0个或多个参数,每个参数都是可迭代的,第二个接受一个参数,这个参数可以产生可迭代的参数:

from itertools import chain


chain(list1, list2, list3)


iterables = [list1, list2, list3]
chain.from_iterable(iterables)

但是 iterables可以是产生迭代器的任何迭代器:

def gen_iterables():
for i in range(10):
yield range(i)


itertools.chain.from_iterable(gen_iterables())

使用第二种形式通常是为了方便,但是由于它以惰性方式遍历输入迭代器,因此它也是您可以链接 无穷无尽数量的有限迭代器的唯一方法:

def gen_iterables():
while True:
for i in range(5, 10):
yield range(i)


chain.from_iterable(gen_iterables())

上面的示例将提供一个迭代器,该迭代器将生成一个循环的数字模式,该模式永远不会停止,但是永远不会消耗比单个 range()调用所需要的更多的内存。

它们做的事情非常相似。对于少数可迭代的 itertools.chain(*iterables)itertools.chain.from_iterable(iterables)执行相似的操作。

from_iterables的关键优势在于能够处理大量(可能是无限的)可迭代文件,因为所有这些文件在调用时都不需要可用。

我找不到任何有效的例子... ... 我可以看到它们之间的区别[ chainchain.from_iterable] ,以及为什么要选择其中一个而不是另一个

公认的答案是彻底的。对于那些寻求快速申请的人,可以考虑列出几个清单:

list(itertools.chain(["a", "b", "c"], ["d", "e"], ["f"]))
# ['a', 'b', 'c', 'd', 'e', 'f']

您可能希望以后重用这些列表,因此可以对列表进行迭代:

iterable = (["a", "b", "c"], ["d", "e"], ["f"])

尝试

然而,传递一个迭代到 chain会得到一个不平坦的结果:

list(itertools.chain(iterable))
# [['a', 'b', 'c'], ['d', 'e'], ['f']]

为什么? 您传递了 项(一个元组)。 chain需要每个单独的列表。


解决方案

如果可能,您可以解压缩一个迭代文件:

list(itertools.chain(*iterable))
# ['a', 'b', 'c', 'd', 'e', 'f']


list(itertools.chain(*iter(iterable)))
# ['a', 'b', 'c', 'd', 'e', 'f']

更一般地说,使用 .from_iterable(因为它也适用于无限迭代器) :

list(itertools.chain.from_iterable(iterable))
# ['a', 'b', 'c', 'd', 'e', 'f']


g = itertools.chain.from_iterable(itertools.cycle(iterable))
next(g)
# "a"

从另一个角度来看:

chain(iterable1, iterable2, iterable3, ...)用于当您已经知道具有哪些可迭代项时,因此可以将它们编写为以逗号分隔的参数。

chain.from_iterable(iterable)用于从另一个迭代器获得迭代器(如 iterable1、 iterable2、 iterable3)。

另一种方法是使用 chain.from _ iterable

当你有一个迭代器,比如一个嵌套迭代器(或者一个复合迭代器) ,并且对简单的迭代器使用链

扩展 @ martijn-pieters 回答

尽管对迭代中内部项的访问保持不变,而且在实现方面也是明智的,

  • itertools_chain_from_iterable(即 Python 中的 chain.from_iterable)和
  • chain_new(即 Python 中的 chain)

在 CPython 实现中,它们都是 内部的 Duck-type


使用 chain.from_iterable(x)有什么优化好处吗? 在 chain.from_iterable(x)中,x是可迭代的; 主要目的是最终使用项目的平面列表?

我们可以尝试用以下方法对其进行基准测试:

import random
from itertools import chain
from functools import wraps
from time import time


from tqdm import tqdm


def timing(f):
@wraps(f)
def wrap(*args, **kw):
ts = time()
result = f(*args, **kw)
te = time()
print('func:%r args:[%r, %r] took: %2.4f sec' % (f.__name__, args, kw, te-ts))
return result
return wrap


def generate_nm(m, n):
# Creates m generators of m integers between range 0 to n.
yield iter(random.sample(range(n), n) for _ in range(m))
    



def chain_star(x):
# Stores an iterable that will unpack and flatten the list of list.
chain_x = chain(*x)
# Consumes the items in the flatten iterable.
for i in chain_x:
pass


def chain_from_iterable(x):
# Stores an iterable that will unpack and flatten the list of list.
chain_x = chain.from_iterable(x)
# Consumes the items in the flatten iterable.
for i in chain_x:
pass




@timing
def versus(f, n, m):
f(generate_nm(n, m))

P/S: 基准测试正在运行... ... 正在等待结果。


结果

链星,m = 1000,n = 1000

for _ in range(10):
versus(chain_star, 1000, 1000)

[出来] :

func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6494 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6603 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6367 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6350 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6296 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6399 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6341 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6381 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6343 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 1000, 1000), {}] took: 0.6309 sec

Chain _ from _ iterable,m = 1000,n = 1000

for _ in range(10):
versus(chain_from_iterable, 1000, 1000)

[出来] :

func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6416 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6315 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6535 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6334 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6327 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6471 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6426 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6287 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6353 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 1000, 1000), {}] took: 0.6297 sec

链星,m = 10000,n = 1000

func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2659 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2966 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2953 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.3141 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2802 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2799 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2848 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.3299 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.2730 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 10000, 1000), {}] took: 6.3052 sec

Chain _ from _ iterable,m = 10000,n = 1000

func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.3129 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.3064 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.3071 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2660 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2837 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2877 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2756 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2939 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2715 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 10000, 1000), {}] took: 6.2877 sec

链星,m = 100000,n = 1000

func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.7874 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.3744 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.5584 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.3745 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.7982 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.4054 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.6769 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.6476 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 63.7397 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 100000, 1000), {}] took: 62.8980 sec

Chain _ from _ iterable,m = 100000,n = 1000

for _ in range(10):
versus(chain_from_iterable, 100000, 1000)

[出来] :

func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7227 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7717 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7159 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7569 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7906 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.6211 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.7294 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.8260 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.8356 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 100000, 1000), {}] took: 62.9738 sec

链星,m = 500000,n = 1000

for _ in range(3):
versus(chain_from_iterable, 500000, 1000)

[出来] :

func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 500000, 1000), {}] took: 314.5671 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 500000, 1000), {}] took: 313.9270 sec
func:'versus' args:[(<function chain_star at 0x7f5c7188ef28>, 500000, 1000), {}] took: 313.8992 sec

Chain _ from _ iterable,m = 500000,n = 1000

for _ in range(3):
versus(chain_from_iterable, 500000, 1000)

[出来] :

func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 500000, 1000), {}] took: 313.8301 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 500000, 1000), {}] took: 313.8104 sec
func:'versus' args:[(<function chain_from_iterable at 0x7f5c7188eb70>, 500000, 1000), {}] took: 313.9440 sec