如何以各种可能的方式将列表分成对

我有一个列表(为了简单起见,假设有6个元素)

L = [0, 1, 2, 3, 4, 5]

我想用 全部可能的方法把它们组合成对。我展示了一些配置:

[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]

诸如此类。 Here (a, b) = (b, a) and the order of pairs is not important i.e.

[(0, 1), (2, 3), (4, 5)] = [(0, 1), (4, 5), (2, 3)]

这种配置的总数是 1*3*5*...*(N-1),其中 N是我的列表的长度。

如何用 Python 编写一个生成器,为任意 N提供所有可能的配置?

131841 次浏览

看看 itertools.combinations

matt@stanley:~$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import itertools
>>> list(itertools.combinations(range(6), 2))
[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]

尝试下面的递归生成器函数:

def pairs_gen(L):
if len(L) == 2:
yield [(L[0], L[1])]
else:
first = L.pop(0)
for i, e in enumerate(L):
second = L.pop(i)
for list_of_pairs in pairs_gen(L):
list_of_pairs.insert(0, (first, second))
yield list_of_pairs
L.insert(i, second)
L.insert(0, first)

示例用法:

>>> for pairs in pairs_gen([0, 1, 2, 3, 4, 5]):
...     print pairs
...
[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]
[(0, 2), (1, 3), (4, 5)]
[(0, 2), (1, 4), (3, 5)]
[(0, 2), (1, 5), (3, 4)]
[(0, 3), (1, 2), (4, 5)]
[(0, 3), (1, 4), (2, 5)]
[(0, 3), (1, 5), (2, 4)]
[(0, 4), (1, 2), (3, 5)]
[(0, 4), (1, 3), (2, 5)]
[(0, 4), (1, 5), (2, 3)]
[(0, 5), (1, 2), (3, 4)]
[(0, 5), (1, 3), (2, 4)]
[(0, 5), (1, 4), (2, 3)]

我认为在标准库中没有任何函数能够完全满足您的需要。仅仅使用 itertools.combinations就可以得到所有可能的单独对的列表,但是实际上并不能解决所有有效对组合的问题。

你可以很容易地解决这个问题:

import itertools
def all_pairs(lst):
for p in itertools.permutations(lst):
i = iter(p)
yield zip(i,i)

但是这将得到重复,因为它将(a,b)和(b,a)视为不同的,并且给出所有对的排序。最后,我发现从头开始编写代码比过滤结果更容易,因此下面是正确的函数。

def all_pairs(lst):
if len(lst) < 2:
yield []
return
if len(lst) % 2 == 1:
# Handle odd length list
for i in range(len(lst)):
for result in all_pairs(lst[:i] + lst[i+1:]):
yield result
else:
a = lst[0]
for i in range(1,len(lst)):
pair = (a,lst[i])
for rest in all_pairs(lst[1:i]+lst[i+1:]):
yield [pair] + rest

它是递归的,因此它会遇到堆栈问题,而且列表很长,但是在其他情况下它会满足您的需要。

>>> for x in all_pairs([0,1,2,3,4,5]):
print x


[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]
[(0, 2), (1, 3), (4, 5)]
[(0, 2), (1, 4), (3, 5)]
[(0, 2), (1, 5), (3, 4)]
[(0, 3), (1, 2), (4, 5)]
[(0, 3), (1, 4), (2, 5)]
[(0, 3), (1, 5), (2, 4)]
[(0, 4), (1, 2), (3, 5)]
[(0, 4), (1, 3), (2, 5)]
[(0, 4), (1, 5), (2, 3)]
[(0, 5), (1, 2), (3, 4)]
[(0, 5), (1, 3), (2, 4)]
[(0, 5), (1, 4), (2, 3)]
L = [1, 1, 2, 3, 4]
answer = []
for i in range(len(L)):
for j in range(i+1, len(L)):
if (L[i],L[j]) not in answer:
answer.append((L[i],L[j]))


print answer
[(1, 1), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]

Hope this helps

def f(l):
if l == []:
yield []
return
ll = l[1:]
for j in range(len(ll)):
for end in f(ll[:j] + ll[j+1:]):
yield [(l[0], ll[j])] + end

用法:

for x in f([0,1,2,3,4,5]):
print x


>>>
[(0, 1), (2, 3), (4, 5)]
[(0, 1), (2, 4), (3, 5)]
[(0, 1), (2, 5), (3, 4)]
[(0, 2), (1, 3), (4, 5)]
[(0, 2), (1, 4), (3, 5)]
[(0, 2), (1, 5), (3, 4)]
[(0, 3), (1, 2), (4, 5)]
[(0, 3), (1, 4), (2, 5)]
[(0, 3), (1, 5), (2, 4)]
[(0, 4), (1, 2), (3, 5)]
[(0, 4), (1, 3), (2, 5)]
[(0, 4), (1, 5), (2, 3)]
[(0, 5), (1, 2), (3, 4)]
[(0, 5), (1, 3), (2, 4)]
[(0, 5), (1, 4), (2, 3)]

在概念上类似于@shang 的回答,但它并没有假设组的大小是2:

import itertools


def generate_groups(lst, n):
if not lst:
yield []
else:
for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
for groups in generate_groups([x for x in lst if x not in group], n):
yield [group] + groups


pprint(list(generate_groups([0, 1, 2, 3, 4, 5], 2)))

结果是:

[[(0, 1), (2, 3), (4, 5)],
[(0, 1), (2, 4), (3, 5)],
[(0, 1), (2, 5), (3, 4)],
[(0, 2), (1, 3), (4, 5)],
[(0, 2), (1, 4), (3, 5)],
[(0, 2), (1, 5), (3, 4)],
[(0, 3), (1, 2), (4, 5)],
[(0, 3), (1, 4), (2, 5)],
[(0, 3), (1, 5), (2, 4)],
[(0, 4), (1, 2), (3, 5)],
[(0, 4), (1, 3), (2, 5)],
[(0, 4), (1, 5), (2, 3)],
[(0, 5), (1, 2), (3, 4)],
[(0, 5), (1, 3), (2, 4)],
[(0, 5), (1, 4), (2, 3)]]

这样吧:

items = ["me", "you", "him"]
[(items[i],items[j]) for i in range(len(items)) for j in range(i+1, len(items))]


[('me', 'you'), ('me', 'him'), ('you', 'him')]

或者

items = [1, 2, 3, 5, 6]
[(items[i],items[j]) for i in range(len(items)) for j in range(i+1, len(items))]


[(1, 2), (1, 3), (1, 5), (1, 6), (2, 3), (2, 5), (2, 6), (3, 5), (3, 6), (5, 6)]

我的老板可能不会高兴我花了一点时间在这个有趣的问题上,但这里有一个很好的解决方案,不需要递归,并使用 itertools.product。在 docstring 中对此进行了解释:)。结果看起来还不错,但我还没有进行过多的测试。

import itertools




def all_pairs(lst):
"""Generate all sets of unique pairs from a list `lst`.


This is equivalent to all _partitions_ of `lst` (considered as an indexed
set) which have 2 elements in each partition.


Recall how we compute the total number of such partitions. Starting with
a list


[1, 2, 3, 4, 5, 6]


one takes off the first element, and chooses its pair [from any of the
remaining 5].  For example, we might choose our first pair to be (1, 4).
Then, we take off the next element, 2, and choose which element it is
paired to (say, 3). So, there are 5 * 3 * 1 = 15 such partitions.


That sounds like a lot of nested loops (i.e. recursion), because 1 could
pick 2, in which case our next element is 3. But, if one abstracts "what
the next element is", and instead just thinks of what index it is in the
remaining list, our choices are static and can be aided by the
itertools.product() function.
"""
N = len(lst)
choice_indices = itertools.product(*[
xrange(k) for k in reversed(xrange(1, N, 2)) ])


for choice in choice_indices:
# calculate the list corresponding to the choices
tmp = lst[:]
result = []
for index in choice:
result.append( (tmp.pop(0), tmp.pop(index)) )
yield result

干杯!

当列表的长度不是2的倍数时,这段代码可以正常工作; 它使用了一种技巧来使其正常工作。也许有更好的方法可以做到这一点... 它还确保对始终在一个元组中,并且无论输入是一个列表还是元组,它都可以工作。

def all_pairs(lst):
"""Return all combinations of pairs of items of ``lst`` where order
within the pair and order of pairs does not matter.


Examples
========


>>> for i in range(6):
...  list(all_pairs(range(i)))
...
[[()]]
[[(0,)]]
[[(0, 1)]]
[[(0, 1), (2,)], [(0, 2), (1,)], [(0,), (1, 2)]]
[[(0, 1), (2, 3)], [(0, 2), (1, 3)], [(0, 3), (1, 2)]]
[[(0, 1), (2, 3), (4,)], [(0, 1), (2, 4), (3,)], [(0, 1), (2,), (3, 4)], [(0, 2)
, (1, 3), (4,)], [(0, 2), (1, 4), (3,)], [(0, 2), (1,), (3, 4)], [(0, 3), (1, 2)
, (4,)], [(0, 3), (1, 4), (2,)], [(0, 3), (1,), (2, 4)], [(0, 4), (1, 2), (3,)],
[(0, 4), (1, 3), (2,)], [(0, 4), (1,), (2, 3)], [(0,), (1, 2), (3, 4)], [(0,),
(1, 3), (2, 4)], [(0,), (1, 4), (2, 3)]]


Note that when the list has an odd number of items, one of the
pairs will be a singleton.


References
==========


http://stackoverflow.com/questions/5360220/
how-to-split-a-list-into-pairs-in-all-possible-ways


"""
if not lst:
yield [tuple()]
elif len(lst) == 1:
yield [tuple(lst)]
elif len(lst) == 2:
yield [tuple(lst)]
else:
if len(lst) % 2:
for i in (None, True):
if i not in lst:
lst = list(lst) + [i]
PAD = i
break
else:
while chr(i) in lst:
i += 1
PAD = chr(i)
lst = list(lst) + [PAD]
else:
PAD = False
a = lst[0]
for i in range(1, len(lst)):
pair = (a, lst[i])
for rest in all_pairs(lst[1:i] + lst[i+1:]):
rv = [pair] + rest
if PAD is not False:
for i, t in enumerate(rv):
if PAD in t:
rv[i] = (t[0],)
break
yield rv

I made a small test suite for all the compliant solutions. I had to change the functions a bit to get them to work in Python 3. Interestingly, the fastest function in PyPy is the slowest function in Python 2/3 in some cases.

import itertools
import time
from collections import OrderedDict


def tokland_org(lst, n):
if not lst:
yield []
else:
for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
for groups in tokland_org([x for x in lst if x not in group], n):
yield [group] + groups


tokland = lambda x: tokland_org(x, 2)


def gatoatigrado(lst):
N = len(lst)
choice_indices = itertools.product(*[
range(k) for k in reversed(range(1, N, 2)) ])


for choice in choice_indices:
# calculate the list corresponding to the choices
tmp = list(lst)
result = []
for index in choice:
result.append( (tmp.pop(0), tmp.pop(index)) )
yield result


def shang(X):
lst = list(X)
if len(lst) < 2:
yield lst
return
a = lst[0]
for i in range(1,len(lst)):
pair = (a,lst[i])
for rest in shang(lst[1:i]+lst[i+1:]):
yield [pair] + rest


def smichr(X):
lst = list(X)
if not lst:
yield [tuple()]
elif len(lst) == 1:
yield [tuple(lst)]
elif len(lst) == 2:
yield [tuple(lst)]
else:
if len(lst) % 2:
for i in (None, True):
if i not in lst:
lst = lst + [i]
PAD = i
break
else:
while chr(i) in lst:
i += 1
PAD = chr(i)
lst = lst + [PAD]
else:
PAD = False
a = lst[0]
for i in range(1, len(lst)):
pair = (a, lst[i])
for rest in smichr(lst[1:i] + lst[i+1:]):
rv = [pair] + rest
if PAD is not False:
for i, t in enumerate(rv):
if PAD in t:
rv[i] = (t[0],)
break
yield rv


def adeel_zafar(X):
L = list(X)
if len(L) == 2:
yield [(L[0], L[1])]
else:
first = L.pop(0)
for i, e in enumerate(L):
second = L.pop(i)
for list_of_pairs in adeel_zafar(L):
list_of_pairs.insert(0, (first, second))
yield list_of_pairs
L.insert(i, second)
L.insert(0, first)


if __name__ =="__main__":
import timeit
import pprint


candidates = dict(tokland=tokland, gatoatigrado=gatoatigrado, shang=shang, smichr=smichr, adeel_zafar=adeel_zafar)


for i in range(1,7):
results = [ frozenset([frozenset(x) for x in candidate(range(i*2))]) for candidate in candidates.values() ]
assert len(frozenset(results)) == 1


print("Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty")
times = dict([(k, timeit.timeit('list({0}(range(6)))'.format(k), setup="from __main__ import {0}".format(k), number=10000)) for k in candidates.keys()])
pprint.pprint([(k, "{0:.3g}".format(v)) for k,v in OrderedDict(sorted(times.items(), key=lambda t: t[1])).items()])


print("Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty")
times = dict([(k, timeit.timeit('list(islice({0}(range(52)), 800))'.format(k), setup="from itertools import islice; from __main__ import {0}".format(k), number=100)) for k in candidates.keys()])
pprint.pprint([(k, "{0:.3g}".format(v)) for k,v in OrderedDict(sorted(times.items(), key=lambda t: t[1])).items()])


"""
print("The 10000th permutations of the previous series:")
gens = dict([(k,v(range(52))) for k,v in candidates.items()])
tenthousands = dict([(k, list(itertools.islice(permutations, 10000))[-1]) for k,permutations in gens.items()])
for pair in tenthousands.items():
print(pair[0])
print(pair[1])
"""

它们似乎都产生完全相同的顺序,所以这些集合是不必要的,但这样它的未来证明。我对 Python 3的转换进行了一些实验,并不总是清楚在哪里构建列表,但我尝试了一些替代方案,并选择了最快的方案。

以下是基准测试结果:

% echo "pypy"; pypy all_pairs.py; echo "python2"; python all_pairs.py; echo "python3"; python3 all_pairs.py
pypy
Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty
[('gatoatigrado', '0.0626'),
('adeel_zafar', '0.125'),
('smichr', '0.149'),
('shang', '0.2'),
('tokland', '0.27')]
Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty
[('gatoatigrado', '0.29'),
('adeel_zafar', '0.411'),
('smichr', '0.464'),
('shang', '0.493'),
('tokland', '0.553')]
python2
Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty
[('gatoatigrado', '0.344'),
('adeel_zafar', '0.374'),
('smichr', '0.396'),
('shang', '0.495'),
('tokland', '0.675')]
Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty
[('adeel_zafar', '0.773'),
('shang', '0.823'),
('smichr', '0.841'),
('tokland', '0.948'),
('gatoatigrado', '1.38')]
python3
Times for getting all permutations of sets of unordered pairs consisting of two draws from a 6-element deck until it is empty
[('gatoatigrado', '0.385'),
('adeel_zafar', '0.419'),
('smichr', '0.433'),
('shang', '0.562'),
('tokland', '0.837')]
Times for getting the first 2000 permutations of sets of unordered pairs consisting of two draws from a 52-element deck until it is empty
[('smichr', '0.783'),
('shang', '0.81'),
('adeel_zafar', '0.835'),
('tokland', '0.969'),
('gatoatigrado', '1.3')]
% pypy --version
Python 2.7.12 (5.6.0+dfsg-0~ppa2~ubuntu16.04, Nov 11 2016, 16:31:26)
[PyPy 5.6.0 with GCC 5.4.0 20160609]
% python3 --version
Python 3.5.2

所以我说,按照加托蒂格拉多的解决方案。

一个非递归函数,用于查找顺序不重要的所有可能的对,即(a,b) = (b,a)

def combinantorial(lst):
count = 0
index = 1
pairs = []
for element1 in lst:
for element2 in lst[index:]:
pairs.append((element1, element2))
index += 1


return pairs

Since it is non-recursive you won't experience memory issues with long lists.

用法示例:

my_list = [1, 2, 3, 4, 5]
print(combinantorial(my_list))
>>>
[(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]

希望这能有所帮助:

L = [0,1,2,3,4,5]

[(i,j) for i in L for j in L ]

产出:

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]

不是最有效或最快的,但可能是最容易的。最后一行是在 python 中推断列表的简单方法。在这种情况下,输出中包含(0,1)和(1,0)这样的对。不知道你会不会考虑那些复制品。

l = [0, 1, 2, 3, 4, 5]
pairs = []
for x in l:
for y in l:
pairs.append((x,y))
pairs = list(set(pairs))
print(pairs)

产出:

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]

我加入了我自己的贡献,它建立在@shang 和@tokland 提供的伟大解决方案的基础上。我的问题是,在一个12人的小组中,我想看到所有可能的配对,当你的配对大小不能与小组大小完全分开时。例如,对于输入列表大小为12的情况,我希望看到具有5个元素的所有可能的对。

这段代码和小小的修改应该能解决这个问题:

import itertools


def generate_groups(lst, n):
if not lst:
yield []
else:
        

if len(lst) % n == 0:
        

        

for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
for groups in generate_groups([x for x in lst if x not in group], n):
yield [group] + groups
            

else:
            

for group in (((lst[0],) + xs) for xs in itertools.combinations(lst[1:], n-1)):
group2 = [x for x in lst if x not in group]
for grp in (((group2[0],) + xs2) for xs2 in itertools.combinations(group2[1:], n-1)):
yield [group] + [grp]

因此,在这种情况下,如果运行下面的代码片段,就会得到下面的结果。最后一段代码是一个健全性检查,我没有重叠的元素。

i = 0
for x in generate_groups([1,2,3,4,5,6,7,8,9,10,11,12], 5):
print(x)
for elem in x[0]:
if elem in x[1]:
print('break')
break
>>>
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 10)]
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 11)]
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 12)]
[(1, 2, 3, 4, 5), (6, 7, 8, 10, 11)]
[(1, 2, 3, 4, 5), (6, 7, 8, 10, 12)]
[(1, 2, 3, 4, 5), (6, 7, 8, 11, 12)]
[(1, 2, 3, 4, 5), (6, 7, 9, 10, 11)]
...