找到两个嵌套列表的交集?

我知道如何得到两个平面列表的交集:

b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
b3 = [val for val in b1 if val in b2]

def intersect(a, b):
return list(set(a) & set(b))
 

print intersect(b1, b2)

但当我必须为嵌套列表找到交集时,我的问题就开始了:

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

最后我希望收到:

c3 = [[13,32],[7,13,28],[1,6]]

你们能帮我一下吗?

相关的

608494 次浏览

你认为[1,2][1, [2]]相交吗?也就是说,你只关心数字,还是也关心列表结构?

如果只有数字,研究如何“扁平化”列表,然后使用set()方法。

你应该使用这段代码(取自http://kogs-www.informatik.uni-hamburg.de/~meine/python_tricks),该代码未经测试,但我非常确定它可以工作:


def flatten(x):
"""flatten(sequence) -> list


Returns a single, flat list which contains all elements retrieved
from the sequence and all recursively contained sub-sequences
(iterables).


Examples:
>>> [1, 2, [3,4], (5,6)]
[1, 2, [3, 4], (5, 6)]
>>> flatten([[[1,2,3], (42,None)], [4,5], [6], 7, MyVector(8,9,10)])
[1, 2, 3, 42, None, 4, 5, 6, 7, 8, 9, 10]"""


result = []
for el in x:
#if isinstance(el, (list, tuple)):
if hasattr(el, "__iter__") and not isinstance(el, basestring):
result.extend(flatten(el))
else:
result.append(el)
return result

在你平摊了列表之后,你用通常的方式执行交叉:


c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]


def intersect(a, b):
return list(set(a) & set(b))


print intersect(flatten(c1), flatten(c2))


如果你想:

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
c3 = [[13, 32], [7, 13, 28], [1,6]]

下面是Python 2的解决方案:

c3 = [filter(lambda x: x in c1, sublist) for sublist in c2]

在Python 3中,filter返回一个可迭代对象而不是list,所以你需要用list()包装filter调用:

c3 = [list(filter(lambda x: x in c1, sublist)) for sublist in c2]

解释:

过滤器部分获取每个子列表的项并检查它是否在源列表c1中。 对c2中的每个子列表执行列表推导式。< / p >

你不需要定义交集。它已经是集合的一流部分了。

>>> b1 = [1,2,3,4,5,9,11,15]
>>> b2 = [4,5,6,7,8]
>>> set(b1).intersection(b2)
set([4, 5])

纯列表理解版本

>>> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
>>> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
>>> c1set = frozenset(c1)

平变体:

>>> [n for lst in c2 for n in lst if n in c1set]
[13, 32, 7, 13, 28, 1, 6]

嵌套的变体:

>>> [[n for n in lst if n in c1set] for lst in c2]
[[13, 32], [7, 13, 28], [1, 6]]

函数方法:

input_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]


result = reduce(set.intersection, map(set, input_list))

它可以应用于更一般的1+列表

对于只想找到两个列表交集的人,Asker提供了两个方法:

b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
b3 = [val for val in b1 if val in b2]

而且

def intersect(a, b):
return list(set(a) & set(b))


print intersect(b1, b2)

但是有一种混合方法更有效,因为你只需要在list/set之间做一次转换,而不是三次:

b1 = [1,2,3,4,5]
b2 = [3,4,5,6]
s2 = set(b2)
b3 = [val for val in b1 if val in s2]

这将在O(n)中运行,而他最初的包含列表理解的方法将在O(n²)中运行

既然定义了intersect,一个基本的列表推导式就足够了:

>>> c3 = [intersect(c1, i) for i in c2]
>>> c3
[[32, 13], [28, 13, 7], [1, 6]]

多亏了S. Lott的评论和TM。的相关评论:

>>> c3 = [list(set(c1).intersection(i)) for i in c2]
>>> c3
[[32, 13], [28, 13, 7], [1, 6]]

我不知道我回答你的问题是否晚了。在阅读了你的问题后,我提出了一个函数intersect(),可以在列表和嵌套列表上工作。我用递归来定义这个函数,很直观。希望这是你想要的:

def intersect(a, b):
result=[]
for i in b:
if isinstance(i,list):
result.append(intersect(a,i))
else:
if i in a:
result.append(i)
return result

例子:

>>> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
>>> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
>>> print intersect(c1,c2)
[[13, 32], [7, 13, 28], [1, 6]]


>>> b1 = [1,2,3,4,5,9,11,15]
>>> b2 = [4,5,6,7,8]
>>> print intersect(b1,b2)
[4, 5]

考虑到:

> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]


> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

我发现下面的代码工作得很好,如果使用set操作可能会更简洁:

> c3 = [list(set(f)&set(c1)) for f in c2]

它有:

> [[32, 13], [28, 13, 7], [1, 6]]

如需订购:

> c3 = [sorted(list(set(f)&set(c1))) for f in c2]

我们有:

> [[13, 32], [7, 13, 28], [1, 6]]

顺便说一下,对于更python的风格,这个也很好:

> c3 = [ [i for i in set(f) if i in c1] for f in c2]
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]


c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]


c3 = [list(set(c2[i]).intersection(set(c1))) for i in xrange(len(c2))]


c3
->[[32, 13], [28, 13, 7], [1, 6]]

我们可以使用set方法:

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]


result = []
for li in c2:
res = set(li) & set(c1)
result.append(list(res))


print result

我也在寻找一种方法来做到这一点,最终结果是这样的:

def compareLists(a,b):
removed = [x for x in a if x not in b]
added = [x for x in b if x not in a]
overlap = [x for x in a if x in b]
return [removed,added,overlap]

的,运算符取两个集合的交。

{1, 2, 3} & {2, 3, 4}
Out[1]: {2, 3}

python获取两个列表的交集的方法是:

[x for x in list1 if x in list2]

要定义正确考虑元素基数的交集,请使用Counter:

from collections import Counter


>>> c1 = [1, 2, 2, 3, 4, 4, 4]
>>> c2 = [1, 2, 4, 4, 4, 4, 5]
>>> list((Counter(c1) & Counter(c2)).elements())
[1, 2, 4, 4, 4]
# Problem:  Given c1 and c2:
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
# how do you get c3 to be [[13, 32], [7, 13, 28], [1, 6]] ?

这里有一种不涉及集合的方法来设置c3:

c3 = []
for sublist in c2:
c3.append([val for val in c1 if val in sublist])

但如果你喜欢只用一行,你可以这样做:

c3 = [[val for val in c1 if val in sublist]  for sublist in c2]

它是列表推导式中的列表推导式,这有点不寻常,但我认为你应该不会有太多的问题。

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
c3 = [list(set(i) & set(c1)) for i in c2]
c3
[[32, 13], [28, 13, 7], [1, 6]]

对我来说,这是非常优雅和快速的方法。

flat list可以通过reduce轻松创建。

你只需要使用reduce函数中的第三个参数初始化器

reduce(
lambda result, _list: result.append(
list(set(_list)&set(c1))
) or result,
c2,
[])

上面的代码适用于python2和python3,但是你需要导入reduce模块作为from functools import reduce。详情请参考下面的链接。

查找迭代对象之间的差异和交集的简单方法

如果重复很重要,就使用这种方法

from collections import Counter


def intersection(a, b):
"""
Find the intersection of two iterables


>>> intersection((1,2,3), (2,3,4))
(2, 3)


>>> intersection((1,2,3,3), (2,3,3,4))
(2, 3, 3)


>>> intersection((1,2,3,3), (2,3,4,4))
(2, 3)


>>> intersection((1,2,3,3), (2,3,4,4))
(2, 3)
"""
return tuple(n for n, count in (Counter(a) & Counter(b)).items() for _ in range(count))


def difference(a, b):
"""
Find the symmetric difference of two iterables


>>> difference((1,2,3), (2,3,4))
(1, 4)


>>> difference((1,2,3,3), (2,3,4))
(1, 3, 4)


>>> difference((1,2,3,3), (2,3,4,4))
(1, 3, 4, 4)
"""
diff = lambda x, y: tuple(n for n, count in (Counter(x) - Counter(y)).items() for _ in range(count))
return diff(a, b) + diff(b, a)
from random import *


a = sample(range(0, 1000), 100)
b = sample(range(0, 1000), 100)
print(a)
print(b)
print(set(a).intersection(set(b)))