如何合并字典的字典?

我需要合并多个字典,这是我有例如:

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}


dict2 = {2:{"c":{C}}, 3:{"d":{D}}

A B CD是树的叶子,就像{"info1":"value", "info2":"value2"}

字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{C}}}}}

在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。

我想将它们合并得到:

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

我不确定如何用Python轻松做到这一点。

119209 次浏览

这应该有助于将所有项目从dict2合并到dict1:

for item in dict2:
if item in dict1:
for leaf in dict2[item]:
dict1[item][leaf] = dict2[item][leaf]
else:
dict1[item] = dict2[item]

请测试一下,告诉我们这是否是你想要的。

编辑:

上述解决方案只合并了一个级别,但正确地解决了op给出的例子。如果合并多个级别,应该使用递归。

这实际上是相当棘手的-特别是如果你想要一个有用的错误消息时,事情是不一致的,同时正确地接受重复但一致的条目(这是这里没有其他答案做的..)。

假设你没有大量的条目,递归函数是最简单的:

from functools import reduce


def merge(a, b, path=None):
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a


# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})

注意,这会改变a——b的内容被添加到a(它也会返回)。如果你想保留a,你可以像merge(dict(a), b)那样调用它。

Agf指出(下面),你可能有两个以上的字典,在这种情况下,你可以使用:

reduce(merge, [dict1, dict2, dict3...])

其中所有内容将被添加到dict1

请注意:我编辑了我的初始答案以改变第一个参数;这使得“reduce”;更容易解释

如果你有一个未知级别的字典,那么我会建议一个递归函数:

def combineDicts(dictionary1, dictionary2):
output = {}
for item, value in dictionary1.iteritems():
if dictionary2.has_key(item):
if isinstance(dictionary2[item], dict):
output[item] = combineDicts(value, dictionary2.pop(item))
else:
output[item] = value
for item, value in dictionary2.iteritems():
output[item] = value
return output

这里有一个使用生成器的简单方法:

def mergedicts(dict1, dict2):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
yield (k, dict(mergedicts(dict1[k], dict2[k])))
else:
# If one of the values is not a dict, you can't continue merging it.
# Value from second dict overrides one in first and we move on.
yield (k, dict2[k])
# Alternatively, replace this with exception raiser to alert you of value conflicts
elif k in dict1:
yield (k, dict1[k])
else:
yield (k, dict2[k])


dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}


print dict(mergedicts(dict1,dict2))

这个打印:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

当然,代码将取决于您解决合并冲突的规则。这里有一个版本,它可以接受任意数量的参数,并递归地将它们合并到任意深度,而不使用任何对象突变。它使用以下规则来解决合并冲突:

  • 字典优先于非字典值({"foo": {...}}优先于{"foo": "bar"})
  • 后面的参数优先于前面的参数(如果你将{"a": 1}{"a", 2}{"a": 3}按顺序合并,结果将是{"a": 3})
try:
from collections import Mapping
except ImportError:
Mapping = dict


def merge_dicts(*dicts):
"""
Return a new dictionary that is the result of merging the arguments together.
In case of conflicts, later arguments take precedence over earlier arguments.
"""
updated = {}
# grab all keys
keys = set()
for d in dicts:
keys = keys.union(set(d))


for key in keys:
values = [d[key] for d in dicts if key in d]
# which ones are mapping types? (aka dict)
maps = [value for value in values if isinstance(value, Mapping)]
if maps:
# if we have any mapping types, call recursively to merge them
updated[key] = merge_dicts(*maps)
else:
# otherwise, just grab the last value we have, since later arguments
# take precedence over earlier arguments
updated[key] = values[-1]
return updated

我一直在测试你的解决方案,并决定在我的项目中使用这个:

def mergedicts(dict1, dict2, conflict, no_conflict):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
yield (k, conflict(dict1[k], dict2[k]))
elif k in dict1:
yield (k, no_conflict(dict1[k]))
else:
yield (k, no_conflict(dict2[k]))


dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}


#this helper function allows for recursion and the use of reduce
def f2(x, y):
return dict(mergedicts(x, y, f2, lambda x: x))


print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))

将函数作为参数传递是将jterrace解决方案扩展为所有其他递归解决方案的关键。

这个问题的一个问题是字典的值可以是任意复杂的数据块。基于这些和其他答案,我得出了以下代码:

class YamlReaderError(Exception):
pass


def data_merge(a, b):
"""merges b into a and return merged result


NOTE: tuples and arbitrary objects are not handled as it is totally ambiguous what should happen"""
key = None
# ## debug output
# sys.stderr.write("DEBUG: %s to %s\n" %(b,a))
try:
if a is None or isinstance(a, str) or isinstance(a, unicode) or isinstance(a, int) or isinstance(a, long) or isinstance(a, float):
# border case for first run or if a is a primitive
a = b
elif isinstance(a, list):
# lists can be only appended
if isinstance(b, list):
# merge lists
a.extend(b)
else:
# append to list
a.append(b)
elif isinstance(a, dict):
# dicts must be merged
if isinstance(b, dict):
for key in b:
if key in a:
a[key] = data_merge(a[key], b[key])
else:
a[key] = b[key]
else:
raise YamlReaderError('Cannot merge non-dict "%s" into dict "%s"' % (b, a))
else:
raise YamlReaderError('NOT IMPLEMENTED "%s" into "%s"' % (b, a))
except TypeError, e:
raise YamlReaderError('TypeError "%s" in key "%s" when merging "%s" into "%s"' % (e, key, b, a))
return a

我的用例是合并YAML文件,其中我只需要处理可能的数据类型的子集。因此我可以忽略元组和其他对象。对我来说,合理的合并逻辑意味着

  • 取代标量
  • 添加列表
  • 通过添加缺失键和更新现有键来合并字典

其他任何事情和不可预见的事情都会导致错误。

这个版本的函数将处理N个字典,并且只处理字典——不能传递不恰当的参数,否则将引发TypeError。合并本身解释了键冲突,它不是覆盖来自合并链下的字典的数据,而是创建一组值并追加到该值;没有数据丢失。

它可能不是页面上最有效的,但它是最彻底的,当你合并2到N字典时,你不会丢失任何信息。

def merge_dicts(*dicts):
if not reduce(lambda x, y: isinstance(y, dict) and x, dicts, True):
raise TypeError, "Object in *dicts not of type dict"
if len(dicts) < 2:
raise ValueError, "Requires 2 or more dict objects"




def merge(a, b):
for d in set(a.keys()).union(b.keys()):
if d in a and d in b:
if type(a[d]) == type(b[d]):
if not isinstance(a[d], dict):
ret = list({a[d], b[d]})
if len(ret) == 1: ret = ret[0]
yield (d, sorted(ret))
else:
yield (d, dict(merge(a[d], b[d])))
else:
raise TypeError, "Conflicting key:value type assignment"
elif d in a:
yield (d, a[d])
elif d in b:
yield (d, b[d])
else:
raise KeyError


return reduce(lambda x, y: dict(merge(x, y)), dicts[1:], dicts[0])


print merge_dicts({1:1,2:{1:2}},{1:2,2:{3:1}},{4:4})

输出:{1:[1,2],2:{1:2,3:1},4:4}

我能想到的最简单的方法是:

#!/usr/bin/python


from copy import deepcopy
def dict_merge(a, b):
if not isinstance(b, dict):
return b
result = deepcopy(a)
for k, v in b.iteritems():
if k in result and isinstance(result[k], dict):
result[k] = dict_merge(result[k], v)
else:
result[k] = deepcopy(v)
return result


a = {1:{"a":'A'}, 2:{"b":'B'}}
b = {2:{"c":'C'}, 3:{"d":'D'}}


print dict_merge(a,b)

输出:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

字典的字典合并

由于这是一个规范的问题(尽管存在某些非泛化性),所以我提供了规范的python方法来解决这个问题。

最简单的例子:“leaves是嵌套的dicts,以空dicts结尾”:

d1 = {'a': {1: {'foo': {}}, 2: {}}}
d2 = {'a': {1: {}, 2: {'bar': {}}}}
d3 = {'b': {3: {'baz': {}}}}
d4 = {'a': {1: {'quux': {}}}}

这是递归最简单的情况,我推荐两种简单的方法:

def rec_merge1(d1, d2):
'''return new merged dict of dicts'''
for k, v in d1.items(): # in Python 2, use .iteritems()!
if k in d2:
d2[k] = rec_merge1(v, d2[k])
d3 = d1.copy()
d3.update(d2)
return d3


def rec_merge2(d1, d2):
'''update first dict with second recursively'''
for k, v in d1.items(): # in Python 2, use .iteritems()!
if k in d2:
d2[k] = rec_merge2(v, d2[k])
d1.update(d2)
return d1

我相信我更喜欢第二个,而不是第一个,但请记住,第一个的原始状态必须从它的起源重建。用法如下:

>>> from functools import reduce # only required for Python 3.
>>> reduce(rec_merge1, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
>>> reduce(rec_merge2, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}

复杂情况:“叶子是任何其他类型的”;

所以如果它们以字典结尾,这是一个简单的合并结尾空字典的例子。如果不是,也不是那么微不足道。如果是字符串,怎么合并?集合也可以类似地更新,所以我们可以这样处理,但我们失去了它们合并的顺序。那么顺序重要吗?

因此,代替更多信息,最简单的方法是给它们一个标准的更新处理,如果两个值都不是dict:即第二个dict的值将覆盖第一个dict,即使第二个dict的值是None,而第一个dict的值是一个包含大量信息的dict。

d1 = {'a': {1: 'foo', 2: None}}
d2 = {'a': {1: None, 2: 'bar'}}
d3 = {'b': {3: 'baz'}}
d4 = {'a': {1: 'quux'}}


from collections.abc import MutableMapping


def rec_merge(d1, d2):
'''
Update two dicts of dicts recursively,
if either mapping has leaves that are non-dicts,
the second's leaf overwrites the first's.
'''
for k, v in d1.items():
if k in d2:
# this next check is the only difference!
if all(isinstance(e, MutableMapping) for e in (v, d2[k])):
d2[k] = rec_merge(v, d2[k])
# we could further check types and merge as appropriate here.
d3 = d1.copy()
d3.update(d2)
return d3

现在

from functools import reduce
reduce(rec_merge, (d1, d2, d3, d4))

返回

{'a': {1: 'quux', 2: 'bar'}, 'b': {3: 'baz'}}

适用于原问题:

我不得不删除字母周围的花括号,并将它们放在单引号中,以使其成为合法的Python(否则它们将在Python 2.7+中设置字面量),并附加一个缺少的大括号:

dict1 = {1:{"a":'A'}, 2:{"b":'B'}}
dict2 = {2:{"c":'C'}, 3:{"d":'D'}}

rec_merge(dict1, dict2)现在返回:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

匹配原始问题的期望结果(在改变后,例如{A}'A'。)

这个简单的递归过程将一个字典合并到另一个字典,同时覆盖冲突的键:

#!/usr/bin/env python2.7


def merge_dicts(dict1, dict2):
""" Recursively merges dict2 into dict1 """
if not isinstance(dict1, dict) or not isinstance(dict2, dict):
return dict2
for k in dict2:
if k in dict1:
dict1[k] = merge_dicts(dict1[k], dict2[k])
else:
dict1[k] = dict2[k]
return dict1


print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))

输出:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}

基于@andrew cooke。这个版本处理字典的嵌套列表,还允许选项更新值

def merge(a, b, path=None, update=True):
"http://stackoverflow.com/questions/7204805/python-dictionaries-of-dictionaries-merge"
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
elif isinstance(a[key], list) and isinstance(b[key], list):
for idx, val in enumerate(b[key]):
a[key][idx] = merge(a[key][idx], b[key][idx], path + [str(key), str(idx)], update=update)
elif update:
a[key] = b[key]
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a

这里我有另一个稍微不同的解决方案:

def deepMerge(d1, d2, inconflict = lambda v1,v2 : v2) :
''' merge d2 into d1. using inconflict function to resolve the leaf conflicts '''
for k in d2:
if k in d1 :
if isinstance(d1[k], dict) and isinstance(d2[k], dict) :
deepMerge(d1[k], d2[k], inconflict)
elif d1[k] != d2[k] :
d1[k] = inconflict(d1[k], d2[k])
else :
d1[k] = d2[k]
return d1

默认情况下,它解决冲突,支持来自第二个字典的值,但您可以很容易地覆盖这一点,使用一些巫术,您甚至可以抛出异常。:)。

andrew cookes的回答有一个小问题:在某些情况下,当你修改返回的dict时,它会修改第二个参数b。具体来说是因为这句话:

if key in a:
...
else:
a[key] = b[key]

如果b[key]是一个dict,它将被简单地分配给a,这意味着对dict的任何后续修改将同时影响ab

a={}
b={'1':{'2':'b'}}
c={'1':{'3':'c'}}
merge(merge(a,b), c) # {'1': {'3': 'c', '2': 'b'}}
a # {'1': {'3': 'c', '2': 'b'}} (as expected)
b # {'1': {'3': 'c', '2': 'b'}} <----
c # {'1': {'3': 'c'}} (unmodified)

为了解决这个问题,这一行必须用这个替换:

if isinstance(b[key], dict):
a[key] = clone_dict(b[key])
else:
a[key] = b[key]

其中clone_dict为:

def clone_dict(obj):
clone = {}
for key, value in obj.iteritems():
if isinstance(value, dict):
clone[key] = clone_dict(value)
else:
clone[key] = value
return

不动。这显然没有解释listset和其他东西,但我希望它说明了合并dicts时的陷阱。

为了完整起见,这里是我的版本,你可以传递多个dicts:

def merge_dicts(*args):
def clone_dict(obj):
clone = {}
for key, value in obj.iteritems():
if isinstance(value, dict):
clone[key] = clone_dict(value)
else:
clone[key] = value
return


def merge(a, b, path=[]):
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass
else:
raise Exception('Conflict at `{path}\''.format(path='.'.join(path + [str(key)])))
else:
if isinstance(b[key], dict):
a[key] = clone_dict(b[key])
else:
a[key] = b[key]
return a
return reduce(merge, args, {})

由于dictviews支持集合操作,我能够极大地简化jterrace的答案。

def merge(dict1, dict2):
for k in dict1.keys() - dict2.keys():
yield (k, dict1[k])


for k in dict2.keys() - dict1.keys():
yield (k, dict2[k])


for k in dict1.keys() & dict2.keys():
yield (k, dict(merge(dict1[k], dict2[k])))

任何将dict与非dict(技术上讲,一个带有'keys'方法的对象和一个没有'keys'方法的对象)组合的尝试都会引发AttributeError。这包括对函数的初始调用和递归调用。这正是我想要的,所以我离开了。您可以很容易地捕获递归调用抛出的AttributeErrors,然后生成您想要的任何值。

我有两个字典(ab),每个字典都可以包含任意数量的嵌套字典。我想递归地合并它们,b优先于a

将嵌套字典视为树,我想要的是:

  • 更新a,使b中每个叶的每条路径都表示在a
  • 覆盖a的子树,如果在b的相应路径中找到叶子
    • 保持所有b叶子节点仍然是叶子的不变式。
    • 李< / ul > < / >

    现有的答案对我来说有点复杂,有些细节被束之高阁。我将以下内容整合在一起,它们通过了我的数据集的单元测试。

      def merge_map(a, b):
    if not isinstance(a, dict) or not isinstance(b, dict):
    return b
    
    
    for key in b.keys():
    a[key] = merge_map(a[key], b[key]) if key in a else b[key]
    return a
    

    示例(为清晰起见,已格式化):

     a = {
    1 : {'a': 'red',
    'b': {'blue': 'fish', 'yellow': 'bear' },
    'c': { 'orange': 'dog'},
    },
    2 : {'d': 'green'},
    3: 'e'
    }
    
    
    b = {
    1 : {'b': 'white'},
    2 : {'d': 'black'},
    3: 'e'
    }
    
    
    
    
    >>> merge_map(a, b)
    {1: {'a': 'red',
    'b': 'white',
    'c': {'orange': 'dog'},},
    2: {'d': 'black'},
    3: 'e'}
    

    b中需要维护的路径是:

    • 1 -> 'b' -> 'white'
    • 2 -> 'd' -> 'black'
    • 3 -> 'e'

    a具有独特且不冲突的路径:

    • 1 -> 'a' -> 'red'
    • 1 -> 'c' -> 'orange' -> 'dog'

    所以它们仍然在合并后的映射中表示。

class Utils(object):


"""


>>> a = { 'first' : { 'all_rows' : { 'pass' : 'dog', 'number' : '1' } } }
>>> b = { 'first' : { 'all_rows' : { 'fail' : 'cat', 'number' : '5' } } }
>>> Utils.merge_dict(b, a) == { 'first' : { 'all_rows' : { 'pass' : 'dog', 'fail' : 'cat', 'number' : '5' } } }
True


>>> main = {'a': {'b': {'test': 'bug'}, 'c': 'C'}}
>>> suply = {'a': {'b': 2, 'd': 'D', 'c': {'test': 'bug2'}}}
>>> Utils.merge_dict(main, suply) == {'a': {'b': {'test': 'bug'}, 'c': 'C', 'd': 'D'}}
True


"""


@staticmethod
def merge_dict(main, suply):
"""
获取融合的字典,以main为主,suply补充,冲突时以main为准
:return:
"""
for key, value in suply.items():
if key in main:
if isinstance(main[key], dict):
if isinstance(value, dict):
Utils.merge_dict(main[key], value)
else:
pass
else:
pass
else:
main[key] = value
return main


if __name__ == '__main__':
import doctest
doctest.testmod()

概述

下面的方法将字典的深度合并问题细分为:

  1. 一个参数化的浅归并函数merge(f)(a,b) 函数f合并两个字典ab

  2. merge一起使用的递归归并函数f


实现

合并两个(非嵌套的)字典的函数可以用很多种方式编写。我个人喜欢

def merge(f):
def merge(a,b):
keys = a.keys() | b.keys()
return {key:f(a.get(key), b.get(key)) for key in keys}
return merge

定义适当的递归归并函数f的一个好方法是使用multipledispatch,它允许定义函数根据参数的类型沿不同的路径求值。

from multipledispatch import dispatch


#for anything that is not a dict return
@dispatch(object, object)
def f(a, b):
return b if b is not None else a


#for dicts recurse
@dispatch(dict, dict)
def f(a,b):
return merge(f)(a,b)

例子

要合并两个嵌套字典,只需使用merge(f),例如:

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
merge(f)(dict1, dict2)
#returns {1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}

注:

这种方法的优点是:

    该函数由较小的函数构建而成,每个函数只做一件事 这使得代码更容易推理和测试

  • 行为不是硬编码的,但可以根据需要进行更改和扩展,从而提高代码重用(参见下面的示例)。


定制

一些答案还考虑了包含列表的字典,例如其他(可能嵌套的)字典。在这种情况下,可能需要映射列表并根据位置合并它们。这可以通过在归并函数f中添加另一个定义来实现:

import itertools
@dispatch(list, list)
def f(a,b):
return [merge(f)(*arg) for arg in itertools.zip_longest(a, b)]

基于@andrew cooke的回答。 它以更好的方式处理嵌套列表

def deep_merge_lists(original, incoming):
"""
Deep merge two lists. Modifies original.
Recursively call deep merge on each correlated element of list.
If item type in both elements are
a. dict: Call deep_merge_dicts on both values.
b. list: Recursively call deep_merge_lists on both values.
c. any other type: Value is overridden.
d. conflicting types: Value is overridden.


If length of incoming list is more that of original then extra values are appended.
"""
common_length = min(len(original), len(incoming))
for idx in range(common_length):
if isinstance(original[idx], dict) and isinstance(incoming[idx], dict):
deep_merge_dicts(original[idx], incoming[idx])


elif isinstance(original[idx], list) and isinstance(incoming[idx], list):
deep_merge_lists(original[idx], incoming[idx])


else:
original[idx] = incoming[idx]


for idx in range(common_length, len(incoming)):
original.append(incoming[idx])




def deep_merge_dicts(original, incoming):
"""
Deep merge two dictionaries. Modifies original.
For key conflicts if both values are:
a. dict: Recursively call deep_merge_dicts on both values.
b. list: Call deep_merge_lists on both values.
c. any other type: Value is overridden.
d. conflicting types: Value is overridden.


"""
for key in incoming:
if key in original:
if isinstance(original[key], dict) and isinstance(incoming[key], dict):
deep_merge_dicts(original[key], incoming[key])


elif isinstance(original[key], list) and isinstance(incoming[key], list):
deep_merge_lists(original[key], incoming[key])


else:
original[key] = incoming[key]
else:
original[key] = incoming[key]

Short-n-sweet:

from collections.abc import MutableMapping as Map


def nested_update(d, v):
"""
Nested update of dict-like 'd' with dict-like 'v'.
"""


for key in v:
if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
nested_update(d[key], v[key])
else:
d[key] = v[key]

这类似于Python的dict.update方法(并且构建在此基础上)。它返回None(如果你愿意,你总是可以添加return d),因为它在原地更新字典dv中的键将覆盖d中任何现有的键(它不会尝试解释字典的内容)。

它也适用于其他(“类字典”)映射。

嘿,我也有同样的问题,但我想出了一个解决方案,我会把它贴在这里,以防它对其他人也有用,基本上合并嵌套字典和添加值,对我来说,我需要计算一些概率,所以这一个工作得很好:

#used to copy a nested dict to a nested dict
def deepupdate(target, src):
for k, v in src.items():
if k in target:
for k2, v2 in src[k].items():
if k2 in target[k]:
target[k][k2]+=v2
else:
target[k][k2] = v2
else:
target[k] = copy.deepcopy(v)

通过使用上述方法,我们可以合并:

目标={6 6:{“63”:1},“63,4:{4 4:1},4,4:{“4 3”:1},“63”:{63,4:1}}

src ={5 4:{4 4: 1}, 5、5:{“5、4”:1},4,4:{“4 3”:1}}

,这将变成: {', 5 ':{“5、4”:1},“5、4”:{4 4:1},“6 6”:{“63”:1},“63,4:{4 4:1},4,4:{“4 3”:2},“63”:{63,4:1}}< / p >

还要注意这里的变化:

目标={6 6:{“63”:1},“63”:{63,4:1},'4,4': {'4,3': 1},“63,4:{4 4:1}}

src ={5 4:{4 4: 1},“4 3”:{“3、4”:1},'4,4': {'4,9': 1}, 3、4:{4 4:1},5、5:{“5、4”:1}}

合并={5 4:{4 4:1},“4 3”:{“3、4”:1},“63”:{63,4:1},5、5:{“5、4”:1},“6 6”:{“63”:1},3、4:{4 4:1},“63,4:{4 4:1},'4,4': {'4,3': 1, '4,9': 1}}

别忘了还添加导入copy:

import copy

如果有人想要另一个方法来解决这个问题,这是我的解决方案。

美德:短小,声明性,函数式风格(递归,不做突变)。

潜在的缺点:这可能不是你正在寻找的合并。查阅文档字符串以了解语义。

def deep_merge(a, b):
"""
Merge two values, with `b` taking precedence over `a`.


Semantics:
- If either `a` or `b` is not a dictionary, `a` will be returned only if
`b` is `None`. Otherwise `b` will be returned.
- If both values are dictionaries, they are merged as follows:
* Each key that is found only in `a` or only in `b` will be included in
the output collection with its value intact.
* For any key in common between `a` and `b`, the corresponding values
will be merged with the same semantics.
"""
if not isinstance(a, dict) or not isinstance(b, dict):
return a if b is None else b
else:
# If we're here, both a and b must be dictionaries or subtypes thereof.


# Compute set of all keys in both dictionaries.
keys = set(a.keys()) | set(b.keys())


# Build output dictionary, merging recursively values with common keys,
# where `None` is used to mean the absence of a value.
return {
key: deep_merge(a.get(key), b.get(key))
for key in keys
}
from collections import defaultdict
from itertools import chain


class DictHelper:


@staticmethod
def merge_dictionaries(*dictionaries, override=True):
merged_dict = defaultdict(set)
all_unique_keys = set(chain(*[list(dictionary.keys()) for dictionary in dictionaries]))  # Build a set using all dict keys
for key in all_unique_keys:
keys_value_type = list(set(filter(lambda obj_type: obj_type != type(None), [type(dictionary.get(key, None)) for dictionary in dictionaries])))
# Establish the object type for each key, return None if key is not present in dict and remove None from final result
if len(keys_value_type) != 1:
raise Exception("Different objects type for same key: {keys_value_type}".format(keys_value_type=keys_value_type))


if keys_value_type[0] == list:
values = list(chain(*[dictionary.get(key, []) for dictionary in dictionaries]))  # Extract the value for each key
merged_dict[key].update(values)


elif keys_value_type[0] == dict:
# Extract all dictionaries by key and enter in recursion
dicts_to_merge = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
merged_dict[key] = DictHelper.merge_dictionaries(*dicts_to_merge)


else:
# if override => get value from last dictionary else make a list of all values
values = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
merged_dict[key] = values[-1] if override else values


return dict(merged_dict)






if __name__ == '__main__':
d1 = {'aaaaaaaaa': ['to short', 'to long'], 'bbbbb': ['to short', 'to long'], "cccccc": ["the is a test"]}
d2 = {'aaaaaaaaa': ['field is not a bool'], 'bbbbb': ['field is not a bool']}
d3 = {'aaaaaaaaa': ['filed is not a string', "to short"], 'bbbbb': ['field is not an integer']}
print(DictHelper.merge_dictionaries(d1, d2, d3))


d4 = {"a": {"x": 1, "y": 2, "z": 3, "d": {"x1": 10}}}
d5 = {"a": {"x": 10, "y": 20, "d": {"x2": 20}}}
print(DictHelper.merge_dictionaries(d4, d5))

输出:

{'bbbbb': {'to long', 'field is not an integer', 'to short', 'field is not a bool'},
'aaaaaaaaa': {'to long', 'to short', 'filed is not a string', 'field is not a bool'},
'cccccc': {'the is a test'}}


{'a': {'y': 20, 'd': {'x1': 10, 'x2': 20}, 'z': 3, 'x': 10}}

你可以试试< >强mergedeep < / >强


安装

$ pip3 install mergedeep

使用

from mergedeep import merge


a = {"keyA": 1}
b = {"keyB": {"sub1": 10}}
c = {"keyB": {"sub2": 20}}


merge(a, b, c)


print(a)
# {"keyA": 1, "keyB": {"sub1": 10, "sub2": 20}}

要获得完整的选项列表,请查看文档!

你可以使用toolz包中的merge函数,例如:

>>> import toolz
>>> dict1 = {1: {'a': 'A'}, 2: {'b': 'B'}}
>>> dict2 = {2: {'c': 'C'}, 3: {'d': 'D'}}
>>> toolz.merge_with(toolz.merge, dict1, dict2)
{1: {'a': 'A'}, 2: {'c': 'C'}, 3: {'d': 'D'}}

下面的函数将b合并为a。

def mergedicts(a, b):
for key in b:
if isinstance(a.get(key), dict) or isinstance(b.get(key), dict):
mergedicts(a[key], b[key])
else:
a[key] = b[key]
return a

还有一个轻微的变化:

下面是一个纯粹的基于python3集的深度更新函数。它通过一次循环遍历一层来更新嵌套字典,并调用自己来更新下一层的字典值:

def deep_update(dict_original, dict_update):
if isinstance(dict_original, dict) and isinstance(dict_update, dict):
output=dict(dict_original)
keys_original=set(dict_original.keys())
keys_update=set(dict_update.keys())
similar_keys=keys_original.intersection(keys_update)
similar_dict={key:deep_update(dict_original[key], dict_update[key]) for key in similar_keys}
new_keys=keys_update.difference(keys_original)
new_dict={key:dict_update[key] for key in new_keys}
output.update(similar_dict)
output.update(new_dict)
return output
else:
return dict_update

举个简单的例子:

x={'a':{'b':{'c':1, 'd':1}}}
y={'a':{'b':{'d':2, 'e':2}}, 'f':2}


print(deep_update(x, y))
>>> {'a': {'b': {'c': 1, 'd': 2, 'e': 2}}, 'f': 2}

换个答案怎么样?!?这也避免了突变/副作用:

def merge(dict1, dict2):
output = {}


# adds keys from `dict1` if they do not exist in `dict2` and vice-versa
intersection = {**dict2, **dict1}


for k_intersect, v_intersect in intersection.items():
if k_intersect not in dict1:
v_dict2 = dict2[k_intersect]
output[k_intersect] = v_dict2


elif k_intersect not in dict2:
output[k_intersect] = v_intersect


elif isinstance(v_intersect, dict):
v_dict2 = dict2[k_intersect]
output[k_intersect] = merge(v_intersect, v_dict2)


else:
output[k_intersect] = v_intersect


return output


dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}
dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}
dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}


assert dict3 == merge(dict1, dict2)

我有一个迭代的解决方案-工作得更好与大字典&很多(例如jsons等):

import collections




def merge_dict_with_subdicts(dict1: dict, dict2: dict) -> dict:
"""
similar behaviour to builtin dict.update - but knows how to handle nested dicts
"""
q = collections.deque([(dict1, dict2)])
while len(q) > 0:
d1, d2 = q.pop()
for k, v in d2.items():
if k in d1 and isinstance(d1[k], dict) and isinstance(v, dict):
q.append((d1[k], v))
else:
d1[k] = v


return dict1

注意,这将使用d2中的值来覆盖d1,以防它们都不是字典。(与python的dict.update()相同)

一些测试:

def test_deep_update():
d = dict()
merge_dict_with_subdicts(d, {"a": 4})
assert d == {"a": 4}


new_dict = {
"b": {
"c": {
"d": 6
}
}
}
merge_dict_with_subdicts(d, new_dict)
assert d == {
"a": 4,
"b": {
"c": {
"d": 6
}
}
}


new_dict = {
"a": 3,
"b": {
"f": 7
}
}
merge_dict_with_subdicts(d, new_dict)
assert d == {
"a": 3,
"b": {
"c": {
"d": 6
},
"f": 7
}
}


# test a case where one of the dicts has dict as value and the other has something else
new_dict = {
'a': {
'b': 4
}
}
merge_dict_with_subdicts(d, new_dict)
assert d['a']['b'] == 4


我已经测试了大约1200个字典——这种方法花了0.4秒,而递归的解决方案花了2.5秒。

我还没有对此进行广泛的测试,因此,鼓励您的反馈。

from collections import defaultdict


dict1 = defaultdict(list)


dict2= defaultdict(list)


dict3= defaultdict(list)




dict1= dict(zip(Keys[ ],values[ ]))


dict2 = dict(zip(Keys[ ],values[ ]))




def mergeDict(dict1, dict2):


dict3 = {**dict1, **dict2}


for key, value in dict3.items():


if key in dict1 and key in dict2:


dict3[key] = [value , dict1[key]]


return dict3


dict3 = mergeDict(dict1, dict2)


#sort keys alphabetically.


dict3.keys()

合并两个字典并添加公共键的值

在不影响输入字典的情况下返回一个合并。

def _merge_dicts(dictA: Dict = {}, dictB: Dict = {}) -> Dict:
# it suffices to pass as an argument a clone of `dictA`
return _merge_dicts_aux(dictA, dictB, copy(dictA))




def _merge_dicts_aux(dictA: Dict = {}, dictB: Dict = {}, result: Dict = {}, path: List[str] = None) -> Dict:


# conflict path, None if none
if path is None:
path = []


for key in dictB:


# if the key doesn't exist in A, add the B element to A
if key not in dictA:
result[key] = dictB[key]


else:
# if the key value is a dict, both in A and in B, merge the dicts
if isinstance(dictA[key], dict) and isinstance(dictB[key], dict):
_merge_dicts_aux(dictA[key], dictB[key], result[key], path + [str(key)])


# if the key value is the same in A and in B, ignore
elif dictA[key] == dictB[key]:
pass


# if the key value differs in A and in B, raise error
else:
err: str = f"Conflict at {'.'.join(path + [str(key)])}"
raise Exception(err)


return result

灵感来自@andrew库克的解决方案

这是我做的递归合并字典到无限深度的解决方案。传递给函数的第一个字典是主字典——其中的值将覆盖第二个字典中相同键的值。

def merge(dict1: dict, dict2: dict) -> dict:
merged = dict1


for key in dict2:
if type(dict2[key]) == dict:
merged[key] = merge(dict1[key] if key in dict1 else {}, dict2[key])
else:
if key not in dict1.keys():
merged[key] = dict2[key]


return merged


正如在许多其他答案中提到的,递归算法在这里最有意义。一般来说,在使用递归时,最好创建新值,而不是试图修改任何输入数据结构。

我们需要定义在每个合并步骤中发生的事情。如果两个输入都是字典,这很简单:我们从每一边复制唯一键,然后递归合并重复键的值。导致问题的是基本情况。如果我们拿出一个单独的函数,逻辑会更容易理解。作为占位符,我们可以将这两个值包装在一个元组中:

def merge_leaves(x, y):
return (x, y)

现在我们的逻辑核心是这样的:

def merge(x, y):
if not(isinstance(x, dict) and isinstance(y, dict)):
return merge_leaves(x, y)
x_keys, y_keys = x.keys(), y.keys()
result = { k: merge(x[k], y[k]) for k in x_keys & y_keys }
result.update({k: x[k] for k in x_keys - y_keys})
result.update({k: y[k] for k in y_keys - x_keys})
return result

让我们来测试一下:

>>> x = {'a': {'b': 'c', 'd': 'e'}, 'f': 1, 'g': {'h', 'i'}, 'j': None}
>>> y = {'a': {'d': 'e', 'h': 'i'}, 'f': {'b': 'c'}, 'g': 1, 'k': None}
>>> merge(x, y)
{'f': (1, {'b': 'c'}), 'g': ({'h', 'i'}, 1), 'a': {'d': ('e', 'e'), 'b': 'c', 'h': 'i'}, 'j': None, 'k': None}
>>> x # The originals are unmodified.
{'a': {'b': 'c', 'd': 'e'}, 'f': 1, 'g': {'h', 'i'}, 'j': None}
>>> y
{'a': {'d': 'e', 'h': 'i'}, 'f': {'b': 'c'}, 'g': 1, 'k': None}

我们可以很容易地修改叶子归并规则,例如:

def merge_leaves(x, y):
try:
return x + y
except TypeError:
return Ellipsis

并观察效果:

>>> merge(x, y)
{'f': Ellipsis, 'g': Ellipsis, 'a': {'d': 'ee', 'b': 'c', 'h': 'i'}, 'j': None, 'k': None}

我们还可以通过使用第三方库来根据输入的类型进行分派来潜在地清理这个问题。例如,使用multipledispatch,我们可以这样做:

@dispatch(dict, dict)
def merge(x, y):
x_keys, y_keys = x.keys(), y.keys()
result = { k: merge(x[k], y[k]) for k in x_keys & y_keys }
result.update({k: x[k] for k in x_keys - y_keys})
result.update({k: y[k] for k in y_keys - x_keys})
return result


@dispatch(str, str)
def merge(x, y):
return x + y


@dispatch(tuple, tuple)
def merge(x, y):
return x + y


@dispatch(list, list)
def merge(x, y):
return x + y


@dispatch(int, int):
def merge(x, y):
raise ValueError("integer value conflict")


@dispatch(object, object):
return (x, y)

这允许我们在不编写自己的类型检查的情况下处理叶类型特殊情况的各种组合,并在主递归函数中替换类型检查。