我需要合并多个字典,这是我有例如:
dict1 = {1:{"a":{A}}, 2:{"b":{B}}}
dict2 = {2:{"c":{C}}, 3:{"d":{D}}}
A、B、C和D是树的叶子,比如{"info1":"value", "info2":"value2"}
字典的级别(深度)未知,可能是{2:{"c":{"z":{"y":{c}}}}}
在我的例子中,它表示一个目录/文件结构,节点是文档,叶子是文件。
我想将它们合并得到:
dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}
我不确定如何用Python轻松做到这一点。
还有一个轻微的变化:
下面是一个纯粹的基于python3集的深度更新函数。它通过一次循环遍历一层来更新嵌套字典,并调用自己来更新下一层的字典值:
def deep_update(dict_original, dict_update):
if isinstance(dict_original, dict) and isinstance(dict_update, dict):
output=dict(dict_original)
keys_original=set(dict_original.keys())
keys_update=set(dict_update.keys())
similar_keys=keys_original.intersection(keys_update)
similar_dict={key:deep_update(dict_original[key], dict_update[key]) for key in similar_keys}
new_keys=keys_update.difference(keys_original)
new_dict={key:dict_update[key] for key in new_keys}
output.update(similar_dict)
output.update(new_dict)
return output
else:
return dict_update
举个简单的例子:
x={'a':{'b':{'c':1, 'd':1}}}
y={'a':{'b':{'d':2, 'e':2}}, 'f':2}
print(deep_update(x, y))
>>> {'a': {'b': {'c': 1, 'd': 2, 'e': 2}}, 'f': 2}
如果有人想要另一种方法来解决这个问题,这是我的解决方案。
优点:简洁、声明性和函数式风格(递归,没有突变)。
潜在缺点:这可能不是你想要的合并。查阅文档字符串以了解语义。
def deep_merge(a, b):
"""
Merge two values, with `b` taking precedence over `a`.
Semantics:
- If either `a` or `b` is not a dictionary, `a` will be returned only if
`b` is `None`. Otherwise `b` will be returned.
- If both values are dictionaries, they are merged as follows:
* Each key that is found only in `a` or only in `b` will be included in
the output collection with its value intact.
* For any key in common between `a` and `b`, the corresponding values
will be merged with the same semantics.
"""
if not isinstance(a, dict) or not isinstance(b, dict):
return a if b is None else b
else:
# If we're here, both a and b must be dictionaries or subtypes thereof.
# Compute set of all keys in both dictionaries.
keys = set(a.keys()) | set(b.keys())
# Build output dictionary, merging recursively values with common keys,
# where `None` is used to mean the absence of a value.
return {
key: deep_merge(a.get(key), b.get(key))
for key in keys
}
我一直在测试你的解决方案,并决定在我的项目中使用这个:
def mergedicts(dict1, dict2, conflict, no_conflict):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
yield (k, conflict(dict1[k], dict2[k]))
elif k in dict1:
yield (k, no_conflict(dict1[k]))
else:
yield (k, no_conflict(dict2[k]))
dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}
#this helper function allows for recursion and the use of reduce
def f2(x, y):
return dict(mergedicts(x, y, f2, lambda x: x))
print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))
将函数作为参数传递是将jterrace解决方案扩展为所有其他递归解决方案的关键。
当然,代码将取决于您解决合并冲突的规则。这里有一个版本,它可以接受任意数量的参数,并递归地将它们合并到任意深度,而不使用任何对象突变。它使用以下规则来解决合并冲突:
字典优先于非字典值({"foo":{…}}优先于{"foo": "bar"})
后面的参数优先于前面的参数(如果按顺序合并{"a": 1}, {"a", 2}和{"a": 3},结果将是{"a": 3})
try:
from collections import Mapping
except ImportError:
Mapping = dict
def merge_dicts(*dicts):
"""
Return a new dictionary that is the result of merging the arguments together.
In case of conflicts, later arguments take precedence over earlier arguments.
"""
updated = {}
# grab all keys
keys = set()
for d in dicts:
keys = keys.union(set(d))
for key in keys:
values = [d[key] for d in dicts if key in d]
# which ones are mapping types? (aka dict)
maps = [value for value in values if isinstance(value, Mapping)]
if maps:
# if we have any mapping types, call recursively to merge them
updated[key] = merge_dicts(*maps)
else:
# otherwise, just grab the last value we have, since later arguments
# take precedence over earlier arguments
updated[key] = values[-1]
return updated
这个简单的递归过程将一个字典合并到另一个字典,同时覆盖冲突的键:
#!/usr/bin/env python2.7
def merge_dicts(dict1, dict2):
""" Recursively merges dict2 into dict1 """
if not isinstance(dict1, dict) or not isinstance(dict2, dict):
return dict2
for k in dict2:
if k in dict1:
dict1[k] = merge_dicts(dict1[k], dict2[k])
else:
dict1[k] = dict2[k]
return dict1
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))
输出:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}