通过键列表访问嵌套的字典项。

我有一个复杂的字典结构,我想通过一个键列表来访问它,以找到正确的项目。

dataDict = {
"a":{
"r": 1,
"s": 2,
"t": 3
},
"b":{
"u": 1,
"v": {
"x": 1,
"y": 2,
"z": 3
},
"w": 3
}
}


maplist = ["a", "r"]

maplist = ["b", "v", "y"]

我做了下面的工作代码,但我相信有一个更好,更有效的方法来做到这一点,如果有人有一个想法。

# Get a given data from a dictionary with position provided as a list
def getFromDict(dataDict, mapList):
for k in mapList: dataDict = dataDict[k]
return dataDict


# Set a given data in a dictionary with position provided as a list
def setInDict(dataDict, mapList, value):
for k in mapList[:-1]: dataDict = dataDict[k]
dataDict[mapList[-1]] = value
166361 次浏览

使用reduce()遍历字典:

from functools import reduce  # forward compatibility for Python 3
import operator


def getFromDict(dataDict, mapList):
return reduce(operator.getitem, mapList, dataDict)

并且重新使用getFromDict来找到存储setInDict()的值的位置:

def setInDict(dataDict, mapList, value):
getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value

mapList中除最后一个元素外的所有元素都需要找到要添加值的“父”字典,然后使用最后一个元素将值设置为正确的键。

演示:

>>> getFromDict(dataDict, ["a", "r"])
1
>>> getFromDict(dataDict, ["b", "v", "y"])
2
>>> setInDict(dataDict, ["b", "v", "w"], 4)
>>> import pprint
>>> pprint.pprint(dataDict)
{'a': {'r': 1, 's': 2, 't': 3},
'b': {'u': 1, 'v': {'w': 4, 'x': 1, 'y': 2, 'z': 3}, 'w': 3}}

请注意,Python PEP8风格指南规定了函数的Snake_大小写名称。以上方法同样适用于列表或字典和列表的混合,因此名称实际上应该get_by_path()set_by_path()

from functools import reduce  # forward compatibility for Python 3
import operator


def get_by_path(root, items):
"""Access a nested object in root by item sequence."""
return reduce(operator.getitem, items, root)


def set_by_path(root, items, value):
"""Set a value in a nested object in root by item sequence."""
get_by_path(root, items[:-1])[items[-1]] = value

以及为了完整起见,删除键的函数:

def del_by_path(root, items):
"""Delete a key-value in a nested object in root by item sequence."""
del get_by_path(root, items[:-1])[items[-1]]

这个库可能会有帮助:https://github.com/akesterson/dpath-python

用于通过访问和搜索字典

的Python库 /slash/paths ala xpath

基本上,它可以让你浏览一本字典,就好像它是一个 文件系统.

使用reduce很聪明,但是如果父键没有预先存在于嵌套字典中,那么op的set方法可能会有问题。由于这是我在谷歌搜索中看到的第一个关于这个主题的帖子,我想让它稍微好一点。

在给定索引和值列表的情况下,在嵌套Python字典中设置值)中的set方法似乎对丢失的父键更健壮。要将其复制过来:

def nested_set(dic, keys, value):
for key in keys[:-1]:
dic = dic.setdefault(key, {})
dic[keys[-1]] = value

此外,有一个方法可以很方便地遍历密钥树并获取所有绝对密钥路径,为此我创建了:

def keysInDict(dataDict, parent=[]):
if not isinstance(dataDict, dict):
return [tuple(parent)]
else:
return reduce(list.__add__,
[keysInDict(v,parent+[k]) for k,v in dataDict.items()], [])

它的一个用途是使用以下代码将嵌套树转换为Pandas DataFrame(假设嵌套字典中的所有叶具有相同的深度)。

def dict_to_df(dataDict):
ret = []
for k in keysInDict(dataDict):
v = np.array( getFromDict(dataDict, k), )
v = pd.DataFrame(v)
v.columns = pd.MultiIndex.from_product(list(k) + [v.columns])
ret.append(v)
return reduce(pd.DataFrame.join, ret)
使用for循环

似乎更符合Python的风格。 请参阅Python 3.0中的新功能的报价。

已拆下reduce()。如果确实需要,请使用functools.reduce()。但是,在99%的情况下,显式for循环更具可读性。

def nested_get(dic, keys):
for key in keys:
dic = dic[key]
return dic

请注意,公认的解决方案不会设置不存在的嵌套键(它会引发KeyError)。使用下面的方法将创建不存在的节点:

def nested_set(dic, keys, value):
for key in keys[:-1]:
dic = dic.setdefault(key, {})
dic[keys[-1]] = value

代码可以在Python2和Python3中运行。

与其每次需要查找一个值时都受到性能影响,不如将字典展开一次,然后只需像b:v:y那样查找键

def flatten(mydict,sep = ':'):
new_dict = {}
for key,value in mydict.items():
if isinstance(value,dict):
_dict = {sep.join([key, _key]):_value for _key, _value in flatten(value).items()}
new_dict.update(_dict)
else:
new_dict[key]=value
return new_dict


dataDict = {
"a":{
"r": 1,
"s": 2,
"t": 3
},
"b":{
"u": 1,
"v": {
"x": 1,
"y": 2,
"z": 3
},
"w": 3
}
}


flat_dict = flatten(dataDict)
print flat_dict
{'b:w': 3, 'b:u': 1, 'b:v:y': 2, 'b:v:x': 1, 'b:v:z': 3, 'a:r': 1, 'a:s': 2, 'a:t': 3}

这样,您可以简单地使用flat_dict['b:v:y']来查找项目,这将为您提供1

您可以通过展平字典并保存输出来加快查找速度,而不是在每次查找时遍历字典,因此从冷启动进行查找将意味着加载展平的字典并简单地执行键/值查找,而无需遍历。

使用递归函数怎么样?

要获取值:

def getFromDict(dataDict, maplist):
first, rest = maplist[0], maplist[1:]


if rest:
# if `rest` is not empty, run the function recursively
return getFromDict(dataDict[first], rest)
else:
return dataDict[first]

并设置一个值:

def setInDict(dataDict, maplist, value):
first, rest = maplist[0], maplist[1:]


if rest:
try:
if not isinstance(dataDict[first], dict):
# if the key is not a dict, then make it a dict
dataDict[first] = {}
except KeyError:
# if key doesn't exist, create one
dataDict[first] = {}


setInDict(dataDict[first], rest, value)
else:
dataDict[first] = value

纯Python风格,没有任何导入:

def nested_set(element, value, *keys):
if type(element) is not dict:
raise AttributeError('nested_set() expects dict as first argument.')
if len(keys) < 2:
raise AttributeError('nested_set() expects at least three arguments, not enough given.')


_keys = keys[:-1]
_element = element
for key in _keys:
_element = _element[key]
_element[keys[-1]] = value


example = {"foo": { "bar": { "baz": "ok" } } }
keys = ['foo', 'bar']
nested_set(example, "yay", *keys)
print(example)

输出

{'foo': {'bar': 'yay'}}

用递归解决了这个问题:

def get(d,l):
if len(l)==1: return d[l[0]]
return get(d[l[0]],l[1:])

使用您的示例:

dataDict = {
"a":{
"r": 1,
"s": 2,
"t": 3
},
"b":{
"u": 1,
"v": {
"x": 1,
"y": 2,
"z": 3
},
"w": 3
}
}
maplist1 = ["a", "r"]
maplist2 = ["b", "v", "y"]
print(get(dataDict, maplist1)) # 1
print(get(dataDict, maplist2)) # 2

如果您还希望能够处理任意的JSON(包括嵌套列表和字典),并很好地处理无效的查找路径,下面是我的解决方案:

from functools import reduce




def get_furthest(s, path):
'''
Gets the furthest value along a given key path in a subscriptable structure.


subscriptable, list -> any
:param s: the subscriptable structure to examine
:param path: the lookup path to follow
:return: a tuple of the value at the furthest valid key, and whether the full path is valid
'''


def step_key(acc, key):
s = acc[0]
if isinstance(s, str):
return (s, False)
try:
return (s[key], acc[1])
except LookupError:
return (s, False)


return reduce(step_key, path, (s, True))




def get_val(s, path):
val, successful = get_furthest(s, path)
if successful:
return val
else:
raise LookupError('Invalid lookup path: {}'.format(path))




def set_val(s, path, value):
get_val(s, path[:-1])[path[-1]] = value

如果您不想在缺少其中一个键时引发错误,则可以使用另一种方法(以便您的主代码可以在不中断的情况下运行):

def get_value(self,your_dict,*keys):
curr_dict_ = your_dict
for k in keys:
v = curr_dict.get(k,None)
if v is None:
break
if isinstance(v,dict):
curr_dict = v
return v

在这种情况下,如果任何输入键都不存在,则不返回任何内容,这可以在主代码中用作检查以执行替代任务。

如何检查然后设置字典元素,而不处理所有索引两次?

解决方案:

def nested_yield(nested, keys_list):
"""
Get current nested data by send(None) method. Allows change it to Value by calling send(Value) next time
:param nested: list or dict of lists or dicts
:param keys_list: list of indexes/keys
"""
if not len(keys_list):  # assign to 1st level list
if isinstance(nested, list):
while True:
nested[:] = yield nested
else:
raise IndexError('Only lists can take element without key')




last_key = keys_list.pop()
for key in keys_list:
nested = nested[key]


while True:
try:
nested[last_key] = yield nested[last_key]
except IndexError as e:
print('no index {} in {}'.format(last_key, nested))
yield None

工作流程示例:

ny = nested_yield(nested_dict, nested_address)
data_element = ny.send(None)
if data_element:
# process element
...
else:
# extend/update nested data
ny.send(new_data_element)
...
ny.close()

测试

>>> cfg= {'Options': [[1,[0]],[2,[4,[8,16]]],[3,[9]]]}
ny = nested_yield(cfg, ['Options',1,1,1])
ny.send(None)
[8, 16]
>>> ny.send('Hello!')
'Hello!'
>>> cfg
{'Options': [[1, [0]], [2, [4, 'Hello!']], [3, [9]]]}
>>> ny.close()

看到这些答案是令人满意的,因为有两个静态方法来设置&;获取嵌套属性。这些解决方案比使用嵌套树__abc0要好得多。

这是我的实现。

用法

设置嵌套属性调用sattr(my_dict, 1, 2, 3, 5) is equal to my_dict[1][2][3][4]=5

若要获取嵌套属性,请调用__abc0

def gattr(d, *attrs):
"""
This method receives a dict and list of attributes to return the innermost value of the give dict
"""
try:
for at in attrs:
d = d[at]
return d
except(KeyError, TypeError):
return None




def sattr(d, *attrs):
"""
Adds "val" to dict in the hierarchy mentioned via *attrs
For ex:
sattr(animals, "cat", "leg","fingers", 4) is equivalent to animals["cat"]["leg"]["fingers"]=4
This method creates necessary objects until it reaches the final depth
This behaviour is also known as autovivification and plenty of implementation are around
This implementation addresses the corner case of replacing existing primitives
https://gist.github.com/hrldcpr/2012250#gistcomment-1779319
"""
for attr in attrs[:-2]:
if type(d.get(attr)) is not dict:
d[attr] = {}
d = d[attr]
d[attrs[-2]] = attrs[-1]

很晚才参加聚会,但如果这可能会在未来帮助某人。对于我的用例,下面的函数工作得最好。可以从字典中提取任何数据类型

字典是包含我们的值的字典

列表是实现我们价值的“步骤”列表

def getnestedvalue(dict, list):


length = len(list)
try:
for depth, key in enumerate(list):
if depth == length - 1:
output = dict[key]
return output
dict = dict[key]
except (KeyError, TypeError):
return None


return None

连接字符串的方法:

def get_sub_object_from_path(dict_name, map_list):
for i in map_list:
_string = "['%s']" % i
dict_name += _string
value = eval(dict_name)
return value
#Sample:
_dict = {'new': 'person', 'time': {'for': 'one'}}
map_list = ['time', 'for']
print get_sub_object_from_path("_dict",map_list)
#Output:
#one

通过扩展@DomTomcat和其他人的方法,这些函数式(即通过deepcopy返回修改后的数据而不影响输入)setter和mapper适用于嵌套dictlist

设置器:

def set_at_path(data0, keys, value):
data = deepcopy(data0)
if len(keys)>1:
if isinstance(data,dict):
return {k:(set_by_path(v,keys[1:],value) if k==keys[0] else v) for k,v in data.items()}
if isinstance(data,list):
return [set_by_path(x[1],keys[1:],value) if x[0]==keys[0] else x[1] for x in enumerate(data)]
else:
data[keys[-1]]=value
return data

映射器:

def map_at_path(data0, keys, f):
data = deepcopy(data0)
if len(keys)>1:
if isinstance(data,dict):
return {k:(map_at_path(v,keys[1:],f) if k==keys[0] else v) for k,v in data.items()}
if isinstance(data,list):
return [map_at_path(x[1],keys[1:],f) if x[0]==keys[0] else x[1] for x in enumerate(data)]
else:
data[keys[-1]]=f(data[keys[-1]])
return data

您可以使用Python中的__abc0函数。

def nested_parse(nest, map_list):
nestq = "nest['" + "']['".join(map_list) + "']"
return eval(nestq, {'__builtins__':None}, {'nest':nest})

解释

对于示例查询:maplist = ["b", "v", "y"]

nestq将是"nest['b']['v']['y']",其中nest是嵌套字典。

eval内置函数执行给定的字符串。但是,请务必注意使用eval函数时可能出现的漏洞。讨论可以在这里找到:

  1. https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html.
  2. https://www.journaldev.com/22504/python-eval-function.

nested_parse()函数中,我已经确保没有__builtins__全局变量可用,唯一可用的局部变量是nest字典。

您可以使用pydash:

import pydash as _


_.get(dataDict, ["b", "v", "y"], default='Default')

https://pydash.readthedocs.io/en/latest/api.html.

我用这个

def get_dictionary_value(dictionary_temp, variable_dictionary_keys):
try:
if(len(variable_dictionary_keys) == 0):
return str(dictionary_temp)


variable_dictionary_key = variable_dictionary_keys[0]
variable_dictionary_keys.remove(variable_dictionary_key)


return get_dictionary_value(dictionary_temp[variable_dictionary_key] , variable_dictionary_keys)


except Exception as variable_exception:
logging.error(variable_exception)
 

return ''


看看NestedDict,它完全符合你的要求。首先安装指示

pip install ndicts

然后

from ndicts.ndicts import NestedDict


data_dict = {
"a":{
"r": 1,
"s": 2,
"t": 3
},
"b":{
"u": 1,
"v": {
"x": 1,
"y": 2,
"z": 3
},
"w": 3
}
}


nd = NestedDict(data_dict)

现在可以使用逗号分隔值访问键

>>> nd["a", "r"]
1
>>> nd["b", "v"]
{"x": 1, "y": 2, "z": 3}

我宁愿使用简单的递归函数:

def get_value_by_path(data, maplist):
if not maplist:
return data
for key in maplist:
if key in data:
return get_value_by_path(data[key], maplist[1:])