如何将 YAML 文件解析/读入 Python 对象?

如何将 YAML 文件解析/读入 Python 对象?

For example, this YAML:

Person:
name: XYZ

对于这个 Python 类:

class Person(yaml.YAMLObject):
yaml_tag = 'Person'


def __init__(self, name):
self.name = name

顺便说一下,我正在使用 PyYAML。

210845 次浏览

如果您的 YAML 文件如下所示:

# tree format
treeroot:
branch1:
name: Node 1
branch1-1:
name: Node 1-1
branch2:
name: Node 2
branch2-1:
name: Node 2-1

你安装的 PyYAML是这样的:

pip install PyYAML

And the Python code looks like this:

import yaml
with open('tree.yaml') as f:
# use safe_load instead load
dataMap = yaml.safe_load(f)

变量 dataMap现在包含一个包含树数据的字典。如果您使用 PrettyPrint 打印 dataMap,您将得到如下内容:

{
'treeroot': {
'branch1': {
'branch1-1': {
'name': 'Node 1-1'
},
'name': 'Node 1'
},
'branch2': {
'branch2-1': {
'name': 'Node 2-1'
},
'name': 'Node 2'
}
}
}

So, now we have seen how to get data into our Python program. Saving data is just as easy:

with open('newtree.yaml', "w") as f:
yaml.dump(dataMap, f)

你有一个字典,现在你必须把它转换成一个 Python 对象:

class Struct:
def __init__(self, **entries):
self.__dict__.update(entries)

然后你可以使用:

>>> args = your YAML dictionary
>>> s = Struct(**args)
>>> s
<__main__.Struct instance at 0x01D6A738>
>>> s...

跟着“ 将 Python Dict 转换为 object”走。

For more information you can look at Pyyaml.org and 这个.

来自 http://pyyaml.org/wiki/PyYAMLDocumentation:

add_path_resolver(tag, path, kind)添加了一个基于路径的隐式标记解析器。路径是构成表示图中节点路径的键列表。Path 元素可以是字符串值、整数或无。节点的类型可以是 str、 list、 dict 或 Nothing。

#!/usr/bin/env python
import yaml


class Person(yaml.YAMLObject):
yaml_tag = '!person'


def __init__(self, name):
self.name = name


yaml.add_path_resolver('!person', ['Person'], dict)


data = yaml.load("""
Person:
name: XYZ
""")


print data
# {'Person': <__main__.Person object at 0x7f2b251ceb10>}


print data['Person'].name
# XYZ

这里有一种方法可以测试用户在 viralenv (或系统)上选择了哪个 YAML 实现,然后适当地定义 load_yaml_file:

load_yaml_file = None


if not load_yaml_file:
try:
import yaml
load_yaml_file = lambda fn: yaml.load(open(fn))
except:
pass


if not load_yaml_file:
import commands, json
if commands.getstatusoutput('ruby --version')[0] == 0:
def load_yaml_file(fn):
ruby = "puts YAML.load_file('%s').to_json" % fn
j = commands.getstatusoutput('ruby -ryaml -rjson -e "%s"' % ruby)
return json.loads(j[1])


if not load_yaml_file:
import os, sys
print """
ERROR: %s requires ruby or python-yaml  to be installed.


apt-get install ruby


OR


apt-get install python-yaml


OR


Demonstrate your mastery of Python by using pip.
Please research the latest pip-based install steps for python-yaml.
Usually something like this works:
apt-get install epel-release
apt-get install python-pip
apt-get install libyaml-cpp-dev
python2.7 /usr/bin/pip install pyyaml
Notes:
Non-base library (yaml) should never be installed outside a virtualenv.
"pip install" is permanent:
https://stackoverflow.com/questions/1550226/python-setup-py-uninstall
Beware when using pip within an aptitude or RPM script.
Pip might not play by all the rules.
Your installation may be permanent.
Ruby is 7X faster at loading large YAML files.
pip could ruin your life.
https://stackoverflow.com/questions/46326059/
https://stackoverflow.com/questions/36410756/
https://stackoverflow.com/questions/8022240/
Never use PyYaml in numerical applications.
https://stackoverflow.com/questions/30458977/
If you are working for a Fortune 500 company, your choices are
1. Ask for either the "ruby" package or the "python-yaml"
package. Asking for Ruby is more likely to get a fast answer.
2. Work in a VM. I highly recommend Vagrant for setting it up.


""" % sys.argv[0]
os._exit(4)




# test
import sys
print load_yaml_file(sys.argv[1])

我使用 叫 Tuple编写了一个实现,我认为这个实现非常简洁,因为它有一点可读性。它还处理字典嵌套的情况。解析器代码如下:

from collections import namedtuple




class Dict2ObjParser:
def __init__(self, nested_dict):
self.nested_dict = nested_dict


def parse(self):
nested_dict = self.nested_dict
if (obj_type := type(nested_dict)) is not dict:
raise TypeError(f"Expected 'dict' but found '{obj_type}'")
return self._transform_to_named_tuples("root", nested_dict)


def _transform_to_named_tuples(self, tuple_name, possibly_nested_obj):
if type(possibly_nested_obj) is dict:
named_tuple_def = namedtuple(tuple_name, possibly_nested_obj.keys())
transformed_value = named_tuple_def(
*[
self._transform_to_named_tuples(key, value)
for key, value in possibly_nested_obj.items()
]
)
elif type(possibly_nested_obj) is list:
transformed_value = [
self._transform_to_named_tuples(f"{tuple_name}_{i}", possibly_nested_obj[i])
for i in range(len(possibly_nested_obj))
]
else:
transformed_value = possibly_nested_obj


return transformed_value

我用以下代码测试了基本案例:

x = Dict2ObjParser({
"a": {
"b": 123,
"c": "Hello, World!"
},
"d": [
1,
2,
3
],
"e": [
{
"f": "",
"g": None
},
{
"f": "Foo",
"g": "Bar"
},
{
"h": "Hi!",
"i": None
}
],
"j": 456,
"k": None
}).parse()


print(x)

它提供以下输出: root(a=a(b=123, c='Hello, World!'), d=[1, 2, 3], e=[e_0(f='', g=None), e_1(f='Foo', g='Bar'), e_2(h='Hi!', i=None)], j=456, k=None)

格式化后看起来有点像:

root(
a=a(
b=123,
c='Hello, World!'
),
d=[1, 2, 3],
e=[
e_0(
f='',
g=None
),
e_1(
f='Foo',
g='Bar'
),
e_2(
h='Hi!',
i=None
)
],
j=456,
k=None
)

我可以像访问其他对象一样访问嵌套字段:

print(x.a.b)  # Prints: 123

在您的例子中,代码最终看起来如下:

import yaml




with open(file_path, "r") as stream:
nested_dict = yaml.safe_load(stream)
nested_objt = Dict2ObjParser(nested_dict).parse()

希望这个能帮上忙!