PyYAML 可以以非字母顺序转储 dict 项吗?

我正在使用 yaml.dump输出一个结果,它会根据这个键以字母顺序的形式打印出每个条目。

>>> d = {"z":0,"y":0,"x":0}
>>> yaml.dump( d, default_flow_style=False )
'x: 0\ny: 0\nz: 0\n'

有办法控制键/值对的顺序吗?

在我的特定用例中,反向打印(巧合的是)已经足够好了。不过,为了完整起见,我正在寻找一个答案,说明如何更精确地控制顺序。

我已经考虑过使用 collections.OrderedDict,但是 PyYAML 似乎并不支持它。我也研究过子类化 yaml.Dumper,但是我还没有弄清楚它是否有能力改变项目顺序。

70912 次浏览

There's probably a better workaround, but I couldn't find anything in the documentation or the source.


Python 2 (see comments)

I subclassed OrderedDict and made it return a list of unsortable items:

from collections import OrderedDict


class UnsortableList(list):
def sort(self, *args, **kwargs):
pass


class UnsortableOrderedDict(OrderedDict):
def items(self, *args, **kwargs):
return UnsortableList(OrderedDict.items(self, *args, **kwargs))


yaml.add_representer(UnsortableOrderedDict, yaml.representer.SafeRepresenter.represent_dict)

And it seems to work:

>>> d = UnsortableOrderedDict([
...     ('z', 0),
...     ('y', 0),
...     ('x', 0)
... ])
>>> yaml.dump(d, default_flow_style=False)
'z: 0\ny: 0\nx: 0\n'

Python 3 or 2 (see comments)

You can also write a custom representer, but I don't know if you'll run into problems later on, as I stripped out some style checking code from it:

import yaml


from collections import OrderedDict


def represent_ordereddict(dumper, data):
value = []


for item_key, item_value in data.items():
node_key = dumper.represent_data(item_key)
node_value = dumper.represent_data(item_value)


value.append((node_key, node_value))


return yaml.nodes.MappingNode(u'tag:yaml.org,2002:map', value)


yaml.add_representer(OrderedDict, represent_ordereddict)

But with that, you can use the native OrderedDict class.

I was also looking for an answer to the question "how to dump mappings with the order preserved?" I couldn't follow the solution given above as i am new to pyyaml and python. After spending some time on the pyyaml documentation and other forums i found this.

You can use the tag

!!omap

to dump the mappings by preserving the order. If you want to play with the order i think you have to go for keys:values

The links below can help for better understanding.

https://bitbucket.org/xi/pyyaml/issue/13/loading-and-then-dumping-an-omap-is-broken

http://yaml.org/type/omap.html

There are two things you need to do to get this as you want:

  • you need to use something else than a dict, because it doesn't keep the items ordered
  • you need to dump that alternative in the appropriate way.¹

import sys
import ruamel.yaml
from ruamel.yaml.comments import CommentedMap


d = CommentedMap()
d['z'] = 0
d['y'] = 0
d['x'] = 0


ruamel.yaml.round_trip_dump(d, sys.stdout)

output:

z: 0
y: 0
x: 0

¹ This was done using ruamel.yaml a YAML 1.2 parser, of which I am the author.

This is really just an addendum to @Blender's answer. If you look in the PyYAML source, at the representer.py module, You find this method:

def represent_mapping(self, tag, mapping, flow_style=None):
value = []
node = MappingNode(tag, value, flow_style=flow_style)
if self.alias_key is not None:
self.represented_objects[self.alias_key] = node
best_style = True
if hasattr(mapping, 'items'):
mapping = mapping.items()
mapping.sort()
for item_key, item_value in mapping:
node_key = self.represent_data(item_key)
node_value = self.represent_data(item_value)
if not (isinstance(node_key, ScalarNode) and not node_key.style):
best_style = False
if not (isinstance(node_value, ScalarNode) and not node_value.style):
best_style = False
value.append((node_key, node_value))
if flow_style is None:
if self.default_flow_style is not None:
node.flow_style = self.default_flow_style
else:
node.flow_style = best_style
return node

If you simply remove the mapping.sort() line, then it maintains the order of items in the OrderedDict.

Another solution is given in this post. It's similar to @Blender's, but works for safe_dump. The common element is the converting of the dict to a list of tuples, so the if hasattr(mapping, 'items') check evaluates to false.

Update:

I just noticed that The Fedora Project's EPEL repo has a package called python2-yamlordereddictloader, and there's one for Python 3 as well. The upstream project for that package is likely cross-platform.

For Python 3.7+, dicts preserve insertion order. Since PyYAML 5.1.x, you can disable the sorting of keys (#254). Unfortunately, the sorting keys behaviour does still default to True.

>>> import yaml
>>> yaml.dump({"b":1, "a": 2})
'a: 2\nb: 1\n'
>>> yaml.dump({"b":1, "a": 2}, sort_keys=False)
'b: 1\na: 2\n'

My project oyaml is a monkeypatch/drop-in replacement for PyYAML. It will preserve dict order by default in all Python versions and PyYAML versions.

>>> import oyaml as yaml  # pip install oyaml
>>> yaml.dump({"b":1, "a": 2})
'b: 1\na: 2\n'

Additionally, it will dump the collections.OrderedDict subclass as normal mappings, rather than Python objects.

>>> from collections import OrderedDict
>>> d = OrderedDict([("b", 1), ("a", 2)])
>>> import yaml
>>> yaml.dump(d)
'!!python/object/apply:collections.OrderedDict\n- - - b\n    - 1\n  - - a\n    - 2\n'
>>> yaml.safe_dump(d)
RepresenterError: ('cannot represent an object', OrderedDict([('b', 1), ('a', 2)]))
>>> import oyaml as yaml
>>> yaml.dump(d)
'b: 1\na: 2\n'
>>> yaml.safe_dump(d)
'b: 1\na: 2\n'

One-liner to rule them all:

yaml.add_representer(dict, lambda self, data: yaml.representer.SafeRepresenter.represent_dict(self, data.items()))

That's it. Finally. After all those years and hours, the mighty represent_dict has been defeated by giving it the dict.items() instead of just dict

Here is how it works:

This is the relevant PyYaml source code:

    if hasattr(mapping, 'items'):
mapping = list(mapping.items())
try:
mapping = sorted(mapping)
except TypeError:
pass
for item_key, item_value in mapping:

To prevent the sorting we just need some Iterable[Pair] object that does not have .items().

dict_items is a perfect candidate for this.

Here is how to do this without affecting the global state of the yaml module:

#Using a custom Dumper class to prevent changing the global state
class CustomDumper(yaml.Dumper):
#Super neat hack to preserve the mapping key order. See https://stackoverflow.com/a/52621703/1497385
def represent_dict_preserve_order(self, data):
return self.represent_dict(data.items())


CustomDumper.add_representer(dict, CustomDumper.represent_dict_preserve_order)


return yaml.dump(component_dict, Dumper=CustomDumper)

If you upgrade PyYAML to 5.1 version, now, it supports dump without sorting the keys like this:

yaml.dump(data, sort_keys=False)

As shown in help(yaml.Dumper), sort_keys defaults to True:

Dumper(stream, default_style=None, default_flow_style=False,
canonical=None, indent=None, width=None, allow_unicode=None,
line_break=None, encoding=None, explicit_start=None, explicit_end=None,
version=None, tags=None, sort_keys=True)

(These are passed as kwargs to yaml.dump)

If safe_dump (i.e. dump with Dumper=SafeDumper) is used, then calling yaml.add_representer has no effect. In such case it is necessary to call add_representer method explicitly on SafeRepresenter class:

yaml.representer.SafeRepresenter.add_representer(
OrderedDict, ordered_dict_representer
)

The following setting makes sure the content is not sorted in the output:

yaml.sort_base_mapping_type_on_output = False