如何引用YAML"设置"从同一YAML文件中的其他地方?

我有以下YAML:

paths:
patha: /path/to/root/a
pathb: /path/to/root/b
pathc: /path/to/root/c

我怎样才能通过从三个路径中删除/path/to/root/来“规范化”它,并将其作为自己的设置,如下所示:

paths:
root: /path/to/root/
patha: *root* + a
pathb: *root* + b
pathc: *root* + c

显然那是无效的,我刚刚编出来的。真正的语法是什么?能做到吗?

184829 次浏览

I don't think it is possible. You can reuse "node" but not part of it.

bill-to: &id001
given  : Chris
family : Dumars
ship-to: *id001

This is perfectly valid YAML and fields given and family are reused in ship-to block. You can reuse a scalar node the same way but there's no way you can change what's inside and add that last part of a path to it from inside YAML.

If repetition bother you that much I suggest to make your application aware of root property and add it to every path that looks relative not absolute.

Yes, using custom tags. Example in Python, making the !join tag join strings in an array:

import yaml


## define custom tag handler
def join(loader, node):
seq = loader.construct_sequence(node)
return ''.join([str(i) for i in seq])


## register the tag handler
yaml.add_constructor('!join', join)


## using your sample data
yaml.load("""
paths:
root: &BASE /path/to/root/
patha: !join [*BASE, a]
pathb: !join [*BASE, b]
pathc: !join [*BASE, c]
""")

Which results in:

{
'paths': {
'patha': '/path/to/root/a',
'pathb': '/path/to/root/b',
'pathc': '/path/to/root/c',
'root': '/path/to/root/'
}
}

The array of arguments to !join can have any number of elements of any data type, as long as they can be converted to string, so !join [*a, "/", *b, "/", *c] does what you would expect.

Another way to look at this is to simply use another field.

paths:
root_path: &root
val: /path/to/root/
patha: &a
root_path: *root
rel_path: a
pathb: &b
root_path: *root
rel_path: b
pathc: &c
root_path: *root
rel_path: c

That your example is invalid is only because you chose a reserved character to start your scalars with. If you replace the * with some other non-reserved character (I tend to use non-ASCII characters for that as they are seldom used as part of some specification), you end up with perfectly legal YAML:

paths:
root: /path/to/root/
patha: ♦root♦ + a
pathb: ♦root♦ + b
pathc: ♦root♦ + c

This will load into the standard representation for mappings in the language your parser uses and does not magically expand anything.
To do that use a locally default object type as in the following Python program:

# coding: utf-8


from __future__ import print_function


import ruamel.yaml as yaml


class Paths:
def __init__(self):
self.d = {}


def __repr__(self):
return repr(self.d).replace('ordereddict', 'Paths')


@staticmethod
def __yaml_in__(loader, data):
result = Paths()
loader.construct_mapping(data, result.d)
return result


@staticmethod
def __yaml_out__(dumper, self):
return dumper.represent_mapping('!Paths', self.d)


def __getitem__(self, key):
res = self.d[key]
return self.expand(res)


def expand(self, res):
try:
before, rest = res.split(u'♦', 1)
kw, rest = rest.split(u'♦ +', 1)
rest = rest.lstrip() # strip any spaces after "+"
# the lookup will throw the correct keyerror if kw is not found
# recursive call expand() on the tail if there are multiple
# parts to replace
return before + self.d[kw] + self.expand(rest)
except ValueError:
return res


yaml_str = """\
paths: !Paths
root: /path/to/root/
patha: ♦root♦ + a
pathb: ♦root♦ + b
pathc: ♦root♦ + c
"""


loader = yaml.RoundTripLoader
loader.add_constructor('!Paths', Paths.__yaml_in__)


paths = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)['paths']


for k in ['root', 'pathc']:
print(u'{} -> {}'.format(k, paths[k]))

which will print:

root -> /path/to/root/
pathc -> /path/to/root/c

The expanding is done on the fly and handles nested definitions, but you have to be careful about not invoking infinite recursion.

By specifying the dumper, you can dump the original YAML from the data loaded in, because of the on-the-fly expansion:

dumper = yaml.RoundTripDumper
dumper.add_representer(Paths, Paths.__yaml_out__)
print(yaml.dump(paths, Dumper=dumper, allow_unicode=True))

this will change the mapping key ordering. If that is a problem you have to make self.d a CommentedMap (imported from ruamel.yaml.comments.py)

I've create a library, available on Packagist, that performs this function: https://packagist.org/packages/grasmash/yaml-expander

Example YAML file:

type: book
book:
title: Dune
author: Frank Herbert
copyright: ${book.author} 1965
protaganist: ${characters.0.name}
media:
- hardcover
characters:
- name: Paul Atreides
occupation: Kwisatz Haderach
aliases:
- Usul
- Muad'Dib
- The Preacher
- name: Duncan Idaho
occupation: Swordmaster
summary: ${book.title} by ${book.author}
product-name: ${${type}.title}

Example logic:

// Parse a yaml string directly, expanding internal property references.
$yaml_string = file_get_contents("dune.yml");
$expanded = \Grasmash\YamlExpander\Expander::parse($yaml_string);
print_r($expanded);

Resultant array:

array (
'type' => 'book',
'book' =>
array (
'title' => 'Dune',
'author' => 'Frank Herbert',
'copyright' => 'Frank Herbert 1965',
'protaganist' => 'Paul Atreides',
'media' =>
array (
0 => 'hardcover',
),
),
'characters' =>
array (
0 =>
array (
'name' => 'Paul Atreides',
'occupation' => 'Kwisatz Haderach',
'aliases' =>
array (
0 => 'Usul',
1 => 'Muad\'Dib',
2 => 'The Preacher',
),
),
1 =>
array (
'name' => 'Duncan Idaho',
'occupation' => 'Swordmaster',
),
),
'summary' => 'Dune by Frank Herbert',
);

YML definition:

dir:
  default: /home/data/in/
  proj1: ${dir.default}p1
  proj2: ${dir.default}p2
  proj3: ${dir.default}p3

Somewhere in thymeleaf

<p th:utext='${@environment.getProperty("dir.default")}' />
<p th:utext='${@environment.getProperty("dir.proj1")}' />

Output: /home/data/in/ /home/data/in/p1

In some languages, you can use an alternative library, For example, tampax is an implementation of YAML handling variables:

const tampax = require('tampax');


const yamlString = `
dude:
name: Arthur
weapon:
favorite: Excalibur
useless: knife
sentence: "\{\{dude.name}} use \{\{weapon.favorite}}. The goal is \{\{goal}}."`;


const r = tampax.yamlParseString(yamlString, { goal: 'to kill Mordred' });
console.log(r.sentence);


// output : "Arthur use Excalibur. The goal is to kill Mordred."

Editor's Note: poster is also the author of this package.

I have written my own library on Python to expand variables being loaded from directories with a hierarchy like:

/root
|
+- /proj1
|
+- config.yaml
|
+- /proj2
|
+- config.yaml
|
... and so on ...

The key difference here is that the expansion must be applied only after all the config.yaml files is loaded, where the variables from the next file can override the variables from the previous, so the pseudocode should look like this:

env = YamlEnv()
env.load('/root/proj1/config.yaml')
env.load('/root/proj1/proj2/config.yaml')
...
env.expand()

As an additional option the xonsh script can export the resulting variables into environment variables (see the yaml_update_global_vars function).

The scripts:

https://sourceforge.net/p/tacklelib/tacklelib/HEAD/tree/trunk/python/cmdoplib/cmdoplib.yaml.xsh https://sourceforge.net/p/tacklelib/tacklelib/HEAD/tree/trunk/python/tacklelib/tacklelib.yaml.py

Pros:

  • simple, does not support recursion and nested variables
  • can replace an undefined variable to a placeholder (${MYUNDEFINEDVAR} -> *$/{MYUNDEFINEDVAR})
  • can expand a reference from environment variable (${env:MYVAR})
  • can replace all \\ to / in a path variable (${env:MYVAR:path})

Cons:

  • does not support nested variables, so can not expand values in nested dictionaries (something like ${MYSCOPE.MYVAR} is not implemented)
  • does not detect expansion recursion, including recursion after a placeholder put

With Yglu, you can write your example as:

paths:
root: /path/to/root/
patha: !? .paths.root + a
pathb: !? .paths.root + b
pathc: !? .paths.root + c

Disclaimer: I am the author of Yglu.

Using OmegaConf

OmegaConf is a YAML-based hierarchical configuration system that has support for this under the functionality Variable interpolation. Using OmegaConf v2.2.2:

Create a YAML file paths.yaml as follows:

paths:
root: /path/to/root/
patha: ${.root}a
pathb: ${.root}b
pathc: ${.root}c

then we can read the file with variable paths:

from omegaconf import OmegaConf
conf = OmegaConf.load("test_paths.yaml")


>>> conf.paths.root
'/path/to/root/'


>>> conf.paths.patha
'/path/to/root/a'
>>> conf.paths.pathb
'/path/to/root/b'
>>> conf.paths.pathc
'/path/to/root/c'

Deep and Cross -Ref

It is possible to define more complex (nested) structures with a relative depth of your variable in reference to other variables:

Create another file nested_paths.yaml:

data:
base: data
sub_dir_A:
name: a
# here we note that `base` is two levels above this variable
# hence we will use `..base` two dots but the `name` variable is
# at the same level hence a single dot `.name`
nested_dir: ${..base}/sub_dir/${.name}/last_dir
sub_dir_B:
# add another level of depth
- name: b
# due to another level of depth, we have to use three dots
# to access `base` variable as `...base`
nested_file: ${...base}/sub_dir/${.name}/dirs.txt
- name: c
# we can also make cross-references to other variables
cross_ref_dir: ${...sub_dir_A.nested_dir}/${.name}

again we can check:

conf = OmegaConf.load("nested_paths.yaml")


# 1-level of depth reference
>>> conf.data.sub_dir_A.nested_dir
'data/sub_dir/a/last_dir'


# 2-levels of depth reference
>>> conf.data.sub_dir_B[0].nested_file
'data/sub_dir/b/dirs.txt'


# cross-reference example
>>> conf.data.sub_dir_B[1].cross_ref_dir
'data/sub_dir/a/last_dir/c'

In case of invalid references (such as wrong depth, wrong variable name), OmegaConf will throw an error omegaconf.errors.InterpolationResolutionError. It is also used in Hydra for configuring complex applications.