部分字符串格式化

是否可以使用类似于字符串模板 safe_substitute()函数的高级字符串格式化方法进行部分字符串格式化?

例如:

s = '{foo} {bar}'
s.format(foo='FOO') #Problem: raises KeyError 'bar'
37355 次浏览

如果您定义了自己的 Formatter来覆盖 get_value方法,那么您可以使用它来将未定义的字段名映射到您想要的任何内容:

Http://docs.python.org/library/string.html#string

例如,如果 bar不在 kwargs 中,您可以将 bar映射到 "{bar}"

但是,这需要使用 Formatter 对象的 format()方法,而不是字符串的 format()方法。

您可以通过覆盖映射将其转换为部分格式化:

import string


class FormatDict(dict):
def __missing__(self, key):
return "{" + key + "}"


s = '{foo} {bar}'
formatter = string.Formatter()
mapping = FormatDict(foo='FOO')
print(formatter.vformat(s, (), mapping))

印刷

FOO {bar}

当然,这个基本实现只能在基本情况下正确工作。

你可以把它包装在一个带有默认参数的函数中:

def print_foo_bar(foo='', bar=''):
s = '{foo} {bar}'
return s.format(foo=foo, bar=bar)


print_foo_bar(bar='BAR') # ' BAR'

感谢 安珀的评论,我想到了这个:

import string


try:
# Python 3
from _string import formatter_field_name_split
except ImportError:
formatter_field_name_split = str._formatter_field_name_split




class PartialFormatter(string.Formatter):
def get_field(self, field_name, args, kwargs):
try:
val = super(PartialFormatter, self).get_field(field_name, args, kwargs)
except (IndexError, KeyError, AttributeError):
first, _ = formatter_field_name_split(field_name)
val = '{' + field_name + '}', first
return val

如果你知道按什么顺序排列:

s = '{foo} \{\{bar}}'

像这样使用它:

ss = s.format(foo='FOO')
print ss
>>> 'FOO {bar}'


print ss.format(bar='BAR')
>>> 'FOO BAR'

您不能同时指定 foobar-必须按顺序进行。

.format()的这种局限性——不能进行部分替换——一直困扰着我。

在评估了这里的许多答案中所描述的编写自定义 Formatter类,甚至考虑使用第三方软件包(如 惰性 _ 格式)之后,我发现了一个更简单的内置解决方案: 模板字符串

它提供了类似的功能,但也提供了部分替代透彻的 safe_substitute()方法。模板字符串需要有一个 $前缀(感觉有点奇怪-但我认为整体解决方案更好)。

import string
template = string.Template('${x} ${y}')
try:
template.substitute({'x':1}) # raises KeyError
except KeyError:
pass


# but the following raises no error
partial_str = template.safe_substitute({'x':1}) # no error


# partial_str now contains a string with partial substitution
partial_template = string.Template(partial_str)
substituted_str = partial_template.safe_substitute({'y':2}) # no error
print substituted_str # prints '12'

在此基础上形成了一个方便的包装:

class StringTemplate(object):
def __init__(self, template):
self.template = string.Template(template)
self.partial_substituted_str = None


def __repr__(self):
return self.template.safe_substitute()


def format(self, *args, **kws):
self.partial_substituted_str = self.template.safe_substitute(*args, **kws)
self.template = string.Template(self.partial_substituted_str)
return self.__repr__()




>>> s = StringTemplate('${x}${y}')
>>> s
'${x}${y}'
>>> s.format(x=1)
'1${y}'
>>> s.format({'y':2})
'12'
>>> print s
12

类似地,基于 Sven 回答的包装器使用默认的字符串格式:

class StringTemplate(object):
class FormatDict(dict):
def __missing__(self, key):
return "{" + key + "}"


def __init__(self, template):
self.substituted_str = template
self.formatter = string.Formatter()


def __repr__(self):
return self.substituted_str


def format(self, *args, **kwargs):
mapping = StringTemplate.FormatDict(*args, **kwargs)
self.substituted_str = self.formatter.vformat(self.substituted_str, (), mapping)

不确定这是否可以作为一个快速的解决方案,但是

s = '{foo} {bar}'
s.format(foo='FOO', bar='{bar}')

? :)

>>> 'fd:{uid}:\{\{topic_id}}'.format(uid=123)
'fd:123:{topic_id}'

试试这个。

对我来说,这已经足够好了:

>>> ss = 'dfassf {} dfasfae efaef {} fds'
>>> nn = ss.format('f1', '{}')
>>> nn
'dfassf f1 dfasfae efaef {} fds'
>>> n2 = nn.format('whoa')
>>> n2
'dfassf f1 dfasfae efaef whoa fds'

还有一种方法可以实现这一点,即使用 format%来替换变量,例如:

>>> s = '{foo} %(bar)s'
>>> s = s.format(foo='my_foo')
>>> s
'my_foo %(bar)s'
>>> s % {'bar': 'my_bar'}
'my_foo my_bar'

你可以使用来自 functoolspartial函数,它简短、易读,并且描述了编码者的意图:

from functools import partial


s = partial("{foo} {bar}".format, foo="FOO")
print s(bar="BAR")
# FOO BAR

假设在字符串完全填满之前不会使用它,那么您可以执行类似于下面这个类的操作:

class IncrementalFormatting:
def __init__(self, string):
self._args = []
self._kwargs = {}
self._string = string


def add(self, *args, **kwargs):
self._args.extend(args)
self._kwargs.update(kwargs)


def get(self):
return self._string.format(*self._args, **self._kwargs)

例如:

template = '#{a}:{}/{}?{c}'
message = IncrementalFormatting(template)
message.add('abc')
message.add('xyz', a=24)
message.add(c='lmno')
assert message.get() == '#24:abc/xyz?lmno'

对我来说,一个非常丑陋但最简单的解决方案就是:

tmpl = '{foo}, {bar}'
tmpl.replace('{bar}', 'BAR')
Out[3]: '{foo}, BAR'

这样,您仍然可以使用 tmpl作为常规模板,并只在需要时执行部分格式化。我发现这个问题太微不足道了,不能像 Mohan Raj 那样用过度杀伤性的解决方案。

我的建议如下(使用 Python 3.6进行测试) :

class Lazymap(object):
def __init__(self, **kwargs):
self.dict = kwargs


def __getitem__(self, key):
return self.dict.get(key, "".join(["{", key, "}"]))




s = '{foo} {bar}'


s.format_map(Lazymap(bar="FOO"))
# >>> '{foo} FOO'


s.format_map(Lazymap(bar="BAR"))
# >>> '{foo} BAR'


s.format_map(Lazymap(bar="BAR", foo="FOO", baz="BAZ"))
# >>> 'FOO BAR'

更新: 这里显示了一种更优雅的方式(子类化 dict和重载 __missing__(self, key)) : < a href = “ https://stackoverflow. com/a/17215533/333403”> https://stackoverflow.com/a/17215533/333403

在测试了来自 给你那里的最有希望的解决方案之后,我意识到它们中没有一个真正满足以下要求:

  1. 严格遵守 str.format_map()认可的语法为模板;
  2. 能够保留复杂的格式,即完全支持 迷你语言格式

因此,我编写了自己的解决方案,满足了上述要求。 (剪辑: 现在由@SvenMarach 提供的版本——正如本回答中所报道的——似乎可以处理我需要的边缘案例)。

基本上,我最终解析了模板字符串,找到匹配的嵌套 {.*?}组(使用 find_all()辅助函数) ,逐步构建格式化字符串,使用 str.format_map()构建 直接,同时捕获任何潜在的 KeyError

def find_all(
text,
pattern,
overlap=False):
"""
Find all occurrencies of the pattern in the text.


Args:
text (str|bytes|bytearray): The input text.
pattern (str|bytes|bytearray): The pattern to find.
overlap (bool): Detect overlapping patterns.


Yields:
position (int): The position of the next finding.
"""
len_text = len(text)
offset = 1 if overlap else (len(pattern) or 1)
i = 0
while i < len_text:
i = text.find(pattern, i)
if i >= 0:
yield i
i += offset
else:
break
def matching_delimiters(
text,
l_delim,
r_delim,
including=True):
"""
Find matching delimiters in a sequence.


The delimiters are matched according to nesting level.


Args:
text (str|bytes|bytearray): The input text.
l_delim (str|bytes|bytearray): The left delimiter.
r_delim (str|bytes|bytearray): The right delimiter.
including (bool): Include delimeters.


yields:
result (tuple[int]): The matching delimiters.
"""
l_offset = len(l_delim) if including else 0
r_offset = len(r_delim) if including else 0
stack = []


l_tokens = set(find_all(text, l_delim))
r_tokens = set(find_all(text, r_delim))
positions = l_tokens.union(r_tokens)
for pos in sorted(positions):
if pos in l_tokens:
stack.append(pos + 1)
elif pos in r_tokens:
if len(stack) > 0:
prev = stack.pop()
yield (prev - l_offset, pos + r_offset, len(stack))
else:
raise ValueError(
'Found `{}` unmatched right token(s) `{}` (position: {}).'
.format(len(r_tokens) - len(l_tokens), r_delim, pos))
if len(stack) > 0:
raise ValueError(
'Found `{}` unmatched left token(s) `{}` (position: {}).'
.format(
len(l_tokens) - len(r_tokens), l_delim, stack.pop() - 1))
def safe_format_map(
text,
source):
"""
Perform safe string formatting from a mapping source.


If a value is missing from source, this is simply ignored, and no
`KeyError` is raised.


Args:
text (str): Text to format.
source (Mapping|None): The mapping to use as source.
If None, uses caller's `vars()`.


Returns:
result (str): The formatted text.
"""
stack = []
for i, j, depth in matching_delimiters(text, '{', '}'):
if depth == 0:
try:
replacing = text[i:j].format_map(source)
except KeyError:
pass
else:
stack.append((i, j, replacing))
result = ''
i, j = len(text), 0
while len(stack) > 0:
last_i = i
i, j, replacing = stack.pop()
result = replacing + text[j:last_i] + result
if i > 0:
result = text[0:i] + result
return result

(此代码也可在 飞天马戏团中获得——免责声明: 我是它的主要作者。)


此代码的用法如下:

print(safe_format_map('{a} {b} {c}', dict(a=-A-)))
# -A- {b} {c}

让我们比较一下我最喜欢的解决方案(由@SvenMarach 提供,他友好地分享了他的代码 给你那里) :

import string




class FormatPlaceholder:
def __init__(self, key):
self.key = key
def __format__(self, spec):
result = self.key
if spec:
result += ":" + spec
return "{" + result + "}"
def __getitem__(self, index):
self.key = "{}[{}]".format(self.key, index)
return self
def __getattr__(self, attr):
self.key = "{}.{}".format(self.key, attr)
return self




class FormatDict(dict):
def __missing__(self, key):
return FormatPlaceholder(key)




def safe_format_alt(text, source):
formatter = string.Formatter()
return formatter.vformat(text, (), FormatDict(source))

下面是一些测试:

test_texts = (
'{b} {f}',  # simple nothing useful in source
'{a} {b}',  # simple
'{a} {b} {c:5d}',  # formatting
'{a} {b} {c!s}',  # coercion
'{a} {b} {c!s:>{a}s}',  # formatting and coercion
'{a} {b} {c:0{a}d}',  # nesting
'{a} {b} {d[x]}',  # dicts (existing in source)
'{a} {b} {e.index}',  # class (existing in source)
'{a} {b} {f[g]}',  # dict (not existing in source)
'{a} {b} {f.values}',  # class (not existing in source)


)
source = dict(a=4, c=101, d=dict(x='FOO'), e=[])

以及使其运行的代码:

funcs = safe_format_map, safe_format_alt


n = 18
for text in test_texts:
full_source = {**dict(b='---', f=dict(g='Oh yes!')), **source}
print('{:>{n}s} :   OK   : '.format('str.format_map', n=n) + text.format_map(full_source))
for func in funcs:
try:
print(f'{func.__name__:>{n}s} :   OK   : ' + func(text, source))
except:
print(f'{func.__name__:>{n}s} : FAILED : {text}')

结果是:

    str.format_map :   OK   : --- {'g': 'Oh yes!'}
safe_format_map :   OK   : {b} {f}
safe_format_alt :   OK   : {b} {f}
str.format_map :   OK   : 4 ---
safe_format_map :   OK   : 4 {b}
safe_format_alt :   OK   : 4 {b}
str.format_map :   OK   : 4 ---   101
safe_format_map :   OK   : 4 {b}   101
safe_format_alt :   OK   : 4 {b}   101
str.format_map :   OK   : 4 --- 101
safe_format_map :   OK   : 4 {b} 101
safe_format_alt :   OK   : 4 {b} 101
str.format_map :   OK   : 4 ---  101
safe_format_map :   OK   : 4 {b}  101
safe_format_alt :   OK   : 4 {b}  101
str.format_map :   OK   : 4 --- 0101
safe_format_map :   OK   : 4 {b} 0101
safe_format_alt :   OK   : 4 {b} 0101
str.format_map :   OK   : 4 --- FOO
safe_format_map :   OK   : 4 {b} FOO
safe_format_alt :   OK   : 4 {b} FOO
str.format_map :   OK   : 4 --- <built-in method index of list object at 0x7f7a485666c8>
safe_format_map :   OK   : 4 {b} <built-in method index of list object at 0x7f7a485666c8>
safe_format_alt :   OK   : 4 {b} <built-in method index of list object at 0x7f7a485666c8>
str.format_map :   OK   : 4 --- Oh yes!
safe_format_map :   OK   : 4 {b} {f[g]}
safe_format_alt :   OK   : 4 {b} {f[g]}
str.format_map :   OK   : 4 --- <built-in method values of dict object at 0x7f7a485da090>
safe_format_map :   OK   : 4 {b} {f.values}
safe_format_alt :   OK   : 4 {b} {f.values}

如您所见,更新后的版本现在似乎能够很好地处理早期版本失败的情况。


就时间而言,它们大约在彼此的50% 之内,这取决于实际的 text格式(可能是实际的 source) ,但是 safe_format_map()似乎在我执行的大多数测试中有优势(当然,不管它们是什么意思) :

for text in test_texts:
print(f'  {text}')
%timeit safe_format(text * 1000, source)
%timeit safe_format_alt(text * 1000, source)
  {b} {f}
3.93 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
6.35 ms ± 51.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b}
4.37 ms ± 57.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
5.2 ms ± 159 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {c:5d}
7.15 ms ± 91.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.76 ms ± 69.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {c!s}
7.04 ms ± 138 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.56 ms ± 161 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {c!s:>{a}s}
8.91 ms ± 113 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
10.5 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {c:0{a}d}
8.84 ms ± 147 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
10.2 ms ± 202 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {d[x]}
7.01 ms ± 197 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.35 ms ± 106 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {e.index}
11 ms ± 68.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
8.78 ms ± 405 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {f[g]}
6.55 ms ± 88.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
9.12 ms ± 159 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
{a} {b} {f.values}
6.61 ms ± 55.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
9.92 ms ± 98.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

如果您想要解压缩一个 dictionary 来将参数传递给 format在这个相关的问题中,您可以使用以下方法。

首先假设字符串 s与这个问题中的相同:

s = '{foo} {bar}'

这些值由下面的字典给出:

replacements = {'foo': 'FOO'}

显然,这种做法行不通:

s.format(**replacements)
#---------------------------------------------------------------------------
#KeyError                                  Traceback (most recent call last)
#<ipython-input-29-ef5e51de79bf> in <module>()
#----> 1 s.format(**replacements)
#
#KeyError: 'bar'

但是,您可以首先创建 从 ABC1中获取所有命名参数的 set并创建一个字典,将参数映射到包装在花括号中的自身:

from string import Formatter
args = {x[1]:'{'+x[1]+'}' for x in Formatter().parse(s)}
print(args)
#{'foo': '{foo}', 'bar': '{bar}'}

现在使用 args字典来填写 replacements中缺少的键。对于 python 3.5 + ,您可以使用 用一个表达式做这件事:

new_s = s.format(**{**args, **replacements}}
print(new_s)
#'FOO {bar}'

对于旧版本的 python,可以调用 update:

args.update(replacements)
print(s.format(**args))
#'FOO {bar}'

我喜欢“斯文-马尔纳克”的回答。我的答案只是它的一个扩展版本。它允许非关键字格式化,并忽略额外的键。下面是一些使用示例(函数的名称是对 python 3.6 f-string 格式化的引用) :

# partial string substitution by keyword
>>> f('{foo} {bar}', foo="FOO")
'FOO {bar}'


# partial string substitution by argument
>>> f('{} {bar}', 1)
'1 {bar}'


>>> f('{foo} {}', 1)
'{foo} 1'


# partial string substitution with arguments and keyword mixed
>>> f('{foo} {} {bar} {}', '|', bar='BAR')
'{foo} | BAR {}'


# partial string substitution with extra keyword
>>> f('{foo} {bar}', foo="FOO", bro="BRO")
'FOO {bar}'


# you can simply 'pour out' your dictionary to format function
>>> kwargs = {'foo': 'FOO', 'bro': 'BRO'}
>>> f('{foo} {bar}', **kwargs)
'FOO {bar}'

我的原则是:

from string import Formatter




class FormatTuple(tuple):
def __getitem__(self, key):
if key + 1 > len(self):
return "{}"
return tuple.__getitem__(self, key)




class FormatDict(dict):
def __missing__(self, key):
return "{" + key + "}"




def f(string, *args, **kwargs):
"""
String safe substitute format method.
If you pass extra keys they will be ignored.
If you pass incomplete substitute map, missing keys will be left unchanged.
:param string:
:param kwargs:
:return:


>>> f('{foo} {bar}', foo="FOO")
'FOO {bar}'
>>> f('{} {bar}', 1)
'1 {bar}'
>>> f('{foo} {}', 1)
'{foo} 1'
>>> f('{foo} {} {bar} {}', '|', bar='BAR')
'{foo} | BAR {}'
>>> f('{foo} {bar}', foo="FOO", bro="BRO")
'FOO {bar}'
"""
formatter = Formatter()
args_mapping = FormatTuple(args)
mapping = FormatDict(kwargs)
return formatter.vformat(string, args_mapping, mapping)

如果您正在进行大量的模板工作,并且发现 Python 内置的字符串模板功能不够充分或笨重,那么请参考 Jinja2

来自文件:

金贾是一种现代的、对设计师友好的 Python 模板语言,效仿了 Django 的模板。

我找到的所有解决方案似乎都存在更高级规范或转换选项的问题。@ SvenMarach 的 格式占位符非常聪明,但是在强制的情况下它不能正常工作(例如 {a!s:>2s}) ,因为它调用的是 __str__方法(在这个例子中)而不是 __format__,你会丢失任何额外的格式。

下面是我最终得出的结论,以及它的一些关键特性:

sformat('The {} is {}', 'answer')
'The answer is {}'


sformat('The answer to {question!r} is {answer:0.2f}', answer=42)
'The answer to {question!r} is 42.00'


sformat('The {} to {} is {:0.{p}f}', 'answer', 'everything', p=4)
'The answer to everything is {:0.4f}'
  • 提供与 str.format类似的接口(不仅仅是映射)
  • 支持更复杂的格式选项:
    • 胁迫
    • 嵌套 {k:>{size}}
    • 获取 {k.foo}
    • 装备 {k[0]}
    • 强制 + 格式化 {k!s:>{size}}
import string




class SparseFormatter(string.Formatter):
"""
A modified string formatter that handles a sparse set of format
args/kwargs.
"""


# re-implemented this method for python2/3 compatibility
def vformat(self, format_string, args, kwargs):
used_args = set()
result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
self.check_unused_args(used_args, args, kwargs)
return result


def _vformat(self, format_string, args, kwargs, used_args, recursion_depth,
auto_arg_index=0):
if recursion_depth < 0:
raise ValueError('Max string recursion exceeded')
result = []
for literal_text, field_name, format_spec, conversion in \
self.parse(format_string):


orig_field_name = field_name


# output the literal text
if literal_text:
result.append(literal_text)


# if there's a field, output it
if field_name is not None:
# this is some markup, find the object and do
#  the formatting


# handle arg indexing when empty field_names are given.
if field_name == '':
if auto_arg_index is False:
raise ValueError('cannot switch from manual field '
'specification to automatic field '
'numbering')
field_name = str(auto_arg_index)
auto_arg_index += 1
elif field_name.isdigit():
if auto_arg_index:
raise ValueError('cannot switch from manual field '
'specification to automatic field '
'numbering')
# disable auto arg incrementing, if it gets
# used later on, then an exception will be raised
auto_arg_index = False


# given the field_name, find the object it references
#  and the argument it came from
try:
obj, arg_used = self.get_field(field_name, args, kwargs)
except (IndexError, KeyError):
# catch issues with both arg indexing and kwarg key errors
obj = orig_field_name
if conversion:
obj += '!{}'.format(conversion)
if format_spec:
format_spec, auto_arg_index = self._vformat(
format_spec, args, kwargs, used_args,
recursion_depth, auto_arg_index=auto_arg_index)
obj += ':{}'.format(format_spec)
result.append('{' + obj + '}')
else:
used_args.add(arg_used)


# do any conversion on the resulting object
obj = self.convert_field(obj, conversion)


# expand the format spec, if needed
format_spec, auto_arg_index = self._vformat(
format_spec, args, kwargs,
used_args, recursion_depth-1,
auto_arg_index=auto_arg_index)


# format the object and append to the result
result.append(self.format_field(obj, format_spec))


return ''.join(result), auto_arg_index




def sformat(s, *args, **kwargs):
# type: (str, *Any, **Any) -> str
"""
Sparse format a string.


Parameters
----------
s : str
args : *Any
kwargs : **Any


Examples
--------
>>> sformat('The {} is {}', 'answer')
'The answer is {}'


>>> sformat('The answer to {question!r} is {answer:0.2f}', answer=42)
'The answer to {question!r} is 42.00'


>>> sformat('The {} to {} is {:0.{p}f}', 'answer', 'everything', p=4)
'The answer to everything is {:0.4f}'


Returns
-------
str
"""
return SparseFormatter().format(s, *args, **kwargs)

在编写了一些关于我希望这个方法如何运行的测试之后,我发现了各种实现的问题。如果有人觉得他们有洞察力,他们就在下面。

import pytest




def test_auto_indexing():
# test basic arg auto-indexing
assert sformat('{}{}', 4, 2) == '42'
assert sformat('{}{} {}', 4, 2) == '42 {}'




def test_manual_indexing():
# test basic arg indexing
assert sformat('{0}{1} is not {1} or {0}', 4, 2) == '42 is not 2 or 4'
assert sformat('{0}{1} is {3} {1} or {0}', 4, 2) == '42 is {3} 2 or 4'




def test_mixing_manualauto_fails():
# test mixing manual and auto args raises
with pytest.raises(ValueError):
assert sformat('{!r} is {0}{1}', 4, 2)




def test_kwargs():
# test basic kwarg
assert sformat('{base}{n}', base=4, n=2) == '42'
assert sformat('{base}{n}', base=4, n=2, extra='foo') == '42'
assert sformat('{base}{n} {key}', base=4, n=2) == '42 {key}'




def test_args_and_kwargs():
# test mixing args/kwargs with leftovers
assert sformat('{}{k} {v}', 4, k=2) == '42 {v}'


# test mixing with leftovers
r = sformat('{}{} is the {k} to {!r}', 4, 2, k='answer')
assert r == '42 is the answer to {!r}'




def test_coercion():
# test coercion is preserved for skipped elements
assert sformat('{!r} {k!r}', '42') == "'42' {k!r}"




def test_nesting():
# test nesting works with or with out parent keys
assert sformat('{k:>{size}}', k=42, size=3) == ' 42'
assert sformat('{k:>{size}}', size=3) == '{k:>3}'




@pytest.mark.parametrize(
('s', 'expected'),
[
('{a} {b}', '1 2.0'),
('{z} {y}', '{z} {y}'),
('{a} {a:2d} {a:04d} {y:2d} {z:04d}', '1  1 0001 {y:2d} {z:04d}'),
('{a!s} {z!s} {d!r}', '1 {z!s} {\'k\': \'v\'}'),
('{a!s:>2s} {z!s:>2s}', ' 1 {z!s:>2s}'),
('{a!s:>{a}s} {z!s:>{z}s}', '1 {z!s:>{z}s}'),
('{a.imag} {z.y}', '0 {z.y}'),
('{e[0]:03d} {z[0]:03d}', '042 {z[0]:03d}'),
],
ids=[
'normal',
'none',
'formatting',
'coercion',
'formatting+coercion',
'nesting',
'getattr',
'getitem',
]
)
def test_sformat(s, expected):
# test a bunch of random stuff
data = dict(
a=1,
b=2.0,
c='3',
d={'k': 'v'},
e=[42],
)
assert expected == sformat(s, **data)

读到@Sam Bourne 的评论,我修改了@SvenMarach 的 < a href = “ https://ideone.com/DZJO1I”rel = “ nofollow noReferrer”> 代码 在不编写自定义解析器的情况下使用强制(如 {a!s:>2s})正常工作。 其基本思想不是转换为字符串,而是使用强制标记连接缺失的键。

import string
class MissingKey(object):
def __init__(self, key):
self.key = key


def __str__(self):  # Supports {key!s}
return MissingKeyStr("".join([self.key, "!s"]))


def __repr__(self):  # Supports {key!r}
return MissingKeyStr("".join([self.key, "!r"]))


def __format__(self, spec): # Supports {key:spec}
if spec:
return "".join(["{", self.key, ":", spec, "}"])
return "".join(["{", self.key, "}"])


def __getitem__(self, i): # Supports {key[i]}
return MissingKey("".join([self.key, "[", str(i), "]"]))


def __getattr__(self, name): # Supports {key.name}
return MissingKey("".join([self.key, ".", name]))




class MissingKeyStr(MissingKey, str):
def __init__(self, key):
if isinstance(key, MissingKey):
self.key = "".join([key.key, "!s"])
else:
self.key = key


class SafeFormatter(string.Formatter):
def __init__(self, default=lambda k: MissingKey(k)):
self.default=default


def get_value(self, key, args, kwds):
if isinstance(key, str):
return kwds.get(key, self.default(key))
else:
return super().get_value(key, args, kwds)

像这样使用(例如)

SafeFormatter().format("{a:<5} {b:<10}", a=10)

以下测试(受@norok2测试的启发)在两种情况下检查传统 format_map和基于上述类的 safe_format_map的输出: 提供正确的关键字或不提供关键字。

def safe_format_map(text, source):
return SafeFormatter().format(text, **source)


test_texts = (
'{a} ',             # simple nothing useful in source
'{a:5d}',       # formatting
'{a!s}',        # coercion
'{a!s:>{a}s}',  # formatting and coercion
'{a:0{a}d}',    # nesting
'{d[x]}',       # indexing
'{d.values}',   # member
)


source = dict(a=10,d=dict(x='FOO'))
funcs = [safe_format_map,
str.format_map
#safe_format_alt  # Version based on parsing (See @norok2)
]
n = 18
for text in test_texts:
# full_source = {**dict(b='---', f=dict(g='Oh yes!')), **source}
# print('{:>{n}s} :   OK   : '.format('str.format_map', n=n) + text.format_map(full_source))
print("Testing:", text)
for func in funcs:
try:
print(f'{func.__name__:>{n}s} : OK\t\t\t: ' + func(text, dict()))
except:
print(f'{func.__name__:>{n}s} : FAILED')


try:
print(f'{func.__name__:>{n}s} : OK\t\t\t: ' + func(text, source))
except:
print(f'{func.__name__:>{n}s} : FAILED')

输出

Testing: {a}
safe_format_map : OK         : {a}
safe_format_map : OK         : 10
format_map : FAILED
format_map : OK         : 10
Testing: {a:5d}
safe_format_map : OK         : {a:5d}
safe_format_map : OK         :    10
format_map : FAILED
format_map : OK         :    10
Testing: {a!s}
safe_format_map : OK         : {a!s}
safe_format_map : OK         : 10
format_map : FAILED
format_map : OK         : 10
Testing: {a!s:>{a}s}
safe_format_map : OK         : {a!s:>{a}s}
safe_format_map : OK         :         10
format_map : FAILED
format_map : OK         :         10
Testing: {a:0{a}d}
safe_format_map : OK         : {a:0{a}d}
safe_format_map : OK         : 0000000010
format_map : FAILED
format_map : OK         : 0000000010
Testing: {d[x]}
safe_format_map : OK         : {d[x]}
safe_format_map : OK         : FOO
format_map : FAILED
format_map : OK         : FOO
Testing: {d.values}
safe_format_map : OK         : {d.values}
safe_format_map : OK         : <built-in method values of dict object at 0x7fe61e230af8>
format_map : FAILED
format_map : OK         : <built-in method values of dict object at 0x7fe61e230af8>

这里有一个基于正则表达式的解决方案。请注意,这将 没有与嵌套格式说明符,如 {foo:{width}},但它确实解决了一些问题,其他答案。

def partial_format(s, **kwargs):
parts = re.split(r'(\{[^}]*\})', s)
for k, v in kwargs.items():
for idx, part in enumerate(parts):
if re.match(rf'\\{\{{k}[!:}}]', part):  # Placeholder keys must always be followed by '!', ':', or the closing '}'
parts[idx] = parts[idx].format_map({k: v})
return ''.join(parts)


# >>> partial_format('{foo} {bar:1.3f}', foo='FOO')
# 'FOO {bar:1.3f}'
# >>> partial_format('{foo} {bar:1.3f}', bar=1)
# '{foo} 1.000'

TL; DR : 问题: 如果没有设置 foobar,则 defaultdict{foobar[a]}失败:

from collections import defaultdict


text = "{bar}, {foo}, {foobar[a]}" # {bar} is set, {foo} is "", {foobar[a]} fails
text.format_map(defaultdict(str, bar="A")) # TypeError: string indices must be integers

解决方案: 从 剪辑复制 DefaultWrapper类,然后:

text = "{bar}, {foo}, {foobar[a]}"
text.format_map(DefaultWrapper(bar="A")) # "A, , " (missing replaced with empty str)


# Even this works:
foobar = {"c": "C"}
text = "{foobar[a]}, {foobar[c]}"
text.format_map(DefaultWrapper(foobar=foobar)) # ", C" missing indices are also replaced

请注意,索引和属性访问在所发布的解决方案之一中不起作用。下面的代码引发 TypeError: string indices must be integers

from collections import defaultdict


text = "{foo} '{bar[index]}'"
text.format_map(defaultdict(str, foo="FOO")) # raises a TypeError

为了解决这个问题,可以使用 collections.defaultdict解决方案和一个支持索引的自定义默认值对象。DefaultWrapper对象在索引和属性访问上返回自己,这允许无限次地索引/使用属性,而不会出错。

注意,这可以扩展为允许包含所请求值的部分的容器。看看下面的编辑。

class DefaultWrapper:
def __repr__(self):
return "Empty default value"
    

def __str__(self):
return ""
    

def __format__(self, format_spec):
return ""
    

def __getattr__(self, name):
return self
    

def __getitem__(self, name):
return self
    

def __contains__(self, name):
return True


text = "'{foo}', '{bar[index][i]}'"
print(text.format_map(defaultdict(DefaultWrapper, foo="FOO")))
# 'FOO', ''

编辑: 部分装满的容器

上面的类可以扩展为支持部分填充的容器

text = "'{foo[a]}', '{foo[b]}'"
foo = {"a": "A"}


print(text.format_map(defaultdict(DefaultWrapper, foo=foo)))
# KeyError: 'b'

这个想法是完全取代 defaultdictDefaultWrapperDefaultWrapper对象包围 container,以字符串的形式返回容器请求的值(用 DefaultWrapper对象包装)或容器。这样可以模仿映射的无限深度,但是返回所有当前值。

增加 kwargs只是为了方便,这样看起来更像 defaultdict解决方案。

class DefaultWrapper:
"""A wrapper around the `container` to allow accessing with a default value."""
ignore_str_format_errors = True


def __init__(self, container="", **kwargs):
self.container = container
self.kwargs = kwargs


def __repr__(self):
return "DefaultWrapper around '{}'".format(repr(self.container))


def __str__(self):
return str(self.container)
    

def __format__(self, format_spec):
try:
return self.container.__format__(format_spec)
except TypeError as e:
if DefaultWrapper.ignore_str_format_errors or self.container == "":
return str(self)
else:
raise e


def __getattr__(self, name):
try:
return DefaultWrapper(getattr(self.container, name))
except AttributeError:
return DefaultWrapper()


def __getitem__(self, name):
try:
return DefaultWrapper(self.container[name])
except (TypeError, LookupError):
try:
return DefaultWrapper(self.kwargs[name])
except (TypeError, LookupError):
return DefaultWrapper()
        

def __contains__(self, name):
return True

现在所有的示例都没有错误:

text = "'{foo[a]}', '{foo[b]}'"
foo = {"a": "A"}
print(text.format_map(DefaultWrapper(foo=foo)))
# 'A', ''


text = "'{foo}', '{bar[index][i]}', '{foobar[a]}', '{foobar[b]}'"
print(text.format_map(DefaultWrapper(foo="Foo", foobar={"a": "A"})))
# 'FOO', '', 'A', ''


# the old way still works the same as before
from collections import defaultdict
text = "'{foo}', '{bar[index][i]}'"
print(text.format_map(defaultdict(DefaultWrapper, foo="FOO")))
# 'FOO', ''

我们就是这样做到的:

import traceback




def grab_key_from_exc(exc):
last_line_idx = exc[:-1].rfind('\n')
last_line = exc[last_line_idx:]
    

quote_start = last_line.find("'")
quote_end = last_line.rfind("'")


key_name = last_line[quote_start+1:quote_end]
return key_name




def partial_format(input_string, **kwargs):
while True:
try:
return input_string.format(**kwargs)
except:
exc = traceback.format_exc()
key = grab_key_from_exc(exc)
kwargs[key] = '{'+key+'}'