收藏的目的是什么?

在 Python 3.3中,向 collections模块添加了一个 ChainMap类:

ChainMap 类用于快速链接多个映射 因此,它们可以被视为一个单一的单位。它往往比 创建一个新字典并运行多个 update ()调用。

例如:

>>> from collections import ChainMap
>>> x = {'a': 1, 'b': 2}
>>> y = {'b': 10, 'c': 11}
>>> z = ChainMap(y, x)
>>> for k, v in z.items():
print(k, v)
a 1
c 11
b 10

它受到 这个问题的激励,并由 这个公开(没有创建 PEP)。

据我所知,这是一个替代有一个额外的字典和维护它与 update()

问题是:

  • ChainMap涵盖哪些用例?
  • 现实世界中有 ChainMap的例子吗?
  • 它是否用于切换到 python3的第三方库?

附加问题: 有没有办法在 Python 2.x 上使用它?


我在 Raymond Hettinger 的 Transforming Code into Beautiful, Idiomatic Python PyCon 演讲中听说过它,我想把它添加到我的工具箱中,但是我不知道什么时候应该使用它。

24690 次浏览

I'll take a crack at this:

Chainmap looks like a very just-so kind of abstraction. It's a good solution for a very specialized kind of problem. I propose this use case.

If you have:

  1. multiple mappings (e.g, dicts)
  2. some duplication of keys in those mappings (same key can appear in multiple mappings, but not the case that all keys appear in all mappings)
  3. a consuming application which wishes to access the value of a key in the "highest priority" mapping where there is a total ordering over all the mappings for any given key (that is, mappings may have equal priority, but only if it is known that there are no duplications of key within those mappings) (In the Python application, packages can live in the same directory (same priority) but must have different names, so, by definition, the symbol names in that directory cannot be duplicates.)
  4. the consuming application does not need to change the value of a key
  5. while at the same time the mappings must maintain their independent identity and can be changed asynchronously by an external force
  6. and the mappings are big enough, expensive enough to access, or change often enough between application accesses, that the cost of computing the projection (3) each time your app needs it is a significant performance concern for your application...

Then, you might consider using a chainmap to create a view over the collection of mappings.

But this is all after-the-fact justification. The Python guys had a problem, came up with a good solution in the context of their code, then did some extra work to abstract their solution so we could use it if we choose. More power to them. But whether it's appropriate for your problem is up to you to decide.

I could see using ChainMap for a configuration object where you have multiple scopes of configuration like command line options, a user configuration file, and a system configuration file. Since lookups are ordered by the order in the constructor argument, you can override settings at lower scopes. I've not personally used or seen ChainMap used, but that's not surprising since it is a fairly recent addition to the standard library.

It might also be useful for emulating stack frames where you push and pop variable bindings if you were trying to implement a lexical scope yourself.

The standard library docs for ChainMap give several examples and links to similar implementations in third-party libraries. Specifically, it names Django’s Context class and Enthought's MultiContext class.

I like @b4hand's examples, and indeed I have used in the past ChainMap-like structures (but not ChainMap itself) for the two purposes he mentions: multi-layered configuration overrides, and variable stack/scope emulation.

I'd like to point out two other motivations/advantages/differences of ChainMap, compared to using a dict-update loop, thus only storing the "final" version":

  1. More information: since a ChainMap structure is "layered", it supports answering question like: Am I getting the "default" value, or an overridden one? What is the original ("default") value? At what level did the value get overridden (borrowing @b4hand's config example: user-config or command-line-overrides)? Using a simple dict, the information needed for answering these questions is already lost.

  2. Speed tradeoff: suppose you have N layers and at most M keys in each, constructing a ChainMap takes O(N) and each lookup O(N) worst-case[*], while construction of a dict using an update-loop takes O(NM) and each lookup O(1). This means that if you construct often and only perform a few lookups each time, or if M is big, ChainMap's lazy-construction approach works in your favor.

[*] The analysis in (2) assumes dict-access is O(1), when in fact it is O(1) on average, and O(M) worst case. See more details here.

To imperfectly answer your:

Bonus question: is there a way to use it on Python2.x?

from ConfigParser import _Chainmap as ChainMap

However keep in mind that this isn't a real ChainMap, it inherits from DictMixin and only defines:

__init__(self, *maps)
__getitem__(self, key)
keys(self)


# And from DictMixin:
__iter__(self)
has_key(self, key)
__contains__(self, key)
iteritems(self)
iterkeys(self)
itervalues(self)
values(self)
items(self)
clear(self)
setdefault(self, key, default=None)
pop(self, key, *args)
popitem(self)
update(self, other=None, **kwargs)
get(self, key, default=None)
__repr__(self)
__cmp__(self, other)
__len__(self)

Its implementation also doesn't seem particularly efficient.