“如果关键字在措辞”与“尝试/除外”-哪一个是更可读的习语?

我有一个关于习惯用法和可读性的问题,在这个特殊的案例中,Python 哲学似乎存在冲突:

我想从字典 B 中建造字典 A。如果 B 中不存在特定的键,那么什么也不做,继续。

哪种方式更好?

try:
A["blah"] = B["blah"]
except KeyError:
pass

或者

if "blah" in B:
A["blah"] = B["blah"]

“请求宽恕”和“简单明了”。

哪个更好,为什么?

83569 次浏览

From what I understand, you want to update dict A with key,value pairs from dict B

update is a better choice.

A.update(B)

Example:

>>> A = {'a':1, 'b': 2, 'c':3}
>>> B = {'d': 2, 'b':5, 'c': 4}
>>> A.update(B)
>>> A
{'a': 1, 'c': 4, 'b': 5, 'd': 2}
>>>

Personally, I lean towards the second method (but using has_key):

if B.has_key("blah"):
A["blah"] = B["blah"]

That way, each assignment operation is only two lines (instead of 4 with try/except), and any exceptions that get thrown will be real errors or things you've missed (instead of just trying to access keys that aren't there).

As it turns out (see the comments on your question), has_key is deprecated - so I guess it's better written as

if "blah" in B:
A["blah"] = B["blah"]

I think the general rule here is will A["blah"] normally exist, if so try-except is good if not then use if "blah" in b:

I think "try" is cheap in time but "except" is more expensive.

The rule in other languages is to reserve exceptions for exceptional conditions, i.e. errors that don't occur in regular use. Don't know how that rule applies to Python, as StopIteration shouldn't exist by that rule.

I think the second example is what you should go for unless this code makes sense:

try:
A["foo"] = B["foo"]
A["bar"] = B["bar"]
A["baz"] = B["baz"]
except KeyError:
pass

Keep in mind that code will abort as soon as there is a key that isn't in B. If this code makes sense, then you should use the exception method, otherwise use the test method. In my opinion, because it's shorter and clearly expresses the intent, it's a lot easier to read than the exception method.

Of course, the people telling you to use update are correct. If you are using a version of Python that supports dictionary comprehensions, I would strongly prefer this code:

updateset = {'foo', 'bar', 'baz'}
A.update({k: B[k] for k in updateset if k in B})

Exceptions are not conditionals.

The conditional version is clearer. That's natural: this is straightforward flow control, which is what conditionals are designed for, not exceptions.

The exception version is primarily used as an optimization when doing these lookups in a loop: for some algorithms it allows eliminating tests from inner loops. It doesn't have that benefit here. It has the small advantage that it avoids having to say "blah" twice, but if you're doing a lot of these you should probably have a helper move_key function anyway.

In general, I'd strongly recommend sticking with the conditional version by default unless you have a specific reason not to. Conditionals are the obvious way to do this, which is usually a strong recommendation to prefer one solution over another.

There is also a third way that avoids both exceptions and double-lookup, which can be important if the lookup is expensive:

value = B.get("blah", None)
if value is not None:
A["blah"] = value

In case you expect the dictionary to contain None values, you can use some more esoteric constants like NotImplemented, Ellipsis or make a new one:

MyConst = object()
def update_key(A, B, key):
value = B.get(key, MyConst)
if value is not MyConst:
A[key] = value

Anyway, using update() is the most readable option for me:

a.update((k, b[k]) for k in ("foo", "bar", "blah") if k in b)

Direct quote from Python performance wiki:

Except for the first time, each time a word is seen the if statement's test fails. If you are counting a large number of words, many will probably occur multiple times. In a situation where the initialization of a value is only going to occur once and the augmentation of that value will occur many times it is cheaper to use a try statement.

So it seems that both options are viable depending from situation. For more details you might like to check this link out: Try-except-performance

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can capture the condition value dictB.get('hello', None) in a variable value in order to both check if it's not None (as dict.get('hello', None) returns either the associated value or None) and then use it within the body of the condition:

# dictB = {'hello': 5, 'world': 42}
# dictA = {}
if value := dictB.get('hello', None):
dictA["hello"] = value
# dictA is now {'hello': 5}

Though the accepted answer's emphasize on "look before you leap" principle might apply to most languages, more pythonic might be the first approach, based on the python principles. Not to mention it is a legitimate coding style in python. Important thing is to make sure you are using the try except block in the right context and is following best practices. Eg. doing too many things in a try block, catching a very broad exception, or worse- the bare except clause etc.

Easier to ask for forgiveness than permission. (EAFP)

See the python docs reference here.

Also, this blog from Brett, one of the core devs, touches most of this in brief.

See another SO discussion here:

In addition to discussing readability, I think performance also matters in some scenarios. A quick timeit benchmark indicates that a test (i.e. “asking permission”) is actually slightly faster than handling the exception (i.e. “asking forgiveness”).

Here’s the code to set up the benchmark, generating a largeish dictionary of random key-value pairs:

setup = """
import random, string
d = {"".join(random.choices(string.ascii_letters, k=3)): "".join(random.choices(string.ascii_letters, k=3)) for _ in range(10000)}
"""

Then the if test:

stmt1 = """
key = "".join(random.choices(string.ascii_letters, k=3))
if key in d:
_ = d[key]
"""

gives us:

>>> timeit.timeit(stmt=stmt1, setup=setup, number=1000000)
1.6444563979999884

whereas the approach utilizing the exception

stmt2 = """
key = "".join(random.choices(string.ascii_letters, k=3))
try:
_ = d[key]
except KeyError:
pass
"""

gives us:

>>> timeit.timeit(stmt=stmt2, setup=setup, number=1000000)
1.8868465850000575

Interestingly, hoisting the key generation from the actual benchmark into the setup and therewith looking for the same key over and over, delivers vastly different numbers:

>>> timeit.timeit(stmt=stmt1, setup=setup, number=100000000)
2.3290171539999847
>>> timeit.timeit(stmt=stmt2, setup=setup, number=100000000)
26.412447488999987

I don’t want to speculate whether this emphasizes the benefits of a test vs. exception handling, or if the dictionary buffers the result of the previous lookup and thus biases the benchmark results towards testing… 🤔