切割字典

我有一个字典,并希望将其中的一部分传递给函数,该部分由一个键列表(或元组)提供。像这样:

# the dictionary
d = {1:2, 3:4, 5:6, 7:8}


# the subset of keys I'm interested in
l = (1,5)

现在,理想情况下,我希望能够这样做:

>>> d[l]
{1:2, 5:6}

... 但这不起作用,因为它会寻找一个键匹配元组 (1,5),与 d[1,5]相同。

d{1,5}甚至不是有效的 Python (据我所知... ...) ,尽管它可能很方便: 大括号表示一个无序集或字典,因此返回包含指定键的字典对我来说非常合理。

d[{1,5}]也有意义(“这里有一组键,给我匹配的项”) ,而且 {1, 5}是一个不可散列的集合,所以不可能有一个与之匹配的键——当然,它也会抛出一个错误。

我知道我能做到:

>>> dict([(key, value) for key,value in d.iteritems() if key in l])
{1: 2, 5: 6}

或者这样:

>>> dict([(key, d[key]) for key in l])

这样更紧凑 但我觉得一定有更好的方法,我是不是错过了一个更优雅的解决方案?

(我使用的是 Python 2.7)

216307 次浏览

Use a set to intersect on the dict.viewkeys() dictionary view:

l = {1, 5}
{key: d[key] for key in d.viewkeys() & l}

This is Python 2 syntax, in Python 3 use d.keys().

This still uses a loop, but at least the dictionary comprehension is a lot more readable. Using set intersections is very efficient, even if d or l is large.

Demo:

>>> d = {1:2, 3:4, 5:6, 7:8}
>>> l = {1, 5}
>>> {key: d[key] for key in d.viewkeys() & l}
{1: 2, 5: 6}

You should be iterating over the tuple and checking if the key is in the dict not the other way around, if you don't check if the key exists and it is not in the dict you are going to get a key error:

print({k:d[k] for k in l if k in d})

Some timings:

 {k:d[k] for k in set(d).intersection(l)}


In [22]: %%timeit
l = xrange(100000)
{k:d[k] for k in l}
....:
100 loops, best of 3: 11.5 ms per loop


In [23]: %%timeit
l = xrange(100000)
{k:d[k] for k in set(d).intersection(l)}
....:
10 loops, best of 3: 20.4 ms per loop


In [24]: %%timeit
l = xrange(100000)
l = set(l)
{key: d[key] for key in d.viewkeys() & l}
....:
10 loops, best of 3: 24.7 ms per


In [25]: %%timeit


l = xrange(100000)
{k:d[k] for k in l if k in d}
....:
100 loops, best of 3: 17.9 ms per loop

I don't see how {k:d[k] for k in l} is not readable or elegant and if all elements are in d then it is pretty efficient.

Write a dict subclass that accepts a list of keys as an "item" and returns a "slice" of the dictionary:

class SliceableDict(dict):
default = None
def __getitem__(self, key):
if isinstance(key, list):   # use one return statement below
# uses default value if a key does not exist
return {k: self.get(k, self.default) for k in key}
# raises KeyError if a key does not exist
return {k: self[k] for k in key}
# omits key if it does not exist
return {k: self[k] for k in key if k in self}
return dict.get(self, key)

Usage:

d = SliceableDict({1:2, 3:4, 5:6, 7:8})
d[[1, 5]]   # {1: 2, 5: 6}

Or if you want to use a separate method for this type of access, you can use * to accept any number of arguments:

class SliceableDict(dict):
def slice(self, *keys):
return {k: self[k] for k in keys}
# or one of the others from the first example


d = SliceableDict({1:2, 3:4, 5:6, 7:8})
d.slice(1, 5)     # {1: 2, 5: 6}
keys = 1, 5
d.slice(*keys)    # same

set intersection and dict comprehension can be used here

# the dictionary
d = {1:2, 3:4, 5:6, 7:8}


# the subset of keys I'm interested in
l = (1,5)


>>>{key:d[key] for key in set(l) & set(d)}
{1: 2, 5: 6}

On Python 3 you can use the itertools islice to slice the dict.items() iterator

import itertools


d = {1: 2, 3: 4, 5: 6}


dict(itertools.islice(d.items(), 2))


{1: 2, 3: 4}

Note: this solution does not take into account specific keys. It slices by internal ordering of d, which in Python 3.7+ is guaranteed to be insertion-ordered.

the dictionary

d = {1:2, 3:4, 5:6, 7:8}

the subset of keys I'm interested in

l = (1,5)

answer

{key: d[key] for key in l}

To slice a dictionary, Convert it to a list of tuples using d.items(), slice the list and create a dictionary out of it.

Here.

d = {1:2, 3:4, 5:6, 7:8}

To get the first 2 items

first_two = dict(list(d.items())[:2])

first_two

{1: 2, 3: 4}

Another option is to convert the dictionary into a pandas Series object and then locating the specified indexes:

>>> d = {1:2, 3:4, 5:6, 7:8}
>>> l = [1,5]


>>> import pandas as pd
>>> pd.Series(d).loc[l].to_dict()
{1: 2, 5: 6}

My case is probably relatively uncommon, but, I'm posting it here nonetheless in case it helps someone (though not OP directly).

I came across this question searching how to slice a dictionary that had item counts. Basically I had a dictionary where the keys were letters, and the values were the number of times the letter appeared (i.e. abababc --> {'a': 3, 'b': 3, 'c': 1} I wanted to 'slice' the dictionary so that I could return the most common n keys.

It turns out that this is exactly what a Collections Counter object is for, and instead of needing to 'slice' my dictionary, I could easily just convert it to a collections.Counter and then call most_common(n): https://docs.python.org/3/library/collections.html#collections.Counter.most_common