If you also need to retrieve the values for the keys you are removing, this would be a pretty good way to do it:
values_removed = [d.pop(k, None) for k in entities_to_remove]
You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.
I have no problem with any of the existing answers, but I was surprised to not find this solution:
keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}
for k in keys_to_remove:
try:
del my_dict[k]
except KeyError:
pass
assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}
Note: I stumbled across this question coming from here. And my answer is related to this answer.
It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.
from itertools import product
from time import perf_counter
# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))
print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000
keys = product(range(50000, 100000), range(1, 100))
# for x,y in keys:
# del cells[x, y]
for n in map(cells.pop, keys):
pass
print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")
10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:
keys = cells.keys() & keys
In summary: del is already heavily optimised, so don't worry about using it.
Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:
timeit results (10k iterations):
all(x.pop(v) for v in r) # 0.85
all(map(x.pop, r)) # 0.60
list(map(x.pop, r)) # 0.70
all(map(x.__delitem__, r)) # 0.44
del_all(x, r) # 0.40
<inline for loop>(x, r) # 0.35
def del_all(mapping, to_remove):
"""Remove list of elements from mapping."""
for key in to_remove:
del mapping[key]
For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.
# Method 1: `del`
for key in remove_keys:
if key in d:
del d[key]
# Method 2: `pop()`
for key in remove_keys:
d.pop(key, None)
# Method 3: comprehension
{key: v for key, v in d.items() if key not in remove_keys}
Here are the results of 1M iterations:
del: 2.03s 2.0 ns/iter (100%)
pop(): 2.38s 2.4 ns/iter (117%)
comprehension: 4.11s 4.1 ns/iter (202%)
So both del and pop() are the fastest. Comprehensions are 2x slower.
But anyway, we speak nanoseconds here :) Dicts in Python are ridiculously fast.
Another map() way to remove list of keys from dictionary
and avoid raising KeyError exception
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, keys_to_remove))
print('k=', k)
print('dic after = \n', dic)
**this will produce output**
k= ['key_not_exist', 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
Duplicate keys_to_remove is artificial, it needs to supply defaults values for dict.pop() function.
You can add here any array with len_ = len(key_to_remove)