So, you essentially want to delete multiple elements in one pass? In that case, the position of the next element to delete will be offset by however many were deleted previously.
Our goal is to delete all the vowels, which are precomputed to be indices 1, 4, and 7. Note that its important the to_delete indices are in ascending order, otherwise it won't work.
to_delete = [1, 4, 7]
target = list("hello world")
for offset, index in enumerate(to_delete):
index -= offset
del target[index]
It'd be a more complicated if you wanted to delete the elements in any order. IMO, sorting to_delete might be easier than figuring out when you should or shouldn't subtract from index.
here is another method which removes the elements in place. also if your list is really long, it is faster.
>>> a = range(10)
>>> remove = [0,4,5]
>>> from collections import deque
>>> deque((list.pop(a, i) for i in sorted(remove, reverse=True)), maxlen=0)
>>> timeit.timeit('[i for j, i in enumerate(a) if j not in remove]', setup='import random;remove=[random.randrange(100000) for i in range(100)]; a = range(100000)', number=1)
0.1704120635986328
>>> timeit.timeit('deque((list.pop(a, i) for i in sorted(remove, reverse=True)), maxlen=0)', setup='from collections import deque;import random;remove=[random.randrange(100000) for i in range(100)]; a = range(100000)', number=1)
0.004853963851928711
I'm a total beginner in Python, and my programming at the moment is crude and dirty to say the least, but my solution was to use a combination of the basic commands I learnt in early tutorials:
some_list = [1,2,3,4,5,6,7,8,10]
rem = [0,5,7]
for i in rem:
some_list[i] = '!' # mark for deletion
for i in range(0, some_list.count('!')):
some_list.remove('!') # remove
print some_list
Obviously, because of having to choose a "mark-for-deletion" character, this has its limitations.
As for the performance as the size of the list scales, I'm sure that my solution is sub-optimal. However, it's straightforward, which I hope appeals to other beginners, and will work in simple cases where some_list is of a well-known format, e.g., always numeric...
Here is an alternative, that does not use enumerate() to create tuples (as in SilentGhost's original answer).
This seems more readable to me. (Maybe I'd feel differently if I was in the habit of using enumerate.) CAVEAT: I have not tested performance of the two approaches.
# Returns a new list. "lst" is not modified.
def delete_by_indices(lst, indices):
indices_as_set = set(indices)
return [ lst[i] for i in xrange(len(lst)) if i not in indices_as_set ]
NOTE: Python 2.7 syntax. For Python 3, xrange => range.
Usage:
lst = [ 11*x for x in xrange(10) ]
somelist = delete_by_indices( lst, [0, 4, 5])
somelist:
[11, 22, 33, 66, 77, 88, 99]
--- BONUS ---
Delete multiple values from a list. That is, we have the values we want to delete:
# Returns a new list. "lst" is not modified.
def delete__by_values(lst, values):
values_as_set = set(values)
return [ x for x in lst if x not in values_as_set ]
Usage:
somelist = delete__by_values( lst, [0, 44, 55] )
somelist:
[11, 22, 33, 66, 77, 88, 99]
This is the same answer as before, but this time we supplied the VALUES to be deleted [0, 44, 55].
For some reason I don't like any of the answers here.
Yes, they work, but strictly speaking most of them aren't deleting elements in a list, are they? (But making a copy and then replacing the original one with the edited copy).
Why not just delete the higher index first?
Is there a reason for this?
I would just do:
for i in sorted(indices, reverse=True):
del somelist[i]
If you really don't want to delete items backwards, then I guess you should just deincrement the indices values which are greater than the last deleted index (can't really use the same index since you're having a different list) or use a copy of the list (which wouldn't be 'deleting' but replacing the original with an edited copy).
Am I missing something here, any reason to NOT delete in the reverse order?
technically, the answer is NO it is not possible to delete two objects AT THE SAME TIME. However, it IS possible to delete two objects in one line of beautiful python.
del (foo['bar'],foo['baz'])
will recusrively delete foo['bar'], then foo['baz']
import numpy as np
a = ['a', 'l', 3.14, 42, 'u']
I = [0, 2]
np.delete(a, I).tolist()
# Returns: ['l', '42', 'u']
If you don't mind ending up with a numpy array at the end, you can leave out the .tolist(). You should see some pretty major speed improvements, too, making this a more scalable solution. I haven't benchmarked it, but numpy operations are compiled code written in either C or Fortran.
To generalize the comment from @sth. Item deletion in any class, that implements abc.MutableSequence, and in list in particular, is done via __delitem__ magic method. This method works similar to __getitem__, meaning it can accept either an integer or a slice. Here is an example:
class MyList(list):
def __delitem__(self, item):
if isinstance(item, slice):
for i in range(*item.indices(len(self))):
self[i] = 'null'
else:
self[item] = 'null'
l = MyList(range(10))
print(l)
del l[5:8]
print(l)
I wanted to a way to compare the different solutions that made it easy to turn the knobs.
First I generated my data:
import random
N = 16 * 1024
x = range(N)
random.shuffle(x)
y = random.sample(range(N), N / 10)
Then I defined my functions:
def list_set(value_list, index_list):
index_list = set(index_list)
result = [value for index, value in enumerate(value_list) if index not in index_list]
return result
def list_del(value_list, index_list):
for index in sorted(index_list, reverse=True):
del(value_list[index])
def list_pop(value_list, index_list):
for index in sorted(index_list, reverse=True):
value_list.pop(index)
Then I used timeit to compare the solutions:
import timeit
from collections import OrderedDict
M = 1000
setup = 'from __main__ import x, y, list_set, list_del, list_pop'
statement_dict = OrderedDict([
('overhead', 'a = x[:]'),
('set', 'a = x[:]; list_set(a, y)'),
('del', 'a = x[:]; list_del(a, y)'),
('pop', 'a = x[:]; list_pop(a, y)'),
])
overhead = None
result_dict = OrderedDict()
for name, statement in statement_dict.iteritems():
result = timeit.timeit(statement, number=M, setup=setup)
if overhead is None:
overhead = result
else:
result = result - overhead
result_dict[name] = result
for name, result in result_dict.iteritems():
print "%s = %7.3f" % (name, result)
Output
set = 1.711
del = 3.450
pop = 3.618
So the generator with the indices in a set was the winner. And del is slightly faster then pop.
l = ['a','b','a','c','a','d']
to_remove = [1, 3]
[l[i] for i in range(0, len(l)) if i not in to_remove])
It's basically the same as the top voted answer, just a different way of writing it. Note that using l.index() is not a good idea, because it can't handle duplicated elements in a list.
None of the answers offered so far performs the deletion in place in O(n) on the length of the list for an arbitrary number of indices to delete, so here's my version:
def multi_delete(the_list, indices):
assert type(indices) in {set, frozenset}, "indices must be a set or frozenset"
offset = 0
for i in range(len(the_list)):
if i in indices:
offset += 1
elif offset:
the_list[i - offset] = the_list[i]
if offset:
del the_list[-offset:]
# Example:
a = [0, 1, 2, 3, 4, 5, 6, 7]
multi_delete(a, {1, 2, 4, 6, 7})
print(a) # prints [0, 3, 5]
delete_from_somelist = []
for i in [int(0), int(2)]:
delete_from_somelist.append(somelist[i])
for j in delete_from_somelist:
newlist = somelist.remove(j)
I put it all together into a list_diff function that simply takes two lists as inputs and returns their difference, while preserving the original order of the first list.
def list_diff(list_a, list_b, verbose=False):
# returns a difference of list_a and list_b,
# preserving the original order, unlike set-based solutions
# get indices of elements to be excluded from list_a
excl_ind = [i for i, x in enumerate(list_a) if x in list_b]
if verbose:
print(excl_ind)
# filter out the excluded indices, producing a new list
new_list = [i for i in list_a if list_a.index(i) not in excl_ind]
if verbose:
print(new_list)
return(new_list)
I tested the suggested solutions with perfplot and found that NumPy's
np.delete(lst, remove_ids)
is the fastest solution if the list is longer than about 100 entries. Before that, all solutions are around 10^-5 seconds. The list comprehension seems simple enough then:
out = [item for i, item in enumerate(lst) if i not in remove_ids]
Code to reproduce the plot:
import perfplot
import random
import numpy as np
import copy
def setup(n):
lst = list(range(n))
random.shuffle(lst)
# //10 = 10%
remove_ids = random.sample(range(n), n // 10)
return lst, remove_ids
def if_comprehension(lst, remove_ids):
return [item for i, item in enumerate(lst) if i not in remove_ids]
def del_list_inplace(lst, remove_ids):
out = copy.deepcopy(lst)
for i in sorted(remove_ids, reverse=True):
del out[i]
return out
def del_list_numpy(lst, remove_ids):
return np.delete(lst, remove_ids)
b = perfplot.bench(
setup=setup,
kernels=[if_comprehension, del_list_numpy, del_list_inplace],
n_range=[2**k for k in range(20)],
)
b.save("out.png")
b.show()