我正在寻找一种方法,以快速删除项目从 C # List<T>
。文档指出,List.Remove()
和 List.RemoveAt()
操作都是 O(n)
这严重影响了我的申请。
我编写了一些不同的删除方法,并在有500,000个条目的 List<String>
上测试了它们。测试用例如下所示..。
概述
I wrote a method that would generate a list of strings that simply contains string representations of each number ("1", "2", "3", ...). I then attempted to remove
every 5th item in the list. Here is the method used to generate the list:
private List<String> GetList(int size)
{
List<String> myList = new List<String>();
for (int i = 0; i < size; i++)
myList.Add(i.ToString());
return myList;
}
测试1: RemoveAt ()
下面是我用来测试 RemoveAt()
方法的测试。
private void RemoveTest1(ref List<String> list)
{
for (int i = 0; i < list.Count; i++)
if (i % 5 == 0)
list.RemoveAt(i);
}
测试2: delete ()
下面是我用来测试 Remove()
方法的测试。
private void RemoveTest2(ref List<String> list)
{
List<int> itemsToRemove = new List<int>();
for (int i = 0; i < list.Count; i++)
if (i % 5 == 0)
list.Remove(list[i]);
}
测试3: 设置为 null,sort,然后 RemoveRange
在这个测试中,我对列表进行了一次循环,并将待删除项设置为 null
。然后,我对列表进行了排序(所以 null 应该在顶部) ,并删除了顶部所有设置为 null 的项目。
注意: 这重新排列了我的列表,所以我可能要把它放回正确的顺序。
private void RemoveTest3(ref List<String> list)
{
int numToRemove = 0;
for (int i = 0; i < list.Count; i++)
{
if (i % 5 == 0)
{
list[i] = null;
numToRemove++;
}
}
list.Sort();
list.RemoveRange(0, numToRemove);
// Now they're out of order...
}
Test 4: Create a new list, and add all of the "good" values to the new list
在这个测试中,我创建了一个新列表,并将所有 keep-item 添加到新列表中。然后,我把所有这些项目放入原始列表。
private void RemoveTest4(ref List<String> list)
{
List<String> newList = new List<String>();
for (int i = 0; i < list.Count; i++)
{
if (i % 5 == 0)
continue;
else
newList.Add(list[i]);
}
list.RemoveRange(0, list.Count);
list.AddRange(newList);
}
测试5: 设置为 null,然后 FindAll ()
在这个测试中,我将所有要删除的项目设置为 null
,然后使用 FindAll()
特性查找所有不是 null
的项目
private void RemoveTest5(ref List<String> list)
{
for (int i = 0; i < list.Count; i++)
if (i % 5 == 0)
list[i] = null;
list = list.FindAll(x => x != null);
}
测试6: 设置为 null,然后 RemoveAll ()
在这个测试中,我将所有要删除的项目设置为 null
,然后使用 RemoveAll()
特性删除所有不是 null
的项目
private void RemoveTest6(ref List<String> list)
{
for (int i = 0; i < list.Count; i++)
if (i % 5 == 0)
list[i] = null;
list.RemoveAll(x => x == null);
}
客户端应用程序和输出
int numItems = 500000;
Stopwatch watch = new Stopwatch();
// List 1...
watch.Start();
List<String> list1 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
watch.Reset(); watch.Start();
RemoveTest1(ref list1);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();
// List 2...
watch.Start();
List<String> list2 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
watch.Reset(); watch.Start();
RemoveTest2(ref list2);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();
// List 3...
watch.Reset(); watch.Start();
List<String> list3 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
watch.Reset(); watch.Start();
RemoveTest3(ref list3);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();
// List 4...
watch.Reset(); watch.Start();
List<String> list4 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
watch.Reset(); watch.Start();
RemoveTest4(ref list4);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();
// List 5...
watch.Reset(); watch.Start();
List<String> list5 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
watch.Reset(); watch.Start();
RemoveTest5(ref list5);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();
// List 6...
watch.Reset(); watch.Start();
List<String> list6 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
watch.Reset(); watch.Start();
RemoveTest6(ref list6);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();
结果
00:00:00.1433089 // Create list
00:00:32.8031420 // RemoveAt()
00:00:32.9612512 // Forgot to reset stopwatch :(
00:04:40.3633045 // Remove()
00:00:00.2405003 // Create list
00:00:01.1054731 // Null, Sort(), RemoveRange()
00:00:00.1796988 // Create list
00:00:00.0166984 // Add good values to new list
00:00:00.2115022 // Create list
00:00:00.0194616 // FindAll()
00:00:00.3064646 // Create list
00:00:00.0167236 // RemoveAll()
注释及意见
前两个测试实际上并没有从列表中删除每5个项目,因为每次删除后列表都会重新排序。事实上,在500,000件物品中,只有83,334件被移除(应该是100,000件)。我对此没有意见——很明显,RemoveAt ()/RemoveAt ()方法并不是一个好主意。
尽管我试图从列表中删除第5项,但在 reality中不会有这样的模式。要删除的条目将是随机的。
尽管我在这个例子中使用了 List<String>
,但情况并不总是如此。可能是 List<Anything>
Not putting the items in the list to begin with is 没有 an option.
The other methods (3 - 6) all performed much better, 相对而言, however I am a little concerned -- In 3, 5, and 6 I was forced to set a value to null
, and then remove all the items according to this sentinel. I don't like that approach because I can envision a scenario where one of the items in the list might be null
and it would get removed unintentionally.
My question is: What is the best way to quickly remove many items from a List<T>
? Most of the approaches I've tried look really ugly, and potentially dangerous, to me. Is a List
the wrong data structure?
Right now, I am leaning towards creating a new list and adding the good items to the new list, but it seems like there should be a better way.