从 List 中删除具有重复属性的对象

我在 C # 中有一个对象列表,所有的对象都包含一个属性 ID。 有几个对象具有相同的 ID 属性。

在每个 ID 属性只有一个对象的情况下,如何修剪 List (或者创建一个新 List) ?

[任何额外的副本已从清单中删除]

101340 次浏览

If you want to avoid using a third-party library, you could do something like:

var bar = fooArray.GroupBy(x => x.Id).Select(x => x.First()).ToList();

That will group the array by the Id property, then select the first entry in the grouping.

MoreLINQ DistinctBy() will do the job, it allows using object proeprty for the distinctness. Unfortunatly built in LINQ Distinct() not flexible enoght.

var uniqueItems = allItems.DistinctBy(i => i.Id);

DistinctBy()

Returns all distinct elements of the given source, where "distinctness" is determined via a projection and the default eqaulity comparer for the projected type.

PS: Credits to Jon Skeet for sharing this library with community

var list = GetListFromSomeWhere();
var list2 = GetListFromSomeWhere();
list.AddRange(list2);


....
...
var distinctedList = list.DistinctBy(x => x.ID).ToList();

More LINQ at GitHub

Or if you don't want to use external dlls for some reason, You can use this Distinct overload:

public static IEnumerable<TSource> Distinct<TSource>(
this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)

Usage:

public class FooComparer : IEqualityComparer<Foo>
{
// Products are equal if their names and product numbers are equal.
public bool Equals(Foo x, Foo y)
{


//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;


//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;


return x.ID == y.ID
}
}






list.Distinct(new FooComparer());

Not sure if anyone is still looking for any additional ways to do this. But I've used this code to remove duplicates from a list of User objects based on matching ID numbers.

private ArrayList RemoveSearchDuplicates(ArrayList SearchResults)
{
ArrayList TempList = new ArrayList();


foreach (User u1 in SearchResults)
{
bool duplicatefound = false;
foreach (User u2 in TempList)
if (u1.ID == u2.ID)
duplicatefound = true;


if (!duplicatefound)
TempList.Add(u1);
}
return TempList;
}

Call: SearchResults = RemoveSearchDuplicates(SearchResults);

Starting from .NET 6, a new DistinctBy LINQ operator is available:

public static IEnumerable<TSource> DistinctBy<TSource,TKey> (
this IEnumerable<TSource> source,
Func<TSource,TKey> keySelector);

Returns distinct elements from a sequence according to a specified key selector function.

Usage example:

List<Item> distinctList = listWithDuplicates
.DistinctBy(i => i.Id)
.ToList();

There is also an overload that has an IEqualityComparer<TKey> parameter.


Alternative: In case creating a new List<T> is not desirable, here is a RemoveDuplicates extension method for the List<T> class:

/// <summary>
/// Removes all the elements that are duplicates of previous elements,
/// according to a specified key selector function.
/// </summary>
/// <returns>
/// The number of elements removed.
/// </returns>
public static int RemoveDuplicates<TSource, TKey>(
this List<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> keyComparer = null)
{
var hashSet = new HashSet<TKey>(keyComparer);
return source.RemoveAll(item => !hashSet.Add(keySelector(item)));
}

This method is efficient (O(n)) but a bit dangerous, because it has the potential to corrupt the contents of the List<T> in case the keySelector lambda fails for some item. The same problem exists with the built-in RemoveAll method¹. So in case the keySelector lambda is not fail-proof, the RemoveDuplicates method should be invoked in a try block that has a catch block where the potentially corrupted list is discarded.

¹ The ABC0 class is backed by an internal _items array. The ABC2 method invokes the ABC3 for each item in the list, moving values stored in the _items along the way (source code). In case of an exception the ABC2 just exits immediately, leaving the _items in a corrupted state. I've posted an issue on GitHub regarding the corruptive behavior of this method, and the feedback that I've got was that neither the implementation should be fixed, nor the behavior should be documented.