使用linq删除列表中的重复项

我有一个类Itemsproperties (Id, Name, Code, Price)

Items的List中填充了重复的项。

为例:

1         Item1       IT00001        $100
2         Item2       IT00002        $200
3         Item3       IT00003        $150
1         Item1       IT00001        $100
3         Item3       IT00003        $150

如何使用linq删除列表中的重复项?

376277 次浏览
var distinctItems = items.Distinct();

为了只匹配某些属性,可以创建一个自定义的相等比较器,例如:

class DistinctItemComparer : IEqualityComparer<Item> {


public bool Equals(Item x, Item y) {
return x.Id == y.Id &&
x.Name == y.Name &&
x.Code == y.Code &&
x.Price == y.Price;
}


public int GetHashCode(Item obj) {
return obj.Id.GetHashCode() ^
obj.Name.GetHashCode() ^
obj.Code.GetHashCode() ^
obj.Price.GetHashCode();
}
}

然后这样使用它:

var distinctItems = items.Distinct(new DistinctItemComparer());

使用Distinct(),但请记住,它使用默认的相等比较器来比较值,所以如果你想要超出这个范围,你需要实现自己的比较器。

有关示例,请参见http://msdn.microsoft.com/en-us/library/bb348436.aspx

如果有一些东西抛出了Distinct查询,你可能想要查看MoreLinq并使用DistinctBy操作符并通过id选择不同的对象。

var distinct = items.DistinctBy( i => i.Id );
var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());

这就是我如何能够通过Linq分组。希望能有所帮助。

var query = collection.GroupBy(x => x.title).Select(y => y.FirstOrDefault());
List<Employee> employees = new List<Employee>()
{
new Employee{Id =1,Name="AAAAA"}
, new Employee{Id =2,Name="BBBBB"}
, new Employee{Id =3,Name="AAAAA"}
, new Employee{Id =4,Name="CCCCC"}
, new Employee{Id =5,Name="AAAAA"}
};


List<Employee> duplicateEmployees = employees.Except(employees.GroupBy(i => i.Name)
.Select(ss => ss.FirstOrDefault()))
.ToList();

你有三个选项来删除列表中的重复项:

  1. 使用一个自定义的相等比较器,然后使用Distinct(new DistinctItemComparer()),如前面提到的@Christian Hayter
  2. 使用GroupBy,但请注意,在GroupBy中,你应该根据所有列进行分组,因为如果你只是根据Id进行分组,它并不总是会删除重复的项。例如,考虑下面的例子:

    List<Item> a = new List<Item>
    {
    new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
    new Item {Id = 2, Name = "Item2", Code = "IT00002", Price = 200},
    new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
    new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
    new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
    new Item {Id = 3, Name = "Item3", Code = "IT00004", Price = 250}
    };
    var distinctItems = a.GroupBy(x => x.Id).Select(y => y.First());
    

    分组的结果将是:

    {Id = 1, Name = "Item1", Code = "IT00001", Price = 100}
    {Id = 2, Name = "Item2", Code = "IT00002", Price = 200}
    {Id = 3, Name = "Item3", Code = "IT00003", Price = 150}
    

    这是不正确的,因为它认为{Id = 3, Name = "Item3", Code = "IT00004", Price = 250}是重复的。所以正确的查询应该是:

    var distinctItems = a.GroupBy(c => new { c.Id , c.Name , c.Code , c.Price})
    .Select(c => c.First()).ToList();
    

    3.重写项目类中的EqualGetHashCode:

    public class Item
    {
    public int Id { get; set; }
    public string Name { get; set; }
    public string Code { get; set; }
    public int Price { get; set; }
    
    
    public override bool Equals(object obj)
    {
    if (!(obj is Item))
    return false;
    Item p = (Item)obj;
    return (p.Id == Id && p.Name == Name && p.Code == Code && p.Price == Price);
    }
    public override int GetHashCode()
    {
    return String.Format("{0}|{1}|{2}|{3}", Id, Name, Code, Price).GetHashCode();
    }
    }
    

    然后你可以这样使用它:

    var distinctItems = a.Distinct();
    

试试这个扩展方法。希望这能有所帮助。

public static class DistinctHelper
{
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
var identifiedKeys = new HashSet<TKey>();
return source.Where(element => identifiedKeys.Add(keySelector(element)));
}
}

用法:

var outputList = sourceList.DistinctBy(x => x.TargetProperty);

当你不想写IEqualityComparer时,你可以试试下面的方法。

 class Program
{


private static void Main(string[] args)
{


var items = new List<Item>();
items.Add(new Item {Id = 1, Name = "Item1"});
items.Add(new Item {Id = 2, Name = "Item2"});
items.Add(new Item {Id = 3, Name = "Item3"});


//Duplicate item
items.Add(new Item {Id = 4, Name = "Item4"});
//Duplicate item
items.Add(new Item {Id = 2, Name = "Item2"});


items.Add(new Item {Id = 3, Name = "Item3"});


var res = items.Select(i => new {i.Id, i.Name})
.Distinct().Select(x => new Item {Id = x.Id, Name = x.Name}).ToList();


// now res contains distinct records
}






}




public class Item
{
public int Id { get; set; }


public string Name { get; set; }
}

一个通用的扩展方法:

public static class EnumerableExtensions
{
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> enumerable, Func<T, TKey> keySelector)
{
return enumerable.GroupBy(keySelector).Select(grp => grp.First());
}
}

用法示例:

var lstDst = lst.DistinctBy(item => item.Key);

另一种变通方法,不美观买不可行。

我有一个XML文件,其中有一个名为“MEMDES”的元素,具有“GRADE”和“SPD”两个属性,用于记录RAM模块信息。 SPD中有很多重复项

下面是我用来删除重复项的代码:

        IEnumerable<XElement> MList =
from RAMList in PREF.Descendants("MEMDES")
where (string)RAMList.Attribute("GRADE") == "DDR4"
select RAMList;


List<string> sellist = new List<string>();


foreach (var MEMList in MList)
{
sellist.Add((string)MEMList.Attribute("SPD").Value);
}


foreach (string slist in sellist.Distinct())
{
comboBox1.Items.Add(slist);
}