在数组列表中检测副本？

小开

最佳答案

最简单的: 将整个集合转储到 Set (使用 Set (Collection)构造函数或 Set.addAll)中，然后查看 Set 是否与 ArrayList 具有相同的大小。

List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);


if(set.size() < list.size()){
/* There are duplicates */
}

更新: 如果我没有理解错你的问题，你有一个2d 的 Block 数组，如

Block table[][];

你想检测它们中的任何一行是否有重复的？

在这种情况下，假设 Block 正确地实现了“ equals”和“ hashCode”，我可以执行以下操作:

for (Block[] row : table) {
Set set = new HashSet<Block>();
for (Block cell : row) {
set.add(cell);
}
if (set.size() < 6) { //has duplicate
}
}

对于语法，我不能100% 肯定这一点，因此将其编写为

for (int i = 0; i < 6; i++) {
Set set = new HashSet<Block>();
for (int j = 0; j < 6; j++)
set.add(table[i][j]);
...

如果要添加的项目已经在集合中，那么 Set.add返回一个布尔值 false，因此，如果您想知道的是是否存在任何重复项目，那么您甚至可以对返回 false的任何添加进行短路和跳出操作。

小开

如果您希望完全避免重复，那么您应该删除检测重复的中间过程并使用预备。

小开

如果你的元素在某种程度上是可比的(顺序有任何实际意义的事实是无关紧要的——它只需要与你的相等定义保持一致) ，最快的重复消除解决方案是对列表(0(n log (n)))进行排序，然后进行单次传递，寻找重复元素(即，相等的元素相继出现)(这是 O (n))。

整体的复杂度将是 O (n log (n)) ，这与使用 Set (n 乘以 long (n))得到的结果大致相同，但使用的常数要小得多。这是因为 sort/deup 中的常量来自于比较元素的成本，而集合中的成本最有可能来自于哈希计算，再加上一个(可能是几个)哈希比较。如果您使用的是基于散列的 Set 实现，这是因为基于 Tree 的实现将给您一个 O (n log2(n)) ，这更糟糕。

然而，据我所知，您不需要拿开重复，只需要测试它们的存在。因此，您应该在数组上手动编写一个合并或堆排序算法，如果比较器返回0，则该算法退出返回 true (即“有一个上升”) ，否则将完成排序，并遍历排序的数组测试以获得重复。实际上，在合并或堆排序中，当排序完成时，您将比较每个重复对，除非两个元素都已经处于最终位置(这是不太可能的)。因此，经过调整的排序算法应该会带来巨大的性能改进(我必须证明这一点，但是我猜测经过调整的算法应该在均匀随机数据的 O (log (n))中)

小开

简而言之: 1)确保所有项目具有可比性 2) sort the array 2)遍历数组并找到重复的

小开

Improved code, using return value of Set#add instead of comparing the size of list and set.

public static <T> boolean hasDuplicate(Iterable<T> all) {
Set<T> set = new HashSet<T>();
// Set#add returns false if the set does not change, which
// indicates that a duplicate element has been added.
for (T each: all) if (!set.add(each)) return true;
return false;
}

小开

改进了返回重复元素的代码

可以在集合中找到重复项
返回副本集
可以从 Set 中获取唯一元素

public static <T> List getDuplicate(Collection<T> list) {


final List<T> duplicatedObjects = new ArrayList<T>();
Set<T> set = new HashSet<T>() {
@Override
public boolean add(T e) {
if (contains(e)) {
duplicatedObjects.add(e);
}
return super.add(e);
}
};
for (T t : list) {
set.add(t);
}
return duplicatedObjects;
}




public static <T> boolean hasDuplicate(Collection<T> list) {
if (getDuplicate(list).isEmpty())
return false;
return true;
}

小开

要了解列表中的重复项，可以使用以下代码: 它将给出包含重复项的集合。

 public Set<?> findDuplicatesInList(List<?> beanList) {
System.out.println("findDuplicatesInList::"+beanList);
Set<Object> duplicateRowSet=null;
duplicateRowSet=new LinkedHashSet<Object>();
for(int i=0;i<beanList.size();i++){
Object superString=beanList.get(i);
System.out.println("findDuplicatesInList::superString::"+superString);
for(int j=0;j<beanList.size();j++){
if(i!=j){
Object subString=beanList.get(j);
System.out.println("findDuplicatesInList::subString::"+subString);
if(superString.equals(subString)){
duplicateRowSet.add(beanList.get(j));
}
}
}
}
System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
return duplicateRowSet;
}

小开

    String tempVal = null;
for (int i = 0; i < l.size(); i++) {
tempVal = l.get(i); //take the ith object out of list
while (l.contains(tempVal)) {
l.remove(tempVal); //remove all matching entries
}
l.add(tempVal); //at last add one entry
}

注意: 这将有主要的性能打击，虽然项目是从列表的开始删除。为了解决这个问题，我们有两个选择。1)反向迭代并删除元素。2)使用 LinkedList 而不是 ArrayList。由于在访谈中提出的问题带有偏见，以便在不使用任何其他集合的情况下从 List 中删除重复的内容，上面的例子就是答案。但在现实世界中，如果我必须实现这一点，我将把元素从列表设置，简单！

小开

/**
* Method to detect presence of duplicates in a generic list.
* Depends on the equals method of the concrete type. make sure to override it as required.
*/
public static <T> boolean hasDuplicates(List<T> list){
int count = list.size();
T t1,t2;


for(int i=0;i<count;i++){
t1 = list.get(i);
for(int j=i+1;j<count;j++){
t2 = list.get(j);
if(t2.equals(t1)){
return true;
}
}
}
return false;
}

覆盖了 equals()的具体类的一个示例:

public class Reminder{
private long id;
private int hour;
private int minute;


public Reminder(long id, int hour, int minute){
this.id = id;
this.hour = hour;
this.minute = minute;
}


@Override
public boolean equals(Object other){
if(other == null) return false;
if(this.getClass() != other.getClass()) return false;
Reminder otherReminder = (Reminder) other;
if(this.hour != otherReminder.hour) return false;
if(this.minute != otherReminder.minute) return false;


return true;
}
}

小开

如果需要重复的值集:

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;


public class FindDuplicateInArrayList {


public static void main(String[] args) {


Set<String> uniqueSet = new HashSet<String>();
List<String> dupesList = new ArrayList<String>();
for (String a : args) {
if (uniqueSet.contains(a))
dupesList.add(a);
else
uniqueSet.add(a);
}
System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
System.out.println(dupesList.size() + " dupesList words: " + dupesList);
}
}

也许还可以考虑修剪值或使用小写... ... 这取决于您的情况。

小开

我需要为 Stream做一个类似的操作，但是找不到一个好的例子。这是我想到的。

public static <T> boolean areUnique(final Stream<T> stream) {
final Set<T> seen = new HashSet<>();
return stream.allMatch(seen::add);
}

这种方法的优点是，当重复数据被及早发现时，不必处理整个流，而且不会比将所有数据放入 Set并检查大小复杂得多。因此，这种情况大致如下:

List<T> list = ...
boolean allDistinct = areUnique(list.stream());

小开

处理这个问题的最佳方法是使用 HashSet:

ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);

只要打印结果数组列表，就可以看到没有重复的结果:)

小开

    ArrayList<String> withDuplicates = new ArrayList<>();
withDuplicates.add("1");
withDuplicates.add("2");
withDuplicates.add("1");
withDuplicates.add("3");
HashSet<String> set = new HashSet<>(withDuplicates);
ArrayList<String> withoutDupicates = new ArrayList<>(set);


ArrayList<String> duplicates = new ArrayList<String>();


Iterator<String> dupIter = withDuplicates.iterator();
while(dupIter.hasNext())
{
String dupWord = dupIter.next();
if(withDuplicates.contains(dupWord))
{
duplicates.add(dupWord);
}else{
withoutDupicates.add(dupWord);
}
}
System.out.println(duplicates);
System.out.println(withoutDupicates);

小开

使用 Java8 + ，你可以使用 Stream API:

boolean areAllDistinct(List<Block> blocksList) {
return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}

小开

这个答案是在 Kotlin 写的，但可以很容易地翻译成 Java。

如果数组列表的大小在一个固定的小范围内，那么这是一个很好的解决方案。

var duplicateDetected = false
if(arrList.size > 1){
for(i in 0 until arrList.size){
for(j in 0 until arrList.size){
if(i != j && arrList.get(i) == arrList.get(j)){
duplicateDetected = true
}
}
}
}

小开

private boolean isDuplicate() {
for (int i = 0; i < arrayList.size(); i++) {
for (int j = i + 1; j < arrayList.size(); j++) {
if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
return true;
}
}
}


return false;
}