有没有更好的方法在 Java 中组合两个字符串集?

我需要组合两个字符串集,同时过滤掉冗余信息,这是我想到的解决方案,有没有更好的方法,任何人都可以建议?也许是我忽略了什么?谷歌上查不到。

Set<String> oldStringSet = getOldStringSet();
Set<String> newStringSet = getNewStringSet();


for(String currentString : oldStringSet)
{
if (!newStringSet.contains(currentString))
{
newStringSet.add(currentString);
}
}
113513 次浏览

Since a Set does not contain duplicate entries, you can therefore combine the two by:

newStringSet.addAll(oldStringSet);

It does not matter if you add things twice, the set will only contain the element once... e.g it's no need to check using contains method.

http://docs.oracle.com/javase/7/docs/api/java/util/Set.html#addAll(java.util.Collection)

Since sets can't have duplicates, just adding all the elements of one to the other generates the correct union of the two.

Just use newStringSet.addAll(oldStringSet). No need to check for duplicates as the Set implementation does this already.

Set.addAll()

Adds all of the elements in the specified collection to this set if they're not already present (optional operation). If the specified collection is also a set, the addAll operation effectively modifies this set so that its value is the union of the two sets

newStringSet.addAll(oldStringSet)
 newStringSet.addAll(oldStringSet);

This will produce Union of s1 and s2

From the definition Set contain only unique elements.

Set<String> distinct = new HashSet<String>();
distinct.addAll(oldStringSet);
distinct.addAll(newStringSet);

To enhance your code you may create a generic method for that

public static <T> Set<T> distinct(Collection<T>... lists) {
Set<T> distinct = new HashSet<T>();


for(Collection<T> list : lists) {
distinct.addAll(list);
}
return distinct;
}

You can do it using this one-liner

Set<String> combined = Stream.concat(newStringSet.stream(), oldStringSet.stream())
.collect(Collectors.toSet());

With a static import it looks even nicer

Set<String> combined = concat(newStringSet.stream(), oldStringSet.stream())
.collect(toSet());

Another way is to use flatMap method:

Set<String> combined = Stream.of(newStringSet, oldStringSet).flatMap(Set::stream)
.collect(toSet());

Also any collection could easily be combined with a single element

Set<String> combined = concat(newStringSet.stream(), Stream.of(singleValue))
.collect(toSet());

The same with Guava:

Set<String> combinedSet = Sets.union(oldStringSet, newStringSet)

If you care about performance, and if you don't need to keep your two sets and one of them can be huge, I would suggest to check which set is the largest and add the elements from the smallest.

Set<String> newStringSet = getNewStringSet();
Set<String> oldStringSet = getOldStringSet();


Set<String> myResult;
if(oldStringSet.size() > newStringSet.size()){
oldStringSet.addAll(newStringSet);
myResult = oldStringSet;
} else{
newStringSet.addAll(oldStringSet);
myResult = newStringSet;
}

In this way, if your new set has 10 elements and your old set has 100 000, you only do 10 operations instead of 100 000.

If you are using Guava you can also use a builder to get more flexibility:

ImmutableSet.<String>builder().addAll(someSet)
.addAll(anotherSet)
.add("A single string")
.build();

If you are using the Apache Common, use SetUtils class from org.apache.commons.collections4.SetUtils;

SetUtils.union(setA, setB);