在使用 map.get()时使用 java Map.contsKey()是多余的

一段时间以来,我一直想知道在最佳实践中是否允许不在 java.util.Map上使用 containsKey()方法,而是对来自 get()的结果执行 null 检查。

我的理由是,对值进行两次查找似乎是多余的——首先查找 containsKey(),然后再查找 get()

另一方面,大多数 Map的标准实现可能会缓存最后一次查找,或者编译器可以通过其他方式消除冗余,为了代码的可读性,最好维护 containsKey()部分。

我非常感谢你的评论。

70062 次浏览

Some Map implementations are allowed to have null values, eg HashMap, in this case if get(key) returns null it does not guarantee that there is no entry in the map associated with this key.

So if you want to know if a map contains a key use Map.containsKey. If you simply need a value mapped to a key use Map.get(key). If this map permits null values, then a return value of null does not necessarily indicate that the map contains no mapping for the key; In such case Map.containsKey is useless and will affect performance. Moreover, in case of concurrent access to a map (eg ConcurrentHashMap), after you tested Map.containsKey(key) there is a chance that the entry will be removed by another thread before you call Map.get(key).

I think it is fairly standard to write:

Object value = map.get(key);
if (value != null) {
//do something with value
}

instead of

if (map.containsKey(key)) {
Object value = map.get(key);
//do something with value
}

It is not less readable and slightly more efficient so I don't see any reasons not to do it. Obviously if your map can contain null, the two options don't have the same semantics.

As assylias indicated, this is a semantic question. Generally, Map.get(x) == null is what you want, but there are cases where it is important to use containsKey.

One such case is a cache. I once worked on a performance issue in a web app that was querying its database frequently looking for entities that didn't exist. When I studied the caching code for that component, I realized it was querying the database if cache.get(key) == null. If the database returned null (entity not found), we would cache that key -> null mapping.

Switching to containsKey solved the problem because a mapping to a null value actually meant something. The key mapping to null had a different semantic meaning than the key not existing.

We can make @assylias answer more readable with Java8 Optional,

Optional.ofNullable(map.get(key)).ifPresent(value -> {
//do something with value
};)

In Java if you check the implementation

public boolean containsKey(Object key) {
return getNode(hash(key), key) != null;
}


public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}

both use getNode to retrieve the matching, where the main work gets done.

redundancy is contextual for example of if you have a dictionary stored in a hash map. When you want to retrieve the meaning of a word

doing...

if(dictionary.containsKey(word)) {
return dictionary.get(word);
}

is redundant.

but if you want to check a word is valid or not based on the dictionary. doing...

 return dictionary.get(word) != null;

over...

 return dictionary.containsKey(word);

is redundant.

If you check HashSet implementation, which uses HashMap internally, use 'containsKey' in 'contains' method.

    public boolean contains(Object o) {
return map.containsKey(o);
}
  • containsKey followed by a get is redundant only if we know apriori that null values will never be permitted. If null values aren't valid, the invocation of containsKey has a non-trivial performance penalty and is just overhead as shown in the benchmark below.

  • The Java 8 Optional idioms - Optional.ofNullable(map.get(key)).ifPresent or Optional.ofNullable(map.get(key)).ifPresent - incur a non-trivial overhead in comparison with just vanilla null checks.

  • A HashMap uses a O(1) constant table lookup whereas a TreeMap uses a O(log(n)) lookup. The containsKey followed by a get idiom is much slower when invoked on a TreeMap.

Benchmarks

See https://github.com/vkarun/enum-reverse-lookup-table-jmh

// t1
static Type lookupTreeMapNotContainsKeyThrowGet(int t) {
if (!lookupT.containsKey(t))
throw new IllegalStateException("Unknown Multihash type: " + t);
return lookupT.get(t);
}
// t2
static Type lookupTreeMapGetThrowIfNull(int t) {
Type type = lookupT.get(t);
if (type == null)
throw new IllegalStateException("Unknown Multihash type: " + t);
return type;
}
// t3
static Type lookupTreeMapGetOptionalOrElseThrow(int t) {
return Optional.ofNullable(lookupT.get(t)).orElseThrow(() -> new
IllegalStateException("Unknown Multihash type: " + t));
}
// h1
static Type lookupHashMapNotContainsKeyThrowGet(int t) {
if (!lookupH.containsKey(t))
throw new IllegalStateException("Unknown Multihash type: " + t);
return lookupH.get(t);
}
// h2
static Type lookupHashMapGetThrowIfNull(int t) {
Type type = lookupH.get(t);
if (type == null)
throw new IllegalStateException("Unknown Multihash type: " + t);
return type;
}
// h3
static Type lookupHashMapGetOptionalOrElseThrow(int t) {
return Optional.ofNullable(lookupH.get(t)).orElseThrow(() -> new
IllegalStateException("Unknown Multihash type: " + t));
}
Benchmark                                (iterations)  (lookupApproach)  Mode  Cnt   Score   Error  Units


MultihashTypeLookupBenchmark.testLookup          1000                t1  avgt    9  33.438 ± 4.514  us/op
MultihashTypeLookupBenchmark.testLookup          1000                t2  avgt    9  26.986 ± 0.405  us/op
MultihashTypeLookupBenchmark.testLookup          1000                t3  avgt    9  39.259 ± 1.306  us/op
MultihashTypeLookupBenchmark.testLookup          1000                h1  avgt    9  18.954 ± 0.414  us/op
MultihashTypeLookupBenchmark.testLookup          1000                h2  avgt    9  15.486 ± 0.395  us/op
MultihashTypeLookupBenchmark.testLookup          1000                h3  avgt    9  16.780 ± 0.719  us/op

TreeMap source reference

https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/util/TreeMap.java

HashMap source reference

https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/util/HashMap.java