为什么 Python 集不是散列的?

我偶然发现了一篇博客文章,详细介绍了如何在 Python 中实现 powerset 函数。所以我尝试了自己的方法,发现 Python 显然不能有一组集合,因为集合不是散列的。这是令人厌烦的,因为 Powerset 的定义是它是一组集合,我想使用实际的集合操作来实现它。

>>> set([ set() ])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

Python 集不是散列的,有什么好的理由吗?

48020 次浏览

Because they're mutable.

If they were hashable, a hash could silently become "invalid", and that would pretty much make hashing pointless.

Generally, only immutable objects are hashable in Python. The immutable variant of set() -- frozenset() -- is hashable.

From the Python docs:

hashable
An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

In case this helps... if you really need to convert unhashable things into hashable equivalents for some reason you might do something like this:

from collections import Hashable, MutableSet, MutableSequence, MutableMapping


def make_hashdict(value):
"""
Inspired by https://stackoverflow.com/questions/1151658/python-hashable-dicts
- with the added bonus that it inherits from the dict type of value
so OrderedDict's maintain their order and other subclasses of dict() maintain their attributes
"""
map_type = type(value)


class HashableDict(map_type):
def __init__(self, *args, **kwargs):
super(HashableDict, self).__init__(*args, **kwargs)
def __hash__(self):
return hash(tuple(sorted(self.items())))


hashDict = HashableDict(value)


return hashDict




def make_hashable(value):
if not isinstance(value, Hashable):
if isinstance(value, MutableSet):
value = frozenset(value)
elif isinstance(value, MutableSequence):
value = tuple(value)
elif isinstance(value, MutableMapping):
value = make_hashdict(value)


return value


my_set = set()
my_set.add(make_hashable(['a', 'list']))
my_set.add(make_hashable({'a': 1, 'dict': 2}))
my_set.add(make_hashable({'a', 'new', 'set'}))


print my_set

My HashableDict implementation is the simplest and least rigorous example from here. If you need a more advanced HashableDict that supports pickling and other things, check the many other implementations. In my version above I wanted to preserve the original dict class, thus preserving the order of OrderedDicts. I also use AttrDict from here for attribute-like access.

My example above is not in any way authoritative, just my solution to a similar problem where I needed to store some things in a set and needed to "hashify" them first.