Mastering Objectoriented Python
上QQ阅读APP看书,第一时间看更新

The __hash__() method

The built-in hash() function invokes the __hash__() method of a given object. This hash is a calculation which reduces a (potentially complex) value to a small integer value. Ideally, a hash reflects all the bits of the source value. Other hash calculations—often used for cryptographic purposes—can produce very large values.

Python includes two hash libraries. The cryptographic-quality hash functions are in hashlib. The zlib module has two high-speed hash functions: adler32() and crc32(). For relatively simple values, we don't use either of these. For large, complex values, these algorithms can be of help.

The hash() function (and the associated __hash__() method) is used to create a small integer key that is used to work with collections such as set, frozenset, and dict. These collections use the hash value of an immutable object to rapidly locate the object in the collection.

Immutability is important here; we'll mention it many times. Immutable objects don't change their state. The number 3, for example, doesn't change state. It's always 3. More complex objects, similarly, can have an immutable state. Python strings are immutable so that they can be used as keys to mappings and sets.

The default __hash__() implementation inherited from an object returns a value based on the object's internal ID value. This value can be seen with the id() function as follows:

>>> x = object()
>>> hash(x)
269741571
>>> id(x)
4315865136
>>> id(x) / 16
269741571.0

From this, we can see that on the author's particular system, the hash value is the object's id//16. This detail might vary from platform to platform. CPython, for example, uses portable C libraries where Jython relies on the Java JVM.

What's essential is that there is a strong correlation between the internal ID and the default __hash__() method. This means that the default behavior is for each object to be hashable as well as utterly distinct, even if it appears to have the same value.

We'll need to modify this if we want to coalesce different objects with the same value into a single hashable object. We'll look at an example in the next section, where we would like two instances of a single Card instance to be treated as if they were the same object.