Understanding the hash function and the Merkle tree
Cryptography is the art and science of hiding a message. It is more of an art than a science. Science here just acts as a mere tool to transform an artistic imagination into a mathematical algorithm.
In this section, we will concentrate only on two specific concepts from this immensely vast subject, which will help us to understand the true essence of a blockchain. These are the hash function and Merkle tree. Please note that these are used for verifying integrity, rather than hiding anything.
Let's now understand what a hash function is and how it works. Figure 1.3 illustrates a hash function. As we can see in this figure, a hash function is a one-way mapping:
That means that it can take an input (the input is usually a large sequence of bits; it can be a movie, a song, an e-book, a picture, or any digital data) and produce a fixed-size value as output, often much smaller than the input size. However, if I change only one bit in this input, the output will be completely different. This is property number one of the hash function and makes the hash function very much collision resistant. The second property is that there is no way to figure out the input if I only have the output. So, if I have the input to a hash function I can always get an output, no matter how big the input is. However, if I only have an output of a hash function, there is no way to reconstruct the input from it, because hash functions are unidirectional. Also note that the output hash is a fixed-length random bit sequence but, when it gets displayed on the terminals, it gets converted to a hexadecimal format and looks alphanumeric.
You might wonder what happens if I take the output of a hash function and feed it back into its input. Nice try! We still get a completely different sequence of fixed-length bits. And guess what, now we cannot even reconstruct the original output. This new output is not just a hash. It is a hash of a hash. It is called a hash chain and is denoted by h(h(x)) or h2(x), where x is the original string of bits.
The next thing we need to understand is what a Merkle tree is. A Merkle tree is a data structure where each layer is a combination of hashes from its child layer. Generally, a Merkle tree is represented using a binary hash tree, where each node has at most two children. The branches are the hash of the combined hashes of the two children. This process of re-hashing the concatenation of the child nodes to create the parent node is performed until the top of the tree is reached, called the root hash or the Merkle root. Let me show you how it works with Figure 1.4:
Let's say we have two hash values (2ad5 and 3ce9). Please note that 0x represents a hexadecimal notation, but I will omit the prefix for simplicity. To get the next value in the tree, I combine them to a single hash and get the value 23a2. As I told you before, this hash is a one-way function. So, by combining 2ad5 and 3ce9 we can always get the hash value 23a2, but if we only have 23a2 there is no way to figure out the original values. I now do the same hashing with 0ab4 and 1bd4 and combine them to get the value 01e3 and then hashing 01e3 and 23a2 to get the root value 0123. So, this root value will be a representation of this data structure, but there would be no way for me to go back and figure out all the individual values in the tree, let alone the original data blocks D0, D1, D2, and D3. That would be very difficult.
Let's say I am the owner of the data block D2 in the preceding diagram. I also acquire, from the distributed consensus, the root hash, which in our situation is 0123. I ask the distributed system to prove to me that my record D2 is in the tree. What the server returns to me are the hashes 3ce9, 01e3, and 0123. Using this information, the proof is somewhat as follows (please note that, here, the + sign denotes a combiner, and not addition, and that the proof is basically a verification process):
- Hash of D2, which we compute to get 2ad5
- Hash of [2ad5 + 3ce9] from which we compute 23a2
- Hash of [01e3 + 23a2] from which we compute 0123
Since we know the root hash from our distributed consensus, the proof validates that 2ad5 exists in the tree. So does our record D2. Furthermore, the system from which you have obtained the proof is proving to you that it is an authority because it is able to provide valid hashes so that you can get from D2 to your known root hash 0123. Any system pretending to validate your request would not be able to provide you with the intermediate hashes. Since you're not giving the system the root hash, you're just telling it to give you the proof, the distributed consensus just can't invent the proof because it doesn't know your root hash; only you know that. Moreover, in order to verify the proof, very little information about the tree is revealed to you. Furthermore, the data packet that is needed for this proof is very small, making it efficient to send over a network and to make the proof computation. Now, if I claimed that I had D4 in place of D2, I could not have proved it with the data 3ce9, 01e3, and 0123 because the hash of the hash of D4 combining 3ce9 would have been a completely different number, that is, not 23a2 as we got in Step 2. We would have eventually got a different root hash, which would contradict the consensus root hash value of 0123.
Now that we understand what a hash, a hash chain, and a Merkle tree are and how they work, we can proceed to understand a blockchain which will be discussed in the next section.