Encryption

March 31, 2022

Sometimes, you want to go somewhere where everybody knows your name, but not who you are.

It should surprise no one that encryption is a vital component of blockchain technology. Cryptographic hash functions and asymmetric cryptography are the foundation of blockchain. They are the *crypto* in cryptocurrency. Let's examine them one at a time. We’ll start with cryptographic hashes.

A hash function ingests an arbitrary-length input (i.e., message) and spits out a fixed-length, enciphered text (i.e., digest or hash). The applications of hash functions are varied (e.g., checksumming and hashtables). Hash functions have four main properties. First, they are deterministic. The same hash function should always produce the same digest for a given message. In other words, the digest should only vary based on the input. Second, they use the whole message to generate the digest. So, any change to the message results in a different digest. That leads to the third property of hash functions, the avalanche effect. The avalanche effect is where minor changes to the input result in a radically different digest. The fourth and final property of the hash function is that the hash function “uniformly” distributes the possible messages across the space of possible digests. It eschews situations where different inputs yield identical digests (i.e., hash collisions).

Cryptographic hash functions are a class of hash functions with three additional properties.

Now, why might these properties be valuable? Let's start with the first property. Hash functions need to be deterministic because they need to return the same digest given the same message. For example, when using a hash function to checksum a file, the algorithm should output the same digest if the file is unaltered. That applies to hash tables too. When you give a hash table a key, you want it to return the sought value. For the second property, the hash function needs to use the entirety of a message. That way, it can detect alterations to the input. This property also is related to the next one. Going back to our example of checksums, wildly different digests makes it easier to see when a file has changed. We discussed the fourth property's importance earlier.

Cryptographic hash functions are a class of hash functions with three additional properties: pre-image resistance, second pre-image resistance, and collision resistance. Pre-image resistance means that given the digest *h*, it is hard to find a message *m* so that *hash(m) = h*. In other words, the hashing function is one-way. You can't get the message from the digest. Second pre-image resistance is also called *weak collision resistance*. It means that given a message *m₁*, it will be difficult to find another message (*m₂*) so that *hash(m₁) = hash(m₂)*. More simply, it is hard to find another input with the same digest as a given message. Collision resistance implies second pre-image resistance. Also called *strong collision resistance*, collision resistance means that it is difficult to find two inputs (*m₁* and *m₂*) so that *hash(m₁) = hash(m₂)*. In other words, it is hard to find two messages that produce the same digest. Collision resistance differs from second pre-image resistance in that collision resistance provides assurances against hash collisions for any two messages. Ideally, a cryptographic hash function guarantees that hash collisions can’t occur. Of course, this isn’t the case in the real world since the function needs to take in any input and output a fixed-length digest. The input space is larger than the output space. So, there will, inevitably, be hash collisions. The best we can expect is that a cryptographic hash function will make it infeasible for an adversary to find a hash collision.

It is also worth noting that a cryptographic hash function can lose its cryptographic hash function status. When an algorithm is no longer suitable for applications requiring security, it can still be a non-cryptographic hash function (e.g., MD5 and SHA-1). That can be due to technological changes, better mathematical understanding, and more.

So how are cryptographic hash functions used in blockchain? One, they are used to secure the data and headers of the blocks. In other words, they are used to guarantee that the ledger entries are trustworthy. Two, they are used to generate unique identifiers like the transaction hash or hash id. The transaction hash identifies the different transactions on a blockchain. It is for tracking and tracing transactions. If the cryptographic hash function takes a public key as input, the output is the key owner's wallet address. This address identifies the user within transactions (e.g., the to and from fields).

Different blockchain implementations can use any cryptographic hash function. Next time, let's examine an often used cryptographic hash function: SHA-256.

Banner image credit to

rolaks