Learn Ethereum in 2024. #12. Hash in ethers.js.

João Paulo Morais
8 min readApr 6, 2024

--

In the previous two articles, we delved into the theoretical aspects of hash functions and asymmetric encryption. Now, it’s time to put theory into practice. These cryptographic concepts are actively implemented in code libraries across various programming languages. Due to their significance in blockchain technology, they are prominently featured in the main libraries associated with Ethereum. In this article, we’ll explore hash functions using a JavaScript library called ethers.js.

Ethers.js

As described in its documentation, ethers.js aims to be a complete and compact library for interacting with the Ethereum Blockchain and its ecosystem. Presently, it stands as the most extensively utilized JavaScript library for Ethereum interaction, although it’s not the sole option in the field. Alternatives include web3.js and Viem. Initially, web3.js gained widespread adoption as the primary library for Ethereum network interaction via web browsers. However, a significant portion of the community eventually transitioned to ethers.js. Currently, there’s some inclination towards Viem, but it remains too early to determine if it will surpass ethers.js in popularity.

Regardless of the chosen library, once you grasp the fundamentals of Ethereum interaction, any will effectively serve its purpose. For now, let’s confine our focus to the cryptographic functionalities offered by these libraries to put our theoretical knowledge into practice. I’ll be utilizing ethers.js within a Node environment for demonstration purposes. From this point onward, I’ll assume you have a basic understanding of Node to install and us it.

In this article, I’ll be employing the most recent version of the library available at the moment, which is version 6, specifically version 6.11.1. When seeking additional tutorials on ethers online, it’s important to exercise caution, as many articles were written using version 5. It’s worth noting that there are several disparities between version 5 and version 6. It’s always advisable to learn and utilize the latest version for optimal compatibility and functionality.

Installing ethers

To utilize ethers.js, you need to install the following library:

npm install ethers 

We can import all library objects using a named import called “ethers”.

import { ethers } from "ethers";

Keccak256

Let’s commence our study with the hash function keccak256, which is implemented by ethers. It’s essential to recall that keccak256 is a hash function that takes an arbitrary input and produces a 256-bit identifier, equivalent to 32 bytes.

Hash functions can be applied to texts and documents, but when applied to texts, the hash function operates on the encoding of the text (typically UTF-8). In other words, hash functions are always applied to a set of bytes. To generate the keccak256 hash of a set of bytes using ethers, we utilize the function as follows:

const hash = ethers.keccak256("0x");
console.log(hash);

The keccak256 function accepts a string that represents a sequence of bytes. A byte is a number ranging from 0 to 255 and can be represented in various forms, with the most common in computing being hexadecimal notation. Thus, a byte is represented by a number between 00 and FF in hexadecimal. The string “0x” denotes a hexadecimal representation of a sequence of bytes, with all subsequent characters after “0x” representing the bytes themselves. Therefore, “0x” alone signifies an empty sequence or no bytes. However, it’s still possible to generate the hash of an empty sequence.

Let’s consider other examples. For instance, the hexadecimal equivalent of the number 200 is C8, so a sequence with only one byte, C8, is written as “0xC8”. Similarly, a sequence of five bytes — 0x48, 0x65, 0x6c, 0x6c, 0x6f — can be represented as “0x48656c6c6f”. Any sequence of bytes can be represented in this manner, and we can generate the hash accordingly.

const hash = ethers.keccak256("0x48656c6c6f");
console.log(hash);

The resulting hash is 0x06b3dfaec148fb1bb2b066f10ec285e7c9bf402ab32aa78a5d38e34566810cd2, which indeed comprises a 32-byte sequence, as evident from its length. There are 64 hexadecimal digits following the “0x” prefix.

Hashing online

To further solidify your understanding, you can also utilize a website that generates the hash of any sequence of bytes, such as https://emn178.github.io/online-tools/keccak\_256.html. In the example below, I’m generating the hash of the word “Hello”. As you’ll notice, the resulting hash matches the hash of the hexadecimal sequence 0x48656c6c6f. This isn’t a coincidence; rather, it’s because 0x48656c6c6f represents the hexadecimal encoding of the UTF-8 encoding of the word “Hello”.

Let’s clarify this point further, as it can sometimes cause confusion. Hash functions always take a sequence of bytes as input. If you intend to generate a hash of a word, you must first convert that word into a sequence of bytes. This conversion process is known as encoding. It’s important to note that there isn’t just one type of encoding; rather, there are several encoding schemes available. You can observe this in the figure below, where you can select from various encoding options.

If you select a different encoding, such as UTF-16LE, the resulting hash will indeed differ, as demonstrated in the figure below.

Another common source of confusion arises when attempting to generate the hash of a string as if it were a sequence of bytes. For example, in the figure below, it might appear that you’re hashing an empty string. However, in reality, you’re hashing the string “0x”, which, when encoded in UTF-8, corresponds to the sequence 0x3078. This emphasizes the importance of exercising caution when using hashing tools without a clear understanding of the input being provided.

The keccak256 function in ethers expects a string that represents a sequence of bytes, and it does not accept a string that lacks such a correspondence. For instance, the following code will produce an error because the function always requires a hexadecimal representation of bytes.

const hash = ethers.keccak256("Hello World") // INVALID;

Keccak256 in ethers.js

For completeness, let’s provide a more technically precise definition of the keccak256 function in ethers.

keccak256(data: BytesLike) => DataHexstring

DataHexString is a string representing a hexadecimal sequence, precisely as we’ve been using, such as “0x48656c6c6f”. It is worth mentioning that this isn’t a standard JavaScript type but rather a construct specific to the ethers library. Similarly, BytesLike is a type defined within the library as

BytesLike => DataHexString | Uint8Array

The definition above states that the keccak256 function can also accept a value of type Uint8Array, which represents an array of bytes. Therefore, the following two lines produce the same keccak hash:

const hash1 = ethers.keccak256("0x48656c6c6f");
const hash2 = ethers.keccak256(new Uint8Array([0x48, 0x65, 0x6c, 0x6c, 0x6f]));

Keccak256 is indeed the most commonly used hash function on Ethereum. However, there are other hash functions that can be valuable in blockchain programming, notably sha256 and RIPEMD160. These functions are extensively used in Bitcoin. Consequently, ethers provides implementations of these hash functions, along with sha512.

Let’s demonstrate with an example. The RIPEMD160 hash of the word “Hello” can be calculated as follows:

const hash = ethers.ripemd160("0x48656c6c6f");
console.log(hash);

The resulting hash will be the string “0xd44426aca8ae0a69cdbc4021c64fa5ad68ca32fe”, which consists of 42 characters (including “0x” and 40 hexadecimal digits). This length is due to the fact that the output of RIPEMD160 is only 20 bytes (160 bits) in length.

hashMessage in ethers

Now that we’ve explored how to utilize the keccak256 function in ethers, I’d like to discuss another function within the library called hashMessage. This function also generates the hash of a string, but with a prefix predefined by Ethereum. Let’s demonstrate this. First, we’ll execute the hashMessage function and observe that its result will differ from the keccak256 function, even for the same input value.

const hash = ethers.keccak256("0x48656c6c6f");
console.log(hash);
const message = ethers.hashMessage("Hello");
console.log(message);

In the example above, the string “Hello” is encoded in UTF-8 as 0x48656c6c6f, but the resulting hash is different. Furthermore, it’s important to highlight that the hashMessage function accepts a text string, which distinguishes it from Keccak256. The purpose of hashMessage is to prefix a message before hashing it, allowing for customization of a message tailored to a specific network, Ethereum.

Let’s delve into why this is crucial. Suppose you intended to sign the message “destroy my account” and send it to Ethereum. Even though Ethereum doesn’t inherently understand this command, let’s assume it does for the sake of this example, and your account is duly destroyed. Since you signed and transmitted this message, you evidently desired the destruction of your account.

However, the issue arises when there’s nothing preventing someone from taking this signed message and replicating it across other blockchains. If the message lacks personalization for a specific network, it remains entirely generic in nature.

To prevent the reuse of messages by other networks for security reasons, Ethereum (and other blockchains, originating from Bitcoin) introduced a signature scheme that incorporates a prefix into each message. This proposal was outlined in EIP-191 (Ethereum Improvement Proposal) and has been implemented. We’ll revisit this topic shortly. For now, let’s focus on demonstrating how the hash of messages conforming to the EIP-191 standard is generated.

EIP-191

To generate the hash consistent with EIP-191, you must include the following prefix before computing the hash: “\x19Ethereum Signed Message:\n” + len(message), where len(message) represents the size of the message. However, the details of this process can be confusing. Therefore, let me clarify further how to include this prefix.

In the first part of the prefix, “\x19Ethereum Signed Message:\n”, the values \x19 and \n are not strings but bytes represented in hexadecimal. Specifically, \x19 corresponds to the byte 0x19, and \n corresponds to the byte 0x0a, which represents a new line. Thus, the complex prefix “\x19Ethereum Signed Message:\n” in hexadecimal is represented as 0x19457468657265756d205369676e6564204d6573736167653a0a.

The next step is to include the message length, which can be confusing as well. Since it is a number, it could be placed as the byte representation of the number, but this is not the case; it is placed as the UTF-8 encoding of the number. In other words, let’s say the message is really “destroy my account”. Such a message has 18 characters, so the message size is 18 and we have to include the UTF-8 encoding of 18, which is \x3138. Now just include the message itself, also converted to UTF-8, which is 0x64657374727561206d696e686120636f6e7461.

In summary, we need to hash the following sequence of bytes:

0x19457468657265756d205369676e6564204d6573736167653a0a313864657374726f79206d79206163636f756e74.

Breaking it down into parts:

Prefix: 19457468657265756d205369676e6564204d6573736167653a0a
Message size: 3138
Message: 64657374726f79206d79206163636f756e74

As shown in the code below, this is exactly how the hashMessage function works when we pass text to it.

const hash = ethers.hashMessage("destroy my account");
console.log(hash);
const sameHash = ethers.keccak256(
"0x19457468657265756d205369676e6564204d6573736167653a0a313864657374726f79206d79206163636f756e74"
);
console.log(sameHash); // SAME AS hash

I hope by now you are somewhat familiar with hash functions. On Ethereum, they are omnipresent and serve various purposes.

--

--

João Paulo Morais
João Paulo Morais

Written by João Paulo Morais

Astrophysicist, full-stack developer, blockchain enthusiast. Technical Writer @RareSkills.

No responses yet