Learn Ethereum in 2024. #13. Digital signature in Ethers.
In the last article, we explored how to generate the hash of a message of arbitrary size using the keccak256 function in ethers, as well as how to generate a message hash suitable for use on the Ethereum network, adhering to the EIP-191 standard. In this article, we will delve deeper into public key cryptography using the ethers library. Specifically, we will examine the relationship between private and public keys in more detail, explore the concept of message signatures, and understand how signatures can be utilized to ensure the integrity and authenticity of messages.
Elliptic curves
There are several digital signature schemes, and Ethereum utilizes the Elliptic Curve Digital Signature Algorithm (ECDSA). Despite the complex mathematics involved in this scheme, it is possible to gain a relatively clear understanding of how it works. Let’s begin by exploring elliptic curves. An elliptic curve is a two-dimensional curve that satisfies the equation y² = x³ + ax + b. Below are two examples of elliptic curves. On the curve, we define an operation, corresponding to the addition of two points. In other words, given two points A and B on the curve, we define an operation such that A + B results in a third point on the curve.
This operation defined on an elliptic curve necessitates all the properties of a group, effectively defining a group on an elliptic curve. A group is an algebraic structure with applications as diverse as crystallography and quantum mechanics. However, understanding the formal definition of a group is not essential to grasp the concept behind this construction.
Perhaps the most challenging aspect to grasp is that this curve is not continuous, as the group is defined over a finite field. Essentially, we discretize the curve, causing points on the curve to appear arbitrarily placed on a two-dimensional graph, as depicted in the figure below. However, it’s essential to understand that the addition of two points still results in a third point on our discretized curve, even though it no longer resembles a smooth curve.
Now that we have the necessary ingredients, let’s select one point among all those present on the curve and designate it as the generator G. Since adding two points on the curve results in a third point, we find that G + G = C, where C is an easily calculable point. Similarly, G + G + G = D, yielding another easily calculable point, named D. By adding G N times and computing the corresponding point, we can achieve our desired operation. This process can be symbolically represented as G * [N] = P, where the symbol * denotes repeated addition of G N times to yield point P.”
The point G must be predefined and known by all parties, along with the curve parameters. The essence of this scheme lies in the fact that if I provide you with N, anyone can compute P since it’s merely a sum. Even when N is significantly large, on the order of 2²⁵⁶, efficient algorithms exist to carry out this computation. However, it is computationally infeasible to deduce the value of N solely from P. This encapsulates the discrete logarithm problem in elliptic curves.
Private and public keys
In this scheme, the number N represents the private key, while the number P represents the public key. It’s important to note that N is simply a scalar, indicating the number of times we add G to itself, while P is a point on the curve. Technically, N can be any number, but for security reasons, it’s crucial to choose a random number on the order of 2²⁵⁶, or 256 bits.
Ethereum, much like Bitcoin, utilizes the curve known as secp256k1. In this curve, the generator is defined by the point G = (Gₓ, Gᵧ), where Gₓ is represented by the hexadecimal value 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798
, and Gᵧ by 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8
.
For educational purposes, as the public key is obtained by multiplying the generator point G by the private key N, if we set the private key to the value 1, our public key must match the generator point I itself. Let’s verify this on Ethers.
SigningKey on ethers
Ethers provides a class called SigningKey, which serves as an abstraction for a pair of keys. This class accepts the private key as an argument, which can be passed as a string representing 32 bytes in hexadecimal format or as an array of type Uint8Array, as demonstrated in the previous article.
In the code snippet below, we begin by instantiating a SigningKey object with the private key set to 1. Subsequently, we utilize the privateKey and publicKey properties to retrieve the private and public keys, respectively.
import { SigningKey } from "ethers";
const keyPair = new SigningKey(
"0x0000000000000000000000000000000000000000000000000000000000000001"
);
console.log(keyPair.privateKey);
console.log(keyPair.publicKey);
The public key is represented as 0x0479be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8
. Let’s break it down. The first byte, 0x04
, indicates that the point is uncompressed. Following that, we have 64 bytes, which correspond to the x and y coordinates of the point. Therefore, our public key corresponding to the private key with the value 1 is as follows:
Pₓ=79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798
Pᵧ=483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8
This corresponds exactly to the generator point G, as expected, since [1] * G = G. If we had chosen 2 as the private key, the public key would be the point G+G. In practice, it’s crucial never to manually choose your private key but instead rely on algorithms that generate large random numbers. Ethers provides functions for generating random numbers, which we will explore further when studying wallets. For now, let’s continue using our terribly insecure private key with the value 1, just for demonstration purposes.
Signing messages
Now that we have a key pair, we can proceed to sign messages. The signing scheme is slightly more intricate than the key creation process, so we won’t delve into the details here. The signature is generated based on the hash of the message, not the message itself. We have the option to sign either the pure message hash or the message hash prefixed via EIP-191, as discussed in the previous article. Here, we will sign the hash of the pure message, but the process remains the same.
In the code snippet below, we begin by hashing the text message “Hello”
to obtain its hash value. Then, we utilize the sign method of the SigningKey instance created earlier to generate a signature for the hashed message.
import { SigningKey, keccak256 } from "ethers";
const keyPair = new SigningKey(
"0x0000000000000000000000000000000000000000000000000000000000000001"
);
const messageDigest = keccak256("0x48656c6c6f");
const signature = keyPair.sign(messageDigest);
console.log(signature);
In ECDSA, a signature comprises a pair of values known as r and s. These values enable the verification of the signature’s validity and allow for the recovery of the x-coordinate of the public key that produced the signature. However, due to the nature of the elliptic curve, the x-coordinate alone is insufficient to uniquely determine the y-coordinate. To achieve this, additional information is required, referred to here as yParity, which can take values of 0 or 1. With the parameters r, s, and yParity, along with the original message, it becomes possible to uniquely recover the public key responsible for generating the signature.
Below is the signature represented as an object generated by Ethers:
Signature { r: "0xe5756b82ed97430f41b8b62adc432da4164c1255b523ea942002cd245a2fc91f", s: "0x720ac446dcc8775a768e40c24851f46a1bdf4406e9ca16963b0192779c3b9b82", yParity: 0, networkV: null }
Now, let’s proceed to retrieve the public key from the message and signature.
const publicKey = SigningKey.recoverPublicKey(messageDigest, signature);
console.log(publicKey);
By employing the static recoverPublicKey method, we can extract the public key that signed the document from the hash of the message and signature. Consequently, the message remains tamper-proof, as even a minor alteration would render it invalid. Furthermore, the authentication aspect is ensured, as we can determine the identity of the signer, thereby fulfilling the objective of the digital signature.
In Ethereum, when initiating a transaction to modify its state, the transaction is a message that must be signed. There’s no need to explicitly specify a sender, as the sender can be determined from the transaction and signature, as demonstrated previously. To gain a deeper understanding of this process, we’ll explore the concept of accounts in Ethereum in the next article.