Learn Ethereum in 2024. #9. Types of Ethereum nodes.

João Paulo Morais
6 min readMar 23, 2024

--

Participants in the Ethereum network are commonly referred to as nodes. To operate as a node, two essential pieces of software are required: an execution client and a consensus client. To elaborate further, if the node is also a validator, it necessitates running a third software component linked to the consensus client. However, the story doesn’t end there. Within the Ethereum ecosystem, there are three distinct types of nodes: full nodes, archive nodes, and light nodes. In this article, we will delve into the distinctions between these three types of nodes.

Client Diversity

To emphasize, it’s important to remember that Ethereum doesn’t rely on a single client implementation, but rather several. These include both execution clients and consensus clients developed by different teams using various programming languages. This diversity is crucial for security reasons. Having multiple teams independently implementing the same protocol and achieving consistent results indicates a robust implementation. Moreover, in the event that one client encounters an issue, other clients can maintain the network’s operation, ensuring its continued activity and resilience.

Indeed, client diversification is crucial to prevent any single implementation from exerting dominance over others. When one implementation sees significantly more usage than the rest, it can effectively control the network. This becomes particularly critical during network forks, where the choices made by the dominant client are carried forward. If the dominant client implements the protocol differently from others, even if the others are correct, most nodes may make incorrect choices. Regrettably, client diversity remains more of an aspiration than a reality, with a single execution client and a single consensus client maintaining the largest share of usage.

Even within the same implementation, users typically have the option to run one of three types of nodes: full nodes, archive nodes, and light nodes. Understanding the differences between them is crucial. Now that we comprehend the distinction between the ledger and the state, as well as the concepts of forks and block reorganizations, we are prepared to grasp the purpose of each type of node.

Full Nodes

Full nodes are the most prevalent type on the network. They download and maintain the entire ledger, from the genesis block to the head of the chain. As previously discussed, the ledger enables the recreation of the current state by executing all transactions from all blocks, sequentially. Consequently, full nodes possess the capability to reconstruct the current state through transaction execution. When a new node seeks to join the network, the most secure approach is to emulate this behavior.

Full nodes offer alternative synchronization methods in addition to re-executing all transactions since the genesis block. While this method guarantees safety, it may not be the fastest option. Alternatively, nodes can begin synchronization from the state of a recent checkpoint, already deemed finalized, and execute block transactions from that point onward until reaching the head of the chain. This approach enables new participants to reach the most current state more rapidly.

Regardless of the synchronization method employed, full nodes are required to maintain the entire ledger in their database, which is why they are referred to as “full” nodes. They have the capability to provide blocks and their transactions to peers upon request, and they can re-execute transactions and regenerate the network state at any historical moment. However, it’s important to note that full nodes do not store the historical state of the network; they only retain the most recent states. Allow me to provide further clarification.

Each transaction indeed generates a new global state. However, for optimization purposes, it’s common to consider the global state in relation to each block. In other words, we have a distinct global state for each block — global state in block 1, global state in block 2, and so forth. Within the header of each block lies a signature of the global state within that block, which is the root of its state Merkle tree. Consequently, based on the state of each block, one can execute the transactions of the subsequent block, compute the signature, and compare it with the one in its header. This process enables validation of the correct updating of the global state.

The state is indeed dynamic, and at each block, a new state emerges. Full nodes do not retain a history of the state. In other words, if one desires to access the state of a specific smart contract at a particular past block, full nodes do not have this information readily available in their databases. Nevertheless, they can recalculate this information by executing all transactions from the genesis block up to the designated block. However, they do not store this historical state information in their databases for immediate access. The only states readily available for consultation are those of the most recent blocks.

Full nodes maintain information on the most recent states, not just the single most recent state, due to the potential occurrence of block reorganization. In the event of a network fork necessitating the invalidation of recent blocks, relying solely on the most recent state would require nodes to reconstruct the entire state from genesis. However, because full nodes retain information on the last states, they can begin from the last unreordered block and recalculate the new state within the revised block organization. Consequently, full nodes store both the ledger and a limited data history corresponding to recent blocks. Presently, this combined information occupies slightly over 1TB of disk space.

Archive nodes

Another type of node, archive nodes, goes beyond full nodes by storing not only the ledger but also the entire historical state of the Ethereum network. Consequently, archive nodes maintain a comprehensive database of historical information, enabling instantaneous responses to queries such as the status of a specific account at any given time. For instance, if one seeks to ascertain how many tokens Alice holds from contract A in block 1,345,567, archive nodes can promptly provide the answer. It’s important to note that archive nodes necessitate significantly more disk space compared to full nodes. Most client implementations are not optimized to function as archive nodes, requiring over 10 TB of data storage to do so.

Erigon, an Ethereum client developed in Go and initially derived from the Geth client, is specifically optimized for running as an archive node, requiring “only” 3.5 TB of storage capacity. It’s essential to understand that archive nodes are not obligatory for participating in the network; they are primarily utilized for node-as-a-service or Ethereum data analysis purposes. Essentially, archive nodes are employed when there is a necessity for readily accessible access to the entire historical state of the Ethereum network.

Light nodes

A more intriguing type of node is the so-called Light node. Unlike full nodes, light nodes do not possess complete blocks of the ledger; instead, they only store block headers, excluding transactions. Consequently, light nodes do not retain the state and lack the ability to calculate it, as they do not possess transaction data. When light nodes require transaction information, they must request that data from full nodes.

You may wonder about the utility of light nodes since they do not possess the entire ledger. However, their significance lies in their ability to validate the information they receive. Let’s consider a scenario where we are a light node requesting information about a transaction to a full node or a node-as-a-service. Light nodes, having access to block headers containing the Merkle root of transactions and the Merkle root of the state of that block, can validate the received information against the header they possess. They have the capability to detect when they receive distorted or inaccurate information about a transaction. This capability aligns with the fundamental motto of blockchain:

Don’t trust, verify.

Light nodes are capable of running on devices with limited computing power and storage, such as smartphones, while still being able to validate information received from third parties. They play a crucial role not only in Ethereum but also in Bitcoin wallets. Presently, light nodes do not fully operate on Ethereum in a proof-of-stake setting. However, it’s an area of active research and is essential for achieving broader and more popular adoption of Ethereum in the future.

--

--

João Paulo Morais
João Paulo Morais

Written by João Paulo Morais

Astrophysicist, full-stack developer, blockchain enthusiast. Technical Writer @RareSkills.

No responses yet