A bit on Public-Key criptography.

João Paulo Morais
10 min readMar 5, 2023
Generated by Dall-e-2.

I have always wanted to learn cryptography. When I started studying blockchain, my desire became a necessity. The security of cryptocurrencies is maintained by cryptography, which is one of the pillars of a blockchain. So, at least the basics I needed to know; But I always strive for the basics. If I want to learn about something, I need to dive deep into the technical and historical fundamentals. This post attempts to tell you a little about what I learned about public cryptography. I know well that, at first, the path seems obscure, so it’s good to have a guide for the journey.

Public-key cryptography, also known as asymmetric cryptography, is a relatively recent invention from the early 1970s. It is difficult to establish its precise origin because it was discovered independently by (at least) two groups, one in the United States and the other in England. This is because most research into cryptography has taken place (and still takes place) in secret government agencies. Thus, to this day, it is impossible to say whether public cryptography was not invented even earlier than we believe.

What we do know is that the first public appearance of asymmetric cryptography was through the work of Whitfield Diffie and Martin Hellman. Diffie is a visionary mathematician and cryptographer whose main goal has always been to revolutionize cryptography. He has maintained that vision for several years, between temporary jobs and adventures nationwide. On the other hand, Hellman has an academic profile and a not-so-adventurous life. They were an unusual duo for a daring purpose, which many considered impossible until that moment.

Whitfield Diffie, Martin Hellman and Ralph Merkle.

Diffie wanted to create public-key cryptography for two reasons. As early as the 1970s, he knew that computers would soon be ubiquitous in homes and offices, and such computers would need to send messages to each other. To ensure the privacy of messages, they needed to be encrypted. However, until that moment, there was only one type of cryptography: symmetric. In this type of encryption, a shared key has to be agreed upon beforehand between the parties involved in the process. Such kind of cryptography would never be conducive to a computer network because the parties involved are unknown to each other. Thus, it was necessary to solve this problem, known as the key exchange problem.

But the key exchange was not the only purpose of asymmetric cryptography as envisioned by Diffie. In a networked environment where previously unknown people would connect, ensuring that such people were who they claimed to be would be necessary. That is, it would be required to create a type of digital signature based on cryptography that guarantees the sender’s identity of a specific message.

Along with Ralph Merkle, a character well known to any blockchain student due to his Merkle tree, Diffie and Hellman began to study ways to solve these problems. All succeeded, but the way found by Merkle is not so safe. Diffie and Hellman devised what is now known as the Diffie-Hellman key exchange, a secure and efficient way for two people, say, Alice and Bob, to agree on a key over an insecure channel; that is, deciding on a key just by exchanging messages that can, in principle, be intercepted.

Diffie and Hellman could not solve the second proposed problem, a way to create a digital signature on documents. This would be further developed by a group of mathematicians at MIT, Ron Rivest, Adi Shamir, and Leonard Adleman, who developed a model of asymmetric cryptography known as RSA, capable of both encrypting data and generating digital signatures. We’ll talk about that later.

About asymmetric cryptography

Asymmetric cryptography uses two rather than a single combined key, as in symmetric cryptography. It is also mathematically much more elegant than traditional cryptography. And much simpler. Diffie and Hellman’s technique for key exchange is based on the discrete logarithm problem. It is also possible to encode and decode a message based on the same problem or to digitally sign a document. Diffie and Hellman came close to creating a complete asymmetric cryptography scheme. Still, the expansion of the discrete logarithm problem to these cases was carried out a few years later by cryptographer Taher Elgamal.

RSA encryption is based on factoring large prime numbers. What prime number factorization and the discrete logarithm problem have in common is the idea of trapdoor functions, which are functions relatively easy to calculate, but whose inverse, which does exist, is very difficult to calculate. We will soon see how such functions work when we study the Diffie-Hellman technique.

Symmetric encryption is basically based on scrambling words or bits. Since its inception, thousands of years before Christ, until the present moment, the idea has not changed much: to exchange and replace characters of a message through an algorithm and a key so that the recovery of the initial message can only be performed by who has the key. What changes from old schemes to new schemes is the algorithm used. This leads us to a question: what guarantees the security of an algorithm? The time and effort to crack it.

We consider an algorithm secure if cryptanalysts have tried to break it without success for a long time. A great example is the Vigenère cipher, an encryption method used for several centuries with great success. This method seemed so secure that it came to be known as “le chiffrage indéchiffrable”, or the indecipherable cipher. This lasted until Charles Babbage and Friedrich Kasiski independently discovered a method of breaking it.

There is no elegant mathematical structure behind symmetric cryptography. I’ll show you how the Diffie-Hellman key exchange works in a few lines, but explaining a modern symmetric encryption algorithm like DES would take many pages and diagrams. And in the end, you would ask yourself: right, but why was it made like that? At first, the choices on how to replace and swap letters or bits seem arbitrary.

Long story short, symmetric encryption is ugly. However, it’s fast. This is because, despite the algorithms’ complexity, it basically swaps and replaces bits, relying heavily on sums or XOR operations. On the other hand, asymmetric encryption is elegant but slow. This is because the mathematical operation used is the exponentiation of large numbers, which requires a lot of computational power.

Let’s look at how the Diffie-Hellman key exchange works to show symmetric cryptography’s elegance.

The Diffie-Hellman key exchange

The idea of the Diffie-Hellman key exchange is that two people, Alice and Bob, must agree on a common key, but they can only communicate over an insecure channel. To do this, Alice and Bob first agree on 2 numbers: alpha and n. There is a criterion for choosing α and n, but we will not worry about that. Let’s say that α and n are known to Alice, Bob, and any malicious actor, such as Eve, who intends to intercept the communication.

The next step is for Alice to pick a random number, say x, and compute αˣ mod n. Let’s call this result Kₐ. Similarly, Bob picks a random number, say y, and calculates αʸ mod n, which we’ll call Kᵦ. Now Alice sends Kₐ to Bob, while Bob sends Kᵦ to Alice. Since the channel is insecure, we can assume that Eve will be able to discover Kₐ and Kᵦ.

Now Alice and Bob can compute the shared key. It is enough for Alice to exponentiate Kᵦ by x, that is, Kᵦₐ = Kᵦˣ mod n = (αʸ)ˣ mod n. Bob does something similar. When it receives Kₐ, it exponents by y, that is, Kₐᵦ = (αˣ)ʸ mod n. And what happens is that, by the exponentiation property, Kₐᵦ = Kᵦₐ. Quite simply, Alice and Bob agreed on a key!

And what about Eve, who wants to discover the key. She knows Kₐ and Kᵦ, but to figure out Kₐᵦ, she needs to know either x or y. But neither x nor y were sent! Eve knows that Kₐ = αˣ mod n, and she knows Kₐ, α, and n. Technically, all Eve needs to do is calculate x by taking the logarithm. But this is the discrete logarithm problem, and there is no simple way to solve it without trying every possible option. Exponentiating is easy, but taking the logarithm is nearly impossible. This is an example of a trapdoor function.

By choosing large enough numbers, on the order of 1000 bits, or 1Kb, Eve can’t calculate x or y. The Diffie-Hellman scheme is simple and elegant. If only symmetric encryption were this elegant…

Hybrid schemas

The Diffie-Hellman method does not allow you to encrypt or decrypt messages; ElGamal’s schema was developed just a decade later. In the meantime, we have the appearance of the algorithm that today is the most used in the world for asymmetric cryptography: RSA.

Ron Rivest, Adi Shamir and Leonard Adler.

As already mentioned, the problem with asymmetric encryption is that it is slow. Remember that we have to exponentiate numbers in the order of 1Kb; that is, huge numbers. For this reason, most encryption systems today use a hybrid model: RSA or Diffie-Hellman to agree on a key, and then AES or 3DES or some other symmetric encryption schema to encrypt the messages using the combined key.

Most website security today is still maintained by RSA, whose patent was registered by its inventors but has already expired. RSA is also part of the most widely used encryption software, PGP.

The British

In chronological terms, Diffie, Hellman, Merkle, Rivest, Shamir, or Adleman were not the first to discover asymmetric cryptography or digital signatures. That was left to British cryptographers working for the British government in a security agency called the Government Communications Headquarters, GCHQ for short, the British equivalent of the national security agency, NSA. This was only revealed in the late 1990s, having remained classified as confidential material for almost 30 years.

James Ellis proposed what he called non-secret cryptography in 1970, similar to what would be envisioned by Diffie 4 years later. Ellis, however, failed to make his idea a reality, just as Diffie was unable to solve the problem of asymmetric cryptography. The equivalent of RSA for the British was developed 3 years later, in 1973, by the mathematician Clifford Cocks, also 4 years before the appearance of the scheme proposed by Rivest, Shamir, and Adleman.

James Ellis and Clifford Cocks.

But comparing cryptography developed by academics with cryptography developed within intelligence agencies is unfair. First, most funding for cryptographic research is allocated to secret government agencies, such as the RSA in the US. They have access to everything published in the academic world, while the academic world does not have access to what is developed in such agencies. It’s an asymmetrical model, for sure.

Another factor to be highlighted is that in the 1970s, there was tremendous pressure to end private cryptography research. To the government, cryptography was their business, not to be developed by outsiders. Sharing newfound knowledge in this area could be seen as an act of treason and could lead to the arrest of those responsible. For example, mere participation in a cryptography conference could put an academic behind bars. You had to be a rebel to work with cryptography outside of government agencies in those days.

Asymmetric cryptography in the blockchain environment

In a blockchain, the most used cryptographic method for digital signatures is not RSA or ElGamal but cryptography based on elliptic curves. This relatively modern encryption method only gained traction at the beginning of the current century. Its math is more intricate, but the main idea is still based on trapdoor functions.

In elliptic curve cryptography, the principal operation is not exponentiation but the sum of points on an elliptic curve. Imagine there is a way to add two points on a curve and calculate a third point from them. It is also possible to add the same point repeatedly, which will result in another point on the curve. Let’s say the starting point is represented by the letter K, and we will add K + K + K + … + K, n times. I’m going to write this new point as P = nK.

Sum of points on an elliptic curve on the continuum. In practice, the points are discrete. Thx Wikicommons.

It is not difficult to calculate nK, but having only P and K, finding n is practically impossible. We have a function that is almost impossible to invert; that is, a trapdoor function.

The most significant advantage of using elliptic curves is the size of the keys. The size of the keys is proportional to the size of the numbers we need to use, which means that keys in RSA and Elgamal have a size in the Kb range, something around 4,096 bits. To achieve similar security on elliptic curves, keys 256 bits long are sufficient. As public keys are recorded in the blockchain ledger, storing 256 bits instead of 4096 bits is much more advantageous.

If one wants to understand blockchain cryptography, it is essential to understand at least the basics of elliptic curve cryptography. You will learn, for example, what it means to add two points on an elliptic curve. This requires a little study in discrete math, which can be a lot of fun.

The history of cryptography is fascinating, and you can read more about it in books like The Code Book by Simon Singh and Crypto: How the Code Rebels Beat the Government Saving Privacy in the Digital Age, by Steven Levy.

Comments and suggestions about this article are welcome.

Any contribution is welcome. www.buymeacoffee.com/jpmorais.

--

--

João Paulo Morais

Astrophysicist, full-stack developer, blockchain enthusiast. Unraveling cosmos mysteries by day, crafting the next Latin American novel by night.