RSA cryptosystem

Among all asymmetric algorithms, RSA is the most used and also perhaps the easiest to understand and implement. A peculiarity of this algorithm is that its two keys serve both to encrypt and to authenticate. It owes its name to its three inventors: Ronald Rivest, Adi Shamir and Leonard Adleman, who first published the RSA method in 1977. He has been under the RSA Laboratories patent until September 20, 2000, so its commercial use It was restricted until that date.

RSA, as already indicated, is based on the difficulty of large number factorization. Public and private keys are calculated from a number that is obtained as a product of two large cousins. An attacker who wants to recover a clear text from the cryptogram and the public key, has to face this factorization problem.


The problem of large number factorization

We already know a possible way to break down a number n into its factors: try to divide it by all positive integers between 2 and the root of n. But when we talk about a 1024 bit size number, this method is computationally impractical. Of course, over time mathematicians have invented other more efficient factoring methods, but none have achieved an algorithm with an order of complexity that allows factoring in a reasonable time numbers of sizes like those used in RSA today, even with the computing power available today.

In fact, the integer factorization problem is considered to be an NP class problem, that is, a problem for which one or more algorithms exist that solve it, but none of the known algorithms are executed in a polynomial time (which can be expressed polynomially depending on the size of the input data), and therefore are inefficient or intractable for very large input data.

RSA algorithm

(1) Key pair generation
To generate a pair of keys (KP; Kp), first, two large prime numbers, p and q (of about 200 digits each, for example) are chosen randomly. Then the product n = p.q is calculated

We will now choose a relative number and cousin with (p-1) and with (q-1). This pair of numbers (e, n) can be known by anyone, and constitute the so-called public key

e must therefore have an inverse module (p-1) (q-1), which we call d. Of course it is true that ed ≡ 1 mod ((p-1) (q-1)), which is the same as saying that ed = 1 + k (p-1) (q-1) for some integer k. The private key will be the pair (d, n). This number d must be kept secret and will only be known by the owner of the key pair.

(2) Encryption of the message with the public key
It should be noted that with this algorithm the messages that are encrypted and decrypted are whole numbers smaller than n, not single letters as in the case of César or Vigènere ciphers.

To obtain the encrypted message C from the clear message M, the following operation is performed:
C = Me (mod n)

(3) Decryption of the message with the private key
To recover the original message after encryption, the following operation is performed:
M = Cd (mod n)


Justification of the method

Cd (mod n) = (Me) d (mod n) = M1 + k (p-1) (q-1) (mod n) = (M (p-1) (q-1)) kM (mod n ) [i]

If we remember, the function of Euler φ (n) = (p-1) (q-1), and that in general, except improbable chance, we will have to gcf (M, p) = mcd (M, q) = mcd (M, n) = 1. And therefore according to the Euler-Fermat theorem, Mφ (n) ≡ 1 (mod n) ⇒ (M (p-1) (q-1)) k ≡ 1 (mod n) [ii]

From [i] and [ii] it is obtained that Cd (mod n) = 1.M (mod n) = M, for 0 <M <n

Commutativity of encryption and decryption in RSA
Due to the properties of modular exponentiation, encryption and decryption are commutative:

M = (Me mod n) d mod n = Md.e mod n = (Md mod n) e mod n = M

This means that if we encrypt M with the public key e and then decrypt the result with the private d we get M again, we can also encrypt M with the private key d and decrypt the result with the public key e, again obtaining M.

This property is important because it allows us to use RSA not only to encrypt a message, but also to authenticate the message, as we will see in the next topic.


RSA cryptanalysis

To break an RSA encryption, we can try several ways. Apart from factoring n, which we already know is a computationally intractable problem in a reasonable time, we could try to calculate φ (n) directly, or try a brute force attack trying to find the private key d, systematically testing with each of the possible numbers of the key space. Both attacks are, for large n, even more computationally expensive than the factorization of n.


The use of RSA in information security

The RSA cryptosystem not only guarantees the confidentiality of communication between two parties, originally encrypting the message to be transmitted over an insecure channel and decrypting it at reception, but also provides other information security services or functions, such as they are the authentication of origin, integrity or non-repudiation (through the digital signature). Let’s see how RSA would be used to guarantee these services.

In a communication between two parts A and B, each of them will generate before starting its own key pair (public, private). Thus A will have the pair (KPA, kpA) and B its pair (KPB, kpB), where KP are the public keys that are known by the two parties, and kp are the private ones, which each party keeps its secret and will not be known by the other party. Remember that KPA = (eA, nA) and kpA = (dA, nA). The same for the key pair of B.




We assume that A wants to send an M message confidentially to B through an insecure transmission medium. These are the steps you have to follow:

Get the public key of recipient B, (eB, nB)
Represents the clear text that you want to convey as a positive integer M <n
Compute the encrypted message: C = (M) eB mod nB
Finally, it transmits crypto message C through the insecure channel


When B receives the encrypted message C, it does the following:

Use your private key (dB, nB) to compute M = (C) dB mod nB
Retrieve the original text from its entire representative M



Suppose A wants to send a message to B. This message may be encrypted or not, but A is interested in “signing” the message so that B can be sure that the message that arrives has originated from A and not by any other entity The steps that A will follow are:

Create a digest (digest) of the message you want to send, using a hash function
Represent this summary as an integer M between 0 and n-1
Use your own private key (dA, nA) to compute the signature S = (M) dA mod nA
Send that signature S to receiver B together with the original message (which can be encrypted or not, as desired). Obviously, the signature S cannot be manipulated by anyone once generated, because if a single bit of the signature is changed, verification of the signature at destination will fail.


When B receives the signature S and the message from A, follow these steps:

Use the public key of sender A to compute V = (S) eA mod nA
From the integer V it obtains the summary r of the message as it was computed by A
In parallel, compute a summary r ‘of the message that has arrived using the corresponding hash function
If both summaries r and r ‘coincide, then the signature is verified. Then you can make sure that the message could only be originated by A and the message has also arrived in its entirety (without seeing its content altered during the transmission through the insecure channel) to B.

The digital signature can therefore guarantee the authentication of the origin, the integrity of the message and also the non-repudiation at origin, since if B has been able to retrieve the summary r of the message using the public key of A, the signature received together with the message can only have been generated with the secret private key of A.


Hash or summary functions

For long message authentication, the so-called hash or summary functions are used. These functions allow to obtain summaries (digests) of fixed size from different very long messages, with the property that each message will give rise to a different summary from the other messages.

In this way, instead of for example signing by means of the RSA algorithm a message that can occupy several pages, what is signed (encrypting it with the sender’s private key) is a summary number of the message, which is equivalent to the complete message.

A summary function r (m) to apply it to a message m, must meet the following properties:

r (m) is of fixed length, regardless of the length of m, and usually smaller than the length of the message.
Given m, it is easy to calculate r (m).
Given r (m), it is computationally intractable to recover m. That is why hash functions are also known as unidirectional functions. Its operation is not reversible.
Given m, it is computationally intractable to obtain another message m0 such that r (m) = r (m0). In this way, we know that for a value r (m), only one m that generated it can correspond.
In general, summary functions are based on functions analogous to information compression algorithms, which result in blocks of length n from blocks of length greater m.

Examples of summary functions used in cryptography are the MD5 algorithms, and the most secure SHA-1 used both today in digital signatures.