Substitution Cipher

Monoalphabet substitution encryption

A monoalphabet substitution such as César encryption can be expressed by a congruent linear transformation (also known cryptographically as a related transformation). In César encryption, this would be written as E (M) = (M + 3) mod N, where N is the length or cardinal of the original alphabet.

The transformation related to a more general case can be extended with the following linear congruence:

E (a, b) (M) = (aM + b) mod N
where M is the numerical value of a character of the original alphabet, a and b two whole numbers smaller than the cardinal N of the alphabet, and fulfilling that a and N are prime to each other, that is, that mcd (a, N) = 1, since of otherwise, different letters of the original alphabet would result in the same letter in the equivalent encrypted alphabet. The encryption key k is then given by the pair (a, b).

a is a constant that determines the separation interval between two letters of the encrypted alphabet when they are consecutive in the original alphabet. This constant is called the coefficient or decimation factor. b is a constant that determines the offset between the letters of the clear message and the corresponding ones in the encryption.

Caesar Cypher would therefore be a related transformation with a key k = (1,3).

Run application Example application: Encryption by monoalphabet replacement

Cryptanalysis of Monoalphabetic Encryption Methods

Monoalphabetic encryption constitutes the simplest family of cryptographic methods of cryptanalyzing, since the statistical properties of clear text are retained in the cryptogram. Suppose, for example, the letter that appears most in Spanish is E. It seems logical that the most frequent letter in the coded text is that corresponding to E. Matching the relative frequencies of each symbol in the message encrypted with the histogram of frequencies of the language in which the clear text is supposed to be, we can easily find out the key.

Distribution of letter frequencies for a literary text

E – 16.78% R – 4.94% Y – 1.54% J – 0.30%
A – 11.96% U – 4.80% Q – 1.53% Ñ – 0.29%
O – 8.69% I – 4.15% B – 0.92% Z – 0.15%
L – 8.37% T – 3.31% H – 0.89% X – 0.06%
S – 7.88% C – 2.92% G – 0.73% K – 0.00%
N – 7.01% P – 2.77% F – 0.52% W – 0.00%
D – 6.87% M – 2.12% V – 0.39% –