A Caesar cipher, also known as Caesar's cipher, the shift
cipher, Caesar's code or Caesar shift, is one of the simplest and widely known
encryption techniques. In this type of substitution cipher each letter in the
plaintext is replaced by a letter some fixed number of positions down the
alphabet. For example, with a left shift of 3, D would be replaced by A, E
would become B, and so on. The method is named after Julius Caesar, who used it
in his private correspondence.
As with all single alphabet substitution ciphers, the Caesar
cipher is easily broken and in modern practice offers essentially no
communication security.
Example
The transformation can be represented by aligning two
alphabets; the cipher alphabet is the plain alphabet rotated left or right by
some number of positions. For instance, here is a Caesar cipher using a left
rotation of three places, equivalent to a right shift of 23 (the shift
parameter is used as the key):
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: XYZABCDEFGHIJKLMNOPQRSTUVW
When encrypting, a person looks up each letter of the message in the
"plain" line and writes down the corresponding letter in the
"cipher" line. Deciphering is done in reverse, with a right shift of
3.
Ciphertext: QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD
Plaintext: the quick brown fox jumps over the lazy dog
Breaking the Cipher
The Caesar cipher can be easily broken even in a cipher text-only
scenario. Two situations can be considered:
- An attacker knows (or guesses) that some sort of simple substitution cipher has
been used, but not specifically that it is a Caesar scheme;
- An attacker knows that a Caesar cipher is in use, but does not know the shift
value.
In the first case, the cipher can be broken using the same
techniques as for a general simple substitution cipher, such as frequency
analysis or pattern words. While solving, it is likely that an attacker will
quickly notice the regularity in the solution and deduce that a Caesar cipher
is the specific algorithm employed.
In the second instance, breaking the scheme is even more
straightforward. Since there are only a limited number of possible shifts (26
in English), they can each be tested in turn in a brute force attack. One way
to do this is to write out a snippet of the cipher text in a table of all
possible shifts — a technique sometimes known as "completing the plain
component".
Decryption
shift
|
Candidate
plaintext
|
0
|
exxegoexsrgi
|
1
|
dwwdfndwrqfh
|
2
|
cvvcemcvqpeg
|
3
|
buubdlbupodf
|
4
|
attackatonce
|
5
|
zsszbjzsnmbd
|
6
|
yrryaiyrmlac
|
...
|
23
|
haahjrhavujl
|
24
|
gzzgiqgzutik
|
25
|
fyyfhpfytshj
|
The example given is for the cipher text
"EXXEGOEXSRGI"; the plaintext is instantly recognizable by eye at a
shift of four. Another way of viewing this method is that, under each letter of
the cipher text, the entire alphabet is written out in reverse starting at that
letter. This attack can be accelerated using a set of strips prepared with the
alphabet written down them in reverse order. The strips are then aligned to
form the cipher text along one row, and the plaintext should appear in one of
the other rows.
Another brute force approach is to match up the frequency
distribution of the letters. By graphing the frequencies of letters in the cipher
text, and by knowing the expected distribution of those letters in the original
language of the plaintext, a human can easily spot the value of the shift by
looking at the displacement of particular features of the graph. This is known
as frequency analysis.
Enhancing the Cipher
Based on the explanation provided we see that main problem
with Caesar cipher is predictability. Two main issues that cause this
predictability are:
- Fixed index for shifting characters
- Knowledge about character at a given index.
We can enhance the strength of the cipher by overcoming
these 2 issues. Let us look into these 2 issues and how we can overcome them:
Fixed index for shifting characters
As you read earlier in Caesar cipher each letter in the
plaintext is replaced by a letter some fixed number of positions down the
alphabet. Thus if attacker manages to figure out shift index he can easily
break the cipher text. We can overcome this making the shift index variable.
Consider the example shown in the table above. Instead of using a fixed index
of 4, the index could be length of the string I.e 12. Using such variable index
will make the cipher text difficult to predict. In addition to this advantage,
two similar word/phrase with different lengths would have different cipher
texts. For example attackatonce and attackonce
will have no pattern in common even though the word “attack” and “once”
are being repeated. This happens because the string length is different thus
the shift index will be different.
Knowledge about character at a given index
In our previous example we also observed while shifting the
alphabets, the alphabet at new position occurs as per alphabetical order of
A-Z. Thus if attacker manages to figure out the shift index he can easily find out
all the characters in the given cipher text. Besides making the shift index
dynamic we can also shuffle the alphabetical order by which the alphabets won’t
be substituted as per the default alphabetic order. For example on a shift
index of 4 if we were suppose to substitute character “a”, by alphabetical
order the new character would be “d”. However if we do not follow alphabetical
order and replace the characters basis of shuffled set of characters it will make
it difficult for attacker to derive the plain text. For this we need to have a
shuffled set of alphabets.
Implementation
Enhanced Caesar cipher can be implemented in following
manner:
- Determine the shift index on the basis of string
length.
- Determine the position of character to be
replaced in the shuffled string.
- Add the shift index to the position to determine
the position of new character.
- Determine the character at new position. If the
new position is more than the length of the shuffled set then loop again after
the loop index reaches the length of shuffled set.
For example
The plain text to be ciphered is “america”. The shuffled
subset is “adwxyzefijklmnoghpqrstbcuv”. The cipher for this will be “frnagyf”. In
this example “a” is substituted with “e”. For this substitution we 1st
determine the position of “a” in shuffled subset. This is 1. Now we need to
determine the shift index, which is length of string i.e. 7. Thus with a right
shift of 7,”a” gives “f” as per the shuffled set. Similar substitution process
will be applied to each character in the given word.
The Outcome
An enhanced cipher technique derived from a simple
substitution cipher is much effective compared to its original version.