Frequency Analysis with Excel

Aim

You are going to use Excel VBA to develop frequency analysis tools.

Text Entry Box.

This workbook uses textboxes and command buttons to count the frequency distribution of the letters in the ciphertext. This is one of the first jobs that you should perform on a new cipher text as it often yields vital clues to the type of cipher that you are dealing with.

The workbook will shift the message into uppercase automatically and eliminate any punctuation.

This webpage assumes that you have already completed:

Open a new Excel workbook and get the worksheet into Design Mode by clicking on the Design Mode toggle.

Add a large text box and a command button to your workbook.
Copy and paste the frequency analysis results for standard English into your workbook.

LetterFrequency %
A 8.2
B 1.5
C 2.8
D 4.3
E 12.7
F 2.2
G 2.0
H 6.1
I 7.0
J 0.2
K 0.8
L 4.0
M 2.4
N 6.7
O 7.5
P 1.9
Q 0.1
R 6.0
S 6.3
T 9.1
U 2.8
V 1.0
W 2.4
X 0.2
Y 2.0
Z 0.1

The first block of VBA code is taken from the clean and case button of the Simple Shift Cipher. Part 2 project.
The only alteration needed is to change textbox2 in the final line to textbox1 as this project only needs one textbox.

Test your code to check that it strips out punctuation etc. and converts lower to uppercase.

Private Sub CommandButton1_Click()
newStr = ""
newStr1 = ""
sourceStr = TextBox1.Value
newStr = Format(sourceStr, ">")
For i = 1 To Len(sourceStr)
If (Asc(Mid(newStr, i, 1)) >= 65) Then
If (Asc(Mid(newStr, i, 1)) <= 90) Then
newStr1 = newStr1 + Mid(newStr, i, 1)
End If
End If
If (Asc(Mid(newStr, i, 1)) = 32) Then
newStr1 = newStr1 + Mid(newStr, i, 1)
End If
Next i
TextBox1.Value = newStr1
End Sub

Now we need to add the code to work through the ciphertext and count up occurences of each letter.
This code is added to the end of the last section of code, just before the closing End Sub line.

sourceStr = TextBox1.Value
For i = 0 To 25
Worksheets("Sheet1").Cells(6 + i, 28).Value = 0
Next i
For i = 1 To Len(sourceStr)
If ((Asc(Mid(sourceStr, i, 1)) >= 65) And (Asc(Mid(sourceStr, i, 1)) <= 90)) Then
letterCount = Worksheets("Sheet1").Cells(Asc(Mid(sourceStr, i, 1)) - 59, 28).Value + 1
Worksheets("Sheet1").Cells(Asc(Mid(sourceStr, i, 1)) - 59, 28).Value = letterCount
End If
Next i
End Sub

Note


The statistics are going to be written into the cells of column M starting at cell M6. To do this you use the Cells command. This takes worksheet references in the form (row,column) as numerical indices counting A as 1. M6 is therefore (6,13).
The first for next loop writes 0's into all the cells in column M.
The second for next loop then reads through the ciphertext. It finds the ASCII code of each letter and then subtracts 59 (65 = ASCII A - 6 for were the data table starts.) to use it as an index in the data array. The current data value is read and incremented before being written back into the cell.

Use Excel in the normal way to draw frequency graphs for normal English and the ciphertext.
I have added the code =SUM(M6:M31) to cell M32 and then used this value to generate the %'s for my graph.

Use this code =100*M6/M$32 in cell N6 to calculate the percentage of A's in the text. Copy it down the column for the other letters.

last updated 29th June 2007