Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reverse the Caesar Cipher in Excel 2010

Status
Not open for further replies.

luxvivens

Technical User
May 20, 2010
27
CA
Using a formula (see below) identified in the Crystal Reports section of this site, I used that cipher to disguise surgeon name when I upload data to an outside vendor.

Later on, I am able to download all the data sent to the vendor and save it in a CSV for subsequent review. However, now I wish to know the actual name of the surgeon in this file. So my question is, is it possible to reverse the cipher for surgeon name in Excel 2010 so that it appears as the correct name? If yes, any ideas how to do this.

The cipher adds 5 characters to each letter. For example “DOE” becomes “ITJ”. So, if I have ITJ in cell A1, how does one turn it back to Doe. Here is the formula as used in the Crystal Reports 10 if that is helpful:

//The Caesar cipher
//The input string to encrypt
Local StringVar inString :=split({pcmProcedure.mdName} , ",")[1];
Local NumberVar shift := 5;
Local StringVar outString := "";
Local NumberVar i;
For i := 1 To Length(inString) Do
(
Local StringVar inC := inString ;
Local StringVar outC;
Local BooleanVar isChar :=
LowerCase(inC) In "a" To "z";
Local BooleanVar isUCaseChar :=
isChar And (UpperCase (inC) = inC);
inC := LCase(inC);
If isChar Then
(
Local NumberVar offset :=
(Asc(inC) + shift - Asc("a")) Mod
(Asc("z") - Asc("a") + 1);
outC := Chr(offset + Asc("a"));
If isUCaseChar Then outC := UpperCase(outC)
)
Else
outC := inC;
outString := outString + outC
);
outString
 



Hi,

Suppose you explain your cipher in plain English, rather than a bunch of obsure code?

Skip,
[sub]
[glasses]Just traded in my old subtlety...
for a NUANCE![tongue][/sub]
 


The cipher adds 5 characters to each letter.
Oh, you mean it adds 5 to the ASCII value.

So what happens with V, W, X, Y or Z?

Skip,
[sub]
[glasses]Just traded in my old subtlety...
for a NUANCE![tongue][/sub]
 
Couldn't you just build a cipher table of all your surgeons and do a lookup?
 
I haven't tested this, but I would try changing this:
Code:
Local NumberVar shift := [COLOR=red]5[/color];
to this:
Code:
Local NumberVar shift := [COLOR=red]-5[/color];
and run the subroutine on the encrypted text.
 
Or try:
Code:
Local NumberVar shift := [COLOR=red]21[/color];
and run it on the encrypted text to bring the cipher back full circle (assuming a 26 letter alphabet).
 
>Oh, you mean it adds <x> to the ASCII value

Indeed. As the OP said, it's just a simple Caesar cipher

>-5

Won't quite work without modification because of the slightly convoluted method where by the shifted character is calculated in which the character shift is assumed to always be positive.

But essentially that is exaty what we want to do - subtract 5 from the ASCII value of each character (rolling back round to the end of the alphabet if that causes us to drop below 'a')
 
... the sad thing is that it will probably take you longer to type the code to decode your Caesar cipher than it would take an intelligent 12-year-old to decode it manually, armed with nothing more than a pencil.

There are much better low-hassle ways to encode things. A very, very simple approach that is already much more secure is to use any reputable random number generator starting at a known seed value (possibly derived from a password) and exclusive-or successive bytes with successively generated numbers. This, of course, has the side-benefit that encoding and decoding can be done with the same procedure.
 
Thanks everyone for the helpful feedback. Several ways to skin a cat. As a first solution, I am going to build a cipher table and do a lookup as suggested by GRUUUU. Thank you Skip,JGES and StrongM for you solutions - they seem like they would work as well.
 
Another idea (possibly overkill for this application) might be to use genuine, industrial-strength encryption (substitution codes using a random number generator with the same seed are really not much more difficult to crack than Caesar cypher; a 'known text attack' will do the trick). Windows comes with built-in cryptography routines, such as AES, and they are not hard to use. Here's an example:
Code:
[blue][green]   ' requires reference to CAPICOM library[/green]
    Dim Crypto As New EncryptedData
    Dim result As String
    
    Crypto.Algorithm = CAPICOM_ENCRYPTION_ALGORITHM_AES
    Crypto.SetSecret "mypassword"
    Crypto.Content = "Consultant's Name"
    result = Crypto.Encrypt
    MsgBox result ' enrypted name
    
    Crypto.Content = " " [green]' clear original content so we can see decrypting works ...[/green]
    Crypto.Decrypt (result)
    result = Crypto.Content
    MsgBox result [green]' and now we've decrypted it back[/green][/blue]
 
Strongm, entirely endorsed! I personally feel that weak security is worse than none, because it misleads the uninformed into thinking they are secure.

As an example, a decade ago I encountered a piece of lab equipment where each user could set up their own, password-protected methods. Naturally most users chose to use the same password as they were already using for their login to their PC/e-mail etc.; there's a limit to how many passwords a human can remember. Unfortunately the software stored all its passwords in a plain text file (but it was OK, becuase the file wasn't called passwords.txt. It was called apwpuight.4rt or something, so no one would realise, would they?). Net result, entire site security was totally compromised because anybody who had used that instrument could know the passwords being used by a wide range of individuals, including those with admin rights over all sorts of things. (It's actually worse than that: people tend not to change the way they make up passwords radically. If a person uses his daughter's name, then when he is told to change it, he'll probably use his son's. If you've once known someone's password, you are in a much stronger position to guess it in the future.)

So if you must encrypt, try to do it well. Strongm is right.
 
... by the way, strongm, can I just defend the concept of random-number generator xor encoding versus Caesar ciphers? I agree they're far from super-strong, but "really not much more difficult to crack" does them a bit of an injustice.

In order to crack them using a 'known text attack' you've got to pick a bit of known text you expect to find, and encode it with every key that is possible, and pick keys that give a result containing the desired text (if I'm understanding the method correctly). The arithmetic of looking for a name like "smith" in the face of a 64-bit random number generator isn't pleasant for the casual snooper, though I don't suppose the FBI would bat an eyelid before they got it.
 
> a 64-bit random number generator

Think really hard about this in terms of a plain text ASCII/ANSI substitition system


 
eh, been thinking, not coming up with much though. Our random number generator is absolutely not limited by the size of the alphabet we're using. The point of using a 64 bit generator is that there are 2^64 possible keys, which means 2^64 possible sequences of bytes derived from the generator and used in encoding. The idea isn't to create a random substitution table that stays the same for the entire encoding process. The idea is to make sure that each successive substitution is different, so we're not looking at aaa -> bbb, we're looking at aaa -> abc.

The benefit of this arithmetically is that in the case of hunting for "smith" in a known-text attack, the chance of a random match is 26^5, while the number of keys to search is 2^64, and fortunately (2^64)/(26^5) is a biggish number (about 1.5*10^12 I think), so hopefully most of the hits that the "smith" search returns will actually be wrong keys that produced "smith" by accident.
 
I think that the lot of you have gone way overboard here.

The situation does not call for insane security. It could easily be solved by assigning random numbers to surgeons and storing those in a table locally.

 
> way overboard here

As I said - overkillfor this application. However these are discussion forums rather than purely tech support forums, and a potentially interesting discussion was arising.
 
Let me not discourage anyone from contemplating the merits of technology by any means.

I just find it an all too common occurrence for tech professionals to use a sledge hammer on a thumb tack. (no offense intended)

It's been my burgeoning understanding that the IT field was concerned with the most cost-efficient application of the best solution.

 
>The point of using a 64 bit generator is that there are 2^64 possible keys, which means 2^64 possible sequences of bytes

This is not correct. Yes, a 64-bit random number generator is capable of generating up to 2^64 different numbers. This has nothing directly to do with keys/sequences. (yes, you can build a sequence from your random numbers, which we can call a key, but it's length has nothing to do with the generator) The maximum key size that you can buld from the generator is dependent on how many numbers are generated before the generator repeats itself.

>not limited by the size of the alphabet

Again, not correct. Working with plain text you only have 96 characters to work with - but let's be generous an allow the entire ANSI range - so 256 characters. Given you are doing a simple substitution, there's only 256 substitutions you can do for each character. So you may well be able to generate 2^64 numbers, but our alphabet size effectively collapses this to 256 valuse (using your XOR suggestion, for example, means that 274877906865, 26132 and 177 would all generate the same encrypted character)
 
Everybody: yes, you're right, all this ap needs is something simple, but hey, life's fun! Why not design a sledgehammer while we're at it?!

strongm, I'm sorry, I don't think I'm expressing myself well, and I think I'm not getting the point across.
(1) For a 64-bit random number generator based on the usual sort of pseudo-random sequence generator, the idea is to pick a generating function that makes it cycle through all 2^64 possible values. What I meant by "key" was actually "random number seed" (sorry). I meant, your generator provides a cyclic list of 64-bit numbers that is 2^64 values long, and the seed is any starting point on that cyclic list. Obviously there are loads of generators out there that have 64 bits but badly-chosen generating functions with the result that they repeat in less than 2^64, but I assume you pick a good one!

(2) From each 64-bit number in our list, we can generate an 8-bit value. The idea is that from any given 8-bit number, it's really hard to predict the next 8-bit number in the sequence (because even if you know the generating function, you still lack 56 bits of information; given a single byte, there are 2^56 places in the cyclic list where you could currently be!). In effect, what I use the generator to do is produce a random string of bytes "aopwiehjkgrwepaaaliwujh" as long as the text to be encoded, and exclusive or each successive byte of the text to be encoded with the corresponding byte from my string.

(3) It's this 64-bit to 8-bit conversion that is the reason why the alphabet size doesn't matter. Actually the alphabet size should be very much smaller than the generator, otherwise we're stuffed.

(4) Be clear: there is no "key" in the Caesar cipher sense. There is no single translation table of 26 letters to 26 other letters. Each successive letter in the original text is exclusive ored with something unique and hard to predict from the sequence so far.

(5) In conventional cypher terms, this is equivalent to the cyper where you encode text using text taken from another book (classically a random letter sequence supplied as a one-time-use booklet to a spy, the corresponding booklet being kept by his contact - but you can use a well-known novel if you prefer!) It's hard to decode even if you have the book, because you need to know what page-number to start on. If you don't have the book, it's actually technically absolutely impossible.

(6) Obviously if you have 8 bytes from the sequence I generate, you know exactly where in the cyclic list you are. But you don't know the bytes I'm using to encode unless you know the text with which they've been XORed. This means that a 64 bit generator is vulnerable to a known-text attack if you have a known 8-byte text, and can be bothered to check 2^63 times, on average. It's a fairly tall order for a desk-top PC and a casual snooper, and of course I can happily switch to a 128 bit generator.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top