Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Can you give me an example of code optimization?

Optimization

Can you give me an example of code optimization?

by  sophisticate  Posted    (Edited  )
A while back, I was heading off on a long plane trip and needed something to keep myself occupied. I was on a bit of a cryptography kick (actually, I still am), so I figured that trying to make sense of some crypted messages would keep me busy. I wrote the following script to create vignere-encrypted texts about a half hour before I left. Since it was rushed, I programmed it poorly (as can be seen)... so, later, I decided to see what optimizations I could make. Here's what I did. But first, an explanation of the cipher.

The vignere cipher uses a technique known as polyalphabetic substitution. Basically, you have a block of alphabets that looks like this:
Code:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
BCDEFGHIJKLMNOPQRSTUVWXYZA
CDEFGHIJKLMNOPQRSTUVWXYZAB
DEFGHIJKLMNOPQRSTUVWXYZABC
EFGHIJKLMNOPQRSTUVWXYZABCD
FGHIJKLMNOPQRSTUVWXYZABCDE
etc...
...a plaintext, and a key. The letters of the key are written over the plaintext, repeated as necessary. For instance, if your key was "black" and the plaintext was "four score and seven years ago", you would notate it as:
Code:
blackblackblackblackblack
fourscoreandsevenyearsago
Then you take each column of letters and find the intersection on the grid -- that is your encrypted text. So the first letter 'f' would be encrypted with 'b' to result in 'g', the second letter 'o' would be encrypted with 'l' to result in 'z' and so on.

Now on to my first version of the code. We assume the subroutine vignere() will do the work, and that it will be passed the keyword as the first argument, and the plaintext as the second.
Code:
sub vignere($$) {
    my @ciphered;
    my($keyword,$text) = @_;
    $keyword =~ s/[^A-Za-z]//g;
    $text =~ s/[^A-Za-z]//g;
    my @keyChars =  split //, $keyword;
    my @plaintextChars = split //, $text;
    my $j = 0;
    my $max = @keyChars;
    foreach my $p(@plaintextChars) {
        my $k = $keyChars[$j];
        unless(++$j < $max) { $j = 0; }
        my $horizontal = indexOf($k);
        my $vertical = indexOf($p);
        my @alphabet = rotate($horizontal);
        my $cipherText = $alphabet[$vertical];
        push(@ciphered,$cipherText);
    }
    @ciphered;
}
sub rotate($) {
    my $pos = shift;
    my @charArray = ('A'..'Z');
    for($i = 0; $i < $pos; $i++) {
        push(@charArray,shift(@charArray));
    }
    @charArray;
}
sub indexOf($) {
    my $char = shift;
    my @charArray = ('A'..'Z');
    my $index;
    for ($i=0; $i < @charArray; $i++) {
        if ($charArray[$i] eq uc $char) {
            $index = $i;
            last;
        }
    }
    $index;
}
Our benchmark tells us this:
vignere: 12 wallclock secs (11.28 usr + 0.00 sys = 11.28 CPU)
(NOTE: The benchmarks in this FAQ are based on 1000 iterations of the subroutine)

Well, the first thing I realized was that the regexes on the plaintext and key were the same, so I could combine them. I also saw that it was ridiculous of me to have an indexOf() subroutine when the ord() function would work perfectly for getting the alphabetic index of a character. ord() returns a number based on the ASCII table wherein the capital letters begin at 65, so, to find the current index (distance from 'A'), we just need to subtract 65 from ord(letter).

New code:
Code:
sub vignere($$) {
    my @ciphered;
    my($keyword,$text) = @_;
    $_ =~ s/[^A-Za-z]//g for ($text,$keyword);
    my @keyChars =  split //, $keyword;
    my @plaintextChars = split //, $text;
    my $j = 0;
    my $max = @keyChars;
    foreach $p(@plaintextChars) {
        my $k = $keyChars[$j];
        unless(++$j < $max) { $j = 0; }
        my $horizontal = ord(uc $k) - 65;
        my $vertical = ord(uc $p) - 65;
        my @alphabet = rotate($horizontal);
        my $cipherText .= $alphabet[$vertical];
        push(@ciphered,$cipherText);
    }
    @ciphered;
}
sub rotate($) {
    my $pos = shift;
    my @charArray = ('A'..'Z');
    for($i = 0; $i < $pos; $i++) {
        push(@charArray,shift(@charArray));
    }
    @charArray;
}
Our benchmark tells us this:
vignere: 4 wallclock secs ( 4.78 usr + 0.00 sys = 4.78 CPU)
Nice improvement!

Next, I figured I might be able to drop the entire rotate() subroutine with a couple mathematical twists. Let's take an example from the coding diagram above. If my key character is 'E' (the 5th letter; ASCII 69) and my plaintext character is 'J' (the 10th letter; ASCII 74), the intersection on the diagram is 'N' (the 14th letter; ASCII 78). If I add the ASCII values of the key character and the plaintext character to obtain 143, the modulo 26 (number of letters in the alphabet) of which is 13, which when taking into account the zero, places us at the 14th character in the alphabet, 'N'.

New code:
Code:
sub vignere($$) {
    my @ciphered;
    my($keyword,$text) = @_;
    $_ =~ s/[^A-Za-z]//g for ($text,$keyword);
    my @keyChars = split //, $keyword;
    my @plaintextChars = split //, $text;
    my $j = 0;
    my $max = @keyChars;
    foreach my $p(@plaintextChars) {
        my $k = $keyChars[$j];
        unless(++$j < $max) { $j = 0; }
        my $cipherText = chr((ord(uc $k) + ord(uc $p))%26 + 65);
        push(@ciphered,$cipherText);
    }
    @ciphered;
}
Our benchmark tells us this:
vignere: 0 wallclock secs ( 0.44 usr + 0.00 sys = 0.44 CPU)
Getting there!

Now we've got the math out of the way (I can't think of a better way to do that), let's look at the control structures... Starting with that unless(++$j..) stuff. What's happening, at present, is that as each character in the plaintext is evaluated, a pointer advances to the next character in the keytext or resets to the first character if none are left. What if we didn't have to worry about knowing the position of the keytext, but rather let the position in the plaintext determine that on the fly? We can do that with the modulus operator -- the remainder of our current position in the plaintext divided by the length of the keytext will be our position in the keytext. That solved, let's move to the foreach section. It's looking like we're moving toward using numeric indices rather than actual characters, so we should probably just drop the foreach. We could use a for loop, but there's something better: map(). map() is a somewhat obscure Perl function, but probably one of the most useful. It applies a function to each member of an array and returns an array containing the results of the function, which is exactly what the foreach loop was doing. Now we'll consolidate that loop into the first argument of map, and use the values 0 to the length of the plaintext - 1 as the array we're traversing. Our last modification will be to pull the uc() commands out of the map. It's much more efficient to upper case the entire text rather than one character at a time, so we'll do this when the keyword and text variables are being initialized. We'll leave the regex the way it is because, believe it or not, the Perl interpreter is optimized in such a way that it is faster to search for upper case and lower case letters than it is to search for either by itself.

New code:
Code:
sub vignere($$) {
    my($keyword,$text) = (uc shift,uc shift);
    $_ =~ s/[^A-Za-z]//g for ($text,$keyword);
    my @keyChars = split //,$keyword;
    my @plaintextChars = split //,$text;
    map{ chr((ord($plaintextChars[$_])+ord($keyChars[$_%@keyChars]))%26+65) } 0..@plaintextChars-1;
}
Our benchmark tells us this:
vignere: 0 wallclock secs ( 0.33 usr + 0.00 sys = 0.33 CPU)
That's a big jump from what we started with.

Hope this gives people some ideas.

brendanc@icehouse.net
Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top