silverhairb
IS-IT--Management
Part 1
Background
Passwords are often masked and stored in databases to keep them from being compromised through casual observation. Using MD5 as a masking agent provides a one-way method of creating this masking effect. However, creating the simple MD5 hash of a password does not fully secure it or prevent its compromise. The following is a discussion of the methods employed to mask passwords using hash algorithms, vulnerability of passwords stored in this manner and possible attack scenarios, and a possible solution focused on the Cisco environment.
Data Digest / Hashing
Hashing is a non-reversible cryptographic technique that compresses input data to a fixed length output. The algorithms are designed such that the input to the algorithm cannot be created from the output, but the output can be recreated by using the same input. Even if the length of the input data is different than the length of the output, the output is such that the length of the input data cannot be determined solely by examining the output.
Since the algorithms for hashing data are public, they are easily accessible. Every PC and server use these algorithms routinely for data verification. A MAC is one way of verifying data and is generated using a hash algorithm. Hashes are also used as input for purposes of creating digital signatures over data or as a method of verification such as used in digital certificates. For a given data stream, a hashing algorithm will generate a single fixed value. It seems intuitive that for data input greater than the length of the hash value, some data streams will generate the same hash value, or “collisions”, as others. Good hashing algorithms defeat the ability to predict what circumstances will produce these collisions.
There are several commonly used hash algorithms. These include MD5, SHA-1 and various later generations of SHA. For purposes of this discussion, we will focus on MD5 since it is used in the masking of passwords in Cisco routers.
MD5 (Message Digest number 5) was developed by Ron Rivest of MIT as a hash routine used in the generation of digital signatures using public key cryptography. MD5 generates a 16-byte binary stream most often depicted using hexadecimal or special characters. Several years ago, MD5 was proven mathematically to have a degree of predictability to generating collisions making it undesirable for use in digital signatures. Recent work has shown that using MD5 to generate the hash used in SSL has resulted in undermining the security framework on which SSL relies.
Attack Scenarios
(Recall that for a given data stream, a single repeatable hash value will be produced. Passwords are data streams, typically less than 16 characters long consisting of a mixture of alpha-numeric and special characters.)
It is possible for an attacker to begin to produce a large sample of hash values by running sample cleartext passwords through MD5 and recording the results. This is essentially a “dictionary” of hash values. As the dictionary size increases, passwords generated using MD5 become more susceptible. This dictionary attack is the reason why cryptographic keys used to encrypt PINs in ATMs and point of sale (POS) PIN entry devices are changed regularly. Otherwise, an attacker could begin capturing transactions (with encrypted PINs and debit card mag stripe images) from a device, enter a number of PINs into that device capturing his own transactions and then wait for transactions with encrypted PIN blocks of the same value as the PINs he entered. He would then have a debit transaction with the card’s mag stripe image and the PIN that goes with that card.
Another attack is a simple brute force attack against the password itself as the input value of the hash. This attack is similar to the dictionary attack but is a trial and error attack trying different possible passwords until there is a match to the hash.
The question to be asked about these attacks has to be the effort involved. Hashing uses fairly simple linear math making it relatively streamlined. This makes it efficient in use, like the data encryption algorithm (DEA) commonly known as DES. In 2003 it was estimated that a successful brute force attack against 56-bit single DES would take one hour or less using about $20,000 worth of commonly available PC equipment. Since that time, attack times have diminished with improvements in technology (Moore’s law) and lower prices.
56-bit DES has 2 raised to the power of 56 possible combinations or about 7 quadrillion keys. Given most passwords are less than 16 characters and that there are about 40 characters that are available, the number of possible passwords is about 1000 times that of DES keys. Most passwords use fewer than 10, their values would fit into a dictionary on a PC with a couple of large hard drives. The dictionary could be developed in a matter of hours. This would be used to attack most password hashes.
The more valuable a password, the more money can be spent on the tools needed to successfully discover its value from a known hash. Passwords longer than 16 characters could possibly yield a hash of the same value (collision) as a shorter password diminishing the value of longer passwords and reducing the attack time. If passwords are case-sensitive, attack times would be increased dramatically, but still allow a successful dictionary attack of shorter passwords, and collisions have to be considered a vulnerability to attack.
An important point is that it’s not necessary to actually discover the real password as much as it is necessary to find a combination of data that would generate the same hash value.
Background
Passwords are often masked and stored in databases to keep them from being compromised through casual observation. Using MD5 as a masking agent provides a one-way method of creating this masking effect. However, creating the simple MD5 hash of a password does not fully secure it or prevent its compromise. The following is a discussion of the methods employed to mask passwords using hash algorithms, vulnerability of passwords stored in this manner and possible attack scenarios, and a possible solution focused on the Cisco environment.
Data Digest / Hashing
Hashing is a non-reversible cryptographic technique that compresses input data to a fixed length output. The algorithms are designed such that the input to the algorithm cannot be created from the output, but the output can be recreated by using the same input. Even if the length of the input data is different than the length of the output, the output is such that the length of the input data cannot be determined solely by examining the output.
Since the algorithms for hashing data are public, they are easily accessible. Every PC and server use these algorithms routinely for data verification. A MAC is one way of verifying data and is generated using a hash algorithm. Hashes are also used as input for purposes of creating digital signatures over data or as a method of verification such as used in digital certificates. For a given data stream, a hashing algorithm will generate a single fixed value. It seems intuitive that for data input greater than the length of the hash value, some data streams will generate the same hash value, or “collisions”, as others. Good hashing algorithms defeat the ability to predict what circumstances will produce these collisions.
There are several commonly used hash algorithms. These include MD5, SHA-1 and various later generations of SHA. For purposes of this discussion, we will focus on MD5 since it is used in the masking of passwords in Cisco routers.
MD5 (Message Digest number 5) was developed by Ron Rivest of MIT as a hash routine used in the generation of digital signatures using public key cryptography. MD5 generates a 16-byte binary stream most often depicted using hexadecimal or special characters. Several years ago, MD5 was proven mathematically to have a degree of predictability to generating collisions making it undesirable for use in digital signatures. Recent work has shown that using MD5 to generate the hash used in SSL has resulted in undermining the security framework on which SSL relies.
Attack Scenarios
(Recall that for a given data stream, a single repeatable hash value will be produced. Passwords are data streams, typically less than 16 characters long consisting of a mixture of alpha-numeric and special characters.)
It is possible for an attacker to begin to produce a large sample of hash values by running sample cleartext passwords through MD5 and recording the results. This is essentially a “dictionary” of hash values. As the dictionary size increases, passwords generated using MD5 become more susceptible. This dictionary attack is the reason why cryptographic keys used to encrypt PINs in ATMs and point of sale (POS) PIN entry devices are changed regularly. Otherwise, an attacker could begin capturing transactions (with encrypted PINs and debit card mag stripe images) from a device, enter a number of PINs into that device capturing his own transactions and then wait for transactions with encrypted PIN blocks of the same value as the PINs he entered. He would then have a debit transaction with the card’s mag stripe image and the PIN that goes with that card.
Another attack is a simple brute force attack against the password itself as the input value of the hash. This attack is similar to the dictionary attack but is a trial and error attack trying different possible passwords until there is a match to the hash.
The question to be asked about these attacks has to be the effort involved. Hashing uses fairly simple linear math making it relatively streamlined. This makes it efficient in use, like the data encryption algorithm (DEA) commonly known as DES. In 2003 it was estimated that a successful brute force attack against 56-bit single DES would take one hour or less using about $20,000 worth of commonly available PC equipment. Since that time, attack times have diminished with improvements in technology (Moore’s law) and lower prices.
56-bit DES has 2 raised to the power of 56 possible combinations or about 7 quadrillion keys. Given most passwords are less than 16 characters and that there are about 40 characters that are available, the number of possible passwords is about 1000 times that of DES keys. Most passwords use fewer than 10, their values would fit into a dictionary on a PC with a couple of large hard drives. The dictionary could be developed in a matter of hours. This would be used to attack most password hashes.
The more valuable a password, the more money can be spent on the tools needed to successfully discover its value from a known hash. Passwords longer than 16 characters could possibly yield a hash of the same value (collision) as a shorter password diminishing the value of longer passwords and reducing the attack time. If passwords are case-sensitive, attack times would be increased dramatically, but still allow a successful dictionary attack of shorter passwords, and collisions have to be considered a vulnerability to attack.
An important point is that it’s not necessary to actually discover the real password as much as it is necessary to find a combination of data that would generate the same hash value.