Updated March 14, 2023
Introduction to MD5 Algorithm
MD5 message-digest algorithm is the 5th version of the Message-Digest Algorithm developed by Ron Rivest to produce a 128-bit message digest. MD5 is quite fast than other versions of the message digest, which takes the plain text of 512-bit blocks, which is further divided into 16 blocks, each of 32 bit and produces the 128-bit message digest, which is a set of four blocks, each of 32 bits. MD5 produces the message digest through five steps, i.e. padding, append length, dividing the input into 512-bit blocks, initialising chaining variables a process blocks and 4 rounds, and using different constant it in each iteration.
Use of MD5 Algorithm
It was developed with the main motive of security as it takes an input of any size and produces an output if a 128-bit hash value. To be considered cryptographically secure, MD5 should meet two requirements:
- It is impossible to generate two inputs that cannot produce the same hash function.
- It is impossible to generate a message having the same hash value.
Initially, MD5 was developed to store one way hash of a password, and some file servers also provide pre-computed MD5 checksum of a file so that the user can compare the checksum of the downloaded file to it. Most Unix based Operating Systems include MD5 checksum utilities in their distribution packages.
How do the MD5 Algorithm works?
As we all know that MD5 produces an output of 128-bit hash value. This encryption of input of any size into hash values undergoes 5 steps, and each step has its predefined task.
Step1: Append Padding Bits
- Padding means adding extra bits to the original message. So in MD5 original message is padded such that its length in bits is congruent to 448 modulo 512. Padding is done such that the total bits are 64 less, being a multiple of 512 bits length.
- Padding is done even if the length of the original message is already congruent to 448 modulo 512. In padding bits, the only first bit is 1, and the rest of the bits are 0.
Step 2: Append Length
After padding, 64 bits are inserted at the end, which is used to record the original input length. Modulo 2^64. At this point, the resulting message has a length multiple of 512 bits.
Step 3: Initialize MD buffer.
A four-word buffer (A, B, C, D) is used to compute the values for the message digest. Here A, B, C, D are 32- bit registers and are initialized in the following way
Word A | 01 | 23 | 45 | 67 |
Word B | 89 | Ab | Cd | Ef |
Word C | Fe | Dc | Ba | 98 |
Word D | 76 | 54 | 32 | 10 |
Step 4: Processing message in 16-word block
MD5 uses the auxiliary functions, which take the input as three 32-bit numbers and produce 32-bit output. These functions use logical operators like OR, XOR, NOR.
F(X, Y, Z) | XY v not (X)Z |
G(X, Y, Z) | XZ v Y not (Z) |
H(X, Y, Z) | X xor Y xor Z |
I(X, Y, Z) | Y xor (X v not (Z)) |
The content of four buffers are mixed with the input using this auxiliary buffer, and 16 rounds are performed using 16 basic operations.
Output-
After all, rounds have performed, the buffer A, B, C, D contains the MD5 output starting with lower bit A and ending with higher bit D.
Example:
Input: This is an article about the cryptography algorithm |
Output: e4d909c290dfb1ca068ffaddd22cbb0 |
Advantages and Disadvantages of MD5 Algorithm
Below are the advantages and disadvantages explained:
- MD5 Algorithms are useful because it is easier to compare and store these smaller hashes than store a large variable length text. It is a widely used algorithm for one-way hashes used to verify without necessarily giving the original value. Unix systems use the MD5 Algorithm to store the passwords of the user in a 128-bit encrypted format. MD5 algorithms are widely used to check the integrity of the files.
- Moreover, it is very easy to generate a message digest of the original message using this algorithm. It can perform the message digest of a message having any number of bits; it is not limited to a message in the multiples of 8, unlike MD5sum, which is limited to octets.
- But for many years, MD5 has prone to hash collision weakness, i.e. it is possible to create the same hash function for two different inputs. MD5 provides no security over these collision attacks. Instead of MD5, SHA (Secure Hash Algorithm, which produces 160-bit message digest and designed by NSA to be a part of digital signature algorithm) is now acceptable in the cryptographic field for generating the hash function as it is not easy to produce SHA-I collision and till now no collision has been produced yet.
- Moreover, it is quite slow then the optimized SHA algorithm.SHA is much secure than the MD5 algorithm, and moreover, it can be implemented in existing technology with exceeding rates, unlike MD5. Nowadays, new hashing algorithms are coming up in the market, keeping in mind higher security of data like SHA256 (which generates 256 bits of signature of a text).
Conclusion
Nowadays, with the storage of all the data on the cloud and internet, it is essential to keep the data’s security at the utmost priority. The most secure algorithm should be adopted to encrypt private data. Recent studies show that the SHA algorithm should be given paramount importance over MD5 as MD5 is more vulnerable to collision attacks. However, researchers are proposing new algorithms that are secure and least vulnerable to attacks like SHA256.
Recommended Articles
This has been a guide to the MD5 Algorithm. Here we have discussed the basic concept, uses, working, advantages and disadvantages of the MD5 Algorithm. You can also go through our other suggested articles to learn more –