There are many different types of hash algorithms such as RipeMD, Tiger, xxhash and more, but the most common type of hashing used for file integrity checks are MD5, SHA-2 and CRC32.
MD5 - An MD5 hash function encodes a string of information and encodes it into a 128-bit fingerprint. MD5 is often used as a checksum to verify data integrity. However, due to its age, MD5 is also known to suffer from extensive hash collision vulnerabilities, but it’s still one of the most widely used algorithms in the world.
SHA-2 – SHA-2, developed by the National Security Agency (NSA), is a cryptographic hash function. SHA-2 includes significant changes from its predecessor, SHA-1. The SHA-2 family consists of six hash functions with digests (hash values) that are 224, 256, 384 or 512 bits: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256 .
Hashing algorithms are used in all sorts of ways – they are used for storing passwords, in computer vison, in databases, etc. There are hundreds of hashing algorithms out there and they all have specific purposes – some are optimized for certain types of data, others are for speed, security, etc. . its name gives away its purpose – it’s for cryptographic security. . Now, let’s get into SHA and the difference between the different versions.
2.1 Secure Hash Algorithms, Also known as SHA, are a family of cryptographic functions designed to keep data secured. It works by transforming the data using a hash function: an algorithm that consists of bitwise operations, modular additions, and compression functions. The hash function then produces a fixed-size string that looks nothing like the original. These algorithms are designed to be one-way functions, meaning that once they’re transformed into their respective hash values, it’s virtually impossible to transform them back into the original data. A few algorithms of interest are SHA-1, SHA-2, and SHA-3, each of which was successively designed with increasingly stronger encryption in response to hacker attacks. SHA-0, for instance, is now obsolete due to the widely exposed vulnerabilities.
SHA-2 basically consists of two hash algorithms: SHA-256 and SHA-512. SHA-224 is a variant of SHA-256 with different starting values and truncated output. SHA-384 and the lesser-known SHA-512/224 and SHA-512/256 are all variants of SHA-512. SHA-512 is more secure than SHA-256 and is commonly faster than SHA-256 on 64-bit machines such as AMD64.
The output size in bits is given by the extension to the "SHA" name, so SHA-224 has an output size of 224 bits (28 bytes),
SHA-256 produces 32 bytes, SHA-384 produces 48 bytes and finally, SHA-512 produces 64 bytes.
For the sake of this tutorial, all we care about are the SHA-256 algorithms.
2.2 Basic operation
The SHA-256 compression function operates on a 512-bit message block and a 256-
bit intermediate hash value. It is essentially a 256-bit block cipher algorithm which
encrypts the intermediate hash value using the message block as key. Hence there
are two main components to describe:
(1) the SHA-256 compression function, and
(2) the SHA-256 message schedule
We will use the following notation
If you fail to comprehend this series of steps,
then move to the next chapter where We will implement an example of SHA256 step by step using "projectfpga.com" as an Example.
Basic operations
• Boolean operations AND, XOR and OR, denoted by ∧, ⊕ and ∨, respectively.
• Bitwise complement, denoted by ¯.
• Integer addition modulo 2^32, denoted by A + B.
Each of them operates on 32-bit words. For the last operation, binary words are interpreted as integers written in base 2.
• RotR(A, n) denotes the circular right shift of n bits of the binary word A.
• ShR(A, n) denotes the right shift of n bits of the binary word A.
• A||B denotes the concatenation of the binary words A and B.
Functions and constants
The algorithm uses the functions:
Ch(X, Y, Z) = (X ∧ Y ) ⊕ (X ∧ Z),
M aj(X, Y, Z) = (X ∧ Y ) ⊕ (X ∧ Z) ⊕ (Y ∧ Z),
Σ0(X) = RotR(X, 2) ⊕ RotR(X, 13) ⊕ RotR(X, 22),
Σ1(X) = RotR(X, 6) ⊕ RotR(X, 11) ⊕ RotR(X, 25),
σ0(X) = RotR(X, 7) ⊕ RotR(X, 18) ⊕ ShR(X, 3),
σ1(X) = RotR(X, 17) ⊕ RotR(X, 19) ⊕ ShR(X, 10),
and the 64 binary words Ki given by the 32 first bits of the fractional parts of the cube roots of the first
64 prime numbers:
0x428a2f98 0x71374491 0xb5c0fbcf 0xe9b5dba5 0x3956c25b 0x59f111f1 0x923f82a4 0xab1c5ed5 0xd807aa98 0x12835b01 0x243185be 0x550c7dc3 0x72be5d74 0x80deb1fe 0x9bdc06a7 0xc19bf174 0xe49b69c1 0xefbe4786 0x0fc19dc6 0x240ca1cc 0x2de92c6f 0x4a7484aa 0x5cb0a9dc 0x76f988da 0x983e5152 0xa831c66d 0xb00327c8 0xbf597fc7 0xc6e00bf3 0xd5a79147 0x06ca6351 0x14292967 0x27b70a85 0x2e1b2138 0x4d2c6dfc 0x53380d13 0x650a7354 0x766a0abb 0x81c2c92e 0x92722c85 0xa2bfe8a1 0xa81a664b 0xc24b8b70 0xc76c51a3 0xd192e819 0xd6990624 0xf40e3585 0x106aa070 0x19a4c116 0x1e376c08 0x2748774c 0x34b0bcb5 0x391c0cb3 0x4ed8aa4a 0x5b9cca4f 0x682e6ff3 0x748f82ee 0x78a5636f 0x84c87814 0x8cc70208 0x90befffa 0xa4506ceb 0xbef9a3f7 0xc67178f2
Padding
The purpose of this padding is to ensure that the padded message is a multiple of 512 or 1024 bits, depending on the algorithm. Padding can be inserted before hash computation begins on a message, or at any other time during the hash computation prior to processing the block(s) that will contain the padding.
• first, a bit 1 is appended,
• next, k bits 0 are appended, with k being the smallest positive integer such that l(bit length) + 1 + k ≡ 448 mod 512, where l is the length in bits of the initial message,
• finally, the length l < 264 of the initial message is represented with exactly 64 bits, and these bits are added at the end of the message.
The message shall always be padded, even if the initial length is already a multiple of 512.
Block decomposition
For each block M ∈ {0, 1} 512, 64 words of 32 bits each are constructed as follows:
• the first 16 are obtained by splitting M in 32-bit blocks
M = W1||W2|| · · · ||W15||W16
• the remaining 48 are obtained with the formula:
Wi = σ1(Wi−2) + Wi−7 + σ0(Wi−15) + Wi−16, 17 ≤ i ≤ 64.
Hash computation
• First, eight variables are set to their initial values, given by the first 32 bits of the fractional part of the square roots of the first 8 prime numbers:
H1(0) = 0x6a09e667 H2(0) = 0xbb67ae85 H3(0)= 0x3c6ef372 H4(0) = 0xa54ff53a
H5(0) = 0x510e527f H6(0) = 0x9b05688c H7(0) = 0x1f83d9ab H8(0) = 0x5be0cd19
• Next, the blocks M
For t = 1 to N
– construct the 64 blocks Wi from M(t) , as explained above
– set
(a, b, c, d, e, f, g, h) = (H 1 (t−1) , H 2 (t−1) , H 3 (t−1) , H 4 (t−1) , H 5 (t−1) , H 6 (t−1) , H 7 (t−1) , H 8 (t−1) )
– do 64 rounds consisting of:
T1 = h + Σ1(e) + Ch(e, f, g) + Ki + Wi
T2 = Σ0(a) + M aj(a, b, c)
h = g
g = f
f = e
e = d + T1
d = c
c = b
b = a
a = T1 + T2
We assume that the length of the message can be represented by a 64-bit integer.
– compute the new value of Hj(t)
H1(t) = H1(t-1) + a
H2(t) = H2(t-1) + b
H3(t) = H3(t-1) + c
H4(t) = H4(t-1) + d
H5(t) = H5(t-1) + e
H6(t) = H6(t-1) + f
H7(t) = H7(t-1) + g
H8(t) = H8(t-1) + h
End for
• The hash of the message is the concatenation of the variables Hi
H = H1(N) || H2(N) || H3(N) || H4(N) || H5(N) || H6(N) || H7(N) || H8(N) .
Implementation: signatures
Implement the cryptographic hash function just described. Define the class sha256 with the method: public static BigInteger hash(byte[] M)
input: M is a chain of bytes of arbitrary length;
output: a positive integer in the interval [0, 2256), the value of the hash of M.
If you fail to comprehend this series of steps,
then move to the next chapter where We will implement an example of SHA256 step by step using "projectfpga.com" as a message to be hashed.