MD5   vs   SHA1   for   data   integrity   checking!

Hope you are well folks. šŸ™‚

In this article, I am going to discuss about hash functions, properties of hash functions, attacks on hash functions, and which algorithm we are using to validate file content’s integrity and what are the next algorithms actively being developed to calculate hash.

Before I move further, it would be great if I tell you what are the hash functions and why on the earth they are needed !!

What is hash function ?

Hash function is an algorithms which converts input to a data which is unique to only that input and can not be reversed engineered.

More formal definition can be found here.

Now let’s try to convert this concept in mathematical expression.

Suppose input we will give to hash function is “Let me know what is my hash”,Ā and we are representing it with constant “M”,Ā and hash data for inputĀ “M” is “9F5746A67A7FE4F15333381F00250431”.Ā Then we can write something like,

Hash(M) =Ā 9F5746A67A7FE4F15333381F00250431, if M=Ā Let me know what is my hash

if we be more abstract and map our input and generated hash code with constants then we can do something like,

Hash(X) = H where M is the input and H is the generated hash

Basically, there are 3 properties which hash function should satisfy. IfĀ  function or algorithm is satisfying these properties then and only then we can say it’s a hash function.

1) Preimage resistance:Ā Given a hash valueĀ h, it should be hard to find any messageĀ mĀ such thatĀ hĀ =Ā hash(m).

2) Second preimage resistance:Ā Given a messageĀ m1, it should be hard to find a different messageĀ m2Ā such thatĀ hash(m1) = hash(m2).

3) Collision resistance:Ā Given two messagesĀ m1Ā andĀ m2, it should be ā€œhardā€ to find a hash such thatĀ hash(k, m1) = hash(k, m2), whereĀ kĀ is the hash key.

MD5 and SHA1 satisfies these properties.

Our use case was to find a algorithm that can generate a hash for a file and we were okay if there were few collisions. But we could not compromise on speed.

MD5 is 40% faster than SHA1 but has been known to be broken and can have collisions. ( sourceĀ ) On the other handĀ Google has proved that SHA1 can be broken.Ā So our obvious choice became MD5.

If you want to know what are the next hashing algorithms being developed today then you can visit these pages for more information.



Here, I am also including sample code to calculate MD5 hash in Java. Do check it out!!!!.

Cheers. šŸ™‚

package com.causecode.sample;

import javax.xml.bind.annotation.adapters.HexBinaryAdapter;
import java.util.Scanner;

 * Demo Program to show how to calculate the hash code for the given input
public class Main {

     * Enum representing Algorithms
    private enum Algorithm {

     * Main Entry Point for Java JVM
     * @param args
     * @throws NoSuchAlgorithmException
    public static void main(String[] args) throws NoSuchAlgorithmException {
        Scanner scanner = new Scanner(;
        printToConsole("Enter the input to calculate the hash ? ");
        String input = scanner.nextLine();

        printToConsole("Chose Algorithm..");
        printToConsole("1) MD5");
        printToConsole("2) SHA1");
        int choice = scanner.nextInt();

        String hash;
        if (choice == 1) {
            hash = calculateHash(input, Algorithm.MD5);
        } else {
            hash = calculateHash(input, Algorithm.SHA1);

        printToConsole(String.format("Calculated Hash for Given Input is:- \" %s \"", convertHashToHexString(hash)));


     * Convert Given Hash to Hex Representation
     * @param hash
     * @return String
    private static String convertHashToHexString(String hash) {
        return new HexBinaryAdapter().marshal(hash.getBytes());

     * Calculates the hash for the given input using supplied Algorithm
     * @param message
     * @param algorithm
     * @return
     * @throws NoSuchAlgorithmException
    static String calculateHash(String message, Algorithm algorithm) 
                   throws NoSuchAlgorithmException {
        return String.valueOf(MessageDigest.getInstance(algorithm.toString()).digest(message.getBytes()));

     * Utility Method to print line to the console
     * @param message
    static void printToConsole(String message) {

About CauseCode: We are a technology company specializing in Healthtech related Web and Mobile application development. We collaborate with passionate companies looking to change health and wellness tech for good. If you are a startup, enterprise or generally interested in digital health, we would love to hear from you! Let's connect at

Leave a Reply

Your email address will not be published. Required fields are marked *


Do you want to get articles like these in your inbox?

Email *

Interested groups *
Technical articles