Extract from Mike Hamilton’s article “MD5 Hashing: The Foundation of a Defensible E-Discovery Process”
Commonly referred to as a “digital fingerprint,” a hash value is a special encryption code that is associated with each computer file. Hash values provide digital files with a unique identifier that corresponds to its contents. If the contents change, the file’s hashtag will change as well, indicating that the file is not the same as it was before. In e-discovery, one can compare hash values before and after collection to verify that a file is the same before and after collection.
To understand how MD5 hashing relates to e-discovery one must first know what a computer hash is. A computer hash is an encryption algorithm that takes the various bits of a file and outputs a unique text string.
Many hash algorithms have been created over the years, but the most commonly applied algorithm in use today for e-discovery is the MD5 (“MD” being short for message-digest). An MD5 hash tag might look something like:
While the above sequence might look like a random assortment of letters and numbers, it is in fact a revealing digital code, a unique alphanumeric value representing the contents of a single computer file.