Defining Hashing Algorithms and Hash Codes                                
•        The .NET framework provides numerous ways for developers to generate ‘hash codes’.
•        Hash codes also go by the term ‘message digest’ or ‘message fingerprint’.
•        A hash code is a numerical value generated from a specific input value and a specific hash code
algorithm.
•        Understand that the same hash code value will be generated when using the same input value / algorithm
combination.
•        Hash codes have no obvious trace of the original message data. Rather, a hash code is a unique output
based on a fixed input.
•        For example, Hello and Hello generate the same hash number, while Hello and HELLO yield different
hash numbers:
•        Note the hash values of the first two strings are identical, while the final string’s hash code is unique (as
the input is unique).


•        Hash codes are useful whenever you with to ensure the integrity of a message:
•        Alice generates a hash code value based on a specific message and a specific hash code algorithm.
•        Bob receives the message and this hash code value, and generates his own hash code value using the
same algorithm.
•        If these hash code numbers are the same, Bob can assume that the message has not been altered by Eve.
•        If the hash codes are different, Bob can assume that somehow, the message data has been altered by Eve
and cannot be trusted.
•        Understand that hash codes themselves can be open to security risks!  
•        Imagine Eve intercepted the message and hash value, and replaced them both. Bob would have little way
of knowing this occurred.
•        One way to fix this issue is to make use of a keyed hash algorithm.
•        You’ll see how to do so later in this chapter.

•        As stated, hash codes are a form of ‘one-way encryption’:
•        Once a hash code has been generated, it is computational infeasible to obtain the original message data.
•        For example, you can generate a hash code using a .NET assembly as input.
•        The hash code itself is represented as a byte array.
•        However what chance would you have of recreating an entire assembly (such as mscorlib.dll) based on a
byte array? No chance. None.
•        Given that hash codes really leave no statistically significant trace of the original message data, you may
indeed wonder where to make use of hash codes?
•        Hash code values can be very helpful when you want to prove somebody knows a secret (such as a
password) without actually storing the secret.
•        As mentioned, hash codes are also used to ensure data has not changed during transport (e.g., message
integrity).
•        Do not to confuse System.Object.GetHashCode() with the act of generating hash codes for encryption
services!
•        The virtual GetHashCode() method of System.Object can be overridden to yield a unique value for an
object in a particular state.
•        This hash value is used by the Hashtable type to identify objects within the container.  
•        However calling GetHashCode() has nothing to do with securing the object!
A High Level Examination of Hash Code Theory                                  
•        Although numerous hash code algorithms ship with the .NET base class libraries, they take the same
basic approach:
•        The message data is broken up into input blocks.
•        The first block of message data is placed into the hash code algorithm along with a seed value that
sometimes goes by the term salt.
•        The resulting value is passed into the algorithm again along with the second block.
•        This process repeats until all message data has been consumed.
•        The final value is the hash code of the message.

•        Hash codes represent a ‘digital fingerprint’:
•        Once a message has been hashed, the value is unique for that specific message.
•        Changing even one bit or one piece of character data will result in a completely different hash code.
The .NET Hash Code Algorithms                                                        
•        The .NET platform ships with various hash code algorithms:
•        Each algorithm differs by the length of the input blocks, message size limit and size of the resulting hash
code.
•        Here are some of the more common hash algorithms.
.NET Hash Algorithm         Input Block Size        Message Limit     (in bits)        Hash Code Size     (in bits)
MD5 (MD = Message Digest)        512        264        128
SHA-1 (SHA = Secure Hash Algorithm)         512        264        160
SHA-256        512        264        256
SHA-384        1024        2128        384
SHA-512        1024        2128        512
•        As you would expect, each of these hash code algorithms are represented by a specific class type in the .
NET base class libraries.
•        Each of the concrete types derives from a common abstract class named System.Security.Cryptography.
HashAlgorithm.
•        The System.Security.Cryptography namespace is defined within mscorlib.dll.

The HashAlgorithm Base Class Functionality                                        
•        Here is a snapshot of some of the core members of HashAlgorithm:
•        Consult the .NET Framework SDK documentation for full details.
public abstract class System.Security.Cryptography.HashAlgorithm :
object, ICryptoTransform, IDisposable
{
// Allow you to get basic stats on the hash algorithm type.
public bool CanReuseTransform { virtual get; }
public bool CanTransformMultipleBlocks { virtual get; }
public byte[] Hash { virtual get; }
public int HashSize { virtual get; }
public int InputBlockSize { virtual get; }
public int OutputBlockSize { virtual get; }

// Hash code computation function.
public byte[] ComputeHash(byte[] buffer);
public byte[] ComputeHash(byte[] buffer, int offset, int count);
public byte[] ComputeHash(System.IO.Stream inputStream);

// Hash code creation functions.
public static HashAlgorithm Create();
public static HashAlgorithm Create(string hashName);
...
}
•        Like any base class, HashAlgorithm provides a set of common members which provide a polymorphic
interface to derived types.
•        The two most immediately useful would be the following methods.
Member of HashAlgorithm        Meaning in Life
Create()        Returns a specific hash code class by name.
ComputeHash()        Creates a hash code from an array of bytes, a portion of a byte array or a data stream.   
The return value is (another) array of bytes representing the hash code value itself.
HashAlgorithm Derived Types                                                                
•        The System.Security.Cryptography.HashAlgorithm type is the abstract base class for all hashing
algorithms.
•        Each specific hash algorithm gains additional functionality from an associated abstract base class.
•        From these ‘hash algorithm-centric’ abstract base classes, derive a concrete type.
•        This architecture allows for the possible of multiple implementations of the same hash algorithm in the
future.
•        In the following hierarchy, these ‘intermediate’ abstract classes are denoted by an arrow icon ( ).
•        Concrete types with the ‘-Managed’ suffix are written in a pure managed language, and do not require
any Win32 API calls.
•        Concrete types with the ‘-ServiceProvider’ suffix rely on the Win32 Crypto API libraries.
Hashing
Table of Contents
Copyright (c) 2008.  Intertech, Inc. All Rights Reserved.  This information is to be used exclusively as an
online learning aid.  Any attempts to copy, reproduce, or use for training is strictly prohibited.
Courseware
Training Resources
Tutorials