Go Back

Hash Functions

Topics

Def. Hash Function

A hash function is a mapping H such that:

  1. H maps inputs of arbitrary lengths to outputs of length n, where n is fixed.
    Mathematically: H: {0,1}* → {0,1}n
    • In general, H: S → T where |S| > |T|.
  2. H(x) can be efficiently computed for all x ∈ {0,1}*.

Security Requirements

Notes: Preimage resistant hash function is called one-way hash function (OWHF), and preimage/2nd preimage/collision resistant hash function is called cryptographic hash function.

Applications of Hash Functions

  1. Password protection on a multi-user computer system
    • Server stores (uid, H(passwd)) in a password file.
    • If an attacker gets the file, she does not learn any passwords.
    • Requires preimage resistance.
  2. Modification Detection Codes (MDC)
    • To ensure that a message m is not modified by unauthorized means, one computes H(m) and protects H(m) from unauthorized modification. (e.g. virus protection).
  3. Message digests for digital signature schemes
    • For reasons of efficiency, instead of signing a (long) message, the (much shorter) message digest is signed.
    • Requires cryptographic hash function.
    • Why collision resistance?
      • Suppose Alice find x1, x2 with H(x1) = H(x2) and x1 != x2.
      • Then Alice signs x1 and later claim to have signed x2.
  4. Message Authentication Codes (MAC)
    • Provides data integrity and data origin authentication.
  5. Pseudorandom bit generation
    • Distilling random bits s from several 'random' sources x1, x2, ..., xt
    • Outputs s = H(x1, x2, ..., xt)
    • Used in OpenSSL and /dev/random
  6. Key derivation function (KDF)
    • Deriving a cryptographic key from a shared secret key.
Remarks.

General Attacks

For finding preimages

For finding collision

Pollard's rho algorithm for finding collision

Image Source: https://link.springer.com/chapter/10.1007/978-981-99-1588-0_4/figures/1
Problem: naive implementation needs to store all xi values to know when it loops.

Floyd's Cycling-Finding algorithm for finding collision

VW Parallel Collision Search

Image Source: https://people.scs.carleton.ca/~paulv/papers/JoC97.pdf

Hash Function Construction

1. The Davies-Meyer Construction

Idea: Build a hash function from block cipher.

2. The Merkle-Damgard Construction

MDx and SHAx families of Hash Functions

MDx-Family

SHA-1 and SHA-2

General Structure of MDx and SHAx (except for SHA-3) hash functions

possibly add SHA-1 algorithm here

SHA-3

High Level Description of SHA-3

Remarks. NIST's Policy on Hash Functions (Dec, 2022)

Attacks on MDx, SHA-1 and SHA-2

Generic Attacks

Non-generic Attacks

Remarks.

Wang's Collision Finding Attack on SHA-1

Remark. Similarly, Wang's collision finding attack on MD5 work where now she finds two (different) 2-block messages x=(x1,x2) and y=(y1,y2) such that F(I,x)=F(I,y). It takes about 239 steps and still essentially meaningless due to limited control on x and y.

Exploit a Single Hash Function

Extending a Collision

Exploiting Wang's Collision