Linux random.c

Linux random.c: The Real Meaning of “Kernel Randomness”

Modern random.c treats the irregular timing and unpredictable scheduling/IO jitter reflected in those values as entropy signals.

The design goal is not “perfect theoretical randomness per source,” but: build an internal state that an attacker cannot feasibly predict, then expand that state into high-volume random bytes using cryptography.

1. The 3-Stage Pipeline

The RNG flow can be understood as three layers:

  • Stage 1 — Entropy Collection: capture unpredictable physical/system timing noise (interrupts, input, disk, boot, VM, CPU RNG).
  • Stage 2 — Entropy Mixing: accumulate and diffuse inputs in a cryptographic pool (BLAKE2s-based input_pool).
  • Stage 3 — Expansion: derive fresh keys and generate output bytes using ChaCha20 CRNG, with fast key erasure.

2. What the Kernel’s “Random Values” Actually Are

These values appear repeatedly across random.c. They are not random in a pure mathematical sense, but are treated as entropy sources because they encode real-world jitter.

2.1 Timing / cycle sources (deterministic counters, unpredictable timing)

  • random_get_entropy(): commonly derived from jiffies, high-res timers, and CPU cycle counters (TSC).
  • Deterministic in theory, but practically hard to replay exactly due to interrupts, contention, scheduling, power states, cache effects, etc.

2.2 Hardware RNG (near-TRNG or high-quality DRBG output)

  • arch_get_random_seed_longs(), arch_get_random_longs(): RDSEED/RDRAND (Intel/AMD), ARM RNG variants.
  • Often based on physical noise sampling + internal conditioning/DRBG.
  • Not always a single trust anchor: controlled via random.trust_cpu and still mixed with other sources.

2.3 Interrupt context signals

  • add_interrupt_randomness(int irq) typically combines:
    • random_get_entropy() (timing)
    • instruction pointer (instruction_pointer(regs) or _RET_IP_)
    • irq number
  • These are pushed through fast_mix into a per-CPU “fast pool” (SipHash-like state).

2.4 Timer / input / disk event timing

  • add_timer_randomness(): uses Δ, Δ², Δ³ timing deltas to estimate entropy conservatively.
  • add_input_randomness(): keyboard/mouse event timing + values → timer randomness logic.
  • add_disk_randomness(): disk IO timing + device identifiers.

2.5 Boot + VM + power events

  • init_utsname(), kernel command line strings, bootloader seed (add_bootloader_randomness), VM unique ID (add_vmfork_randomness).
  • suspend/resume timestamps: ktime_get, ktime_get_boottime, ktime_get_real, etc.

3. Init State Management: EMPTY → EARLY → READY

The kernel tracks whether the CRNG is safe for cryptographic use. The exact names vary by version, but conceptually:

  • CRNG_EMPTY: essentially not seeded.
  • CRNG_EARLY: some seed material exists (boot phase minimum).
  • CRNG_READY: enough entropy has been credited for strong security guarantees.

The key idea: other kernel code can check readiness via macros/functions like crng_is_ready() / rng_is_initialized(), and can block via wait_for_random_bytes() when required.

Key functions

  • wait_for_random_bytes() — blocks callers until READY (wakes on crng_init_wait queue).
  • try_to_generate_entropy() — may be used to encourage additional collection while waiting.

4. Entropy Pool: BLAKE2s-based input_pool

The modern kernel treats the entropy pool as a cryptographic mixing state: a BLAKE2s context that absorbs inputs over time.

Core structure

struct entropy_pool {
  blake2s_ctx hash;      // mixing state
  spinlock_t lock;       // concurrency guard
  unsigned int init_bits;// conservative credited entropy estimate
};

Mixing APIs

  • _mix_pool_bytes() — update BLAKE2s state without locking (internal helper).
  • mix_pool_bytes() — lock + call helper (safe for concurrent sources).

5. Collection Routines: How Each Source Enters the System

5.1 Boot phase

  • random_init_early() — mixes in early-available sources (often CPU RNG if present) plus system uniqueness.
  • random_init() — later boot phase: adds real time, more timing entropy, and sets up reseed behavior.

5.2 Device / bootloader / VM hooks

  • add_device_randomness(buf, len) — mixes device identity-ish inputs (usually no entropy credit).
  • add_bootloader_randomness(buf, len) — mixes bootloader-provided seed; credit depends on trust options.
  • add_vmfork_randomness(unique_vm_id, len) — ensures cloned VMs don’t share identical RNG state; may force reseed if already READY.

5.3 Interrupt path: fast pool (performance-aware)

Interrupt context is extremely hot. The kernel avoids heavy hashing on every IRQ by using a fast pool.

  • add_interrupt_randomness(int irq) — collects timing/IP/irq signals and feeds fast_mix.
  • fast_mix() — SipHash-style fast diffusion into fast_pool.pool[].
  • mix_interrupt_randomness() — timer/worker flush: moves accumulated fast-pool state into input_pool via mix_pool_bytes(), then credits bits conservatively.

5.4 Timer / input / disk

  • add_timer_randomness(state, num) — computes Δ, Δ², Δ³ and estimates min-entropy (capped, conservative).
  • add_input_randomness(type, code, value) — calls timer randomness on meaningful input changes.
  • add_disk_randomness(disk) — uses disk IO timing + device identifiers (SSD may contribute less true jitter).

6. extract_entropy(): HKDF-like Derivation + Pool Forward Security

extract_entropy() is the bridge from “mixed pool state” to “fresh CRNG keys.” The flow is HKDF-like in spirit: derive seed, then derive next_key, then re-key the pool state so it moves forward.

Flows:

  1. Gather supplemental bits (CPU RNG outputs + random_get_entropy()) into a local block.
  2. Lock input_pool and finalize current BLAKE2s state to obtain seed.
  3. Run BLAKE2s again over (seed + supplemental block) to derive next_key.
  4. Reinitialize input_pool.hash keyed with next_key (pool state “advances”).
  5. Expand output: increment a counter in the block and hash(seed, block) to generate as many bytes as needed.
  6. Wipe temporary buffers (memzero_explicit style) to reduce key leakage risk.

The security intuition: even if an attacker learns something about past pool state, the pool is re-keyed forward, and outputs are derived through cryptographic transforms.

7. ChaCha20 CRNG: per-CPU Keys + Fast Key Erasure

The output generator is ChaCha20. Modern kernels typically use: a global base_crng plus per-CPU CRNG instances for performance and scalability.

Key components

  • base_crng.key — global key material updated on reseed.
  • generation — increments on reseed so per-CPU instances can sync when needed.
  • per-CPU crngs — avoid global lock contention for frequent randomness calls.

crng_reseed()

  • Runs on a schedule (often via workqueue).
  • Calls extract_entropy(), updates base_crng.key, increments generation.
  • Transitions to READY when thresholds/credits are satisfied.

crng_fast_key_erasure() (the “key shredder”)

Fast key erasure means: generate one ChaCha block, then overwrite the key with part of that block immediately. This reduces damage if memory is later compromised.

Old key  ──► ChaCha block() ──► [ new_key || output_bytes ]
             ^ overwrite key immediately (erase old key)

Output APIs

  • crng_make_state() — builds a ChaCha state; syncs per-CPU key if generation changed.
  • _get_random_bytes(buf, len) — fills buffer by generating ChaCha blocks; zeroizes state.
  • get_random_bytes() — wrapper + warning logic when used too early.
  • DEFINE_BATCHED_ENTROPY / get_random_u32() — per-CPU batching for small random requests.

8. Interfaces: /dev/random, /dev/urandom, getrandom()

A modern way to think about it: both devices draw from the same CRNG machinery, but differ in policy (blocking/readiness semantics).

  • /dev/urandom: fast stream output once the CRNG is operating.
  • /dev/random: historically more conservative/strict with readiness and entropy accounting.
  • getrandom(): syscall interface that can enforce “wait until ready” behavior depending on flags/policy.

9. Why This Ends Up Secure (Even if Individual Sources Aren’t “Perfect Random”)

  • Multiple independent sources: IRQ timing + input + disk + boot + VM + CPU RNG.
  • Cryptographic mixing (BLAKE2s): strong diffusion/avalanche, not a fragile chaos-map property.
  • HKDF-like extraction: pool state advances (seed → next_key) rather than staying static.
  • ChaCha20 expansion: outputs are as unpredictable as the key/state.
  • Fast key erasure: reduces blast radius of memory disclosure; strengthens forward/backward security properties in practice.
  • Conservative crediting: timing deltas and caps prevent overestimating entropy.

In short: the kernel claims that the system as a whole produces enough unpredictable jitter across many pathways, and then uses cryptography to transform that into robust random outputs.