Suffix Structures & Rolling Hash

Lesson, slides, and applied problem sets.

View Slides

Lesson

Suffix Structures & Rolling Hash

Why this module exists

Tries handle prefixes. Suffix structures handle all substrings efficiently. They power fast substring search, repeated substring detection, and pattern matching at scale.


Rolling hash (prefix hashing)

Compute a hash for every prefix so substring hashes are O(1).

For base B and mod M:

  • pref[i+1] = pref[i] * B + value(s[i])
  • hash(l..r) = pref[r+1] - pref[l] * B^(r-l+1)

Use two moduli to reduce collision risk.


Suffix array (sorted suffixes)

A suffix array is the list of suffix indices in sorted order. With LCP (longest common prefix), you can:

  • Find longest repeated substring
  • Count distinct substrings
  • Do pattern search in O(m log n)

Suffix automaton

A compact DFA that represents all substrings. Supports:

  • Distinct substring count
  • Longest repeated substring
  • Online construction in O(n)

What you will build

  • Substring equality queries with rolling hash.

Module Items