Strings & Automata

  • KMP: prefix table to skip work
  • Regex: DP, backtracking, and NFA engines
  • POSIX: leftmost-longest semantics
  • FSM: validate structured input
1 / 8

KMP intuition

  • Never backtrack in text
  • LPS table encodes pattern self-overlap
  • O(n+m) worst-case
2 / 8

Regex DP (lite)

  • dp[i][j] = match s[:i] with p[:j]
  • Handles '.' and '*'
  • Correct and predictable
3 / 8

Backtracking engines

  • Depth-first search over regex choices
  • Simple but can be exponential
  • Catastrophic example: (a+)+b
4 / 8

Thompson NFA

  • Compile regex to NFA with epsilon edges
  • Track all active states at once
  • Avoids exponential blowups
5 / 8

POSIX leftmost-longest

  • Earliest start wins
  • Among those, longest match wins
  • Alternation order does not matter
  • Example: "a|aa" on "aa" -> "aa"
6 / 8

Capstone: POSIX regex

  • Literals, ., *, +, ?, {m,n}, |, (), []
  • Anchors ^ and $
  • Return (start, end) of leftmost-longest match
7 / 8

FSM parsing

  • States, transitions, accept states
  • Ideal for numbers, dates, protocols
8 / 8
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.