| Algorithm Deep Dives |
|---|
| 1. Z Algorithm |
| 2. Manacher’s Algorithm |
| 3. Gosper’s Hack Algorithm |
This is the first post in the Algorithm Deep Dives series.
I will study various algorithms and explain them one by one. Whether the series continues or quietly fades away remains to be seen, but I will do my best.
The first algorithm is the Z Algorithm.
What is the Z Algorithm?
Z Array
The Z Algorithm is a pattern search algorithm that finds a given pattern in a string. It runs in linear time with complexity O(M + N).
It works with an auxiliary array called the Z array, which stores the length of the longest prefix substring starting at each index—i.e., at String[i].
Prefix substring: a substring starting at the current index that matches the beginning of the string
Index 0 1 2 3 4 5 6
Text a a b c a a b
Z values N 1 0 0 3 1 0
For example, in the string “aabcaab”, the longest prefix substring starting at the fifth a is “aab”, because aabcaab matches the prefix.
So if Z[i] = k, then String[i .. i + k] = String[0 .. k]. The Z Algorithm uses this property to search for the pattern in linear time.
In short, think of concatenating the pattern and the text into a single string "P$T", where P is the pattern, T is the text, and $ is a character that does not appear in the prefix.
If at some index the Z array value equals the prefix length, then by the property above, a prefix substring exists at that position.
With the concept in mind, let us look at the implementation.
Implementation
fun getZArray(s: String): IntArray {
val n = s.length
// Use a Z array of length n.
val zArray = IntArray(n)
// Use two pointers left and right to represent the current prefix substring interval for linear-time construction.
var left = 0
var right = 0
// Index 0 is the whole string (the prefix), so start from 1.
for (i in 1 until n) {
// i is outside the current interval; we have no prefix substring overlapping i, so no info to reuse.
if (i > right) {
left = i
right = i
// Expand the interval while characters match (i.e. while we still have a prefix substring).
while (right < n && s[right - left] == s[right]) {
right++
}
// Set Z[i] to the length of the prefix substring (the interval length).
zArray[i] = right-- - left
} else {
// i is inside or at the right bound; we have two cases.
val k = i - left
if (zArray[k] < right - i + 1) {
// The prefix substring at k is shorter than the remaining interval, so the one at i cannot extend further; set Z[i] = Z[k].
zArray[i] = zArray[k]
} else {
// We can extend the current interval; expand it and set Z[i].
left = i
while (right < n && s[right - left] == s[right]) {
right++
}
zArray[i] = right-- - left
}
}
}
return zArray
}
You can find the code sample on GitHub.
Practice Problems
LeetCode 28. Find the Index of the First Occurrence in a String
Find the index of the first occurrence of a pattern in the text. A good introductory problem for practicing the Z Algorithm.
LeetCode 2223. Sum of Scores of Built-in Strings
Compute the sum of the lengths of all common prefixes. It is rated Hard, but with an understanding of the Z Algorithm you can solve it quite easily.