Z Algorithm
Algorithm Deep Dives
1. Z Algorithm
2. Manacher’s Algorithm
3. Gosper’s Hack Algorithm


This is the first post in the Algorithm Deep Dives series.

I will study various algorithms and explain them one by one. Whether the series continues or quietly fades away remains to be seen, but I will do my best.

The first algorithm is the Z Algorithm.


What is the Z Algorithm?

Z Algorithm Explain Image Z Array

The Z Algorithm is a pattern search algorithm that finds a given pattern in a string. It runs in linear time with complexity O(M + N).

It works with an auxiliary array called the Z array, which stores the length of the longest prefix substring starting at each index—i.e., at String[i].

Prefix substring: a substring starting at the current index that matches the beginning of the string

Index            0   1   2   3   4   5   6 
Text             a   a   b   c   a   a   b
Z values         N   1   0   0   3   1   0

For example, in the string “aabcaab”, the longest prefix substring starting at the fifth a is “aab”, because aabcaab matches the prefix.

So if Z[i] = k, then String[i .. i + k] = String[0 .. k]. The Z Algorithm uses this property to search for the pattern in linear time.

In short, think of concatenating the pattern and the text into a single string "P$T", where P is the pattern, T is the text, and $ is a character that does not appear in the prefix.

If at some index the Z array value equals the prefix length, then by the property above, a prefix substring exists at that position.

With the concept in mind, let us look at the implementation.


Implementation

fun getZArray(s: String): IntArray {
    val n = s.length

    // Use a Z array of length n.
    val zArray = IntArray(n)

    // Use two pointers left and right to represent the current prefix substring interval for linear-time construction.
    var left = 0
    var right = 0

    // Index 0 is the whole string (the prefix), so start from 1.
    for (i in 1 until n) {
        
        // i is outside the current interval; we have no prefix substring overlapping i, so no info to reuse.
        if (i > right) {
            left = i
            right = i

            // Expand the interval while characters match (i.e. while we still have a prefix substring).
            while (right < n && s[right - left] == s[right]) {
                right++
            }

            // Set Z[i] to the length of the prefix substring (the interval length).
            zArray[i] = right-- - left
        } else {
            // i is inside or at the right bound; we have two cases.
            val k = i - left

            if (zArray[k] < right - i + 1) {
                // The prefix substring at k is shorter than the remaining interval, so the one at i cannot extend further; set Z[i] = Z[k].
                zArray[i] = zArray[k]
            } else {
                // We can extend the current interval; expand it and set Z[i].
                left = i

                while (right < n && s[right - left] == s[right]) {
                    right++
                }

                zArray[i] = right-- - left
            }
        }
    }

    return zArray
}

You can find the code sample on GitHub.


Practice Problems

LeetCode 28. Find the Index of the First Occurrence in a String

Find the index of the first occurrence of a pattern in the text. A good introductory problem for practicing the Z Algorithm.

LeetCode 2223. Sum of Scores of Built-in Strings

Compute the sum of the lengths of all common prefixes. It is rated Hard, but with an understanding of the Z Algorithm you can solve it quite easily.