Understand the working of KMP Algorithm
Introduction
Ever needed to find a word while using a text editor? This situation is probably common, and you have probably experienced it frequently. Of course, you don't automatically scan the entire article; instead, you use the find function on your computer to have it do it for you.
Algorithms for "string matching" or "pattern matching" are applied in this situation. The KMP algorithm falls into the category of “pattern matching” algorithms.
We need to determine whether string a matches all or part of string b when we have two strings, a representing the pattern and b representing the string in which we want to find the pattern. In essence, we must determine whether string an is a component of string b.
From left to right, this algorithm compares each character individually. However, it uses a preprocessed table called the "Prefix Table" to avoid character comparison during matching whenever a mismatch occurs.
Working of KMP
String matching algorithms are another name for the pattern search algorithms. When searching a string within another string, these algorithms come in very handy. Write a programme with the function PatternSearch(char pat[, char str[]) that prints all instances of pat[] in str[ given a text string str[0..n-1] and a pattern pat[0..m-1]. Considering n > m
Examples:
Input: str[] = "BOARD INFINITY
pat[] = "INF"
Output: Pattern found at index 6.
Components of the KMP algorithm:
a. Prefix.
b. Suffix.
c. LPS table : Used to detect whether the Longest Proper Prefix is also a suffix.
Steps
LPS ← array [size = pattern length] |
Code
void KMP(char* pat, char* str) { |
Complexity Analysis
Since we have an outer and an inner loop suggests that time complexity would be constant time O(mn). However, if we count it in a different way, we can see that it is always less than that. The idea is that we perform one comparison T[i+j]==P[j] for each iteration of the inner loop. By keeping track of how many comparisons we make, we can determine how long the algorithm took overall.
Time complexity of the complete algorithm is O(m + n).