Longest common subsequence (LCS) is a classic problem of dynamic programming. The problem is to find out the longest subsequence which presents in the two given sequences. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguous.
The solution of LCS is delicate, and the idea can be applied to many other problems. I will walk through these problems in this post.
I. Solution of LCS
The key is to break down the problem into a collection of simpler subproblems.1
2
3
4
5
6
7
8Let mark[i][j] represent the length of the LCS of s1(0..i-1) and s2(0..j-1).
When i = 0 && j = 0, mark[i][j] = 0, because the LCS of "" and "" is "";
When i = 0 && j > 0, mark[i][j] = 0, because the LCS of "" and s2(0..j-1) is "";
When i > 0 && j = 0, mark[i][j] = 0, because the LCS of s1(0..i-1) and "" is "";
When i > 0 && j > 0,
if s1[i-1] = s2[j-1], mark[i][j] = mark[i-1][j-1] + 1;
otherwise, mark[i][j] = max{mark[i-1][j], mark[i][j-1]};
If we need to return the subsequence, we can use a matrix to record the path we take in the if/else above and then recover the subsequence according to it.1
2
3
4When i > 0 && j > 0,
if s1[i-1] = s2[j-1], mark[i][j] = mark[i-1][j-1] + 1, trace[i][j] = 0;
if mark[i-1][j] >= mark[i][j-1], mark[i][j] = mark[i-1][j], trace[i][j] = 1;
otherwise, mark[i][j] = mark[i][j-1], trace[i][j] = 2;
1 | int i = s1.length(); |
II. Applying the idea of LCS
i. Longest Common Substring
It is a bit different from Longest Common Subsequence. Substring requires to be contiguous. But the solution is similar. Note that this is not a problem of dp, the answer is not mark[s1.length()][s2.length()], but the max value in the matrix.
1 | Let mark[i][j] represent the length of the LCSubstring of s1(0..i-1) and s2(0..j-1). |
A similar problem on leetcode: 718. Maximum Length of Repeated Subarray.
ii. Interleaving String
97. Interleaving String: given s1, s2, s3, find whether s3 is formed by the interleaving of s1 and s2.
We can directly apply the idea of LCS to this problem. Just be aware of the index.
1 | Let mark[i][j] represent if the interleaving of s1(0..i-1) and s2(0..j-1) can form s3(0..i+j-1). |
III. Matching
i. Regular expression matching
10. Regular Expression Matching: decide if the pattern p matches the string s. ‘.’ matches any single character; ‘*’ matches zero or more of the preceding element.
1 | Let mark[i][j] be true if p(0..j-1) matches s(0..i-1); otherwise it is false. |
ii. Wilcard matching
44. Wildcard Matching: decide if the pattern p matches the string s. ‘?’ matches any single character; ‘*’ matches any sequence of characters (including the empty sequance).
1 | Let mark[i][j] be true if p(0..j-1) matches s(0..i-1); otherwise it is false. |
IV. Edit distance
i. Delete distance
583. Delete Operation for Two Strings: given two words word1 and word2, find the minimum number of steps required to make word1 and word2 the same, where in each step you can delete one character in either string.
The idea is to find the LCS and remove the rest characters. Or we can directly apply the idea of LCS to this problem.
1 | Let mark[i][j] represent the number of characters to be deleted in order to make s1(0..i-1) and s2(0..j-1) the same. |
ii. Edit distance
72. Edit Distance: given two words word1 and word2, find the minimum number of steps required to make word1 and word2 the same, where in each step you can insert/delete/edit one character in either string.
The solution is to apply the idea of LCS to this problem. As the insert operation is the same as the delete operation, we can just use either one.
1 | Let mark[i][j] represent the number of characters to be deleted/edited in order to make s1(0..i-1) and s2(0..j-1) the same. |
V. Palindrome
i. Longest Palindromic Subsequence
516. Longest Palindromic Subsequence: find the longest palindromic subsequence’s length in the given string s.
One solution is to reverse string s and find the LCS of s and the reversed string. Time complexity is O(s.length() * s.length()).
Another solution is to apply the idea of LCS.1
2
3
4
5
6
7
8Let mark[i][j] represents the length of the longest palindromic subsequence in s(i..j).
When i = j, mark[i][j] = 1;
When i != j && s[i] = s[j]
if i+1 = j, mark[i][j] = 2;
otherwise, mark[i][j] = mark[i+1][j-1] + 2;
When i != j && s[i] != s[j]
mark[i][j] = max{mark[i+1][j], mark[i][j-1]};
Time complexity is: O(s.length() * s.length()).
ii. Longest Palindromic Substring
5. Longest Palindromic Substring: find the longest palindromic substring in the given string s.
A straight-forward solution is to traverse the array and check if an element is the center of a palindrome.1
2
3
4
5
6
7
8
9
10
11
12for (int i = 0; i < s.length(); i++) {
int left = i - 1, right = i + 1;
while (left >= 0 && right < s.length() && s.charAt(left) == s.charAt(right)) {
left--; right++;
}
//...
int left = i, right = i + 1;
while (left >= 0 && right < s.length() && s.charAt(left) == s.charAt(right)) {
left--; right++;
}
//...
}
Time complexity: O(s.length() * s.length()).
Another solution is to apply the idea of LCS.1
2
3
4
5
6
7Let mark[i][j] represents the length of the longest palindromic substring in s(i..j).
When i = j, mark[i][j] = 1;
When i != j && s[i] = s[j]
if i+1 = j, mark[i][j] = 2;
if mark[i+1][j-1] = j-i-1, mark[i][j] = mark[i+1][j-1] + 2;
Otherwise, mark[i][j] = 0;
Time complexity is: O(s.length() * s.length()).
Note that the trick that reverses the string and then finds the longest common substring does not work here. This is a counterexample: “abcdkecba”.