10.3. Longest Common Subsequence
This section discusses recursive and dynamic programming ways of finding longest common subsequence.
A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.
Given a string "ABCDE"
Substring: {"A", "BC", "CDE", ...}
Subsequence: {all substrings, "AC", "ACE", ...}
Not subsequence: {"BA", "DAB", ...}
Longest common subsequence
The longest subsequence commonly shared by multiple strings.
e.g., “baal” is a LCS of “bilabial” and “balaclava”.
Can there be more than one longest common subsequence?
Application
Find the longest common subsequence in DNA (e.g.,
GAATGTCCTTTCTCTAAGTCCTAAG
).
Abstraction
Let us create the abstract class LCS
:
Recursion
Let us create the LCSRecursive
inheriting LCS
:
L4
: no character is left to compare for either string.L5
: when two characters match, move ontoc[:i-1]
andd[:j-1]
.L7
: gets the longest common subsequence betweenc[:i-1]
andd[:j]
.L8
: gets the longest common subsequence betweenc[:i]
andd[:j-1]
.L9
: returns the longest subsequence.
The followings demonstrate a recursive way of finding a LCS between two strings:
Dynamic Programming
Let us create the LCSDynamic
class inheriting LCS
.
L4
: creates a dynamic table and passes it to the solver.L12
: the dynamic table is pre-populated before any recursive calls.
The following show the dynamic table populated by the previous example:
Let us define the solve()
method:
L2
: no character is left to compare for either string.L3
: when two characters match, move ontoc[:i-1]
andd[:j-1]
.L5
: gets the longest common subsequence betweenc[:i]
andd[:j-1]
.L6
: gets the longest common subsequence betweenc[:i-1]
andd[:j]
.L9
: returns the longest subsequence by looking up the values in the dynamic table.
The followings demonstrate how to find a LCS using the dynamic table:
Last updated