This section discusses recursive and dynamic programming ways of finding longest common subsequence.
A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.
Given a string "ABCDE"
Substring: {"A", "BC", "CDE", ...}
Subsequence: {all substrings, "AC", "ACE", ...}
Not subsequence: {"BA", "DAB", ...}
Longest common subsequence
The longest subsequence commonly shared by multiple strings.
e.g., “baal” is a LCS of “bilabial” and “balaclava”.
Can there be more than one longest common subsequence?
Application
Find the longest common subsequence in DNA (e.g., GAATGTCCTTTCTCTAAGTCCTAAG).
publicabstractclassLCS { /** * @param a the first string. * @param b the second string. * @return a longest common sequence of the two specific strings. */publicStringsolve(String a,String b) {returnsolve(a.toCharArray(),b.toCharArray(),a.length() -1,b.length() -1); } /** * @param c the first array of characters. * @param d the second array of characters. * @param i the index of the character in {@code c} to be compared. * @param j the index of the character in {@code d} to be compared. * @return a longest common sequence of the two specific strings. */protectedabstractStringsolve(char[] c,char[] d,int i,int j);}
L4: no character is left to compare for either string.
L5: when two characters match, move onto c[:i-1] and d[:j-1].
L7: gets the longest common subsequence between c[:i-1] and d[:j].
L8: gets the longest common subsequence between c[:i] and d[:j-1].
L9: returns the longest subsequence.
The followings demonstrate a recursive way of finding a LCS between two strings:
Dynamic Programming
Let us create the LCSDynamic class inheriting LCS.
publicclassLCSDynamicextendsLCS { @OverrideprotectedStringsolve(char[] c,char[] d,int i,int j) {returnsolve(c, d, i, j, createTable(c, d)); } /** * @param c the first string. * @param d the second string. * @return the dynamic table populated by estimating the # of LCSs in the grid of the two specific strings. */protectedint[][] createTable(char[] c,char[] d) {finalint N =c.length, M =d.length;int[][] table =newint[N][M];for (int i =1; i < N; i++)for (int j =1; j < M; j++) table[i][j] = (c[i] == d[j]) ? table[i -1][j -1] +1:Math.max(table[i -1][j], table[i][j -1]);return table; }}
L4: creates a dynamic table and passes it to the solver.
L12: the dynamic table is pre-populated before any recursive calls.
The following show the dynamic table populated by the previous example:
Let us define the solve() method:
protectedStringsolve(char[] c,char[] d,int i,int j,int[][] table) {if (i <0|| j <0) return"";if (c[i] == d[j]) returnsolve(c, d, i -1, j -1, table)+ c[i];if (i ==0) returnsolve(c, d, i, j -1, table);if (j ==0) returnsolve(c, d, i -1, j, table);return (table[i -1][j] > table[i][j -1]) ?solve(c, d, i -1, j, table):solve(c, d, i, j -1, table);}
L2: no character is left to compare for either string.
L3: when two characters match, move onto c[:i-1] and d[:j-1].
L5: gets the longest common subsequence between c[:i] and d[:j-1].
L6: gets the longest common subsequence between c[:i-1] and d[:j].
L9: returns the longest subsequence by looking up the values in the dynamic table.
The followings demonstrate how to find a LCS using the dynamic table: