10.3. Longest Common Subsequence

This section discusses recursive and dynamic programming ways of finding longest common subsequence.

A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

Given a string "ABCDE"

  • Substring: {"A", "BC", "CDE", ...}

  • Subsequence: {all substrings, "AC", "ACE", ...}

  • Not subsequence: {"BA", "DAB", ...}

Longest common subsequence

  • The longest subsequence commonly shared by multiple strings.

  • e.g., “baal” is a LCS of “bilabial” and “balaclava”.

Can there be more than one longest common subsequence?

Application

  • Find the longest common subsequence in DNA (e.g., GAATGTCCTTTCTCTAAGTCCTAAG).

Abstraction

Let us create the abstract class LCS:

public abstract class LCS {
    /**
     * @param a the first string.
     * @param b the second string.
     * @return a longest common sequence of the two specific strings.
     */
    public String solve(String a, String b) {
        return solve(a.toCharArray(), b.toCharArray(), a.length() - 1, b.length() - 1);
    }

    /**
     * @param c the first array of characters.
     * @param d the second array of characters.
     * @param i the index of the character in {@code c} to be compared.
     * @param j the index of the character in {@code d} to be compared.
     * @return a longest common sequence of the two specific strings.
     */
    protected abstract String solve(char[] c, char[] d, int i, int j);
}

Recursion

Let us create the LCSRecursive inheriting LCS:

  • L4: no character is left to compare for either string.

  • L5: when two characters match, move onto c[:i-1] and d[:j-1].

  • L7: gets the longest common subsequence between c[:i-1] and d[:j].

  • L8: gets the longest common subsequence between c[:i] and d[:j-1].

  • L9: returns the longest subsequence.

The followings demonstrate a recursive way of finding a LCS between two strings:

Dynamic Programming

Let us create the LCSDynamic class inheriting LCS.

  • L4: creates a dynamic table and passes it to the solver.

  • L12: the dynamic table is pre-populated before any recursive calls.

The following show the dynamic table populated by the previous example:

Let us define the solve() method:

  • L2: no character is left to compare for either string.

  • L3: when two characters match, move onto c[:i-1] and d[:j-1].

  • L5: gets the longest common subsequence between c[:i] and d[:j-1].

  • L6: gets the longest common subsequence between c[:i-1] and d[:j].

  • L9: returns the longest subsequence by looking up the values in the dynamic table.

The followings demonstrate how to find a LCS using the dynamic table:

Last updated

Was this helpful?