Data Structures and Algorithms in Java
GitHubAuthor
  • Preface
    • Syllabus
    • Schedule
  • 0. Getting Started
    • 0.1. Environment Setup
    • 0.2. Quiz
  • 1. Java Essentials
    • 1.1. Abstraction
    • 1.2. Implementation
    • 1.3. Unit Testing
    • 1.4. Quiz
  • 2. Priority Queues
    • 2.1. Simple Priority Queues
    • 2.2. Binary Heap
    • 2.3. Unit Testing
    • 2.4. Benchmarking
    • 2.5. Quiz
  • 3. Sorting Algorithms
    • 3.1. Abstraction
    • 3.2. Comparison-based Sort
    • 3.3. Divide & Conquer Sort
    • 3.4. Distribution-based Sort
    • 3.5. Quiz
    • 3.6. Homework
  • 4. Binary Search Trees
    • 4.1. Binary Search Trees
    • 4.2. Balanced BST
    • 4.2. AVL Trees
    • 4.3. Red-Black Trees
    • 4.4. Quiz
  • 5. Tries
    • 5.1. Concept
    • 5.2. Implementation
    • 5.3. Quiz
    • 5.4. Homework
  • 6. Disjoint Sets
    • 6.1. Concept
    • 6.2. Implementation
    • 6.3. Quiz
  • 7. Graphs
    • 7.1. Implementation
    • 7.2. Cycle Detection
    • 7.3. Topological Sorting
    • 7.4. Quiz
  • 8. Minimum Spanning Trees
    • 8.1. Abstraction
    • 8.2. Prim's Algorithm
    • 8.3. Kruskal’s Algorithm
    • 8.4. Edmonds' Algorithm
    • 8.5. Quiz
    • 8.6. Homework
  • 9. Network Flow
    • 9.1. Flow Network
    • 9.2. Ford-Fulkerson Algorithm
    • 9.3. Simplex Algorithm
    • 9.3. Quiz
  • 10. Dynamic Programming
    • 10.1. Fibonacci Sequence
    • 10.2. Tower of Hanoi
    • 10.3. Longest Common Subsequence
    • 10.4. Quiz
Powered by GitBook

©2023 Emory University - All rights reserved

On this page
  • Selection-based Sort
  • Selection Sort
  • Heap Sort
  • Insertion-based Sort
  • Insertion Sort
  • Shell Sort
  • Unit Tests & Benchmarks

Was this helpful?

Export as PDF
  1. 3. Sorting Algorithms

3.2. Comparison-based Sort

Selection sort, heap sort, insertion sort, shell sort.

Previous3.1. AbstractionNext3.3. Divide & Conquer Sort

Last updated 2 years ago

Was this helpful?

Selection-based Sort

Selection-based sorting algorithms take the following steps:

  • For each key AiA_iAi​ where ∣A∣=n|A| = n∣A∣=n and i∈[n,0)i \in [n, 0)i∈[n,0):

    • Search the maximum key AmA_mAm​ where m∈[1,i)m \in [1, i)m∈[1,i).

    • Swap AiA_iAi​ and AmA_mAm​

The complexities differ by different search algorithms:

Algorithm

Search

Compare

Swap

Selection Sort

Heap Sort

Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes O(log⁡n)O(\log n)O(logn) instead of O(n)O(n)O(n) .

Can the search be faster than O(log⁡n)O(\log n)O(logn)?

Selection Sort

Let us create the class inheriting AbstractSort:

public class SelectionSort<T extends Comparable<T>> extends AbstractSort<T> {
    public SelectionSort() {
        this(Comparator.naturalOrder());
    }

    public SelectionSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then override the sort() method:

@Override
public void sort(T[] array, final int beginIndex, final int endIndex) {
    for (int i = endIndex; i > beginIndex; i--) {
        int max = beginIndex;

        for (int j = beginIndex + 1; j < i; j++) {
            if (compareTo(array, j, max) > 0)
                max = j;
        }

        swap(array, max, i - 1);
    }
}

How does the sort() method work with Comparator.reverseOrder()?

Heap Sort

public class HeapSort<T extends Comparable<T>> extends AbstractSort<T> {
    public HeapSort() {
        this(Comparator.naturalOrder());
    }

    public HeapSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Before we override the sort() method, let us define the following helper methods:

private void sink(T[] array, int k, int beginIndex, int endIndex) {
    for (int i = getLeftChildIndex(beginIndex, k); i < endIndex; k = i, i = getLeftChildIndex(beginIndex, k)) {
        if (i + 1 < endIndex && compareTo(array, i, i + 1) < 0) i++;
        if (compareTo(array, k, i) >= 0) break;
        swap(array, k, i);
    }
}

private int getParentIndex(int beginIndex, int k) {
    return beginIndex + (k - beginIndex - 1) / 2;
}

private int getLeftChildIndex(int beginIndex, int k) {
    return beginIndex + 2 * (k - beginIndex) + 1;
}
  • L1-7: finds the right position of the k'th key by using the sink operation.

  • L9-11: finds the parent index of the k'th key given the beginning index.

  • L13-15: finds the left child index of the k'th key given the beginning index.

Finally, we override the sort() method:

@Override
public void sort(T[] array, int beginIndex, int endIndex) {
    // heapify all elements
    for (int k = getParentIndex(beginIndex, endIndex - 1); k >= beginIndex; k--)
        sink(array, k, beginIndex, endIndex);

    // swap the endIndex element with the root element and sink it
    while (endIndex > beginIndex + 1) {
        swap(array, beginIndex, --endIndex);
        sink(array, beginIndex, beginIndex, endIndex);
    }
}

What is the worst-case scenario for Selection Sort and Heap Sort?

Insertion-based Sort

Insertion-based sorting algorithms take the following steps:

The complexities differ by different sequences:

Algorithm

Sequence

Compare

Swap

Insertion Sort

Adjacent

Shell Sort

Knuth

Insertion Sort

public class InsertionSort<T extends Comparable<T>> extends AbstractSort<T> {
    public InsertionSort() {
        this(Comparator.naturalOrder());
    }

    public InsertionSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then define an auxiliary method, sort():

protected void sort(T[] array, int beginIndex, int endIndex, final int h) {
    int begin_h = beginIndex + h;

    for (int i = begin_h; i < endIndex; i++)
        for (int j = i; begin_h <= j && compareTo(array, j, j - h) < 0; j -= h)
            swap(array, j, j - h);
}
  • L6: swaps the two keys.

Given the auxiliary method, the sort() method can be defined as follows where h = 1:

@Override
public void sort(T[] array, int beginIndex, int endIndex) {
    sort(array, beginIndex, endIndex, 1);
}

How many swaps does Insertion Sort make for the following array? [7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]

Shell Sort

Gap Sequence

For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]

It then sorts each group using Insertion Sort, which results in the following array: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.

Implementation

public abstract class ShellSort<T extends Comparable<T>> extends InsertionSort<T> {
    protected List<Integer> sequence;

    public ShellSort(Comparator<T> comparator) {
        super(comparator);
        sequence = new ArrayList<>();
        populateSequence(10000);
    }
}
  • L2: stores a particular gap sequence.

  • L7: pre-populate the gap sequence that can handle the input size up to 10000.

Then, let us define two abstract methods, populateSequence() and getSequenceStartIndex():

/**
 * Populates the gap sequence with respect to the size of the list.
 * @param n the size of the list to be sorted.
 */
protected abstract void populateSequence(int n);

/**
 * @param n the size of the list to be sorted.
 * @return the starting index of the sequence with respect to the size of the list.
 */
protected abstract int getSequenceStartIndex(int n);
  • L5: populates a particular sequence for the input size n.

  • L11: returns the index of the first gap to be used given the input size n.

Let us then override the sort() method:

@Override
 public void sort(T[] array, int beginIndex, int endIndex) {
     int n = endIndex - beginIndex;
     populateSequence(n);

     for (int i = getSequenceStartIndex(n); i >= 0; i--)
         sort(array, beginIndex, endIndex, sequence.get(i));
 }
  • L4: should not re-populate the sequence unless it has to.

  • L7: sorts the gap group by using the auxiliary method.

Knuth Sequence

public class ShellSortKnuth<T extends Comparable<T>> extends ShellSort<T> {
    public ShellSortKnuth() {
        this(Comparator.naturalOrder());
    }

    public ShellSortKnuth(Comparator<T> comparator) {
        super(comparator);
    }
}

The two abstract methods can be overridden as follows:

@Override
protected void populateSequence(int n) {
    n /= 3;

    for (int t = sequence.size() + 1; ; t++) {
        int h = (int) ((Math.pow(3, t) - 1) / 2);
        if (h <= n) sequence.add(h);
        else break;
    }
}

@Override
protected int getSequenceStartIndex(int n) {
    int index = Collections.binarySearch(sequence, n / 3);
    if (index < 0) index = -(index + 1);
    if (index == sequence.size()) index--;
    return index;
}

Demonstration

Unit Tests & Benchmarks

L3-12: O(n2)O(n^2)O(n2)

L3: iterates all keys within the range →O(n)\rightarrow O(n)→O(n).

L4-9: finds the index of the maximum key within the range →O(n)\rightarrow O(n)→O(n) .

L11: swaps the maximum key with the last key in the range →O(1)\rightarrow O(1)→O(1) .

Let us create the class inheriting AbstractSort:

How is this sink() method different from the one in the class?

L4-5: turns the input array into a heap →O(nlog⁡n)\rightarrow O(n \log n)→O(nlogn):

L4: iterates from the parent of the key in the ending index →O(n)\rightarrow O(n)→O(n) .

L5: sinks the key →O(log⁡n)\rightarrow O(\log n)→O(logn).

L8-11: selection sort →O(nlog⁡n)\rightarrow O(n \log n)→O(nlogn):

L8: iterates all keys within the range →O(n)\rightarrow O(n)→O(n).

L9: swaps the maximum key with the beginning key in the range →O(1)\rightarrow O(1)→O(1).

L10: sinks to heapify →O(log⁡n)\rightarrow O(\log n)→O(logn).

For each key AiA_iAi​ where ∣A∣=n|A| = n∣A∣=n and i∈[1,n)i \in [1, n)i∈[1,n) :

Keep swapping AjA_jAj​ and AiA_iAi​ until Aj≤AiA_j \leq A_iAj​≤Ai​.

Let us create the class inheriting AbstractSort:

L4: iterates keys in the input array →O(n)\rightarrow O(n)→O(n).

L5: compares keys in the sublist of the input array →O(nh)\rightarrow O(\frac{n}{h})→O(hn​).

Shell Sort groups keys whose gap is defined by a particular :

Knuth: (3k−1)/2⇒{1,4,13,40,121,…}(3^k - 1) / 2 \Rightarrow \{1, 4, 13, 40, 121, \ldots\}(3k−1)/2⇒{1,4,13,40,121,…}

Hibbard: 2k−1⇒{1,3,7,15,31,63,…}2^k - 1 \Rightarrow \{1, 3, 7, 15, 31, 63, \ldots\}2k−1⇒{1,3,7,15,31,63,…}

Pratt: 2p⋅3q⇒{1,2,3,4,6,8,9,12,…}2^p \cdot 3^q \Rightarrow \{1, 2, 3, 4, 6, 8, 9, 12, \ldots\}2p⋅3q⇒{1,2,3,4,6,8,9,12,…}

Shell: n/2k⇒{500,250,125,…}n / 2^k \Rightarrow \{500, 250, 125, \ldots\}n/2k⇒{500,250,125,…}, where n=1000n = 1000n=1000

Let us create an abstract class, , inheriting AbstractSort:

L6: iterates the sequence →O(s)\rightarrow O(s)→O(s) where sss is the number of gaps in the sequence.

Let us create the class that populates the Knuth Sequence for ShellSort:

L2: populates the Knuth sequence up to the gap ≤n3\leq \frac{n}{3}≤3n​.

L13: returns the index of the first key ≤n3\leq \frac{n}{3}≤3n​.

Why should we use n3\frac{n}{3}3n​ as the largest gap in the sequence?

The codes for unit testing and benchmarking are available in :

O(n)O(n)O(n)
O(n2)O(n^2)O(n2)
O(n)O(n)O(n)
O(log⁡n)O(\log n)O(logn)
O(nlog⁡n)O(n\log n)O(nlogn)
O(nlog⁡n)O(n \log n)O(nlogn)
O(n2)O(n^2)O(n2)
O(n2)O(n^2)O(n2)
O(n1.5)O(n^{1.5})O(n1.5)
O(n1.5)O(n^{1.5})O(n1.5)
HeapSort
BinaryHeap
InsertionSort
gap sequence
ShellSort
ShellSortKnuth
SortTest
SelectionSort
https://www.slideshare.net/jchoi7s/dsajava-heap-sort
https://www.slideshare.net/jchoi7s/dsajava-shell-sort
https://www.slideshare.net/jchoi7s/dsajava-comparisonbased-sort-benchmarks