3.2. Comparison-based Sort

Selection sort, heap sort, insertion sort, shell sort.

Selection-based Sort

Selection-based sorting algorithms take the following steps:

  • For each key AiA_i where A=n|A| = n and i[n,0)i \in [n, 0):

    • Search the maximum key AmA_m where m[1,i)m \in [1, i).

    • Swap AiA_i and AmA_m

The complexities differ by different search algorithms:

Algorithm

Search

Compare

Swap

Selection Sort

O(n)O(n)

O(n2)O(n^2)

O(n)O(n)

Heap Sort

O(logn)O(\log n)

O(nlogn)O(n\log n)

O(nlogn)O(n \log n)

Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes O(logn)O(\log n) instead of O(n)O(n) .

Can the search be faster than O(logn)O(\log n)?

Selection Sort

Let us create the SelectionSort class inheriting AbstractSort:

public class SelectionSort<T extends Comparable<T>> extends AbstractSort<T> {
    public SelectionSort() {
        this(Comparator.naturalOrder());
    }

    public SelectionSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then override the sort() method:

  • L3-12: O(n2)O(n^2)

    • L3: iterates all keys within the range O(n)\rightarrow O(n).

    • L4-9: finds the index of the maximum key within the range O(n)\rightarrow O(n) .

    • L11: swaps the maximum key with the last key in the range O(1)\rightarrow O(1) .

How does the sort() method work with Comparator.reverseOrder()?

Heap Sort

Let us create the HeapSort class inheriting AbstractSort:

Before we override the sort() method, let us define the following helper methods:

  • L1-7: finds the right position of the k'th key by using the sink operation.

  • L9-11: finds the parent index of the k'th key given the beginning index.

  • L13-15: finds the left child index of the k'th key given the beginning index.

How is this sink() method different from the one in the BinaryHeap class?

Finally, we override the sort() method:

  • L4-5: turns the input array into a heap O(nlogn)\rightarrow O(n \log n):

    • L4: iterates from the parent of the key in the ending index O(n)\rightarrow O(n) .

    • L5: sinks the key O(logn)\rightarrow O(\log n).

  • L8-11: selection sort O(nlogn)\rightarrow O(n \log n):

    • L8: iterates all keys within the range O(n)\rightarrow O(n).

    • L9: swaps the maximum key with the beginning key in the range O(1)\rightarrow O(1).

    • L10: sinks to heapify O(logn)\rightarrow O(\log n).

https://www.slideshare.net/jchoi7s/dsajava-heap-sort

What is the worst-case scenario for Selection Sort and Heap Sort?

Insertion-based Sort

Insertion-based sorting algorithms take the following steps:

  • For each key AiA_i where A=n|A| = n and i[1,n)i \in [1, n) :

    • Keep swapping AjA_j and AiA_i until AjAiA_j \leq A_i.

The complexities differ by different sequences:

Algorithm

Sequence

Compare

Swap

Insertion Sort

Adjacent

O(n2)O(n^2)

O(n2)O(n^2)

Shell Sort

Knuth

O(n1.5)O(n^{1.5})

O(n1.5)O(n^{1.5})

Insertion Sort

Let us create the InsertionSort class inheriting AbstractSort:

Let us then define an auxiliary method, sort():

  • L4: iterates keys in the input array O(n)\rightarrow O(n).

  • L5: compares keys in the sublist of the input array O(nh)\rightarrow O(\frac{n}{h}).

  • L6: swaps the two keys.

Given the auxiliary method, the sort() method can be defined as follows where h = 1:

How many swaps does Insertion Sort make for the following array? [7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]

Shell Sort

Gap Sequence

Shell Sort groups keys whose gap is defined by a particular gap sequence:

  • Knuth: (3k1)/2{1,4,13,40,121,}(3^k - 1) / 2 \Rightarrow \{1, 4, 13, 40, 121, \ldots\}

  • Hibbard: 2k1{1,3,7,15,31,63,}2^k - 1 \Rightarrow \{1, 3, 7, 15, 31, 63, \ldots\}

  • Pratt: 2p3q{1,2,3,4,6,8,9,12,}2^p \cdot 3^q \Rightarrow \{1, 2, 3, 4, 6, 8, 9, 12, \ldots\}

  • Shell: n/2k{500,250,125,}n / 2^k \Rightarrow \{500, 250, 125, \ldots\}, where n=1000n = 1000

For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]

It then sorts each group using Insertion Sort, which results in the following array: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.

Implementation

Let us create an abstract class, ShellSort, inheriting AbstractSort:

  • L2: stores a particular gap sequence.

  • L7: pre-populate the gap sequence that can handle the input size up to 10000.

Then, let us define two abstract methods, populateSequence() and getSequenceStartIndex():

  • L5: populates a particular sequence for the input size n.

  • L11: returns the index of the first gap to be used given the input size n.

Let us then override the sort() method:

  • L4: should not re-populate the sequence unless it has to.

  • L6: iterates the sequence O(s)\rightarrow O(s) where ss is the number of gaps in the sequence.

  • L7: sorts the gap group by using the auxiliary method.

Knuth Sequence

Let us create the ShellSortKnuth class that populates the Knuth Sequence for ShellSort:

The two abstract methods can be overridden as follows:

  • L2: populates the Knuth sequence up to the gap n3\leq \frac{n}{3}.

  • L13: returns the index of the first key n3\leq \frac{n}{3}.

Why should we use n3\frac{n}{3} as the largest gap in the sequence?

Demonstration

https://www.slideshare.net/jchoi7s/dsajava-shell-sort

Unit Tests & Benchmarks

The codes for unit testing and benchmarking are available in SortTest:

https://www.slideshare.net/jchoi7s/dsajava-comparisonbased-sort-benchmarks

Last updated

Was this helpful?