3.2. Comparison-based Sort

Selection sort, heap sort, insertion sort, shell sort.

Selection-based Sort

Selection-based sorting algorithms take the following steps:

  • For each key AiA_i where A=n|A| = n and i[n,0)i \in [n, 0):

    • Search the maximum key AmA_m where m[1,i)m \in [1, i).

    • Swap AiA_i and AmA_m

The complexities differ by different search algorithms:

Algorithm

Search

Compare

Swap

Selection Sort

O(n)O(n)

O(n2)O(n^2)

O(n)O(n)

Heap Sort

O(logn)O(\log n)

O(nlogn)O(n\log n)

O(nlogn)O(n \log n)

Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes O(logn)O(\log n) instead of O(n)O(n) .

Can the search be faster than O(logn)O(\log n)?

Selection Sort

Let us create the SelectionSortarrow-up-right class inheriting AbstractSort:

public class SelectionSort<T extends Comparable<T>> extends AbstractSort<T> {
    public SelectionSort() {
        this(Comparator.naturalOrder());
    }

    public SelectionSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then override the sort() method:

  • L3-12: O(n2)O(n^2)

    • L3: iterates all keys within the range O(n)\rightarrow O(n).

    • L4-9: finds the index of the maximum key within the range O(n)\rightarrow O(n) .

    • L11: swaps the maximum key with the last key in the range O(1)\rightarrow O(1) .

How does the sort() method work with Comparator.reverseOrder()?

Heap Sort

Let us create the HeapSortarrow-up-right class inheriting AbstractSort:

Before we override the sort() method, let us define the following helper methods:

  • L1-7: finds the right position of the k'th key by using the sink operation.

  • L9-11: finds the parent index of the k'th key given the beginning index.

  • L13-15: finds the left child index of the k'th key given the beginning index.

How is this sink() method different from the one in the BinaryHeaparrow-up-right class?

Finally, we override the sort() method:

  • L4-5: turns the input array into a heap O(nlogn)\rightarrow O(n \log n):

    • L4: iterates from the parent of the key in the ending index O(n)\rightarrow O(n) .

    • L5: sinks the key O(logn)\rightarrow O(\log n).

  • L8-11: selection sort O(nlogn)\rightarrow O(n \log n):

    • L8: iterates all keys within the range O(n)\rightarrow O(n).

    • L9: swaps the maximum key with the beginning key in the range O(1)\rightarrow O(1).

    • L10: sinks to heapify O(logn)\rightarrow O(\log n).

https://www.slideshare.net/jchoi7s/dsajava-heap-sortarrow-up-right

What is the worst-case scenario for Selection Sort and Heap Sort?

Insertion-based Sort

Insertion-based sorting algorithms take the following steps:

  • For each key AiA_i where A=n|A| = n and i[1,n)i \in [1, n) :

    • Keep swapping AjA_j and AiA_i until AjAiA_j \leq A_i.

The complexities differ by different sequences:

Algorithm

Sequence

Compare

Swap

Insertion Sort

Adjacent

O(n2)O(n^2)

O(n2)O(n^2)

Shell Sort

Knuth

O(n1.5)O(n^{1.5})

O(n1.5)O(n^{1.5})

Insertion Sort

Let us create the InsertionSortarrow-up-right class inheriting AbstractSort:

Let us then define an auxiliary method, sort():

  • L4: iterates keys in the input array O(n)\rightarrow O(n).

  • L5: compares keys in the sublist of the input array O(nh)\rightarrow O(\frac{n}{h}).

  • L6: swaps the two keys.

Given the auxiliary method, the sort() method can be defined as follows where h = 1:

How many swaps does Insertion Sort make for the following array? [7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]

Shell Sort

Gap Sequence

Shell Sort groups keys whose gap is defined by a particular gap sequencearrow-up-right:

  • Knuth: (3k1)/2{1,4,13,40,121,}(3^k - 1) / 2 \Rightarrow \{1, 4, 13, 40, 121, \ldots\}

  • Hibbard: 2k1{1,3,7,15,31,63,}2^k - 1 \Rightarrow \{1, 3, 7, 15, 31, 63, \ldots\}

  • Pratt: 2p3q{1,2,3,4,6,8,9,12,}2^p \cdot 3^q \Rightarrow \{1, 2, 3, 4, 6, 8, 9, 12, \ldots\}

  • Shell: n/2k{500,250,125,}n / 2^k \Rightarrow \{500, 250, 125, \ldots\}, where n=1000n = 1000

For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]

It then sorts each group using Insertion Sort, which results in the following array: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.

Implementation

Let us create an abstract class, ShellSortarrow-up-right, inheriting AbstractSort:

  • L2: stores a particular gap sequence.

  • L7: pre-populate the gap sequence that can handle the input size up to 10000.

Then, let us define two abstract methods, populateSequence() and getSequenceStartIndex():

  • L5: populates a particular sequence for the input size n.

  • L11: returns the index of the first gap to be used given the input size n.

Let us then override the sort() method:

  • L4: should not re-populate the sequence unless it has to.

  • L6: iterates the sequence O(s)\rightarrow O(s) where ss is the number of gaps in the sequence.

  • L7: sorts the gap group by using the auxiliary method.

Knuth Sequence

Let us create the ShellSortKnutharrow-up-right class that populates the Knuth Sequence for ShellSort:

The two abstract methods can be overridden as follows:

  • L2: populates the Knuth sequence up to the gap n3\leq \frac{n}{3}.

  • L13: returns the index of the first key n3\leq \frac{n}{3}.

Why should we use n3\frac{n}{3} as the largest gap in the sequence?

Demonstration

https://www.slideshare.net/jchoi7s/dsajava-shell-sortarrow-up-right

Unit Tests & Benchmarks

The codes for unit testing and benchmarking are available in SortTestarrow-up-right:

https://www.slideshare.net/jchoi7s/dsajava-comparisonbased-sort-benchmarksarrow-up-right

Last updated

Was this helpful?