Selection-based sorting algorithms take the following steps:
For each key Ai where ∣A∣=n and i∈[n,0):
Search the maximum key Amwhere m∈[1,i).
Swap Ai and Am
The complexities differ by different search algorithms:
Algorithm
Search
Compare
Swap
Selection Sort
Heap Sort
Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes O(logn) instead of O(n) .
Can the search be faster than O(logn)?
Selection Sort
Let us create the class inheriting AbstractSort:
public class SelectionSort<T extends Comparable<T>> extends AbstractSort<T> {
public SelectionSort() {
this(Comparator.naturalOrder());
}
public SelectionSort(Comparator<T> comparator) {
super(comparator);
}
}
Let us then override the sort() method:
@Override
public void sort(T[] array, final int beginIndex, final int endIndex) {
for (int i = endIndex; i > beginIndex; i--) {
int max = beginIndex;
for (int j = beginIndex + 1; j < i; j++) {
if (compareTo(array, j, max) > 0)
max = j;
}
swap(array, max, i - 1);
}
}
L3-12: O(n2)
L3: iterates all keys within the range →O(n).
L4-9: finds the index of the maximum key within the range →O(n) .
L11: swaps the maximum key with the last key in the range →O(1) .
How does the sort() method work with Comparator.reverseOrder()?
Heap Sort
public class HeapSort<T extends Comparable<T>> extends AbstractSort<T> {
public HeapSort() {
this(Comparator.naturalOrder());
}
public HeapSort(Comparator<T> comparator) {
super(comparator);
}
}
Before we override the sort() method, let us define the following helper methods:
private void sink(T[] array, int k, int beginIndex, int endIndex) {
for (int i = getLeftChildIndex(beginIndex, k); i < endIndex; k = i, i = getLeftChildIndex(beginIndex, k)) {
if (i + 1 < endIndex && compareTo(array, i, i + 1) < 0) i++;
if (compareTo(array, k, i) >= 0) break;
swap(array, k, i);
}
}
private int getParentIndex(int beginIndex, int k) {
return beginIndex + (k - beginIndex - 1) / 2;
}
private int getLeftChildIndex(int beginIndex, int k) {
return beginIndex + 2 * (k - beginIndex) + 1;
}
L1-7: finds the right position of the k'th key by using the sink operation.
L9-11: finds the parent index of the k'th key given the beginning index.
L13-15: finds the left child index of the k'th key given the beginning index.
Finally, we override the sort() method:
@Override
public void sort(T[] array, int beginIndex, int endIndex) {
// heapify all elements
for (int k = getParentIndex(beginIndex, endIndex - 1); k >= beginIndex; k--)
sink(array, k, beginIndex, endIndex);
// swap the endIndex element with the root element and sink it
while (endIndex > beginIndex + 1) {
swap(array, beginIndex, --endIndex);
sink(array, beginIndex, beginIndex, endIndex);
}
}
L4-5: turns the input array into a heap →O(nlogn):
L4: iterates from the parent of the key in the ending index →O(n) .
L5: sinks the key →O(logn).
L8-11: selection sort →O(nlogn):
L8: iterates all keys within the range →O(n).
L9: swaps the maximum key with the beginning key in the range →O(1).
L10: sinks to heapify →O(logn).
What is the worst-case scenario for Selection Sort and Heap Sort?
Insertion-based Sort
Insertion-based sorting algorithms take the following steps:
For each key Ai where ∣A∣=n and i∈[1,n) :
Keep swapping Aj and Ai until Aj≤Ai.
The complexities differ by different sequences:
Algorithm
Sequence
Compare
Swap
Insertion Sort
Adjacent
Shell Sort
Knuth
Insertion Sort
public class InsertionSort<T extends Comparable<T>> extends AbstractSort<T> {
public InsertionSort() {
this(Comparator.naturalOrder());
}
public InsertionSort(Comparator<T> comparator) {
super(comparator);
}
}
Let us then define an auxiliary method, sort():
protected void sort(T[] array, int beginIndex, int endIndex, final int h) {
int begin_h = beginIndex + h;
for (int i = begin_h; i < endIndex; i++)
for (int j = i; begin_h <= j && compareTo(array, j, j - h) < 0; j -= h)
swap(array, j, j - h);
}
L4: iterates keys in the input array →O(n).
L5: compares keys in the sublist of the input array →O(hn).
L6: swaps the two keys.
Given the auxiliary method, the sort() method can be defined as follows where h = 1:
@Override
public void sort(T[] array, int beginIndex, int endIndex) {
sort(array, beginIndex, endIndex, 1);
}
How many swaps does Insertion Sort make for the following array?
[7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]
Shell Sort
Gap Sequence
Knuth: (3k−1)/2⇒{1,4,13,40,121,…}
Hibbard: 2k−1⇒{1,3,7,15,31,63,…}
Pratt: 2p⋅3q⇒{1,2,3,4,6,8,9,12,…}
Shell: n/2k⇒{500,250,125,…}, where n=1000
For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]
It then sorts each group using Insertion Sort, which results in the following array:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.
Implementation
public abstract class ShellSort<T extends Comparable<T>> extends InsertionSort<T> {
protected List<Integer> sequence;
public ShellSort(Comparator<T> comparator) {
super(comparator);
sequence = new ArrayList<>();
populateSequence(10000);
}
}
L2: stores a particular gap sequence.
L7: pre-populate the gap sequence that can handle the input size up to 10000.
Then, let us define two abstract methods, populateSequence() and getSequenceStartIndex():
/**
* Populates the gap sequence with respect to the size of the list.
* @param n the size of the list to be sorted.
*/
protected abstract void populateSequence(int n);
/**
* @param n the size of the list to be sorted.
* @return the starting index of the sequence with respect to the size of the list.
*/
protected abstract int getSequenceStartIndex(int n);
L5: populates a particular sequence for the input size n.
L11: returns the index of the first gap to be used given the input size n.
Let us then override the sort() method:
@Override
public void sort(T[] array, int beginIndex, int endIndex) {
int n = endIndex - beginIndex;
populateSequence(n);
for (int i = getSequenceStartIndex(n); i >= 0; i--)
sort(array, beginIndex, endIndex, sequence.get(i));
}
L4: should not re-populate the sequence unless it has to.
L6: iterates the sequence →O(s) where s is the number of gaps in the sequence.
L7: sorts the gap group by using the auxiliary method.
Knuth Sequence
public class ShellSortKnuth<T extends Comparable<T>> extends ShellSort<T> {
public ShellSortKnuth() {
this(Comparator.naturalOrder());
}
public ShellSortKnuth(Comparator<T> comparator) {
super(comparator);
}
}
The two abstract methods can be overridden as follows:
@Override
protected void populateSequence(int n) {
n /= 3;
for (int t = sequence.size() + 1; ; t++) {
int h = (int) ((Math.pow(3, t) - 1) / 2);
if (h <= n) sequence.add(h);
else break;
}
}
@Override
protected int getSequenceStartIndex(int n) {
int index = Collections.binarySearch(sequence, n / 3);
if (index < 0) index = -(index + 1);
if (index == sequence.size()) index--;
return index;
}
L2: populates the Knuth sequence up to the gap ≤3n.
L13: returns the index of the first key ≤3n.
Why should we use 3n as the largest gap in the sequence?
Demonstration
Unit Tests & Benchmarks
Let us create the class inheriting AbstractSort:
How is this sink() method different from the one in the class?
Let us create the class inheriting AbstractSort:
Shell Sort groups keys whose gap is defined by a particular :
Let us create an abstract class, , inheriting AbstractSort:
Let us create the class that populates the Knuth Sequence for ShellSort:
The codes for unit testing and benchmarking are available in :