3.2. Comparison-based Sort
Selection sort, heap sort, insertion sort, shell sort.
Selection-based Sort
Selection-based sorting algorithms take the following steps:
For each key where and :
Search the maximum key where .
Swap and
The complexities differ by different search algorithms:
Algorithm
Search
Compare
Swap
Selection Sort
Heap Sort
Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes instead of .
Can the search be faster than ?
Selection Sort
Let us create the SelectionSort
class inheriting AbstractSort
:
public class SelectionSort<T extends Comparable<T>> extends AbstractSort<T> {
public SelectionSort() {
this(Comparator.naturalOrder());
}
public SelectionSort(Comparator<T> comparator) {
super(comparator);
}
}
Let us then override the sort()
method:
@Override
public void sort(T[] array, final int beginIndex, final int endIndex) {
for (int i = endIndex; i > beginIndex; i--) {
int max = beginIndex;
for (int j = beginIndex + 1; j < i; j++) {
if (compareTo(array, j, max) > 0)
max = j;
}
swap(array, max, i - 1);
}
}
L3-12
:L3
: iterates all keys within the range .L4-9
: finds the index of the maximum key within the range .L11
: swaps the maximum key with the last key in the range .
How does the
sort()
method work withComparator.reverseOrder()
?
Heap Sort
Let us create the HeapSort
class inheriting AbstractSort
:
public class HeapSort<T extends Comparable<T>> extends AbstractSort<T> {
public HeapSort() {
this(Comparator.naturalOrder());
}
public HeapSort(Comparator<T> comparator) {
super(comparator);
}
}
Before we override the sort()
method, let us define the following helper methods:
private void sink(T[] array, int k, int beginIndex, int endIndex) {
for (int i = getLeftChildIndex(beginIndex, k); i < endIndex; k = i, i = getLeftChildIndex(beginIndex, k)) {
if (i + 1 < endIndex && compareTo(array, i, i + 1) < 0) i++;
if (compareTo(array, k, i) >= 0) break;
swap(array, k, i);
}
}
private int getParentIndex(int beginIndex, int k) {
return beginIndex + (k - beginIndex - 1) / 2;
}
private int getLeftChildIndex(int beginIndex, int k) {
return beginIndex + 2 * (k - beginIndex) + 1;
}
L1-7
: finds the right position of thek
'th key by using the sink operation.L9-11
: finds the parent index of thek
'th key given the beginning index.L13-15
: finds the left child index of thek
'th key given the beginning index.
How is this
sink()
method different from the one in theBinaryHeap
class?
Finally, we override the sort()
method:
@Override
public void sort(T[] array, int beginIndex, int endIndex) {
// heapify all elements
for (int k = getParentIndex(beginIndex, endIndex - 1); k >= beginIndex; k--)
sink(array, k, beginIndex, endIndex);
// swap the endIndex element with the root element and sink it
while (endIndex > beginIndex + 1) {
swap(array, beginIndex, --endIndex);
sink(array, beginIndex, beginIndex, endIndex);
}
}
L4-5
: turns the input array into a heap :L4
: iterates from the parent of the key in the ending index .L5
: sinks the key .
L8-11
: selection sort :L8
: iterates all keys within the range .L9
: swaps the maximum key with the beginning key in the range .L10
: sinks to heapify .
What is the worst-case scenario for Selection Sort and Heap Sort?
Insertion-based Sort
Insertion-based sorting algorithms take the following steps:
For each key where and :
Keep swapping and until .
The complexities differ by different sequences:
Algorithm
Sequence
Compare
Swap
Insertion Sort
Adjacent
Shell Sort
Knuth
Insertion Sort
Let us create the InsertionSort
class inheriting AbstractSort
:
public class InsertionSort<T extends Comparable<T>> extends AbstractSort<T> {
public InsertionSort() {
this(Comparator.naturalOrder());
}
public InsertionSort(Comparator<T> comparator) {
super(comparator);
}
}
Let us then define an auxiliary method, sort()
:
protected void sort(T[] array, int beginIndex, int endIndex, final int h) {
int begin_h = beginIndex + h;
for (int i = begin_h; i < endIndex; i++)
for (int j = i; begin_h <= j && compareTo(array, j, j - h) < 0; j -= h)
swap(array, j, j - h);
}
L4
: iterates keys in the input array .L5
: compares keys in the sublist of the input array .L6
: swaps the two keys.
Given the auxiliary method, the sort()
method can be defined as follows where h = 1
:
@Override
public void sort(T[] array, int beginIndex, int endIndex) {
sort(array, beginIndex, endIndex, 1);
}
How many swaps does Insertion Sort make for the following array?
[7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]
Shell Sort
Gap Sequence
Shell Sort groups keys whose gap is defined by a particular gap sequence:
Knuth:
Hibbard:
Pratt:
Shell: , where
For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]
It then sorts each group using Insertion Sort, which results in the following array:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.
Implementation
Let us create an abstract class, ShellSort
, inheriting AbstractSort
:
public abstract class ShellSort<T extends Comparable<T>> extends InsertionSort<T> {
protected List<Integer> sequence;
public ShellSort(Comparator<T> comparator) {
super(comparator);
sequence = new ArrayList<>();
populateSequence(10000);
}
}
L2
: stores a particular gap sequence.L7
: pre-populate the gap sequence that can handle the input size up to10000
.
Then, let us define two abstract methods, populateSequence()
and getSequenceStartIndex()
:
/**
* Populates the gap sequence with respect to the size of the list.
* @param n the size of the list to be sorted.
*/
protected abstract void populateSequence(int n);
/**
* @param n the size of the list to be sorted.
* @return the starting index of the sequence with respect to the size of the list.
*/
protected abstract int getSequenceStartIndex(int n);
L5
: populates a particular sequence for the input sizen
.L11
: returns the index of the first gap to be used given the input sizen
.
Let us then override the sort()
method:
@Override
public void sort(T[] array, int beginIndex, int endIndex) {
int n = endIndex - beginIndex;
populateSequence(n);
for (int i = getSequenceStartIndex(n); i >= 0; i--)
sort(array, beginIndex, endIndex, sequence.get(i));
}
L4
: should not re-populate the sequence unless it has to.L6
: iterates the sequence where is the number of gaps in the sequence.L7
: sorts the gap group by using the auxiliary method.
Knuth Sequence
Let us create the ShellSortKnuth
class that populates the Knuth Sequence for ShellSort
:
public class ShellSortKnuth<T extends Comparable<T>> extends ShellSort<T> {
public ShellSortKnuth() {
this(Comparator.naturalOrder());
}
public ShellSortKnuth(Comparator<T> comparator) {
super(comparator);
}
}
The two abstract methods can be overridden as follows:
@Override
protected void populateSequence(int n) {
n /= 3;
for (int t = sequence.size() + 1; ; t++) {
int h = (int) ((Math.pow(3, t) - 1) / 2);
if (h <= n) sequence.add(h);
else break;
}
}
@Override
protected int getSequenceStartIndex(int n) {
int index = Collections.binarySearch(sequence, n / 3);
if (index < 0) index = -(index + 1);
if (index == sequence.size()) index--;
return index;
}
L2
: populates the Knuth sequence up to the gap .L13
: returns the index of the first key .
Why should we use as the largest gap in the sequence?
Demonstration
Unit Tests & Benchmarks
The codes for unit testing and benchmarking are available in SortTest
:
Last updated
Was this helpful?