3.4. Distribution-based Sort
Bucket sort, radix sort.
Unlike comparison-based algorithms in Sections 3.2 and 3.3, distribution-based sorting algorithms do not make comparisons between keys in the input array; instead, they distribute keys into ordered buckets and merge keys in the buckets.
Bucket Sort
Let us create an abstract class, AbstractBucketSort
inheriting AbstractSort
:
public abstract class BucketSort<T extends Comparable<T>> extends AbstractSort<T> {
protected List<Deque<T>> buckets;
/** @param numBuckets the total number of buckets. */
public BucketSort(int numBuckets) {
super(null);
buckets = Stream.generate(ArrayDeque<T>::new).limit(numBuckets).collect(Collectors.toList());
}
}
L2
: declares a list of buckets where each bucket is represented by aDeque
.L6
: does not pass any comparator toAbstractSort
since no comparison is needed.L7
: creates a pre-defined number of buckets.
What kind of input keys can work with distribution-based sorting algorithms?
We then define a helper method sort()
:
/**
* @param array the input array.
* @param beginIndex the index of the first key to be sorted (inclusive).
* @param endIndex the index of the last key to be sorted (exclusive).
* @param bucketIndex takes a key and returns the index of the bucket that the key to be added.
*/
protected void sort(T[] array, int beginIndex, int endIndex, Function<T, Integer> bucketIndex) {
// add each element in the input array to the corresponding bucket
for (int i = beginIndex; i < endIndex; i++)
buckets.get(bucketIndex.apply(array[i])).add(array[i]);
// merge elements in all buckets to the input array
for (Deque<T> bucket : buckets) {
while (!bucket.isEmpty())
array[beginIndex++] = bucket.remove();
}
}
L7
:bucketIndex
is a function that takes a comparable key and returns the appropriate bucket index of the key.L9-10
: adds the keys within the range in the input array to the buckets.L13-16
: puts all keys in the buckets back to the input array while keeping the order.
How many assignments are made by the
sort()
method inBucketSort
?
Integer Bucket Sort
Let us create the IntegerBucketSort
class inheriting BucketSort
that sorts integers within a specific range in ascending order:
public class IntegerBucketSort extends BucketSort<Integer> {
private final int MIN;
/**
* @param min the minimum integer (inclusive).
* @param max the maximum integer (exclusive).
*/
public IntegerBucketSort(int min, int max) {
super(max - min);
MIN = min;
}
}
L2
: takes the smallest integer in the range.L8
: passes the number of buckets,max - min
, to be initialized to the super constructor.\
We then override the sort()
method:
@Override
public void sort(Integer[] array, int beginIndex, int endIndex) {
sort(array, beginIndex, endIndex, key -> key - MIN);
}
L3
: passes a lambda expression that takeskey
and returnskey - MIN
indicating the index of the bucket thatkey
should be assigned to.
Is it possible to implement
DoubleBucketSort
that can handle double instead of integer keys?
Radix Sort
Let us create an abstract class, RadixSort
, inheriting BucketSort<Integer>
:
public abstract class RadixSort extends BucketSort<Integer> {
public RadixSort() {
super(10);
}
protected int getMaxBit(Integer[] array, int beginIndex, int endIndex) {
Integer max = Arrays.stream(array, beginIndex, endIndex).reduce(Integer::max).orElse(null);
return (max != null && max > 0) ? (int)Math.log10(max) + 1 : 0;
}
}
L3
: creates10
buckets to sort any range of integers.L6
: returns the order of the most significant digit in the input array.
LSD Radix Sort
Let us create the LSDRadixSort
class inheriting RadixSort
that sorts the integer keys from the least significant digit (LSD) to the most significant digit (MSD):
public class LSDRadixSort extends RadixSort {
@Override
public void sort(Integer[] array, int beginIndex, int endIndex) {
int maxBit = getMaxBit(array, beginIndex, endIndex);
for (int bit = 0; bit < maxBit; bit++) {
int div = (int)Math.pow(10, bit);
sort(array, beginIndex, endIndex, key -> (key / div) % 10);
}
}
}
L5
: iterates from LSD to MSD.L7
: the lambda expression returns the bucket index, the value in the 'th significant digit.
Benchmarks
The followings compare runtime speeds between advanced sorting algorithms where the input arrays consist of random integer keys:
Last updated
Was this helpful?