3.4. Distribution-based Sort

Bucket sort, radix sort.

Unlike comparison-based algorithms in Sections 3.2 and 3.3, distribution-based sorting algorithms do not make comparisons between keys in the input array; instead, they distribute keys into ordered buckets and merge keys in the buckets.

Bucket Sort

Let us create an abstract class, AbstractBucketSort inheriting AbstractSort:

public abstract class BucketSort<T extends Comparable<T>> extends AbstractSort<T> {
    protected List<Deque<T>> buckets;

    /** @param numBuckets the total number of buckets. */
    public BucketSort(int numBuckets) {
        super(null);
        buckets = Stream.generate(ArrayDeque<T>::new).limit(numBuckets).collect(Collectors.toList());
    }
}
  • L2: declares a list of buckets where each bucket is represented by a Deque.

  • L6: does not pass any comparator to AbstractSort since no comparison is needed.

  • L7: creates a pre-defined number of buckets.

What kind of input keys can work with distribution-based sorting algorithms?

We then define a helper method sort():

/**
 * @param array       the input array.
 * @param beginIndex  the index of the first key to be sorted (inclusive).
 * @param endIndex    the index of the last key to be sorted (exclusive).
 * @param bucketIndex takes a key and returns the index of the bucket that the key to be added.
 */
protected void sort(T[] array, int beginIndex, int endIndex, Function<T, Integer> bucketIndex) {
    // add each element in the input array to the corresponding bucket
    for (int i = beginIndex; i < endIndex; i++)
        buckets.get(bucketIndex.apply(array[i])).add(array[i]);

    // merge elements in all buckets to the input array
    for (Deque<T> bucket : buckets) {
        while (!bucket.isEmpty())
            array[beginIndex++] = bucket.remove();
    }
}
  • L7: bucketIndex is a function that takes a comparable key and returns the appropriate bucket index of the key.

  • L9-10: adds the keys within the range in the input array to the buckets.

  • L13-16: puts all keys in the buckets back to the input array while keeping the order.

How many assignments are made by the sort() method in BucketSort?

Integer Bucket Sort

Let us create the IntegerBucketSort class inheriting BucketSort that sorts integers within a specific range in ascending order:

public class IntegerBucketSort extends BucketSort<Integer> {
    private final int MIN;

    /**
     * @param min the minimum integer (inclusive).
     * @param max the maximum integer (exclusive).
     */
    public IntegerBucketSort(int min, int max) {
        super(max - min);
        MIN = min;
    }
}
  • L2: takes the smallest integer in the range.

  • L8: passes the number of buckets, max - min, to be initialized to the super constructor.\

We then override the sort() method:

@Override
public void sort(Integer[] array, int beginIndex, int endIndex) {
    sort(array, beginIndex, endIndex, key -> key - MIN);
}
  • L3: passes a lambda expression that takes key and returns key - MIN indicating the index of the bucket that key should be assigned to.

Is it possible to implement DoubleBucketSort that can handle double instead of integer keys?

Radix Sort

Let us create an abstract class, RadixSort, inheriting BucketSort<Integer>:

public abstract class RadixSort extends BucketSort<Integer> {
    public RadixSort() {
        super(10);
    }
    
    protected int getMaxBit(Integer[] array, int beginIndex, int endIndex) {
        Integer max = Arrays.stream(array, beginIndex, endIndex).reduce(Integer::max).orElse(null);
        return (max != null && max > 0) ? (int)Math.log10(max) + 1 : 0;
    }
}
  • L3: creates 10 buckets to sort any range of integers.

  • L6: returns the order of the most significant digit in the input array.

LSD Radix Sort

Let us create the LSDRadixSort class inheriting RadixSort that sorts the integer keys from the least significant digit (LSD) to the most significant digit (MSD):

public class LSDRadixSort extends RadixSort {
    @Override
    public void sort(Integer[] array, int beginIndex, int endIndex) {
        int maxBit = getMaxBit(array, beginIndex, endIndex);
        for (int bit = 0; bit < maxBit; bit++) {
            int div = (int)Math.pow(10, bit);
            sort(array, beginIndex, endIndex, key -> (key / div) % 10);
        }
    }
}
  • L5: iterates from LSD to MSD.

  • L7: the lambda expression returns the bucket index, the value in the nn'th significant digit.

Benchmarks

The followings compare runtime speeds between advanced sorting algorithms where the input arrays consist of random integer keys:

Last updated

©2023 Emory University - All rights reserved