# 3.4. Distribution-based Sort

Unlike comparison-based algorithms in Sections [3.2](https://emory.gitbook.io/dsa-java/sorting-algorithms/3.2.-comparison-based-sort) and [3.3](https://emory.gitbook.io/dsa-java/sorting-algorithms/divide-and-conquer-sort), distribution-based sorting algorithms do not make comparisons between keys in the input array; instead, they distribute keys into ordered buckets and merge keys in the buckets.

## Bucket Sort

Let us create an abstract class, [`AbstractBucketSort`](https://github.com/emory-courses/dsa-java/blob/master/src/main/java/edu/emory/cs/sort/distribution/BucketSort.java) inheriting `AbstractSort`:

{% code lineNumbers="true" %}

```java
public abstract class BucketSort<T extends Comparable<T>> extends AbstractSort<T> {
    protected List<Deque<T>> buckets;

    /** @param numBuckets the total number of buckets. */
    public BucketSort(int numBuckets) {
        super(null);
        buckets = Stream.generate(ArrayDeque<T>::new).limit(numBuckets).collect(Collectors.toList());
    }
}
```

{% endcode %}

* `L2`: declares a list of buckets where each bucket is represented by a [`Deque`](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/Deque.html).
* `L6`: does not pass any comparator to `AbstractSort` since no comparison is needed.
* `L7`: creates a pre-defined number of buckets.

> What kind of input keys can work with distribution-based sorting algorithms?

We then define a helper method `sort()`:

{% code lineNumbers="true" %}

```java
/**
 * @param array       the input array.
 * @param beginIndex  the index of the first key to be sorted (inclusive).
 * @param endIndex    the index of the last key to be sorted (exclusive).
 * @param bucketIndex takes a key and returns the index of the bucket that the key to be added.
 */
protected void sort(T[] array, int beginIndex, int endIndex, Function<T, Integer> bucketIndex) {
    // add each element in the input array to the corresponding bucket
    for (int i = beginIndex; i < endIndex; i++)
        buckets.get(bucketIndex.apply(array[i])).add(array[i]);

    // merge elements in all buckets to the input array
    for (Deque<T> bucket : buckets) {
        while (!bucket.isEmpty())
            array[beginIndex++] = bucket.remove();
    }
}
```

{% endcode %}

* `L7`: `bucketIndex` is a function that takes a comparable key and returns the appropriate bucket index of the key.
* `L9-10`: adds the keys within the range in the input array to the buckets.
* `L13-16`: puts all keys in the buckets back to the input array while keeping the order.

> How many assignments are made by the `sort()` method in `BucketSort`?

## Integer Bucket Sort

Let us create the `IntegerBucketSort` class inheriting `BucketSort` that sorts integers within a specific range in ascending order:

{% code lineNumbers="true" %}

```java
public class IntegerBucketSort extends BucketSort<Integer> {
    private final int MIN;

    /**
     * @param min the minimum integer (inclusive).
     * @param max the maximum integer (exclusive).
     */
    public IntegerBucketSort(int min, int max) {
        super(max - min);
        MIN = min;
    }
}
```

{% endcode %}

* `L2`: takes the smallest integer in the range.
* `L8`: passes the number of buckets, `max - min`, to be initialized to the super constructor.\\

We then override the `sort()` method:

{% code lineNumbers="true" %}

```java
@Override
public void sort(Integer[] array, int beginIndex, int endIndex) {
    sort(array, beginIndex, endIndex, key -> key - MIN);
}
```

{% endcode %}

* `L3`: passes a lambda expression that takes `key` and returns `key - MIN` indicating the index of the bucket that `key` should be assigned to.

{% embed url="<https://speakerdeck.com/emory/dsa-java-integer-bucket-sort>" %}
<https://speakerdeck.com/emory/dsa-java-integer-bucket-sort>
{% endembed %}

> Is it possible to implement `DoubleBucketSort` that can handle double instead of integer keys?

## Radix Sort

Let us create an abstract class, `RadixSort`, inheriting `BucketSort<Integer>`:

{% code lineNumbers="true" %}

```java
public abstract class RadixSort extends BucketSort<Integer> {
    public RadixSort() {
        super(10);
    }
    
    protected int getMaxBit(Integer[] array, int beginIndex, int endIndex) {
        Integer max = Arrays.stream(array, beginIndex, endIndex).reduce(Integer::max).orElse(null);
        return (max != null && max > 0) ? (int)Math.log10(max) + 1 : 0;
    }
}
```

{% endcode %}

* `L3`: creates `10` buckets to sort any range of integers.
* `L6`: returns the order of the most significant digit in the input array.

## LSD Radix Sort

Let us create the `LSDRadixSort` class inheriting `RadixSort` that sorts the integer keys from the least significant digit (LSD) to the most significant digit (MSD):

{% code lineNumbers="true" %}

```java
public class LSDRadixSort extends RadixSort {
    @Override
    public void sort(Integer[] array, int beginIndex, int endIndex) {
        int maxBit = getMaxBit(array, beginIndex, endIndex);
        for (int bit = 0; bit < maxBit; bit++) {
            int div = (int)Math.pow(10, bit);
            sort(array, beginIndex, endIndex, key -> (key / div) % 10);
        }
    }
}
```

{% endcode %}

* `L5`: iterates from LSD to MSD.
* `L7`: the lambda expression returns the bucket index, the value in the $$n$$'th significant digit.

{% embed url="<https://speakerdeck.com/emory/dsa-java-lsd-radix-sort>" %}
<https://speakerdeck.com/emory/dsa-java-lsd-radix-sort>
{% endembed %}

## Benchmarks

The followings compare runtime speeds between advanced sorting algorithms where the input arrays consist of random integer keys:

{% embed url="<https://speakerdeck.com/emory/dsa-java-distribution-based-sort-benchmarks>" %}
<https://speakerdeck.com/emory/dsa-java-distribution-based-sort-benchmarks>
{% endembed %}
