# 3.4. Distribution-based Sort

Unlike comparison-based algorithms in Sections [3.2](/dsa-java/sorting-algorithms/3.2.-comparison-based-sort.md) and [3.3](/dsa-java/sorting-algorithms/divide-and-conquer-sort.md), distribution-based sorting algorithms do not make comparisons between keys in the input array; instead, they distribute keys into ordered buckets and merge keys in the buckets.

## Bucket Sort

Let us create an abstract class, [`AbstractBucketSort`](https://github.com/emory-courses/dsa-java/blob/master/src/main/java/edu/emory/cs/sort/distribution/BucketSort.java) inheriting `AbstractSort`:

{% code lineNumbers="true" %}

```java
public abstract class BucketSort<T extends Comparable<T>> extends AbstractSort<T> {
    protected List<Deque<T>> buckets;

    /** @param numBuckets the total number of buckets. */
    public BucketSort(int numBuckets) {
        super(null);
        buckets = Stream.generate(ArrayDeque<T>::new).limit(numBuckets).collect(Collectors.toList());
    }
}
```

{% endcode %}

* `L2`: declares a list of buckets where each bucket is represented by a [`Deque`](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/Deque.html).
* `L6`: does not pass any comparator to `AbstractSort` since no comparison is needed.
* `L7`: creates a pre-defined number of buckets.

> What kind of input keys can work with distribution-based sorting algorithms?

We then define a helper method `sort()`:

{% code lineNumbers="true" %}

```java
/**
 * @param array       the input array.
 * @param beginIndex  the index of the first key to be sorted (inclusive).
 * @param endIndex    the index of the last key to be sorted (exclusive).
 * @param bucketIndex takes a key and returns the index of the bucket that the key to be added.
 */
protected void sort(T[] array, int beginIndex, int endIndex, Function<T, Integer> bucketIndex) {
    // add each element in the input array to the corresponding bucket
    for (int i = beginIndex; i < endIndex; i++)
        buckets.get(bucketIndex.apply(array[i])).add(array[i]);

    // merge elements in all buckets to the input array
    for (Deque<T> bucket : buckets) {
        while (!bucket.isEmpty())
            array[beginIndex++] = bucket.remove();
    }
}
```

{% endcode %}

* `L7`: `bucketIndex` is a function that takes a comparable key and returns the appropriate bucket index of the key.
* `L9-10`: adds the keys within the range in the input array to the buckets.
* `L13-16`: puts all keys in the buckets back to the input array while keeping the order.

> How many assignments are made by the `sort()` method in `BucketSort`?

## Integer Bucket Sort

Let us create the `IntegerBucketSort` class inheriting `BucketSort` that sorts integers within a specific range in ascending order:

{% code lineNumbers="true" %}

```java
public class IntegerBucketSort extends BucketSort<Integer> {
    private final int MIN;

    /**
     * @param min the minimum integer (inclusive).
     * @param max the maximum integer (exclusive).
     */
    public IntegerBucketSort(int min, int max) {
        super(max - min);
        MIN = min;
    }
}
```

{% endcode %}

* `L2`: takes the smallest integer in the range.
* `L8`: passes the number of buckets, `max - min`, to be initialized to the super constructor.\\

We then override the `sort()` method:

{% code lineNumbers="true" %}

```java
@Override
public void sort(Integer[] array, int beginIndex, int endIndex) {
    sort(array, beginIndex, endIndex, key -> key - MIN);
}
```

{% endcode %}

* `L3`: passes a lambda expression that takes `key` and returns `key - MIN` indicating the index of the bucket that `key` should be assigned to.

{% embed url="<https://speakerdeck.com/emory/dsa-java-integer-bucket-sort>" %}
<https://speakerdeck.com/emory/dsa-java-integer-bucket-sort>
{% endembed %}

> Is it possible to implement `DoubleBucketSort` that can handle double instead of integer keys?

## Radix Sort

Let us create an abstract class, `RadixSort`, inheriting `BucketSort<Integer>`:

{% code lineNumbers="true" %}

```java
public abstract class RadixSort extends BucketSort<Integer> {
    public RadixSort() {
        super(10);
    }
    
    protected int getMaxBit(Integer[] array, int beginIndex, int endIndex) {
        Integer max = Arrays.stream(array, beginIndex, endIndex).reduce(Integer::max).orElse(null);
        return (max != null && max > 0) ? (int)Math.log10(max) + 1 : 0;
    }
}
```

{% endcode %}

* `L3`: creates `10` buckets to sort any range of integers.
* `L6`: returns the order of the most significant digit in the input array.

## LSD Radix Sort

Let us create the `LSDRadixSort` class inheriting `RadixSort` that sorts the integer keys from the least significant digit (LSD) to the most significant digit (MSD):

{% code lineNumbers="true" %}

```java
public class LSDRadixSort extends RadixSort {
    @Override
    public void sort(Integer[] array, int beginIndex, int endIndex) {
        int maxBit = getMaxBit(array, beginIndex, endIndex);
        for (int bit = 0; bit < maxBit; bit++) {
            int div = (int)Math.pow(10, bit);
            sort(array, beginIndex, endIndex, key -> (key / div) % 10);
        }
    }
}
```

{% endcode %}

* `L5`: iterates from LSD to MSD.
* `L7`: the lambda expression returns the bucket index, the value in the $$n$$'th significant digit.

{% embed url="<https://speakerdeck.com/emory/dsa-java-lsd-radix-sort>" %}
<https://speakerdeck.com/emory/dsa-java-lsd-radix-sort>
{% endembed %}

## Benchmarks

The followings compare runtime speeds between advanced sorting algorithms where the input arrays consist of random integer keys:

{% embed url="<https://speakerdeck.com/emory/dsa-java-distribution-based-sort-benchmarks>" %}
<https://speakerdeck.com/emory/dsa-java-distribution-based-sort-benchmarks>
{% endembed %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://emory.gitbook.io/dsa-java/sorting-algorithms/distribution-based-sort.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
