1 of 61

Data Structures and Algorithms in Java

Preface

Jinho D. Choi

This is an advanced programming course in Computer Science that teaches how to design efficient structures and algorithms to process big data and methods to benchmark their performance for large-scale computing. Topics cover data structures such as priority queues, binary trees, tries, and graphs and their applications in constructing algorithms such as sorting, searching, balancing, traversing, and spanning. Advanced topics such as network flow and dynamic programming are also discussed. Throughout this course, students are expected to

Have a deep conceptual understanding of various data structures and algorithms.
Implement their conceptual understanding in a programming language.
Explore the most effective structures and algorithms for given tasks.
Properly assess the quality of their implementations.

There are topical quizzes and homework assignments that require sufficient skills in Java programming, Git version control, Gradle software project management, and scientific writing. Intermediate-level Java programming is a prerequisite of this course.

Syllabus

Spring 2023

General

Time: MW 11:30AM - 12:45PM
Location: Atwood 240

Instructors

Grading

1 + 10 topical quizzes: 70%
3 homework assignments: 30%

Notes

For every topic, one quiz will be assigned to check if you keep up with the materials.
Homework assignments assess conceptual understanding, programming ability, and analytical writing skills relevant to this course.
All quizzes and assignments must be submitted individually. Discussions are allowed; however, your work must be original.
Late submissions within a week will be accepted with a grading penalty of 15% and will not be accepted once the solutions are discussed in class.

Schedule

Spring 2023

Date

Topic

Assignment

01/11

01/16

MLK Holiday

01/18

01/23

(Continue)

01/25

(Continue)

01/30

02/01

(Continue)

02/06

02/08

(Continue)

02/13

(Continue)

02/15

02/20

(Continue)

02/22

(Continue)

02/27

03/01

(Continue)

03/06

Spring Break

03/08

Spring Break

03/13

03/15

03/20

(Continue)

03/22

03/27

(Continue)

03/29

(Continue)

04/03

04/05

(Continue)

04/10

(Continue)

04/12

04/17

(Continue)

04/19

(Continue)

04/24

Review

0. Getting Started

This chapter helps you set up the development environment for this course.

Resources

Many Java developers commonly adapt to the environment described in this section. Thus, it is essential to familiarize yourself with this setup.

0.1. Environment Setup

Development kit, version control system, integrated development environment, and project management for Java programming.

Development Kit

The required version: 17.x.x (or higher)

Although Java 17 is not the most recent version, it is the latest long-term support (LTS) release, which is preferred.

Version Control

Install Git using any of the following instructions:

Run the following commands on a terminal by replacing user.email and user.name with your email address and name:

Integrated Development Environment

The recommended version: 2022.3.x (Ultimate Edition)

Project Management

Name: dsa-java
Location: local_path/dsa-java
Check "Create Git repository"
Language: Java
Build system: Gradle
JDK: 17
Gradle DSL: Groovy
Uncheck "Add sample code"
Advanced Settings:
- GroupId: edu.emory.cs
- ArtifactId: dsa-java

For JDK, you should be able to see version 17 if it is properly installed. If you cannot find the version, click [Add JDK] and select the following directory.

Windows: C:\Program Files\Java\jdk-17.x.x
Mac: /Library/Java/JavaVirtualMachines/jdk-17.x.x.jdk

Click [Settings - Build, Execution, Deployment] on the menu:

Click [Build Tools - Gradle] and set Gradle JVM to 17.
Click [Compiler - Java Compiler] and set Project bytecode version to 17.

Click [File - Project Structure] and select [Project Settings]:

Click [Project Settings - Project] and set SDK to 17 and Project language level to SDK default.
Click [Project Settings - Modules - Dependencies] and set Module SDK to 17.
Go to [Platform Settings - SDKs] and select 17.

Lastly, check mavenCentral() is configured as a repository in your build.gradle:

GitHub Integration

Choose [Version Control - Github] on the left pane.
Click [+] and login with your GitHub ID and password.

Click [Git - GitHub - Share Project on Github] and create a repository:

Make sure to check private.
Repository name: dsa-java
Remote: origin
Description: Data Structures and Algorithms in Java

Add all files and make the initial commit. Check if the repository is created under your GitHub account: https://github.com/your_id/dsa-java.

We recommend you create a GitHub account with your school email address, allowing you to add unlimited collaborators to the repository.

0.2. Quiz

Quiz 0: Getting Started

Coding

Add Utils.java to git.
Add the following methods to the Utils class:

static public int getMiddleIndex(int beginIndex, int endIndex) {
    return beginIndex + (endIndex - beginIndex) / 2;
}

static public void main(String[] args) {
    System.out.println(getMiddleIndex(0, 10));
}

Run the program by clicking [Run -> Run]. If you see 5 on the output pane, your program runs successfully.

Testing

test {
    useJUnitPlatform()
}

dependencies {
    testImplementation 'org.junit.jupiter:junit-jupiter-api:5.8.2'
    testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.8.2'
}

Add UtilsTest.java to Git.
Add the following method to the UtilsTest class. Make sure to include all imports:

package edu.emory.cs.utils;

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;

@Test
public void getMiddleIndexTest() {
    assertEquals(5, Utils.getMiddleIndex(0, 10));
}

Run the test by clicking [Run -> Run]. If you see the test passed, your unit test runs successfully.

Submission

Add the instructors as collaborators in your GitHub repository:

Jinho Choi: jdchoi77
Peilin Wu: qualidea1217
Jeongrok Yu: jeongrok
Zinc Zhao: ZincZhao

2. Commit and push the following to your GitHub repository:

3. Submit the URL of your GitHub repository to Canvas.

1. Java Essentials

This chapter explains essential object-oriented programming features in Java to implement data structures and algorithms.

Resources

References

Please follow every example described in this section. Programming is an act of writing, not reading. By the end of this chapter, you should be able to reproduce the entire codebase yourself from scratch without consulting those examples.

1.1. Abstraction

Different types of objects and inheritances in Java.

Class

What is the relationship between a class and an object?

package edu.emory.cs.algebraic;

public class Numeral {
}

L1: packageindicates the name of the package that this class belongs to in a hierarchy.

What are the acceptable access-level modifiers to declare a top-level class?

Let us declare a method, add() , that is an operation expected by all numeral types:

public class Numeral {
    /**
     * Adds `n` to this numeral.
     * @param n the numeral to be added.
     */
    public void add(Numeral n) { /* cannot be implemented */ }
}

The issue is that we cannot define the methods unless we know what specific numeral type this class should implement; in other words, it is too abstract to define those methods. Thus, we need to declare Numeral as a type of abstract class.

What are the advantages of havingNumeral as a super class of all numeral types?

There are two types of abstract classes in Java, abstract class and interface.

Can an object be instantiated by an abstract class or an interface?

Interface

public interface Numeral {
    void add(Numeral n);
}

L2: abstract method
- All methods in an interface are public that does not need to be explicitly coded.
- Abstract methods in an interface are declared without their bodies.

Who defines the bodies of the abstract methods?

Let us create a new interface called SignedNumeral that inherits Numeral and adds two methods, flipSign() and subtract():

Can an interface inherit either an abstract class or a regular class?

public interface SignedNumeral extends Numeral {
    /** Flips the sign of this numeral. */
    void flipSign();

    /**
     * Subtracts `n` from this numeral.
     * @param n the numeral to be subtracted.
     */
    default void subtract(Numeral n) {
        n.flipSign();
        add(n);
        n.flipSign();
    }
}

L1: extends inherits exactly one class or interface.

Can we call add() that is an abstract method without a body in the default method subtract()?

Although the logic of subtract() seems to be correct, n.flipSign() gives a compile error because n is a type of Numeral that does not include flipSign(), which is defined in SignedNumeral that is a subclass of Numeral.

What kind of a compile error does n.flipSign() cause?

Casting

The first way is to downcast the type of n to SignedNumeral, which forces the compiler to think that n can invoke the flipSign() method:

default void subtract(Numeral n) {
    ((SignedNumeral)n).flipSign();
    add(n);
    ((SignedNumeral)n).flipSign();
}

Why is a runtime error worse than a compile error?

How can downcasting cause a runtime error in the above case?

Polymorphism

The second way is to change the type of n to SignedNumeral in the parameter setting:

default void subtract(SignedNumeral n) {
    n.flipSign();
    add(n);
    n.flipSign();
}

This seems to solve the issue. Then, what about add() defined in Numeral? Should we change its parameter type to SignedNumeral as well?

It is often the case that you do not have access to change the code in a super class unless you are the author of it. Even if you are the author, changing the code in a super class is not recommended.

Why is it not recommended to change the code in a super class?

public interface SignedNumeral extends Numeral {
    @Override
    void add(SignedNumeral n);
    ...

The annotation @Override gives an error in this case because it is not considered an overriding.

What are the criteria to override a method?

When @Override is discarded, the error goes away and everything seems to be fine:

public interface SignedNumeral extends Numeral {
    void add(SignedNumeral n);
    ...

What are good use cases of method overriding and overloading?

Generics

public interface Numeral<T extends Numeral<T>> {
    void add(T n);
}

L1: T is a generic type that is a subtype of Numeral.
- A generic type can be recursively defined as T extends Numeral<T>.
L2: T is considered a viable type in this interface such that it can be used to declare add().

Can we define more than one generic type per interface or class?

The generic type T can be specified in a subclass of Numeral:

public interface SignedNumeral extends Numeral<SignedNumeral> {
    void flipSign();

    default void subtract(SignedNumeral n) {
        n.flipSign();
        add(n);
        n.flipSign();
    }
}

L1: T is specified as SignedNumeral.

This would implicitly assign the parameter type of add() as follows:

void add(SignedNumeral n);

The issue is that the implementation of add() may require specific features defined in the subclass that is not available in SignedNumeral. Consider the following subclass inheriting SignedNumeral:

public class LongInteger implements SignedNumeral {
    @Override
    public void flipSign() { /* to be implemented */ }

    @Override
    public void add(SignedNumeral n) { /* to be implemented */ }
}

Would the type of n being SignedNumeral an issue for the subtract() method as well?

Thus, SignedNumeral needs to define its own generic type and pass it onto Numeral:

public interface SignedNumeral<T extends SignedNumeral<T>> extends Numeral<T> {
    void flipSign();

    default void subtract(T n) {
        n.flipSign();
        add(n);
        n.flipSign();
    }
}

L1: T is a generic type inheriting SignedNumeral, that implies all subclasses of SignedNumeral.

T can be safely passed onto Numeral because if it is a subclass of SignedNumeral, it must be a subclass of Numeral, which is how T is defined in the Numeral class.

Generics are used everywhere in Java, so it is important to understand the core concept of generics and be able to adapt it in your code to make it more modular.

Enum

public enum Sign {
    POSITIVE,
    NEGATIVE;
}

Items must be delimited by , and ends with ;.

The items in the enum can be assigned with specific values to make them more indicative (e.g., +, -):

public enum Sign {
    POSITIVE('+'),
    NEGATIVE('-');

    private final char value;

    Sign(char value) {
        this.value = value;
    }

    /** @return the value of the corresponding item. */
    public char value() {
        return value;
    }
}

Why should the member field value be private in the above example?

Note that value in L8 indicates the local parameter declared in the constructor whereas value in L13 indicates the member field declared in L5.

Limit of Interface

In SignedNumeral, it would be convenient to have a member field that indicates the sign of the numeral:

public interface SignedNumeral<T extends SignedNumeral<T>> extends Numeral<T> {
    Sign sign = Sign.POSITIVE;
    ...

L2: All member fields of an interface are static and public.

Can you declare a member field in an interface without assigning a value?

Given the sign field, it may seem intuitive to define flipSign() as a default method:

/** Flips the sign of this numeral. */
default void flipSign() {
    sign = (sign == Sign.POSITIVE) ? Sign.NEGATIVE : Sign.POSITIVE;
}

Is there any advantage of using a ternary operator instead of using a regular if statement?

Unfortunately, this gives a compile error because sign is a constant whose value cannot be reassigned. An interface is not meant to define so many default methods, which were not even allowed before Java 8. For such explicit implementations, it is better to declare SignedNumeral as an abstract class instead.

Abstract Class

public abstract class SignedNumeral<T extends SignedNumeral<T>> implements Numeral<T> {
    /** The sign of this numeral. */
    protected Sign sign;

    /**
     * Create a signed numeral.
     * the default sign is {@link Sign#POSITIVE}.
     */
    public SignedNumeral() {
        this(Sign.POSITIVE);
    }

    /**
     * Create a signed numeral.
     * @param sign the sign of this numeral.
     */
    public SignedNumeral(Sign sign) {
        this.sign = sign;
    }
    ...

L9: the default constructor with no parameter.
L17: another constructor with the sign parameter.
L10: this() calls the constructor in L17.

Why calling this(Sign.POSITIVE) in L10 instead of stating this.sign = Sign.POSITIVE?

    ...
    /** @return true if this numeral is positive; otherwise, false. */
    public boolean isPositive() {
        return sign == Sign.POSITIVE;
    }

    /** @return true if this numeral is negative; otherwise, false. */
    public boolean isNegative() {
        return sign == Sign.NEGATIVE;
    }

    /** Flips the sign of this numeral. */
    public void flipSign() {
        sign = isPositive() ? Sign.NEGATIVE : Sign.POSITIVE;
    }
    
    /**
     * Subtracts `n` from this numeral.
     * @param n the numeral to be subtracted.
     */
    public void subtract(T n) {
        n.flipSign(); add(n); n.flipSign();
    }

    /**
     * Multiplies `n` to this numeral.
     * @param n the numeral to be multiplied.
     */
    public abstract void multiply(T n);
}

L29: abstract indicates that this is an abstract method.

Member fields and methods in an abstract class can be decorated by any modifiers, which need to be explicitly coded.

Is there anything that is not allowed in an abstract class but allowed in a regular class?

In summary, SignedNumeral includes 2 abstract methods, add() inherited from Numeral, and multiply() declared in this class.

Can you define an abstract class or an interface without declaring an abstract method?

1.2. Implementation

LongInteger: implementation.

What is so special about primitive data types in Java?

Field

Let us declare the member field digits that is an array of bytes holding the values of this integer:

L1: LongInteger is passed to specify the generic type T in SignedNumeral.

The i'th dimension of digits is the i'th least significant digit in the integer such that the integer 12345 would be stored as digits = {5, 4, 3, 2, 1}, which makes it convenient to implement the arithmetic methods, add() and multiply().

Is the array of bytes the most efficient way of storing a long integer?

Constructors

Let us define the following three constructors:

L2: the default constructor that initializes this integer with 0 by calling the constructor in L20.
L10: a copy constructor that initializes this integer with n.
- super(): calls the corresponding constructor in the super class, SignedNumeral.
L20: a constructor that initializes this integer with n by passing it to the set() method.

Do the constructors in L10 and L20 call any constructor in the super class?

Can you call non-static methods or fields in the body of a static method?

The static keyword must not be abused to quickly fix compile errors unless it is intended.

Method: `set()`

Let us define the set() method that takes a string and sets the sign and the value of this integer:

L1-7: javadoc comments.
L10-11: throws the NullPointerException.
- yield: returns the value of this switch statement for the condition (introduced in Java 14).
L21-30: sets the value of n to this.digits .
- L23: for-loop can handle multiple variables such as i and j.
- L25-28: throws the InvalidParameterException if v is not a digit.
- L29: stores the value in the reverse order.

Method: `add()`

Let us override the add() method that calls two helper methods:

L3-4: adds n to this integer that has the same sign by calling addSameSign().
L5-6: adds n to this integer that has a different sign by calling addDifferentSign().

The following shows an implementation of addSameSign() based on the simple arithmetic:

L7-9: creates the byte array result by copying values in this integer.
- L8: the dimension of result can be 1 more than m after the addition.
L12-19: adds n to results (if exists) from the least significant digit.
- L15-18: pass a carry to the next digit.
L22: trims the most significant digit if it is 0.

What are tradeoffs to make the size of result to be m instead of m+1 and vice versa?

In practice, addSameSign()and addDifferentSign() should be private. We made them protected for exercise purposes.

Method: `multiply()`

Let us override the multiply() method:

L4: sets the sign after the multiplication.
L7-15: multiplies n to this integer:
- L7: the max-dimension of results is digits.length + n.digits.length.
- L12-13: pass a carry to the next digit.
L18-20: trims the most significant digit iteratively if it is 0.
- L20: ++m increments m before the comparison.

What is the worst-case complexity of the multiply() method?

Method: `main()`

Can we define the main method in LongInteger instead without creating LongIntegerRun?

L2: the parameter args is passed from the command line.

Why does the main method need to be static?

This prints something like the following:

[: one-dimensional array.
L: the element of this array is an object.
java.lang.String: the type of object.
d716361: the hash code of this array in hexadecimal.

What is the hash code of an object?

How is the Arrays.toString() method implemented?

Since no argument is passed to the main method at the moment, this prints an empty array:

If you set the arguments to 123 -456 using the [Run - Edit Configurations - Program arguments]setting, it prints the following array:

Given those two arguments, we can create two integers:

This prints something like the following, which are returned by a.toString():

How is the toString() method implemented in the Object class?

Method: `toString()`

To print a more readable representation, we need to override the toString() method in LongInteger:

What are the advantages of using StringBuilder instead of concatenating values with the + operator as follows:

Given the overridden method, the above main method now prints the following:

What are the advantages of overriding toString() instead of creating a new method with the same code, and calling the new method to get the string representation of LongInteger?

Method: `compareTo()`

L2: LongInteger is passed to Comparable as a generic type.

Is extends always used to inherit a class whereas implements is used to inherit an interface?

The compareAbs() method compares the absolute values of this and n:

L7: if digits has more dimensions, its absolute value is greater.
L10-13: compares the significant digits iteratively.

Is it safe to use the same variable i to iterate both digits and n.digits?

Once LongInteger properly inherits Comparable by overriding compareTo(), objects instantiated by this class can be compared using many built-in methods.

- <>: the diamond operator that infers the generic type from its declaration.

What is the advantage of declaring list as List instead of ArrayList? What kind of sorting algorithm does Collections.sort() use?

The above code prints the following sorted lists:

What would be the case that needs to distinguish -0 from 0?

1.3. Unit Testing

LongInteger: unit tests.

Test: `LongInteger()`

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;

public class LongIntegerTest {
    @Test
    public void testConstructors() {
        // default constructor
        assertEquals("0", new LongInteger().toString());

        // constructor with a string parameter
        assertEquals("12", new LongInteger("12").toString());
        assertEquals("34", new LongInteger("+34").toString());
        assertEquals("-56", new LongInteger("-56").toString());
        assertEquals("-0", new LongInteger("-0").toString());

        // copy constructor
        assertEquals("12", new LongInteger(new LongInteger("12")).toString());
        assertEquals("-34", new LongInteger(new LongInteger("-34")).toString());
    }
}

L7: methods used for unit testing must be public.
L8: tests the default constructor
L12-15: tests the constructor with a string parameter.
L18-19: tests the copy constructor.

When should we use import static instead of import?

When you run this test, you see a prompt similar to the followings:

Tests passed: 1 of 1 test

Test: `multiply()`

Let us define another method for testing the multiply() method:

@Test
public void testMultiply() {
    LongInteger a = new LongInteger("123456789");

    a.multiply(new LongInteger("1"));
    assertEquals("123456789", a.toString());

    a.multiply(new LongInteger("-1"));
    assertEquals("-123456789", a.toString());

    a.multiply(new LongInteger("-1234567890123456789"));
    assertEquals("152415787517146788750190521", a.toString());

    a.multiply(new LongInteger("0"));
    assertEquals("0", a.toString());

    a.multiply(new LongInteger("-0"));
    assertEquals("-0", a.toString());
}

Test: `compareTo()`

Let us define a method for testing the compareTo() method:

@Test
public void testCompareTo() {
    assertTrue(0 < new LongInteger("0").compareTo(new LongInteger("-0")));
    assertTrue(0 > new LongInteger("-0").compareTo(new LongInteger("0")));

    assertTrue(0 < new LongInteger("12").compareTo(new LongInteger("-34")));
    assertTrue(0 > new LongInteger("-12").compareTo(new LongInteger("34")));

    assertTrue(0 > new LongInteger("-34").compareTo(new LongInteger("12")));
    assertTrue(0 < new LongInteger("34").compareTo(new LongInteger("-12")));
}

1.4. Quiz

Quiz 1: Java Essentials

Coding

Testing

Test the correctness of your LongIntegerQuiz using the unit tests.
Add more tests for a more thorough assessment if necessary.

Quizzes

What is the advantage of using Generics?
How do you make the class you define comparable?
What is the advantage of overriding member methods in the Object class?
What kind of methods should be defined as static?

Submission

1. Commit and push everything under the following packages to your GitHub repository:

2. Priority Queues

This chapter discusses different types of priority queues and benchmarks their performance in terms of the worst-case complexities.

Resources

References

3.1. Abstraction

The abstract class for all sorting algorithms.

Abstract Sort

L2-4: member fields:
- comparator: specifies the precedence of comparable keys.
- comparisons: counts the number of comparisons performed during sorting.
- assignments: counts the number of assignments performed during sorting.

The two-member fields, comparisons and assignments, are used to micro-benchmark sorting algorithms inheriting AbstractSort.

Let us define three helper methods, compareTo(), assign(), and swap():

L7: compares two keys in the array and increments the count.
L18: assigns the value to the specific position in the array and increments the count.
L29: swaps the values of the specific positions in the array by calling the assign() method.

These helper methods allow us to analyze the exact counts of those operations performed by different sorting algorithms.

How is it different to benchmark runtime speeds from counting these two operations?

Finally, let us define the default sort() that calls an overwritten abstract method, sort():

L5: calls the abstract method sort() that overwrites it.
L15: sorts the array in the range of (beginIndex, endIndex) as specified in comparator.

When would it be useful to sort a specific range in the input array?

1.1. Abstraction

Different types of objects and inheritances in Java.

Class

A is a template to instantiate an .

What is the relationship between a class and an object?

Let us create a class called that we want to be a super class of all numeral types:

package edu.emory.cs.algebraic;

public class Numeral {
}

L1: packageindicates the name of the package that this class belongs to in a hierarchy.
L3: public is an .

What are the acceptable access-level modifiers to declare a top-level class?

Let us declare a method, add() , that is an operation expected by all numeral types:

public class Numeral {
    /**
     * Adds `n` to this numeral.
     * @param n the numeral to be added.
     */
    public void add(Numeral n) { /* cannot be implemented */ }
}

L4: @param adds a comment about the parameter.

What are the advantages of havingNumeral as a super class of all numeral types?

There are two types of abstract classes in Java, abstract class and interface.

Can an object be instantiated by an abstract class or an interface?

Interface

Let us define Numeral as an :

public interface Numeral {
    void add(Numeral n);
}

L2: abstract method
- All methods in an interface are public that does not need to be explicitly coded.
- Abstract methods in an interface are declared without their bodies.

Who defines the bodies of the abstract methods?

Let us create a new interface called SignedNumeral that inherits Numeral and adds two methods, flipSign() and subtract():

Can an interface inherit either an abstract class or a regular class?

public interface SignedNumeral extends Numeral {
    /** Flips the sign of this numeral. */
    void flipSign();

    /**
     * Subtracts `n` from this numeral.
     * @param n the numeral to be subtracted.
     */
    default void subtract(Numeral n) {
        n.flipSign();
        add(n);
        n.flipSign();
    }
}

L1: extends inherits exactly one class or interface.
L9: default allows an interface to define a (introduced in Java 8).

Can we call add() that is an abstract method without a body in the default method subtract()?

What kind of a compile error does n.flipSign() cause?

There are three ways of handling this error: , , and .

Casting

The first way is to downcast the type of n to SignedNumeral, which forces the compiler to think that n can invoke the flipSign() method:

default void subtract(Numeral n) {
    ((SignedNumeral)n).flipSign();
    add(n);
    ((SignedNumeral)n).flipSign();
}

This removes the compile error; however, it will likely cause a worse kind, a .

Why is a runtime error worse than a compile error?

, although allowed in Java, is generally not recommended unless there is no other way of accomplishing the job without using it.

How can downcasting cause a runtime error in the above case?

Polymorphism

The second way is to change the type of n to SignedNumeral in the parameter setting:

default void subtract(SignedNumeral n) {
    n.flipSign();
    add(n);
    n.flipSign();
}

This seems to solve the issue. Then, what about add() defined in Numeral? Should we change its parameter type to SignedNumeral as well?

Why is it not recommended to change the code in a super class?

How about we the add() method as follows?

public interface SignedNumeral extends Numeral {
    @Override
    void add(SignedNumeral n);
    ...

L2: @Override is a predefined type to indicate the method is overridden.

The annotation @Override gives an error in this case because it is not considered an overriding.

What are the criteria to override a method?

When @Override is discarded, the error goes away and everything seems to be fine:

public interface SignedNumeral extends Numeral {
    void add(SignedNumeral n);
    ...

However, this is considered an , which defines two separate methods for add(), one taking n as Numeral and the other taking it as SignedNumeral. Unfortunately, this would decrease the level of abstraction that we originally desired.

What are good use cases of method overriding and overloading?

Generics

The third way is to use , introduced in Java 5:

public interface Numeral<T extends Numeral<T>> {
    void add(T n);
}

L1: T is a generic type that is a subtype of Numeral.
- A generic type can be recursively defined as T extends Numeral<T>.
L2: T is considered a viable type in this interface such that it can be used to declare add().

Can we define more than one generic type per interface or class?

The generic type T can be specified in a subclass of Numeral:

public interface SignedNumeral extends Numeral<SignedNumeral> {
    void flipSign();

    default void subtract(SignedNumeral n) {
        n.flipSign();
        add(n);
        n.flipSign();
    }
}

L1: T is specified as SignedNumeral.

This would implicitly assign the parameter type of add() as follows:

void add(SignedNumeral n);

public class LongInteger implements SignedNumeral {
    @Override
    public void flipSign() { /* to be implemented */ }

    @Override
    public void add(SignedNumeral n) { /* to be implemented */ }
}

L1: implements inherits .
L2-6: LongInteger is a regular class, so all declared in the super classes must be defined in this class.

Since the n is typed to SignedNumeral in L6, it cannot call any method defined in LongInteger, which leads to the same issue addressed in the section.

Would the type of n being SignedNumeral an issue for the subtract() method as well?

Thus, SignedNumeral needs to define its own generic type and pass it onto Numeral:

public interface SignedNumeral<T extends SignedNumeral<T>> extends Numeral<T> {
    void flipSign();

    default void subtract(T n) {
        n.flipSign();
        add(n);
        n.flipSign();
    }
}

L1: T is a generic type inheriting SignedNumeral, that implies all subclasses of SignedNumeral.

T can be safely passed onto Numeral because if it is a subclass of SignedNumeral, it must be a subclass of Numeral, which is how T is defined in the Numeral class.

Generics are used everywhere in Java, so it is important to understand the core concept of generics and be able to adapt it in your code to make it more modular.

Enum

Let us create an class called to represent the "sign" of the numeral:

public enum Sign {
    POSITIVE,
    NEGATIVE;
}

All items in an enum have the scope of and the access-level of public.
Items must be delimited by , and ends with ;.

The items in the enum can be assigned with specific values to make them more indicative (e.g., +, -):

public enum Sign {
    POSITIVE('+'),
    NEGATIVE('-');

    private final char value;

    Sign(char value) {
        this.value = value;
    }

    /** @return the value of the corresponding item. */
    public char value() {
        return value;
    }
}

L5: final makes this field a constant, not a , such that the value cannot be updated later.
L8: this points to the created by this constructor.
L11: @return adds a comment about the return value of this method.

Why should the member field value be private in the above example?

Note that value in L8 indicates the local parameter declared in the constructor whereas value in L13 indicates the member field declared in L5.

Limit of Interface

In SignedNumeral, it would be convenient to have a member field that indicates the sign of the numeral:

public interface SignedNumeral<T extends SignedNumeral<T>> extends Numeral<T> {
    Sign sign = Sign.POSITIVE;
    ...

L2: All member fields of an interface are static and public.

Can you declare a member field in an interface without assigning a value?

Given the sign field, it may seem intuitive to define flipSign() as a default method:

/** Flips the sign of this numeral. */
default void flipSign() {
    sign = (sign == Sign.POSITIVE) ? Sign.NEGATIVE : Sign.POSITIVE;
}

L3: condition ? A : B is a that returns A if the condition is true; otherwise, it returns B.

Is there any advantage of using a ternary operator instead of using a regular if statement?

Abstract Class

Let us turn into an :

public abstract class SignedNumeral<T extends SignedNumeral<T>> implements Numeral<T> {
    /** The sign of this numeral. */
    protected Sign sign;

    /**
     * Create a signed numeral.
     * the default sign is {@link Sign#POSITIVE}.
     */
    public SignedNumeral() {
        this(Sign.POSITIVE);
    }

    /**
     * Create a signed numeral.
     * @param sign the sign of this numeral.
     */
    public SignedNumeral(Sign sign) {
        this.sign = sign;
    }
    ...

L9: the default constructor with no parameter.
L17: another constructor with the sign parameter.
L10: this() calls the constructor in L17.

Why calling this(Sign.POSITIVE) in L10 instead of stating this.sign = Sign.POSITIVE?

    ...
    /** @return true if this numeral is positive; otherwise, false. */
    public boolean isPositive() {
        return sign == Sign.POSITIVE;
    }

    /** @return true if this numeral is negative; otherwise, false. */
    public boolean isNegative() {
        return sign == Sign.NEGATIVE;
    }

    /** Flips the sign of this numeral. */
    public void flipSign() {
        sign = isPositive() ? Sign.NEGATIVE : Sign.POSITIVE;
    }
    
    /**
     * Subtracts `n` from this numeral.
     * @param n the numeral to be subtracted.
     */
    public void subtract(T n) {
        n.flipSign(); add(n); n.flipSign();
    }

    /**
     * Multiplies `n` to this numeral.
     * @param n the numeral to be multiplied.
     */
    public abstract void multiply(T n);
}

L29: abstract indicates that this is an abstract method.

Member fields and methods in an abstract class can be decorated by any modifiers, which need to be explicitly coded.

Is there anything that is not allowed in an abstract class but allowed in a regular class?

In summary, SignedNumeral includes 2 abstract methods, add() inherited from Numeral, and multiply() declared in this class.

Can you define an abstract class or an interface without declaring an abstract method?

1.2. Implementation

LongInteger: implementation.

We are going to create a class called inheriting SignedNumeral that can store an indefinite size of an integer value beyond the such as int and long.

What is so special about primitive data types in Java?

Java SE provides a similar class called although the implementations of LongIntegerandBigIntegerare completely independent.

Field

Let us declare the member field digits that is an array of bytes holding the values of this integer:

L1: LongInteger is passed to specify the generic type T in SignedNumeral.

Is the array of bytes the most efficient way of storing a long integer?

Constructors

Let us define the following three constructors:

L2: the default constructor that initializes this integer with 0 by calling the constructor in L20.
L10: a copy constructor that initializes this integer with n.
- super(): calls the corresponding constructor in the super class, SignedNumeral.
- : creates a new array by copying n.digits.
L20: a constructor that initializes this integer with n by passing it to the set() method.

Do the constructors in L10 and L20 call any constructor in the super class?

Arrays.copyOf() is a referenced by the class type Arrays, not an object. Java provides many classes with static methods that are commonly used (e.g., , ).

Can you call non-static methods or fields in the body of a static method?

The static keyword must not be abused to quickly fix compile errors unless it is intended.

Method: `set()`

Let us define the set() method that takes a string and sets the sign and the value of this integer:

L1-7: javadoc comments.
- L4: this method throws if n is null.
- L5: this method throws if the format of n is invalid.
L10-11: throws the NullPointerException.
L14-18: checks the first character of n and sets this.sign using the expression.
- String member methods: , .
- yield: returns the value of this switch statement for the condition (introduced in Java 14).
L21-30: sets the value of n to this.digits .
- L23: for-loop can handle multiple variables such as i and j.
- L24: gets the value of n.charAt(i).
- L25-28: throws the InvalidParameterException if v is not a digit.
- L29: stores the value in the reverse order.
- L27: is a static method in String.

When should we use statements over blocks for error handling and vice versa?

This type of method is called a setter. Java encourages making member fields private and creating getters and setters to access the fields for , which is not necessarily encouraged by other languages.

Method: `add()`

Let us override the add() method that calls two helper methods:

L3-4: adds n to this integer that has the same sign by calling addSameSign().
L5-6: adds n to this integer that has a different sign by calling addDifferentSign().

The following shows an implementation of addSameSign() based on the simple arithmetic:

L7-9: creates the byte array result by copying values in this integer.
- L8: the dimension of result can be 1 more than m after the addition.
- Static methods: , .
L12-19: adds n to results (if exists) from the least significant digit.
- L15-18: pass a carry to the next digit.
L22: trims the most significant digit if it is 0.

What are tradeoffs to make the size of result to be m instead of m+1 and vice versa?

The following shows addDifferentSign() that throws :

The implementation of addDifferentSign() is quite similar to addSameSign() although it involves a few more logics. We will leave this as an .

In practice, addSameSign()and addDifferentSign() should be private. We made them protected for exercise purposes.

Method: `multiply()`

Let us override the multiply() method:

L4: sets the sign after the multiplication.
L7-15: multiplies n to this integer:
- L7: the max-dimension of results is digits.length + n.digits.length.
- L12-13: pass a carry to the next digit.
L18-20: trims the most significant digit iteratively if it is 0.
- L20: ++m increments m before the comparison.

What is the worst-case complexity of the multiply() method?

Method: `main()`

Let us create a runnable class called that contains the main method:

Can we define the main method in LongInteger instead without creating LongIntegerRun?

L2: the parameter args is passed from the command line.

Why does the main method need to be static?

This prints something like the following:

[: one-dimensional array.
L: the element of this array is an object.
java.lang.String: the type of object.
d716361: the hash code of this array in hexadecimal.

What is the hash code of an object?

Every object implicitly inherits that defines a few member methods including , which gets called automatically by the println() method to retrieve the string representation of this object. We can use the helper method that gives a more readable representation:

How is the Arrays.toString() method implemented?

Since no argument is passed to the main method at the moment, this prints an empty array:

If you set the arguments to 123 -456 using the [Run - Edit Configurations - Program arguments]setting, it prints the following array:

Given those two arguments, we can create two integers:

This prints something like the following, which are returned by a.toString():

How is the toString() method implemented in the Object class?

Method: `toString()`

To print a more readable representation, we need to override the toString() method in LongInteger:

L5: provides an efficient way of concatenating different data types into one string.

What are the advantages of using StringBuilder instead of concatenating values with the + operator as follows:

Given the overridden method, the above main method now prints the following:

What are the advantages of overriding toString() instead of creating a new method with the same code, and calling the new method to get the string representation of LongInteger?

Method: `compareTo()`

Java does not allow , so it is not possible to use logical operators to compare the two integers above, a and b:

In fact, any object that is comparable must inherit the interface as follows:

L2: LongInteger is passed to Comparable as a generic type.

Is extends always used to inherit a class whereas implements is used to inherit an interface?

The Comparable interface contains one abstract method called that returns a negative value if this object is smaller than n, a positive value if this object is greater than n, and zero if this object equals to n. The compareTo() method must be overridden by the LongInteger class:

The compareAbs() method compares the absolute values of this and n:

L7: if digits has more dimensions, its absolute value is greater.
L10-13: compares the significant digits iteratively.

Is it safe to use the same variable i to iterate both digits and n.digits?

Once LongInteger properly inherits Comparable by overriding compareTo(), objects instantiated by this class can be compared using many built-in methods.

L2: is a specific implementation of the interface .
- All collections in Java inheriting uses generics.
- <>: the diamond operator that infers the generic type from its declaration.
L11: sorts the list in ascending order using and .
L14: sorts the list in descending order using and .

What is the advantage of declaring list as List instead of ArrayList? What kind of sorting algorithm does Collections.sort() use?

The above code prints the following sorted lists:

What would be the case that needs to distinguish -0 from 0?

2.1. Simple Priority Queues

Lazy and eager priority queues.

A priority queue (PQ) is a data structure that supports the following two operations:

add(): adds a comparable key to the PQ.
remove(): removes the key with the highest (or lowest) priority in the PQ.

A PQ that removes the key with the highest priority is called a maximum PQ (max-PQ), and with the lowest priority is called a minimum PQ (min-PQ).

Does a priority queue need to be sorted at all time to support those two operations? What are the use cases of priority queues?

Abstract Priority Queue

public abstract class AbstractPriorityQueue<T extends Comparable<T>> {
    protected final Comparator<T> priority;

    /**
     * Initializes this PQ as either a maximum or minimum PQ.
     * @param priority if {@link Comparator#naturalOrder()}, this is a max PQ;
     *                 if {@link Comparator#reverseOrder()}, this is a min PQ.
     */
    public AbstractPriorityQueue(Comparator<T> priority) {
        this.priority = priority;
    }
}

L2: is a comparator that can compare keys of the generic type T.
- final: must be initialized in every constructor.
L6: the javadoc {@link} hyperlinks to the specified methods.

What are comparable data types in Java? Can you define your own comparator?

Let us define three abstract methods, add(), remove(), and size() in AbstractPriorityQueue:

/**
 * Adds a comparable key to this PQ.
 * @param key the key to be added.
 */
abstract public void add(T key);

/**
 * Removes the key with the highest/lowest priority if exists.
 * @return the key with the highest/lowest priority if exists; otherwise, null.
 */
abstract public T remove();

/** @return the size of this PQ. */
abstract public int size();

Given the abstract methods, we can define the regular method isEmpty():

/** @return true if this PQ is empty; otherwise, false. */
public boolean isEmpty() {
    return size() == 0;
}

Lazy Priority Queue

add(): takes $O(1)$ to add a key to the PQ.
remove(): takes $O(n)$ to remove the key with the highest/lowest priority from the PQ.

In other words, all the hard work is done at the last minute when it needs to remove the key.

public class LazyPriorityQueue<T extends Comparable<T>> extends AbstractPriorityQueue<T> {
    private final List<T> keys;

    /** Initializes this as a maximum PQ. */
    public LazyPriorityQueue() {
        this(Comparator.naturalOrder());
    }

    /** @see AbstractPriorityQueue#AbstractPriorityQueue(Comparator). */
    public LazyPriorityQueue(Comparator<T> priority) {
        super(priority);
        keys = new ArrayList<>();
    }
    
    
    @Override
    public int size() {
        return keys.size();
    }
}

L1: declares T and passes it to its super class, AbstractPriorityQueue.
L2: defines a list to store input keys.
L17-19: overrides the size() method.

Can you add keys to the member field keys when it is declared as final (a constant)? Why does all constructors in LazyPriorityQueue need to call the super constructor?

We then override the core methods, add() and remove():

/**
 * Appends a key to {@link #keys}.
 * @param key the key to be added.
 */
@Override
public void add(T key) {
    keys.add(key);
}

/**
 * Finds the key with the highest/lowest priority, and removes it from {@link #keys}.
 * @return the key with the highest/lowest priority if exists; otherwise, null.
 */
@Override
public T remove() {
    if (isEmpty()) return null;
    T max = Collections.max(keys, priority);
    keys.remove(max);
    return max;
}

L6-8: appends a key to the list in $O(1)$ .
L15-20: removes a key to the list in $O(n)$ .
- L16: edge case handling.
- L18: removes a key from the list in $O(n+n) = O(n)$ .

Is ArrayList the best implementation of List for LazyPriorityQueue? Why does remove() in L18 cost $O(n+n)$ ?

Eager Priority Queues

add(): takes $O(n)$ to add a key to the PQ.
remove(): takes $O(1)$ to remove the key with the highest/lowest priority from the PQ.

In other words, all the hard work is done as soon as a key is added.

What are the situations that LazyPQ is preferred over EagerPQ and vice versa?

public class EagerPriorityQueue<T extends Comparable<T>> extends AbstractPriorityQueue<T> {
    private final List<T> keys;

    public EagerPriorityQueue() {
        this(Comparator.naturalOrder());
    }

    public EagerPriorityQueue(Comparator<T> priority) {
        super(priority);
        keys = new ArrayList<>();
    }
    
    @Override
    public int size() {
        return keys.size();
    }
}

The implementations of the two constructors and the size() method are identical to the ones in LazyPriorityQueue.

Should we create an abstract class that implements the above code and make it as a super class of LazyPQ and EagerPQ? What level of abstraction is appropriate in object-oriented programming?

We then override the core methods, add() and remove():

/**
 * Adds a key to {@link #keys} by the priority.
 * @param key the key to be added.
 */
@Override
public void add(T key) {
    // binary search (if not found, index < 0)
    int index = Collections.binarySearch(keys, key, priority);
    // if not found, the appropriate index is {@code -(index +1)}
    if (index < 0) index = -(index + 1);
    keys.add(index, key);
}

/**
 * Remove the last key in the list.
 * @return the key with the highest priority if exists; otherwise, {@code null}.
 */
@Override
public T remove() {
    return isEmpty() ? null : keys.remove(keys.size() - 1);
}

L6-12: inserts a key to the list in $O(n)$ .
- L8: finds the index of the key to be inserted in the list using binary search in $O(\log n)$ .
- L11: inserts the key at the index position in $O(n)$ .
L19-21: removes a key from the list in $O(1)$ .

What are the worst-case complexities of add() and remove() in LazyPQ and EagerPQ in terms of assignments and comparison?

3.3. Divide & Conquer Sort

Merge sort, quick sort, intro sort

Divide & Conquer

Divide the problem into sub-problems (recursively).
Conquer sub-problems, which effectively solve the super problem.

Complexity

MergeSort

TimSort

QuickSort

IntroSort

Best

Worst

Average

Why do people ever want to use QuickSort?

Merge Sort

Divide the input array into two sub-arrays.
Sort each of the sub-arrays and merge them into the back.

public class MergeSort<T extends Comparable<T>> extends AbstractSort<T> {
    private T[] temp;

    public MergeSort() {
        this(Comparator.naturalOrder());
    }

    public MergeSort(Comparator<T> comparator) {
        super(comparator);
    }
}

L2: holds the copy of the input array.

Let us then override the sort() method that calls the helper method:

@Override
@SuppressWarnings("unchecked")
public void sort(T[] array, int beginIndex, int endIndex) {
    if (temp == null || temp.length < array.length)
        temp = (T[])Array.newInstance(array[0].getClass(), array.length);
    sort(array, temp, beginIndex, endIndex);
}

L4-5: increases the size of the temp array if necessary.
- L5: unchecked type Object to T[].

What is the advantage of declaring the member field temp?

The helper method can be defined as follows:

/**
 * @param input the input array.
 * @param temp the array to hold the copy of the input array.
 * @param beginIndex the beginning index of the 1st half (inclusive).
 * @param endIndex the ending index of the 2nd half (exclusive).
 */
protected void sort(T[] input, T[] copy, int beginIndex, int endIndex) {
    if (beginIndex + 1 >= endIndex) return;
    int middleIndex = Utils.getMiddleIndex(beginIndex, endIndex);

    sort(input, copy, beginIndex, middleIndex);
    sort(input, copy, middleIndex, endIndex);
    merge(input, copy, beginIndex, middleIndex, endIndex);
}

L11: sorts the left sub-array.
L12: sorts the right sub-array.
L13: merges the left and right sub-arrays.

Finally, the merge() method can be defined as follows:

/**
 * @param input the input array.
 * @param temp the array to hold the copy of the input array.
 * @param beginIndex  the beginning index of the 1st half (inclusive).
 * @param middleIndex the ending index of the 1st half (exclusive).
 * @param endIndex    the ending index of the 2nd half (exclusive).
 */
protected void merge(T[] input, T[] copy, int beginIndex, int middleIndex, int endIndex) {
    int fst = beginIndex, snd = middleIndex, n = endIndex - beginIndex;
    System.arraycopy(input, beginIndex, copy, beginIndex, n);
    assignments += n;

    for (int k = beginIndex; k < endIndex; k++) {
        if (fst >= middleIndex)
            assign(input, k, copy[snd++]);
        else if (snd >= endIndex)
            assign(input, k, copy[fst++]);
        else if (compareTo(copy, fst, snd) < 0)
            assign(input, k, copy[fst++]);
        else
            assign(input, k, copy[snd++]);
    }
}

L10-11: copies the input array to the temporary array and counts the assignments.
L14-15: no key left in the 1st half.
L16-17: no key left in the 2nd half.
L18-19: the 2nd key is greater than the 1st key.
L20-21: the 1st key is greater than or equal to the 2nd key.

How many assignments are made for the $n$ number of keys by the above version of MergeSort?

Quick Sort

Pick a pivot key in the input array.
Exchange keys between the left and right partitions such that all keys in the left and right partitions are smaller or bigger than the pivot key, respectively.
Repeat this procedure in each partition, recursively.

public class QuickSort<T extends Comparable<T>> extends AbstractSort<T> {
    public QuickSort() {
        this(Comparator.naturalOrder());
    }

    public QuickSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then override the sort() method:

@Override
public void sort(T[] array, int beginIndex, int endIndex) {
    if (beginIndex >= endIndex) return;

    int pivotIndex = partition(array, beginIndex, endIndex);
    sort(array, beginIndex, pivotIndex);
    sort(array, pivotIndex + 1, endIndex);
}

L3: stops when the pointers are crossed.
L6: sorts the left partition.
L7: sorts the right partition.

The partition() method can be defined as follows:

protected int partition(T[] array, int beginIndex, int endIndex) {
    int fst = beginIndex, snd = endIndex;

    while (true) {
        while (++fst < endIndex && compareTo(array, beginIndex, fst) >= 0);
        while (--snd > beginIndex && compareTo(array, beginIndex, snd) <= 0);
        if (fst >= snd) break;
        swap(array, fst, snd);
    }

    swap(array, beginIndex, snd);
    return snd;
}

L5: finds the left pointer where endIndex > fst > pivot.
L6: finds the right pointer where beginIndex < snd < pivot.
L7: the left and right pointers are crossed.
L8: swaps between keys in the left and right partitions.
L11: swaps the keys in the beginIndex and pivot.

The programming design in L5 and L6 are not ideal since the while loops do not include anybody that can confuse other programmers.

Intro Sort

Although Quick Sort is the fastest on average, the worst-case complexity is $O(n^2)$ .
$\exists$ sorting algorithms that guarantee faster worst-case complexities than Quick Sort: $\Rightarrow$ Quick Sort for random cases and a different algorithm for the worst case.

How can we determine if Quick Sort is meeting the worse-case?

Let us define the IntroSoft class inheriting QuickSort:

public class IntroSort<T extends Comparable<T>> extends QuickSort<T> {
    private final AbstractSort<T> engine;

    public IntroSort(AbstractSort<T> engine) {
        this(engine, Comparator.naturalOrder());
    }

    public IntroSort(AbstractSort<T> engine, Comparator<T> comparator) {
        super(comparator);
        this.engine = engine;
    }

    @Override
    public void resetCounts() {
        super.resetCounts();
        if (engine != null) engine.resetCounts();
    }

}

L2: declares a sorting algorithm to handle the worst cases.
L14: resets the counters for both the main and auxiliary sorting algorithms.

We then override the sort() method by passing the maximum depth to the auxiliary method:

@Override
public void sort(T[] array, int beginIndex, int endIndex) {
    final int maxdepth = getMaxDepth(beginIndex, endIndex);
    sort(array, beginIndex, endIndex, maxdepth);
    comparisons += engine.getComparisonCount();
    assignments += engine.getAssignmentCount();
}

protected int getMaxDepth(int beginIndex, int endIndex) {
    return 2 * (int)Utils.log2(endIndex - beginIndex);
}

L9: returns $2 \log n$ as the maximum depth.

Finally, the auxiliary method sort() can be defined as follows:

private void sort(T[] array, int beginIndex, int endIndex, int maxdepth) {
    if (beginIndex >= endIndex) return;

    if (maxdepth == 0)    // encounter the worst case
        engine.sort(array, beginIndex, endIndex);
    else {
        int pivotIndex = partition(array, beginIndex, endIndex);
        sort(array, beginIndex, pivotIndex, maxdepth - 1);
        sort(array, pivotIndex + 1, endIndex, maxdepth - 1);
    }
}

L4-5: switches to the other sorting algorithm if the depth of the partitioning exceeds the max depth.

Does the max-depth need to be set to $2 \log n$ ?

Benchmarks

The following shows runtime speeds, assignment costs, and comparison costs between several sorting algorithms for the random, best, and worst cases.

3.2. Comparison-based Sort

Selection sort, heap sort, insertion sort, shell sort.

Selection-based Sort

Selection-based sorting algorithms take the following steps:

For each key $A_i$ where $|A| = n$ and $i \in [n, 0)$ :
- Search the maximum key $A_m$ where $m \in [1, i)$ .
- Swap $A_i$ and $A_m$

The complexities differ by different search algorithms:

Algorithm

Compare

Swap

Selection Sort

Heap Sort

Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes $O(\log n)$ instead of $O(n)$ .

Can the search be faster than $O(\log n)$ ?

Selection Sort

public class SelectionSort<T extends Comparable<T>> extends AbstractSort<T> {
    public SelectionSort() {
        this(Comparator.naturalOrder());
    }

    public SelectionSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then override the sort() method:

@Override
public void sort(T[] array, final int beginIndex, final int endIndex) {
    for (int i = endIndex; i > beginIndex; i--) {
        int max = beginIndex;

        for (int j = beginIndex + 1; j < i; j++) {
            if (compareTo(array, j, max) > 0)
                max = j;
        }

        swap(array, max, i - 1);
    }
}

L3-12: $O(n^2)$
- L3: iterates all keys within the range $\rightarrow O(n)$ .
- L4-9: finds the index of the maximum key within the range $\rightarrow O(n)$ .
- L11: swaps the maximum key with the last key in the range $\rightarrow O(1)$ .

How does the sort() method work with Comparator.reverseOrder()?

Heap Sort

public class HeapSort<T extends Comparable<T>> extends AbstractSort<T> {
    public HeapSort() {
        this(Comparator.naturalOrder());
    }

    public HeapSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Before we override the sort() method, let us define the following helper methods:

private void sink(T[] array, int k, int beginIndex, int endIndex) {
    for (int i = getLeftChildIndex(beginIndex, k); i < endIndex; k = i, i = getLeftChildIndex(beginIndex, k)) {
        if (i + 1 < endIndex && compareTo(array, i, i + 1) < 0) i++;
        if (compareTo(array, k, i) >= 0) break;
        swap(array, k, i);
    }
}

private int getParentIndex(int beginIndex, int k) {
    return beginIndex + (k - beginIndex - 1) / 2;
}

private int getLeftChildIndex(int beginIndex, int k) {
    return beginIndex + 2 * (k - beginIndex) + 1;
}

L1-7: finds the right position of the k'th key by using the sink operation.
L9-11: finds the parent index of the k'th key given the beginning index.
L13-15: finds the left child index of the k'th key given the beginning index.

Finally, we override the sort() method:

@Override
public void sort(T[] array, int beginIndex, int endIndex) {
    // heapify all elements
    for (int k = getParentIndex(beginIndex, endIndex - 1); k >= beginIndex; k--)
        sink(array, k, beginIndex, endIndex);

    // swap the endIndex element with the root element and sink it
    while (endIndex > beginIndex + 1) {
        swap(array, beginIndex, --endIndex);
        sink(array, beginIndex, beginIndex, endIndex);
    }
}

L4-5: turns the input array into a heap $\rightarrow O(n \log n)$ :
- L4: iterates from the parent of the key in the ending index $\rightarrow O(n)$ .
- L5: sinks the key $\rightarrow O(\log n)$ .
L8-11: selection sort $\rightarrow O(n \log n)$ :
- L8: iterates all keys within the range $\rightarrow O(n)$ .
- L9: swaps the maximum key with the beginning key in the range $\rightarrow O(1)$ .
- L10: sinks to heapify $\rightarrow O(\log n)$ .

What is the worst-case scenario for Selection Sort and Heap Sort?

Insertion-based Sort

Insertion-based sorting algorithms take the following steps:

For each key $A_i$ where $|A| = n$ and $i \in [1, n)$ :
- Keep swapping $A_j$ and $A_i$ until $A_j \leq A_i$ .

The complexities differ by different sequences:

Algorithm

Sequence

Compare

Swap

Insertion Sort

Adjacent

Shell Sort

Knuth

Insertion Sort

public class InsertionSort<T extends Comparable<T>> extends AbstractSort<T> {
    public InsertionSort() {
        this(Comparator.naturalOrder());
    }

    public InsertionSort(Comparator<T> comparator) {
        super(comparator);
    }
}

Let us then define an auxiliary method, sort():

protected void sort(T[] array, int beginIndex, int endIndex, final int h) {
    int begin_h = beginIndex + h;

    for (int i = begin_h; i < endIndex; i++)
        for (int j = i; begin_h <= j && compareTo(array, j, j - h) < 0; j -= h)
            swap(array, j, j - h);
}

L4: iterates keys in the input array $\rightarrow O(n)$ .
L5: compares keys in the sublist of the input array $\rightarrow O(\frac{n}{h})$ .
L6: swaps the two keys.

Given the auxiliary method, the sort() method can be defined as follows where h = 1:

@Override
public void sort(T[] array, int beginIndex, int endIndex) {
    sort(array, beginIndex, endIndex, 1);
}

How many swaps does Insertion Sort make for the following array? [7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]

Shell Sort

Gap Sequence

Knuth: $(3^k - 1) / 2 \Rightarrow \{1, 4, 13, 40, 121, \ldots\}$
Hibbard: $2^k - 1 \Rightarrow \{1, 3, 7, 15, 31, 63, \ldots\}$
Pratt: $2^p \cdot 3^q \Rightarrow \{1, 2, 3, 4, 6, 8, 9, 12, \ldots\}$
Shell: $n / 2^k \Rightarrow \{500, 250, 125, \ldots\}$ , where $n = 1000$

For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]

It then sorts each group using Insertion Sort, which results in the following array: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.

Implementation

public abstract class ShellSort<T extends Comparable<T>> extends InsertionSort<T> {
    protected List<Integer> sequence;

    public ShellSort(Comparator<T> comparator) {
        super(comparator);
        sequence = new ArrayList<>();
        populateSequence(10000);
    }
}

L2: stores a particular gap sequence.
L7: pre-populate the gap sequence that can handle the input size up to 10000.

Then, let us define two abstract methods, populateSequence() and getSequenceStartIndex():

/**
 * Populates the gap sequence with respect to the size of the list.
 * @param n the size of the list to be sorted.
 */
protected abstract void populateSequence(int n);

/**
 * @param n the size of the list to be sorted.
 * @return the starting index of the sequence with respect to the size of the list.
 */
protected abstract int getSequenceStartIndex(int n);

L5: populates a particular sequence for the input size n.
L11: returns the index of the first gap to be used given the input size n.

Let us then override the sort() method:

@Override
 public void sort(T[] array, int beginIndex, int endIndex) {
     int n = endIndex - beginIndex;
     populateSequence(n);

     for (int i = getSequenceStartIndex(n); i >= 0; i--)
         sort(array, beginIndex, endIndex, sequence.get(i));
 }

L4: should not re-populate the sequence unless it has to.
L6: iterates the sequence $\rightarrow O(s)$ where $s$ is the number of gaps in the sequence.
L7: sorts the gap group by using the auxiliary method.

Knuth Sequence

public class ShellSortKnuth<T extends Comparable<T>> extends ShellSort<T> {
    public ShellSortKnuth() {
        this(Comparator.naturalOrder());
    }

    public ShellSortKnuth(Comparator<T> comparator) {
        super(comparator);
    }
}

The two abstract methods can be overridden as follows:

@Override
protected void populateSequence(int n) {
    n /= 3;

    for (int t = sequence.size() + 1; ; t++) {
        int h = (int) ((Math.pow(3, t) - 1) / 2);
        if (h <= n) sequence.add(h);
        else break;
    }
}

@Override
protected int getSequenceStartIndex(int n) {
    int index = Collections.binarySearch(sequence, n / 3);
    if (index < 0) index = -(index + 1);
    if (index == sequence.size()) index--;
    return index;
}

L2: populates the Knuth sequence up to the gap $\leq \frac{n}{3}$ .
L13: returns the index of the first key $\leq \frac{n}{3}$ .

Why should we use $\frac{n}{3}$ as the largest gap in the sequence?

Demonstration

Unit Tests & Benchmarks

4.1. Binary Search Trees

This section discusses abstraction of binary search trees.

This section assumes that you have already learned core concepts about binary search tress from the prerequisite. Thus, it focuses on the abstraction that can be applied to other types of binary search trees introduced in the following sections.

Abstract Binary Node

public abstract class AbstractBinaryNode<T extends Comparable<T>, N extends AbstractBinaryNode<T, N>> {
    protected T key;
    protected N parent;
    protected N left_child;
    protected N right_child;

    public AbstractBinaryNode(T key) {
        setKey(key);
    }
}

L1: defines two generic types, T for the type of the key and N is for the type of the binary node.
L8: calls the setKey() method to assign the value of key in L2.

Let us define boolean methods for the member fields:

public boolean hasParent() { return parent != null; }

public boolean hasLeftChild() { return left_child != null; }

public boolean hasRightChild() { return right_child != null; }

public boolean hasBothChildren() {
    return hasLeftChild() && hasRightChild();
}

/** @return true if the specific node is the left child of this node. */
public boolean isLeftChild(N node) {
    return left_child == node;
}

/** @return true if the specific node is the right child of this node. */
public boolean isRightChild(N node) {
    return right_child == node;
}

What is the input parameter node is null for the isLeftChild() and isRightChild() methods?

Let us then define getters to access the member fields:

public T getKey() { return key; }

public N getParent() { return parent; }

public N getLeftChild() { return left_child; }

public N getRightChild() { return right_child; }

We can also define helper methods inferred by the getters:

public N getGrandParent() {
    return hasParent() ? parent.getParent() : null;
}

@SuppressWarnings("unchecked")
public N getSibling() {
    if (hasParent()) {
        N parent = getParent();
        return parent.isLeftChild((N)this) ? parent.getRightChild() : parent.getLeftChild();
    }

    return null;
}

public N getUncle() {
    return hasParent() ? parent.getSibling() : null;
}

L9: this needs to be casted to N since the input parameter of isLeftChild() is N.

Is it safe to downcast this to N?

Finally, let us define setters and their helper methods:

public void setKey(T key) { this.key = key; }

public void setParent(N node) { parent = node; }

public void setLeftChild(N node) {
    replaceParent(node);
    left_child = node;
}

public void setRightChild(N node) {
    replaceParent(node);
    right_child = node;
}

/**
 * Replaces the parent of the specific node to be this node. 
 * @param node the node whose parent to be replaced.
 */
@SuppressWarnings("unchecked")
protected void replaceParent(N node) {
    if (node != null) {
        if (node.hasParent()) node.getParent().replaceChild(node, null);
        node.setParent((N)this);
    }
}

/**
 * Replaces the old child with the new child if exists.
 * @param oldChild the old child of this node to be replaced.
 * @param newChild the new child to be added to this node.
 */
public void replaceChild(N oldChild, N newChild) {
    if (isLeftChild(oldChild)) setLeftChild(newChild);
    else if (isRightChild(oldChild)) setRightChild(newChild);
}

L5,10: sets node to be a child of this in two steps:
- L6,11: replaces the parent of node with this.
- L7,12: sets node to be the child.
L20: replaces the parent of node with this in two steps:
- L22: node gets abandoned by its current parent.
- L23: this becomes the new parent of node.
L32: replaces oldChild of this node with newChild.

Binary Node

public class BinaryNode<T extends Comparable<T>> extends AbstractBinaryNode<T, BinaryNode<T>> {
    public BinaryNode(T key) {
        super(key);
    }
}

L1: defines only 1 generic type T for the comparable key and passes itself for the generic type N to theAbstractBinaryNode class.

Is there any abstract method from AbstractBinaryNode to be defined in BinaryNode?

Abstract Binary Search Tree

public abstract class AbstractBinarySearchTree<T extends Comparable<T>, N extends AbstractBinaryNode<T, N>> {
    protected N root;

    public AbstractBinarySearchTree() {
        setRoot(null);
    }

    /** @return a new node with the specific key. */
    abstract public N createNode(T key);
    
    public boolean isRoot(N node) { return root == node; }

    public N getRoot() { return root; }

    public void setRoot(N node) {
        if (node != null) node.setParent(null);
        root = node;
    }
}

L1: defines two generic types, T for the type of the key and N is for the type of the binary node.
L4: initializes the member field root to null.
L9: creates a binary node typed N, required for the add() method.

Why does Java not allow a generic type to be instantiated (e.g., node = new N())?

Let us define searching methods:

/** @return the node with the specific key if exists; otherwise, {@code null}. */
public N get(T key) {
    return findNode(root, key);
}

/** @return the node with the specific key if exists; otherwise, {@code null}. */
protected N findNode(N node, T key) {
    if (node == null) return null;
    int diff = key.compareTo(node.getKey());

    if (diff < 0)
        return findNode(node.getLeftChild(), key);
    else if (diff > 0)
        return findNode(node.getRightChild(), key);
    else
        return node;
}

/** @return the node with the minimum key under the subtree of {@code node}. */
protected N findMinNode(N node) {
    return node.hasLeftChild() ? findMinNode(node.getLeftChild()) : node;
}

/** @return the node with the maximum key under the subtree of {@code node}. */
protected N findMaxNode(N node) {
    return node.hasRightChild() ? findMaxNode(node.getRightChild()) : node;
}

What are the worst-case complexities of findNode(), findMinNode(), and findMaxNode()?

Let us define the add() method:

public N add(T key) {
    N node = null;

    if (root == null)
        setRoot(node = createNode(key));
    else
        node = addAux(root, key);

    return node;
}

private N addAux(N node, T key) {
    int diff = key.compareTo(node.getKey());
    N child, newNode = null;

    if (diff < 0) {
        if ((child = node.getLeftChild()) == null)
            node.setLeftChild(newNode = createNode(key));
        else
            newNode = addAux(child, key);
    }
    else if (diff > 0) {
        if ((child = node.getRightChild()) == null)
            node.setRightChild(newNode = createNode(key));
        else
            newNode = addAux(child, key);
    }

    return newNode;
}

L5: creates a node with the key to be the root if this tree does not include any node.
L7: finds the appropriate location for the key and creates the node.

What does the add() method above do when the input key already exists in the tree?

Let us define the remove() method:

/** @return the removed node with the specific key if exists; otherwise, {@code null}. */
public N remove(T key) {
    N node = findNode(root, key);

    if (node != null) {
        if (node.hasBothChildren()) removeHibbard(node);
        else removeSelf(node);
    }

    return node;
}

L6: removes a node with two children using the Hibbard algorithm.
L7: removes a node with no or one child

The removeSelf() method makes the node's only child as the child of its parent and removes it:

/** @return the lowest node whose subtree has been updatede. */
protected N removeSelf(N node) {
    N parent = node.getParent();
    N child = null;

    if (node.hasLeftChild()) child = node.getLeftChild();
    else if (node.hasRightChild()) child = node.getRightChild();
    replaceChild(node, child);

    return parent;
}

private void replaceChild(N oldNode, N newNode) {
    if (isRoot(oldNode))
        setRoot(newNode);
    else
        oldNode.getParent().replaceChild(oldNode, newNode);
}

L6: finds the child of node.
L7: replaces node with its child.

The removeHibbard() method finds a node that can be the parent of the left- and the right-children of node and makes it a child of its parent:

Which nodes are guaranteed to be the parent of those left- and right- children?

/** @return the lowest node whose subtree has been updatede. */
protected N removeHibbard(N node) {
    N successor = node.getRightChild();
    N min = findMinNode(successor);
    N parent = min.getParent();

    min.setLeftChild(node.getLeftChild());

    if (min != successor) {
        parent.setLeftChild(min.getRightChild());
        min.setRightChild(successor);
    }

    replaceChild(node, min);
    return parent;
}

The following demonstrates how the above removeHibbard() method works:

What is the worst-case complexity of the removeHibbard() method?

Binary Search Tree

Let us define the BinarySearchTree inheriting AbstractBinarySearchTree:

public class BinarySearchTree<T extends Comparable<T>> extends AbstractBinarySearchTree<T, BinaryNode<T>> {
    /**
     * @param key the key of this node.
     * @return a binary node with the specific key.
     */
    @Override
    public BinaryNode<T> createNode(T key) {
        return new BinaryNode<T>(key);
    }
}

Data Structures and Algorithms in Java

Preface

Syllabus

General

Instructors

Grading

Notes

Schedule

0. Getting Started

Contents

Resources

0.1. Environment Setup

Development Kit

Version Control

Integrated Development Environment

Project Management

GitHub Integration

0.2. Quiz

Coding

Testing

Submission

1. Java Essentials

Contents

Resources

References

1.1. Abstraction

Class

Interface

Casting

Polymorphism

Generics

Enum

Limit of Interface

Abstract Class

1.2. Implementation

Field

Constructors

Method: set()

Method: add()

Method: multiply()

Method: main()

Method: toString()

Method: compareTo()

1.3. Unit Testing

Test: LongInteger()

Test: multiply()

Test: compareTo()

1.4. Quiz

Coding

Testing

Quizzes

Submission

2. Priority Queues

Contents

Resources

References

3.1. Abstraction

Abstract Sort

Preface

0.2. Quiz

Coding

Testing

Submission

1. Java Essentials

Contents

Resources

References

Schedule

Syllabus

General

Instructors

Grading

Notes

0.1. Environment Setup

Development Kit

Version Control

Integrated Development Environment

Project Management

GitHub Integration

0. Getting Started

Method: `set()`

Method: `add()`

Method: `multiply()`

Method: `main()`

Method: `toString()`

Method: `compareTo()`

Test: `LongInteger()`

Test: `multiply()`

Test: `compareTo()`

Test: `LongInteger()`

Test: `multiply()`

Test: `compareTo()`

Method: `set()`

Method: `add()`

Method: `multiply()`

Method: `main()`

Method: `toString()`

Method: `compareTo()`

`Add()` with Swim

`Remove()` with Sink