Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This chapter helps you set up the development environment for this course.
Many Java developers commonly adapt to the environment described in this section. Thus, it is essential to familiarize yourself with this setup.
Spring 2023
Time: MW 11:30AM - 12:45PM
Location: Atwood 240
Jinho Choi Associate Professor of Computer Science Office Hours → MW 4PM - 5:30PM, MSC W302F
Peilin Wu BS in Computer Science; BA in Physics and Astronomy Office Hours → TuTh 10:30AM - 12PM, MSC E308
Jeongrok Yu BS in Computer Science and Economics Office Hours -> MW 2:30PM - 4PM, MSC E308
Zinc Zhao BS in Computer Science; BA in Music Office Hours → TuTh 1PM - 2:30PM, MSC E308
1 + 10 topical quizzes: 70%
3 homework assignments: 30%
Your work is governed by the Emory Honor Code. Honor code violations (e.g., copies from any source, including colleagues and internet sites) will be referred to the Emory Honor Council.
Excuses for exam absence/rescheduling and other serious personal events (health, family, personal related, etc.) that affect course performance must be accompanied by a letter from the Office of Undergraduate Education.
For every topic, one quiz will be assigned to check if you keep up with the materials.
Homework assignments assess conceptual understanding, programming ability, and analytical writing skills relevant to this course.
All quizzes and assignments must be submitted individually. Discussions are allowed; however, your work must be original.
Late submissions within a week will be accepted with a grading penalty of 15% and will not be accepted once the solutions are discussed in class.
LongInteger: implementation.
We are going to create a class called LongInteger
inheriting SignedNumeral
that can store an indefinite size of an integer value beyond the primitive types such as int
and long
.
What is so special about primitive data types in Java?
Java SE provides a similar class called BigInteger
although the implementations of LongInteger
andBigInteger
are completely independent.
Let us declare the member field digits
that is an array of bytes holding the values of this integer:
L1
: LongInteger
is passed to specify the generic type T
in SignedNumeral
.
The i
'th dimension of digits
is the i
'th least significant digit in the integer such that the integer 12345
would be stored as digits = {5, 4, 3, 2, 1}
, which makes it convenient to implement the arithmetic methods, add()
and multiply()
.
Is the array of bytes the most efficient way of storing a long integer?
Let us define the following three constructors:
L2
: the default constructor that initializes this integer with 0
by calling the constructor in L20
.
L10
: a copy constructor that initializes this integer with n
.
super()
: calls the corresponding constructor in the super class, SignedNumeral
.
Arrays.copyOf()
: creates a new array by copying n.digits
.
L20
: a constructor that initializes this integer with n
by passing it to the set()
method.
Do the constructors in
L10
andL20
call any constructor in the super class?
Arrays.copyOf()
is a static method referenced by the class type Arrays
, not an object. Java provides many classes with static methods that are commonly used (e.g., Arrays
, Collections
).
Can you call non-static methods or fields in the body of a static method?
The static
keyword must not be abused to quickly fix compile errors unless it is intended.
set()
Let us define the set()
method that takes a string and sets the sign and the value of this integer:
L1-7
: javadoc comments.
L4
: this method throws NullPointerException
if n
is null.
L5
: this method throws InvalidParameterException
if the format of n
is invalid.
L10-11
: throws the NullPointerException
.
L14-18
: checks the first character of n
and sets this.sign
using the switch expression.
String
member methods: charAt()
, substring()
.
yield
: returns the value of this switch statement for the condition (introduced in Java 14).
L21-30
: sets the value of n
to this.digits
.
L23
: for-loop can handle multiple variables such as i
and j
.
L24
: gets the ASCII value of n.charAt(i)
.
L25-28
: throws the InvalidParameterException
if v
is not a digit.
L29
: stores the value in the reverse order.
L27
: String.format()
is a static method in String
.
When should we use
throw
statements overtry..catch
blocks for error handling and vice versa?
This type of method is called a setter. Java encourages making member fields private and creating getters and setters to access the fields for encapsulation, which is not necessarily encouraged by other languages.
add()
Let us override the add()
method that calls two helper methods:
L3-4
: adds n
to this integer that has the same sign by calling addSameSign()
.
L5-6
: adds n
to this integer that has a different sign by calling addDifferentSign()
.
The following shows an implementation of addSameSign()
based on the simple arithmetic:
L7-9
: creates the byte array result
by copying values in this integer.
L8
: the dimension of result
can be 1 more than m
after the addition.
Static methods: Math.max()
, System.arraycopy()
.
L12-19
: adds n
to results
(if exists) from the least significant digit.
L15-18
: pass a carry to the next digit.
L22
: trims the most significant digit if it is 0
.
What are tradeoffs to make the size of
result
to bem
instead ofm+1
and vice versa?
The following shows addDifferentSign()
that throws UnsupportedOperationException
:
The implementation of addDifferentSign()
is quite similar to addSameSign()
although it involves a few more logics. We will leave this as an exercise.
In practice, addSameSign()
and addDifferentSign()
should be private. We made them protected for exercise purposes.
multiply()
Let us override the multiply()
method:
L4
: sets the sign after the multiplication.
L7-15
: multiplies n
to this integer:
L7
: the max-dimension of results
is digits.length + n.digits.length
.
L12-13
: pass a carry to the next digit.
L18-20
: trims the most significant digit iteratively if it is 0
.
L20
: ++m
increments m
before the comparison.
What is the worst-case complexity of the
multiply()
method?
main()
Let us create a runnable class called LongIntegerRun
that contains the main method:
Can we define the main method in
LongInteger
instead without creatingLongIntegerRun
?
L2
: the parameter args
is passed from the command line.
Why does the main method need to be static?
This prints something like the following:
[
: one-dimensional array.
L
: the element of this array is an object.
java.lang.String
: the type of object.
d716361
: the hash code of this array in hexadecimal.
What is the hash code of an object?
Every object implicitly inherits Object
that defines a few member methods including toString()
, which gets called automatically by the println()
method to retrieve the string representation of this object. We can use the helper method Arrays.toString()
that gives a more readable representation:
How is the
Arrays.toString()
method implemented?
Since no argument is passed to the main method at the moment, this prints an empty array:
If you set the arguments to 123 -456
using the [Run - Edit Configurations - Program arguments]
setting, it prints the following array:
Given those two arguments, we can create two integers:
This prints something like the following, which are returned by a.toString()
:
How is the
toString()
method implemented in theObject
class?
toString()
To print a more readable representation, we need to override the toString()
method in LongInteger
:
L5
: StringBuilder
provides an efficient way of concatenating different data types into one string.
What are the advantages of using
StringBuilder
instead of concatenating values with the+
operator as follows:
Given the overridden method, the above main method now prints the following:
What are the advantages of overriding
toString()
instead of creating a new method with the same code, and calling the new method to get the string representation ofLongInteger
?
compareTo()
Java does not allow operator overloading, so it is not possible to use logical operators to compare the two integers above, a
and b
:
In fact, any object that is comparable must inherit the interface Comparable
as follows:
L2
: LongInteger
is passed to Comparable
as a generic type.
Is
extends
always used to inherit a class whereasimplements
is used to inherit an interface?
The Comparable
interface contains one abstract method called compareTo()
that returns a negative value if this object is smaller than n
, a positive value if this object is greater than n
, and zero if this object equals to n
. The compareTo()
method must be overridden by the LongInteger
class:
The compareAbs()
method compares the absolute values of this
and n
:
L7
: if digits
has more dimensions, its absolute value is greater.
L10-13
: compares the significant digits iteratively.
Is it safe to use the same variable
i
to iterate bothdigits
andn.digits
?
Once LongInteger
properly inherits Comparable
by overriding compareTo()
, objects instantiated by this class can be compared using many built-in methods.
L2
: ArrayList
is a specific implementation of the interface List
.
All collections in Java inheriting AbstractCollection
uses generics.
<>
: the diamond operator that infers the generic type from its declaration.
L11
: sorts the list in ascending order using sort()
and Comparator.naturalOrder()
.
L14
: sorts the list in descending order using sort()
and Comparator.reverseOrder()
.
What is the advantage of declaring
list
asList
instead ofArrayList
? What kind of sorting algorithm doesCollections.sort()
use?
The above code prints the following sorted lists:
What would be the case that needs to distinguish
-0
from0
?
Jinho D. Choi
This is an advanced programming course in Computer Science that teaches how to design efficient structures and algorithms to process big data and methods to benchmark their performance for large-scale computing. Topics cover data structures such as priority queues, binary trees, tries, and graphs and their applications in constructing algorithms such as sorting, searching, balancing, traversing, and spanning. Advanced topics such as network flow and dynamic programming are also discussed. Throughout this course, students are expected to
Have a deep conceptual understanding of various data structures and algorithms.
Implement their conceptual understanding in a programming language.
Explore the most effective structures and algorithms for given tasks.
Properly assess the quality of their implementations.
There are topical quizzes and homework assignments that require sufficient skills in Java programming, Git version control, Gradle software project management, and scientific writing. Intermediate-level Java programming is a prerequisite of this course.
Different types of objects and inheritances in Java.
A class is a template to instantiate an object.
What is the relationship between a class and an object?
Let us create a class called Numeral
that we want to be a super class of all numeral types:
L1
: package
indicates the name of the package that this class belongs to in a hierarchy.
L3
: public
is an access-level modifier.
What are the acceptable access-level modifiers to declare a top-level class?
Let us declare a method, add()
, that is an operation expected by all numeral types:
L4
: @param
adds a javadoc comment about the parameter.
The issue is that we cannot define the methods unless we know what specific numeral type this class should implement; in other words, it is too abstract to define those methods. Thus, we need to declare Numeral
as a type of abstract class.
What are the advantages of having
Numeral
as a super class of all numeral types?
There are two types of abstract classes in Java, abstract class
and interface
.
Can an object be instantiated by an abstract class or an interface?
Let us define Numeral
as an interface:
L2
: abstract method
All methods in an interface are public
that does not need to be explicitly coded.
Abstract methods in an interface are declared without their bodies.
Who defines the bodies of the abstract methods?
Let us create a new interface called SignedNumeral
that inherits Numeral
and adds two methods, flipSign()
and subtract()
:
Can an interface inherit either an abstract class or a regular class?
L1
: extends
inherits exactly one class or interface.
L9
: default
allows an interface to define a method with its body (introduced in Java 8).
Can we call
add()
that is an abstract method without a body in the default methodsubtract()
?
Although the logic of subtract()
seems to be correct, n.flipSign()
gives a compile error because n
is a type of Numeral
that does not include flipSign()
, which is defined in SignedNumeral
that is a subclass of Numeral
.
What kind of a compile error does
n.flipSign()
cause?
There are three ways of handling this error: casting, polymorphism, and generics.
The first way is to downcast the type of n
to SignedNumeral
, which forces the compiler to think that n
can invoke the flipSign()
method:
This removes the compile error; however, it will likely cause a worse kind, a runtime error.
Why is a runtime error worse than a compile error?
Downcasting, although allowed in Java, is generally not recommended unless there is no other way of accomplishing the job without using it.
How can downcasting cause a runtime error in the above case?
The second way is to change the type of n
to SignedNumeral
in the parameter setting:
This seems to solve the issue. Then, what about add()
defined in Numeral
? Should we change its parameter type to SignedNumeral
as well?
It is often the case that you do not have access to change the code in a super class unless you are the author of it. Even if you are the author, changing the code in a super class is not recommended.
Why is it not recommended to change the code in a super class?
How about we override the add()
method as follows?
L2
: @Override
is a predefined annotation type to indicate the method is overridden.
The annotation @Override
gives an error in this case because it is not considered an overriding.
What are the criteria to override a method?
When @Override
is discarded, the error goes away and everything seems to be fine:
However, this is considered an overloading, which defines two separate methods for add()
, one taking n
as Numeral
and the other taking it as SignedNumeral
. Unfortunately, this would decrease the level of abstraction that we originally desired.
What are good use cases of method overriding and overloading?
The third way is to use generics, introduced in Java 5:
L1
: T
is a generic type that is a subtype of Numeral
.
A generic type can be recursively defined as T
extends Numeral<T>
.
L2
: T
is considered a viable type in this interface such that it can be used to declare add()
.
Can we define more than one generic type per interface or class?
The generic type T
can be specified in a subclass of Numeral
:
L1
: T
is specified as SignedNumeral
.
This would implicitly assign the parameter type of add()
as follows:
The issue is that the implementation of add()
may require specific features defined in the subclass that is not available in SignedNumeral
. Consider the following subclass inheriting SignedNumeral
:
L1
: implements
inherits multiple interfaces.
L2-6
: LongInteger
is a regular class, so all abstract methods declared in the super classes must be defined in this class.
Since the n
is typed to SignedNumeral
in L6
, it cannot call any method defined in LongInteger
, which leads to the same issue addressed in the casting section.
Would the type of
n
beingSignedNumeral
an issue for thesubtract()
method as well?
Thus, SignedNumeral
needs to define its own generic type and pass it onto Numeral
:
L1
: T
is a generic type inheriting SignedNumeral
, that implies all subclasses of SignedNumeral
.
T
can be safely passed onto Numeral
because if it is a subclass of SignedNumeral
, it must be a subclass of Numeral
, which is how T
is defined in the Numeral
class.
Generics are used everywhere in Java, so it is important to understand the core concept of generics and be able to adapt it in your code to make it more modular.
Let us create an enum class called Sign
to represent the "sign" of the numeral:
All items in an enum have the scope of static
and the access-level of public
.
Items must be delimited by ,
and ends with ;
.
The items in the enum can be assigned with specific values to make them more indicative (e.g., +
, -
):
L5
: final
makes this field a constant, not a variable, such that the value cannot be updated later.
L8
: this
points to the object created by this constructor.
L11
: @return
adds a javadoc comment about the return value of this method.
Why should the member field
value
beprivate
in the above example?
Note that value
in L8
indicates the local parameter declared in the constructor whereas value
in L13
indicates the member field declared in L5
.
In SignedNumeral
, it would be convenient to have a member field that indicates the sign of the numeral:
L2
: All member fields of an interface are static
and public
.
Can you declare a member field in an interface without assigning a value?
Given the sign
field, it may seem intuitive to define flipSign()
as a default method:
L3
: condition ?
A :
B is a ternary expression that returns A if the condition is true; otherwise, it returns B.
Is there any advantage of using a ternary operator instead of using a regular if statement?
Unfortunately, this gives a compile error because sign
is a constant whose value cannot be reassigned. An interface is not meant to define so many default methods, which were not even allowed before Java 8. For such explicit implementations, it is better to declare SignedNumeral
as an abstract class instead.
Let us turn SignedNumeral
into an abstract class:
L9
: the default constructor with no parameter.
L17
: another constructor with the sign
parameter.
L10
: this()
calls the constructor in L17
.
Why calling
this(Sign.POSITIVE)
inL10
instead of statingthis.sign = Sign.POSITIVE
?
L29
: abstract
indicates that this is an abstract method.
Member fields and methods in an abstract class can be decorated by any modifiers, which need to be explicitly coded.
Is there anything that is not allowed in an abstract class but allowed in a regular class?
In summary, SignedNumeral
includes 2 abstract methods, add()
inherited from Numeral
, and multiply()
declared in this class.
Can you define an abstract class or an interface without declaring an abstract method?
This chapter explains essential object-oriented programming features in Java to implement data structures and algorithms.
Please follow every example described in this section. Programming is an act of writing, not reading. By the end of this chapter, you should be able to reproduce the entire codebase yourself from scratch without consulting those examples.
Quiz 0: Getting Started
Right-click on the src/main/java
directory under the project and create the package edu.emory.cs.utils
. Right-click on the utils
package and create the Java class Utils
:
Add Utils.java
to git.
Add the following methods to the Utils
class:
Run the program by clicking [Run -> Run]
. If you see 5
on the output pane, your program runs successfully.
Open build.gradle
and add the following configurations (if not already), which would allow you to perform JUnit Testing:
Right-click on the src/test/java
directory under the project and create the package edu.emory.cs.utils
. Right-click on the utils
package and create the Java class UtilsTest
:
Add UtilsTest.java
to Git.
Add the following method to the UtilsTest
class. Make sure to include all imports:
Run the test by clicking [Run -> Run]
. If you see the test passed, your unit test runs successfully.
Add the instructors as collaborators in your GitHub repository:
Jinho Choi: jdchoi77
Peilin Wu: qualidea1217
Jeongrok Yu: jeongrok
Zinc Zhao: ZincZhao
2. Commit and push the following to your GitHub repository:
3. Submit the URL of your GitHub repository to Canvas.
Development kit, version control system, integrated development environment, and project management for Java programming.
Install the latest version of the Java SE Development Kit (JDK) on your local machine:
The required version: 17.x.x
(or higher)
Although Java 17 is not the most recent version, it is the latest long-term support (LTS) release, which is preferred.
Install Git using any of the following instructions:
Run the following commands on a terminal by replacing user.email
and user.name
with your email address and name:
Install the latest version of IntelliJ IDEA on your local machine:
The recommended version: 2022.3.x
(Ultimate Edition)
Apply for the academic license with your school email address to use the ultimate version.
Even if you already have an IDE that you are familiar with for Java programming, we strongly recommend you use IntelliJ because JetBrains provides IDEs for many popular programming languages with similar interfaces, which makes it easier for you to adapt.
Launch IntelliJ and create a Gradle project by clicking the [New Project]
button at the top:
Name: dsa-java
Location: local_path/dsa-java
Check "Create Git repository"
Language: Java
Build system: Gradle
JDK: 17
Gradle DSL: Groovy
Uncheck "Add sample code"
Advanced Settings:
GroupId: edu.emory.cs
ArtifactId: dsa-java
For JDK, you should be able to see version 17 if it is properly installed. If you cannot find the version, click [Add JDK]
and select the following directory.
Windows: C:\Program Files\Java\jdk-17.x.x
Mac: /Library/Java/JavaVirtualMachines/jdk-17.x.x.jdk
Open gradle/wrapper/gradle-wrapper.properties
and make sure distributionUrl
indicates the latest version of Gradle:
Click [Settings - Build, Execution, Deployment]
on the menu:
Click [Build Tools - Gradle]
and set Gradle JVM
to 17.
Click [Compiler - Java Compiler]
and set Project bytecode version
to 17.
Click [File - Project Structure]
and select [Project Settings]
:
Click [Project Settings - Project]
and set SDK
to 17 and Project language level
to SDK default.
Click [Project Settings - Modules - Dependencies]
and set Module SDK
to 17.
Go to [Platform Settings - SDKs]
and select 17.
Open build.gradle
and make sure sourceCompatibility
and targetCompatibility
are set to java version 17 (add the following configurations if they do not exist already):
Lastly, check mavenCentral()
is configured as a repository in your build.gradle
:
There is another popular build tool called Maven. However, we will use Gradle for this course because it is faster and simpler to build a Java-based project.
To integrate the project with your GitHub repository, click [Settings]
:
Choose [Version Control - Github]
on the left pane.
Click [+]
and login with your GitHub ID and password.
If you use two-factor authentication, log in with your personal access token.
Create .gitignore
under the project and add the following contents:
Click [Git - GitHub - Share Project on Github]
and create a repository:
Make sure to check private.
Repository name: dsa-java
Remote: origin
Description: Data Structures and Algorithms in Java
Add all files and make the initial commit. Check if the repository is created under your GitHub account: https://github.com/your_id/dsa-java
.
We recommend you create a GitHub account with your school email address, allowing you to add unlimited collaborators to the repository.
LongInteger: unit tests.
LongInteger()
Let us create a testing class called LongIntegerTest
and make a unit test for the constructors:
L6
: the @Test
annotation indicates that this method is used for unit testing.
L7
: methods used for unit testing must be public
.
L8
: tests the default constructor
assertEquals()
: compares two parameters where the left parameter is the expected value and the right parameter is the actual value.
L12-15
: tests the constructor with a string parameter.
L18-19
: tests the copy constructor.
When should we use
import static
instead ofimport
?
When you run this test, you see a prompt similar to the followings:
multiply()
Let us define another method for testing the multiply()
method:
compareTo()
Let us define a method for testing the compareTo()
method:
assertTrue()
: passes if the parameter returns true.
Unit testing provides an effective way of ensuring the correctness of your program. Making a unit test for every method is standard practice, although this standard is often dismissed due to time pressure.
Spring 2023
01/11
01/16
MLK Holiday
01/18
01/23
(Continue)
01/25
(Continue)
01/30
02/01
(Continue)
02/06
02/08
(Continue)
02/13
(Continue)
02/15
02/20
(Continue)
02/22
(Continue)
02/27
03/01
(Continue)
03/06
Spring Break
03/08
Spring Break
03/13
03/15
03/20
(Continue)
03/22
03/27
(Continue)
03/29
(Continue)
04/03
04/05
(Continue)
04/10
(Continue)
04/12
04/17
(Continue)
04/19
(Continue)
04/24
Review
Speed vs. complexity.
Let us define a static nested class Time
under that stores runtimes for the add()
and remove()
methods in any PQ:
What is the advantage of defining static nested class over non-static nested class?
Let us define addRuntime()
that estimates the runtime for add()
and remove()
for a particular PQ:
L5-8
: estimates the runtime for add()
:
L8
: accumulates the runtime to times.add
.
L11-14
: estimates the runtime for remove()
.
Is the approach using
currentTimeMillis()
a proper way of benchmarking the speed?
We then define benchmark()
that estimates the speeds of multiple PQs using the same lists of keys by calling addRuntime()
:
L1
: method declaration:
AbstractPriorityQueue[]
: It is unusual to declare an array of objects involving generics in Java (you will see the unique use case of it in the testSpeedAux()
method).
L2
: generates an array of Time
using several stream operations. The dimension of times[]
is the same as the dimension of queues
such that times[i]
accumulates the speeds taken by the queue[i]
, respectively.
L4
: benchmarks the PQs multiple times as specified by iter
.
L5
: generates the list of keys specified by the supplier sup
.
L6-7
: estimate the runtimes by using the same list of keys for each PQ.
Why do we need to benchmark the priority queues multiple times?
Let us define testSpeedAux()
that takes multiple PQs and prints out the benchmarking results:
L2
: declares a final method that cannot be overridden by its subclasses.
It can take a 0-to-many number of abstract PQs.
The type of queues
is AbstractPriorityQueue<Integer>[]
such that it can be directly passed to the benchmark()
method.
L8
: benchmarks different sizes of PQs.
L10
: warms up the Java Virtual Machine (JVM) before the actual benchmarking.
rand::nextnt
that is equivalent to the lambda expression of () -> rand.nextInt()
.
L12
: benchmarks add()
and remove()
in the PQs.
L14-18
: prints the benchmark results.
L16
: exercise.
Why is it recommended to declare the method
final
if it takesvararg
parameters? Why does JVM need to be warmed-up before the actual benchmarking? Why should you useStringJoiner
instead of concatenating strings with the+
operator?
The followings show runtime comparisons between LazyPQ
, EagerPQ
, and BinaryHeap
:
Lazy and eager priority queues.
A priority queue (PQ) is a data structure that supports the following two operations:
add()
: adds a comparable key to the PQ.
remove()
: removes the key with the highest (or lowest) priority in the PQ.
A PQ that removes the key with the highest priority is called a maximum PQ (max-PQ), and with the lowest priority is called a minimum PQ (min-PQ).
Does a priority queue need to be sorted at all time to support those two operations? What are the use cases of priority queues?
Let us define that is an abstract class to be inherited by all priority queues:
L2
: is a comparator that can compare keys of the generic type T
.
final
: must be initialized in every constructor.
L6
: the javadoc {@link}
hyperlinks to the specified methods.
What are comparable data types in Java? Can you define your own comparator?
Let us define three abstract methods, add()
, remove()
, and size()
in AbstractPriorityQueue
:
Given the abstract methods, we can define the regular method isEmpty()
:
In other words, all the hard work is done at the last minute when it needs to remove the key.
L1
: declares T
and passes it to its super class, AbstractPriorityQueue
.
L2
: defines a list to store input keys.
L17-19
: overrides the size()
method.
Can you add keys to the member field
keys
when it is declared as final (a constant)? Why does all constructors inLazyPriorityQueue
need to call the super constructor?
We then override the core methods, add()
and remove()
:
L16
: edge case handling.
In other words, all the hard work is done as soon as a key is added.
What are the situations that
LazyPQ
is preferred overEagerPQ
and vice versa?
The implementations of the two constructors and the size()
method are identical to the ones in LazyPriorityQueue
.
Should we create an abstract class that implements the above code and make it as a super class of
LazyPQ
andEagerPQ
? What level of abstraction is appropriate in object-oriented programming?
We then override the core methods, add()
and remove()
:
What are the worst-case complexities of
add()
andremove()
inLazyPQ
andEagerPQ
in terms of assignments and comparison?
Selection sort, heap sort, insertion sort, shell sort.
Selection-based sorting algorithms take the following steps:
For each key where and :
Search the maximum key where .
Swap and
The complexities differ by different search algorithms:
Selection Sort uses linear search to find the minimum (or maximum) key, whereas Heap Sort turns the input array into a heap, so the search complexity becomes instead of .
Can the search be faster than ?
Let us create the class inheriting AbstractSort
:
Let us then override the sort()
method:
How does the
sort()
method work withComparator.reverseOrder()
?
Before we override the sort()
method, let us define the following helper methods:
L1-7
: finds the right position of the k
'th key by using the sink operation.
L9-11
: finds the parent index of the k
'th key given the beginning index.
L13-15
: finds the left child index of the k
'th key given the beginning index.
Finally, we override the sort()
method:
What is the worst-case scenario for Selection Sort and Heap Sort?
Insertion-based sorting algorithms take the following steps:
The complexities differ by different sequences:
Let us then define an auxiliary method, sort()
:
L6
: swaps the two keys.
Given the auxiliary method, the sort()
method can be defined as follows where h = 1
:
How many swaps does Insertion Sort make for the following array?
[7, 1, 2, 3, 4, 5, 6, 14, 8, 9, 10, 11, 12, 13, 0]
For the above example, by using the Hibbard sequence, it first groups keys whose gap is 7:[7, 14, 0], [1, 8], [2, 9], [3, 10], [4, 11], [5, 12], [6, 13]
It then sorts each group using Insertion Sort, which results in the following array:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
The above procedure is repeated for gaps 3 and 1; however, the array is already sorted, so no more swapping is necessary.
L2
: stores a particular gap sequence.
L7
: pre-populate the gap sequence that can handle the input size up to 10000
.
Then, let us define two abstract methods, populateSequence()
and getSequenceStartIndex()
:
L5
: populates a particular sequence for the input size n
.
L11
: returns the index of the first gap to be used given the input size n
.
Let us then override the sort()
method:
L4
: should not re-populate the sequence unless it has to.
L7
: sorts the gap group by using the auxiliary method.
The two abstract methods can be overridden as follows:
The abstract class for all sorting algorithms.
Let us create an abstract class :
L2-4
: member fields:
comparator
: specifies the precedence of comparable keys.
comparisons
: counts the number of comparisons performed during sorting.
assignments
: counts the number of assignments performed during sorting.
The two-member fields, comparisons
and assignments
, are used to micro-benchmark sorting algorithms inheriting AbstractSort
.
Let us define three helper methods, compareTo()
, assign()
, and swap()
:
L7
: compares two keys in the array and increments the count.
L18
: assigns the value to the specific position in the array and increments the count.
L29
: swaps the values of the specific positions in the array by calling the assign()
method.
These helper methods allow us to analyze the exact counts of those operations performed by different sorting algorithms.
How is it different to benchmark runtime speeds from counting these two operations?
Finally, let us define the default sort()
that calls an overwritten abstract method, sort()
:
L5
: calls the abstract method sort()
that overwrites it.
L15
: sorts the array
in the range of (beginIndex
, endIndex
) as specified in comparator
.
When would it be useful to sort a specific range in the input array?
Bucket sort, radix sort.
Unlike comparison-based algorithms in Sections and , distribution-based sorting algorithms do not make comparisons between keys in the input array; instead, they distribute keys into ordered buckets and merge keys in the buckets.
Let us create an abstract class, inheriting AbstractSort
:
L2
: declares a list of buckets where each bucket is represented by a .
L6
: does not pass any comparator to AbstractSort
since no comparison is needed.
L7
: creates a pre-defined number of buckets.
What kind of input keys can work with distribution-based sorting algorithms?
We then define a helper method sort()
:
L7
: bucketIndex
is a function that takes a comparable key and returns the appropriate bucket index of the key.
L9-10
: adds the keys within the range in the input array to the buckets.
L13-16
: puts all keys in the buckets back to the input array while keeping the order.
How many assignments are made by the
sort()
method inBucketSort
?
Let us create the IntegerBucketSort
class inheriting BucketSort
that sorts integers within a specific range in ascending order:
L2
: takes the smallest integer in the range.
L8
: passes the number of buckets, max - min
, to be initialized to the super constructor.\
We then override the sort()
method:
L3
: passes a lambda expression that takes key
and returns key - MIN
indicating the index of the bucket that key
should be assigned to.
Is it possible to implement
DoubleBucketSort
that can handle double instead of integer keys?
Let us create an abstract class, RadixSort
, inheriting BucketSort<Integer>
:
L3
: creates 10
buckets to sort any range of integers.
L6
: returns the order of the most significant digit in the input array.
Let us create the LSDRadixSort
class inheriting RadixSort
that sorts the integer keys from the least significant digit (LSD) to the most significant digit (MSD):
L5
: iterates from LSD to MSD.
The followings compare runtime speeds between advanced sorting algorithms where the input arrays consist of random integer keys:
Quiz 2: Priority Queues
Create a class called under the main package that extends the abstract class .
Each node in TernaryHeapQuiz
takes up to 3 children, so it becomes a ternary instead of a binary tree.
Override the required abstract methods, , , and , such that both add()
and remove()
give .
Feel free to use the code in .
Create the class under the test package.
Test the correctness of your TernaryHeapQuiz
using the method.
Add more tests for a more thorough assessment if necessary.
Compare runtime speeds between BinaryHeap
and TernaryHeapQuiz
for add()
and remove()
using the method.
Create a PDF file quiz2.pdf and write a report that includes the following:
A table and a chart to compare speeds between the two priority queues for those two methods, add()
and remove()
, with respect to different input sizes.
A brief explanation of why a certain PQ is faster than the other PQ with respect to different input sizes.
1. Commit and push everything under the following packages to your GitHub repository:
2. Submit quiz2.pdf to Canvas.
Operations: swim and sink.
A binary heap is a PQ that takesfor both add()
and remove()
, where keys are stored in a balanced binary tree in which every node has a higher priority than its children (when used as a max-PQ).
When should we use a binary heap over a lazy PQ or an eager PQ?
A balanced binary tree can be implemented by a list such that given the 'th key in the list:
The index of its parent:
The index of its left child: .
The index of its right child: .
e.g., given the list [1, 2, 3, 4, 5]
, 1
is the root, 2
and 3
are the left and right children of 1
, and 4
and 5
are the left and right children of 2
, respectively.
Is the tree represented by the list
[1, 2, 3, 4, 5]
balanced?
Let us create the class inheriting AbstractPriorityQueue
:
L11
: adding null
as the first item makes it easier to calculate the indices of parents and children.
L16
: keys
has the extra item of null
from L11
that is not an input key.
How does adding
null
make the calculation of the indices easier?
Additionally, let us define the helper method compare()
that compares two keys in the list:
L8
: calls the compare()
method in the member field priority
declared in the super class.
Add()
with SwimThe add()
method in a heap uses the operation called swim as follows:
L3
: appends the new key to the list.
L7-10
: keeps swimming until the list becomes a heap:
L8
: compares the new key to its parent if exists.
How many keys are compared at most for the `add operation?
Remove()
with SinkThe remove()
method in a heap uses the operation called sink as follows:
L3
: checks if the heap is empty.
L4
: swaps the root and the last key in the list.
L3
: removes the (swapped) last key with the highest priority in this heap.
L10-16
: keeps sinking until the list becomes a heap:
L12
: finds the child with a higher priority.
L13
: breaks if the parent has a higher priority than the child.
L14
: swaps if the child has a higher priority than the parent.
How many keys are compared at most for the
remove
operation?
Robustness tests.
Let us create and define testRobustnessAux()
that checks the robustness of any PQ inheriting AbstractPriorityQueue
:
L6
: the generic type T
is defined specifically for this method such that the priority queue pq
, the list keys
, and the comparator comp
take the same type T
that is comparable.
L7
: any collection that inherits the interface has the member method that takes a and applies each key in the collection to the consumer.
L8
: any collection has the member method that returns the of the collection.
: creates a stream of the collection whose keys are sorted
: creates a collection specified by the .
: returns a collector that transforms the stream into a list.
L9
: iterates each of the sorted keys and compares it to the returned value of pq.remove()
.
What are the advantages of defining a method-level generic type?
The following code shows a traditional way of iterating over a list (that is equivalent to L7
above):
Java 5 introduced an enhanced for loop that simplified the traditional for loop:
The syntax of the above code can be simplified as follows (as shown in L7
):
What are the main differences between object-oriented programming and functional programming?
Let us define the testRobustness()
method that calls the auxiliary method testRobustnessAux()
:
L7-9
: tests different types of max-PQs. The keys need to be sorted in reverse order for it to test the remove()
method.
L11-13
: tests different types of min-PQs. The keys need to be sorted in the natural order for it to test the remove()
method.
The generic types of the PQs in
L4-5
are explicitly coded as<Integer>
, whereas they are simplified to<>
inL7-13
. When do generic types need to be explicitly coded?
The testRobustness()
method illustrates the benefit of defining AbstractPriorityQueue
such that any type of PQ can be passed to testRobustnessAux()
for unit-testing.
Merge sort, quick sort, intro sort
Divide the problem into sub-problems (recursively).
Conquer sub-problems, which effectively solve the super problem.
Why do people ever want to use QuickSort?
Divide the input array into two sub-arrays.
Sort each of the sub-arrays and merge them into the back.
Let us create the class inheriting AbstractSort
:
L2
: holds the copy of the input array.
Let us then override the sort()
method that calls the helper method:
L4-5
: increases the size of the temp
array if necessary.
L5
: unchecked type Object
to T[]
.
What is the advantage of declaring the member field
temp
?
The helper method can be defined as follows:
L11
: sorts the left sub-array.
L12
: sorts the right sub-array.
L13
: merges the left and right sub-arrays.
Finally, the merge()
method can be defined as follows:
L10-11
: copies the input array to the temporary array and counts the assignments.
L14-15
: no key left in the 1st half.
L16-17
: no key left in the 2nd half.
L18-19
: the 2nd key is greater than the 1st key.
L20-21
: the 1st key is greater than or equal to the 2nd key.
Pick a pivot key in the input array.
Exchange keys between the left and right partitions such that all keys in the left and right partitions are smaller or bigger than the pivot key, respectively.
Repeat this procedure in each partition, recursively.
Let us then override the sort()
method:
L3
: stops when the pointers are crossed.
L6
: sorts the left partition.
L7
: sorts the right partition.
The partition()
method can be defined as follows:
L5
: finds the left pointer where endIndex
> fst
> pivot
.
L6
: finds the right pointer where beginIndex
< snd
< pivot
.
L7
: the left and right pointers are crossed.
L8
: swaps between keys in the left and right partitions.
L11
: swaps the keys in the beginIndex
and pivot
.
The programming design in L5
and L6
are not ideal since the while loops do not include anybody that can confuse other programmers.
How can we determine if Quick Sort is meeting the worse-case?
Let us define the IntroSoft
class inheriting QuickSort
:
L2
: declares a sorting algorithm to handle the worst cases.
L14
: resets the counters for both the main and auxiliary sorting algorithms.
We then override the sort() method by passing the maximum depth to the auxiliary method:
Finally, the auxiliary method sort()
can be defined as follows:
L4-5
: switches to the other sorting algorithm if the depth of the partitioning exceeds the max depth.
The following shows runtime speeds, assignment costs, and comparison costs between several sorting algorithms for the random, best, and worst cases.
Quiz 3: Sorting Algorithms
Create the class under the package that inherits .
Override the populateSequence()
and getSequenceStartIndex()
methods such that it performs Shell Sort by using the Hibbard sequence: .
Feel free to use the code in .
Create the class under the package that inherits .
Override the sort()
method such that it performs Radix Sort from the most significant digit (MSD) to the least significant digit (LSD).
Feel free to use the code in .
Create the class under the test package.
Test the correctness of your TernaryHeapQuiz
using the method.
Add more tests for a more thorough assessment if necessary.
Compare runtime speeds between LSDRadixSort
and RadixSortQuiz
for random cases.
Create a PDF file quiz3.pdf and write a report that includes charts and explanations to compare runtime speeds between:
ShellSortKnuth
and ShellSortQuiz
.
LSDRadixSort
and RadixSortQuiz
.
Push everything under the sort package to your GitHub repository:
Submit quiz3.pdf to Canvas.
This section explains the homework assignment regarding Hybrid Sort.
Your task is to write a program that sorts a 2D-array of comparable keys into an 1D-array where all the keys are sorted in ascending order. Here is a sample input to your program:
Here is the expected output given the sample input:
Each row in the input array has one of the following properties:
Sorted in ascending order (e.g., the 1st row).
Sorted in descending order (e.g., the 2nd row).
Mostly sorted in ascending order (e.g., the 3rd row).
Mostly sorted in descending order (e.g., the 4th row).
Randomly distributed (e.g., the 5th row).
The easiest way is to merge all rows in the input array into an 1D-array then sort it using . The implementation of this strategy is provided to establish the baseline: . Your goal is to design a sorting mechanism that gives a noticeable speed-up over this baseline.
Try different input arrays; you are responsible for the robustness of your program.
Try different configurations for speed comparison (e.g., row size, column size, shuffle ratio).
Write the report hw1.pdf describing the logic behind your mechanism and the speed comparison against the baseline.
Submit hw1.pdf to Canvas.
You are not allowed to:
Use any implementation that is not written by you nor provided by this course.
Call any sorting methods built-in Java or 3rd-party libraries.
Merge all rows first and apply a fancy sorting algorithm to the 1D array as the baseline does.
The input matrix can:
Contain an arbitrary number of rows, where the size of each row is also arbitrary.
Baseline: 20697
Cannot run: 7
Extremely slow: 6
Code not found: 1
This section discusses abstraction of binary search trees.
This section assumes that you have already learned core concepts about binary search tress from the prerequisite. Thus, it focuses on the abstraction that can be applied to other types of binary search trees introduced in the following sections.
Let us create an abstract class, :
L1
: defines two generic types, T
for the type of the key and N
is for the type of the binary node.
L8
: calls the setKey()
method to assign the value of key
in L2
.
Let us define boolean methods for the member fields:
What is the input parameter
node
isnull
for theisLeftChild()
andisRightChild()
methods?
Let us then define getters to access the member fields:
We can also define helper methods inferred by the getters:
L9
: this
needs to be casted to N
since the input parameter of isLeftChild()
is N
.
Is it safe to downcast
this
toN
?
Finally, let us define setters and their helper methods:
L5,10
: sets node
to be a child of this
in two steps:
L6,11
: replaces the parent of node
with this
.
L7,12
: sets node
to be the child.
L20
: replaces the parent of node
with this
in two steps:
L22
: node
gets abandoned by its current parent.
L23
: this
becomes the new parent of node
.
L32
: replaces oldChild
of this node with newChild
.
L1
: defines only 1 generic type T
for the comparable key and passes itself for the generic type N
to theAbstractBinaryNode
class.
Is there any abstract method from
AbstractBinaryNode
to be defined inBinaryNode
?
L1
: defines two generic types, T
for the type of the key and N
is for the type of the binary node.
L4
: initializes the member field root
to null
.
L9
: creates a binary node typed N
, required for the add()
method.
Why does Java not allow a generic type to be instantiated (e.g.,
node = new N()
)?
Let us define searching methods:
What are the worst-case complexities of
findNode()
,findMinNode()
, andfindMaxNode()
?
Let us define the add()
method:
L5
: creates a node with the key to be the root if this tree does not include any node.
L7
: finds the appropriate location for the key and creates the node.
What does the
add()
method above do when the input key already exists in the tree?
Let us define the remove()
method:
L6
: removes a node with two children using the Hibbard algorithm.
L7
: removes a node with no or one child
The removeSelf()
method makes the node's only child as the child of its parent and removes it:
L6
: finds the child of node
.
L7
: replaces node
with its child.
The removeHibbard()
method finds a node that can be the parent of the left- and the right-children of node
and makes it a child of its parent:
Which nodes are guaranteed to be the parent of those left- and right- children?
The following demonstrates how the above removeHibbard()
method works:
What is the worst-case complexity of the
removeHibbard()
method?
Let us define the BinarySearchTree
inheriting AbstractBinarySearchTree
:
This section gives an overview of tries.
A trie is a tree where each node has:
0-to-many children,
A key whose type is character,
A value that can be any type of object, and
An end-state that indicates if the node and its ancestors together represent the end of a certain word.
Let us consider the following list of strings:
Given the strings as keys, a binary search tree can be constructed as follows using a balancing algorithm:
How many character comparisons does it need to make to search a string in a balanced BST?
With the same list of strings, a trie can be constructed as follows:
1st cell: key (a character).
2nd cell: value (the index of the string that the node represents).
3rd cell: end-state (T
for true, F
for false).
What is the worst-case complexity of searching a string in a trie?
This section provides exercises for better understanding in balanced binary search trees.
Explain how the remove()
method works in the class:
Explain what gets assigned to lowest
in .
Explain how the balance()
methods in and keep the trees balanced (or fails to keep them balanced) after a removal.
Write a report including your explanation and save it as quiz4.pdf
.
Create the class under the package that inherits class.
Override the balance()
method that first checks the following conditions:
node
is the only child &
node
’s parent is the right child of node
's grand parent &
node
’s uncle has only one child.
If all of the above conditions are satisfied, balance()
makes multiple rotations such that the left subtree is always full before any node gets added to the right subtree.
For the example below, after adding the keys (in red) to the trees 1
, 2
, 3
, and 4
, they all need to be transformed into the tree 5
by the balance()
method.
The transform must be performed only by rotations; other setter methods such as setLeftChild()
or setRightChild()
are not allowed in this assignment.
This section discusses abstraction of binary search trees.
A binary search tree gives the worst-case complexity of for searching a key when it is not balanced, where is the number of nodes in the tree:
To ensure the complexity, it needs to be balanced at all time (or at most time).
There are 4 cases of unbalanced trees to be considered:
Left linear [3, 2, 1]
Right linear [1, 2, 3]
Left zig-zag [3, 1, 2]
Right zig-zag [1, 3, 2]
These cases can be balanced by performing rotations as follows:
What should happen to the subtrees (red/orange/green/blue triangles) after the rotations?
Where does
node
end up being after the rotations?
The followings demonstrate how the rotateLeft()
method works:
Let us override the add()
and remove()
methods that call the abstract method balance()
:
L3
: calls the add()
method in the super class, AbstractBinarySearchTree
.
L4
: performs balancing on node
that is just added.
L14
: performs balancing on node
that is the lowest node in the tree affected by this removal.
L24
: defines a balancing algorithm for a specific type of balanced binary search trees.
What are the worst-case complexities of the
add()
andremove()
methods above?
This section describes how to implement a trie.
Let us define a class, :
L1
: the generic type T
represents the type of value
in L6
.
L2
: is used to store the children.
L4
: true
if this node represents the last character of a certain word; otherwise, false
.
L9
: gives the complexity for searching a key.
What other types of maps are implemented in Java?
Let us define getter methods:
L10
: returns the map consisting of children's characters as keys and their nodes as values.
Is there any risk of returning
this.children
inL11
?
Let us then define setter methods:
L7
: returns the previously assigned value of this node.
L13
: returns the child with the specific key if exists; otherwise, a new child with the key.
What does the
removeChild()
method return ifkey
does not exist in this node?
Finally, let us define checker methods:
L1
: returns true
if this node represents the last character of a certain word; false
.
L1
: defines a generic type T
that can be any type.
L5
: initializes the root with the null
character.
Let us define the find()
method that takes a specific key in string and returns the node corresponding the last character of the key:
L5
: iterates through every character in the string.
L6
: finds the node whose key is the current character.
L7
: the child does not exist, implying that key
is not introduced in this trie.
Given the find()
method, let us define the get()
method that takes a string key and returns the value of the node corresponding to the key in this trie:
L3
: checks if both the key exists and the key is a word.
We then define the put() method that inserts a key-value pair to the trie:
L4-5
: iterates through all characters in the key and adds children as necessary.
L7
: sets the end state of the node corresponding the last character to true
.
L8
: returns the value of the previously key if exists; otherwise, null
.
What does
node.addChild(c)
return if the child with the keyc
already exists?
The following demonstrates how the put()
method works:
Finally, let us define the remove()
method that removes all appropriate nodes corresponding to the key:
L2
: retrieves the node corresponding to the last character in the key.
L4
: returns false
the key does not exist or the key is not a word.
L7-10
: if the retrieved node has children, sets its end-state to false
and returns true
.
L12-23
: removes ancestors who have descendants only related to this key.
The following demonstrates how the remove()
method works:
This section describes disjoint sets
A disjoint set enables to be join sets efficiently. There are two methods implemented in this structure, inSameSet()
and union()
.
In the following example, each key in {0, 1, 2, 3, 4}
is initially in its own set:
inSameSet(1, 3)
checks if the keys 1
and 3
are in the same set:
The union(1, 3)
method joins two sets including 1
and 3
as one set:
If we check if the keys 1
and 3
are in the same set, it should return true
although for the keys 1
and 4
, it should return false
.
The union(3, 4)
method joins the two sets including 3
and 4
as one:
If we check if 1
, 3
and 4
are in the same set, it should return true
:
This section provides exercises for better understanding in tries.
Let L
be a list of country names in String
where spaces are allowed (e.g., "United States"
, "South Korea"
). Consider a trie where all country names in L
and their unique IDs are put as key-value pairs as follows:
Write a report that briefly describe your approach and explains the worst-case complexity of your approach, and save it as quiz5.pdf
.
Substring matching of country names is not required. If your dictionary has "United States"
as a country name and the input string contains "United Nation"
, the "United"
should not be matched as a country name.
Substring matching within input words are expected. If your dictionary has "America"
as a country name and the input contains "American"
, the first 7 characters, "America"
should be recognized as a country name.
The worst-case complexity needs to be explained in terms of the number of characters n
in the input string.
This sections discusses the implementation of disjoint sets.
Let us create the class:
L2
: the size of subset (-1
implies 1
ID, -2
implies 2
IDs, and so on).
L5-6
: every set is initialized with the size of 1
.
L9
: a copy constructor.
Let us define the find()
method:
L2
: returns the ID of the subset where the specific key belongs to.
What is the worst-case complexity of the
find()
method?
We then define the inSameSet()
method:
L2
: returns true if the two specific keys are in the same set.
Finally, let us define the union()
method:
L2-4
: return the subset ID where both key1
and key2
belongs to.
L6-10
: merges the set containing key2
to the set containing key1
.
L11-15
: merges the set containing key1
to the set containing key2
.
This section provides exercises for better understanding in disjoint sets.
Create the class under the package.
Assume that the method in the DisjointSet
class uses the baseline approach:
A disjoint set can be represented by a tree. Update the method in the DisjointSetQuiz
class that would result the following tree:
Write a report quiz6.pdf
that includes the followings:
This section provides exercises for better understanding in disjoint sets.
Create the class under the package.
Update the numberOfCycles()
method that returns the number of all cycles in the graph.
Make sure to find all atomic cycles; do not count cycles that can be created by simply combining other cycles.
Write a report quiz6.pdf
that includes the followings:
Explain the worst-case complexity of the algorithm for your numberOfCycle()
method.
This section describes an algorithm to detect a cycle in a graph.
A cycle in a graph can be detected by traversing through the graph:
Let us define the containsCycle()
method under the Graph
class that returns true if the graph contains any cycle; otherwise, false:
L2
: initially all vertices are not visited.
L4
: iterates until all vertices are visitied:
L5-6
: if the recursive call finds a cycle, returns true
.
What is the worst-case complexity of the
containsCycle()
method?
L2-3
: marks the target vertex visited.
L5
: for each incoming edge of the target vertex:
L6-7
: returns true if the source vertex of this edge has seen before.
L9-10
: returns true if the recursive call finds a cycle.
Why do we need to pass the new
HashSet
for every call inL5
?
This section discusses spanning trees in a graph.
A spanning tree in a graph is a tree that contains all vertices in the graphs as its nodes. A minimum spanning tree is a spanning tree whose sum of all edge weights is the minimum among all the other spanning trees in the graph.
Can a graph have more than one minimum spanning tree?
L2
: contains all edges in this spanning tree.
L3
: contains the total weight of this spanning tree.
We then define getters and setters:
L3
: the size of the spanning tree is determined by the number of edges.
Finally, we override the compareTo()
method that makes the comparison to another spanning tree by the total weight:
L2
: an abstract method that takes a graph and returns a minimum spanning tree.
This section describes how to implement a graph structure.
There are several types of graphs:
Can we represent undirected graphs using an implementation of directed graphs?
L2-3
: the source and target vertices are represented by unique IDs in int
.
L10
: constructor overwriting.
L14
: copy constructor.
Let us the define the init()
method, getters, and setters:
Let us override the compareTo()
method that compares this edge to another edge by their weights:
L4
: returns 1
if the weight of this edge is greater.
L5
: returns -1
if the weight of the other edge is greater.
L6
: returns 0
is the weights of both edges are the same.
L2
: a list of edge lists where each dimension of the outer list indicates a target vertex and the inner list corresponds to the list of incoming edges to that target vertex.
L5
: creates an empty edge list for each target vertex.
L9
: copies the list of edge lists from g
to this graph.
L17
: the size of the graph is the number of vertices in the graph.
For example, a graph with 3 vertices 0
, 1
, and 2
and 3 edges 0 -> 1
, 2 -> 1
, and 0 -> 2
, the incoming_edges
is as follows:
Let us define setters for edges:
L2-4
: retrieves the incoming edge list of target
, and adds the new edge to the edge list.
L9-10
: add edges to both directions.
Let us then define getters:
What is the worst-case complexity of the
getVerticesWithNoIncomingEdges()
method?
Let us define the getOutgoingEdges()
method that returns the list of outgoing edges:
L1
: returns a list of edge deque, where each dimension in the outer list represents the deque of outgoing edges for the corresponding source vertex.
L2
: generates the list of empty edge deques.
L4-7
: visits every target vertex to retrieve the incoming edge lists, and assigns edges to appropriate source vertices.
What is the worst-case complexity of the
getOutgoingEdges()
method?
This section discusses two algorithms for topological sorting.
The task of topological sorting is to sort vertices in a linear order such that for each vertex, all target vertices associated with its incoming edges must appear prior to it.
The followings demonstrate how to perform a topological sort by incoming scores, where the incoming score of a vertex is define as the sum of all source vertices:
The followings demonstrate how to perform a topological sort by depth-first traversing:
Let us define the topological_sort()
method under the Graph
class that performs topological sort by either incoming scores or depth-first search:
L2
: starts with vertices with no incoming edges.
L3
: retrieves outgoing edges of each source vertex.
L4
: consists of the list of vertices in topological order.
L6
: iterates until no more vertex to visit:
L8-9
: adds a source vertex to order
.
L12
: iterates through all the outgoing edges of the source vertex:
L17
: removes the outgoing edge from the graph.
L20-21
: adds the target vertex if it has no incoming edge left to local
.
L25-27
: adds all vertices in local
to global
.
What is the worst-case complexity of the
topological_sort()
method?
This section discusses Kruskal's Algorithm that finds a minimum spanning tree in an undirected graph.
Kruskal's algorithm finds a minimum spanning tree by finding the minimum edge among subtrees.
Let us create the class inheriting the MST
interface:
L4
: adds all edges to the priority queue.
L5
: creates a forest where each vertex is considered a separate tree.
L8
: iterates through all edges:
L11
: if the target and source vertices in the edge are not in the same tree.
L13
: if the tree contains v-1
number of edges, a spanning tree is found.
L14
: merges the target and source vertices into the same tree.
This section describes the homework assignment to develop an algorithm to find all minimum spanning tress given an undirected graph.
Your task is to write a program that finds all minimum spanning trees given an undirected graph.
Create a class called that inherits the interface.
Override the getMinimumSpanningTrees()
method.
Feel free to use any classes in the package without modifying.
Write a report that explains the logic and worst-case complexity of your program and save it as hw3.pdf
.
You may want to start with the implementation of either or algorithm and update the code to find all minimum spanning trees.
Create an interesting graph and submit both the source code (as in ) and the diagram representing your graph (up to 1 point).
Test your code with graphs consisting of zero to many spanning trees.
Your output must include only minimum spanning trees.
Your output should not include redundant trees. For example, if there are three vertices, 0, 1, and 2, {0 -> 1
, 0 -> 2
} is considered the same as {0 -> 1
, 0 <- 2
} or {0 <- 1
, 0 -> 2
} or {0 <-1
, 0 <- 2
}.
L11-12
: a
can hold the integer 152415787517146788750190521
(), which is much larger than the maximum value of long
that is .
L5
: gets the snapshot of the starting time using .
L7
: gets the snapshot of the ending time using .
: passes a function as a parameter that returns a key of the generic type T
.
: creates a stream specified by the supplier.
: creates a stream with a specific size.
: returns an array of the specific type.
L1
: annotates indicating that this method takes vararg
parameters.
L5
: creates a generator.
L14
: uses to concatenate strings with the specific delimiter.
L1
: declares the type T
that is the type of input keys to be stored in this PQ.
T
must be by its priority.
Comparators: , .
Let us define whose core methods satisfy the following conditions:
add()
: takesto add a key to the PQ.
remove()
: takes to remove the key with the highest/lowest priority from the PQ.
L6-8
: appends a key to the list in .
L15-20
: removes a key to the list in .
L17
: finds the max-key in the list using in .
L18
: removes a key from the list in .
Is ArrayList
the best implementation of List
for LazyPriorityQueue
?
Why does remove()
in L18
cost ?
Let us define whose core methods satisfy the following conditions:
add()
: takesto add a key to the PQ.
remove()
: takes to remove the key with the highest/lowest priority from the PQ.
L6-12
: inserts a key to the list in .
L8
: finds the index of the key to be inserted in the list using binary search in .
L10
: reverse engineers the return value of .
L11
: inserts the key at the index
position in .
L19-21
: removes a key from the list in .
L3-12
:
L3
: iterates all keys within the range .
L4-9
: finds the index of the maximum key within the range .
L11
: swaps the maximum key with the last key in the range .
Let us create the class inheriting AbstractSort
:
How is this sink()
method different from the one in the class?
L4-5
: turns the input array into a heap :
L4
: iterates from the parent of the key in the ending index .
L5
: sinks the key .
L8-11
: selection sort :
L8
: iterates all keys within the range .
L9
: swaps the maximum key with the beginning key in the range .
L10
: sinks to heapify .
For each key where and :
Keep swapping and until .
Let us create the class inheriting AbstractSort
:
L4
: iterates keys in the input array .
L5
: compares keys in the sublist of the input array .
Shell Sort groups keys whose gap is defined by a particular :
Knuth:
Hibbard:
Pratt:
Shell: , where
Let us create an abstract class, , inheriting AbstractSort
:
L6
: iterates the sequence where is the number of gaps in the sequence.
Let us create the class that populates the Knuth Sequence for ShellSort
:
L2
: populates the Knuth sequence up to the gap .
L13
: returns the index of the first key .
Why should we use as the largest gap in the sequence?
The codes for unit testing and benchmarking are available in :
L7
: the lambda expression returns the bucket index, the value in the 'th significant digit.
Main:
Test:
L9
: if the new key has a higher priority than its parent, them.
Java 8 introduced that enabled functional programming in Java. The following code takes each key
(as a variable) in keys
and applies it to pq.add()
:
Since Java 8, higher-order methods can be defined by parameterizing types of interfaces in the package.
How many assignments are made for the number of keys by the above version of MergeSort?
Let us create the class inheriting AbstractSort
:
Although Quick Sort is the fastest on average, the worst-case complexity is .
sorting algorithms that guarantee faster worst-case complexities than Quick Sort: Quick Sort for random cases and a different algorithm for the worst case.
L9
: returns as the maximum depth.
Does the max-depth need to be set to ?
Compare runtime speeds between ShellSortKnuth
and ShellSortQuiz
for random, ascending, and descending cases using the method.
Main:
Test:
Create a class inheriting under the package .
Override the sort()
method and evaluate your program for robustness and runtime speeds using :
Commit and push everything under the package.
Include any type of the keys that are .
Let us define the class inheriting AbstractBinaryNode
:
Let us define an abstract class, :
Let us create an abstract class that inherits the abstract class AbstractBinarySearchTree
:
Let us create the class, :
Create the class under the package that inherits by passing Integer
as the generic type.
Update the getEntities()
method that takes an input string and returns the list of , where each entity consists of the begin index (inclusive) and end index (exclusive) of the first and the last characters of the corresponding country name respectively as well as the ID of the country name.
See for an example of how to unit-test your approach.
Describe how the values in the array changes after each call in the main()
method.
Describe how the values in the array would change after calling find(0)
once all keys are added as above, assuming that the find()
method in DisjointSet
class uses the efficient approach:
For the method in the Graph
class, explain why the condition for the exception indicates that the graph includes a cycle.
Let us define the class under the package:
L1
: inherits the interface.
Let us create the interface that is inherited by all minimum spanning tree algorithms:
Let us create the class representing a directed edge under the package:
L1
: inherits the interface.
Let us create the class under the package:
L13
: returns true if all incoming edge lists are empty using the method.
L6
: uses the method that merges the list of edge lists into one list of edges.
L10
: uses the and methods to find vertices with no incoming edges.
Algorithm
Sequence
Compare
Swap
Insertion Sort
Adjacent
Shell Sort
Knuth
Algorithm
Search
Compare
Swap
Selection Sort
Heap Sort
Complexity
MergeSort
TimSort
QuickSort
IntroSort
Best
Worst
Average
This chapter discusses dynamic programming algorithms in comparison to recursions.
This section provides exercises for better understanding of minimum spanning trees.
Explain the worst-case complexities of the following algorithms in terms of V
and E
according to our implementations:
Prim's algorithm
Kruskal's algorithm
Chu-Liu-Edmonds' algorithm
Provide an example of a graph where Prim's and Kruskal's algorithms find different minimum spanning trees from the same graph and explain how these trees are found. If you cannot find an example, explain why these algorithms always find the same minimum spanning tree given the same graph.
Different MSTs can be found from the same graph only if there are multiple edges with the minimum weight for any of which can be polled from the priority queue. Consider cases where the compareTo()
method is overridden differently from the original implementation.
Write a report quiz7.pdf
that includes the complexity and comparison analyses.
This section explains the homework assignment regarding Autocomplete.
Most virtual keyboards provide the option of auto-complete, which gives a list of candidate words that you wish to type given a prefix you enter. For instance, when you type "sh", it gives a list of candidate words such as "she", "shell", "ship", etc.
If you select a word from the candidates, it should learn your selection as the top candidate for that prefix. For instance, if you select "ship" from the example above, the next time you enter the same prefix "sh", it should give a list of candidates where the first item is "ship" instead of "she".
Your task is to write a program that gives a list of candidate words for any prefix and learns the selected candidates.
Create a class called AutocompleteHW
under the autocomplete
package:
This class extends the abstract class Autocomplete
, which extends Trie
.
The value type of the generic T
can be a collection of strings (e.g., List<String>
).
You must create a constructor that takes two parameters, dict_file
and max
, and calls its super constructor, which reads all words in the dictionary file (e.g., dict.txt
).
Override the getCandidates()
method that takes a prefix and returns a list of candidate words matching the prefix:
The max-number of candidates in the list must be the return value of the getMax()
method.
The most recently selected candidate should appear as the 1st item, the 2nd most recently selected candidate should appear as the 2nd item, and so on.
The rest of the candidate list should be filled with the shortest words matching the prefix.
If there are more than one candidate with the same lengths, they should be sorted alphabetically.
Make sure the same candidate does not appear more than once.
Override the pickCandidate()
method that takes a prefix and a selected candidate, and saves the information:
This is the most recently selected candidate for that particular prefix. It must appear as the first item in the candidate list when the same prefix is entered next time.
Submit AutocompleteHW.java
to Canvas.
Create a class called AutocompleteHWExtra
under the autocomplete
package.
Feel free to change the generic type, representing the value type of each TrieNode
, to anything other than List<String>
(List<something else>
).
The getCandidates()
method:
Must show the most frequently picked candidate first, the 2nd-most frequently picked candidate next, and so on.
If there are more than one candidate with the same frequency, sort them by recency.
The rest of candidates should be filled as in the original task.
You must submit both AutocompleteHW
and AutocompleteHWExtra
to earn the extra credit.
Do not change the dictionary file. If you find anything peculiar about the dictionary file, please let me know so everyone works on the same copy of the dictionary file.
Please test your program yourself. We will evaluate your program using our unit tests and measure the performance (both speed and accuracy).
Take a look at Map if you are not familiar with methods in the standard library.
If you are having trouble with implementing getCandidates()
, think about how to traverse the trie given any node.
If you are having trouble with implementing pickCandidate()
, take a look at Trie#put
.
You are not allowed to use any type of Map to store candidates for this homework.
Your program should be able to handle prefixes or candidates that do not exist in the dictionary.
All picked candidates should be introduced as real words to the trie.
Whitespaces are not allowed as input; thus, you should trim all input strings.
This section provides exercises for better understanding in dynamic programming.
Write a report quiz9.pdf
that includes answers to the followings.
As n
increases from 1
to 10
, how many times does the auxiliary solve()
method get called recursively in HanoiRecursive
and HanoiDynamic
?
Is there clear patterns between n
and the number of the method calls made by these classes? Explain the patterns if exist.
Include answers to the followings in quiz9.pdf
:
Explain what the values of the dynamic table mean in the LCSDynamic
class.
LCSDynamic
pre-populates the dynamic table before making any recursive calls. Is it possible to find a LCS with dynamic programming by populating the dynamic table while making recursive class.
You may need a different type of a dynamic table to populate it while making recursive calls.
Create the LCSQuiz
class under the dynamic.lcs
package.
Update the solveAll()
that returns all longest common subsequences between two strings.
Search
Insert
Delete
Unbalanced
Balanced
This section discusses recursive and dynamic programming ways of finding longest common subsequence.
A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.
Given a string "ABCDE"
Substring: {"A", "BC", "CDE", ...}
Subsequence: {all substrings, "AC", "ACE", ...}
Not subsequence: {"BA", "DAB", ...}
Longest common subsequence
The longest subsequence commonly shared by multiple strings.
e.g., “baal” is a LCS of “bilabial” and “balaclava”.
Can there be more than one longest common subsequence?
Application
Find the longest common subsequence in DNA (e.g., GAATGTCCTTTCTCTAAGTCCTAAG
).
Let us create the abstract class LCS
:
Let us create the LCSRecursive
inheriting LCS
:
L4
: no character is left to compare for either string.
L5
: when two characters match, move onto c[:i-1]
and d[:j-1]
.
L7
: gets the longest common subsequence between c[:i-1]
and d[:j]
.
L8
: gets the longest common subsequence between c[:i]
and d[:j-1]
.
L9
: returns the longest subsequence.
The followings demonstrate a recursive way of finding a LCS between two strings:
Let us create the LCSDynamic
class inheriting LCS
.
L4
: creates a dynamic table and passes it to the solver.
L12
: the dynamic table is pre-populated before any recursive calls.
The following show the dynamic table populated by the previous example:
Let us define the solve()
method:
L2
: no character is left to compare for either string.
L3
: when two characters match, move onto c[:i-1]
and d[:j-1]
.
L5
: gets the longest common subsequence between c[:i]
and d[:j-1]
.
L6
: gets the longest common subsequence between c[:i-1]
and d[:j]
.
L9
: returns the longest subsequence by looking up the values in the dynamic table.
The followings demonstrate how to find a LCS using the dynamic table:
This section provides exercises for better understanding in network flow
An augmenting path is a directed path between the source and target vertices; no need to worry about residuals for this quiz.
Create the NetworkFlowQuiz
class under the graph.flow
package.
Update the getAugmentingPaths()
method that returns all augmenting paths between the specific source and target vertices using depth-first traverse.
Create the NetworkFlowQuizExtra
class and update the getAugmentingPaths()
method such that it uses breadth-first traverse instead of depth-first traverse.
Given the following graph where there are two source vertices, and , and two target vertices, and , define the objective functions and their constraints to find the (1) max-flow and (2) min-cut using the simplex algorithm:
Write quiz8
that includes:
The worst-case complexity of your getAugmentingPaths()
method.
An explanation of when the depth-first traverse is preferred over the breadth-first traverse and vice versa to find augmenting paths.
An objective function and is constraints to find max-flow.
An objective function and is constraints to find min-cut.
This section discusses Prim's Algorithm that finds a minimum spanning tree in an undirected graph.
Prim's algorithm finds a minimum spanning tree by finding the minimum edge among all incoming edges of visited vertices.
Let us define the MSTPrim
class inheriting the MST
interface:
L9
: adds the vertex 0
to the visited set and its incoming edges to the queue.
L11
: iterates through all incoming edges:
L14
: if the visited set contains the source vertex, adding the edge would create a cycle.
L16
: if the tree contains v-1
number of edges, a spanning tree is found.
L17
: repeats the process by adding the source of the edge.
The add()
method can be defined as follows:
L4-7
: adds all incoming edges that have not been visited to the queue.
What is the worst-case complexity of Prim's Algorithm?
This section discusses recursive, iterative, and dynamic programming ways of generating the Fibonacci sequence.
The Fibonacci sequence is a sequence of numbers generated as follows:
where
Let us create the Fibonacci
interface that contains one abstract method:
L2
: returns the 'th Fibonacci number.
The followings demonstrate how to generate the Fibonacci sequence using recursion:
Let us create the FibonacciRecursive
class inheriting Fibonacci
:
What is the worst-case complexity of this recursive algorithm?
The followings demonstrate how to generate the Fibonacci sequence using dynamic programming:
Let us create the FibonacciDynamic
class inheriting Fibonacci
:
L7
: creates a dynamic table represented by an array where the 'th dimension is to be filled with the 'th Fibonacci number.
L16
: returns the value in the dynamic table if exists.
L17
: fills in the dynamic table using recursion.
What is the worst-case complexity of this dynamic programming algorithm?
For this particular task, the efficiency can be further enhanced using iteration. Let us create the class FibonacciIterative
inheriting Fibonacci
:
The followings show performance comparisons:
This section discusses recursive and dynamic programming ways of solving the Tower of Hanoi problem.
The followings demonstrate the Tower of Hanoi problem:
Let us create the abstract class Hanoi
:
L9
: returns a list of movements to solve the given problem.
n
: the total number of rings.
source
: the ID of the source tower.
intermediate
: the ID of the intermediate tower.
destination
: the ID of the destination tower.
L11
: returns the string representation of a movement.
Let us create the HanoiRecursive
class inheriting Hanoi
:
Let us create the HanoiDynamic
class inheriting Hanoi
:
The followings show performance comparison:
Quiz 1: Java Essentials
Create a class called under the main package that extends the class.
Override the method so that the add()
method in can handle integers with different signs.
Create the class under the test package.
Test the correctness of your LongIntegerQuiz
using the unit tests.
Add more tests for a more thorough assessment if necessary.
What is the advantage of using Generics?
How do you make the class you define comparable?
What is the advantage of overriding member methods in the Object class?
What kind of methods should be defined as static?
1. Commit and push everything under the following packages to your GitHub repository:
This section discusses a type of balanced binary search trees called Red-Black Trees.
Red-Black trees preserve the balance at most time by satisfying the following conditions:
Every node is either red or black.
The root and all leaves (null
) are black.
Every red node must have two black children.
Every path from a given node to any of its leaves goes through the same number of black nodes.
L2
: if is_red
is true
, the color of this node is red; otherwise, it is black.
L6
: the color of the node is black by default upon instantiation.
Finally, let us override balance()
method that handles 3 cases:
L3
: sets the color to black if node
is the root.
L5
: if the color of node
's parent is red:
L8
: checks if the color of node
's uncle is also red.
L10
: the color of node
's uncle is black.
What about the case when the color of
node
's parent is black?
The following shows a method that handles when both the parent and the uncle are red:
Are there any structural changes after this method is called?
The following shows a method that handles when the parent is red but the uncle is black:
L7
: the case of left zig-zag.
L11
: the case of right zig-zag.
L19
: the case of left linear.
L21
: the case of right linear.
The followings demonstrate how the balance()
method works in Red-Black trees:
How does a Red-Black tree know when it is unbalanced?
Main:
Test:
2. Submit answers to the above to Canvas.
Let us create the class inheriting AbstractBinaryNode
:
Let us the class inheriting AbstractBalancedBinarySearchTree
:
This section discusses a type of balanced binary search trees called AVL Trees.
AVL (Adelson-Velsky and Landis) trees preserve the balance at all time by keeping track of the height of every node through a balance factor.
Let us create the AVLNode
class inheriting AbstractBinaryNode
:
L2
: keeps track of the height of this node.
L6
: a node with no children has the height of 1
.
We then override the setLeftChild()
and setRigthChild()
methods such that the height of this node gets adjusted accordingly with the new child by calling the resetHeights()
method:
L15
: adjusts the height of this node and its ancestors, recursively.
L20-22
: retrieve the height of this node.
L24-27
: resets the height of this node if different and makes the recursive call to its parent.
What are the heights of
3
,5
, and7
in the figure below when the height of4
changes from 3 to 4?
Finally, we define the getBalanceFactor()
method that returns the height difference between the left-subtree and the right-subtree of this node:
How can we tell if the node is unbalanced by using the balance factor?
Let us create the AVLTree
class that inherits AbstractBalancedBinarySearchTree
and override the createNote()
, rotateLeft()
, and rotateRight()
methods:
L10,16
: resets the height of node
after rotation.
node.resetHeights()
does not reset heights of nodes in its subtree. Would this cause an issue?
We then override the balance()
method:
L6
: the left-subtree is longer.
L9-10
: the case of left zig-zag.
L12
: the case of left linear.
L14
: the right-subtree is longer.
L17
: the case of right zig-zag.
L20
: the case of right linear.
The followings demonstrate how the balance()
method works in AVL Trees:
What is the worst-case complexity of the
balance()
method inAVLTree
?
This section describes how to implement a flow network.
A flow network is a directed graph that has the following properties:
Let us define the MaxFlow
class that keeps track of the amount of flows in every edge used to find the maximum flow:
L2
: consists of (edge, flow) as the key-value pair.
L3
: stores the total amount of flows to the target vertices.
L13-14
: initializes all flows to 0
.
Let us define getter methods:
L5
: returns the remaining residual that can be used to push more flows.
L6
: the edge weight represents the total capacity.
Finally, let us define setter methods:
L1
: takes the path from a source and a target.
L2
: updates the flow of every edge in the path with the specific flow.
L3
: updates the overall flow.
L6
: adds the specific flow to the specific edge.
This section discusses Chu-Liu-Edmonds' Algorithm that finds a MST in a directed graph.
Chu-Liu-Edmonds' Algorithm takes the following steps to find a MST in a directed graph:
Initially, every vertex is considered a subtree.
For each subtree, keep 1 incoming edge with the minimum weight.
If there is no cycle, go to #5.
If there is a cycle,
Merge subtrees with the cycle into one and update scores for all incoming edges to this merged subtree, and goto #2.
For each vertex in the subtree, add the weight of its outgoing edge chain to its incoming edges not in the subtree.
Break all cycles by removing edges that cause multiple parents.
The following demonstrates how Chu-Liu-Edmonds' Algorithm find a minimum spanning tree:
What is the logic behind updating the edge weights related to the cycles?
The following explains the weight updates in details:
This section describes the implementation of Ford-Fulkerson Algorithm
Let us define the class that consists of a subset of vertices and edges from the original graph:
Let us define helper methods:
Ford-Fulkerson algorithm finds the maximum flow from a flow network as follows:
L2
: indicates one source and one target vertices.
L6
: iterates as long as it can find an augmenting path
L7
: finds the edge with the minimum capacity in the augmenting path.
L8
: updates the edges in the path with the flow.
Let us define the getMin()
method:
Finally, let us define the getAugmentingPath()
method:
L2
: once the source reaches the target, it found an augmenting path.
L6
: adding the source vertex would cause a cycle.
L7
: cannot push the flow when there is no residual left.
L10
: recursively finds the augmenting path by switching the target.
Let us consider the following graph:
As shown, our implementation of Ford-Fulkerson Algorithm does not always guarantee to find the maximum flow correctly. To fix this issue, we need to implement backward pushing:
The backward pushing can be performed after the applying the flow to all edges as in the implementation above (see the code in the "With Backward Pushing" tab).
Finally, the updateBackward() method can be implemented as follows:
Let us create the class: