CS|QTM|LING-329: Computational Linguistics (Spring 2025)
Time: TBA
Location: TBA
Jinho Choi : Associate Professor of Computer Science : Office Hours → TBA
Catherine Baker : MS Student in Computer Science : Office Hours → TuTh 11 AM - 12:30 PM, Zoom (ID: 964 3750 1501, PW: posted in Canvas)
Zelin Zhang : Ph.D. student in Computer Science and Informatics : Office Hours → MW 11:20 AM - 12:50 PM, Zoom (ID: 975 2341 9724, PW: posted in Canvas)
Homework: 70%
Team Formation: 3%
Project Proposal: 12%
Live Demonstration: 15%
Your work is governed by the Emory Honor Code. Honor code violations (e.g., copies from any source, including colleagues and internet sites) will be referred to the Emory Honor Council.
Requests for absence/rescheduling due to severe personal events (such as health, family, or personal reasons) impacting course performance must be supported by a letter from the Office of Undergraduate Education.
Each topic will include homework that combines quizzes and programming assignments to assess your understanding of the subject matter.
Assignments must be submitted individually. While discussions are allowed, your work must be original.
Late submissions within a week will be accepted with a grading penalty of 15% but will not be accepted after the solutions are discussed in class.
Each section incorporates questions to explore the content more comprehensively, with their corresponding answers slated for discussion in the class.
While certain questions may have multiple valid answers, the grading will be based on the responses discussed in class, and alternative answers will be disregarded. This approach allows us to distinguish between answers discussed in class and those generated by AI tools like ChatGPT.
You are encouraged to use any code examples provided in this book.
You can invoke any APIs provided in the course packages (under the src/ directory).
Feel free to create additional functions and variables in the assigned Python file. For each homework, ensure that all your implementations are included in the respective Python file located under the src/homework/ directory.
Usage of packages not covered in the corresponding chapter is prohibited. Ensure that your code does not rely on the installation of additional packages, as we will not be able to execute your program for evaluation if external dependencies are needed.
You are expected to:
Group a team of 3-4 members.
Give a pitch presentation to showcase your idea for the project.
Provide a live demonstration to illustrate the details and potential of your project.
Everyone in the same group will receive the same grade for the project, except for the individual portion.
Your project will undergo evaluation based on various criteria, including originality, feasibility, and potential impact.
Your project will also undergo peer assessment, which will factor into your project grade.
Participation in project presentations and live demonstrations is compulsory. Failure to attend any of these events will result in a zero grade for the respective activity. In the event of unavoidable absence due to severe personal circumstances, a formal letter from the Office of Undergraduate Education must accompany any excuses.
You can earn up to 3 extra credits by helping us improve this online book. If you wish to contribute, please submit an issue to our GitHub repository using the "Online Book" template. Upon verification, you will receive credits based on the following criteria:
Content enhancements (e.g., additional explanations, test codes): 0.3 points
Code bug fixes: 0.2 points
Identification and correction of typos (and other obvious mistakes): 0.1 points
Prior to submission, please check for existing issues to avoid duplication. If multiple submissions of the same (or very similar) issues occur, only the first one will be credited.
HW0: Getting Started
Install version 3.10 or higher. Earlier versions are not compatible with this course.
You are encouraged to install the latest version of Python. Please review the introduced in each version.
Login to (create an account if you do not have one).
Create a new repository called nlp-essentials and set it to private.
From the [Settings]
menu, add the following as a collaborator to this repository: .
Install on your local machine:
The following instructions assume that you have "PyCharm 2023.3.x Professional Edition".
You can get the professional version by applying for an .
Configure your GitHub account:
Go to [Settings] - [Version Control] - [GitHub]
.
Press [+]
, select Log in via GitHub
, and follow the procedure.
Create a new project:
Press the [Get from VCS]
button on the Welcome
prompt.
Choose [GitHub]
on the left menu, select the nlp-essentials
repository, and press [Clone]
(make sure the directory name is nlp-essentials
).
Setup an interpreter:
Go to [Settings] - [Project: nlp-essentials] - [Project Interpreter]
.
Click Add Interpreter
and select Add Local Interpreter
.
In the prompted window, choose [Virtualenv Environment]
on the left menu, configure as follows, then press [OK]
:
Environment: New
Location: SOME_LOCAL_PATH/nlp-essentials/venv
Base interpreter: Python 3.11
(or the Python version you installed)
Open a terminal by clicking [Terminal]
at the bottom (or go to [View] - [Terminal]
).
If the terminal prompts "Successfully installed ...", the packages are installed on your machine.
PyCharm may automatically create the __init__.py
file under src
, which is required for Python to recognize the directory as a package, so leave the file as it is.
If PyCharm prompts you to add getting_started.py
to git, press [Add]
.
4. Run the program by clicking [Run] - [Run 'getting_started']
. An alternative way is to click the green triangle (L20) and select Run 'getting_started'
:
5. If you see the following output, your program runs successfully.
2. Add the following files to git by right-clicking on them and selecting [Git] - [Add]
(if not already):
getting_started.py
.gitignore
Once the files are added to git, they should turn green. If not, restart PyCharm and try to add them again.
3. Commit and push your changes to GitHub:
Right-click on nlp-essentials
.
Select [Git] - [Commit Directory]
.
Enter a commit message (e.g., Submit Quiz 0).
Press the [Commit and Push]
button.
Make sure you both commit
and push
, not just commit
.
4. Check if the above files are properly pushed to your GitHub repository.
Submit the URL of your GitHub repository to Canvas.
Upgrade (if necessary) by entering the following command into the terminal:
Install (if necessary) using the following command:
Install the with the following command:
1. Create a package called under the nlp-essentials
directory.
2. Create a package under the src
package.
3. Create a Python file called under homework
and copy the code:
1. Create a file under the nlp-essentials
directory and copy the content:
By Jinho D. Choi (2023 Edition)
Natural Language Processing (NLP) is a vibrant field in Artificial Intelligence that seeks to create computational models to understand, interpret, and generate human language. NLP technology has become deeply ingrained in our daily lives through various applications, evolving at an unprecedented pace. Understanding how NLP works enables you to maximize the utilization of these applications, ultimately enhancing your lifestyle.
This course focuses on establishing a solid foundation in the core principles essential for modern NLP techniques. Starting with the basics of text processing, you will learn how to manipulate text to enhance data quality for developing NLP models. Next, we will delve into language modeling that enables computational systems to understand and generate human language and explore vector space models that convert human language into machine-readable vector representations.
Moving forward, we will cover distributional semantics, a technique for creating word embeddings based on their global contextual usage, and adapt them for sequence modeling to tackle NLP tasks that are inherently structured around sequences of words. We will also delve into contextual representations that capture the subtleties and nuances of language by considering local context. Finally, we will explore cutting-edge topics, including large language models and their effects on NLP tasks and applications.
Throughout the course, several quizzes and Python programming assignments will further deepen your understanding of the concepts and the practice of NLP. By the end of the term, you can expect to possess the knowledge and skills necessary to navigate the swiftly evolving landscape of NLP.
Introduction to Python Programming
Introduction to Machine Learning
Each section has its own set of references. We highly recommend you read the ones marked with asterisks (*), as they provide an in-depth understanding of those subjects.
CS|QTM|LING-329: Computational Linguistics (Spring 2024)
01/17
01/22
01/24
(continue)
01/29
(continue)
01/31
(continue)
02/05
02/07
(continue)
02/12
(continue)
02/14
(continue)
02/19
02/21
(continue)
02/26
(continue)
02/28
(continue)
03/04
03/06
(continue)
03/11
Spring Break
03/13
Spring Break
03/18
03/20
(continue)
03/25
(continue)
03/27
(continue)
04/01
(continue)
HW4
04/03
Progress Report
04/08
(continue)
04/10
04/15
04/17
HW5
04/22
04/24
(continue)
04/29
HW6