Homework

HW0: Getting Started

Environment Setup

Python

  • Install Python 3.11.x.

  • Lower versions of Python may not be compatible with this course.

GitHub

  1. Login to GitHub (create an account if you do not have one).

  2. Create a new repository called nlp-essentials and make it private.

  3. From the [Settings] menu, add the instructors as collaborators of this repository.

PyCharm

Install PyCharm on your local machine:

  • The following instructions assume that you have "PyCharm 2023.3.x Professional Edition".

  • You can get the professional version by applying for an academic license.

Configure your GitHub account:

  1. Go to [Settings] - [Version Control] - [GitHub].

  2. Press [+], select Log in via GitHub, and follow the procedure.

Create a new project:

  1. Press the [Get from VCS] button on the Welcome prompt.

  2. Choose [GitHub] on the left menu, select the nlp-essentials repository, and press [Clone] (make sure the directory name is nlp-essentials).

Setup an interpreter:

  1. Go to [Settings] - [Project: nlp-essentials] - [Project Interpreter].

  2. Click Add Interpreter and select Add Local Interpreter.

  3. In the prompted window, choose [Virtualenv Environment] on the left menu, configure as follows, then press [OK]:

    • Environment: New

    • Location: SOME_LOCAL_PATH/nlp-essentials/venv

    • Base interpreter: Python 3.11 (or the Python version you installed)

Tasks

Install Package

  1. Open a terminal by clicking [Terminal] at the bottom (or go to [View] - [Terminal]).

  2. Upgrade pip (if necessary) by entering the following command into the terminal:

    python -m pip install --upgrade pip
  3. Install setuptools (if necessary) using the following command:

    pip install setuptools
  4. Install the ELIT Tokenizer with the following command:

    pip install elit_tokenizer
  5. If the terminal prompts "Successfully installed ...", the packages are installed on your machine.

Run Program

1. Create a package called src under the nlp-essentials directory.

PyCharm may automatically create the __init__.py file under src, which is required for Python to recognize the directory as a package, so leave the file as it is.

2. Create a homework package under the src package.

3. Create a Python file called getting_started.py under homework and copy the code:

from elit_tokenizer import EnglishTokenizer
__author__ = 'Jinho D. Choi'

if __name__ == '__main__':
    text = 'Emory NLP is a research lab in Atlanta, GA. It was founded by Jinho D. Choi in 2014. Dr. Choi is a professor at Emory University.'
    tokenizer = EnglishTokenizer()
    sentence = tokenizer.decode(text)
    print(sentence.tokens)
    print(sentence.offsets)

If PyCharm prompts you to add getting_started.py to git, press [Add].

4. Run the program by clicking [Run] - [Run 'getting_started']. An alternative way is to click the green triangle (L20) and select Run 'getting_started':

5. If you see the following output, your program runs successfully.

['Emory', 'NLP', 'is', 'a', 'research', 'lab', 'in', 'Atlanta', ',', 'GA', '.', 'It', 'is', 'founded', 'by', 'Jinho', 'D.', 'Choi', 'in', '2014', '.', 'Dr.', 'Choi', 'is', 'a', 'professor', 'at', 'Emory', 'University', '.']
[(0, 5), (6, 9), (10, 12), (13, 14), (15, 23), (24, 27), (28, 30), (31, 38), (38, 39), (40, 42), (42, 43), (44, 46), (47, 49), (50, 57), (58, 60), (61, 66), (67, 69), (70, 74), (75, 77), (78, 82), (82, 83), (84, 87), (88, 92), (93, 95), (96, 97), (98, 107), (108, 110), (111, 116), (117, 127), (127, 128)]

Commit & Push

1. Create a .gitignore file under the nlp-essentials directory and copy the content:

.idea/
venv/

2. Add the following files to git by right-clicking on them and selecting [Git] - [Add] (if not already):

  • getting_started.py

  • .gitignore

Once the files are added to git, they should turn green. If not, restart PyCharm and try to add them again.

3. Commit and push your changes to GitHub:

  • Right-click on nlp-essentials.

  • Select [Git] - [Commit Directory].

  • Enter a commit message (e.g., Submit Quiz 0).

  • Press the [Commit and Push] button.

Make sure you both commit and push, not just commit.

4. Check if the above files are properly pushed to your GitHub repository.

Submission

Submit the URL of your GitHub repository to Canvas.

Last updated

Copyright © 2023 All rights reserved