N-gram Language Models, Jurafsky and Martin, Chapter 3 in Speech and Language Processing (3rd ed.), 2023.
Efficient Estimation of Word Representations in Vector Space, Mikolov et al., ICLR, 2013. <- Word2Vec
GloVe: Global Vectors for Word Representation, Pennington et al., EMNLP, 2014.
Deep Contextualized Word Representations, Ppeters et al., NAACL, 2018. <- ELMo
Attention is All You Need, Vaswani et al., NIPS, 2017. <- Transformer
Generating Wikipedia by Summarizing Long Sequences, Liu et al., ICLR, 2018.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al., NAACL, 2018.
Neural Machine Translation of Rare Words with Subword Units, Sennrich et al., ACL, 2016. <- Byte-Pair Encoding (BPE)
Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, Wu et al., arXiv, 2016. <- WordPiece
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing, Kudo and Richardson, EMNLP, 2018.
Improving Language Understanding by Generative Pre-Training, Radford et al., OpenAI, 2018. <- GPT-1
Language Models are Unsupervised Multitask Learners, Radford et al., OpenAI, 2019. <- GPT-2
Language Models are Few-Shot Learners, Brown et al., NeurIPS, 2020. <- GPT-3
Consider that you want to extract someone's call name(s) during a dialogue in real time:
Design a prompt that extracts all call names provided by the user.
In "My friends call me Pete, my students call me Dr. Parker, and my parents call me Peter.", how does the speaker want to be called? Respond in the following JSON format: {"call_names": ["Mike", "Michael"]}
Let us write a function that takes the user input and returns the GPT output in the JSON format:
#2-6
: uses the model to retrieve the GPT output.
#8-10
: uses the regular expression (if provided) to extract the output in the specific format.
Let us create a macro that calls MacroGPTJSON
:
#3
: the task to be requested regarding the user input (e.g., How does the speaker want to be called?).
#4
: the example output where all values are filled (e.g., {"call_names": ["Mike", "Michael"]}
).
#5
: the example output where all collections are empty (e.g., {"call_names": []}
).
#7
: it is a function that takes the STDM variable dictionary and the JSON output dictionary and sets necessary variables.
Override the run
method in MacroGPTJSON
:
#2-3
: creates a input prompt to the GPT API.
#4-5
: retreives the GPT output using the prompt.
#7-11
: checks if the output is in a proper JSON format.
#13-14
: updates the variable table using the custom function.
#15-16
: updates the variable table using the same keys as in the JSON output.
Let us create another macro called MacroNLG
:
#3
: is a function that takes a variable table and returns a string output.
Finally, we use the macros in a dialogue flow:
The helper methods can be as follow:
#6
: the to check the information.
Revisit your Quiz 2 and improve its language understanding capability using the large language model such as GPT.
Use ChatGPT to figure out the right prompts.
Use your trial credits from OpenAI to test the APIs.
Update the code to design a dialogue flow for the assigned dialogue system.
Create a PDF file quiz5.pdf
that describes the approach (e.g., prompt engineering) and how the large language model improved over the limitations you described in Quiz 2.
Answer the following questions in quiz5.py
:
What are the limitations of the Bag-of-Words representation?
Describe the Chain Rule and Markov Assumption and how they are used to estimate the probability of a word sequence.
Explain how the Word2Vec approach uses feed-forward neural networks to generate word embeddings. What are the advantages of the Word2Vec representation over the Bag-of-Words representation?
Explain what patterns are learned in the multi-head attentions of a Transformer. What are the advantages of the Transformer embeddings over the Word2Vec embeddings?
Create an account for OpenAI and log in to your account.
Click your icon on the top-right corner and select "View API keys":
Click the "+ Create new secret key" button and copy the API key:
Make sure to save this key in a local file. If you close the dialog without saving, you cannot retrieve the key again, in which case, you have to create a new one.
Create a file openai_api.txt
under the resources
directory and paste the API key to the file such that it contains only one line showing the key.
Add openai_api.txt
to the .gitignore
file:
Do not share this key with anyone or push it to any remote repository (including your private GitHub repository).
Open the terminal in PyCharm and install the OpenAI package:
Create a function called api_key()
as follow:
#4
: specifies the path of the file containing the OpenAI API key.
Retrieve a response by creating a ChatCompletition
module:
#1
: the GPT model to use.
#2
: the content to be sent to the GPT model.
#3
: creates the chat completion model and retrieves the response.
#5
: messages are stored in a list of dictionaries where each dictionary contains content from either the user
or the system
.
Print the type of the response and the response itself that is in the JSON format:
#1
: the response type.
#2
: the response in the JSON format.
Print only the content from the output: