5.3. Information Extraction

Consider that you want to extract someone's call name(s) during a dialogue in real time:

S: Hi, how should I call you?
U: My friends call me Jin, but you can call me Jinho. Some students call me Dr. Choi as well.

Design a prompt that extracts all call names provided by the user.

Let us write a function that takes the user input and returns the GPT output in the JSON format:

def gpt_completion(input: str, regex: Pattern = None) -> str:
    response = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=[{'role': 'user', 'content': input}]
    )
    output = response['choices'][0]['message']['content'].strip()

    if regex is not None:
        m = regex.search(output)
        output = m.group().strip() if m else None

    return output
  • #2-6: uses the ChatCompletition model to retrieve the GPT output.

  • #8-10: uses the regular expression (if provided) to extract the output in the specific format.

Let us create a macro that calls MacroGPTJSON:

  • #3: the task to be requested regarding the user input (e.g., How does the speaker want to be called?).

  • #4: the example output where all values are filled (e.g., {"call_names": ["Mike", "Michael"]}).

  • #5: the example output where all collections are empty (e.g., {"call_names": []}).

  • #6: the regular expression to check the information.

  • #7: it is a function that takes the STDM variable dictionary and the JSON output dictionary and sets necessary variables.

Override the run method in MacroGPTJSON:

  • #2-3: creates a input prompt to the GPT API.

  • #4-5: retreives the GPT output using the prompt.

  • #7-11: checks if the output is in a proper JSON format.

  • #13-14: updates the variable table using the custom function.

  • #15-16: updates the variable table using the same keys as in the JSON output.

Let us create another macro called MacroNLG:

  • #3: is a function that takes a variable table and returns a string output.

Finally, we use the macros in a dialogue flow:

The helper methods can be as follow:

Last updated

Was this helpful?