arrow-left

All pages
gitbookPowered by GitBook
1 of 6

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

3. Contextual Understanding

This chapter explains how to improve contextual understanding using Natex.

hashtag
Content

  1. Natex

hashtag
Resource

  • Source code:

Ontology
Regular Expression
Macro
Quiz
contextual_understanding.pyarrow-up-right

3.5. Quiz

Quiz 3: Contextual Understanding

hashtag
Overview

Your goal is to create a chatbot that talks about movies. Here is a sample dialogue:

Your chatbot aims to collect user information by asking the following:

  • The latest movie that the user watched (#3-4).

  • A contextualized question regarding the latest movie (#5-6).

  • A question regarding the genre of the latest movie (#7-10).

Your chatbot should give an appropriate response to every user response. For this assignment, you must use all of the following:

  • An covering common movie genres and a branch of movies that you target,

  • At least one ,

  • At least one (can be used inside a macro).

hashtag
Task 1

  1. Create a Python file under the package.

  2. Create a JSON file under the directory.

  3. Update them to design a dialogue flow for the chatbot.

hashtag
Task 2

Create a PDF file quiz3.pdf that describes the following:

  • Sample dialogues that your chatbot can conduct.

  • Explanations of how the ontology, macro(s), and regular expression(s) are used for contextual understanding in your chatbot.

hashtag
Submission

  1. Commit and push quiz3.py to your GitHub repository.

  2. Submit quiz3.pdf to Canvas.

S: Hello, may I have your name?
U: Call me Jinho.
S: It's nice to meet you, Jinho. What was the latest movie you watched?
U: That'd be Spider Man No Way Home.
S: I love Marvel movies. Who is your favorite avenger?
U: Captain America.
S: Captain is my man. So I guess you are a big fan of Sci-Fi movies?
U: Yes, I am.
S: Excellent. Why do you like them so much?
U: They make me feel like living in the future.
S: I'm with you. Thanks for sharing.
ontology
macro
regular expression
quiz3.pyarrow-up-right
quizarrow-up-right
ontology_quiz3.jsonarrow-up-right
resourcesarrow-up-right

3.2. Ontology

How to use ontologies for matching in Natex.

hashtag
Ontology

Let us create a dialogue flow to talk about animals:

For each type of animal, however, the list can be indefinitely long (e.g., there are over 5,400 mammal species). In this case, it is better to use an (e.g., , ).

Let us create a JSON file, , containing an ontology of animals:

  • #2: the key ontology is paired with a dictionary as a value.

  • #3: the key animal represents the category, and its subcategories are indicated in the list.

Given the ontology, the above transitions can be rewritten as follow:

  • #4: matches the key "mammal" as well as its subcategories: "dog", "ape", and "rat".

  • #5: matches the key "reptile" as well as its subcategories: "snake" and "lizard".

  • #6

Unlike set matching, ontology matching handles plurals (e.g., "frogs").

Although there is no condition specified for the category dog that includes "golden retriever", there is a condition for its supercategory mammal (#4), to which it backs off.

triangle-exclamation

Currently, ontology matching does not handle plurals for compound nouns (e.g., "golden retrievers"), which will be fixed in the following version.

hashtag
Expression

It is possible that a category is mentioned in a non-canonical way; the above conditions do not match "puppy" because it is not introduced as a category in the ontology. In this case, we can specify the aliases as "expressions":

  • #10: the key expressions is paired with a dictionary as a value.

  • #4: allows matching "canine" and "puppy" for the dog category.

Once you load the updated JSON file, it now understands "puppy" as an expression of "dog":

circle-check

It is possible to match "puppy" by adding the term as a category of "dog" (#7). However, it would not be a good practice as "puppy" should not be considered a subcategory of "dog".

hashtag
Variable

Values matched by the ontology can also be stored in variables:

  • #4,7,10: the matched term gets stored in the variable FAVORITE_ANIMAL.

  • #5,8,11: the system uses the value of FAVORITE_ANIMAL to generate the response.

hashtag
Loading

The custom ontology must be loaded to the knowledge base of the dialogue flow before it runs:

  • #1: loads the ontology in ontology_animal.json to the knowledge base of df.

hashtag
Code Snippet

transitions = {
    'state': 'start',
    '`What is your favorite animal?`': {
        '[{dog, ape, rat}]': {
            '`I love mammals!`': 'end'
        },
        '[{snake, lizard}]': {
            '`Reptiles are slick, haha`': 'end'
        },
        '[{frog, salamander}]': {
            '`Amphibians can be cute :)`': 'end'
        },
        'error': {
            '`I\'ve never heard of that animal.`': 'end'
        }
    }
}
S: What is your favorite animal?
U: I love frog
S: Amphibians can be cute :)
S: What is your favorite animal?
U: Cat
S: I've never heard of that animal.
S: What is your favorite animal?
U: Dogs
S: I've never heard of that animal.
#4-6: each subcategory, mammal, reptile, and amphibian, has its own subcategory.
  • #7: the ontology hierarchy: animal -> mammal -> dog.

  • : matches the key "amphibian" as well as its subcategories: "frog" and "salamander".
    ontologyarrow-up-right
    WordNetarrow-up-right
    FrameNetarrow-up-right
    ontology_animal.jsonarrow-up-right
    {
        "ontology": {
            "animal": ["mammal", "fish", "bird", "reptile", "amphibian"],
            "mammal": ["dog", "ape", "rat"],
            "reptile": ["snake", "lizard"],
            "amphibian": ["frog", "salamander"],
            "dog": ["golden retriever", "poodle"]
        }
    }
    transitions = {
        'state': 'start',
        '`What is your favorite animal?`': {
            '[#ONT(mammal)]': {
                '`I love mammals!`': 'end'
            },
            '[#ONT(reptile)]': {
                '`Reptiles are slick, haha`': 'end'
            },
            '[#ONT(amphibian)]': {
                '`Amphibians can be cute :)`': 'end'
            },
            'error': {
                '`I\'ve never heard of that animal.`': 'end'
            }
        }
    }
    S: What is your favorite animal?
    U: I love frogs
    S: Amphibians can be cute :)
    S: What is your favorite animal?
    U: I love my golden retriever
    S: I love mammals!
    S: What is your favorite animal?
    U: I cannot live without my puppy!
    S: I've never heard of that animal.
    {
        "ontology": {
            "animal": ["mammal", "fish", "bird", "reptile", "amphibian"],
            "mammal": ["dog", "ape", "rat"],
            "reptile": ["snake", "lizard"],
            "amphibian": ["frog", "salamander"],
            "dog": ["golden retriever", "poodle"]
        },
    
        "expressions": {
            "dog": ["canine", "puppy"]
        }
    }
    S: What is your favorite animal?
    U: I cannot live without my puppy!
    S: I love mammals!
    transitions = {
        'state': 'start',
        '`What is your favorite animal?`': {
            '[$FAVORITE_ANIMAL=#ONT(mammal)]': {
                '`I love` $FAVORITE_ANIMAL `!`': 'end'
            },
            '[$FAVORITE_ANIMAL=#ONT(reptile)]': {
                '$FAVORITE_ANIMAL `are slick, haha`': 'end'
            },
            '[$FAVORITE_ANIMAL=#ONT(amphibian)]': {
                '$FAVORITE_ANIMAL `can be cute :)`': 'end'
            },
            'error': {
                '`I\'ve never heard of that animal.`': 'end'
            }
        }
    }
    S: What is your favorite animal?
    U: I love frogs
    S: frogs can be cute :)
    S: What is your favorite animal?
    U: I can't live without my puppy!
    S: I love puppy !
    df = DialogueFlow('start', end_state='end')
    df.knowledge_base().load_json_file('resources/ontology_animal.json')
    df.load_transitions(transitions)
    def natex_ontology() -> DialogueFlow:
        transitions = {
            'state': 'start',
            '`What is your favorite animal?`': {
                '[$FAVORITE_ANIMAL=#ONT(mammal)]': {
                    '`I love` $FAVORITE_ANIMAL `!`': 'end'
                },
                '[$FAVORITE_ANIMAL=#ONT(reptile)]': {
                    '$FAVORITE_ANIMAL `are slick, haha`': 'end'
                },
                '[$FAVORITE_ANIMAL=#ONT(amphibian)]': {
                    '$FAVORITE_ANIMAL `can be cute :)`': 'end'
                },
                'error': {
                    '`I\'ve never heard of that animal.`': 'end'
                }
            }
        }
    
        df = DialogueFlow('start', end_state='end')
        df.knowledge_base().load_json_file('resources/ontology_animal.json')
        df.load_transitions(transitions)
        return df
        
    if __name__ == '__main__':
        natex_ontology().run()

    3.4. Regular Expression

    How to use regular expressions for matching in Natex.

    Regular expressions provide powerful ways to match strings and beyond:

    • Chapter 2.1: Regular Expressionsarrow-up-right, Chapter 2.1, Speech and Language Processing (3rd ed.), Jurafsky and Martin.

    • Regular Expression HOWTOarrow-up-right, Python Documentation

    hashtag
    Syntax

    hashtag
    Grouping

    Syntax
    Description

    hashtag
    Repetitions

    Syntax
    Description
    Non-greedy

    hashtag
    Special Characters

    Syntax
    Description

    hashtag
    Functions

    Several functions are provided in Python to match regular expressions.

    hashtag
    match()

    Let us create a regular expression that matches "Mr." and "Ms.":

    • #1: imports the .

    • #3: the regular expression into the RE_MR.

    circle-info

    A regular expression is represented by r'expression' where the expression is in a string preceded by the special character r.

    The above code prints None, indicating that the value of m is None, because the regular expression does not match the string.

    • #1: since RE_MR matches the string, m is a match object.

    • #3: true since m is a match object.

    Currently, no are specified in RE_MR:

    • #1: returns an empty ().

    circle-exclamation

    What are the differences between a list and a tuple in Python?

    It is possible to specific patterns using parentheses:

    • #1: there are two groups in this regular expression, (M[rs]) and (\.).

    • #3: returns a of matched substrings ('Ms', '.') for the two groups in #1.

    The above RE_MR matches "Mr." and "Ms." but not "Mrs." Modify it to match all of them (Hint: use a non-capturing group and |).

    The non-capturing group (?:[rs]|rs) matches "r", "s", or "rs" such that the first group matches "Mr", "Ms", and "Mrs", respectively.

    Since we use the non-capturing group, the following code still prints a tuple of two strings:

    hashtag
    search()

    Let us match the following strings with RE_MR:

    • #4: matches "Mr." but not "Ms."

    • #5: matches neither "Mr." nor "Mrs."

    For s1, only "Mr." is matched because match() stops matching after finding the first pattern. For s2 on the other hand, even "Mr." is not matched because match() requires the pattern to be at the beginning of the string.

    To match a pattern anywhere in the string, we need to for the pattern instead:

    • search() returns a match object as match() does.

    hashtag
    findall()

    search() still does not return the second substrings, "Ms." and "Mrs.". The following shows how to substrings that match the pattern:

    • findall() returns a list of tuples where each tuple represents a group of matched results.

    hashtag
    finditer()

    Since findall() returns a list of tuples instead of match objects, there is no definite way of locating the matched results in the original string. To return match objects instead, we need to the pattern:

    • #1: finditer() returns an that keeps matching the pattern until it no longer finds.

    You can use a to store the match objects as a list:

    • #1: returns a list of all m (in order) matched by finditer().

    circle-exclamation

    How is the code above different from the one below?

    What are the advantages of using a list comprehension over a for-loop other than it makes the code shorter?

    Write regular expressions to match the following cases:

    • Abbreviation: Dr., U.S.A.

    • Apostrophe: '80, '90s

    hashtag
    Natex Integration

    The nesting example in has a condition as follows (#4):

    Write a regular expression that matches the above condition.

    It is possible to use regular expressions for matching in Natex. A regular expression is represented by forward slashes (/../):

    • #4: true if the entire input matches the regular expression.

    You can put the expression in a sequence to allow it a partial match:

    • #4: the regular expression is put in a sequence [].

    circle-info

    When used in Natex, all literals in the regular expression (e.g., "so", "good" in #4) must be lowercase because Natex matches everything in lowercase. The design choice is made because users tend not to follow typical capitalization in a chat interface, whether it is text- or audio-based.

    hashtag
    Variable

    It is possible to store the matched results of a regular expression to variables. A variable in a regular expression is represented by angle brackets (<..>) inside a capturing group ((?..)).

    The following transitions take the user name and respond with the stored first and last name:

    • #4: matches the first name and the last name in order and stores them in the variables FIRSTNAME and LASTNAME.

    • #5: uses FIRSTNAME and LASTNAME in the response.

    0 or 1 repetitions

    ??

    {m}

    Exactly m repetitions

    {m,n}

    From m to n repetitions

    {m,n}?

    Any whitespace character

    \S

    Any non-whitespace character

    \w

    Any alphanumeric character and the underscore

    \W

    Any non-alphanumeric character

    #4: matchesarrow-up-right the string "Dr. Choi" with RE_MR and saves the match objectarrow-up-right to m.

    #4: prints the matched substring, and the startarrow-up-right (inclusive) and endarrow-up-right (exclusive) indices of the substring with respect to the original string in #1.

    #4,5: return the entire match "Ms.".

  • #6: returns "Ms" matched by the first group (M[rs]).

  • #7: returns "." matched by the second group (\.).

  • What if we use a capturing group instead?

    Now, the nested group ([rs]|rs) is considered the second group such that the match returns a tuple of three strings as follows:

    ,
    'cause
  • Concatenation: don't, gonna, cannot

  • Hyperlink: https://github.com/emory-courses/cs329/

  • Number: 1/2, 123-456-7890, 1,000,000

  • Unit: $10, #20, 5kg

  • [ ]

    A set of characters

    ( )

    A capturing group

    (?: )

    A non capturing group

    |

    or

    .

    Any character except a newline

    *

    0 or more repetitions

    *?

    +

    1 or more repetitions

    +?

    ^

    The start of the string

    $

    The end of the string

    \num

    The contents of the group of the same number

    \d

    Any decimal digit

    \D

    Any non-decimal-digit character

    Regular Expresions 101arrow-up-right
    regular expression libraryarrow-up-right
    compilesarrow-up-right
    regex objectarrow-up-right
    groupsarrow-up-right
    tuple
    grouparrow-up-right
    tuplearrow-up-right
    searcharrow-up-right
    find allarrow-up-right
    interactively findarrow-up-right
    iteratorarrow-up-right
    list comprehensionarrow-up-right
    Section 3.1

    ?

    \s

    RE_TOK = re.compile(r'([",.]|n\'t|\s+)')
    RE_ABBR = re.compile(r'((?:Mr|Mrs|Ms|Dr)\.)|((?:[A-Z]\.){2,})')
    RE_APOS = re.compile(r'\'(\d\ds?|cause)')
    RE_CONC = re.compile(r'([A-Za-z]+)(n\'t)|(gon)(na)|(can)(not)')
    RE_HYPE = re.compile(r'(https?://\S+)')
    RE_NUMB = re.compile(r'(\d+/\d+)|(\d{3}-\d{3}-\d{4})|(\d(?:,\d{3})+)')
    RE_UNIT = re.compile(r'([$#])?(\d+)([km]g)?')
    import re
    
    RE_MR = re.compile(r'M[rs]\.')
    m = RE_MR.match('Dr. Wayne')
    print(m)
    m = RE_MR.match('Mr. Wayne')
    print(m)
    if m:
        print(m.group(), m.start(), m.end())
    <re.Match object; span=(0, 3), match='Mr.'>
    Mr. 0 3
    print(m.groups())
    RE_MR = re.compile(r'(M[rs])(\.)')
    m = RE_MR.match('Ms. Wayne')
    print(m.groups())
    print(m.group())
    print(m.group(0))
    print(m.group(1))
    print(m.group(2))
    ('Ms', '.')
    Ms.
    Ms.
    Ms
    .
    RE_MR = re.compile(r'(M(?:[rs]|rs))(\.)')
    print(RE_MR.match('Mrs. Wayne').groups())
    --> ('Mrs', '.')
    s1 = 'Mr. and Ms. Wayne are here'
    s2 = 'Here are Mr. and Mrs. Wayne'
    
    print(RE_MR.match(s1))
    print(RE_MR.match(s2))
    <re.Match object; span=(0, 3), match='Mr.'>
    None
    print(RE_MR.search(s1))
    print(RE_MR.search(s2))
    <re.Match object; span=(0, 3), match='Mr.'>
    <re.Match object; span=(9, 12), match='Mr.'>
    print(RE_MR.findall(s1))
    print(RE_MR.findall(s2))
    [('Mr', '.'), ('Ms', '.')]
    [('Mr', '.'), ('Mrs', '.')]
    for m in RE_MR.finditer(s1):
        print(m)
    <re.Match object; span=(0, 3), match='Mr.'>
    <re.Match object; span=(8, 11), match='Ms.'>
    for m in RE_MR.finditer(s2):
        print(m)
    <re.Match object; span=(9, 12), match='Mr.'>
    <re.Match object; span=(17, 21), match='Mrs.'>
    ms = [m for m in RE_MR.finditer(s1)]
    print(ms)
    [<re.Match object; span=(0, 3), match='Mr.'>, <re.Match object; span=(8, 11), match='Ms.'>]
    ms = []
    for m in RE_MR.finditer(s1):
        ms.append(m)
    '{[{so, very} good], fantastic}'
    r'((?:so|very) good|fantastic)'
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {
            '/((?:so|very) good|fantastic)/': {
                '`Things are just getting better for you!`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: So good!!!
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: Fantastic :)
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: It's fantastic
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {
            '[/((?:so|very) good|fantastic)/]': {
                '`Things are just getting better for you!`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: It's fantastic!!
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: I'm so good, thank you!
    S: Things are just getting better for you!
    transitions = {
        'state': 'start',
        '`Hello. What should I call you?`': {
            '[/(?<FIRSTNAME>[a-z]+) (?<LASTNAME>[a-z]+)/]': {
                '`It\'s nice to meet you,` $FIRSTNAME `. I know several people with the last name,` $LASTNAME': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. What should I call you?
    U: Jinho Choi
    S: It's nice to meet you, jinho . I know several other choi .
    RE_MR = re.compile(r'(M([rs]|rs))(\.)')
    print(RE_MR.match('Mrs. Wayne').groups())
    --> ('Mr', 'rs', '.')

    3.1. Natex

    Several matching strategies built in Natex.

    Emora STDM supports several ways for interpreting the contexts of user inputs through Natex (Natural Langauge Expression), some of which you already experienced in Matching Strategy.

    hashtag
    Literal

    A literal is what you intend the system to say. A literal is represented by reversed primes (`..`):

    • #3: the system prompts the literal and ends the dialogue.

    hashtag
    Matching

    Natex supports several ways of matching the input with key terms.

    hashtag
    Term

    The condition is true if the input exactly matches the term. A term is represented as a string and can have more than one token:

    • #4: matches the input with 'could be better'.

    • #7: error is a reserved term indicating the default condition of this conditional branching, similar to the wildcard condition (_) in a statement.

    hashtag
    Set

    The condition is true if the input exactly matches any term in the set. A set is represented by curly brackets ({}):

    • #7: matches the input with either 'good' or 'not bad'.

    hashtag
    Unordered List

    The condition is true if some terms in the input match all terms in the unordered list, regardless of the order. An unordered list is represented by angle brackets (<>):

    • #10: matches the input with both 'very' and 'good' in any order.

    hashtag
    Ordered List

    The condition is true if some terms in the input match all terms in the ordered list, a.k.a. sequence, in the same order. An ordered list is represented by square brackets ([]):

    • #13: matches the input with both 'so' and 'good' in that order.

    Currently, it matches the input "could be better" with the condition in #4, but does not match "it could be better" or "could be better for sure", where there are terms other than the ones indicated in the condition.

    1. Update the condition such that it matches all three inputs.

    2. How about matching inputs such as "could be much better" or "could be really better"?

    hashtag
    Rigid Sequence

    The condition is true if all terms in the input exactly match all terms in the rigid sequence in the same order. A rigid sequence is represented by square brackets ([ ]), where the left bracket is followed by an exclamation mark (!):

    #16: matches the input with both 'hello' and 'world' in that order.

    circle-info

    There is no difference between matching a term (e.g., 'hello world') and matching a rigid sequence (e.g., '[!hello, world]'). The rigid sequence is designed specifically for , which will be deprecated in the next version.

    hashtag
    Negation

    The condition is true if all terms in the input exactly match all terms in the rigid sequence except for ones that are negated. A negation is represented by a hyphen (-):

    • #19: matches the input with 'aweful' and zero to many terms prior to it that are not 'not'.

    hashtag
    Nesting

    It is possible to nest conditions for more advanced matching. Let us create a condition that matches both "so good" and "very good" using a nested :

    • #4: uses a set inside a term.

    Does this condition match "good"?

    No, because the outer condition uses term matching that requires the whole input to be the same as the condition.

    However, it does not match when other terms are included in the input (e.g., "It's so good to be here"). To broaden the matching scope, you can put the condition inside a :

    • #4: the term condition is inside the sequence.

    What if we want the condition to match the above inputs as well as "fantastic"? You can put the condition under a set and add fantastic as another term:

    • #4: the sequence condition and the new term fantastic is inside the set.

    The above transitions match "Fantastic" but not "It's fantastic". Update the condition such that it can match both inputs.

    Put fantastic under a sequence such that '{[{so, very} good], [fantastic]}'.

    hashtag
    Variable

    Saving user content can be useful in many ways. Let us consider the following transitions:

    Users may feel more engaged if the system says, "I like dogs too" instead of "them". Natex allows you to create a variable to store the matched term. A variable is represented by a string preceded (without spaces) by a dollar sign $:

    • #4: creates a variable FAVORITE_ANIMAL storing the matched term from the user content.

    • #5: uses the value of the variable to generate the follow-up system utterance.

    circle-info

    In #5, two literals, `I like` and `too!` surround the variable $FAVORITE_ANIMAL. If a variable were indicated inside a literal, STDM would throw an error.

    transitions = {
        'state': 'start',
        '`Hello. How are you?`': 'end'  # literal
    }
  • '[could be better]'

  • '[could be, better]'

  • S: Hello. How are you?
    matcharrow-up-right
    negation
    term
    set
    sequence
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {         # literal
            'could be better': {           # term
                '`I hope your day gets better soon :(`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: Could be better..
    S: I hope your day gets better soon :(
    S: Hello. How are you?
    U: It could be better
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {         # literal
            'could be better': {           # term
                '`I hope your day gets better soon :(`': 'end'
            },
            '{good, not bad}': {           # set
                '`Glad to hear that you are doing well :)`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: Good!!
    S: Glad to hear that you are doing well :)
    S: Hello. How are you?
    U: Not bad..
    S: Glad to hear that you are doing well :)
    S: Hello. How are you?
    U: I'm good
    S: Sorry, I didn't understand you.
    S: Hello. How are you?
    U: Not so bad
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {         # literal
            'could be better': {           # term
                '`I hope your day gets better soon :(`': 'end'
            },
            '{good, not bad}': {           # set
                '`Glad to hear that you are doing well :)`': 'end'
            },
            '<very, good>': {              # unordered list
                '`So glad that you are having a great day!`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: Very good!
    S: So glad that you are having a great day!
    S: Hello. How are you?
    U: I'm very well and good
    S: So glad that you are having a great day!
    S: Hello. How are you?
    U: Good, things are going very well!
    S: So glad that you are having a great day!
    S: Hello. How are you?
    U: Good
    S: Glad to hear that you are doing well :)
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {         # literal
            'could be better': {           # term
                '`I hope your day gets better soon :(`': 'end'
            },
            '{good, not bad}': {           # set
                '`Glad to hear that you are doing well :)`': 'end'
            },
            '<very, good>': {              # unordered list
                '`So glad that you are having a great day!`': 'end'
            },
            '[so, good]': {                # ordered list (sequence)
                '`Things are just getting better for you!`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
        }
    }
    S: Hello. How are you?
    U: So good!
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: It's so wonderfully good!
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: It's good
    S: Sorry, I didn't understand you.
    S: Hello. How are you?
    U: It's good so far
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {         # literal
            'could be better': {           # term
                '`I hope your day gets better soon :(`': 'end'
            },
            '{good, not bad}': {           # set
                '`Glad to hear that you are doing well :)`': 'end'
            },
            '<very, good>': {              # unordered list
                '`So glad that you are having a great day!`': 'end'
            },
            '[so, good]': {                # ordered list (sequence)
                '`Things are just getting better for you!`': 'end'
            },
            '[!hello, world]': {           # rigid sequence
                '`You\'re a programmer!`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
        }
    }
    S: Hello. How are you?
    U: Hello World
    S: You're a programmer!
    S: Hello. How are you?
    U: hello world to you
    S: Sorry, I didn't understand you.
    S: Hello. How are you?
    U: It's hello world
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {         # literal
            'could be better': {           # term
                '`I hope your day gets better soon :(`': 'end'
            },
            '{good, not bad}': {           # set
                '`Glad to hear that you are doing well :)`': 'end'
            },
            '<very, good>': {              # unordered list
                '`So glad that you are having a great day!`': 'end'
            },
            '[so, good]': {                # ordered list (sequence)
                '`Things are just getting better for you!`': 'end'
            },
            '[!hello, world]': {           # rigid sequence
                '`You\'re a programmer!`': 'end'
            },
            '[!-not, aweful]': {           # negation
                '`Sorry to hear that :(`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: Aweful!
    S: Sorry to hear that :(
    S: Hello. How are you?
    U: It's so aweful..
    S: Sorry to hear that :(
    S: Hello. How are you?
    U: Not aweful
    S: Sorry, I didn't understand you.
    S: Hello. How are you?
    U: Not so aweful
    S: Sorry to hear that :(
    S: Hello. How are you?
    U: Aweful and terrible
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {
            '{so, very} good': {
                    '`Things are just getting better for you!`': 'end'
                },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {
            '[{so, very} good]': {
                    '`Things are just getting better for you!`': 'end'
                },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    transitions = {
        'state': 'start',
        '`Hello. How are you?`': {
            '{[{so, very} good], fantastic}': {
                    '`Things are just getting better for you!`': 'end'
                },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    S: Hello. How are you?
    U: I'm very good, thank you!
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: It's so good to be here :)
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: Fantastic!!!
    S: Things are just getting better for you!
    S: Hello. How are you?
    U: Good
    S: Sorry, I didn't understand you.
    S: Hello. How are you?
    U: It's fantastic
    S: Sorry, I didn't understand you.
    transitions = {
        'state': 'start',
        '`What is your favorite animal?`': {
            '[{dogs, cats, hamsters}]': {
                '`I like them too!`': 'end'
            },
            'error': {
                '`I\'ve never heard of that animal.`': 'end'
            }
        }
    }
    S: What is your favorite animal?
    U: I like dogs
    S: I like them too!
    transitions = {
        'state': 'start',
        '`What is your favorite animal?`': {
            '[$FAVORITE_ANIMAL={dogs, cats, hamsters}]': {
                '`I like` $FAVORITE_ANIMAL `too!`': 'end'
            },
            'error': {
                '`I\'ve never heard of that animal.`': 'end'
            }
        }
    }
    S: What is your favorite animal?
    U: I like dogs!!
    S: I like dogs too!
    S: What is your favorite animal?
    U: Hamsters are my favorite!
    S: I like hamsters too!

    3.5. Macro

    How to use macro functions for matching in Natex.

    The most powerful aspect of Natex is its ability to integrate pattern matching with arbitrary code. This allows you to integrate regular expressions, NLP models, or custom algorithms into Natex.

    hashtag
    Creation

    A macro can be defined by creating a class inheriting the abstract classarrow-up-right Macro in STDM and overridesarrow-up-right the run method:

    • #1: imports Macro from STDM.

    • #2: imports type hints from the package in Python.

    • #4

    Currently, the run method returns True no matter what the input is.

    hashtag
    Integration

    Let us create transitions using this macro. A macro is represented by an alias preceded by the pound sign (#):

    • #4: calls the macro #GET_NAME that is an alias of MacroGetName.

    • #13: creates a dictionary defining aliases for macros.

    To call the macro, we need to add the alias dictionary macros to the dialogue flow:

    • #3: adds all macros defined in macros to the dialogue flow df.

    hashtag
    Parameters

    The run method has three parameters:

    • ngrams: is a set of strings representing every of the input matched by the Natex.

    • vars: is the variable dictionary, maintained by a DialogueFlow object, where the keys and values are variable names and objects corresponding to their values.

    Let us modify the run method to see what ngrams and vars give:

    • #2: prints the original string of the matched input span before preprocessing.

    • #3: prints the input span, preprocessed by STDM and matched by the Natex.

    • #4

    When you interact with the the dialogue flow by running it (df.run()), it prints the followings:

    The raw_text method returns the original input:

    The text method returns the preprocessed input used to match the Natex:

    The ngrams gives a set of all possible n-grams in text():

    Finally, the vars gives a dictionary consisting of both system-level and user-custom variables (no user-custom variables are saved at the moment):

    hashtag
    Implementation

    Let us update the run method that matches the title, first name, and last name in the input and saves them to the variables $TITLE, $FIRSTNAME, and $LASTNAME, respectively:

    • #2: creates a regular expression to match the title, first name and last name.

    • #3: searches for the span to match.

    • #4

    Given the updated macro, the above transitions can be modified as follow:

    • #5: uses the variables $FIRSTNAME and $LASTNAME retrieved by the macro to generate the output.

    The followings show outputs:

    Although the last name is not recognized, and thus, it leaves a blank in the output, it is still considered "matched" because run() returns True for this case. Such output can be handled better by using the capability in Natex.

    circle-exclamation

    Can macros be mixed with other Natex expressions?

    from emora_stdm import Macro, Ngrams
    from typing import Dict, Any, List
    
    class MacroGetName(Macro):
        def run(self, ngrams: Ngrams, vars: Dict[str, Any], args: List[Any]):
            return True
    : creates the
    MacroGetName
    class inheriting
    Macro
    .
  • #5: overrides the run method declared in Macro.

  • #14: creates an object of MacroGetName and saves it to the alias GET_NAME.
    args: is a list of strings representing arguments specified in the macro call.
    : prints a set of n-grams.
    : returns
    False
    if no match is found.
  • #6-18 -> exercise.

  • #20-22: saves the recognized title, first name, and last name to the corresponding variables.

  • #24: returns True as the regular expression matches the input span.

  • typingarrow-up-right
    n-gramarrow-up-right
    language generation
    transitions = {
        'state': 'start',
        '`Hello. What should I call you?`': {
            '#GET_NAME': {
                '`It\'s nice to meet you.`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }
    
    macros = {
        'GET_NAME': MacroGetName()
    }
    df = DialogueFlow('start', end_state='end')
    df.load_transitions(transitions)
    df.add_macros(macros)
    def run(self, ngrams: Ngrams, vars: Dict[str, Any], args: List[Any]):
        print(ngrams.raw_text())
        print(ngrams.text())
        print(ngrams)
        print(vars)
    S: Hello. What should I call you?
    U: Dr. Jinho Choi
    S: It's nice to meet you.
    Dr. Jinho Choi
    dr jinho choi
    {
        'dr',
        'jinho',
        'choi',
        'dr jinho',
        'jinho choi',
        'dr jinho choi'
    }
    {
        '__state__': '0',
        '__system_state__': 'start',
        '__stack__': [],
        '__user_utterance__': 'dr jinho choi',
        '__goal_return_state__': 'None',
        '__selected_response__': 'Hello. What should I call you?',
        '__raw_user_utterance__': 'Dr. Jinho Choi',
        '__converged__': 'True'
    }
    def run(self, ngrams: Ngrams, vars: Dict[str, Any], args: List[Any]):
        r = re.compile(r"(mr|mrs|ms|dr)?(?:^|\s)([a-z']+)(?:\s([a-z']+))?")
        m = r.search(ngrams.text())
        if m is None: return False
    
        title, firstname, lastname = None, None, None
        
        if m.group(1):
            title = m.group(1)
            if m.group(3):
                firstname = m.group(2)
                lastname = m.group(3)
            else:
                firstname = m.group()
                lastname = m.group(2)
        else:
            firstname = m.group(2)
            lastname = m.group(3)
    
        vars['TITLE'] = title
        vars['FIRSTNAME'] = firstname
        vars['LASTNAME'] = lastname
    
        return True
    transitions = {
        'state': 'start',
        '`Hello. What should I call you?`': {
            '#GET_NAME': {
                '`It\'s nice to meet you,` $FIRSTNAME `.` $LASTNAME `is my favorite name.`': 'end'
            },
            'error': {
                '`Sorry, I didn\'t understand you.`': 'end'
            }
        }
    }?
    S: Hello. What should I call you?
    U: Dr. Jinho Choi
    S: It's nice to meet you, jinho . choi is my favorite name.
    S: Hello. What should I call you?
    U: Jinho Choi
    S: It's nice to meet you, jinho . choi is my favorite name.
    S: Hello. What should I call you?
    U: Dr. Choi
    S: It's nice to meet you, dr choi . choi is my favorite name.
    S: Hello. What should I call you?
    U: Jinho
    S: It's nice to meet you, jinho .  is my favorite name.