3.5. Macro
How to use macro functions for matching in Natex.
The most powerful aspect of Natex is its ability to integrate pattern matching with arbitrary code. This allows you to integrate regular expressions, NLP models, or custom algorithms into Natex.
Creation
A macro can be defined by creating a class inheriting the abstract class Macro
in STDM and overrides the run
method:
from emora_stdm import Macro, Ngrams
from typing import Dict, Any, List
class MacroGetName(Macro):
def run(self, ngrams: Ngrams, vars: Dict[str, Any], args: List[Any]):
return True
#1
: importsMacro
from STDM.#2
: imports type hints from thetyping
package in Python.#4
: creates theMacroGetName
class inheritingMacro
.#5
: overrides therun
method declared inMacro
.
Currently, the run
method returns True
no matter what the input is.
Integration
Let us create transitions using this macro. A macro is represented by an alias preceded by the pound sign (#
):
transitions = {
'state': 'start',
'`Hello. What should I call you?`': {
'#GET_NAME': {
'`It\'s nice to meet you.`': 'end'
},
'error': {
'`Sorry, I didn\'t understand you.`': 'end'
}
}
}
macros = {
'GET_NAME': MacroGetName()
}
#4
: calls the macro#GET_NAME
that is an alias ofMacroGetName
.#13
: creates a dictionary defining aliases for macros.#14
: creates an object ofMacroGetName
and saves it to the aliasGET_NAME
.
To call the macro, we need to add the alias dictionary macros
to the dialogue flow:
df = DialogueFlow('start', end_state='end')
df.load_transitions(transitions)
df.add_macros(macros)
#3
: adds all macros defined inmacros
to the dialogue flowdf
.
Parameters
The run
method has three parameters:
ngrams
: is a set of strings representing every n-gram of the input matched by the Natex.vars
: is the variable dictionary, maintained by aDialogueFlow
object, where the keys and values are variable names and objects corresponding to their values.args
: is a list of strings representing arguments specified in the macro call.
Let us modify the run
method to see what ngrams
and vars
give:
def run(self, ngrams: Ngrams, vars: Dict[str, Any], args: List[Any]):
print(ngrams.raw_text())
print(ngrams.text())
print(ngrams)
print(vars)
#2
: prints the original string of the matched input span before preprocessing.#3
: prints the input span, preprocessed by STDM and matched by the Natex.#4
: prints a set of n-grams.
When you interact with the the dialogue flow by running it (df.run()
), it prints the followings:
S: Hello. What should I call you?
U: Dr. Jinho Choi
S: It's nice to meet you.
The raw_text
method returns the original input:
Dr. Jinho Choi
The text
method returns the preprocessed input used to match the Natex:
dr jinho choi
The ngrams
gives a set of all possible n-grams in text()
:
{
'dr',
'jinho',
'choi',
'dr jinho',
'jinho choi',
'dr jinho choi'
}
Finally, the vars
gives a dictionary consisting of both system-level and user-custom variables (no user-custom variables are saved at the moment):
{
'__state__': '0',
'__system_state__': 'start',
'__stack__': [],
'__user_utterance__': 'dr jinho choi',
'__goal_return_state__': 'None',
'__selected_response__': 'Hello. What should I call you?',
'__raw_user_utterance__': 'Dr. Jinho Choi',
'__converged__': 'True'
}
Implementation
Let us update the run
method that matches the title, first name, and last name in the input and saves them to the variables $TITLE
, $FIRSTNAME
, and $LASTNAME
, respectively:
def run(self, ngrams: Ngrams, vars: Dict[str, Any], args: List[Any]):
r = re.compile(r"(mr|mrs|ms|dr)?(?:^|\s)([a-z']+)(?:\s([a-z']+))?")
m = r.search(ngrams.text())
if m is None: return False
title, firstname, lastname = None, None, None
if m.group(1):
title = m.group(1)
if m.group(3):
firstname = m.group(2)
lastname = m.group(3)
else:
firstname = m.group()
lastname = m.group(2)
else:
firstname = m.group(2)
lastname = m.group(3)
vars['TITLE'] = title
vars['FIRSTNAME'] = firstname
vars['LASTNAME'] = lastname
return True
#2
: creates a regular expression to match the title, first name and last name.#3
: searches for the span to match.#4
: returnsFalse
if no match is found.#6-18
-> exercise.#20-22
: saves the recognized title, first name, and last name to the corresponding variables.#24
: returnsTrue
as the regular expression matches the input span.
Given the updated macro, the above transitions can be modified as follow:
transitions = {
'state': 'start',
'`Hello. What should I call you?`': {
'#GET_NAME': {
'`It\'s nice to meet you,` $FIRSTNAME `.` $LASTNAME `is my favorite name.`': 'end'
},
'error': {
'`Sorry, I didn\'t understand you.`': 'end'
}
}
}?
#5
: uses the variables$FIRSTNAME
and$LASTNAME
retrieved by the macro to generate the output.
The followings show outputs:
S: Hello. What should I call you?
U: Dr. Jinho Choi
S: It's nice to meet you, jinho . choi is my favorite name.
Can macros be mixed with other Natex expressions?
Last updated
Was this helpful?