Robust Parse Part 1: A Different Way To Handle User Input

From SCI Wiki
Jump to navigationJump to search

Robust Parse: A different way to handle user input

Do you want your text-based user interaction to be better? Want to have parsing that beats the pants off of every single SCI Sierra game (yes, really!)? Well, here's how to do it.

I'm writing this to address some of the shortcomings with parsing the user input. This solution was inspired by how the parsing takes place in Inform. I didn't dream this up completely on my own; I'm just applying similar techniques in SCI.

Existing SCI error handling constructs:
wordFail: 'I don't understand the word %s': Very good. Very specific and provides good feedback that the user has a typo, or worst case the game just doesn't understand their word choice. If this message comes up excessively, the game designer needs to consider putting in some synonymns into the vocab. However, this is a direct reflection of the designers' shortcomings, not the implementation of the parsing, etc.

syntaxFail: 'That doesn't appear to be a proper sentence': For example: 'up jump' (presumably instead of 'jump up'). This happens occasionally & I'm not sure if anything can be done here, except changing the message returned to the user to use a well-known standard sentence structure in the form of 'verb <noun> <prep> <noun>'.

semanticFail: 'That sentence doesn't make sense': I unfortunately do not have an example. Again, not much we can do here, short of asking the user to restate their request in a standard way, similar to syntaxFail.

pragmaFail: 'You've left me responseless': This is the worst possible thing a user can see. We've done everything we can to parse their input, but we've come up with nothing & this is the best we can do - provide a 'canned' response to the user. It's horrific.

Overview of Solution

  1. Parse the verb
  2. Parse the noun & 'second' noun
  3. Check to make sure the verbs are utilized properly, otherwise throw error (this is the replacement for the stock 'pragmaFail' method)
  4. Check to make sure that the nouns specified are 'here', otherwise throw error
  5. Call an 'action' script to handle the specified verb

Part 1: Parse the Verb
In this step we performing the parsing of the verb. We will also be the flagging of missing or superfluous nouns, which will be used to build the output error message we will display to the user in Step 3.

Each verb will typically be represented with three different Said() string if statements. This is so we can determine whether a verb, verb/noun or verb/noun/noun combination was entered. Note the 'no-claim' operator ('>') at the end of each Said() string - this allows us to continue evaluating Said() statements for the nouns and 'second' nouns in the following steps.

Code:
        = Verb UNSET                // initialize it to 'nothing' before each parse

	(if(Said('dig/*/*>'))      // ex: dig in ground with shovel
	   = Verb VERB_DIG	
	)

	(if(Said('dig/*[/!*]>'))  // ex: dig in ground
	   = Verb VERB_DIG
	   = needSecond TRUE	// what do you want to dig in the ground with?
	)
	
	(if(Said('dig[/!*]>'))    // ex: dig
	   = Verb VERB_DIG
	   = needNoun TRUE    // what do you want to dig in?
	)

In the code above, we have three different Said() statements, each one will apply true based on what is inputted. This represents all the possibilities for the verb 'dig' (remember, each sentence that a user inputs is always split into a maximum of 3 parts, so we've got it covered). They match a 3-part, 2-part and 1-part user input, respectively starting with the verb 'dig'.

Within each 'if' statement, the first thing we do is set our global 'Verb' variable to the verb 'dig', which we've defined in our game.sh as a constant - in this case an integer. There must be verb constants set up for every verb in our game and are ideally numbered from zero to n.

You also see references to other global variables 'needNoun' and 'needSecond'. Not shown are two other globals 'unneededNoun' and 'unneededSecond'. These 4 variables are initalized to FALSE before each parse, and are set to TRUE when the inputted sentence does not make sense.

Let's look at the first 'if' statement. It will capture a user input of something like 'dig in ground with shovel'. We set the verb global variable to VERB_DIG, we'll get into why we do this later. This if statement happens to be evaluating a 'good parse' in the game.

Next, the second 'if' statement in the example. It applies when the user enters 'dig in ground'. Again, we set our verb global. Then, since the sentence isn't specific enough for us, we set our 'needSecond' variable to true. This indicates that we are expecting a second noun to be entered and it wasn't supplied. We'll use this flag later to tell the user that they goofed & need to be more specific.

The third 'if' is just like the second, except the user has entered something even less specific, the sentence of simply 'dig'. This time we set our 'needNoun' variable to TRUE, indicating they didn't enter the first noun. We could additionally set the 'needSecond' variable to TRUE, but we don't do anything with it, as we stop reporting an error at the initial noun.

Part 2: Parsing the noun & 'second' noun
Next, we parse the remaining two portions of the user input (noun & second). This is simpler than parsing the verbs as we are just looking for a match on the nouns in the relative positions in the Said() string. Note that we also are setting the 'noun' and 'second' global variables with constants defined in a similar fashion as our verb constants.

Code:
	= noun UNSET		// initialize it to 'nothing' before each parse
	(if(Said('*/ground>'))
	   = noun GROUND_OBJ	
	)
	
	(if(Said('*/shovel>'))
	   = noun SHOVEL_OBJ	
	)

and:

Code:
	= second UNSET	 // initialize it to 'nothing' before each parse
	(if(Said('*/*/ground>'))
	   = second GROUND_OBJ	
	)
	
	(if(Said('*/*/shovel>'))
	   = second SHOVEL_OBJ	
	)

Part 3: A robust 'pragmaFail' method
In order to prepare for this part, let's check a few things. You've got all your verbs & nouns defined as constants, right? Great. Now we need to go one step further & create two text resources. One will be for verb 'names' and one for 'noun' names. This is where we store the text of the verbs and nouns so we can regurgitate them back to the user in our pragamaFail method. The index of the text resource is correlated to the value of the constant, seen below:

Code:
/***************** game.sh *****************/
(define UNSET		-1)

// Verbs
(define VERB_CUT        0)
(define VERB_CLIMBUP    1)
(define VERB_CLIMBDOWN  2)
(define VERB_DIG        3)
(define VERB_LISTEN     4)
(define VERB_LOOK       5)
(define VERB_RUN        6)

// Nouns
(define TREE_OBJ        0)
(define SAW_OBJ         1)
(define GROUND_OBJ      2)
(define SHOVEL_OBJ      3)
(define BIRD_OBJ        4)
(define HOUSE_OBJ       5)
(define DOOR_OBJ        6)
(define KEY_OBJ         7)

// Text resource numbers
(define VERB_NAMES      0)
(define NOUN_NAMES      1)
Code:
/***************** Text.000 (verb names) *****************/
Num	Text
------	----- 
0	cut
1	climb up
2	climb down
3	dig
4	listen to
5	look at
6	run
Code:
/***************** Text.001 (noun names) *****************/
Num	Text
------	-----
0	tree
1	saw
2	ground
3	shovel
4	bird
5	house
6	door
7	key

Now that we've laid the groundwork, let's proceed on to the code for Step 3:

Code:
(procedure public (RobustPragmaFail) 
  (if(== needNoun TRUE and == noun UNSET)
  	 FormatPrint("What do you want to %s?" VERB_NAMES Verb)
  	 return(TRUE)
  )
 
  (if(== needSecond TRUE and == second UNSET)
  	 FormatPrint("What do you want to %s the %s with?" VERB_NAMES Verb NOUN_NAMES noun)
  	 return(TRUE)
  )
  
  (if(== unneededNoun TRUE and <> noun UNSET)
  	 FormatPrint("I understood only as far as you wanting to %s." VERB_NAMES Verb)
  	 return(TRUE)
  )
  
  (if(== unneededSecond TRUE and <> second UNSET)
  	 FormatPrint("I understood only as far as you wanting to %s the %s." VERB_NAMES Verb NOUN_NAMES noun)
  	 return(TRUE)
  )
  
  return(FALSE)
)

Now you can see all of the pieces from our previous two steps come together. Each if statement handles a different combination of needed (or unneeded) nouns. In our previous 'dig' example, here's the output from two different bad user inputs:

"dig" results in "What do you want to dig?"
"dig in ground" results in "What do you want to dig in the ground with?"

The last two if statements work similar to to first two if statements, but for the reverse scenario, where the user has provided too many nouns (instead of too few). The FormatPrint statements rely on looking up the verb name from the VERB_NAMES text resource (in this case text.000) and the NOUN_NAMES text resource (text.001) to give the user a warm-fuzzy about their bad input.

Part 4: Are the nouns here?
Good, so we've now implemented a much better solution for reporting bad user input back to the user. We are ready to continue our error checking. Let's suppose our user input is 'dig in ground with shovel'. Completely valid, but what if we are inside a building? Or if there is no shovel here? This just doesn't make sense in this context. We need to make sure that the nouns being referenced are actually in the current room or in the inventory.

In this example, I've implemented a global array of noun locations, and like the text resources, their indexes correspond to the noun constants. Before we launch into more code, I know that this is a non-standard way of tracking where objects are, but for this example I believe it helps with clarity. It should be easy to modify the code to use the 'ownedBy' and 'put' methods.

Code:
/*************** Main.sc *************/
/* global variables */
	Verb
	noun
	second	
	needNoun
	needSecond
	unneededNoun
	unneededSecond

/* The array that will track where our objects are */	
	objectLocations[10]  // one for each object in our game

/* in the init() method of Main.sc */
	= objectLocations[TREE_OBJ] TREE_ROOM
	= objectLocations[SAW_OBJ] NOWHERE
	= objectLocations[GROUND_OBJ] EVERYWHERE
	= objectLocations[SHOVEL_OBJ] HOUSE_ROOM
	= objectLocations[BIRD_OBJ] TREE_ROOM
	= objectLocations[HOUSE_OBJ] HOUSE_ROOM
	= objectLocations[DOOR_OBJ] HOUSE_ROOM
	= objectLocations[KEY_OBJ] INVENTORY

You'll notice that we've introduced more constants here: room constants. These are defined in our game.sh. Note that the the positive numbers should correspond to your room script numbers. INVENTORY, EVERYWHERE and NOWHERE are special values

Code:
// Rooms
(define INVENTORY      -2)
(define EVERYWHERE     -1)
(define NOWHERE         0)
// From here, the number should match the room script numbers
(define HOUSE_ROOM      1)
(define TREE_ROOM       2)

Here's the procedure that does the checking for whether the nouns are 'in-scope':

Code:
(procedure public (ValidateNouns) 
	(if(<> noun UNSET)
	    (if (     <> objectLocations[noun] gRoomNumber
	          and <> objectLocations[noun] INVENTORY
	          and <> objectLocations[noun] EVERYWHERE)
	             FormatPrint("I don't see any %s here." NOUN_NAMES noun)
	             return(FALSE)
		)         	
	)

	(if(<> second UNSET)
	    (if (     <> objectLocations[second] gRoomNumber
	          and <> objectLocations[second] INVENTORY
	          and <> objectLocations[second] EVERYWHERE)
	             FormatPrint("I don't see any %s here." NOUN_NAMES second)
	             return(FALSE)
		)         	
	)
	
	// All the nouns are here
	return(TRUE)
)

Let's go through it. For the first noun, we check to make sure that it's actually been specified, the check to see if the noun is either in the current room, inventory or everywhere (everywhere being specified for something like 'ground' or 'sky' if your game takes place completely outside). We do the same check for the second noun. If either of these if statements evaluate true, we return FALSE (indicating failure), otherwise we return TRUE (all good to continue parsing).

Part 5: Call the action script
At this point, we've narrowed our input down to being as legitimate as possible without knowing if the combination of verbs and nouns make sense. At this point we will create a procedure for each verb that we've specified & either perform the action that user asked for, or tell them that their verb and nouns don't make sense together.

Below is the 'CallAction' script, which is purposed to simply call another procedure based on the verb input:

Code:
(procedure public (CallAction) 
    (if(== Verb VERB_CUT)       CutAction() return)
    (if(== Verb VERB_CLIMBUP)   ClimbUpAction() return)
    (if(== Verb VERB_CLIMBDOWN) ClimbDownAction() return)
    (if(== Verb VERB_DIG)       DigAction() return)
    (if(== Verb VERB_LISTEN)    ListenAction() return)
    (if(== Verb VERB_LOOK)      LookAction() return)
    (if(== Verb VERB_RUN)       RunAction() return)    
)

Let's take a peek at the DigAction procedure:

Code:
(procedure public (DigAction) 	
	// Could be "dig ground with shovel"  or "dig with shovel in ground"
	(if (<> second SHOVEL_OBJ and <> noun SHOVEL_OBJ)
            Print("That doesn't have a blade adequate for digging.")
	    return	
	)
	
	(if (<> second GROUND_OBJ and <> noun GROUND_OBJ)
	    Print("It would be best if you dug in something sensible!")	
	    return
	)
		
        Print("Nice hole. What next?")
)

In the case of digging, we've programmed it as a bit of a "red herring". You really don't accomplish anything, but it illustrates the point. In the first if, we are making sure that the user is trying to dig with the shovel. Next we make sure that what the user is trying to dig in is the ground. Finally, the 'good parse' results represents success; we've dug a hole.

Testing
Here are some sample inputs to test 'digging' (note that these will all work in the Robust Parse Demo game that I've created):

dig
dig in ground
dig in ground with shovel
dig in house
dig in house with bird
dig in house with key
etc... (try every noun if you wish!)

And of course try these commands in both rooms (there are only 2 in the demo) to see the varying results.

Conclusion:
You can see that getting a very robust parse takes a lot of parts & a lot of error checking along the way. However, we've gained the flexibility to apply any verb to any noun in any room and get a sensible response from the game, which results in a very satisfying and rich experience for the user.

Miscellaneous Notes:
- Memory utilization: Following this template will potentially generate some very large heap utilization. Though not necessary in the demo game, I still put in DisposeScript() calls everywhere possible to mitigate this. If memory usage becomes excessive, it will most likely be a problem in ParseVerb, ParseNoun and ParseSecond scripts. To address this, simply split the scripts in two or more pieces (e.g. ParseVerbAthruM, ParseVerbNthruZ) and change the calls to the scripts accordingly in the RobustParse routine.

- Do not have words in your vocabulary that you have not provided a successful parse for. Doing so only enables the user to use these words and make us look stupid because it will fall through to the pragmaFail method. This means dump the template game's vocabulary. Yes, dump it. Use the (nearly) empty vocab provided in the Robust Parse Demo game.

- The use of synonyms is discouraged. Instead of synonymns, use the words in the parse itself with the grouping operator (parens) . It will make troubleshooting easier. Once you have bound two words together via a synonym, it is impossible to separate them and can cause unintentional results at parse time.

- Noun disambiguation. This is where adjectives come into play - this is a whole other topic.