In late January, 2006, my interactive fiction Whom the Telling Changed was showcased at the Slamdance Guerrilla Gamemaker Competition in Park City, Utah, along with eleven other finalists. Since I was only able to create this game by using tools such as Inform and the assistance of the online IF community, I felt it was important to try to get something worthwhile for the community at large out of the experience. There has often been discussion in the IF newsgroups about the paucity of real-world data concerning how new players react to IF   . I thought providing a complete lists of transcripts and analysis for Telling might be of interest to the community.
Whom the Telling Changed is an experimental piece of IF designed as an exercize in exploring a conversation or story space rather than a physical space. The player takes the role of a villager thousands of years ago whose people have gathered to hear their storyteller tell a story. As the player traverses the mostly linear game, he or she accumulates a history based on his or her decisions that ultimately impacts the outcome and significance of game events. Part of this process is realized via a system of highlighted keywords in the interior story: typing each keyword will raise certain issues, which have the power to persuade the crowd and other characters towards one point of view or another. (Read more about Whom the Telling Changed)
Telling was written in Inform 6.3, with a slightly modified Library 6/11. The version that competed at Slamdance was a slightly modified version of Release 2. The interpreter used to display the game was Gargoyle (20051002 release), modified to automatically make transcripts and to display in full screen.
To make sense of the transcripts, it's reccommended that the reader experience the game firsthand. You may download Whom the Telling Changed or play it online on my website.
It's important to note that these results do not claim to be representative of new IF players in general, for two major reasons.
First, Whom the Telling Changed is very different from an average IF. It has only four rooms, no puzzles to speak of, uses a keyword-based conversation system and a single-word shorthand for examining items.
Second, the Slamdance venue was not conducive to an IF experience. The venue was a fairly small conference room (maybe 40' x 25') with each of the twelve games displayed on its own computer (Pictures of the Slamdance venue). Foot traffic consisted mostly of people killing time between screenings, as the venue was just down the hall from where most of the Slamdance films were shown. Most of these people had no or little past experience with interactive fiction. Music in the background, lots of chatter, and the surrounding graphical games all served as distractions from the serious tone and text-based interface of Telling.
Another notable point is that the instructions both in-game and physically near the computer were modified after the first day. After observing the first few players who had trouble with the game, I decided the instructions needed to be simplified. Both sets of instructions are included below. Transcripts up to and including #20 were generated using the earlier instructions, while all subsequent transcripts included the second instructions.
Despite these differences from an idealized sample set, I still feel the results are at least somewhat relevant for designers of new IF and IF systems. Telling still uses standard IF conventions for much of its actions, and on a broader level is still using a text-based parser that expects imperatives. The venue, crowded and filled with distractions, is perhaps a good representation of the struggle any text-based entertainment has to find an audience amid the thousands of flashier attractions available. Finally, the majority of the visitors were not gamers and most had not tried interactive fiction before. While a certain percentage were other gamemakers or people otherwise familiar with IF, the majority of the transcripts represent players coming to IF for the first time.
Nearly all players were observed to follow the instructions to type "RESTART" to begin a new game. Each of these transcripts represents an individual player traversing the game.
Whom the Telling Changed begins with a quote, then a title screen and the message "Would you like instructions?". Surprisingly, only about half of players answered "yes" to this question: 53% (39 players) versus 47% (34 players) who said no.
A certain percentage of the "no" respondants may have been experienced IF players or those giving the game a second try.
A related statistic: 78% of players typed out "yes" or "no" in full, while the rest abbreviated to "y" or "n". About 16% of players pressed enter multiple times before answering, perhaps in an attempt to skip over the question entirely. A small minority (less than 5%) typed other words like "nope" or "please".
On average, a player sitting down at Telling entered input 36 times before leaving the game. Several people immediately quit once they realized it was an all-text game; however several played it for a much longer time, including to completion. Factoring out those who completed the game, the average play time was only 28 moves. Since the average time to finish the game was 131 moves, this means people who did not finish saw only 20% of the game.
My transcripts did not include any timestamping, so I can not translate this into physical time. Based on observation, I would guess the average person who typed at least one command into the game but did not play all or most of the way through spent maybe 3 to 5 minutes with Telling before moving on.
Only four players (5%) who interacted with Whom the Telling Changed at Slamdance played the game to its completion. This number has so many mitigating factors it's hard to extract much meaning. On the one hand, the Slamdance venue encouraged players to spend just a few minutes with each game before moving on to something else. On the other hand, Telling has no real puzzles, and almost anyone who takes the time to learn how to interact with it can get to the ending in 60 to 90 minutes. The tone of the piece is also probably a factor in this—Telling is very somber and probably not what many people were looking for.
The results of any command entered into a work of IF can be broken down into three sets.
The first is when the player successfully communicates a desire to the game via the parser, and the desired action takes place. E.g., the game mentions that oranges are present, the player types EXAMINE ORANGES and receives a description of the oranges.
The second is when the player's action is understood by the parser, but does not take place due to a game mechanic or other error that is sensible to the player. E.g., the player types EXAMINE ORANGES when no oranges have been mentioned, and the message "You can't see any such thing" is printed.
The third is when the parser or game does not understand or misinterprets the player's intent, and produces a message that seems incongruous to the player. If oranges had been described, and the player typed EXAMINE ORANGES and received the message "You can't see any such thing," the parser or game author has failed to present a consistent reality to the player.
Of the approximately 2,500 commands entered in the 73 transcripts, 247 were error messages of the third type, meaning about 10% of the time, the parser did not understand the player. At first glance this actually seems extraordinarily high. It should be pointed out, however, that much of a player's interaction with Whom the Telling Changed is entering highlighted keywords, a mechanic which is hard to screw up. If these inputs are factored out, the error rate rises to something nearer 20%. What were the causes of these errors? They are legion.
19% (47) of the errors were produced by trivial syntax errors, where the player typed something reasonable in the format given them by the instructions that did not happen to match a grammar line in the Inform parser. Examples included LOOK SYMBOL (the parser expects LOOK AT SYMBOL), LEAVE TENT (arguably an implementation problem, but not something the standard library understands), GO TO DOOR or GO TO LOVER (the parser does not recognize attempts to move within a room), and ASK ABOUT ENEMIES (the parser expects ASK CHARACTER ABOUT TOPIC).
A further 14% (35) of the errors were caused by major syntax errors, where the player typed input in a format alien to the parser. Examples include WHO IS LOVER (only imperatives are supported), PUT THE HERBS ON MY HEAD (the parser does not recognizes references to body parts), PLEASE TALK (verbs must be first to be understood) and other similar commands.
Another large group of misunderstood input was unsupported synonyms for otherwise valid actions, which comprised 19% (46) of the misunderstood input. Words like GRAB and BRING were used for TAKE; SEE and VIEW were substituted for LOOK; and other verbs like STEAL, SMACK, INSPECT, and SHOUT were also tried.
The remainder of the misunderstood input was less interesting: nonsense text like keyboard mashing made up 4% (10), chatty responses like complete sentences or talkative answers to characters were 5% (13), errors caused by a bug in Telling were 4% (10) and errors caused by implementation problems in Telling were 8% (20). The largest producer of error messages was simple typos (such as LOK instead of LOOK) at 27% (66).
How long does the average player play until they get an error message? In my sample set, and factoring out players who did not type anything other than enter, HELP, INFO, QUIT or RESTART, it was 7.9 moves. This is somewhat encouraging, but looking at the raw numbers shows that this average is caused by a small number of experienced players who averaged over 20 moves till they hit an error message, and a much larger number of players who got one on the sixth, fifth, fourth, or even first move.
Two interesting observations about these numbers. First, the time-till-first-error does not seem to correspond directly to the total time spent playing. This means that the factors pulling people away from the game in this case were more complex than just seeing an error message. Secondly, those who played the game for a nontrivial amount of time divided into two groups: a smaller group who were able to play for sometimes dozens of moves before receiving an error, and the majority which got the error fairly early. Again, the larger group may be at least partially comprised of people who had some degree of past experience with interactive fiction.
At least two players   did not seem to realize that you could type nouns after verbs in cases where this was not made explicit by the directions. Unfortunately, in many cases the parser defaults to automatically selecting a likely item. If a disambiguation question always occurred in these cases, it would help teach new players the expected form of input.
Another frequent assumption was that pressing enter without typing anything would cause time to pass in the game world or otherwise make something happen  . In a large number of cases, players would press enter if they weren't sure what to do or what options were available. The default parser error message "I beg your pardon?" was replaced by the slightly more useful "Nothing entered" in Telling, but better still would have been to map a blank input line to the two-line help text reached by typing HELP.
Obviously, Whom the Telling Changed and these sample transcripts are special cases in a number of ways, and therefore drawing far-reaching conclusions from this data would be a mistake.
For me, however, the data shows that the IF experience could be drastically improved for new players with some very minor changes to existing parser behavior. Simply removing unrecognized words from the input stream, for example, would allow a wider range of commands to be recognized. A looser grammar system would allow input like LOOK BAG to pass—why not just pass anything that contains a VERB and NOUN in that order, disgregarding all other words unless they are relevant? (pass LOOK UNDER SOFA unchanged but change THROW BALL UNDER HAND to THROW BALL).
Speech-to-text recognition software did not become commercially viable until the recognition error rate dropped to 5% or lower, and this seems like a reasonable goal that IF should aim for in comprehension. Through a combination of good implementation on the game-designer's side and a beefed-up parser, I think a statistic of 95% understood player input, even for newbies, is very possible.
Fiction writers are told that they only have a certain amount of time to grab the reader's attention before the average reader will get bored and move on to something else. It may be useful to use the "time to first error" statistic as a similar measuring stick for interactive fiction. If the player has become engaged in the first few moves, they are probably more likely to overlook the occasional error or take the time to learn how the game wants to receive input.
|Transcripts 1-20 featured this in-game instruction text.||Transcripts 21-72 featured different in-game instruction text and brief instructions taped to the monitor.|
Return to aaronareed.net