HOWTO use the Illuminac Lighting System

From BiDInternal

Jump to: navigation, search

gt9wxo <a href="http://yovaettkdplj.com/">yovaettkdplj</a>, [url=http://fbjcnizmykes.com/]fbjcnizmykes[/url], [link=http://lhglrmstupju.com/]lhglrmstupju[/link], http://skuawxzjfhsk.com/

Contents

[edit] Installing the Speech Interface Locally

If you don't have the right components installed, when you navigate to the URL of the new Illuminac Lighting System, you probably just see Ana's Flash-based GUI for the lights and don't see a way to use speech to do anything.

If you want to use your voice to control the lights from your local machine, you're going to need a special setup.

If you don't want to bother setting this up on your local machine, simply use the kitchen computer - it's ready to go with the (smaller) microphone there.

To enjoy speech-controlled lighting from your machine, you will need:

  • Opera web browser, version 9.27 (latest version of Opera may not have stable voice support)
  • IBM voice/multimodal plugin (In Opera, Tools > Preferences > Advanced tab > Voice > Enable voice-controlled browsing)

After these are installed, simply use Opera to browse to http://169.229.63.31:82/illuminacvoice/ and you should hear a greeting. Push and hold the Scroll Lock key (customize this in the preferences) to talk with the system.

[edit] How to Use

If you want to try the system right away, please come by the BiD lab and take a look at the computer in the kitchen (near the sink).

[edit] Getting Started

I've tried to make the system require as little explanation as possible, but there really isn't much help available if you just walk up to the machine in the middle of the day. Here are a few things you can try to get started:

  • If the computer is on and has the Illuminac lights GUI page loaded in the Opera web browser, try refreshing the page and listen for an initial prompt. If you don't hear one, you're probably on the wrong page or the speakers are off. Make sure you go here:
http://169.229.63.31:82/illuminacvoice/        (it's in the bookmarks, Start Bar, and Speed Dial)

If the computer isn't on, turn it on and it should eventually get to the state where Opera is loaded and the page is ready to go. If Opera isn't running, just find a shortcut to Opera and launch it.

  • The system is built in a push-to-talk style - try *holding* down the Scroll Lock key on the keyboard and saying 'help' into the microphone on the table (the smaller one). [Quasi-modal FTW]
  • If you accidentally start triggering lighting changes that you don't want, you have about three seconds to click the "Cancel" button that appears on the screen after you release Scroll Lock. You can also hit the Escape key to cancel a command.
  • Keep the mouse focus on the HTML part of the webpage. If you start messing with the Flash GUI, you're giving focus to Flash and the Scroll Lock key will not do anything. Click in the white (empty) area of the page to give focus back to the webpage.
  • If in doubt, refresh the page and check that the microphone/sound is ON.

[edit] What can I say?

Currently, you can issue two categories of commands that relate to lights. You can turn all the lights in the lab off at once, or you can affect the lighting of certain 'areas' of the lab in various ways.

(Note that you can't issue a command to turn all the lights in the lab on; that would just be wasteful, wouldn't it?)

Which 'areas' of the lab are recognized, you ask? Well, these are specified in an XML configuration file (more about expanding the set of recognized commands later) and are heavily based on a sampling of the commands that Ana Ramirez Chang collected through her research. So if you were involved with Ana's research, almost any command you used then will work now. (In fact, over 70% of the 120+ commands collected by Ana currently work in this system as well.)

I apologize for not 'designing for the odd users', but if I make the grammar too large, recognition accuracy will drop.

I'm actually reluctant to provide some concrete examples of commands that will work because I have this hypothesis (a Research Question?) that your imagination of the space of possible commands will be different depending on the types of 'example commands' I show you. In a completely different line of work, I may be investigating this hypothesis; hence, I'd rather not give away commands now in case I need them for a study on the members of the lab. (=P)

If you are *really* curious about the space of possible commands, you can:

  • Come see me
  • Take a peek at the complete grammar being used: http://169.229.63.31:82/illuminacvoice/illuminac.abnf
  • Just sit on the interface for a while pressing and releasing Scroll Lock. The system spits out a random example command every time you do this, and at the beginning of the interaction dialogue.

[edit] Caveats

I'd like to address some issues that may come up during the use of this new voice interface.

Some basic instructions on how to operate it can be found on Post-It notes at the kitchen computer at the BiD lab. (This computer was recently reformatted and now automatically logs onto a Guest account, then opens the BiD alarm and Illuminac interfaces.)

In a nutshell, you browse to the appropriate web page and push-and-hold the Scroll Lock key to issue voice commands. This web page is currently:

http://169.229.63.31:82/illuminacVoice/

This will probably change shortly, as I figure out how to move the service off of port 82 so that it will be limited to the EECS subnet.

Note that http://bid.berkeley.edu/bidlights does *not* currently point to this new URL; it will be updated soon, but for now do not assume that it will direct you straight to this new voice-enabled version.

Some important caveats you should keep in mind when using this interface:

  • You can use the Flash GUI interface from any browser. However, you can only make use of the (new, augmented) voice interface from a computer running Windows, Opera 9.27, and the IBM Multimodal/voice plugin. (See previous section for installation instructions.)
  • You will know that the system is ready and listening to you if you hear the initial prompt (i.e. "Say a command. For example, ...") and if you hear 'beeps' as you push and release the Scroll Lock key. If you don't hear the beeps or any prompts, try refreshing the page.
  • The interface will only respond to your spoken commands if the current FOCUS is ON THE HTML PART OF THE PAGE. That is, if focus is given to the Flash GUI, pressing Scroll Lock will have no effect (and you will not hear the 'beeps', so you'll know it's not listening). If you give focus to the Flash part of the page to interact with the Flash GUI, please remember to click on the white (empty) region of the webpage before leaving so that the system will be ready to take spoken commands from the next user.
  • If you switch tabs (e.g. to use the BiD alarm), the system will stop listening for spoken commands on all open pages. *Even if you switch back*, the system will not start listening again. You'll have to refresh the page to kick off the spoken interaction dialogue once again.
  • Something about the sound card or microphone at the kitchen computer is causing audio to be captured at a fairly low volume. If you're having trouble being recognized, try speaking louder (and enunciate difficult terms like 'off' vs. 'on').
  • The same plugin that is being used to do the voice recognition can be used to control the browser itself. If you say things that sound like "Opera new page", a new tab may open up. This typically happens when you're trying to issue commands when the focus is on something other than the web page itself. If in doubt about whether the system is listening correctly (i.e. limiting interaction to just the lights interface), refresh the page and listen for the prompt.

In summary, if you're having an unusual amount of trouble getting the system to recognize your commands, try restarting the browser or refreshing the page. Also ensure that the microphone is on and is audible.

[edit] Customizing

I will describe how to alter the speech grammar or define new areas for the lighting interface to recognize.

It is important to note that this new system is based on a 'strict grammar' - it will only recognize commands in its limited, predefined vocabulary and there is no way to 'train' it to recognize more based on spoken samples. If you would like to train custom, personalized commands with the lighting system (e.g. 'david sun superstar' to do something completely random), I refer you to Ana's original system which is still operational at the original URL (http://169.229.63.31:82/IlluminacGUI/IlluminacGUI.html).

[edit] Adding to the Grammar

Although you cannot train this speech system in the traditional sense, you *can* add to its grammar. I will walk through how to do this using Reza as an example.

Currently, you can say commands such as: "Anuj - turn my lights on" (in the style of Ana's system) or "Turn Anuj's lights on". These phrases are covered in the existing grammar. However, "Reza - turn my lights on" or "Turn Reza's lights on" will not work because I haven't added 'Reza' as a recognized name. Let's add it now.

The grammar is defined in a set of .abnf files (Augmented BNF grammar, in a W3C specification format) in the same directory as the root of the application. You can take a peek at them:

http://169.229.63.31:82/illuminacVoice/illuminac.abnf http://169.229.63.31:82/illuminacVoice/bid.abnf

I tried to separate the BiD-lab-specific parts of the commands in their own file, so the main grammar would not need to be modified even if the lab moves or changes radically. That is, all the stuff about the lights in general stays in illuminac.abnf, while lab-specific parts of phrases are found in bid.abnf. Hence, it is best to only modify bid.abnf, to keep changes to the grammar to a minimum. Keep in mind that small alterations to the grammar can have large impacts on the accuracy of recognition (i.e. they can greatly increase or decrease the set of recognized phrases).

In bid.abnf, you will find three main public 'rules'. Each rule represents a *part* of a whole, recognized phrase. They are created much like variable declarations, and are given the modifier 'public' so that they'll be accessible from other grammars (i.e. rules in illuminac.abnf make use of these rules in bid.abnf).

The three rules are:

  • $area_identifiers: these are terms that map to the various areas in the lab which I deemed worthy of recognition - adding or modifying these areas is slightly more complex, which I will cover later
  • $name: these are the people in the lab whose names are recognized as part of larger phrases
  • $name_possessive: these are the possessive forms of the names above

It may be worthwhile to learn about the syntax for ABNF grammars (see: http://www.w3.org/TR/speech-grammar/), but the selling point of the format is that it is easy to read, understand, and (hopefully) modify.

Some things I should point out right away for those diving in, though:

  • The '|' character represents 'or'. All of these rules use '|' and a newline to separate the options for terms that will satisfy the rule. For example, in the $name rule, each person's name is separated with '|' and a newline (see the character at the end of each line).
  • There is no '|' character following the last entry in each rule because that would mean 'or NULL' which is not what we want.
  • Everything within the brackets {} is executable Javascript code. All they are doing are setting values of variables in Javascript for interpretation later. This is how the 'mapping' is really done - whenever part of a rule matches, the Javascript code immediately following it is executed and some variables are set. If the content within the brackets {} is not included, nothing will happen.
  • The actual text to be recognized by the rules should always be in lower case with no punctuation. If a term is more than one word, it should be grouped together with parentheses, like so: (john canny)
  • Square brackets [] represent optional words as part of larger terms. A rule like ((tool|work) [shop]) will match: "tool", "work", "tool shop", "work shop"

To add Reza as a recognized name, we simply need to append to the chain of options in the appropriate locations. In the $name rule, we add a '|' character after the last option (past the brackets {}), enter a newline, then type 'reza' followed by the mapping of what we want 'reza' to mean. In this case, it's pretty much the same as what we want 'anuj' to mean, so we can copy the content within the brackets from Anuj's line:

{out.name = 'anuj'; out.location = 'CUBE 6';}

... and modify it just for Reza:

{out.name = 'reza'; out.location = 'CUBE 6';}

out.name is actually just what is displayed in the GUI when this command is recognized, and out.location is the actual ID string of the area that will be affected. The set of possible location ID strings are in a different place, which we will go into shortly.

We also must do something similar in the $name_possessive rule, adding a line for 'rezas' to represent a possessive form of 'reza'.

There is one slight problem with all this, though - it has to do with how 'reza' is pronounced. If you imagine the speech engine interpreting 'reza', you can guess that it would think it's pronounced 'reeeza'. To ensure that a particular pronounciation is the one that is recognized, you may need to modify the spelling of the term for the grammar. In our case, I decided to go with 'rehza' (and testing of the interface afterwards confirms this works, while 'reza' doesn't). The final modifications:

public $name =
  ...
  anooj {out.name = 'anuj'; out.location = 'CUBE 6';} |
  rehza {out.name = 'reza'; out.location = 'CUBE 6';}
;
public $name_possessive =
  ...
  anoojs {out.name = 'anuj'; out.location = 'CUBE 6';} |
  rehzas {out.name = 'reza'; out.location = 'CUBE 6';}
;

... and we're done! You should now be able to say, "Turn on Reza's lights" and have CUBE 6 lighted.

(BONUS: Want to give Anuj a nickname? Remember you can turn one term into a multi-word term by grouping the words with parentheses:

...
(anooj | (da nooj) ) {out.name = 'anuj'; out.location = 'CUBE 6';} |
...

This is essentially how you create synonyms for terms.)

[edit] Adding New Locations

So where does 'CUBE 6' come from? Where's the set of possible ID strings for the out.location variable? Well, the answer to that lies in an XML configuration file, which controls the definitions for areas in the lab. Take a look here:

http://169.229.63.31:82/illuminacVoice/lights_application_config.xml

This XML file is really just mapping ID strings (e.g. "COUCH" or "CUBE 4") to sets of lights identified by a string of binary numbers. It's not very easy to read, I admit, but it was done this way for convenience.

Let's say Reza is not actually satisfied with having his name mapped to the same lighting configuration as Anuj. He wants to 'name his own area' and have it recognized as something outside of the limited scope of cubicles. First, he has to think up the name of the area - say it will be called the 'back shop'.

To add an entry for the 'back shop', you just have to add a new <property> entry, right alongside all the others. The 'id' attribute will be the unique ID string you want to identify this area with (let's say "BACKSHOP"). The 'value' attribute is a little messy - it is a string of 1s and 0s that identify exactly which lights in the array you're talking about when you refer to the "BACKSHOP" set. Here is how the string is derived:

  • '1' represents a light that is part of the set, '0' represents a light that is not
  • The lab is organized into a grid of 7x12, exactly how the Flash GUI shows the lights - the empty squares along the bottom of the GUI map count as part of this grid, too
  • Starting at the *top-right* cell, going down and then left, each cell is one digit in the string. It reads much like East Asian writing, but I'm sure that's just a coincidence.

So, here is something Reza might have come up with as his desired set of lights:

000000001110 000000001010 000000001111 000000000000000000000000000000000000000000000000
[1st column] [2nd column] [3rd column] [the rest of the lights in the lab]

And his full <property> addition:

<property id="BACKSHOP" value="000000001110000000001010000000001111000000000000000000000000000000000000000000000000" />

Now to add it to the grammar so you can actually control the 'back shop' with voice. Going back to bid.abnf, we make the following modification to the $area_identifiers rule:

...
      (door | exit)           {out.location = 'DOORS';} |
      (rehzas back shop)                      {out.location = 'BACKSHOP';}

Why didn't we just call it 'backshop'? Well, trying that results in 'back shop' getting mixed up with a lot of the 'tool shop' commands, making it almost impossible to have it be recognized. Reza's name was added in order to make this new location more unique. (Think about it: there are far fewer commands that sound like "Reza's back shop" as opposed to "back shop". We have to do this in order to reduce ambiguity and errors. This is a major weakness of an untrainable system.)

Now, adding personalized locations is not generally recommended as it balloons the grammar and affects reliability of recognition. Rather, it would be better to reduce the number of locations and synonyms of locations, settling on a set of terms that are easily distinguishable in speech. Now that you know how to make the modifications, I only hope you do not abuse the system (!!)

[edit] Getting Access

There remains the problem of how you might get access to all these files for modification in the first place. Well, I could tell you that, but it might be a security hazard. Better to contact me or the system administrators of the lab (bid-lab-admin@lists.berkeley.edu) directly for further instructions. Though you're free to take a look at the grammar and configuration files and suggest changes at any time.

[edit] Future Work

I admit that this system for configuration is far from perfect. In a more ideal world:

  • There would be a web app you could log into and use to manage your own custom grammars, which would be loaded dynamically if you decide to log in before using the speech interface
  • You could create a custom configuration directly from the graphical interface, instead of having to edit the XML file by hand and generating the binary number string yourself
  • You could configure more than just the sets of lights you want to control - perhaps you could customize which level each light would be in your set and tie that to a specific command (e.g. "presentation mode" instead of "turn down the lights over the screen, turn up the guest area lights, turn off the couch lights, turn off the kitchen lights, etc..."). This would lead us closer to something like Ana's system, where you could Simulataneously Name And Configure.

[edit] Contact