WATA Bulletin - Fall 1999
Contents:
Update on Voice Recognition: Will it work for you?
November 1999
Kurt L. Johnson, Ph.D.
Associate Professor and Head
Division of Rehabilitation Counseling
Department of Rehabilitation Medicine and
Director, U.W. Assistive Technology Resource Center
Sharmon Morris, M.S.
Rehabilitation Counselor
U.W. Medical Center and
Assistive Technology Clinic
Several years ago we published a brief article on voice recognition for computers and posted the text on our WWW site. This article has received more attention than any other article we have published. A great deal has changed, both in terms of voice recognition software and computers so we have revised the article and are republishing it here.
So - you want to use a computer for work, school, or play, but you cannot use a keyboard efficiently. Perhaps you have lost full use of your hands due to spinal cord injury, orthopedic trauma, carpal tunnel, muscle weakness, or some other reason. You have heard that programs exist that allow you to "just talk to the computer." Will one of these programs be of use to you? This is one of the most frequent questions we are asked at conferences and on our information and referral line. We will review these issues in this article.
The general issue here is, "how can I gain access to my computer - access that is suited to my skills, abilities, preferences, work environment, and work tasks (or school or play)?" The most conventional method of computer access is to issue commands to the computer from a keyboard, and by pointing and clicking a mouse. When an individual is not able to use these access routes (or prefers not to), then we look for alternatives. Computer technology changes rapidly so the observations we make here may not be true tomorrow.
Many people, regardless of whether they have a disability, would like to just talk to the computer and have it respond. There are a number of applications which permit, to some degree, voice input. For example, with either specialized or off-the-shelf software, many functions of a computer can be controlled by voice. For example, one might say, "open file," or "close file," to initiate an action. But realistically, most of what we use computers for in work and education is writing and, to a lesser extent, data entry. To write, we want the computer to recognize our spoken language as input just as if we were keyboarding. This is called Voice Recognition. Essentially, we want to be able to dictate our thoughts and have them transcribed by software and appear as text.
In the past voice recognition systems required "discrete" speech - that is to say, you must .. allow .. a .. brief .. period .. of .. silence .. between .. each .. word .. you .. speak .. in .. order .. for .. the .. computer .. to .. recognize .. your .. speech. Currently, systems have improved sufficiently so that one can very nearly speak in a normal fashion, but there are tradeoffs that we will describe in a minute.
Although vendors often demonstrate speech recognition systems with claims of speeds in the 80 words per minute range (about the speed of a good office typist), we have not seen users exceed about 50 wpm in real life settings unless they are using a set of language that falls into a routine where one can use pre-set groups of text (called macros) and insert the new material (almost like filling in a form). Also, most voice recognition systems achieve about a 90 - 93% accuracy rate. That's great, but if you are trying to compose, and every 10th or 13th word is wrong and must be corrected, that can again make a greater demand on your "cognitive horsepower" as you try to compose your message.
Correcting errors is another issue. When you notice an error in the transcription of the word you say and what appears on the screen and begin the error correction process, a list of alternatives will pop up. You might say, "choose five" to select the fifth alternative and you would be on your way to continue dictating. If the correct word was not on the list, however, then you must begin spelling the word until the software "guesses" the correct word. The newest programs allow you to spell using the natural alphabet (versus the military alphabet "alpha" "bravo" "charlie" used by older discrete speech versions).
With all voice recognition systems, mixing keyboard strokes, mouse pointing and clicking with dictation dramatically increases the efficiency of the program. For example, if you have some hand function and can use the keyboard to make corrections, or can click on correct selections, or can open files, you will speed up your use of the program a great deal. Remember to always consider alternatives to voice recognition. For example, someone who is really good at using Morse code by sip-and-puff input can achieve up to 40 wpm and use a slower computer, and "blowing" the Morse code does not compete cognitively in the same way as dictation for many people.
OK - let's look at the major systems that are available and summarize the advantages and disadvantages of each. Remember to add in the price of the required computer system when thinking about the economics.
Naturally Speaking from Dragon Systems
PROS: This system allows you to speak "naturally" without using discrete speech. The vendors state that it is able to achieve up to 95% accuracy after initial 20 minutes of training. They have eliminated the usage of the military alphabet, and allow correction spelling to be done with the regular alphabet. The newest version has integrated with Microsoft Word97 and Corel WordPerfect, however use of these word processing programs does slow overall performance. This version has also expanded the macro options allowing more voice macros for common tasks. Voice recording and text to speech are available should one find these features helpful. This version also can be used with the add-on of Naturally Speaking Mobile. This allows a user to dictate into a recorder and download dictated text into a software program.
CONS: Training is difficult for a poor reader. The user must be able to read out loud (or remember and repeat) long sentences of complex words, in order to program his or her voice. Generally speaking, a person must have about an 8th grade reading level to be able to use the "pre programmed" training modules. A person can create a customized training, however this can be very complicated and requires assistance from someone familiar with this process. The user is also required to program their voice all in one sitting without turning the computer off. This can create problems for people with disabilities or reading difficulties. The command and control features have improved, however the system is still not accessible for someone in need to total hands free use.
Requirements: 200 MHz Pentium; 64 MB ram Minimum; 240 MB hard drive space
Price: $199
DRAGON Version 3.0
PROS: Can be used to operate off-the-shelf word processors such as Word for Windows/Office 97, and operating systems, such as Windows 95. It is also compatible with Naturally Speaking (e.g., can open and manipulate files created in the proprietary word processor required in Naturally Speaking). In addition to providing "speech to text" input, it allows "text to speech output," ("reads the screen with voice output), has a mouse grid permitting voice input mouse function, and has a large capacity for macros. This system can "learn" to understand many kinds of dysarthric speech, or the speech of people for whom English is a second language. Allows for total hands free control of the computer which is currently lacking in the continuous speech systems.
CONS: Is currently in the process of being phased out and is no longer bundle packaged with Naturally Speaking Version 3.52. One may still obtain this through authorized resellers by special request, however this is likely to change in the future. Requires discrete speech and requires military alphabet for correction.
Requirements: 486/66 and 16 MB of hard disk space. Can be run on Win 3.1, 95 or NT, 43 MB hard drive space, 20 MB of RAM.
Price: $149
IBM VIAVOICE98 EXECUTIVE
This is a continuous speech system from IBM. Recent improvements in this product have made it more accessible for users with disabilities. Allows for command and control of some windows applications. Allows direct dictation into Word 97 as well as its own proprietary word processor. Allows the user to create macros to automate tasks. The training is simpler than that for Naturally Speaking, however is still difficult for a poor reader. Has increased its command and control features, however is not totally "hands free".
Requirements: Pentium 200, Win95, 98 or NT, 64 MB RAM, 250 MB hard disk space.
LERNOUT & HAUSPIE - VOICE EXPRESS PROFESSIONAL:
This is a continuous speech recognition system which allows a user to dictate directly into and control all of MS office and other Windows applications. Control and Command features are very "natural" using natural language commands. Allows user to create macros to automate tasks. Again training is difficult for a poor reader
Requirements: Pentium 166 with MMX, 40 MB RAM, Win95, 98 or NT, 130 MB hard disk space, 16 bit sound card.
Conclusion
Voice recognition is not for everyone. It requires a heavy cognitive load, including memorization of multiple commands; ability to differentiate when to use each command; ability to track multiple sequences of events; ability to complete multiple step commands; ability to generate spontaneous text; and strong respiratory support.
The older discrete speech versions can be adapted to work with people who use ventilators by programming out the ventilator sounds. Other consistent noises can be programmed out as well (e.g., door shutting, phone ringing, dog barking etc.). This is not true of the newer, continuous speech versions and the older versions either have been or are being phased out.
People should have a trial of a voice recognition system before purchasing. This can help dispel the idea that you just put on a headphone and speak to the computer. A trial, and appropriate AT evaluation can help to identify whether speech recognition is, in fact, the most appropriate access method for an individual. One should also be aware that speech recognition software requires at minimum a 486/66 with 16MB RAM. Also, not all soundcards are compatible with the various software programs. Special microphones must also be purchased for use.
USEFUL WEBSites:
Dragon Naturally Speaking: www.dragonsys.com
Lernout & Hauspie - Voice Express - www.lhs.com
IBM ViaVoice: www.software.ibm.com/speech/
Legislative & Policy Update
By
Frances E. Pennell
Olympia: The Medical Assistance Administration (MAA) has been hard at work finalizing Washington's new Children's Health Insurance Program (CHIP). The state plan was approved by HCFA on September 8, 1999. The program will serve children under age 19 in families between 200 and 250% of the Federal poverty level (approximately $2,784 and $3,480 per month for a family of four). CHIP will be administered by MAA and will provide the same benefits including durable medical equipment, prosthetics and orthotics, occupational therapy, physical therapy and speech therapy. Parents will pay a monthly premium ($10 per child up to $30 per family) and some co-pays. Applications and enrollment begin in January 2000. For more information, visit the MAA website (http://maa.dshs.wa.gov/CHIP/Index.html) or contact your local community service office.
OSPI completed draft rules implementing the 1997 amendments to IDEA on September 1, 1999. Final regulations will be available soon. The draft regulations incorporate the federal requirement that all IEP teams specifically consider whether a student needs AT devices and services. If the team determines that AT devices or services are needed, it must include a statement to that effect in the IEP. Proposed WAC 392-172-161. The regulations also address home use stating that "On a case-by-case basis, the use of school-purchased assistive technology devices in a student's home or in other settings is required if the student's IEP team determines that the student needs access to those devices in order to receive FAPE"(Free and Appropriate Public Education.) Proposed WAC 392-172-075. Other provisions spell out district responsibility for services and devices for special education students enrolled in private schools (Proposed WAC 392-172-232 to 246), and the circumstances under which a district may ask parents to tap Medicaid or CHIP (Proposed WAC 392-172-50300) and/or their private insurance (WAC 392-172-50305) to pay for special education services. Copies can be obtained from the OSPI website at http://inform.ospi.wednet.edu/sped/speced.html.
The MAA is proposing changes in its Durable Medical Equipment regulations and is considering revisions in many other rules. Contact the DSHS Rules Coordinator at (360) 664-6094 or wallpg@dshs.wa.gov if you want to be notified of proposed revisions or visit the DSHS rules website at http://www.wa.gov/dshs/dockets to find out what rules are going to hearing. The Insurance Commissioner has proposed new rules which would require health insurers to review consumer appeals of denials of treatment within two weeks or within three days in the case of an emergency. The review must be conducted by someone with the training, experience and expertise to render a competent decision! It also finalized rules requiring prompt payment of claims and giving medical providers other rights vis a vis insurers. Copies can be obtained from the OIC website (www.insurance.wa.gov/tableofcontents.htm).
Washington D.C.: Congress is in the final throes of developing its FY 2000 budgets and considering many bills of interest to AT users. To obtain current information about pending Federal legislation, visit WATA's website (wata.org) and click on "Policy."
Calendar of Events
Technology, Reading & Learning Difficulties - January 27 - 29, 2000, San Francisco, CA
TRLD 2000 offers a comprehensive selection of 126 sessions and workshops. Conference
sessions will be focusing on actual classroom and administrative applications. There also
will be exhibitor presentations at which you can learn about a variety of software
programs and/or other products. For more information call:
Diane Frost, President/CEO, Educational Computer Conferences Inc., 19 Calvert Court,
Piedmont, CA 94611-3435, (888) 594-1249 (Pacific Time, 8:00 am - 5:00 pm).
Technology and Persons With Disabilities - "Where Assistive Technology Meets The Information Age"(tm), March 20 - 25, 2000, Los Angeles, CA
CSUN's 15th Annual International Conference is a comprehensive international conference
where all technologies across all ages, disabilities, levels of education and training,
employment, and independent living are addressed. Keynote speaker will be Tom Whittaker,
the first person with a disability to climb Mt. Everest.
Contact: Center On Disabilities, California State University, Northridge, 18111 Nordhoff
Street, Northridge, CA 91330-8340, (818) 677-2578, or on the Web at http://www.csun.edu/cod/.
Introduction To Assistive Technology - Summer Institute: June 19 - 23, 2000, Seattle, WA
The summer institute will provide participants with a comprehensive introduction to the field of assistive technology (AT). Through lecture, demonstrations of AT devices, case studies, and hands-on experiences with AT we will focus on an interdisciplinary approach to the selection, implementation and use of technology to meet the educational, vocational, transitional and independent living needs of individuals (adults and children) with disabilities. Participants will increase their understanding of the advantages and limitations of technology; improve their assessment, intervention and advocacy skills; learn to formulate and follow functional assessment strategies; and increase their knowledge about electronic information technology for accessing resources and networking.
RESNA 2000 Annual Conference - June 28 - July 2, 2000, Orlando, FL
RESNA 2000 will bring together people who use, develop, manufacture, and deliver these technologies. If will provide and informative, thought provoking educational forum with a diversity of presentations, including: instructional courses, scientific and interactive poster papers, concurrent sessions, computer tech and environmental control systems labs and more. For more information call: RESNA, 1700 North Moore Street, Suite 1540, Arlington, VA 22209-1903, (703) 524-6686, email: info@resna.org or on the Web at http://www.resna.org/.
New Millennium: Research to Practice (Call for Papers)
The 11th World Congress of the International Association for the Scientific Study of Intellectual Disabilities (IASSID) will be held during August 1-6, 2000 in the WA State Convention and Trade Center. For more information regarding November 15, 1999 - Abstract deadline, visit their website at http://www.waisman.wisc.edu/iassid/.