Review: $99 Voice Recognition Software (Windows)
Awhile back, Kyle wrote a tip about text-to-speech technology that you can use to turn written text into an audio file. In other words, having the computer “read the text out loud,” but sending the audio signal straight to an MP3 file for later listening on an iPod. This is an article about going the other way, i.e. voice-recognition. Specifically, this is a review of the dictation-taking software called Dragon Naturally Speaking version 9 (for Windows). I’m using it right now to dictate this article. I tried using an older version years ago, but was unimpressed and sent it back. This latest version, however, absolutely blows me away.
Out of the box, with hardly any training at all, I was able to dictate casual e-mails as well as formal documentation. The first thing I wrote was an e-mail to a friend telling her about the latest DVD that I had seen. It was one that starred Gary Sinise and Julianne Moore, and I was floored when I saw that the software not only recognized those two names as names (and thereby capitalized them), but spelled them correctly to boot. Now, when I said “hardly any training at all”, I meant it in both senses of the term. I did not have to read much of the manual in order to get started, and neither did the software have to learn anything about me. (Later, I did run a training session in which the software had me read to it an excerpt from a Scott Adams’ Dilbert book, which was loaded with terminology commonly used in an office, and that boosted the accuracy even higher. That took about 10 minutes, if I recall correctly. Also, whenever I come across a special word or phrase that I use all the time, such as a company name or a website name, I can train the software specifically to recognize that word or phrase, and that takes less than a minute.)
What’s in the box: There are two consumer editions of this product: a “standard” version for $99, and a “preferred” version for $199. I bought the latter because it has features that I thought I needed, but so far I haven’t taken advantage of any of them. The hardware requirements are pretty high, but my two-year-old laptop copes just fine. In the box was the CD-ROM, an actual printed manual that’s halfway decent, and a basic, noise-canceling headset. I already owned a good headset (Plantronics), so I didn’t need the one that came with the software, but I tried it out just to see. It worked fine, but it was a tight fit for my big head and made my temple sore after a while.
Productivity boost: For someone like me who is not a fast typist, nor an accurate one, this software is a vast improvement. It seems like dictating allows me to “type” between 2 and 4 times faster than typing the old fashioned way (not counting the times that I stop to think). As I said, the learning curve is pretty gentle. So, overall, I’m quite happy with this software. I’m convinced that it has already paid for itself.
Types of writing: My impetus for buying this software was that I now spend half my days writing English in one form or another (articles, blogs, e-mails, and documentation), whereas before I used to spend about 90% of my time programming. I was convinced that voice-recognition software couldn’t possibly help with writing program code — and it doesn’t. But, something happened that surprised me once I got used to dictating things. I figured out that this is a great way to take notes while I’m programming. I now keep a Notepad window open all the while I’m working, and every time I think of something that I want to remember I’ll switch over to that window and dictate a note to myself: code that ought to be refactored later, notes for the QA testers, theories and hypotheses about how some legacy code works that I’m trying to fix, ideas for test data, etc. This goes a really fast, because they’re only notes to myself, so I don’t worry about getting them perfect. And, because it does go so fast, I tend to make longer notes with more detail than I normally would, which means that when it’s time to go back over the notes and address those action items, I can immediately get right back up to speed, even if it’s weeks later.
Nitty-Gritty Details: The following are selected notes about Dragon Naturally Speaking from a cheat sheet that I made for myself while I was figuring out how to best take advantage of it. I offer them here as a glimpse into the nitty-gritty details of exactly what the software does and how works, but this is by no means a complete tutorial…
Some setup and troubleshooting notes:
- There is an option to require the word “click” before any command so that you do not activate commands accidentally.
- To program special vocabulary words, use Words | View/Edit. This is the preferred way to program anything that’ll appear in-line. For example, you could program it to recognize “my e-mail address” but type instead “me(at)example.com”.
- For multi line macros, you cannot use of vocabulary word. Instead, you need to define a “command”. For example, you could program it to recognize “my sig file” but type instead all of your contact information, spanning multiple lines. The only difference between invoking a (custom) vocabulary word and invoking a (custom) command is that you have to pause for half a second before saying a command phrase.
- If the voice-recognition starts acting as if it can barely hear you, then make sure the microphone is actually plugged in (you could be working off the laptop’s microphone).
Shortcut keys
- The plus key on the numeric keypad toggles the microphone on and off. (This one is especially important to know, because turning the microphone on is one of the few things you cannot do with a voice command. You have to use the mouse or this shortcut key.)
- The minus key on the numeric keypad opens correction mode.
- These numeric keypad keys work for me, because I don’t use the number pad much. You might need to change them, though.
Dictating punctuation and special characters:
- You can use the international phonetic alphabetic to spell out letters (”Alpha Bravo Charlie” for “abc”), if you happen to know it. It recognizes three different versions.
- The default mode is to automatically insert commas and periods, but you might prefer to turn that off and deliberately say “comma” and “period” yourself.
- To capitalize a Word, say “cap” before the word.
- To uppercase a WORD, say “all caps” before the word.
- For CamelCase words, say “no space” between the words.
- For parentheses and quotation marks, you have to say “open quote” and “close quote”, not “left…”/”right…” (unless you specifically program that vocabulary, of course).
- For a double-hyphen, you can say “dash”.
- To force entry of a single digit, rather than the word for that digit say “numeral” before the digit.
- To force entry of multiple digits say “numbers mode on”, the digits, then “numbers mode off” (although it seems to figure it out most of the time anyway).
Read more: Software, Windows, Productivity

Post a Comment