By Kyle Baxter
Last month, I discussed how I use speech recognition in OS X to make using my Mac a little easier to use and more enjoyable. I use speech recognition to do relatively simple tasks, such as opening new email, replying to emails, sending new messages, opening chats with people in my Address Book, creating new iCal events, switching between applications, moving up and down pages in Safari, moving back and forward in Safari, and even opening websites.
I am going to focus on using speech recognition in a few key applications that I think you will enjoy the most. I will explain how to use speech recognition in Address Book, Safari, and Coda.
Enable Speech Recognition
I will assume you are using OS X Leopard. By default, speech recognition is turned off in OS X. To turn it on, open System Preferences, and then open the “Speech” preference pane. Next to “Speakable Items,” click the “on” radio button. Speech Recognition is now on, and you should see a round floating window in the lower right-hand corner of your screen. This window gives you immediate access to speech recognition no matter what application is at the front. If, however, you do not like it to be open, or you have a small screen, you can minimize it into the dock. Just double-click to minimize it.
Now that speech recognition is enabled, we need to calibrate the microphone’s volume. Click the “Calibrate” button next to the Microphone drop-down box, and follow the instructions. All you need to do is read off the list of phrases beginning with “What time is it?”, and adjust the slider until the meter stays in the green. This is also great practice for using speech recognition — when you say a command, and OS X recognizes it, the phrase will flash. Read through the list at least once.
Note, however, that you are not calibrating OS X to your voice — you are simply setting the microphone correctly. OS X does not need to “learn” each individual’s voice, as it can recognize a multitude of people’s voices.
With speech recognition enabled, and the microphone properly calibrated, it is time for you to make a decision. OS X gives you two choices for prompting speech recognition to start listening for commands. You can set it to listen only when you press a key (the escape key by default) or listen for commands continuously.
You can also set it to listen continuously but only respond to commands when a keyword of your choice — “Computer” by default — is said. I prefer to allow it to listen continuously, but depending on your surroundings, you may want to set it to respond only when you press the escape key. You set this under “Listening Method.”
Little Background
With speech recognition running on your Mac, it is time to take it for a little test run. Hold the escape key if you set it that way, or otherwise just say, “what time is it?” If you enunciated well and there is not much background noise, your Mac should reply with the time.
Now try this: ask “How late is it?” Your Mac should reply with the time. You are probably assuming that both commands are programmed in. Well, now ask “What is the date?” As you would expect, your Mac will respond with the date. Now ask, “What day is it?” Hmm, same answer.
OS X, it turns out, is smart enough to know that both questions mean the same thing. OS X’s speech recognition can figure these out, and even cope with minor mis-pronounciations.
I am sure you are wondering where you can see a list of all of the available commands. To do this, on the speech recognition window (or bubble, as I call it), click on the down-arrow, then click on “Open Speech Commands Window.” This brings up a master list of all available speech commands for Address Book, the Front Window (whatever application is on top), the menu bar, and any global speakable items. When we are finished with this tutorial, play around with the different commands available and remember any that you really like. The great thing is, commands are plain English, so you should have little difficulty remembering them.
Let’s move on to Address Book and see what we can do with our voice.
Address Book
Address Book may not be open, so let’s open it. Rather than click on its dock icon or browse for it through the Finder, we are going to use our voice. Say, “Open Address Book,” and that is all you need to do. You can open any application this way.
Now that Address Book is open, we can do some great things. First, let’s say you are calling a friend to see if they want to have dinner at your favorite sushi restaurant, but you forgot their phone number. No problem — just say, “Phone for [your friend’s name].” If your friend is in your Address Book, a large, smoke overlay will pop up with your friend’s phone number. Your Mac will even repeat the number for you — just say, “Speak it.” Whenever you are ready to dismiss the overlay, just say “thank you,” and it will disappear.
You have called your friend, but they already have plans for tonight. Instead, you decide you will have lunch together on Friday, but you would like to add it to your iCal calendar so you do not forget. To add breakfast to your calendar on Friday, say “Breakfast with [your friend’s name] on Friday,” and as expected, your Mac will create a breakfast appointment on Friday.
Unfortunately you must get back to work, and you need to send an email. Rather than open Mail, click “New Message,” and type in your contact’s email address or name, just say, “Mail to [contact’s name].” This will open a new email message with your contact’s email in the To: field. If you have a group set up in Address Book, you can even send emails to the entire group this way. Say “Mail to [group’s name],” and it will do exactly what you expect. I have my company colleagues set up in a group so I can set up a quick email just with my voice.
Now that you have typed your message, say “Send,” and… well, you get the picture.
You can do even more with Address Book, but I will leave it to you to find out what else you can do.
Safari
Let’s switch to Safari. You can either say, “Switch to Safari,” or “Open Safari.” Both will work.
It is the morning, and you would like to check Yahoo’s home page to see the top news headlines. You have Yahoo in your bookmarks bar, so you say, “Yahoo,” and Safari opens it for you. This will work for any bookmark in your bookmark bar.
Now you would like to check Daring Fireball, which is in your bookmarks menu. Because you can use voice to access the menu bar in any application, you say, “Bookmarks menu,” then “Daring Fireball,” and Safari will open it for you.
Gruber posted a long article today, so you need to move down the page to read it. Rather than scroll or click the space bar, you can say, “Move page down,” and Safari will do just that.
This article also has an interesting link that you clicked on, but you are finished with it and want to move back to Gruber’s article. Say “Back,” and Safari will move back to the article. “Forward” works the same way.
You look at Daring Fireball quite a lot, though, so you want quicker access than going through the bookmarks menu, and you do not really want to add it to your bookmarks bar. Wouldn’t it be great if you could just say “Daring Fireball” and Safari would open it?
You can do just that with any web site you want. With Daring Fireball open, say “Make this page speakable,” and a little box will pop up, asking you what you want to say to make this page open. Type in Daring Fireball, or anything else you want, and click “OK”. Now try it — Safari should open your page.
Speech recognition allows you to do quite a lot in Safari. It is a great way to check sites you frequent and move between pages.
Coda
What I want to illustrate here is that even in non-Apple applications, the toolbar works with speech recognition. If you have Coda, let’s open it up, and open a site to edit.
When editing an HTML file, normally you would click “Preview” to — surprise surprise — preview the page.
These different modes — Sites, Edit, Preview, et cetera — can be used by voice, however. Just say them. If you want to move to Edit, say edit; if you want to move to Preview, say preview.
This is not exactly revolutionary, but it changes up your workflow which helps with creativity.
Not Perfect
Voice recognition in OS X is not perfect. It will not recognize everything you say, but it works surprisingly well. Voice recognition allows you to do small and menial tasks without even touching your computer, and makes it a more enjoyable experience.
I am convinced that using a computer for extended periods of time dulls creativity because there is very little interaction — there is only staring at a screen, typing, and moving a mouse. Controlling your Mac through voice, however, uses a part of your brain that is not used often when doing work, and I think ultimately this makes using your Mac a more enjoyable experience, and may even boost creativity.

1 comment so far ↓
I’ve always wanted to give this a try and now I have the knowledge to do so. Thanks for the information!
Leave a Comment