Archives / Search ›

OS X dictation alternatives

I dictate to my computer a lot. It helps me write faster and saves my hands for other pursuits.

In the last few years, dictation, both of the local and network-hosted variety, has improved to the point that this choice is no longer an infuriating time sink. Coding, versus writing prose, via dictation is still in its infancy and I continue to anxiously await Tavis Rudd’s release of the dictation system he demoed at several conferences last year (warning: videos may be NSFW thanks to some synthesized expletives).

In a conversation on Twitter earlier this week I noted that, despite considerable enhancements in the past few years, dictation on OS X doesn’t get discussed much — hence this post.

If you’re going to be playing with dictation, make sure you have a decent headset, properly positioned. Wired, noise-canceling USB headsets are not expensive, and even though Apple’s been adding microphones and improving noise canceling on their Macs recently, you still do better with a headset. If the dictation system you’re using doesn’t include an audio setup step, just record and play back some of your own speech to make sure it’s audible and relatively free of background noise.

On OS X, you have 4 choices for dictation:

Networked dictation

Networked dictation was introduced in OS X 10.8 Mountain Lion.  It’s similar to the dictation service on iOS, and benefits from its use by Siri. I appreciate its well executed incorporation into the OS: you can dictate effectively into nearly every text field everywhere; you can easily start and stop dictation from the keyboard; dictation alternatives (blue dotted underline) are part of the Cocoa text system, and dictated text nicely integrates its capitalization and sentence structure with the surrounding material. The software on the other end of the network has a huge vocabulary, including medical terms.

Usability disadvantages with this method of dictation as currently implemented include:

  1. no trainability (though, given it’s designed to be a speaker-independent system, this is less of an issue)
  2. no real-time feedback: dictation happens in 1-minute batches
  3. no editing by voice
  4. no error handling whatsoever. If the server fails to respond or recognize your words, up to a minute of spoken text is lost. This is somewhat understandable on iOS, but given the essentially infinite resources of OS X in comparison, it’s not defensible there. Ideally, I’d expect audio to be saved as a text attachment for deferred recognition, much like the Newton did with ink text.

There are also privacy issues, of course. I’m careful not to use this service to dictate anything sensitive, regardless of the promised or actual handling of my data.

“Enhanced Dictation”

OS X Mavericks (10.9) introduces “Enhanced Dictation”, a locally hosted version of Nuance’s recognizer. It’s not installed and off by default; you can turn it on in System Preferences. Like OS X’s networked dictation, Enhanced Dictation is not trainable and doesn’t let you edit by voice, but it does let you mix keyboard/mouse editing and dictation. While it does provide the feedback expected of a local recognizer and does away with the one minute dictation limitation, it’s the only one of these options I find unusable in practice.

Enhanced Dictation’s omissions of training and editing likely protect sales of the Dragon Mac products (discussed below). The bigger issue is that this seems fundamentally a speaker-dependent system without a method of training, resulting in frequent dictation errors you can’t fix. The vocabulary seems smaller than the networked alternative, though because of its frustratingly high error rate, I haven’t done a lot of testing. It also uses a lot of memory.

Dragon products

Nuance offers Dragon Dictate for Mac, MacSpeech Scribe and Dragon Dictate Medical. The Mac-specific components of these products and their predecessors have always been buggy and flaky. My experience with the support and sales surrounding them have ranged from incompetence to sleaziness. I have purchased several versions and upgrades of these products going back to the original pre-OS X, Philips recognizer-based versions, but I’m not going to keep supporting software that is this poorly developed, sold and supported.

Windows in a virtual machine

Nuance’s Windows dictation products (Dragon NaturallySpeaking and Medical/Legal) are better than their Mac equivalents, though that’s not saying a lot. The UI is a scattered, slowly-evolving mess; true interaction between keyboard/mouse and voice editing is limited to individual versions of specific applications, and the medical product is expensive (upgrades are $500 on sale).

The main reason I dictate into Windows is the ecosystem surrounding the Dragon products there. There are quite a few abandoned research projects and other near-abandonware to contend with, but it’s possible with some effort to construct a productive system. What I’ve done thus far is nowhere near what Tavis Rudd did, but it works for me. Natlink is a Python framework for building recognition systems, with several macro languages/frameworks built on top including Unimacro, Vocola and Dragonfly (the basis of Tavis’s system).

Microsoft also bundles speech recognition with Windows these days; I’ve used it very little, but it does work with Dragonfly.

My choices

I use OS X’s networked dictation for brief passages, and a Windows 7 VM for anything longer, like this post. I recently upgraded my Windows environment to the current Dragon Medical 2 (equivalent to NaturallySpeaking 12) and Word 2013. More on that setup is coming in my next post.

Apple II to Mac: Copying physical disks

As you may have noted if you follow me on Twitter or Flickr, I’ve recently been trying to preserve my and my family’s Apple II life from binders and boxes full of 5.25ʺ and 3.5ʺ disks.

There’s a remarkably rich ecosystem of Apple II emulation and file transfer software for the Mac. This and the next few posts, while by no means comprehensive, will discuss the hardware and software which are helping me to save this data. If you still have an Apple II legacy to save, hopefully they’ll help you as well.

Copying 5.25ʺ disks

I use a CFFA3000 in an Apple //e (my first computer). The CFFA software images a floppy (DOS, ProDOS, Apple Pascal, etc.) from a Disk II drive to a file on a CF card in a few seconds. It logs and aggressively retries on read errors.

Copying 800K 3.5ʺ disks

You could also use a CFFA3000, but my IIgs and Apple 3.5ʺ Drive are long gone.  Instead, I use a PowerBook G3 (PDQ), Apple’s last computer to support a SuperDrive, with Mac OS 9. Apple’s Disk Copy works for imaging and MacSFTP handles file transfer, as I wasn’t able to coax Mac OS 9 into connecting to the AFP server on current OS X versions.  Once on the Mac, I use Hazel to automatically convert the Disk Copy-generated NDIF (.img) images to data-fork-only UDIF (.dmg), which Sweet16 has no trouble with:

Hazel convert to UDIF

Unfortunately, Disk Copy gives up quickly on read errors, but Mac OS 9 will mount ProDOS disks directly in the Finder, so I have been able to rescue a few individual files when the disk can’t be imaged as a whole.

My original expansion bay floppy drive stopped reading reliably after 10–15 disks. I could probably have cleaned it, but either the floppy drive mechanism (the same Mitsubishi one was used in PowerBook SuperDrives from 1994–1998) or complete floppy expansion bay modules are currently available on eBay for $10–15. I bought one of each; one appears to have been damaged in shipment, and I could always scavenge the mechanism out of my PowerBook 540 if I was desperate.

If you used 1.4 MB MFM disks on your Apple II, I imagine you may be able to get away with an external USB floppy drive, but I don’t have any such disks to test with.

ERA by Jawbone (2014)

Last May I posted a comparison between the Jawbone ERA and the BlueAnt Q3, with an emphasis on my primary use case of podcast listening. The final outcome of this was a bit anticlimactic: I managed to lose the Q3 during a trip and my ERA stopped misbehaving so I went back to it.

ERA Amazon packaging

A few weeks ago, Jawbone introduced a new ERA, confusingly named “ERA by Jawbone”. Once again, iLounge has a review of the new device, which is worth reading as background before you dive into the following.

Externally the new ERA looks much the same as the old one, just smaller, lighter and thinner. I didn’t buy a version with the charging case (partially because Amazon doesn’t stock the black case) so I can’t opine on it, but I’ve used other-brand headsets with charging cases and found them useful. The device now plugs in to charge straight rather than at an angle, which might look a little less elegant (the headset curve no longer matches the curve on the USB cable) but is a lot more discoverable. The bundled charging cable is slightly shorter, but this may just be manufacturing variation. There’s no AC adapter (the old one wasn’t particularly compact, in any case) or carrying pouch included. The packaging, at least as ordered from Amazon, is refreshingly simple and easy to remove.

The new ERA has shorter battery life but no apparent change to the headset’s excellent range. I am able to listen glitch-free two rooms away from my iPhone, a feat unequalled by the other headsets I’ve used recently (the BlueAnt Q3 and JF3 Freedom).  The range depends as much on the device on the other end — if I transmit from my Mac mini rather than my iPhone, I can’t even make it one room away without audible dropouts. So you’re seeing poor performance from your computer’s internal Bluetooth, you might consider a USB Bluetooth adapter.

Protocol-wise, the new ERA supports Bluetooth Smart (4.0) with HFP 1.6, A2DP 1.2, AVRCP 1.4, HSP 1.2 and SPP, according to the Web site.  The previous ERA, introduced about 3 years ago in January 2011, implemented Bluetooth 3.0 (at least HFP, HSP and A2DP).

On to my complaints about the original ERA, with updates inline:

There’s no audible feedback for the first few seconds when turning on the headset. Startup is slow — total time from power on to pairing with an iPhone 4S is about 5 seconds.

I must have made a mistake in my previous post. As I just tested it, the old ERA takes ~4 seconds from power on to any audio feedback, and 9 seconds from power on to pairing with an iPhone 5s. The new ERA takes ~3 seconds to audio feedback (which doesn’t sound like a ripoff Mac startup sound any more) and another second to pair. Regardless, this is a substantial improvement — slightly slower than the Q3 but acceptable, given it takes a second or two just to attach the headset to my ear.

Volume control requires you to hold down the button while audio is playing (otherwise, it triggers Siri, the way I have it configured), and inevitably causes me to overcorrect and/or trigger Siri. It’s hard to figure out whether the volume is increasing or decreasing, as it alternates after it “bounces” off the extrema.

Volume control on the headset itself still works the same way, but is now synchronized with the host device’s volume, so for example, you’ll see the volume change on the iPhone screen while you adjust it on the headset, or you can use the volume controls on the iPhone itself. You’ll sometimes see the volume bezel pop up on an iOS device when the device connects. The Q3 still does this better.

Related to the above: triggering Siri requires holding down the button, hearing 2 beeps from the headset, then waiting a while before hearing Siri’s acknowledgement.

No change, but I’m pretty used to it by now. iOS 7 seems less willing to activate Siri at all when the phone is locked, so unless I’m actively using the phone I end up having to pull it out of my pocket anyway.

You’re supposed to be able to trigger various actions by tapping or shaking the headset. These never worked reliably for me.

These features (and, I imagine, the associated accelerometer) have been removed entirely from the new ERA.

I have to carry around two earbuds: one is comfortable to wear for hours when listening to audio, but it puts the headset so far away from my mouth that the phone call quality — or, more relevantly for me, Siri — is essentially unusable. The other earbud I use for phone calls, very infrequently. There’s also an ear hook, but it does not attach securely and has been awkward to the point of uselessness in my experience.

The new ERA includes only one style of earbuds, in several sizes, and no ear hook (loop). There is an alignment bulge on the headset which prevents the earbud from getting out of alignment. The good news is that for ease of placement, comfort, and generally staying attached to my ear and adjacent to my cheek while I move around, the new design is the best I’ve ever used on any headset. Siri works reliably and I used it for FaceTime earlier today without a hitch. The bad news is that there’s no in-ear-canal earbud option at the moment, and the alignment bulge precludes compatibility with the original ERA’s earbuds. More about this at the end.

My other complaints were intermittent issues, which I can’t evaluate yet as I’ve had the new headset for less than a day.

New features on the 2014 ERA

er…

As with the Q3, the new ERA can be simultaneously connected to two devices. This is opportunistic for both devices — typically it connects to the second device a few seconds after the first. The implementation is seamless. With both my iPhone and iPad connected, when I tap the play button on either device, it starts playing on that device and pauses the other device instantly. However, one time I tried this, it caused the audio to stutter on the newly-playing device until I tapped pause and play again. After restarting the headset, it was fine.

Bluetooth configuration and control may now be performed from an iOS (or Android) app, in addition to the previously-supplied Web app (which uses a companion Mac/Windows app to interface with the device). This includes:

  • Numerical display of battery percentage (though not time remaining as on the device itself, oddly enough).
  • Mapping a button press-and-hold to Siri or a “speed dial” for a contact of your choice. On a call, you can choose between cycling the volume or muting the microphone.
  • Selecting from one of several sets of voice samples, though the iOS app’s previews have yet to be completely updated (e.g., they still include the prior ERA’s startup sound).
  • Triggering a “Find” function. These headsets are small, to the point they often live in the coin pocket of my jeans, and the new one is quite a bit smaller. I seem to lose my headset at home every week or two while it’s turned on, so this is a feature I’d use frequently. It is thoughtfully implemented, first warning you via voice that you’ll be deafened in case you have the headset in your ear, then playing a tone with increasing volume.
  • Enabling or disabling simultaneous connections. You can also view connected devices here and (slowly) trigger a connection attempt to the other device.
  • Remote controlled playback (via AVRCP): tap the button 3 times to play and once to pause. You can’t change tracks, but particularly given the steps back in lock-screen audio control in iOS 7, this is appreciated.

Some configuration is still limited to the Web/native app:

  • Renaming the headset
  • Updating firmware (there was an update available today provided without release notes, which at least fixed a small volume adjustment issue I noted; all my tests were with the upgraded firmware)
  • Removing device pairings
  • Configuring caller ID by name for up to 20 contacts, either by manually entering names and numbers or importing from Google Contacts; unlike the Q3, the new ERA doesn’t support PBAP for direct transfer of names from a phone’s address book

What about SPP? My guess is that it’s used for configuration: if I connect to the serial port from the Mac, I can get it to spit out “unknown command” but that’s about it.

A delightful surprise, which I couldn’t find documented anywhere, is a “charged” notification delivered over Bluetooth Smart via the Jawbone iOS app. I can be in another room and the ERA will let me know on my phone that it’s finished charging. This seems to be a bit desynchronized with the visual indication (white LED ring) however: the headset is still pulsing red when the notification is delivered.

As is often the case with device manufacturers, neither the Web nor iOS software for managing the device is terribly reliable or easy to use. Just in the first day, I ran into layout issues with the Web app where the entire UI was invisible (white on white), a hang on quit in the Mac app, device name truncation and a repeated failure to connect from the iOS app. Switching between devices in the iOS app is also pretty confusing, as there are two ways to select devices: the AirPlay icon (which only shows devices that are connected), and the “J”-in-a-circle button, which also shows devices that are available, regardless of whether they’re connected. If you tap on an disconnected device here, you are given “Connect” and ”Find” buttons, but tap on an already-connected device and nothing happens.

The earbud situation

The earbud situation is unfortunately a show-stopper for me at the moment. None of the supplied earbuds fit in your ear canal. With the supplied earbuds, the new ERA is fine if you’re in a quiet room, but inaudible if you’re walking on the sidewalk next to light traffic, or as I was this evening, doing laundry. A couple of washing machines running nearby are enough to make me strain to hear, in comparison with the old ERA with an in-canal earbud. If I take off the supplied earbud, the plastic inner piece does fit in my ear canal, but it’s none too comfortable.

The headset is obviously brand new at this point and even replacements for the existing earbuds are not yet available, despite references in the documentation. Jawbone offers three distinctly different earbud styles for the ERA (“type C”, “round” — the in-canal type — and “spout”) as well as for their previous headsets, so I am relatively hopeful that there will be similar options for the new ERA in the next few weeks to months.

SMART monitoring on USB/FireWire enclosures

As a followup to my prior post on external enclosures, a helpful commenter mentioned the OS X SAT SMART Driver, a third-party driver which provides SMART monitoring via USB/FireWire on OS X.  Other operating systems provide SCSI command pass-through support, meaning a separate driver is not required.

The installation instructions for the driver are confusing and contradictory in places, but you should be fine if you download a snapshot of the repository, install version 0.6 from the included disk image (SATSMARTDriver-0.6.dmg) and restart, then look in Disk Utility to see if SMART status appears for your drive (which may take a minute or two).  I had no problems on 10.6.8 or 10.8.4.

However, works just means the SMART driver is functional — the USB or FireWire enclosure/SATA bridge you’re using needs to support sending SMART commands to the drive as well. As is customary with “fringe” features of commodity hardware, reliable information about specific hardware support is hard to come by. The best I found was this list of USB devices supported by the smartmontools package (which also includes some notes on FireWire), but it doesn’t include most of the devices I have.

I originally used a Linux live CD with smartmontools preinstalled to test several of my USB 2/3/FireWire 800 enclosures. This provides more useful error information than OS X when SMART doesn’t work, but the set of devices supported by the OS X driver turned out to be identical.  (This is not necessarily the case.)

The results weren’t great — links point to smartmontools output:

Assuming the above is representative, if you're using a USB 2 or 3-only enclosure, chances are good that you'll be able to get SMART info. If you’re using an older FireWire 800 enclosure, you’re probably out of luck, even over USB. If you’re using a newish FireWire 800 enclosure, it might work!

The MB080USEB-1SB above is a nice toolless swappable FireWire 800/USB 2.0/eSATA enclosure with a large (quiet) fan up front. It is available until the end of August for as little as $38 after a $40 mail-in rebate, which is a pretty great deal if you still need FireWire 800, and it was the only enclosure I tested that does support SMART over FireWire.

Speaking of fans, OWC now sells replacement fans for the miniStack — mine seems to have fixed itself as I discussed in my prior post, but it’s good to see the option.

Growl and Pester repeating alarms

Back in October, a Pester user sent me this email:

I work in media and I like to set recurring Growl reminders to save my work. Why Growl? Because I don’t want to actually be interrupted to dismiss the Pester alarm window, only see a reminder so I can hit Command+S and continue on my way.

If I set a recurring alarm that repeats and notifies with Growl (screenshot 1), the alarm will be listed in the Alarms List (screenshot 2).  However, it will only display the alert once and then not repeat and the recurring alarm will not show up again in the Alarms List (screenshot 3).  If I quit Pester and then reopen the Alarms List, the alarm will be listed again (screenshot 4) but after it displays that one time, it will not show up again.

I have tried upgrading Pester’s Growl framework to 1.3.1 using the Growl Version Detective, but that did not work.

At the time I responded that I could not reproduce the problem and promised to get back to it. I received another similar email today and finally had a chance to investigate.

Unfortunately, the problem is a bug in Growl 2.0.x and 2.1 that affects repeating Pester alarms for which you have Notify with Growl selected but not Display message and time.

Pester’s repeating (periodic) alarms work somewhat uniquely: if you’ve got an alarm set to trigger every 5 minutes, the 5 minutes don’t start until the previous repetition of the alarm finishes expiring. An alarm can finish expiring in one of several ways:

  1. If Display message and time is selected, the user clicks Snooze or Dismiss.
  2. If Display message and time is not selected, the user chooses Stop Alerts from Pester’s Alarm menu or presses ⌘. (Command-period) (and #1, if necessary)
  3. If Display message and time is not selected, all of the alarm’s selected alerts (voice, media, speech and/or Growl) complete or are dismissed by the user. For images or movies, the user can close the display window.

Growl promises to inform clients, such as Pester, when the user clicks on a notification, and when a notification times out (e.g., disappears from the screen). But Growl 2.0.x and 2.1 don’t do this, so Pester waits forever for the notification to time out or be clicked, the Growl alert never completes, and the alarm never finishes expiring. Alarms which are in the process of expiring aren’t listed in the Alarms window, which explains the behavior described above in which alarms disappear from the list.

My apologies if you’ve experienced this problem.  If you have repeating alarms “stuck” which you can’t delete, you can force an alarm that is waiting forever for Growl to appear in the alarm list with Stop Alerts (⌘.).

Update: Pester 1.1b16 supports OS X notifications in 10.8 Mountain Lion and later, in addition to Growl notifications.  Unlike Growl, OS X notifications don’t even promise to provide any information about when they disappear from the screen, so Pester won’t wait for them. Pester 1.1b16 works around the problem by making Growl notifications behave the same way as OS X notifications when you are using Growl 2.0 or later — they won’t wait for you to dismiss them.  When a fixed version of Growl is available, I will revisit this issue.

‹ Newer Posts  •  Older Posts ›