Archives / Search ›

Workaround for Dragon Medical Practice Edition on high-DPI displays

In my last blog post, I pointed to a Nuance support article which indicated that there was no support for high-DPI displays in DMPE. This does remain the case in 2.3, as launching it out-of-the-box causes the text to be scaled and blurry, and icons/windows incorrectly placed relative to the insertion point. No useful workarounds are provided in the article.

I was working on something else today, when I remembered “hey, doesn’t Windows have compatibility options for this?” Of course it does! Find natspeak.exe and check the following box in Compatibility Properties:

Fixing high DPI display.png

This makes DMPE look better and no longer have issues with window placement. DMPE complains once during launch that it doesn’t want to be in compatibility mode, but it doesn’t stop you.

Dragon Medical updates

Dragon Medical Practice Edition (DMPE) 2.3 was released as a free update at the end of June — which I didn’t notice until last week, since I had the automatic updater disabled because it was so bad, and had assumed that there were going to be no further updates before another paid upgrade.

The reality turned out to be good in the short term but potentially concerning in the long term.

Dragon Medical Practice Edition 2.3

DMPE 2.3 (the version in the About box is actually 12.53.350.033) does add partial compatibility with Windows 10, Internet Explorer 11 and supposedly Office 2016 — I am still using 2013 on the Windows side. It does not support current versions of any modern browser (Firefox, Chrome or Edge). It also fixes a very irritating and long-standing issue described in the release notes as “Too many internal messages overwhelm the system and cause Dragon to appear frozen.” These freezes would last for many seconds and happen seemingly at random, but often manifested when trying to turn dictation on and off. Now toggling dictation is a reliably sub-second operation — I have not noticed a single such freeze since upgrading to 2.3, which is incredibly gratifying.

My dictation buffer setup

I continue to iterate on my dictation buffer setup. Since my most recent post on the subject about a year ago, I have fixed some pesky bugs and worked on further virtual machine automation. This has involved such diverse things as performance profiling that helped me partially work around the above freezes, and editing the Windows registry to prevent the Dragon Word addin from getting automatically disabled. I am currently hard-coding my profile path as this is necessary to fully automate dictation startup with a roaming profile, which you’ll need to change if you want to try it yourself.

Macro performance

I had been automating some basic formatting tasks with the built-in Dragon “Advanced Scripting”, as it’ll work on the hospital’s Dragon 360 environment as well as my home/laptop DMPE setup. Unfortunately, Advanced Scripting simulates keystrokes extremely slowly (and apparently slower still on Windows 10). I recently discovered that the older Dragon NaturallySpeaking macro language is still supported; you just need to import an old command, and you can duplicate it. The old language appears to be at least an order of magnitude faster even on Windows 7, so I’ll be using it in future.

Windows 10 and high-DPI support

In addition to my virtual machine Windows 7 environments, I have recently installed DMPE on Windows 10 in Boot Camp on my 13ʺ Retina MacBook Pro, to see if I can do some of my work without a dictation buffer at all. This has been frustrated by DMPE’s lack of support for high-DPI displays (which persists in 2.3). So instead of using high-DPI, I set the display magnification to 100%, try to use font size adjustments where available to make text readable, and squint or use Magnifier where such adjustments are not available. (I found a better workaround later.) Despite several years and Windows versions, high DPI support in Windows is still inconsistent and buggy. OS X looks amazing by comparison.

Another issue with Windows 10 in Boot Camp, unrelated to DMPE, is Boot Camp’s keyboard and trackpad drivers. Unlike on OS X, remapping Caps Lock to Control does not eliminate the accidental-activation delay, and while a kind soul has reverse engineered the appropriate HID commands to remove the delay, I will need to port the code to Windows before I can make use of it. The trackpad is even worse with no workaround of which I’m aware — simple trackpad activities such as clicking and right clicking are completely reliable when booted into OS X but inconsistent on Windows.

The future

The future contains many potential pitfalls.

If Apple switches to ARM-based Macs, running Windows in virtualization will become untenable and I will likely need to start using a non-Mac as my primary machine, or attempt to make do with a Mac version of Dragon Medical.

EMC, in their great wisdom, laid off the entire team developing VMware Fusion for OS X. There has been a single patch release of VMware Fusion since then, which didn’t seem to break anything too horribly, but i don’t have high hopes for the future of the software. The only other option for high-performance virtualization on the Mac, Parallels, is well-known for their shady business practices, would be more expensive, require more frequent upgrades, and from what I can read, also has poorer quality audio driver support.

Currently, DMPE is still based on Dragon 12, which is two major versions behind the current non-medical version. Nuance has publicly stated that there will be no further major releases of DMPE, and that its future is Internet-connected, subscription software which would end up costing about 5-10× as much ($135/month) in my usage model. In addition, automation choices appear to be substantially reduced or eliminated in this future version, which would likely mean I would have to rewrite or completely abandon my dictation buffer software.

The good news is that with Windows 10 and Office 2016 support, and a relatively new laptop, I’ve got a few years to persevere with my current setup before I am forced to make any changes.

I continue to hope that speech recognition’s entry into the mainstream will eventually penetrate the medical market, finally dragging medical speech recognition out of its expensive, flaky, buggy backwater into the 21st century. In the meantime, I will be thankful for small victories, such as that I didn’t experience any freezing while I dictated this blog post.

The Dash: 2.0 updates

This is turning into The Dash blog recently. Hopefully after my current 1-2 month crunch I’ll be able to get back to posting about other things!

In any case, The Dash firmware 2.0 (now “Bragi OS 2.0”), which I wrote about in my most recent post, is now released. Full release notes are here. The major version number increment is definitely deserved, as it adds a lot of features, not all of which I’ve had a chance to evaluate. The timing of these changes does seem to correlate with more general availability for the earphones.

The pairing procedure for both the Bluetooth and Bluetooth Low Energy sides have been overhauled. Both now require pairing codes which theoretically prevent spoofing. More usefully day-to-day, you no longer need to explicitly pair the Bluetooth Low Energy side every time you want to use it. Instead, just inserting it in your ear is enough for it to show up in the app. Both audio and visual prompts during pairing are very well done. Given the design of the Bragi app, I’m pretty sure this pairing procedure was intended all along.

While not commented on in the release notes, the touch sensitivity seems less sensitive to hair brushing against it.

Microphone (hence Siri) quality is once again somewhat improved but it still sounds like you’re talking inside a tin can. The iPhone’s internal microphone is a lot better if you’re able to use it instead.

My trick to give the left and right Dash different names still works, and unfortunately you still need to rename the device twice to get it to stick.

I’ll be trying out cadence detection for cycling tomorrow; I’m not much of a swimmer.

There’s still a lot more to come — including, eventually, a SDK…

The Dash: disconnecting from the Dash side

Back in April, I noted “You can't initiate a disconnection or pairing from the right Dash once it's connected.” This is still technically true in the current firmware (1.5.1) — but I just discovered a convenient workaround, at least on iOS. Typically, I find this an issue when the Dash is connected to a device across the room (or inside my bag when I'm on my bike) and I want to pair it with something else closer to me.

Here’s how to do it:

  • Tap and hold on the right Dash until you hear the tone.
  • Wait another second or so until you hear the “Siri is listening” tone.
  • Say “turn off Bluetooth.” The iOS device does just this, severing its connection to your Dash.
  • You can then connect/pair the right Dash to another device.

The Dash firmware 2.0 is now in private beta testing. Unfortunately I didn't respond quickly enough to the call for testers on Facebook to get in the pool. The advertised list of upcoming features is pretty enticing:

  • Enhancements to activity tracking, especially for swimming and cycling
  • Changes to the feedback of metrics during activities; metrics are also logged in the Bragi App
  • Calibration of The Dash sensors to improve accuracy
  • Major enhancement to the speech quality during phone calls
  • Changes to audio playback to improve clarity and quality, as well as significantly boosting the maximum volume level
  • Improvement to the Bluetooth & BLE connectivity with other devices and apps, as well as implementing security during BT pairing and bonding to ensure data privacy
  • Implement more remote data channels with the Bragi app

The Dash is getting a lot of competitors in the cord-free Bluetooth headset market. I hope Bragi is able to keep up and realize more of their vision, while fixing practical issues such as those related to pairing and Bluetooth range.

Dragon NaturallySpeaking roaming user profiles with Apache

Some editions of Dragon NaturallySpeaking (including Medical) support a Roaming User Profile feature. With this, you can store your voice profile on a server and download it to/upload it from computers on which you dictate. Like most aspects of Dragon NaturallySpeaking, it’s unnecessarily complex and flaky, but I got it to work in my distinctly non-enterprise environment a few weeks ago. For anyone else in a similar situation who wants their training, custom dictionaries and commands to follow them, I hope the following is helpful.

I assume here you have an existing local user profile to migrate. Dragon NaturallySpeaking’s WebDAV client is inefficient and includes many configuration options of dubious utility, but does (eventually) work. For WebDAV on IIS (or SMB), the instructions in the administration manual appear relatively complete. The manual mentions Apache compatibility but includes no setup information, nor could I find any elsewhere on the Internet. So, my server examples use WebDAV with the Apache HTTP server 2.4.x.

Setting up a WebDAV server

It's 2016 and you should be using SSL/TLS by now. Mozilla has a nice SSL configuration generator; this is the configuration I'm using. The newest protocol Dragon NaturallySpeaking 12 claims it supports is TLSv1, so the "modern" configuration likely won't work.

My configuration follows. Authentication is however you want to set it up; I use digest auth behind SSL/TLS. Obviously, replace my file paths as appropriate. The Dragon NaturallySpeaking WebDAV client configuration includes options to follow redirects, but they don't work properly and aren't compatible with connection keep-alive. Thankfully, Apache has a workaround for such brokenness (redirect-carefully). The client expects infinite-depth requests to work, hence DavDepthInfinity on.

DavLockDB /var/www/sabi.net/webdav/dav_lock.db
<Directory /var/www/sabi.net/public/dragon>
        Dav On
        DavDepthInfinity on
        AuthType Digest
        AuthName dragon
        AuthUserFile /var/www/sabi.net/etc/digest.passwd
        Require valid-user
        SSLRequireSSL
        # Redirects don't work. At all.                                         
        BrowserMatch "Nuance component" redirect-carefully
        RewriteEngine off
</Directory>

Make sure the directory is writable by the Web server user; mine looks like this:

drwxrwsr-x 4 nriley www-nriley 4.0K Apr 23 11:49 /var/www/sabi.net/public/dragon/

Setting up the WebDAV client

Documentation is here. Follow the instructions under Enable the Roaming User Profile feature and Set location of Master Roaming User Profiles.

In HTTP Settings, specify your username, password and an Authentication Type as appropriate. Under Connection, click Never for Follow Redirects and check the Keep Connection Alive box. I didn't change the Timeouts from the defaults.

My SSL Settings are as follows:

SSL settings.png

I haven't actually tested if my server certificate is verified, but I do know enough not to check Using OpenSSL in an application that hasn't been updated in years.

Click Test Connection. If it fails, check your Apache logs; client-side feedback ranges from unhelpful to misleading. You'll notice that every single request is initially tried unauthenticated — I couldn't figure out a way to stop this from happening. Once I was confident that authentication was working, I filtered out these duplicate requests. Here’s the whole test:

% tail -fn 0 /var/www/sabi.net/logs/ssl.*~*.gz | grep nriley
nriley [23/Apr/2016:19:18:15 +0000] "PROPFIND /dragon HTTP/1.1" 207 1210 "-" "Nuance component"
nriley [23/Apr/2016:19:18:15 +0000] "DELETE /dragon/tst.tmp HTTP/1.1" 404 522 "-" "Nuance component"
nriley [23/Apr/2016:19:18:15 +0000] "PUT /dragon/tst.tmp HTTP/1.1" 201 442 "-" "Nuance component"
nriley [23/Apr/2016:19:18:15 +0000] "DELETE /dragon/TempDir HTTP/1.1" 404 522 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "MKCOL /dragon/TempDir HTTP/1.1" 201 442 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "DELETE /dragon/TempDir/tst1.tmp HTTP/1.1" 404 522 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "PROPFIND /dragon HTTP/1.1" 207 6554 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "PUT /dragon/TempDir/tst1.tmp HTTP/1.1" 201 458 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "DELETE /dragon/TempDir/tst2.tmp HTTP/1.1" 404 522 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "PROPFIND /dragon HTTP/1.1" 207 6554 "-" "Nuance component"
nriley [23/Apr/2016:19:18:16 +0000] "PUT /dragon/TempDir/tst2.tmp HTTP/1.1" 201 458 "-" "Nuance component"
nriley [23/Apr/2016:19:18:17 +0000] "PROPFIND /dragon/TempDir HTTP/1.1" 207 2858 "-" "Nuance component"
nriley [23/Apr/2016:19:18:17 +0000] "GET /dragon/TempDir/tst1.tmp HTTP/1.1" 200 341 "-" "Nuance component"
nriley [23/Apr/2016:19:18:17 +0000] "PROPFIND /dragon/TempDir/ HTTP/1.1" 207 1162 "-" "Nuance component"
nriley [23/Apr/2016:19:18:17 +0000] "MOVE /dragon/TempDir/tst1.tmp HTTP/1.1" 201 458 "-" "Nuance component"
nriley [23/Apr/2016:19:18:17 +0000] "MOVE /dragon/TempDir/ HTTP/1.1" 201 442 "-" "Nuance component"
nriley [23/Apr/2016:19:18:17 +0000] "COPY /dragon/newTempDir HTTP/1.1" 201 442 "-" "Nuance component"
nriley [23/Apr/2016:19:18:18 +0000] "DELETE /dragon/tst.tmp HTTP/1.1" 204 261 "-" "Nuance component"
nriley [23/Apr/2016:19:18:18 +0000] "DELETE /dragon/newTempDir HTTP/1.1" 204 293 "-" "Nuance component"
nriley [23/Apr/2016:19:18:18 +0000] "DELETE /dragon/newTempDir2 HTTP/1.1" 204 293 "-" "Nuance component"

Roaming options

Nuance documentation is here and does a reasonably good job of explaining the options; I recommend you read it prior to my comments below. Here's how I have the roaming Administrative Settings configured:
Administrative Settings.png

If you’re going to be the only user, check Display Classic Open User Profiles dialog. This displays a flat versus a hierarchical list of users and dictation sources. Every time you click on anything in this dialog, be prepared for a long synchronous wait for server access. By disabling the hierarchy, you eliminate the wait while expanding your user. (If you only have one user and dictation source, you may not see this dialog at all.)

Allow non-Roaming User Profiles to be opened will need to be checked while you are migrating your user profile to a roaming profile, but can be unchecked afterward.

Merge contents of vocdelta.dat into network User Profile when file is full involves a 500K file; in a WAN environment with reasonably fast links, latency is likely to outweigh any time savings, so I kept this checked.

I unchecked Access network at User Profile open/close only because I keep my profiles open for days at a time and have an Internet connection available at all times. If your usage pattern is different, you may select otherwise.

Despite documentation suggesting that Ask before breaking locks on network User Profiles does not apply to profiles accessed through HTTP, I was asked to break a lock nearly every time I opened my profile until I unchecked it. There might be some server configuration that will let this be checked, but I’m unaware of it.

Always copy acoustic information to network and Conserve archive size on network are somewhat related. How you decide to limit/copy acoustic information really depends on your network performance, patience and desired strategy for propagating corrections and optimizing your profile.

Converting your profile

Again, there's official documentation which I won't repeat. There's no progress bar, just an unresponsive interface during migration; watch the server logs or your favorite network monitoring utility if you get nervous.

If you’ve been using Dragon NaturallySpeaking for some time, you may think of your profile as a large, unwieldy multi-gigabyte entity. Much of this is backups and audio data that aren’t strictly necessary — and you’ll notice that the server profile is much smaller because it omits them. My local profiles (compressed!) on two machines prior to migration were 1.4 and 1.1 GB; corresponding sizes on the server are 437 and 430 MB. ~320 MB of each is (primarily) audio in the voice_container subdirectory.

Once you're comfortable your roaming profile works, don't forget to delete your local profile(s).

Pitfalls

Much of the information here is out of date but an important and still-relevant sentence is "When using a roaming user profile, backup files cannot be generated in any location". The downside of backups not being written to the roaming profile is that if your profile becomes corrupted (which just happened for me today — I set up Dragon Medical Practice Edition on a new Windows 10 installation and subsequently DMPE crashed every time I opened the profile from my Windows 7 VMs) you’ll have to rely on your server backups. If you don’t have server backups — go fix that.

The Language and Acoustic Optimizers don't run on a roaming profile; they idea is that you run them server-side. I plan on seeing how well they work on a fast network by remotely mounting the WebDAV share, but haven't had a chance to do this yet.

Dragon NaturallySpeaking startup and shutdown obviously takes longer when the network is involved. You can automate opening a profile with a command-line argument to natspeak.exe, but you can't specify a dictation source (if you have more than one) without relying on AutoHotKey or similar. Thanks to various VMware Fusion and/or OS X bugs I already have to babysit dictation startup, so one more click to select a profile hasn't been a great additional hardship.

For more

My other dictation-related blog posts are in the Dictation category, if you're interested. Right now all my dictation effort is targeted at prose, but at some point I plan to investigate VoiceCode — which is currently in the process of being rewritten.

Older Posts ›