devinprater

Devin Prater's blog

Today, I began trying to make the most of Emacs. Mainly, this just means activating the packages that I’ve already installed. I’ve noticed that, even when using the Emacs package manager, with Melpa added, packages don’t always get “enabled” and configured to work. Some of them require that you do (require package-name) in your .emacs.el file. So, I went through the list of packages, and one by one read their Github page to see how to configure them. It’s slightly annoying, yes, but I’ve gotten a bit out of it.

First, I found out a lot more about the extra Org-mode blocks and links added with a package. I don’t remember the name now. And then, I found a few packages that I didn’t need upon further inspection, so I got rid of those. And then, I started hitting some big gold mines.

LSP Mode

LSP (Language Server Protocol), is basically an IDE-like bundle of code checkers, refactoring mechanisms, and documentation things that brings IDE’s to your text editor. Or something. All I really care about is that it brings VS Code like functionality to Emacs. And, there’s a Grammarly extension! The only problem is that when I load LSP mode, afterwards, Emacspeak reads a little extra info each time I switch buffers, like there’s still another frame there or something.

So, I plan on using that mainly with Markdown files, although Grammarly doesn’t seem to like filled paragraphs, and I hate unfilled paragraphs, although I can deal with it when working with Gemini. Ah well, maybe I’ll just turn on visual-line-mode everywhere. I don’t know. At least it’s not like VS Code, where the screen reader cannot read visual lines and only reads logical lines. Emacspeak handles visual lines by playing a sound when the next logical line is reached, but speaks each visual line as I arrow or use C-n or C-p.

Helm

Helm is a completion package. It’s really great how Emacspeak works with it, and I can just type to narrow things down, and not needing to type, then tab, then type, all that. And, unlike the mini-buffer, I can perform many actions on the thing at point, not just pressing Enter to activate. It’s really great, and I’ll definitely incorporate this into my workflow.

I tried to write this on Mastodon, but 1000 characters just isn’t enough. Since I am blind, and do not have any other visible disability, I don’t know what it’s like the not have the use of my legs. Therefore, if I’ve misrepresented anything in the following section, let me know.


You’ve always hated your legs. They flop uselessly at the end of your body; there, but just to show off that you’re different. That you can’t actually use them. Like a blind person’s eyes, just rolling around in the head, without use. You particularly hate your legs today, as you sit in front of a set of stairs with a helpful “accessibility controller.” At the top of the stairs. You could pull the lever on the controller box, and a ramp is lowered to the ground. If only your legs worked.

You remember when these things were invented. It was a bill at first, made after a video of a person in a wheelchair suffered severe brain trauma after falling down stairs when attempting to get medical help. The media ran the videos nonstop until the people boiled with anger, and so the government did as little as possible, as usual. So now these things exist. After another video was made of a person falling down stairs trying to activate it, stairs leading to public buildings were altered so that, if a wheelchair is pushed up them backwards at a certain angle, then they can reach the top, and the lever. Hopefully.

So, you take a deep breath, turn the wheelchair around, and prepare to try to reach your appointment.


The moral of the story: accessibility switches are bad. The UI of software or anything really, should be accessible from the beginning, and if a user has to go in and manually put in accessibility enablement statements in .xinitrc and .profile, your crap is broken.

When I have to go into the Mate desktop’s menu, then “system”, then “Personal” then “assistive Technology” and “enable” the use of assistive technologies,” then that tells me that if I didn’t, Linux would be far, far less accessible without this. And what if a user doesn’t know about this “trick” to enable a user to use their system? Well, they’d think Linux was far less accessible than what it is, and even with full accessibility settings on, I can barely use Zoom, which is a pretty important program these days. Google Docs is another thing I struggle with in Firefox, and Chromium. And yes, Google Docs is another piece of junk that requires an accessibility switch. Even with all this in my .xinitrc and .profile:

export
export
export
export GTK_MODULES=gail:atk-bridge
export GNOME_ACCESSIBILITY=1
export QT_ACCESSIBILITY=1
export QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1

exec mate-session

stuff still is hard to use, like Zoom, and Google Docs. And just how much of this is even still needed? Do we still need “export QTLINUXACCESSIBILITYALWAYSON=1” when we have “export QTACCESSIBILITY=1”? Am I missing yet another flag that has to be enabled?

Meanwhile, Mac and Windows are accessible by default. No need to turn on any flags or check a box that tells the system, and probably all apps, that hey, this guy is blind~ Funny how privacy goes out the window when you’re freaking disabled, huh? Funny how closed source, proprietary systems are more accessible, and privacy friendly in that regard, than a system made by the people, for the 95%. But that’s what I get for being a nerd.

So, I’ve been looking for more Gemini clients. Not that Elpher is bad, but because I’m not always on my computer, as much as I’d love to just be able to sit at the computer, or more specifically, on my bed with my USB keyboard in front of me, plugged into my laptop, twenty-four seven. Unfortunately, there are times when I need to just suck it up and use my phone. For example, when I’m outside sitting on the porch during a warm day or evening, or when I’m on the way to or from work, or when I’m in my rocking chair.

So, I looked through the list of clients on Gemini’s circumlunar site, and found Elaho, a client for iOS. I liked it. It was simple, and displayed things fine. After a slightly long discussion on the Gemini mailing list, however, it got even better!

Today, I got an update on it that basically put preformatted blocks into an image item type, with the Alt-text as the image name. Something like that. And VoiceOver works amazingly well with that! So, now, I don’t even have to deal with most ASCII art! So, I can just relax and read Gemlogs with my braille display, and everything be simple, luscious, plain text! Well, plain as in readable, with headings and links and such.

So, I just got this all figured out, so I thought I’d write it down here before I forget. This is about me tweaking my Arch Linux setup to be a bit more productive a little faster.

The non-problem

So, I use Arch (BTW), so I naturally have full control of what my computer does. Well besides the firmware, the loose ports, and all the software I have no idea how to work or use (yet). But I’m getting there. I still need to figure out how to make this Ladspa sync in Pulse go to whatever card is plugged in, not just the Intel built-in Speakers/headphone-jack thing.

Anyways, I had ESpeakup (Speakup screen reader speaking through ESpeak), starting at startup, giving me a good indication that the system is ready. I could then type in my username and password, and then raced to type `startx` before the network manager connected to Wi-fi, because it was kinda fun getting that notification that I’m connected to Wi-fi.

Then, I needed to type my password again because some log-in key ring wasn’t authenticated via a mere shell login. Ah well. But that wasn’t all very productive. For one thing, I almost never used the console for anything. So, why log in using it? I just used it as a jumping off point for startx and mate-session.

So, I tried a few display managers. My first choice was lightdm, as I wasn’t sure the GDM one would allow me to start Mate, or if it was tied to Gnome. Well, that one didn’t seem to have Orca support. Or, if it did, it was more work than I was willing to expend to get that working. So, I went back to no DM and just using startx.

So, then, I tried GDM, the Gnome display manager. This worked well, I was able to start Orca within it. The settings were just the default Orca settings, with slow speech and such, but I could deal with that. I just needed to hit Enter, type in my password, and hit Enter again. But then, I started Emacs. The environment variable to set DTKProgram wasn’t set anymore to “outloud,” so it used ESpeak, which doesn’t have great support in Emacspeak. So, I tried other programs, some QT apps weren’t accessible, and neither was Chromium. So, my environment variables weren’t being loaded. So, I went back to no DM and just using startx.

So, today, I can’t remember why I wanted to try this again. Ah yes, it was .bashprofile verses .bashrc. Also, I need to find new Aspell dictionary with more computer/Linux terms and such. But anyway, I wanted to see if .bashrc worked to get environment vars loaded when using GDM. So, I enabled GDM, but found that Emacs (with Emacspeak) still loaded ESpeak. That was kind of disappointing.

So, after a few restarts, I determined that it wasn’t me, that the .bashprofile was made right, and that when loading GDM, that simply wasn’t being taken into account. So, I looked it up, and found that most modern Linux distros load from .profile, not .bashrc or .bashprofile. Well, that makes sense.

So, I found that, yes, I do have a .profile, and that it’s practically empty. I filled in everything that I had from my .xinitrc, .bashrc, and .bashprofile that I’ve added over the months that I’ve used Linux, and restarted. And it works! Emacs loads with Outloud, Chromium is accessible, and all it better, needing one login, not basically two with the authentication key ring login. So, here is my .profile:

export
export
export
export DTK_PROGRAM=outloud
export LADSPA_PATH=/usr/lib/ladspa
export ACCESSIBILITY_ENABLED=1 
export PATH="$HOME/.gem/ruby/2.7.0/bin:$PATH"
export SAL_USE_VCLPLUGIN=gtk3 GTK_MODULES="gail:atk-bridge"
export GTK_MODULES=gail:atk-bridge
export GNOME_ACCESSIBILITY=1
export QT_ACCESSIBILITY=1
export QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
export EDITOR="emacsclient"

alias git=git-smb

Yeah, it could use with a little cleaning, but the extra stuff about GTK3 was for LibreOffice, and I ain’t messing with that.

This log will detail my search throughout Linux for accessible games, besides the audio game manager, and my reaching out to developers, and their responses. Hopefully, this will motivate me to keep going in the face of, undoubtedly, much failure.

Why?

Because I’m weird. I can’t just start with any old app category, oh no. ToDo managers? Pomodoro timers? Text editors? No, I choose to bang my head against games. And because I want new blind Linux users, when they join Linux, to have some games outside the Windows audio games, to play. Because it’s like… a sighted person coming to Linux and finding out that all there is to play is Windows games. And yeah there are a good many games made for Linux. So why not? Hopefully I can get at least one game made accessible, or find that one already is accessible. If I can do at least that, then that’s one more success story of the open source community actually giving a crap.

Testing the games

I test each game using the Orca screen reader, version 3.38.2. I run the Mate desktop (version 1.24.1) on Arch Linux. My PC has an Intel Core i7-6500U CPU at 2.50GHz and 8 GB of RAM and a Skylake GT2 [HD Graphics 520] graphics card. At least, I think that’s the graphics card. 😊

Game list

I am getting the list of games from the Arch Linux Wiki. It’s separated into game genre headings, so that’s great. At a fellow Mastodon user’s suggestion, I’m going to go with casual games first. Arch Wiki List of Games

So, from here, I’ll have the game category, then the games, their accessibility, and contact with the developer.

Casual Games

Aisleriot (version 3.22.12)

Upon starting the game, I hear “Klondike, drawing area.” The “Drawing area” is what the Screen Savers use as a “frame” to show the picture. But in this case, I assume a game has started, so this should be filled with cards. Whenever I press Tab, I hear “new game, button, start a new game”, and when pressing it, the drawing area stays the same, so that’s why I assume a game has already started.

When pressing Tab after the “new game” button, I’m placed back onto the drawing area. If I use the Right Arrow while on the “new game” button, I find the other buttons on what I assume is a toolbar: “Select game,” “Deal” and “hint”. If I press Enter on “select game,” I am able to choose another game type to play. Even so, the Drawing Area is still there. If I press the “Hint” button, I am given an accessible dialog with a hint on what to do next, like “Move the ten of hearts onto the ten of clubs.” I can dismiss the hint with the Enter key. If I press the “Deal” button, back in Klondike mode, nothing is reported, but two new buttons, “undo move” and “restart” appear.

When I press F10, to open the menu bar, that part is accessible. Pressing “New game,” “restart,” entering the “recent games” menu, and closing work, in that I can perform those functions. The statistics screen was much more accessible than I expected, with textual labels for each field, along with the number associated with them. There is also a button to reset the statistics, and one to close the window. None of the items in the “view” menu affect accessibility, although the removal of the “tool bar” hides the buttons “above” the drawing area. Nothing within the Controls menu affects accessibility, neither does anything in the Klondike menu. In the help menu, there are keyboard shortcuts, but none regarding accessibility.

In short, everything is accessible except the cards and ways of moving and controlling them. I don’t know much about Solitaire, but I do know there are supposed to be cards, and from the hints, they can be moved. [[https://gitlab.gnome.org/GNOME/aisleriot/-/issues/54][Gitlab Issue

During the month of… November? December? Something like that… I found myself being called by Linux again. I just can’t stay away. I go to Windows for a while, and then something happens. VS Code became sluggish and unreliable, and I just… just couldn’t deal with crap anymore. Sure everything else worked well enough. I could play my audio games and Slay the Spire (using Say the Spire), but gosh darn it I missed freedom.

So, I thought about it. People on the Linux-a11y IRC use Arch. Because they’re all pretty much advanced users. Other blind people use Ubuntu Mate, or Debian. I tried Fedora, and found that I couldn’t even run Orca, the Linux graphical screen reader, within the Fedora installer. I tried Debian in a virtual machine, but the resulting system, after installation, didn’t speak. I tried Slint Linux, a Slackware based distribution, but there were sound issues, and they weren’t something I could deal with.

The Need for Speed

So, I thought about different Linux distros. There priorities, their values, and their priority of keeping packages up-to-date or not. I like distros that keep packages up-to-date. Not doing so, to me, feels like a slap in the face of developers, the distro maintainer saying: “We don’t trust that you can write good enough software, so we’re going to leave your software at this version for >= 6 months. And then, when we release a new version of our distro, we’ll go into your code and “backport” things into your old version.”

Another issue is that older software isn’t necessarily better. It definitely isn’t necessarily more accessible, and that is my main concern, and is, I suspect, why most “power user” blind Linux folks go with arch. They already have GTK4 in their repository. Can Ubuntu, or even worse, Debian, say the same?

Now, I know that there is Flatpak, Snap, and probably a lot of lesser-known protocols. But I see them as add-on package managers, supplementing the system package manager. Also, they wouldn’t be necessary if Ubuntu and Debian would package up-to-date software. Snap and Flatpak are solving a problem that Ubuntu themselves created. Isn’t that nice?

Choosing Arch

So, I looked around. Ubuntu, Debian, all the main distros were all fixed releases, all stale, and I like to explore. I use my computer more than just for simple stuff. I mean, I can’t have old, out-dated packages. And it’s so sad that Youtube-dl, and even Mycroft, have to explain to users how to install from Pip, or from a git repo, just to keep the package up-to-date. But enough about that. A person on the IRC channel suggested Anarchy, which is an “easy installer ISO” of arch. So, I took a look. => https://anarchyinstaller.org/ Anarchy Installer (HTTP)

Since late last year, the base Arch Linux distro has come prebuilt with accessibility stuff. Just press Down Arrow once while booting, then Enter, and the Arch Linux ISO will come up talking. So, maybe Anarchy would do the same.

I got the ISO, flashed it to a flash drive, and booted it, doing the steps to boot in accessible mode. And it worked. The command line interface was pretty easy to use, and left me with a system that, while inaccessible (there were no settings in the installer to configure that,) I was able to chroot in, from the command line of the ISO, and set things up.

Setting up my New System

First, I enabled espeakup.service. This runs the Speakup screen reader with the ESpeak synthesizer. That was enough to give me speech at the console. Then I installed Yay, the AUR package manager thing. I later switched to Paru. Then, I installed the Mate Desktop, as they’re currently the only ones that have accessibility well enough for easy usage for now. Hopefully Gnome gets back into the game with Gnome 40, but I’m not holding my breath.

Then, I added these lines to my .xinitrc:

export
export
export
export GTK_MODULES=gail:atk-bridge
export GNOME_ACCESSIBILITY=1
export QT_ACCESSIBILITY=1
export QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1

exec mate-session

And so then I could get going. I started the X-session (startx), and ran Orca from the run dialog (Alt + F2). But still, some programs weren’t accessible. So, I went to the System menu, down to Preferences, then Personal, then “Assistive Technology”, and checked that box, and things were pretty smooth after that.

My Experiences so far

I don’t think I’ll be going back to Windows any time soon. While there are problems: Alsa sends ESpeakup through my speakers even if headphones are plugged in, I need to learn more about Pulse so that I can add more than one Ladspa effect at a time, and add them to whatever sound card I’m using, not just making a new sound card, and I do miss the sound packs created for MUD’s that only run on Mush-client. But there are things about Linux that I do love:

  • Emacspeak: The more I use it, the more I love it.
  • GPodder: A Podcasting client that not only is accessible, it even allows me to get Youtube channels as podcasts! I mean, that’s amazing!
  • Mutt: I’m really starting to like this simple Email client. Sure, the hotkey list bar at the top is a little annoying and I wish I could just make that go away and just reference keyboard commands when I need them, but overall I love it and wish I could use it with more accounts.
  • Audio Game Manager: I probably wouldn’t be on Linux for this long without this tool. It brings audio games from Windows to Linux with Wine and preconfigured settings.
  • Retroarch: Now that it’s accessible, I love playing Dissidia Final Fantasy on it. Although, trying to “record a movie” on it really slows things down. I wonder if streaming would do the same.
  • BRLTTY: This has saved my butt on multiple occasions when Alsa couldn’t find any audio devices or something and I had to fiddle with Pulseaudio to fix it. I don’t know much about audio on Linux really, I just revert any change I made on behalf of something like Mycroft or whatever. Oh, BRLTTY is basically a screen reader for braille displays, meaning I don’t need audio to use it.
  • Emacs: What can I say? Most of my work is done inside Emacs. Most of my play is done inside Emacs. Nearly all of my writing and reading is done inside Emacs. I’m considering having my window manager inside Emacs. One day, my brain will be inside Emacs. No Microsoft text editor can compare with Emacs and Emacspeak’s ability to give as much information as possible, even syntax highlighting, bold, italics, just everything!
  • The command line: Sure, we have this in Windows, but it’s more of an afterthought, and a bolted on feature at this point. In Linux, it’s a first-class citizen. I’m not a power user in any stretch of the imagination, but I can navigate the file system, run commands with arguments, all the basic stuff. I can do this in audio and braille. I can use nano a bit to edit files, and know the general layout of config files, and am not as scared of them as I used to be, although I need to learn to read the manual before I dive into them.

Also, in my experience, Linux breeds creativity. You could use it as a regular desktop user, but if you dig just a tad, you see the building blocks. And it makes you want to learn about them, to play with them, to maybe break them a bit but then try to fix them. And some things you can’t make work: like the fact that my laptop, having just a USB C port, can’t display video over Thunderbolt. (I have a Thunderbolt dock at work connected through Display port to a monitor.) But some things you can do. You can script things using Python, then put them in your bin folder to run from anywhere. You can make your own programs! You can turn your Linux machine into a Bluetooth speaker to listen to books from your iPhone on your laptop! There is just so much possible with Linux, and even more possible with coding knowledge.

Flaws in the Utopia

This isn’t to say that Linux is perfect. It is made by people, and mostly hobbyists at that. This isn’t to say their code is sloppy, or that they don’t care. It does mean that they aren’t held to any kind of company standard, especially regarding accessibility. Linux is more of a community effort, so users will need (me included) to interact with the community to get things fixed, or even just to remind them that blind users actually do exist. We do have our own IRC server, a little corner of the Internet, but we won’t get anywhere by just staying in that corner.

  • The graphical interface can be tricky to use, like remembering that you have to press Control + Tab to reach some parts of the fractal program, and there are still unlabeled buttons in official Gnome apps, like Fractal. But there will be a complete rewrite of the interface, so hopefully accessibility is considered in the process.
  • There are less games, and much less accessible games, on Linux. I’ll begin to reach out to developers of games to see if anything can be done about this. In the meantime, there is the Audio Game Manager for playing Windows accessible games.
  • You’ll have to Google things, a lot: There aren’t many blind people who use Linux. That number grows by one or two per year, and the Audio Games forum has a few members who use Linux, but there aren’t many outside that.
  • Sound isn’t as convenient as on Windows, where you have enhancements, bit rate and format control, all that in one place. And the Pulse-effects package makes things very laggy, whereas loading a Ladspa module directly produces no lag.
  • Sound can be slightly rough when first booting up the computer.

Looking Forward

I’ll probably stick with Linux, as long as this laptop survives. I’ve had it for about five years, and it’s still pretty well up to the task. It performs well, has a good enough keyboard with a number pad, but a first ports, especially the headphone jack, are becoming loose. I’ll have to see about getting a USB sound card or something, unless ports can be tightened. And a new battery would be good too. I ordered one, so we’ll see if it can be replaced.

I’ll still reach out to developers, to see if accessibility of apps can be improved. Hopefully, indie game developers will be more receptive as well. Eventually, I’d love to have more blind people come to Linux, and not just then jump into the blindness servers and moan and grown, but continue to push for greater accessibility, on Matrix, on IRC, on Forums of desktop environments and graphical toolkits like GTK. Linux makes me feel passionate about technology, about open source, about what’s possible, whereas Windows just felt contrived, the accessibility team preaching and preaching on their Twitter account, saying all the right things. Saying all the right words. But when it comes time to deliver, well, they fall short. Windows 10’s Mail app still is a pain to use with screen readers, still having the issue of when I put keyboard focus on a thread, it automatically expands, and then I have to close it just to quickly move down to another message or thread, and when I press Control + R to reply, nothing is spoken to let me know the command succeeded. Not even Thunderbird, even though it locks up every few minutes, has those kinds of problems. And the only other good email client is Mutt.

So, Linux feels more “real” to me. It doesn’t try to hide its accessibility issues with warm words and “we hear you!” tweets. It could do better. Earlier today I suggested to people involved with the Pine Phone that accessibility could be a greater focus, and essentially got back “maybe you can focus on Linux desktop accessibility first.” I guess I’ll have to. I’m not a developer, but if that’s what people want, sure. Why not. Maybe I’ll even learn to enjoy it. But I’m more of a writer, for now, not a programmer. I’ve made a script, one script that is used in “production”, and find it easier to learn and enjoy now, but it’ll be some time, a lot of time, before I’m able to deal with low-level stuff in Linux.

But, until then, I’ll keep exploring, learning, and trying my best to get the word out, to keep people cognizant that accessibility is an issue, and that they don’t have to be an expert to help.

It’s been a while since I’ve written a blog post. But, my entry into Gemini space prompts me to finally write about what’s been going on with me. The simplicity of writing in Gemini, and the “cool new thing” feel is quite inviting. And, because the people at tilde.pink have given me a space to serve this, I have direct access to the files, processes that go into how things look, everything.

Static Site Generators and my disillusionment from them

I like Hugo, I really do. But a theme problem got in the way, leaving me unable to actually build the site. So, I looked for another one, finding Nikola. It worked well, but I couldn’t customize it that much. It had a great plugin that took the text of an article and made the whole blog into a podcast using ESpeak to speak the articles, but I had no idea how to customize the theme, put in my usual “reading time” functionality, or any of that.

So, I just left the blog as it is, a basic Nikola site on Github Pages. I didn’t want to mess with it anymore. I didn’t want to have to deal with config files, running scripts, all that. Besides that, I’ve been very busy with work-related stuff.

Python for lunch!

For a while now, I’ve wanted to write a script that grabs the lunch menu from my job’s Moodle page, gets the menu for today, and shows it, or speaks it, to the user. A few weeks ago, I completed it. What I’ve learned:

  • Python is easier for me when I have a project to work on. I’ll start using the Automate the Boring Stuff book more for this.
  • I learned about the “try” and “except” functionality easily, lending credit to my idea that I learn best with projects.
  • Emacs’ Python mode is pretty great, and voice-lock-mode of Emacspeak has gotten me out of a few situations I wouldn’t have found easily otherwise.

Entry into Gemini space

So, Gemini is this cool new thing that is like the web, but with simple “Gemini files” instead of HTML, JavaScript, and CSS. There are only headings, lists, links, paragraphs, and preformatted blocks in Gemini, and no CSS and JavaScript. It’s basically just the information of the web; no web apps, no need to control looks and reactions, just sweet, simple, plain text.

At first, I was afraid that there would be lots of ASCII graphics. These never are understandable to screen readers. And there are some, but not as much as I’d feared. Then I found a Gemini browser for Emacs, called Elpher which is pretty good. It isn’t optimized for Emacspeak use, and it doesn’t show the Alt text of preformatted blocks, but it’s good enough for my use.

So, I jumped at the chance to host my blog in Gemini space. Plain text, no need for a static site generator, since everything is in plain text, human-readable text, no JavaScript or CSS required. Everything is in directories, and the Index file is plain, with links to whatever you want to show. And for drafts, I’ll just work on them, and when they’re ready, link to them from the Gemlog index. I think, finally, that I’ve found my home.

Switching back to Emacs

A while back, I wrote an article about “Switching Tools”, where I talked about switching from Mac and Emacspeak to Windows and VS Code. Well, turns out that VS Code being a memory-hogging Electron app, and it really just being another edit field, made that kinda fall through. Now, I’m on Linux (I’ll write about that, I promise), and using Emacspeak again. Reasons include:

  • VS Code on my laptop was quite unresponsive. Emacspeak on my (now Linux) laptop is snappy.
  • It looks like VS Code won’t be using sounds for events, like reaching a line with an error, any time soon.

So, since I had nothing more to lose, and because Linux was calling my name, I switched, and I’m pretty happy with it now, actually. I don’t know if it’s my continuing maturation, or Linux accessibility improvements, but I’m finding that I’m mostly able to do anything from Linux, and even more, since there is an actually good podcast client for Linux, GPodder.

Introduction

At the launch of the iPhone 3GS, Apple unveiled VoiceOver on the iPhone. Blind users and accessibility experts had been used to screen readers on computers, and even rudimentary screen readers for smart phones that used a keyboard, trackball, or quadrants of a touch screen for navigation and usage. But here was a screen reader that not only came prepackaged on a modern, relatively inexpensive to the competition, and off-the-shelf device, but it also allowed the user to use the touch screen as it is, a touch device.

This year, VoiceOver added a feature called “VoiceOver recognition.” This feature allows VoiceOver to utilize the machine learning coprocessor in newer iPhone models to describe images with near-human quality, make apps more accessible using ML models, and read the text in images.

This article will explore these new features, go into their benefits, compare VoiceOver Recognition to other options, and discuss the history of these features, and what’s next.

VoiceOver Recognition, the features

VoiceOver Recognition, as discussed before, contains three separate features: Image Recognition, Screen Recognition, and Text recognition. All three work together to bring the best experience. In accessible apps and sites, though, Image and Text recognition do the job fine. All three features must be downloaded and turned on in VoiceOver settings. Image recognition acts upon images automatically, employing Text recognition when text is found in an image.

Screen recognition makes inaccessible apps as good as currently possible with the ML (Machine Learning) model. It is still great, though. It allows me to play Final Fantasy Record Keeper quite easily. It is not perfect, but it is only the beginning!

Benefits of VoiceOver Recognition

Imagine, if you are sighted, that you have never seen a picture before, or if you have, that you’ve never seen a picture you’ve taken yourself. Imagine that all the pictures you have viewed on social media have been blurry and vague. Sure, you can see some movies, but they are far and few between. And apps? You can only access a few, relative to the number of total apps. And games are laughably simple and forgettable.

That is how digital life is for blind people. Now, however, we have a tool that helps with that immensely. VoiceOver Recognition gives amazing descriptions for photos. Not perfect, and sometimes when playing a game, I just get “A photo of a video game” as a description, but again, this is the first version. And photos in news articles and on websites, and in apps, are amazingly accurate. If I didn’t know better, I would think someone at Apple is busy describing all the images I come across. While Screen Recognition can fail spectacularly sometimes, especially with apps that do not look native to iOS, it has allowed me to get out of sticky situations in some apps and has allowed me to press the occasional button that VoiceOver can’t press due to poor app coding and such. And I can play a few text-heavy games with it, like Game of Thrones, a tale of crows.

Even my ability to take pictures is greatly enhanced with image recognition. With this feature, I can open the Camera app, put VoiceOver focus on the “view finder,” and it will describe what is in the camera view! When it changes, I must move focus away and back to the View Finder, but that’s a small price to pay for a “talking camera” that is actually accurate.

Comparing VO Recognition to Other Options

Blind people may then say “Okay, what about Narrator on Windows? It does the same thing, right?” No. First, the photo is sent to a server owned by Microsoft. On iOS, the photo is captioned using the ML Coprocessor. What Microsoft needs and Internet connection and remote server to do, Apple does far better with the chip on your device!

You may then say “Well, how does it give better results?” First, it’s automatic. Land on an image, and it works! Second, it is not shy about what it thinks it sees. If it is confident in its description, it will simply describe it. Narrator, and Seeing AI, always say “Image may contain: ” before giving a guess. And, with more complex images, Narrator fails, and so does Seeing AI. I have read that this is set to improve, but I’ve not seen the improvements yet. Only when VoiceOver Recognition isn’t confident in what it sees, it says, “Photo contains,” and then gives a list of objects that it is surer of. This does not happen nearly as frequently as Narrator/Seeing AI, though.

You may also say “Okay, so how is this better than NVDA’s OCR? You can use it to click on items in an app.” Yes, and that is great, it really is, and I thank the NVDA developers every time I use VMWare with Linux because there always seems to be something going on with it. But with VoiceOver Recognition, you get an actual natively “accessible,” app. You don’t have to click on anything, and you know what VoiceOver thinks the item type of something is: a button, text field, ETC., and can interact with the item accordingly. With NVDA, you have a sort of mouse. With VO Recognition, you have an entire app experience.

The history of these features

Using AI to bolster the accessibility of user Interfaces is not a new idea. It has been floating around the blind community for a while now. I remember discussing it on an APH (American Printing House for the Blind) mailing list around a decade ago. Back then, however, it was just a toy idea. No one thought it could be done with current, at the time, Android 2.3 era hardware or software. It continued to be brought up by blind people who dreamed bigger than I, but never really went anywhere.

Starting with the iPhone X R, Apple began shipping a machine learning Coprocessor within their iPhones. Then n iOS 13, VoiceOver gained the ability to describe images. This was not using the ML chip, however, since older phones could take advantage of it. I thought they may improve this, but I had no idea they would do as great a job as they are doing with iOS 14.

What’s Next?

As I’ve said a few times now, this is only version one. I suspect Apple will continue building on their huge success this year, fleshing out Screen recognition, and perhaps having VoiceOver automatically speak what’s in the camera view when preparing to take a picture, and perhaps adding even more than I cannot imagine now. I suspect, however, that this is leading to an even larger reveal for accessibility in the next few years, Augmented and Virtual reality. Apple Glasses, after all, would be very useful if they could describe what’s around a blind person.

This is basically a test post. I’ve switched from Emacs to VS Code, and I’ll detail why below. The gist is that Emacs is unhelpful, only easy to set up on Mac and Linux, and Emacs packages are not standard, and Emacspeak, the speech extension for Emacs, just can’t keep up with extensions like LanguageTool, and probably won’t because coding is Emacs’ main use case, not writing.

Why I used Emacs

Emacs has been my work tool for about a year now. I went along with its strange commands, and even got to liking them. I memorized strange terminology in order to get the most of the editor. Don’t get me wrong, Emacs is a wonderful tool, and Emacspeak allows me to use it with confidence and even enjoyment.

Before the end, I was writing blog posts, both here and on a Wordpress blog, using Git and GitHub, and even reading EBooks. I also adore Org-mode, which I still find superior to anything else for note taking, compiling quick reports, and just about anything writing-related. Seriously, being able to export just one part of a file, instead of the whole large file containing every bit of work-related notes, is huge, and I’ll now have to use folders, subfolders, and folders under those to come close to achieving that level of productivity. And no, the Org-mode extension for VS Code doesn’t have a third of the ability of the native Emacs Org-mode.

But, Emacs was founded on the do-it-yourself mentality, and it’ll stay that way. If you don’t know what to look for, Emacs will just sit there, without any guidance for you. I’ll get more into that as I compare it with VS Code.

Making Good out of Bad

One day, my MacBook, which is what I run Emacs on, ran very low on battery. It was in the morning that day, and I have a Windows computer also, so I decided to see if I could get things done on it. I’d tried writing on it before, using Markdown in Word, or even VS Code before. But my screen reader, NVDA, wouldn’t read indentation like Emacspeak did, or pause between reading formatting symbols in Markdown like Emacspeak did, play sounds for quick alerts of action like Emacspeak did, or even have a settings interface like Emacs did, and definitely didn’t have a voice like Alex on the Mac. Those were my thoughts when I’d tried it before. I’ll tackle them all, now that I’ve used VS Code for almost a week.

So, I managed to get Markdown support close to how I used it in Emacs, minus the quick jumping between headings with a single keyboard command. I still miss that. The LanguageTool extension works perfectly, although I had to learn that to access the corrections it gave I have to press Control + . (period). Every extension I’ve installed so far has worked with NVDA. I cannot say that for Emacs with Emacspeak. Since the web is so standardized, there isn’t too much an extension could do to not be accessible. Sometimes I wish the suggestions didn’t pop up all the time in some language modes, but I’ll take that any day over inaccessibility.

So, on with debunking the problems I had at first. Hopefully this will help newcomers to VS Code, or those who are cynical that basically a web app can do what they need:

NVDA doesn’t read indentation!

Yes, it can. It can either speak the indentation, or beep, starting at, I believe, low C for the baseline and moving up tones. Sometimes I have to pay a bit of attention to notice the difference between no space and one space, but that’s what having it speak is for.

NVDA doesn’t pause between formatting symbols!

This is true, and unavoidable for now. But, unlike Emacspeak, NVDA has the ability to use a braille display, which makes reading, digesting information, and learning a lot easier for those whose mind, like mine, is more like a train than a race car. In the future, NVDA’s speech refactoring may make pausing, or changing pitch for syntax highlighting, a reality.

VS Code doesn’t play sounds!

This is true too, and I’ve not found a setting or extension to make this happen. Maybe one day…

VS Code doesn’t even have a settings interface!

Before, I thought one had to edit the JSON file for settings to change them. It turns out that if you press Control + , (comma), you get a simple, easy, Windows interface. This is a bit rough around the edges, because you have to Tab twice from one setting to the next, and you could roam from one section of settings to another, but it’s easier than Emacs.

But what about the awful Windows voices!

Yes, Windows voices still are dry and boring, or sound fuzzy, but NVDA has many options for speech now. I’ve settled on one that I can live with. No, it doesn’t have the seeming contextual awareness of paragraphs like Alex, but it’s Windows. I can’t expect too much.

Bonus points for VS Code

Git

I’m only now starting to get Git. It’s a program that allows you to keep multiple versions of things, so you can roll back your work, or even work on separate parts of your work in separate branches of the project. Emacs just… sits there as usual, assuming you have any idea of what you’re doing. VS Code, though, actively tries to help. If you have Git, it offers an extension for that. If you open a Git repository, it asks if you’d like it to fetch changes every once in a while to make sure things are up-to-date when you commit your changes. I was able to commit a pull request in VS Code easily and with minimal fuss. In Emacs, I didn’t even know where to begin. And any program that takes guessing and meaningless work off my shoulders is a program I’ll keep.

Suggestions while typing

VS Code is pretty good at this. If I’m writing code, it will offer suggestions as I type. Sometimes they’re helpful, sometimes they aren’t. In text modes, this doesn’t happen; it appears that this only happens in programming modes. Emacs would just let you type and type and type, and then browsing Reddit you’d find out about snippet packages that may or may not work with Emacspeak.

Standardized

As mentioned before, VS Code is basically a web app. Emacs is a program written in mostly Emacs Lisp, and a bit written in C. Extensions in VS Code are written in JavaScript, whereas extensions in Emacs are written in its Lisp dialect. Since Emacs is completely text based, any kind of fancy interface must be made manually, which usually means that Emacspeak will not work with it, unless the author, or a community member, massages the data enough to make it work. This is a constant battle, and it won’t get easier for anyone involved.

VS Code is a graphical tool that has plenty of keyboard commands, and screen reader support. Its completion, correction, and terminal processes have already been created, so all extensions have to do is hook into that. This means that a lot of extensions are accessible, without even knowing it.

So, any downsides to VS Code?

VS Code is not perfect by any stretch. When screen reader support is enabled, a few features are actually disabled because Microsoft doesn’t know how to convey them to the user without using sound. Code folding is disabled, which would make navigating markdown a lot simpler. Word wrapping is disabled, meaning that a paragraph is on one very long line. I’ve found Rewrap, a third-party extension that I can use, so that’s fixed. There are no sounds, so the only way I know there are problems is by going to the next problem, or opening the issues panel.

Overall though, VS Code has impressed me, and I continuously find wonderful, time-saving, mind-clearing moments where I breathe a sigh of relief that to create a list in markdown, I can just select lines of text, and choose “toggle list” from the commands panel, whereas with Emacs I had to mark the list of lines and remember some strange command like “string-insert-rectangle” and type “*” to make all of those list items. These kinds of time-savers make me more productive, offsetting slightly the lack of features akin to those in Org-mode.

Conclusion

I didn’t expect this post to be so long, but it will be a good test to see if VS Code’s Hugo support is enough to replace Easy-Hugo on Emacs. While VS Code doesn’t have a book reader, at least, not one I think I’d like, or a media player with Tune-In Radio support made for the blind, and many other packages, it is a great editor, and does have tools like Hugo extensions that make it slightly more that an editor. I should branch out more and see what tools Windows now has for these functions anyways. I already use Foobar2000 for media, I just have to find a good book reader that doesn’t get rid of formatting info.

So, I hope you all have enjoyed reading this long test of VS Code, and an update on what I’ve been doing lately when not playing video games and other things.

In other news, I’ve been using the iOS 14 and macOS 11 public betas. I’ll report on my findings on those when the systems are released this fall.

So, I’m writing this from a Windows computer, using Notepad, with WinSCP providing SFTP access to the server. This won’t come as a surprise for those who follow me on Mastodon and such, but I want to put this in the blog, so everything is complete.

About half a year ago, I installed Linux. Sometimes, I get curious as to if anything has changed in Linux, or if it’s any better than it once was. And I want to know if I can tackle it, or if it’s even worth it. Half a year ago, I installed Arch using the Anarchy installer, got accessibility switches turned on, and got to work trying to use it.

Throughout my journey with Linux, I found myself having to forego things that Windows users took for granted. Stuff like instant access to all audio games for computers, regular video games which, even being accessible, used only Windows screen readers for speech. And all the tools that made life a little easier for blind people, like built-in OCR for all screen readers on the platform, different choices in Email clients and web browsers, and even stuff like RSS and Podcatcher clients made by blind people themselves, not to mention Twitter clients. Now, there is OCR Desktop, but it doesn’t come with Orca, and you must set up a keyboard command for it.

But I had Emacs, GPodder for podcasts, Firefox, Chromium when I wanted to deal with that, and Thunderbird for lagging my system every time it checked for email. It was usable, and a few blind people do make use of it as their daily driver. But I just couldn’t. I need something that’s easy to setup and use, otherwise my stress levels just keep going up as I not only have to fight with config files and all that, but accessibility issues as well.

The breaking point

A few days ago, I wanted to get my Android phone talking with my Linux computer, so that I could text, get notifications, and make calls. KDE Connect wasn’t accessible, so I tried Device Connect. I couldn’t get anything out of that, so I tried GSConnect. In order to use that Gnome extension, I needed to start Gnome. I have Gnome 40, since I’m on Arch, so I logged in using that session, and got started. Except, Gnome had become much less accessible since the last time I’d tried it. The Dash was barely usable, the top panels trapped me in them until I opened a dialog from them, and I was soon just too frustrated to go much further. And then I finally opened the Gnome Extensions app, only to find that it’s not accessible at all.

There’s only so much I can take until I just give up and go back to Windows, and that was it. It doesn’t matter how powerful a thing is if one cannot use it, and while Linux is good for simple, everyday tasks, when you really start digging in, when you really start trying to make Linux your ecosystem, you start finding barriers all over the place.

Now, I’m using Windows, have Steam installed with a few accessible video games, Google Chrome, NVDA with plenty of addons, and the “Your Phone” app on Windows and Android works great, except for calls. But it still works much better than any Linux integration I could do. Also, with Windows and Android, I can open the Android phone screen in Windows, and, with NVDA or other screen readers, control the phone from the keyboard using Talkback keyboard commands. That’s definitely not something Linux developers would have thought of.