That does make a lot more sense. But, NVDA already has read word/sentence/line capabilities, I'm sure you could call those functions when the app needs to read the corresponding pieces of text. You'd not even need to write any new functionality, just tie to the existing function calls when the functions are requested. That should save you considerable time.
toggle quoted messageShow quoted text
On 6/22/2020 8:35 PM, Christian Comaschi wrote:
Ok, here are the details of what I'm trying to achieve. I have almost made my decision, so if you don't have ethis time to read this lengthy message, just ignore it and consider my problem solved.
The application I'm working on is Plover, a sort of translator that maps steno key combinations to common key presses.
Obviously, all sort of keyboard hooking is handled by the application and needs no enhancement.
Once the program is started, the customer I'm working with, who is a stenotyper, can open a word processor, type with his special keyboard and hear what he's typing with any screen reader.
But here come the problems:
1) the steno software is constantly guessing what the user is typing comparing it to internal dictionaries, and in some cases words are deleted and re-inserted (simulating backspace and keypresses to type the word again). In all these cases, the screen reader's output becomes too verbose and slows down his typing dramatically;
2) when a screen reader sends each word to the speech engine, a pause is inserted, which can be slightly shorter or longer according to the engine, but it's always too much for someone who is typing with a steno machine, who finds himself constantly waiting for the screen reader to end reading the words. Increasing the voice speed doesn't help, beceause the output becomes messier and pauses are still inserted.
A plugin was developed to solve point 1, called plover-jaws, with the goal to send the typed words to the screen reader. So when input came from the steno machine, the screen reader was not in charge of reading what was typed by himself, but it received the words from the Plover plugin and all those fake deletions were not read anymore.
Point 2 was still unsolved.
The problem of point 2, as I wrote before, was that pauses between words are placed by speech engines; though there are commands in modern engines to minimize pauses, they are effective when a full sentence is sent to the engine, but not when you ask thte engine to say every word in real time. So a new layer had to be put between the speech engine and the audio output in order to control the timing of spoken words.
So it seemed a good idea to develop a new plugin, called Plover-speech (still unpublished), which works like Plover-jaws, but sends the spoken words directly to the soundcard, without any screen reader intervention.
Another (better?) approach could be customizing NVDA to control the timing between words when requested, but I would have to make big changes to the synth drivers, main core and configuration and I'm not a NVDA developer yet, and instead of spending 20 hours to write this new plugin I would spend 100 or 150 hours just to figure out what had to be changed of NVDA to make it work.
The situation now is that we have a stand-alone plugin that lets the screen reader do its normal job, but when input comes from the steno machine, the screen reader shuts up and the plugin sends the typed words directly to the speech output.
The customer asked me: "with the previous plugin, Plover-jaws, I could read the current word, sentence or line with a steno key combination; I want this feature back".
So now I'm trying to figure out if I can do it with a small effort or if it's better to use a screen reader just for these functionalities because they are too complex.
Today the customer told me that he is ok with bringing the screen reader back just for the "say word", "say line" and "say sentence" functions, so, unless reading text from edit controls is really easy, I'll take this route.
I am sorry for the long message and hope everything's clear now
Of course, nothing could be done with JAWS;
Il 22/06/2020 19:11, Travis Siegel ha scritto:
Not sure why you need to read anything if you're just translating keys, simply hook the keyboard functions, then do a simple replace on required keystrokes, no screen reading necessary. By dong that, the new key combinations would automatically be placed in the field they were targeted for in the first place, and your apps don't even to know anything changed. That would be the simplest method. Now, how you get windows to send you all the key information could certainly benefit from studying screen reader functionality, but I see no need to bother with reading screens at all in your particular case.
On 6/21/2020 4:32 PM, Christian Comaschi wrote:
Thanks for the explanation, but unfortunately I’m working on a steno application that can be thought more as a driver than as a normal application. In fact, its main goal is to translate steno keyboards key combinations to normal key presses and send them to other apps, e.g. text editing apps.
So I don’t have to make its windows accessible, they already are, but I have to do what I wrote in my previous mail, read the text and caret position of the most common applications in a screen-reader like manner.
I could explain more in detail why I have to do that but it would take pages!
Il giorno 21 giu 2020, alle ore 20:47, Travis Siegel <tsiegel@...> ha scritto:
In general, if you have the source code for an application, (and with opensource, you do), there's no need to fiddle with screen reader built-in functions at all, just rewrite the actual application to use standard windows api calls, (instead of custom functionality) such as gui elements, buttons, and the like. This will automatically translate to better functionality in screen readers, because they're already built to watch the regular apis for information. I.E.
If the app is written in java, instead of drawing your text onto a canvas, like so many apps do, simply use a standard text control instead, it may take more work to make it look the way you want (which is why some folks use the graphical canvas), but it will automatically become more accessible without you having to do anything at all. in regards to the screen readers. Other languages have similar functionality issues. In general, using a nongraphical method to get the text to the screen, properly labeling graphical elements, and using standard windows controls instead of creating your own from scratch will make the applications completely accessible, with very few tweaks being necessary to complete any accessibility issues that may remain.
In general, the more custom gui elements you use, the less accessible your application becomes. Obviously, there's ways to get around this, but few (If any) developers know enough about accessibility out of the box to make those required modifications to custom elements so they work with screen readers. I don't know specifically what you're trying to fix, I've never heard of the application you're trying to fix, neither do I know what language it's written in, but most of the time, making an application more accessible doesn't require writing scripts or screen reader modules, simply make the application use standard windows controls at the source level, and most of those things will solve themselves.
On Sun, 21 Jun 2020, Christian Comaschi wrote:
I'm asking a question that might be a little off topic because I'm not planning to develop anything for NVDA at the moment; but I'm working on an accessibility project and I'd like to know more about screen readers internals, and I think that someone here can help me find the info I need.
I'm writing custom code to improve the accessibility of an open source application (Plover), because common screen reader scripts and app modules alone don't allow me to bring it to the needed accessiblity requirements.
The problem is that at some point I need to read text from editable controls of any application in a "screen reader"-like manner, so I would like to know how screen readers can get the caret position and read the text of an editable control and the different approach of JAWS and NVDA.
I'm asking you the details of this functionality because I am trying to figure out if it could be a viable solution to read text from the screen in a "screen-reader like" manner with an approach that is valid for almost every application, or if it'stoo complex because it would require re-inventing a screen driver from scratch or re-inventing scripts for common application. In this latter case, I would consider a less stand-alone approach and make the application work in tandem with JAWS or NVDA.
After some analysis I have come to a conclusion and I would like to know if it's righgt:
- NVDA has no generic way to "read" the text given a screen position, but there are scripts for the most common applications that provide this information to the main module using the most proper technique for the single application (Win32 API, MSAA, UIA or other means);
- JAWS seems to have generic functions such as "SayLine", "SayRow" or "SaySentence" that work for most of the applications because of its video intercept driver.
As a first try, I wrote some small scripts to use just UIA to read the text and caret position inside Notepad or Winword, but it didn't work; I also tried to use the inspect tool from Microsoft, meant to analyze the windows of any application to get accessibility info, but even that tool wasn't able to get the caret position inside the edit windows.
Am I missing something or is it really that complex?
Thanks in advance