Re: Add-on issue - Getting text from a PDF

Brian's Mail list account

The big issue on pdf files is how was it made and is it protected.
I'm not well versed in the coding myself, but I have been displaying pdfs in the pdf reader on the webbie site, and this is extremely simple and can cope with non protected text based tagged pdfs very well and one can cut and paste the text into any editor you like.
However if its a picture of text then it won't work and neither will the Adobe product, you nee to OCR it and even then you can end up with a reading order that is left to right top to bottom and if the reading order is changing inside a document it will never know, so garbage will be the result.


Sent via blueyonder.
Please address personal E-mail to:-
briang1@..., putting 'Brian Gaff'
in the display name field.
Newsgroup monitored: alt.comp.blind-users

----- Original Message -----
From: "Stefano Bringhenti" <>
To: <>
Sent: Tuesday, November 05, 2019 11:55 AM
Subject: [nvda-devel] Add-on issue - Getting text from a PDF


I am writing since I would like to get some insights about getting text from a PDF file. In particular, I would like that the add-on gets and changes spoken text from a PDF file while navigating it (through arrow keys). While the add-on perfectly works on any text editor (by omitting the default speach, using "event_caret" and a function which gets the text from the caret position, modifies it and then speaks it again), the same does not work for a PDF file, since the caret seems to not move at all while using the arrow keys. I think the problem could be solved using something different from the caret for PDF files, but I do not know how to change the add-on in order to let it works also for PDF files. If anyone knows how to solve the problem or some other add-on with a similar aim please write back.

Thanks in advance,
Stefano Bringhenti

Join to automatically receive all group messages.