TARDY response -- [Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]]

Richard Owlett Wed, 07 Aug 2024 05:46:20 -0700

On 06/24/2024 12:22 PM, Karen Lewellen wrote:

Good afternoon.
I am providing another option that might help here.
robobraille,


www.robobraille.org

Provides services, free of charge, that will convert pdf files to anumber of different formats, including .html

They provide audio, mobi, and  convert epub files too..but I digress.
As a test, consider sending your file to
convert at robobraille.org
  correctly of course.
in the subjectline put html
leaving the body blank, and attach the file.
See if the .html file returned meets your needs.
Best,
Karen


I went to the site shortly after you posted.
*MY* browser (SeaMonkey 2.49.4  {32 bit Linux}) choked on it.
I didn't get a chance to visit local library to try another browser.
Forgot I had a copy of Firefox 68.10.0esr on my machine.
It ran fine.

I converted"https://fns-prod.azureedge.us/sites/default/files/resource-files/TFP2021.pdf";to both text and HTML.

The text version seems perfect.

The HTML version has problem of missing titles to several tables nearend of file. There are 15 tables one after another. All table *contents*came thru OK. Only the last one had its associated title.


I'll give www.robobraille.org a heads-up about it.

As I've a peculiar local configuration of SeaMonkey, could another SMuser run a quick check so that I can report if their site has a problemwith SeaMonkey?

TIA

On Mon, 24 Jun 2024, Richard Owlett wrote:
On 06/24/2024 12:35 AM, Richard wrote:
 Hello,
 this very much depends on what you are expecting it to do. In general,
 PDFs
 are only meant to be viewed - and printed - they where never meant for
 anything else. ...
Second sentence should read:
 ... only meant to be viewed by those with *NORMAL* vision ...
I'm attempting to read a USDA document.[1]
The printed version of this document is marginally readable.

Tools such as "Atril Document Viewer" provide selected magnification.
For this particular document and monitor, 150% is comfortable.Requires re-positioning the viewpoint 500 to 600 times to read document.
For _this_ document, Atril can select all the text on a page in amanner that can be pasted in a "reasonable" manner to a Pluma document.
It will:
   a. ignore actual graphics.
   b. put title/headings/??? on a separate line.
   c. all text between full page-width title/headings/??? will be
     treated as a logical unit.
It will not:
   1. put a blank line between paragraphs.
   2. put a blank line above/below lines containing title/headings/???.
   3. identify superscripts in some manner.
All this suggests that it should be able to extract text from a PDFand create a HTML document likely using only <p>, <br>, <sup>, and<li> in its <body>.
[1]https://fns-prod.azureedge.us/sites/default/files/resource-files/TFP2021.pdf
    _Thrifty Food Plan, 2021_
    Food and Nutrition Service
    August 2021
    FNS-916

TARDY response -- [Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]]

Reply via email to