On 25/08/11 20:01, Jude DaShiell wrote:
> pdf has accessibility issues for screen reader users

Some pdfs have issues.
Some of the pdf issues are accessibility. :-)
Some html files also have accessibility issues...

> and riverwind and me are both screen reader users.

And you are not alone.

> The best we can attempt is a text extraction from pdf files if we're 
> going to read what's in them.

Then you have been sadly misinformed.
I have no problems reading the pdf I linked with Ocular (using kttsd) -
I prefer the html version, but I wouldn't want it as a single file.

I'd recommend careful preparation (food, drink, sleeping bag etc) before
attempting to screen read a single page documents made from 544 pages -
or spend the next few hours trying to kill speech-dispatch (without the
benefit of a reader) to find it's PID! ;-D

> If what was left in the file was a scanned image, maybe that can be 
> scanned on Windows I don't know that parallel capability exists with 
> Linux yet.

Usually the other way around. Eg. one day Windoof will have
screenreading built-in to the core and people will stop forking out big
dollars thinking JAWS is "assistive technology".

Tesseract does an excellent job of OCRing pdfs that are just image -
there are GUI options.

> Also, whenever text extraction gets done on pdf files with command 
> line tools with Linux there are spelling mistakes in the output.

I'm assuming you use Orca (or whatever Gnome calls it's reader) - surely
that works with the Gnome PDF viewer?

> The pdf format is just something those of us that can't see the
> screen would be really happy if either Adobe had never come into
> existence or invented that format. 

If Microsoft ceased to exist I'd agree - but they do, and the best I can
do with some "users" is get them to send me a pdf *instead* of a
"rent-a-view" Office document or some other proprietary method of making
information asymmetrically accessible.... It's a less than perfect world
so I accept less than perfect solutions.

> Also, knowledgeable sighted
> technical people I talk with hate Adobe and pdf with a passion and
> they can't all be wrong.

Originally Adobe *was* pdf. This is no longer the case - it was made an
open standard three years ago (ISO 32000-1:2008).

Plain text is good, RTF is nice, HTML is better.
Sadly, many people have problems with cross-platform text files, and
HTML is often made ugly and unusable, PDFs can be ugly too - but most
people have no problems viewing or printing them. So often pdfs are
often the "least worst" format for styled text and image documents. It's
also a handy format for saving reference webpages.

> 
> On Thu, 25 Aug 2011, Curt wrote:
> 
<snipped>
> 
> 

Cheers

-- 
"If the FBI's motivating factor for busting down the Koresh compound was
child abuse, how come we never see Bradley tanks smashing into Catholic
churches?"
~ Bill Hicks


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4e5636ad.3070...@gmail.com

Reply via email to