ne 17. 3. 2019 v 19:57 odesílatel Ross Moore <ross.mo...@mq.edu.au> napsal:
>
> Hi Andrew,
>
> On 18/03/2019, at 0:18, "Andrew Cunningham" <lang.supp...@gmail.com> wrote:
>
> Ross,
>
> It is also dependent in the fonts themselves and the scripts the language is 
> written in.
>
>
> Absolutely.
>
> Depending on the language and script the only way to ensure accessibility is 
> to include the ActualText attributes for each relevant tag.
>
>
> Indeed, provided you have supplied tagging at all, as of course should be 
> done.
>
> Considering how complex opentype fonts  can become for some scripts the 
> simplistic To Unicode mappings in a PDF can be insufficient.
>
>
> Yes, but it is better for the CMaps to at least be appropriate, rather than 
> inaccurate or missing altogether, as can be the case. Different software 
> tools get information from different places, so ideally one needs to provide 
> the best values for all those possible places.
>
No, CMaps help for simple scripts only. Let's imagine a person name
written বৌমিক in the Bengali script and transliterated as Bowmik. OW
is a two part matra (dependent vowel) which looks as e-matra preceding
the consonant and o-matra following the consonant. I-matra always
precedes the consonant thus using a CMap only the word would become
eboimak with two spelling errors. An editor will complain on an
e-matra at the beginning of a word and i-matra following o-matra, the
editor will indicate missing consonants. Similarly Hindi word स्थापित
(sthaapit) would be extraxted as sthaaipat which is wrong because
i-matra must not follow aa-matra. If I had time, I could give you
several thousands examples where CMaps fail. In past I did many tests
with Devanagari and without ActualText the problem cannot be solved.
This is the very reason why \XeTeXgenerateactualtext was implemented.
It is not just a problem of save as text/rtf/doc, in addition search
does not work.

> And text in a PDF may by WCAG definition be non-textual content.
>
>
> Presumably you mean, adding descriptive text to graphics that convey 
> meaningful information; e.g. a company logo, and most illustrations.
> Of course this should be done too. But this can only be useful if the 
> alternate descriptive text can be found via the structure tagging; hence the 
> need for fully tagged PDF, navigable via that tagging.
>
> And Zdenek's comment emphasises how what might work well in one language 
> setting can be quite insufficient for others. We need to be able to 
> accommodate all things that are helpful.
> That is surely what the U (for Universal) means in PDF/UA.
>
>
> Cheers,
>
>       Ross
>

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz
>
>
>
> On Sunday, 17 March 2019, Ross Moore <ross.mo...@mq.edu.au> wrote:
>>
>> Hi Karljūrgen,
>>
>> On 17/03/2019, at 1:42, "Karljürgen Feuerherm" <kfeuerh...@kfeuerherm.ca> 
>> wrote:
>>
>> > Ross,
>> >
>> > Your reply caught my eye, and I am now looking at the pdfx package 
>> > documentation.
>> >
>> > May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, 
>> > which seems directly targeted at this?
>>
>> PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
>> This is a highly non-trivial task, which requires adding much extra to the 
>> document, done almost entirely through \special commands. The pdfx package 
>> does not provide this, but is useful for meeting the Metadata and other 
>> requirements of these formats.
>>
>> Abstractly, accessibility is about having sufficient information stored in 
>> the PDF for software tools to be able to build and present a description of 
>> the content and structure, other than the visual one. The same can be said 
>> of software for converting into a different format.
>>
>> A significant part of this is being able to correctly identify each 
>> character in the fonts used within the TeX/produced PDF. Even this is a 
>> non-trivial problem, due to TeX's non-standard font encodings, and virtual 
>> font technique.
>>
>> >
>> > Many thanks,
>> >
>> > K
>> >
>> >> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> >> This fixes many of these things that affect conversions, as well as 
>> >> Accessibility and Archivability.
>> >>
>> >> It's not fully tagged PDF, but handles many other technical issues.
>> >>
>>
>>
>> Hope this helps.
>>
>> Ross
>>
>
>
> --
> Andrew Cunningham
> lang.supp...@gmail.com
>
>
>

Reply via email to