Hi Jonathan,
I am using xetex/xelatex for typesetting of devanagari texts.
eg. http://sanskritdocuments.org/doc_devii/gangAShTakamkAlidAsa.pdf
http://sanskritdocuments.org/doc_devii/gangAShTakamkAlidAsa.html?lang=sa
(HTML TEXT version of the same)
However, when the devanagri text is copied from t
Hi all,
the problem is caused just by a few characters, especially the short
i-matra. It might be more difficult in other Indic scripts containing
two-part vowels. The reason is that visually they appear in a different
order than they should appear in Unicode representation. It can be solved
by us
It would probably more than double, i was under the impression that
ActualText was a tag attrubute, so extensive tagging would be needed, and
actual text added to the tags.
But the question is how to practically make use of ActualText if there is a
visible text layer.
PDF/UA for instance leaves t
On 23/02/2016 13:54, Andrew Cunningham wrote:
> PDF/UA for instance leaves the question deliberately ambigious.
> ActualText is the way to make the content accessible, but developers
> creating tools for PDF do not actually have to process the ActualText.
Yeah. (Sorry to keep banging the drum but)
>> the problem is caused just by a few characters, especially the short
i-matra. It might be more difficult in other Indic scripts containing
two-part vowels.
It is more extensive and applies to all/most glyphs used for conjuncts in
addition to the short i-matra. It also applies to other Indic scr
PDF text is essentially a sequence of glyphs, and uses the ToUnicode
mappings to resolve to
For OpenType fonts, it will apply to any glyphs that are not default glyphs
assigned specific codepoints, true ligatures or variation selectors, so in
theory for complex scripts it could include many if mos
On 23 February 2016 at 14:58, ShreeDevi Kumar wrote:
>> It is not only the problem of copy&paste, you will not be able to use
> the search dialog in Acrobat. For instance, you will not be able to find
> किताब.
>
> Yes, you are right. Search does not work for unicode fonts for complex
> scripts in
Simon,
On 23 February 2016 at 14:12, Simon Cozens wrote:
> On 23/02/2016 13:54, Andrew Cunningham wrote:
> > PDF/UA for instance leaves the question deliberately ambigious.
> > ActualText is the way to make the content accessible, but developers
> > creating tools for PDF do not actually have to