On 4/26/2010 12:41 PM, Ross Moore wrote:
Hi Herb,
Just curious... what happens when you try to do search within or a copy from a pdf which has such combined characters?

PDF has the /ActualText(...) replacement tagging feature. This allows you to capture a sequence of content characters and declare the whole collection to be equivalent to a single (or sequence of) Unicode point(s).

But, that only works if you add an /ActualText command. As far as I can tell, using a compound glyph as discussed here will not be a problem in a search, *provided* that the software you're using implemented the unicode collation algorithm correctly, in which case for this type of thing it shouldn't need the /ActualText command for searching to work.

That said, I have no idea how many PDF readers other than Adobe's Acrobat actually use a correctly and fully implemented unicode collation algorithm.

- Mike "Pomax" Kamermans
nihongoresources.com


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Reply via email to