Hello! I’m having a problem with the way the advanced font features of XeTeX interact with PDF reader programs. I’m not exactly sure where exactly is the culprit, so I apologize if this is not the right place to ask for help; (re-)directions are welcome if such is the case.
I’ve been writing my CV (I think the more correct US term is resume) in LaTeX, using xelatex to compile it to PDF. I managed to get it to look pretty much exactly as I wanted. (I’m not quite a typography expert, but I’m quite pleased with the result if I may say so.) The document uses a nice font with many OpenType features like small and titling capitals, lining and old-style numerals, and superscripts and the like. (Those are the ones I use, there are others.) Therein lies the problem: as far as I can tell “variant” characters, like small-caps or superscript letters, are represented as additional (private) code-points within the font, rather than as separate fonts. For display and printing, this is not a problem: the font is embedded in the PDF, and everywhere I tried it it seems to look as it should. However, when copying and pasting the contents in another program—big failure. Everything that isn’t displayed in the “normal” variant is copied to the clipboard as a set of (what I believe to be) private codepoints rather than the “semantic” Unicode codepoints it represents. This is a big problem for this document, as I expect a potential employer might try to copy&paste parts of it (e.g., address) and fail unexpectedly (getting gibberish). I’ve tried searching for solutions or workarounds, with little success. If (as I assume) this is a well-known problem, don’t hesitate to just point me towards a document that explains it. I’ve seen PDF documents that seemed to have a kind of “text overlay”: these were all scanned documents with (I assume) some kind of OCR processing. For display and printing purposes, only the scanned image was used (i.e., the OCRed text was invisible). However, when selecting (and copy/pasting), a text layer was used. I’ve no idea what PDF feature this used and if it’s accessible via LaTeX. I was hoping there was a way to add a “replacement” text for affected areas (and I searched fruitlessly the hyperref documentation for it), such that on copy-paste the replacement is used rather than just private characters. Since it’s a one-page document it wouldn’t be a lot of work to add the replacements. The only alternative I could think of was to take FontForge and manually split the font in pieces (e.g., one for small caps, one for superscripts, etc.), such that each variant glyph is encoded in its “semantic” position. But it’s a big and complex font, so that would take a lot more work than just “hinting” the document. I also worry that messing around with it in FontForge will cause me to loose hinting and other features I (or it) may not be aware of. I welcome all ideas, and thank you in advance. --Bogdan Butnaru PS. What I’m using identifies itself as “XeTeX 3.1415926-2.2-0.9995.2 (TeX Live 2009/Debian)” on Ubuntu. Fontspec reports itself as “2008/08/09 v1.18”. The problem manifests itself on every PDF viewer I tried (about one each for Linux, Windows and Mac OS X, and also Google Docs’ viewer). -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex