Samuel Thibault wrote: > When pasting ligatures, they are developped, which is fine for non-UTF-8 > environment. But when a UTF-8 transfer is possible, maybe they should rather > be transmitted as such? For instance, > http://dept-info.labri.fr/~thibault/tmp/ligature.dvi [...] > Locale: [EMAIL PROTECTED], [EMAIL PROTECTED] (charmap=ISO-8859-15)
When do you think a UTF-8 transfer would be possible? You are not working in a UTF-8 locale. However, I am actually surprised that you do get something reasonable at all. After all, the DVI file contains in place of the 'ffi' ligature only x0E for OT1 encoding as in your example or x1E for T1 encoding. In both cases copy & paste gives the correct sequence of characters. BTW, I don't know how xdvi works internally here, but generally text extraction functions are used for searching and copying. For text searching it is however important to also find the version with decomposed ligature. In addition, AFAIK the only reason why a very small set of ligatures is part of Unicode is because they were defined in legacy encodings like AdobeExpert. Normally Unicode wants to encode characters not glyphs. And the 'ffi' ligature is just a special glyph for the character sequence 'ffi'. So from my understanding of the intention of Unicode, xdvi's behaviour to decompose ligatures is correct even in a locale using Unicode. See <URL:http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf> for more details. cheerio ralf -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

