On Thu, Oct 05, 2023 at 10:08:50AM +0200, Walter Alejandro Iglesias wrote: > On Wed, Oct 04, 2023 at 10:20:57PM +0000, Bjarni Ingi Gislason wrote: > > Latin1 iacute has the utf8 code 'Ã' > > and the hexadecimal code is C3AD which is "LATIN CAPITAL LETTER A WITH > > TILDE" and "SOFT HYPHEN" > > > > "groff" turns "soft hyphen" into "HYPHEN-MINUS" (0x2D) > > > > More is in the attachment. > > > file list.tr > > > > .hw a-hÃ- > > .hw a-ño > > .hw ár-bol > > .hw cu-brÃ--a > > .hw e-té-re-o > > .hw ca-mión > > .hw ú-te-ro > > .hw pin-güi-no > > > > Output from "preconv -e utf8 list.tr > > > > .lf 1 list.tr > > .hw a-h\[uFFFD]- > > .hw a-\[u00F1]o > > .hw \[u00E1]r-bol > > .hw cu-br\[uFFFD]--a > > .hw e-t\[u00E9]-re-o > > .hw ca-mi\[u00F3]n > > .hw \[u00FA]-te-ro > > .hw pin-g\[u00FC]i-no > > > > Translate "list.tr" to latin1 > > > > iconv -f utf8 -f latin1 list.tr > > > > .hw a-h > > > > iconv: illegal input sequence at position 7 > > The list.tr you're using in this example is already latin1. > Repeat what you did but this time with a UTF-8 list.tr. :-)
Now I realize you used a latin1 list.tr on purpose, right? If that's the case, sorry! If I feed preconv with a file already in latin1 (using UTF-8 locales here) ... $ preconv -e utf8 list_in_latin1.tr ... *all* non ASCII characters in the output are replaced by \[uFFFD]. It seems that when I call the file as a macro (using .mso list.tr) that particular utf8 character (iacute) is read wrongly by groff, presumably because of what you noticed. > > > > > > \[uFFFD] is called "replacement character". > > > > -.- > > > > Latin1 iacute has the utf8 code 'Ã' > > and the hexadecimal code is C3AD which is "LATIN CAPITAL LETTER A WITH > > TILDE" and "SOFT HYPHEN" > > > > "groff" turns "soft hyphen" into "HYPHEN-MINUS" (0x2D) > -- Walter