On 2015-09-04, Georg Baum wrote: > Kornel Benko wrote: >> Am Donnerstag, 3. September 2015 um 17:20:30, schrieb Scott Kostyshak >> <skost...@lyx.org> >>> These tests are failing on master:
>>> 16:tex2lyx/roundtrip/test-insets-basic.tex >>> 17:tex2lyx/cmplyx/test-insets-basic.tex >>> 18:tex2lyx/roundtrip/test-insets.tex >>> 19:tex2lyx/cmplyx/test-insets.tex >> They all show only difference in displaying underlined 'e'. The underline >> is now thicker. (See attached) ... >> I'd say, we should use the new version. > First we need to know why the output changed. I investigated the issue, and > found out: > 1) The test fail since change 268bd00: > http://www.lyx.org/trac/changeset/268bd0075e57e7ddde6ae63ac60faec1f4c31330/lyxgit I added the mapping for the "macron below" after research into combining diacritical characters in connection with the comma below fix. > Looking further at change 268bd00, I see that both 0x0320 and 0x0331 do now > map to \b in lib/unicodesymbols. Which one is correct? * \b for "combining macron below" is definitely not wrong: - both produce an underline that does not connect with neighbours (as opposed to "line below") https://en.wikipedia.org/wiki/Macron_below The German version explicitely mentions the TeX representation \b https://de.wikipedia.org/wiki/Makron_%28Unterzeichen%29 - \b is described as "underbar accent" (http://www.ams.org/membership/texcodes) or "macron below (line below)" (http://vjimc.osu.cz/TeXform.html) - Unicode has a number of precomposed characters with "macron below", e.g. 1E06 LATIN CAPITAL LETTER B WITH LINE BELOW : 0042 0331 (mark the canonical decomposition despite the different name!) These characters can be reliably expressed as LICRs with \b. Compare the output of "Ḇḇ, Ḏḏ" vs. "\b{B}\b{b}, \b{D}\b{d}" in a XeTeX document (see below). * \b for combining minus sign below may be correct (but I did not add it!): - the minus below is used in the phonetic transcription for "retracted" https://www.internationalphoneticassociation.org/sites/default/files/phonsymbol.pdf https://en.wikipedia.org/wiki/Relative_articulation#Advanced_and_retracted - The output is identic in the XeTeX example below (this may differ when a different font is selected). - The Unicode standard says: COMBINING MINUS SIGN BELOW • IPA: retracted or backed articulation • glyph may have small end-serifs -- http://unicode.org/charts/PDF/U0300.pdf while \b does not produce small end-serifs. - The tipa-manual says: Tiefgestellter Balken Usage: rückverlagert Input1 : \textsubbar{e} Input2 : \=*e Sources: IPA ’49–’96 i.e. the equivalent to "combining minus sign below" according to tipa would be the "\textsubbar" accent macro. There are many cases, where Unicode has code points that map to the same LICR. Sometimes Unicode itself "merged" the code points (e.g. ` 1FEF GREEK VARIA == ` 0060 GRAVE ACCENT) sometimes different Unicode points have the same LaTeX counterpart: (\~ is both, accent tilde and accent perispomeni but Unicode keeps the difference). > We should not keep both, since the screenshots sent by Kornel show that > 0x0320 and 0x0331 produce slightly different output, Whether a "combining minus sign below" looks identical to the "macron below" depends on the chosen font (see e.g. http://unicode-table.com/en/0320/) > and lib/unicodesymbols contains this line: > # Do only add commands that give correct output, no hacks that look > "similar". Maybe in this case we have correct output that does not look similar :-) Günter \documentclass[]{article} \usepackage{fontspec} \setmainfont[Mapping=tex-text]{Linux Libertine O} \begin{document} ḆḇḎḏẖḴḵḺḻṈṉṞṟṮṯ: Unicode characters ... LETTER ... WITH LINE BELOW \b{B}\b{b}\b{D}\b{d}\b{h}\b{K}\b{k}\b{L}\b{l}\b{N}\b{n}\b{R}\b{r}\b{T}\b{t}: base characters with accent macro \verb|\b| Ḇ̱ḇḎḏẖḴḵḺḻṈ̱ṉṞṟṮṯ: Unicode characters + 0331 COMBINING MACRON BELOW B̠b̠D̠d̠h̠...: Unicode characters + 0320 COMBINING MINUS SIGN BELOW a̱e̱f̱g̱: 0331 COMBINING MACRON BELOW without composite rules ̠a̠e̠f̠g̠: 0320 COMBINING MINUS SIGN BELOW ̱ \end{document}