Update of bug #54213 (project groff): Status: None => Invalid Assigned to: None => gbranden Open/Closed: Open => Closed Summary: grodvi: broken ^ and ~ => [grodvi] Basic Latin ^ and ~ on input map to surprising Unicode code points
_______________________________________________________ Follow-up Comment #1: [comment #0 original submission:] > grodvi replaces ascii ^ (U+005E) by ˆ (U+02C6) and ~ (U+007E) by ˜ (U+02DC). Yes. > Test case: > > $ cat test.man > .TH test 1 > .BI "perl example: " "$str =~ m/^[a-z]$/;" > > $ man -Tdvi ./test.man | dvipdfmx > test.pdf > > $ pdftotext test.pdf - > > And check that there is correct output: $str =~ m/^[a-z]$/; > Currently there is ˆ and ˜. > > Werner LEMBERG wrote on mailing list that there are already macros for textual representation form: \(ha and \(ti. So they should be used for ^ and ~ by default. \(ha and \(ti are not macros, they are special character escape sequences for accessing spacing forms of the circumflex accent and tilde, respectively. ^ and ~ do _not_ map to spacing forms, but rather to modifier letters. This is due to *roff heritage going back to the early 1970s when Western Electric Model 37 Teletypes were used as Unix terminals, for document composition among other purposes. The ASCII ^ and ~ characters were small and high above the baseline so that they could be used as accent marks on a base character. The example should more properly read as follows. .TH test 1 .BI "perl example: " "$str =\(ti m/\(ha[a-z]$/;" (I also would not set a literal example in italics in a man page, but that's a separate issue.) It is probably a good idea to consult the groff_char(7) man page. The page has been heavily revised since groff 1.22.4. You can get a preview of it at Michael Kerrisk's Linux man-pages project site. https://man7.org/linux/man-pages/man7/groff_char.7.html The groff_man(7) page in groff 1.22.4 also has some advice in this area. \(ha ASCII circumflex accent. Use for syntax elements of programming languages because some output devices might replace unescaped circumflex accents with non‐ASCII glyphs like the Unicode U+02C6 modifier letter circum‐ flex. \(ti ASCII tilde. Use for syntax elements of programming languages because some output devices might replace un‐ escaped tildes with non‐ASCII glyphs like the Unicode U+02DC small tilde. In groff 1.23.0, it is expected that the foregoing material will move to a new groff_man_style(7) page (and Kerrisk's site already reflects this move), since it not specific to the man(7) package, but it is hard to get man page writers to read general *roff documentation. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?54213> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/