Hi Branden, On Wed Nov 13, 2024 at 4:36 PM CET, G. Branden Robinson wrote: > [...] > > Latin1 characters continue working even when loading Latin2 as long as > > they are specified as the respective UTF-8 codes. > > And they _should_ continue to hyphenate at appropriate locations because > s set of hyphenation codes is associated with the hyphenation > _language code_ ("en", "cs", "fr", etc.), which can change from > environment to environment. > [...]
Ah yeah, I forgot about the functionality of .hla. I wonder if this works even when each language uses a different encoding though, given that .tr and .trin are specified like so in groff(7): .tr abcd... Translate ordinary or special characters a to b, c to d, and so on prior to output. .trin abcd... As .tr, except that .asciify ignores the translation when a diversion is interpolated. i.e. translation should happen on output, not on input, meaning that using .hla might not be sufficient to switch between cs and fr, because that doesn't switch the encoding used. That's just my thoughts based on the documentation, though. I don't have the time to verify this. > > My conclusion is that, given the intricacies of all this, loading the > > appropriate localization file is THE way to setup hyphenation > > correctly. > > Yes! Our documentation does actually try to get this idea across. If > there are spots where you feel it is failing to do so, please bring them > to my attention (but also base your recommendations on groff Git--I > revise documentation all the time). groff(7) does mention it, but it's among the last things mentioned in the Hyphenation section. The texinfo manual doesn't mention it at all in its section 5.1.3 about Hyphenation where I would expect it. (At least the online version -- I haven't found any git source for it, just tarballs.) > > I feel like splitting the hyphenation part of localization files off > > (into hycs.tmac etc.) would be beneficial in that one could load the > > hyphenation settings for a given language without all the localization > > strings. > > This, I'm less sure about. The localization strings are namespaced, so > the only real advantage to separating them is a minuscule reduction in > formatter startup time, about which I have never read any complaints. > [...] > > > Groff's documentation of hyphenation could then be updated > > with a simple mention of > > .mso hycs.tmac > > before specifying the technical details (.hy, .hla, .hpf, ...) which > > ordinary users won't need to deal with. > > The existing recommendation for localization is to specify loading of > the groff locale via the command-line `-m` option, _after_ loading any > full-service package. > [...] The reason I was suggesting this is the fact that once one disables hyphenation through .nh or .hy 0, the only way to re-enable it, as far as I am aware, is to issue .hy with the proper hyphenation mode, which depends on the language and might not be known by the user. Separating the hyphenation portion into its own macro file would allow one to re-enable hyphenation by issuing .mso hyLANG.tmac instead of having to research the appropriate mode for the given language. A simple macro could then be constructed which would offer a friendlier interface to hyphenation. It could work like this: .HY Return to previous hyphenation settings (if set). .HY 0 Disable hyphenation. .HY LANG Set hyphenation parameters appropriate for language LANG. This would allow usage like so: .HY cs Příliš žluťoučký kůň úpěl ďábelské ódy... .br .HY 0 .na https://\:www.gnu.org/\:software/\:groff/\:manual/\:groff.html .br .ad .HY Of course, this wouldn't be necessary if .hy worked like .ad, but (unless I am mistaken again :) it doesn't and cannot due to desired compatibility with AT&T troff. ~ onf