[bug #66919] [troff] .hcode no longer accepts a special character as a first argument

G. Branden Robinson Mon, 17 Mar 2025 01:26:26 -0700

Update of bug #66919 (group groff):

                  Status:                    None => Need Info
             Assigned to:                    None => barx
                 Summary: .hcode no longer accepts a special character as a
first argument => [troff] .hcode no longer accepts a special character as a
first argument


    _______________________________________________________

Follow-up Comment #1:

Hi Dave,

[comment #0 original submission:]
> The .hcode entry in the manual says:
> 
> -- Request: .hcode dst1 src1 [dst2 src2] ...
> DST1 must be an ordinary character (other than a numeral) or a special
> character
> 
> And a special character did previously work here.  But it no longer does.

I don't think you diagnosed this problem correctly.


> $ file hcode_test
> hcode_test: ISO-8859 text
> $ iconv -f iso-8859-1 hcode_test
> .hcode \[~o] õ
> .pchar \[~o]
> .hcode õ õ
> .pchar \[~o]
> $ groff-latest hcode_test 2>&1 | fgrep 'hyphenation code:'
> hyphenation code: 0
> hyphenation code: 245


I have alternative facts.  Er, I mean, an alternative input file.


$ file ATTIC/66919alt.groff 
ATTIC/66919alt.groff: ISO-8859 text
$ iconv -f iso-8859-1 ATTIC/66919alt.groff
.hcode \[~o] á
.pchar \[~o]
.hcode õ á
.pchar \[~o]
$ ./build/test-groff ATTIC/66919alt.groff 2>&1 | grep -F 'hyphenation code'
  hyphenation code: 225
  hyphenation code: 225


> ("õ" is given no .hcode value in either latin1.tmac or en.tmac.

...and therefore its hyphenation code is 0, the default.

> Its absence from latin1.tmac seems like a bug, but at present
> it's a useful one, so I hope it sticks around for a bit.)

There are multiple opinions about whether hyphenation codes should bind to the
character encoding or to the hyphenation language, and right now my stance is
the latter.

The comment in "en.tmac" explains.


.\" Map hcodes of Latin-1 characters with diacritical marks that are
.\" used in English words to their unadorned ASCII counterparts.
.\" See http://savannah.gnu.org/bugs/?66112 for rationale.


In part I am influenced by the exclusively UTF-8 future I can foresee, where
every language uses the same character encoding for input to GNU _troff_, and
we must rely upon localization files to set up all of the hyphenation codes
anyway, since the mappings will sometimes differ (hello, Turkish).

I went with your first suggestion in bug #66112, rather than your second.

I propose that there is no bug here, or at a minimum, that GNU _troff_ is in
fact accepting a special character as the first argument to the `hcode`
request.  What do you think?


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66919>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

signature.asc
Description: PGP signature

[bug #66919] [troff] .hcode no longer accepts a special character as a first argument

Reply via email to