[bug #66919] [troff] behavior change in some .hcode calls when a special character is the first argument

G. Branden Robinson Thu, 20 Mar 2025 18:26:39 -0700

Follow-up Comment #16, bug #66919 (group groff):

At 2025-03-18T15:49:28-0400, Dave wrote:
> Follow-up Comment #14, bug #66919 (group groff):
>
> [comment #13 comment #13:]
>> In this case, the change is deliberate:
>>
>> "Support clearing a character's hyphenation code by copying
>> that of a character that lacks one."
>
> Ah!
>
> But it seems that this mechanism to clear a code needs to carve out an
> exception for what I'm terming "reflexive hcode," right?

I think it has one.

https://git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/input.cpp?id=6297957a8679fb9657c544cd13c35650a8731ff7#n8058

> Because the purpose of that is to generate a new hyphenation code for
> a character that may never have had one.

...or reset one to its virgin state after meddling. A theme of my
changes to groff is that we should reduce the number of one-way paths in
the formatter's configuration space. Anything you can create, you
should be able to delete (and create again). Any parameter you can put
some English on, giving it some wacky value, you can unspin back to
normality.

A couple of accepted exceptions exist: deletion of requests, and
deletion of registers with special semantics (like `.i` or `.ps`). If
you kill those, there's no going back. Doug McIlroy had to talk me into
the latter.

> It's also curious that "reflexive hcode" works differently depending
> on how the character is spelled: ".hcode \[~o] õ" now has a different
> effect from ".hcode õ õ".

I don't have many useful thoughts on this yet. I understand that a set
of normalization processes is internally applied to special character
names. For instance:

$ printf "hell\\[o aa]\n" | groff -a
<beginning of page>
hell<'o>

(same output in groff 1.22.3 through master HEAD).

Also, you can (try to) remove a special character and recreate it as a
user-defined character. Its hyphenation code will remain unchanged
throughout these processes.

$ printf ".pchar \\[o aa]\n.rchar \\[o aa]]\n.pchar \\[o aa]\n.char \\[o aa]
world\n.pchar \\[o aa]\n" | ~/groff-HEAD/bin/groff -a
special character "'o"
is not translated
does not have a macro
special translation: 0
hyphenation code: 111
flags: 0
ASCII code: 0
asciify code: 243
is found
is transparently translatable
is translatable as input
mode: normal
special character "'o"
is not translated
does not have a macro
special translation: 0
hyphenation code: 111
flags: 0
ASCII code: 0
asciify code: 243
is found
is transparently translatable
is translatable as input
mode: normal
special character "'o"
is not translated
has a macro: "world"
special translation: 0
hyphenation code: 111
flags: 0
ASCII code: 0
asciify code: 243
is found
is transparently translatable
is translatable as input
mode: normal

Is the foregoing correct/desirable? I don't know. Since they've been
separately configurable for either ~23 years (`char`) or 34+ (`hcode`),
I guess so. Maybe?

I am inclined to punt this question out into the Great Post-1.24
Hyphenation Reform Chinwag that Peter proposed. There seems to be a
cluster of related nettles to grasp in this area.

The question I have for you right now is whether _groff_ master is
working as you expect and desire specifically for Latin-1 characters
whose hyphenation codes we configure in "en.tmac", like "ó". By listing
them there, we bless them as honorary letters of the English alphabet.
I also want to know whether spelling them as 8-bit ordinary characters
or special characters works as you expect.

Once those questions are decided, then I think it is time to decide
whether it's NEWS-worthy to announce changed hyphenation support, in
English, of letters that don't properly occur in English words. I lean
against it, because people can and will install fancy fonts with support
for Thai and Cherokee scripts and their hyphenation codes are all going
to be stone zeroes as well.

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?66919>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

signature.asc
Description: PGP signature

[bug #66919] [troff] behavior change in some .hcode calls when a special character is the first argument

Reply via email to