Hi folks, A little over a year ago I put an implemented change back on the shelf due to objections from Dave and Peter, who weren't opposed to the change per se, but felt that it needed some preliminaries mooted on this list.
Background: https://lists.gnu.org/archive/html/groff/2025-02/msg00000.html https://savannah.gnu.org/bugs/?66392 https://savannah.gnu.org/bugs/?66387 The quotes driving my initiation of this thread are as follows. "Roll back the change and have a good chinwag about environment switching after 1.24." -- Peter "When the change is re-implemented for 1.25, it can be done in tandem with the larger discussion that has sprung up here about how best to populate new environments. And that can be done early in the release cycle, to give any issues more time to shake out before a release." -- Dave Well, it's now after 1.24 and early in the 1.25 release cycle. I've got a patch that applies cleanly to the master branch, resolving bug #66387; for now it's living in my Git stash, but I'll attach it in case anyone's curious. I remain convinced that the hyphenation language should be a property of the environment, rather than golbal, because the hyphenation "mode" that determines the acceptable locations within a word for hyphenation breaking is already environmental _and_ the hyphenation mode is only intelligible when the hyphenation language is known. Please find in a footnote an illustration of my claim that the hyphenation mode is a property of the environment.[1] Dave and Peter's objections seemed to center around changing the formatter's default behavior in this respect without ensuring that full-service macro packages were prepared for the change first--a reasonable demand. In particular, macro packages need to know what they can expect when they create a new environment. Here is my proposal: (1) Extend the syntax of the `ev` request. Here is the pitch from comment #23 of Savannah #66392. .ev environment [source-environment] Since environment names are identifiers, they can't contain spaces, so they are well-behaved as request arguments. In the foregoing, the second argument would be honored only if the first argument doesn't already exist as an environment. [2026 remark: if the first argument _does_ already exist, what you want is to switch to it and use the existing `evc` GNU troff extension equest. AT&T troff has no facilities at all for copying environments.] And this would be backward compatible with the status quo; the default environment has no name, but that's okay. If environment "foo" doesn't already exist: .ev foo ...would work the same before and after the change. (2) Adapt existing packages to the foregoing change. Frequently this will mean changing request sequences like this (from our "s.tmac"): .ev k .evc 0 .ev ...to this. .ev k 0 .ev Recall that the `ev` request doesn't merely create an environment (if it doesn't already exist), but switches to it. (In groff ms, environment `k` is for "keeps".) (3) Commit the patch for Savannah #66387. Thoughts? Let the chins be waggy! Regards, Branden [1] I'll illustrate with DWB troff, Plan 9 troff, and groff. $ cat ATTIC/environmental-hyphenation-mode.roff .de ww aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babushka .do if \n(.g groff .hy=\\n[.hy] .. .hy 1 This is environment 0 .br .ww .br Switching to environment 1 .br .ev 1 .hy 4 .ww .br Switching to environment 2 .br .ev 2 .ww .br Setting hyphenation mode in environment 2 to 4 .br .hy 4 .ww .br Switching to environment 0 .br .ev 0 .ww $ dwb troff -a ATTIC/environmental-hyphenation-mode.roff| cat -s This is environment 0 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush- ka Switching to environment 1 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback ba- bushka Switching to environment 2 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush- ka Setting hyphenation mode in environment 2 to 4 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback ba- bushka Switching to environment 0 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush- ka $ 9 troff -a ATTIC/environmental-hyphenation-mode.roff| cat -s This is environment 0 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush- ka Switching to environment 1 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback ba- bushka Switching to environment 2 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush- ka Setting hyphenation mode in environment 2 to 4 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback ba- bushka Switching to environment 0 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush- ka $ groff -a ATTIC/environmental-hyphenation-mode.roff| cat -s <beginning of page> This is environment 0 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush<hy> ka groff .hy=1 Switching to environment 1 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babushka groff .hy=4 Switching to environment 2 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush<hy> ka groff .hy=1 Setting hyphenation mode in environment 2 to 4 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babushka groff .hy=4 Switching to environment 0 aardvark abaci aback abacuses abaft abalone abandonment babbling baboon babushka aardvark abaci aback babush<hy> ka groff .hy=1
diff --git a/ChangeLog b/ChangeLog index c55103f68..b7280d21e 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,46 @@ +2026-03-15 G. Branden Robinson <[email protected]> + + [troff]: Couple the current hyphenation language more tightly + with the environment. This is to ease maintenance of + multilingual documents, and address a curious situation where an + environment could bear an automatic hyphenation code that had + nothing to do with the selected hyphenation language, because + the `hla` request has to date never altered the environment. + + * src/roff/troff/env.h (class environment): Add private member + variable `language_code` of type `symbol`. Declare public + member functions `get_language_code()` and + `set_language_code()`. Declare `environment_switch()` as a + friend function. + * src/roff/troff/env.cpp (environment::environment): Plain + constructor initializes `language_code` as empty string. + (environment::environment): Copy constructor copies the language + code from the source object. + (environment::copy): Member function backing `evc` request + copies the language code from the source environment. + (environment::print_env): Report the hyphenation language code + in use by the environment. Clarify when the automatic + hyphenation mode is ignored because no hyphenation language is + configured in the environment. + (select_hyphenation_language): When the `hla` request is called + without an argument, set the current environment's hyphenation + language code to the empty string. With an argument, update the + environment's hyphenation to the value of the argument. Add + assertions prior to function return to enforce invariants: (1) + the current environment's hyphenation language code must not be + null and (2) the `current_language` global variable and the + current environment's hyphenation language code must agree. + (environment_copy, environment_switch): Set the + `current_language` global variable to current environment's + hyphenation language code. + (environment::get_language_code): Implement accessor. + (environment::set_language_code): Implement mutator. + + * src/roff/groff/groff.am (groff_XFAIL_TESTS): Drop now-passing + "current-language-and-environment-in-sync.sh" test. + + Fixes <https://savannah.gnu.org/bugs/?66387>. + 2026-03-15 G. Branden Robinson <[email protected]> * bootstrap.conf (gnulib_modules): Drop `putenv`. diff --git a/src/roff/groff/groff.am b/src/roff/groff/groff.am index d821dc73d..f1965f3df 100644 --- a/src/roff/groff/groff.am +++ b/src/roff/groff/groff.am @@ -144,7 +144,6 @@ EXTRA_DIST += \ src/roff/groff/tests/artifacts/throughput-file groff_XFAIL_TESTS = \ - src/roff/groff/tests/current-language-and-environment-in-sync.sh \ src/roff/groff/tests/html-device-works-with-grn-and-eqn.sh \ src/roff/groff/tests/stringup-request-transforms-non-basic-latin.sh XFAIL_TESTS += $(groff_XFAIL_TESTS) diff --git a/src/roff/groff/tests/current-language-and-environment-in-sync.sh b/src/roff/groff/tests/current-language-and-environment-in-sync.sh index 4777207a9..c223372f3 100755 --- a/src/roff/groff/tests/current-language-and-environment-in-sync.sh +++ b/src/roff/groff/tests/current-language-and-environment-in-sync.sh @@ -30,7 +30,8 @@ wail () { # Unit-test synchronization between the formatter's "current language" # (global) and the hyphenation language code in the current environment. # -# See Savannah #66387 and #66392. +# See Savannah #66387 and #66392, and comment prior to +# `environment_copy()` definition in "src/roff/troff/env.cpp". input='. .tm 1 en=\n[.hla] diff --git a/src/roff/troff/env.cpp b/src/roff/troff/env.cpp index 173b6f0ba..62b273ce9 100644 --- a/src/roff/troff/env.cpp +++ b/src/roff/troff/env.cpp @@ -818,6 +818,7 @@ environment::environment(symbol nm) line_number_indent(0), line_number_multiple(1), no_number_count(0), + language_code(""), hyphenation_mode(1), hyphenation_mode_default(1), hyphen_line_count(0), @@ -912,6 +913,7 @@ environment::environment(const environment *e) line_number_indent(e->line_number_indent), line_number_multiple(e->line_number_multiple), no_number_count(e->no_number_count), + language_code(e->language_code), hyphenation_mode(e->hyphenation_mode), hyphenation_mode_default(e->hyphenation_mode_default), hyphen_line_count(0), @@ -999,6 +1001,7 @@ void environment::copy(const environment *e) no_number_count = e->no_number_count; tab_char = e->tab_char; leader_char = e->leader_char; + set_language_code(e->language_code.contents()); hyphenation_mode = e->hyphenation_mode; hyphenation_mode_default = e->hyphenation_mode_default; fontno = e->fontno; @@ -3680,6 +3683,10 @@ void environment::dump() errprint(" lines remaining for which to suppress numbering: %1\n", no_number_count); } + const char *hl = language_code.contents(); + bool is_hyphenation_impossible = language_code.is_empty(); + errprint(" hyphenation language code: %1\n", + is_hyphenation_impossible ? "(none)" : hl); string hf = hyphenation_mode ? "on" : "off"; if (hyphenation_mode & HYPHEN_NOT_LAST_LINE) hf += ", not on line before vertical position trap"; @@ -3692,8 +3699,10 @@ void environment::dump() if (hyphenation_mode & HYPHEN_NOT_FIRST_CHARS) hf += ", not allowed within first two characters"; hf += '\0'; - errprint(" hyphenation mode: %1 (%2)\n", hyphenation_mode, - hf.contents()); + errprint(" hyphenation mode: %1 (%2)%3\n", hyphenation_mode, + hf.contents(), + is_hyphenation_impossible ? " [no hyphenation language]" + : ""); errprint(" hyphenation mode default: %1\n", hyphenation_mode_default); errprint(" count of consecutive hyphenated lines: %1\n", @@ -3788,6 +3797,7 @@ static void select_hyphenation_language() { if (!has_arg()) { current_language = 0 /* nullptr */; + curenv->set_language_code(""); skip_line(); return; } @@ -3800,10 +3810,22 @@ static void select_hyphenation_language() (void) language_dictionary.lookup(nm, static_cast<hyphenation_language *>(current_language)); } + curenv->set_language_code(nm.contents()); } + assert(!(curenv->get_language_code().is_null())); + if (current_language != 0 /* nullptr */) + assert(strcmp(current_language->name.contents(), + curenv->get_language_code().contents()) == 0); skip_line(); } +// The `environment` class has no visibility into which hyphenation +// languages are defined; it has only a code that is either empty +// or must reference a valid one (and since the code uniquely +// identifies a language, a `const char *` is more ergonomic than a +// pointer to a type that's not visible in the class's scope). So +// updating that code is a two-step process. + void environment_copy() { if (!has_arg()) { @@ -3817,14 +3839,27 @@ void environment_copy() symbol nm = read_long_identifier(); assert(nm != 0 /* nullptr */); e = static_cast<environment *>(env_dictionary.lookup(nm)); - if (e != 0 /* nullptr */) + if (e != 0 /* nullptr */) { curenv->copy(e); + current_language = static_cast<hyphenation_language *> + (language_dictionary.lookup(e->get_language_code())); + } else error("cannot copy from nonexistent environment '%1'", nm.contents()); skip_line(); } +symbol environment::get_language_code() +{ + return language_code; +} + +void environment::set_language_code(const char *lang) +{ + language_code = lang; +} + void environment_switch() { if (curenv->is_dummy()) { @@ -3840,6 +3875,8 @@ void environment_switch() bool seen_eol = curenv->seen_eol; bool suppress_next_eol = curenv->suppress_next_eol; curenv = env_stack->env; + current_language = static_cast<hyphenation_language *> + (language_dictionary.lookup(curenv->language_code)); curenv->seen_space = seen_space; curenv->seen_eol = seen_eol; curenv->suppress_next_eol = suppress_next_eol; @@ -3857,6 +3894,8 @@ void environment_switch() } env_stack = new env_list_node(curenv, env_stack); curenv = e; + current_language = static_cast<hyphenation_language *> + (language_dictionary.lookup(curenv->language_code)); } } skip_line(); diff --git a/src/roff/troff/env.h b/src/roff/troff/env.h index c2e7935f6..c8552e953 100644 --- a/src/roff/troff/env.h +++ b/src/roff/troff/env.h @@ -219,6 +219,7 @@ class environment { int line_number_indent; // in digit spaces int line_number_multiple; int no_number_count; + symbol language_code; unsigned int hyphenation_mode; unsigned int hyphenation_mode_default; int hyphen_line_count; @@ -327,6 +328,8 @@ public: hunits get_input_line_position(); const char *get_tabs(); int is_using_line_tabs(); + symbol get_language_code(); + void set_language_code(const char *); unsigned get_hyphenation_mode(); unsigned get_hyphenation_mode_default(); int get_hyphen_line_max(); @@ -415,6 +418,7 @@ public: friend void number_lines(); friend void leader_character_request(); friend void tab_character_request(); + friend void environment_switch(); friend void hyphenate_request(); friend void set_hyphenation_mode_default(); friend void no_hyphenate();
signature.asc
Description: PGP signature
