Follow-up Comment #21, bug #66392 (group groff):

Hi Peter & Dave,

At 2025-02-01T12:56:28-0500, Peter Schaffter wrote:
> Why is \n[.hla] not global regardless of ev?  It seems an eminently
> reasonable expectation that a document's hyphenation language will
> apply throughout the whole document.  I can only think of edge cases
> where one might want to switch hla's, e.g. a document in French with a
> formatted blockquote in Italian.

That is the sort of scenario I was thinking of.  But what motivated the
change is the fact that the hyphenation mode itself is not global, but a
property of the environment.  I _think_ this is true all the way back to
Ossanna troff but it's tedious to verify that fact, as the value of the
hyphenation mode is not introspectable except via a GNU extension.
(You'd have to infer it by formatting text and seeing if the placement
of the hyphenation breaks changed from one environment to another, given
comparable inputs.)

> Having to explicitly instantiate .hla for every .ev that doesn't call
> .evc 0 makes no sense.

I disagree--here's why.  I think a lot of people assume that when they
create a new environment, it's a copy of environment 0 already.  But it
isn't.  It's a copy of the _formatter_'s default environment, meaning,
in practice, it has the attributes that correspond go the way its C++
object was constructed when the formatter started up.  This is an
implementation detail in capital letters.

At a more practical level, that "formatter's default" environment is not
affected by anything that happens in the "troffrc" and "troffrc-end"
files.

To be fair, most of what the _stock_ startup files (and each of the
several files it macro-sources) do alters only global state.[1]
"troffrc" itself sets a register, defines (and then removes) some
strings, and sets up blank-line and leading-space traps (for diagnostic
purposes, which "troffrc-end" later removes).

What about the macro files "troffrc" loads?

"composite.tmac" sets up a handful of composite character mappings.
These are global.

"fallbacks.tmac" creates user-defined characters.  Also global.

An output-driver-specific macro file is loaded.  These generally do
things like define more characters.  Some assign hyphenation codes
(global, but one should feel a twitch here[2]).  Some define color names
(global).  "pdf.tmac" defines boatloads of macros (global).

A localization macro file is loaded.  By default, it's the one for
English, but we do encourage sites to alter this if they wish.

The localization macro file itself loads an encoding macro file, which
sets up input character translations (`trin` requests) and more
hyphenation codes (twitch).

The localization macro file then goes on to configure the inter-sentence
spacing amount (environmental), set up a default hyphenation mode
(environmental), and select that mode (environmental).  It sets the
hyphenation language (formerly global, now environmental), and loads
hyphenation pattern files (global, but a separate dictionary for each
hyphenation language code is maintained--so until/unless we support
maintenance of multiple sets of hyphenation patterns for a given
language code,[3] I figure this looks as good as environmental to the
user).

Finally, for convenience, and depending on the output device,
"pdfpic.tmac" or "pspic.tmac" might get loaded.  These do only global
stuff, mainly defining namesake (albeit fully capitalized) macros.

The rug may not be pulled yet, but the dog is tugging at a corner of it.

Here's the rug pull.

Because we advocate site-local customization of "troffrc" and
"troffrc-end", there's simply no way for us know of or prevent the user
from putting all kinds of environment-altering stuff in them.  They
might choose an adjustment mode.  They might override the line length.
Change the page offset.  Alter the type size.  Here's the output of the
`pev` request from bleeding-edge GNU troff.


Current Environment:
  previous type size: 10p (10000s)
  type size: 10p (10000s)
  previous requested type size: 10000s
  requested type size: 10000s
  valid type size list for selected font: 1000s-10000000s
  previous default family: 'T'
  default family: 'T'
  previous font selection: 1 ('TR')
  font selection: 1 ('TR')
  space size: 12/12 of font space width
  sentence space size: 12/12 of font space width
  previous line length: 468000u
  line length: 468000u
  previous title line length: 468000u
  title line length: 468000u
  previous line interrupted/continued: no
  filling: on
  alignment/adjustment: both
  previous vertical spacing: 12000u
  vertical spacing: 12000u
  previous post-vertical spacing: 0u
  post-vertical spacing: 0u
  previous line spacing: 1
  line spacing: 1
  previous indentation: 0u
  indentation: 0u
  temporary indentation: 0u
  temporary indentation pending: no
  total indentation: 0u
  previous text length: 0u
  target text length: 0u
  input line start: 0u
  computing tab stops from: input line start
  forcing adjustment: no
  hyphenation language code: en
  hyphenation mode: 4 (on, not allowed within last two characters)
  hyphenation mode default: 4
  count of consecutive hyphenated lines: 0
  consecutive hyphenated line count limit: -1 (unlimited)
  hyphenation space: 0u
  hyphenation margin: 0u
Environment 0:
  current


And it's stuff they _won't get_ automatically when creating a new
environment.  Our documentation should probably urge the user more
strongly to, as a rule, `evc 0` when creating an environment.

All of that said, we _could_ change `ev` to, when creating a new
environment, copy from environment `0` automatically.  (I'm not sure how
we would represent a desire to copy the formatter's default environment
though.  I hope not with yet another new request.)  But that seemed like
a more disruptive and less backward-compatible change.

I think that if people have been creating environments and _not_ using
`evc 0` on them immediately afterward, they've been relying on luck.

At 2025-02-01T14:13:16-0500, Dave wrote:
> Follow-up Comment #19, bug #66392 (group groff):
>
> [comment #18 comment #18:]
>> Why is \n[.hla] not global regardless of ev?
>
> By my reading of bug #66387, the salient sentence is, "Pretty weird to
> pop the environment stack and have the hyphenation mode, but not the
> hyphenation _language_, change."
>
> But perhaps this is something that warrants wider discussion.

That could be; I don't mind.  But the status quo ante did not look to me
like a situation anyone would expect or desire.  Hmm, I do see that I
missed an opportunity to post one of my "trivia challenges" about it to
the list.  ;-)

Regards,
Branden

[1] The stock "troffrc" performs one character translation involving the
    non-breaking space.  Character translations are presently global but
    I have a notion to make those environmental as well, to avoid a
    problem seen in the real world where a set of translations
    temporarily set up happens to be in force when a page break happens,
    corrupting header and/or footer text.  I don't have this work
    scheduled.  Like the item in the next footnote, it will demand major
    surgery to reorganize data structures.

[2] It hadn't occurred to me before now that we might need to house the
    hyphenation code assignments in the environment instead.  Doing so
    will require some significant refactoring, as presently a
    character's hyphenation code is stored in its `charinfo` object, the
    dictionary of which is global.  I have no appetite to add this to my
    groff 1.24 plate.

[3] And I don't know why we would; if we ever need to distinguish
    "en_GB" from "en_US", for example (which really do hyphenate a few
    words differently, I gather), those strings will _be_ the
    hyphenation language codes, and they obviously differ.  Similarly,
    if a user wants to set up their own multiple distinct hyphenation
    configurations for a given language, they'd pick distinct
    identifiers (language codes) for them.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66392>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to