Hi Alex,

At 2025-09-11T09:29:23-0500, G. Branden Robinson wrote:
> At 2025-09-11T10:59:19+0200, Alejandro Colomar wrote:
> > I'm trying hard to have reproducible builds, so that I can verify
> > that my build system produces the same exact thing as long as the
> > tools used with it are the same (or a reasonably similar version).
[...]
> > Is groff(1) just random in some sense?  Would it be possible to
> > remove that randomness from groff(1)?
> 
> I'll need to look at what data structure is being used to house the
> list of file names that get dumped into GNU troff(1)'s output for HTML
> devices.  I suspect that feature was put in as a grohtml(1) debugging
> aid, as there's nothing about it that necessarily couples it to the
> HTML format.  Output for any target device could dump into its grout
> the list of input files that were read during formatting.

I can't account for this behavior.  No container-style data structure is
used; file names are written out to the grout stream synchronously as a
`file_iterator` opens them.  I cannot think of a mechanism by which the
formatter could be opening files for reading in a nondeterministic order
given a consistent input.

I'd provide links to groff source via cgit.git.savannah.gnu.org, but
once again the site is under AI DDoS attack and it's nonresponsive or
unbearably slow.

If you have a checkout, the functions you want are the aforementioned
`file_iterator`'s constructor in "src/roff/troff/input.cpp", and
`output_file::really_put_filename()` in "src/roff/troff/node.cpp".)

While constructors could be called in a nondeterministic order to
populate global objects of their type at application startup, that
doesn't happen here.  The `file_iterator` type is private to
"input.cpp", and there are no globals of that type.

The only call sites of `file_iterator`'s constructor are:

input_stack::next_file() // called by `next_file()`, .nx handler
do_source() // .so and .soquiet backend
pipe_source_request() // .pso handler
process_macro_package_argument() // `-m` command-line option handler
process_startup_file() // called by main on file name literals
do_macro_source() // .mso and .msoquiet backend
process_input_file() // called by main() on argv[] elements

> It's possible that the data structure is effectively an unordered map,
> and so is subject to the host system's stochastic and history-dependent
> dynamic memory allocations.  However, I'm not strongly confident about
> that because the output doesn't seem quite random _enough_.
> 
> Anyway, one shouldn't theorize ahead of facts, so I'll check out the
> data structure and see what there is to see.

Yeah, I got this totally wrong.

Worse still, I'm stumped.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature

Reply via email to