Hi Alex, At 2025-09-11T09:29:23-0500, G. Branden Robinson wrote: > At 2025-09-11T10:59:19+0200, Alejandro Colomar wrote: > > I'm trying hard to have reproducible builds, so that I can verify > > that my build system produces the same exact thing as long as the > > tools used with it are the same (or a reasonably similar version). [...] > > Is groff(1) just random in some sense? Would it be possible to > > remove that randomness from groff(1)? > > I'll need to look at what data structure is being used to house the > list of file names that get dumped into GNU troff(1)'s output for HTML > devices. I suspect that feature was put in as a grohtml(1) debugging > aid, as there's nothing about it that necessarily couples it to the > HTML format. Output for any target device could dump into its grout > the list of input files that were read during formatting.
I can't account for this behavior. No container-style data structure is used; file names are written out to the grout stream synchronously as a `file_iterator` opens them. I cannot think of a mechanism by which the formatter could be opening files for reading in a nondeterministic order given a consistent input. I'd provide links to groff source via cgit.git.savannah.gnu.org, but once again the site is under AI DDoS attack and it's nonresponsive or unbearably slow. If you have a checkout, the functions you want are the aforementioned `file_iterator`'s constructor in "src/roff/troff/input.cpp", and `output_file::really_put_filename()` in "src/roff/troff/node.cpp".) While constructors could be called in a nondeterministic order to populate global objects of their type at application startup, that doesn't happen here. The `file_iterator` type is private to "input.cpp", and there are no globals of that type. The only call sites of `file_iterator`'s constructor are: input_stack::next_file() // called by `next_file()`, .nx handler do_source() // .so and .soquiet backend pipe_source_request() // .pso handler process_macro_package_argument() // `-m` command-line option handler process_startup_file() // called by main on file name literals do_macro_source() // .mso and .msoquiet backend process_input_file() // called by main() on argv[] elements > It's possible that the data structure is effectively an unordered map, > and so is subject to the host system's stochastic and history-dependent > dynamic memory allocations. However, I'm not strongly confident about > that because the output doesn't seem quite random _enough_. > > Anyway, one shouldn't theorize ahead of facts, so I'll check out the > data structure and see what there is to see. Yeah, I got this totally wrong. Worse still, I'm stumped. Regards, Branden
signature.asc
Description: PGP signature