URL: <https://savannah.gnu.org/bugs/?64612>
Summary: consider an environment variable for general resources/inclusions Group: GNU roff Submitter: gbranden Submitted: Wed 30 Aug 2023 02:25:04 PM UTC Category: General Severity: 3 - Normal Item Group: Feature change Status: None Privacy: Public Assigned to: None Open/Closed: Open Discussion Lock: Any Planned Release: None _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: Wed 30 Aug 2023 02:25:04 PM UTC By: G. Branden Robinson <gbranden> Spawned off of bug #64577. In that ticket, the submitter observed that a change in _groff_ 1.23.0 to make font description file opening logic reject file names with slashes in them (to avoid directory traversal for security [confidentiality] purposes in the event of untrusted input), broke a user's workflow. In my opinion the root cause of the problem is the _grops_ driver's use of `font::open_file()` to open files that aren't font descriptions. You might call it a...leaky abstraction (take a drink). The problem is work-aroundable by performing some path-fu with the `-F` option or `GROFF_FONT_PATH` environment variable. The downside of those is that they push the cognitive dissonance noted above from the API level (where it's gone unnoticed for decades) up to the user. Not ideal. Let's review the call sites at issue, and how they look in our master branch in Git. _src/devices/grops/ps.cpp_: 785 void ps_printer::define_encoding(const char *encoding, int encoding_index) 786 { 787 char *vec[256]; 788 int i; 789 for (i = 0; i < 256; i++) 790 vec[i] = 0; 791 char *path; 792 FILE *fp = font::open_file(encoding, &path); 793 if (0 /* nullptr */ == fp) { 794 // If errno not valid, assume file rejected due to '/'. 795 if (errno <= 0) 796 fatal("refusing to traverse directories to open PostScript" 797 " encoding file '%1'"); 798 fatal("can't open encoding file '%1'", encoding); 799 } ... What's an encoding file? _grops_(1): A font description file may also contain a directive encoding enc‐file which says that the PostScript font should be reencoded using the encoding described in enc‐file; this file should consist of a sequence of lines of the form pschar code where pschar is the PostScript name of the character, and code is its position in the encoding expressed as a decimal integer; valid values are in the range 0 to 255. ... The encoding file is therefore tightly coupled to a _groff_ font description file, and moreover is of no inherent utility as a resource to be inlined into device-specific output (in this case, PostScript), and I therefore propose no change here. (Incidentally, I wonder if this is a model we might follow when pursuing an old idea of Werner's to decouple terminal output devices from their character set encodings. Dave can maybe help locate the ticket we have about that.) What are the other cases? _src/devices/grops/psrm.cpp_: 306 void resource_manager::output_prolog(ps_output &out) 307 { 308 FILE *outfp = out.get_file(); 309 out.end_line(); 310 char *path; 311 if (!getenv("GROPS_PROLOGUE")) { 312 string e = "GROPS_PROLOGUE"; 313 e += '='; 314 e += GROPS_PROLOGUE; 315 e += '\0'; 316 if (putenv(strsave(e.contents()))) 317 fatal("putenv failed"); 318 } 319 char *prologue = getenv("GROPS_PROLOGUE"); 320 FILE *fp = font::open_file(prologue, &path); 321 if (0 /* nullptr */ == fp) { 322 // If errno not valid, assume file rejected due to '/'. 323 if (errno <= 0) 324 fatal("refusing to traverse directories to open PostScript" 325 " prologue file '%1'"); 326 fatal("failed to open PostScript prologue file '%1': %2", prologue, 327 strerror(errno)); 328 } ... Salient facts here appear to be: 1. A prologue file is _required_, not merely some optional resource. 2. We already ship one. 3. We already have an environment variable for the specific purpose of identifying a user-supplied alternate. So this doesn't seem in urgent need of change either, even if it is a poor fit with other font description-related stuff. 346 void resource_manager::supply_resource(resource *r, int rank, 347 FILE *outfp, int is_document) 348 { 349 if (r->flags & resource::BUSY) { 350 r->name += '\0'; 351 fatal("loop detected in dependency graph for %1 '%2'", 352 resource_table[r->type], 353 r->name.contents()); 354 } 355 r->flags |= resource::BUSY; 356 if (rank > r->rank) 357 r->rank = rank; 358 char *path = 0 /* nullptr */; 359 FILE *fp = 0 /* nullptr */; 360 if (r->filename != 0 /* nullptr */) { 361 if (r->type == RESOURCE_FONT) { 362 fp = font::open_file(r->filename, &path); 363 if (0 /* nullptr */ == fp) { 364 // If errno not valid, assume file rejected due to '/'. 365 if (errno <= 0) 366 error("refusing to traverse directories to open PostScript" 367 " resource file '%1'"); 368 else 369 error("failed to open PostScript resource file '%1': %2", 370 r->filename, strerror(errno)); 371 delete[] r->filename; 372 r->filename = 0 /* nullptr */; 373 } 374 } 375 else { 376 fp = include_search_path.open_file_cautious(r->filename); 377 if (0 /* nullptr */ == fp) { 378 error("can't open '%1': %2", r->filename, strerror(errno)); 379 delete[] r->filename; 380 r->filename = 0 /* nullptr */; 381 } 382 else 383 path = r->filename; 384 } 385 } ... I wonder if we shouldn't just collapse the if-else branches on `(r->type == RESOURCE_FONT)` into one, using the latter, the `include_search_path`. 1090 void resource_manager::read_download_file() 1091 { 1092 char *path = 0 /* nullptr */; 1093 FILE *fp = font::open_file("download", &path); 1094 if (0 /* nullptr */ == fp) 1095 fatal("failed to open 'download' file: %1", strerror(errno)); 1096 char buf[512]; 1097 int lineno = 0; 1098 while (fgets(buf, sizeof(buf), fp)) { 1099 lineno++; 1100 char *p = strtok(buf, " \t\r\n"); 1101 if (p == 0 /* nullptr */ || *p == '#') 1102 continue; 1103 char *q = strtok(0 /* nullptr */, " \t\r\n"); 1104 if (!q) 1105 fatal_with_file_and_line(path, lineno, "file name missing for" 1106 " font '%1'", p); 1107 lookup_font(p)->filename = strsave(q); 1108 } 1109 free(path); 1110 fclose(fp); 1111 } Ah, it's our old bedeviling friend the _download_ file. I think a good case can be made for treating this like the Type 1 font files themselves--in other words, from _grops's_ perspective as an inclusion rather than a font description. (The significance of this distinction is why I have been such a stickler for the terminological distinction between a "font file" and a "font *description* file) in _groff_ 1.23.0 documentation. 1. It needs to be updated to refer to those fonts when any are added/removed from the system. 2. It can live in a place on the file system that has nothing to do per se with groff. Phil Chadwick's system and those of countless Debian users keep Type 1 fonts elsewhere. 3. In principle, it doesn't even need to be _groff_'s job to update such _download_ files when relevant fonts appear on and disappear from the system. One may perceive that we're dovetailing into the _install-font.sh_ problem, a long-standing bugaboo for _groff_ users. But that's just step one: reassigning some files from being resolved via `-F` paths to `-I` paths. Step two is to decide if we want/need an environment variable analog for `-I`, as we do for `-F` in `GROFF_FONT_PATH`. Bike shed time: I'm uncertain whether to call this `GROFF_INCLUDE_PATH` or `GROPS_INCLUDE_PATH`. The case for `GROFF_INCLUDE_PATH` is that we don't need different names for different _groff_ programs; the semantics of `-I` are similar or identical everywhere we support it at all. Also, if _gropdf_ is to behave the same as _grops_ here--and I can't think of any reason it should not, a "generic" variable name is less confusing. On the downside, this means more work to either (1) implement support for this environment variable in places we currently don't or (2) document the discrepancy. On the gripping hand, if `include_search_path.open_file_cautious()` works the way I hope it does, and is diligently used everywhere it should be, there's only one place we'd need to make the change: there. The case for `GROPS_INCLUDE_PATH` is simpler to state. Quicker, easier, directly attacks the problem reported and only that. The downside is that it becomes another piece of driver-specific weirdness, a wart of non-orthogonality that makes _groff_ as a system harder to understand for developers and users alike. I welcome feedback on this disquisition. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?64612> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/