On 2025-02-19 Paul Eggert wrote:
> So I installed the attached patch instead into Savannah master gzip.
> It means gzip will pay attention to -1..-9, --rsyncable, and
> --synchronous

This sounds reasonable. It unambiguously prevents glitches in scripts.
For example, GZIP=-v may seem innocent but it affects --list output too
which some scripts parse. Also, -v undoes -q, and -q affects exit
status. There are safe use cases for GZIP=-v, but losing that feature
shouldn't be a major loss (it would hurt more if -v provided a
real-time progress indicator).

I wonder if -n (--no-name) should be allowed. GZIP=-9n was perhaps the
most common use case I had. It was with pipes, and nowadays -n is the
default for pipes anyway, so it doesn't matter for me.

Hypothetically, someone somewhere might rely on GZIP=-n when
compressing regular files. If -n is silently ignored, a script might
silently start producing .gz files with timestamps and filename
metadata. If someone is worried about that, they possibly have seen the
deprecation warning already and might have modified their script as
well, so the risk should be low. I don't have any examples of such use
cases, I'm just pondering what kind of tiny risks there could be.

Some script might expect the metadata to be stored, and GZIP=-n could
break it. I suspect such use cases are rare nowadays. Preserving the
filename was more important when OSes with short file names were still
in common use. Thus, I think the probability of negative surprises is
low if GZIP=-n is allowed. But I also understand that -n is an option
that affects more than just compression ratio, and that is the reason
why the commit dropped -n from GZIP. I don't have a clear opinion here.

> but will silently ignore everything else in GZIP. The idea is that a
> garbage GZIP value shouldn't disrupt normal operations.

I have mixed feelings about this. I understand that this way garbage in
GZIP cannot break scripts by making gzip fail. Also, if someone has
GZIP=-9n for pipe use, a silently-ignored -n will keep it working like
before, being compatible with both very old and current gzip versions.

On the other hand, it can be confusing that some options, that used to
work, no longer work for no apparent reason. One has to look at the
manual to figure it out. Typos won't be caught either.

How about printing a message to stderr if something is ignored in GZIP?
zstd and lz4 do that with their env vars. Even a vague message like "one
or more arguments in the GZIP environment variable were ignored" would
do if a more specific message is cumbersome to implement.

Such a message can break some special uses that check if something
was written to stderr, but those should be rare enough (they should
unset GZIP). After all, the current deprecation message about GZIP
usage was considered acceptable to add.

> These options shouldn't cause the bigger glitches (and perhaps even
> security hazards) that behavior-oriented options can cause.

I understand well that if everything was allowed in GZIP and someone
actually used that freedom in a wrong way, bad things could happen. What
I don't understand well is how big the risks are in practice. gzip and a
few other tools did or do suppport unrestricted options from env vars
for a long time. I haven't seen any security alerts about them, but that
doesn't mean that they don't or won't exist. I wouldn't add an
unrestricted env var to a new tool, but modifying existing tools always
has backward compatibility concerns.

I'm happy that the important functionality of GZIP is preserved. Thanks!

-- 
Lasse Collin



Reply via email to