Hi, On Thu, 2025-02-06 at 10:20 -0800, Paul Eggert wrote: > On 2025-02-05 07:46, Eduard Stefes wrote: > > > We think that the only > > possibility for a problem here, is if faulty hardware updates > > param->cf > > incorrect. > > I'd rather not worry about hardware going bad, unless the hardware > failure is a known real-world problem. So how about if we make the > following further patch? > > diff --git a/dfltcc.c b/dfltcc.c > index 9f86581..a360ce2 100644 > --- a/dfltcc.c > +++ b/dfltcc.c > @@ -372,7 +372,7 @@ dfltcc_deflate (int pack_level) > /* Read the input data. */ > if (inptr == insize) > { > - if (fill_inbuf (1) == EOF && !param->cf) > + if (fill_inbuf (1) == EOF) > break; > inptr = 0; > } > > We really need to check the cf(continuation flag) before we can return. here a cite [1] from he spec: > ... when one, indicates the operation is partially complete ...
if we do not wait for the hardware to lower the flag, we risk to corrupt and overwrite parts of the output buffer. > > > I read rfc1952 and there is no explicit statement saying that > > incomplete gz files *are corrupt or invalid*. > > My reading is that the RFC does not specify any behavior for > incomplete > gz files. However, gzip goes beyond what the RFC requires, and is > supposed to complain "unexpected end of file" for incomplete files, > versus "invalid compressed data--format violated" for files that are > not > prefixes of valid compressed files. We should maintain that behavior > as > it is valuable information to give to the user. > > I installed the patch to dfltcc.c that you suggested in > < > https://debbugs.gnu.org/c > gi_bugreport.cgi-3Fbug-3D75924- > 235&d=DwICaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=9u0E1SPaj7JeThb7S74vZ9s- > VX1c_JTZ3i5rEh89cLU&m=mYHcCIO2Z8sEAayQMo8pBmY3pxQ16wbf8KMW2NPiDknJMFz > 4qEFFyhpCJvbRLyu9&s=VxHLexQ8YkPuK0dODBtBIOdJSZNN3UOOaFzlPLnFd7s&e= >, > superseding the > incorrect patch I installed a couple of days ago. However, I'm > worried > about the patch to test/hufts that you also suggested, as it would > mean > gzip's diagnostics would differ on s390. As I understand it, > test/hufts > checks for invalid data not unexpected EOF, so for these tests gzip > should say "invalid compressed data" rather than "unexpected end of > file", which means that there's still a bug in the s390 port that > relates to which diagnostic to give in this case. > > Am I understanding this correctly? this is a tricky one. the hardware will not start emitting proper OESCs (Operation-Ending-Supplemental Code) before a complete huffman tree was built up. Until then we have to rely on the CC (Condition Code). However DFLTCC_CC_OP2_TOO_SHORT and DFLTCC_CC_OP2_CORRUPT are overloaded[1]. I came up with an ugly hack: in case of an EOF we could check if bytes_out == 0 and then return 2(invalid data). If we already wrote some bytes we continue with the EOF error code. This new *if-branch* would only be hit in this special case, where parsing of the initial huffman tree failed on the hardware. It would *not* affect the situation when we have corrupt data *after* the initial huffman tree was parsed by the hardware, because that is covered here[2]. This makes hufts pass and does not affect any other test case. I will do more exhaustive tests just to be sure that nothing else is affected. [1] dfltcc.c:42 [2] dfltcc.c:457-464 -- Eduard Stefes <eduard.ste...@ibm.com>