Hello, I would agree, fail fast and be strict as a default. Should also help us with Iess (fuzzing-discovered) DOS security reports and would prevent file type confusion which is a very real attack especially for Archives.
Bernd -- http://bernd.eckenfels.net ________________________________ Von: Gary Gregory <garydgreg...@gmail.com> Gesendet: Friday, June 4, 2021 10:32:22 PM An: Commons Developers List <dev@commons.apache.org> Betreff: Re: [compress] 7z and Recovering Corrupt Archives In general, I think fail fast is ok with a clear exception message. Gary On Fri, Jun 4, 2021, 15:44 Stefan Bodewig <bode...@apache.org> wrote: > Hi all > > 7z archives provide CRCs for the metadata section so you can quickly > identify a wide range of broken archives - which is far better than what > you get for ZIP for example. > > It is possible to recover from a certain type of broken archive. A case > where the archive has been written almost completely and just the CRC > and the locator of metadata are missing. The docs talk about > disks/drives being removed prematurely. > > The basic idea is to search backwards from the end of the file for the > metadata and try to parse it. This is what SevenZFile does and has > always done. This is the root cause of > https://issues.apache.org/jira/browse/COMPRESS-542 - the file ends with > something that looks like metadata of an archive with lots and lots of > files in it and the allocation of arrays leads to a OOM. > > Current master will detect corrupt archives more quickly - in particular > without excessive allocations - but still it may take quite some time to > reject thousands of candidates of "this could be the first byte of > proper meta data". We are scanning the last megabyte of the file and > there is ample chance this last megabyte may contain random noise that > looks promising. > > Personally I believe that almost nobody actually needs this mode of > recovery. > > Therefore I've thought we might want to introduce an option that enables > the recovery mode. If it was disabled and we found the CRC was missing > we'd throw a new specific exception that says "you may want to try with > recovery enabled instead". > > Making this new option default to disabling recovery would break > backwards compatibility but it is tempting to think this could be > fine. I'm a bit torn here. What do you think? > > > Stefan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >