Hi,

I’m finding that commons-compress-1.26.1 is recognising a utf-16 text file as a 
tar archive – unlike the previous version

This is the code that changed in that release in ArchiveStreamFactory - public 
static String detect(final InputStream in) throws ArchiveException {
that differs in detection:

if (signatureLength >= TAR_HEADER_SIZE) {
    try (TarArchiveInputStream inputStream = new TarArchiveInputStream(new 
ByteArrayInputStream(tarHeader))) {
        // COMPRESS-191 - verify the header checksum
        // COMPRESS-644 - do not allow zero byte file entries
        TarArchiveEntry entry = inputStream.getNextEntry();
        // try to find the first non-directory entry within the first 10 
entries.
        int count = 0;
        while (entry != null && entry.isDirectory() && count++ < 
TAR_TEST_ENTRY_COUNT) {
            entry = inputStream.getNextEntry();
        }
        if (entry != null && entry.isCheckSumOK() && !entry.isDirectory() && 
entry.getSize() > 0 || count > 0) {
            return TAR;
        }
    } catch (final Exception e) { // NOPMD NOSONAR
        // can generate IllegalArgumentException as well as IOException 
auto-detection, simply not a TAR ignored
    }
}

I feel this is being too lenient.  For instance at the last “if” statement, for 
the test file, entry is null and count=1.  The code suggests it is looking for 
the first non-directory entry.  It hasn’t found a non-directory entry in our 
case.

For instance, the earlier code at least checked that the checksum was OK for 
the one entry it checked (it isn’t for our test file…)

Regards,
Gren

Gren Elliot
Senior Software Engineer
m: +44 7590 571125
p: 
w: https://www.mimecast.com
Address: https://www.mimecast.com/company/contact/

Work Protected.™



Reply via email to