I would like to propose a strengthening of the security posture
of the ZIP file implementation.

The java.util.zip implementation is, according to the package docs,
based on the Info-ZIP specification [1] which itself states to be based
on PKWARE's appnote.txt [2]. The latter is probably considered the
"official" specification of the ZIP file format by most. [3]

Over the years these have drifted apart in a crucial area:
the naming of files within the ZIP.


Info-ZIP:
      The name of the file, with optional relative path.
      The path stored should not contain a drive or
      device letter, or a leading slash.  All slashes
      should be forward slashes '/' as opposed to
      backwards slashes '\' for compatibility with Amiga
      and Unix file systems etc.


PKWARE:
      The name of the file, with optional relative path.
      The path stored MUST NOT contain a drive or
      device letter, or a leading slash.  All slashes
      MUST be forward slashes '/' as opposed to
      backwards slashes '\' for compatibility with Amiga
      and UNIX file systems etc.


The difference is the use of "MUST" vs "SHOULD".
I have verified that the PKWARE 'appnote' indeed in the past used
the text as in the Info-ZIP specification. When it exactly changed
I don't know.

The JDK implementation allows backslashes and even allows
a starting slash. In fact, I don't think it has any restrictions at all.
Interestingly, if you create a ZIP using the JDK with entry
names with backslashes in them and then use the Info-ZIP 'unzip'
utility to list or unzip, it will produce a warning but it will accept
the file and be able to explode it. So yes, even Info-ZIP
believes this to be a violation of sorts, at least important
enough to flag it.

I've gone through the javadocs and I cannot find anywhere
mentioned that the implementation is "forgiving" to the extent
it is. I understand the (historical?) need to be able READ zip files
which are not fully compliant. However, in retrospect it would have
been nice if the JDK was a lot less forgiving when CREATING zip files.
But alas, too late to change now. It just seems like a massive
footgun to provide the users of the JDK with.

There is also a security angle: Spoofing file names in ZIP files
is a common technique. Some implementations takes cautionary
steps on this. For example, the Windows Explorer's ZIP reader
simply will not show entries which start with ".." or ".".
Well done, I would say. It is of course unfair to compare a library
(the JDK) to an end-user tool like Windows Explorer as the
objectives are different, however can we fault a user of the JDK
if user would expect the entry names returned from ZipFile class
(i.e. when READING) to be compliant ZIP file names?

Bottom line: My point is that the subtle point that the JDK's
implementation is based off a very old spec from Info-ZIP is likely
to be lost on most users. Now that the "official" spec (PKWARE's) has
become blatantly clear on file naming (except I wish they would have
mentioned that starting the file name with "./" or "../" is illegal too),
I believe the JDK's javadocs should at least have something to say on
the topic of ZIP entry naming and the architectural choices made
in the implementation (accept anything).

So that is my suggestion: a "strengthening" of the Javadoc. I'll be happy
to propose the text. A more thorough approach would be to create new
name entry validating methods and possibly deprecate the existing ones.
Just thought I would propose the easiest solution first: javadoc.

Thoughts?

/Lars


[1]: http://www.info-zip.org/doc/appnote-19970311-iz.zip
[2]: https://pkwaredownloads.blob.core.windows.net/pem/APPNOTE.txt
[3]: The only known attempt - that I know of - from a standardization body
at formalizing the ZIP spec is ISO/IEC.21320-1 This spec is based off
PKWARE's
Zip File Format Specification version 6.3.3.
[3]:
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/zip/ZipEntry.html#%3Cinit%3E(java.lang.String)

Reply via email to