Hi Steffen, > > The closest is > > http://www.iana.org/assignments/media-types/application/gzip which > > leaves the tar for the user to fathom out if left to MIME types > > alone. > > [http://svn.apache.org/viewvc/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml] > lists the extension tgz regulary thereunder > > <mime-type type="application/gzip"> > <_comment>Gzip Compressed Archive</_comment> > <alias type="application/x-gzip"/> > <alias type="application/x-gunzip"/> > <alias type="application/gzip-compressed"/> > <alias type="application/gzipped"/> > <alias type="application/gzip-compressed"/> > <alias type="application/x-gzip-compressed"/> > <alias type="gzip/document"/> > <magic priority="45"> > <match value="\037\213" type="string" offset="0" /> > <match value="\x1f\x8b" type="string" offset="0" /> > </magic> > <glob pattern="*.tgz" /> > <glob pattern="*.gz" /> > <glob pattern="*-gz" /> > <glob pattern="*.emz" /> > </mime-type>
Interesting, but the comment seems wrong, since foo.gz need not be an archive, i.e. a collection of other things, and it's lumping foo.gz and foo.tgz together, so no indication that one of them is a tar file. MIME types just don't seem to allow layering. That file also has a couple of tar types. <mime-type type="application/x-tar"> <magic priority="40"> <!-- POSIX tar archive --> <match value="ustar\0" type="string" offset="257" /> </magic> <glob pattern="*.tar"/> </mime-type> <mime-type type="application/x-gtar"> <_comment>GNU tar Compressed File Archive (GNU Tape Archive)</_comment> <magic priority="50"> <!-- GNU tar archive --> <match value="ustar \0" type="string" offset="257" /> </magic> <glob pattern="*.gtar"/> <sub-class-of type="application/x-tar"/> </mime-type> The second's comment suggests it's compressed but then it wants `ustar' at offset 257 which is uncompressed. It's a mess. HTTP can ship an application/x-tar with a Content-Encoding of gzip, but that's not the same thing as the client should ungzip to deliver the tar; not what's wanted. This seems to be getting off-groff topic. :-) Cheers, Ralph.