cinap_len...@felloff.net wrote:
 |found it. the server sends Content-Encoding header which causes hget
 |to add a decompression filter, so you get as output a tarball.
 |
 |<- Content-Type: application/x-gzip
 |<- Content-Encoding: gzip

 |

 |this is clearly silly, as the file is already compressed, \
 |and decompressing it
 |will not yield the indicated content-type: application/x-gzip, \
 |but a tarball.
 |
 |maybe the w3c is wrong, or is ignored in practice or we need to handle gzip
 |specially. the problem is that some webservers compress the \

The problem is that IANA doesn't support a tar-gz MIME type, so
that mime.types(5) (tika [1] for Apache) will return "silly"
values, as in

  application/gzip              tgz gz emz
  application/x-bzip2           bz2 tbz2 boz
  # EXTENSION .tbz
  application/x-xz              xz tbz
  application/x-tar             tar

  [1] 
http://svn.apache.org/viewvc/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml

 |data, like you request
 |a html file and it gives you gzip back, thats why hget uncompresses.

mime.types(5) (re-)evaluating expanded content seems what IANA has
in mind with its decision (it would be all too simple if it would
just work (tm)).

--steffen

Reply via email to