The feature of zip we want is the index, that let's us seek to a position in the bundle and start unpacking, just given the filename.

How hard is to actually create a datastructure for the same purpose for a tar.xz or so? I don't know really anything about the uncompression algorithms to know if we could do something like
- seek to position N in bundle
- set state to X, if applicable
- uncompress, skip M bytes
-- you get your file contents, L bytes long

Or so. Yes, it'd be a new file format, I guess, at least as far as I can tell? Maybe it's worth it.

Axel

On 2/27/14, 1:30 AM, Andreas Gal wrote:
Could we compress major parts of omni.ja en block? We could for example stick 
all JS we load at startup into a zip with zero compression and then compress 
that into an outer zip. I think we already support nested containers like that. 
Assuming your math is correct even without adding LZMA2 just sticking with zip 
we should get better compression and likely better load times. Wdyt?

Andreas

On Feb 27, 2014, at 12:25 AM, Mike Hommey <m...@glandium.org> wrote:

On Wed, Feb 26, 2014 at 08:56:37PM +0100, Andreas Gal wrote:
This randomly reminds me that it might be time to review zip as our
compression format for omni.ja.

ls -l omni.ja

7862939

ls -l omni.tar.xz (tar and then xz -z)

4814416

LZMA2 is available as a public domain implementation. It uses a bit
more memory than zip, but its still within reason (the default level 6
is around 1MB to decode I believe). A fairly easy way to use it would
be to add support for a custom compression format for our version of
libjar.
IIRC, it's also slower both to compress and decompress. Note you're
comparing oranges with apples, too.
Jars are per-file compression. tar.xz is per-archive compression.
This is what i get:

$ stat -c %s ../omni.ja
8609399

$ unzip -q ../omni.ja
$ find -type f -not -name *.xz | while read f; do a=$(stat -c %s $f); xz --keep -z $f; b=$(stat -c 
%s $f.xz); if [ "$a" -lt "$b" ]; then rm $f.xz; else rm $f; fi; done
# The above compresses each file individually, and keeps either the
# decompressed file of the compressed file depending which is smaller,
# which is essentially what we do when creating omni.ja

$ find -type f | while read f; do stat -c %s $f; done | awk '{t+=$1}END{print 
t}'
# Sum all file sizes, excluding directories that du would add.
7535827

That is, obviously, without jar headers.
$ unzip -lv ../omni.ja 2>/dev/null | tail -1
27696753          8260243  70%                            2068 files
$ echo $((8609399 - 8260243))
349156

Thus, that same omni.ja that is 8609399, with xz compression would be
7884983. Not much of a win, and i doubt it's worth it considering the
runtime implication.

However, there is probably room for improvement on the installer side.

Mike

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to