The feature of zip we want is the index, that let's us seek to a
position in the bundle and start unpacking, just given the filename.
How hard is to actually create a datastructure for the same purpose for
a tar.xz or so? I don't know really anything about the uncompression
algorithms to know if we could do something like
- seek to position N in bundle
- set state to X, if applicable
- uncompress, skip M bytes
-- you get your file contents, L bytes long
Or so. Yes, it'd be a new file format, I guess, at least as far as I can
tell? Maybe it's worth it.
Axel
On 2/27/14, 1:30 AM, Andreas Gal wrote:
Could we compress major parts of omni.ja en block? We could for example stick
all JS we load at startup into a zip with zero compression and then compress
that into an outer zip. I think we already support nested containers like that.
Assuming your math is correct even without adding LZMA2 just sticking with zip
we should get better compression and likely better load times. Wdyt?
Andreas
On Feb 27, 2014, at 12:25 AM, Mike Hommey <m...@glandium.org> wrote:
On Wed, Feb 26, 2014 at 08:56:37PM +0100, Andreas Gal wrote:
This randomly reminds me that it might be time to review zip as our
compression format for omni.ja.
ls -l omni.ja
7862939
ls -l omni.tar.xz (tar and then xz -z)
4814416
LZMA2 is available as a public domain implementation. It uses a bit
more memory than zip, but its still within reason (the default level 6
is around 1MB to decode I believe). A fairly easy way to use it would
be to add support for a custom compression format for our version of
libjar.
IIRC, it's also slower both to compress and decompress. Note you're
comparing oranges with apples, too.
Jars are per-file compression. tar.xz is per-archive compression.
This is what i get:
$ stat -c %s ../omni.ja
8609399
$ unzip -q ../omni.ja
$ find -type f -not -name *.xz | while read f; do a=$(stat -c %s $f); xz --keep -z $f; b=$(stat -c
%s $f.xz); if [ "$a" -lt "$b" ]; then rm $f.xz; else rm $f; fi; done
# The above compresses each file individually, and keeps either the
# decompressed file of the compressed file depending which is smaller,
# which is essentially what we do when creating omni.ja
$ find -type f | while read f; do stat -c %s $f; done | awk '{t+=$1}END{print
t}'
# Sum all file sizes, excluding directories that du would add.
7535827
That is, obviously, without jar headers.
$ unzip -lv ../omni.ja 2>/dev/null | tail -1
27696753 8260243 70% 2068 files
$ echo $((8609399 - 8260243))
349156
Thus, that same omni.ja that is 8609399, with xz compression would be
7884983. Not much of a win, and i doubt it's worth it considering the
runtime implication.
However, there is probably room for improvement on the installer side.
Mike
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform