On Sun, 2002-12-22 at 04:13, Martin Vermeer wrote:
> On Sat, Dec 21, 2002 at 10:51:42PM +1030, Darren Freeman spake thusly:
>  
> ...
>  
> > But as for eliminating symbols altogether by more intelligent use of
> > defaults, I'd say: go for it! That will shave off some data.
> >
> > But before making the statement definitive, I had another go. I removed
> > what I thought looked like redundant info, from the already smaller
> > file. I only touched the tables. I shaved off a further 2.3 kB. But upon
> > compressing it, I actually added 26 bytes. I have no definitive
> > explanation for why, but gzip obviously couldn't take advantage of
> > patterns that were present before.
>  
> I find this a bit surprising. Especially if it would be typical, which
> I have difficulty believing.

Well somebody needs to try it again on a real document. And it needs to
be a real document dominated by tables and other XML-intensive stuff.

> > So in this very simple example, there was hardly any gain in the
> > compressed file despite putting in work to shave off up to 5 kB from the
> > original. It would be very handy if the person coding file I/O could
> > come up with some real tests on publically available documents to show
> > how big a saving is to be had by reducing the bloat, if it's going to be
> > compressed anyway.
>  
> > Have fun,
> > Darren
> 
> As I see it, the default values should be the column values for align,
> leftline and rightline, and the row values for valign, topline and
> bottomline (the latter is apparently not done, or at least no
> advantage taken of it IIUC). It would require some coding I suppose,
> but that alone would eliminate most of these attributes from the cells.

Yeah I would expect that clever defaults would be a good thing, at least
for readability if not the compressed file.

> For width and usebox, the defaults should clearly be 0pt and none. And
> I agree that abbreviating the attribute names and values is not a good
> idea after all.
> 
> And the metric we should use for success is what it does to the
> *uncompressed* file. IMHO.

I say that the metric should be a combination of readability of the
uncompressed file, and size of the compressed file. I don't have a
suggestion of how to quantify that =)

But clearly the compressed file is pretty insensitive to the sorts of
changes we are talking about so that leaves readability as the key
point. Unless people want to keep uncompressed copies lying around =(

> Martin

Have fun,
Darren

Reply via email to