On Sun, 2002-12-22 at 00:27, John Levon wrote:
> On Sat, Dec 21, 2002 at 10:51:42PM +1030, Darren Freeman wrote:
> 
> > I agree in principle, but would like to see tests to show which activity
> > is actually more productive. Remember that shrinking the symbol names
> 
> Why would we do that ?

Because I did very basic tests to show that implementing compression as
a standard will have a much, much larger impact on file size than
stuffing around with the symbols in the XML source. In fact, removing
some symbols increased the file size slightly, for reasons unknown.

Since it was way to simple to prove that this is the case on a real
document, somebody could jump in and do it for that case.

Well either that or just argue until we all get sick of the idea, and
achieve nothing =)

> > shouldn't have much of an effect on the gzip file size.. You'd have to
> > actually start trimming away symbols to get a real benefit.
> 
> Exactly. And with the advent of lyx2lyx the final argument against not
> outputting default values for tabulars is gone.

Like I said before, if it's really easy then go right ahead. But there
is an even bigger benefit to using zlib. And since that sounds pretty
easy to somebody with the familiarity to do it, I would like to see
compression used first.

> Just removing the default params from a large table can easily halve the
> file size.

Did you try it *with* compression? Because I did, but on a small dummy
document. And it didn't help.

> Then consider that there's no reason :
> 
> \end_inset
> </cell>
> <cell>
> \begin_inset Text
> 
> can't be :
> 
> </cell><cell>

Sure. That would be a clever thing to do, right after enabling
compression =)

> > So I would say this: apart from obvious shortening of bloated symbols,
> > leave them readable (and compatible!). As long as gzipping becomes the
> > standard, that's a good thing since it's a tiny penalty for a large
> > gain.
> 
> Frankly, I consider it a hack.

Why? Are we concerned with the size of a gzipped file or the original??

What is the goal?

The representation of the user's data is the goal, as I see it.

We can come up with some ultra-optimised code that only LyX can read, in
which the information rate is nearly optimal. Or we can make it more
readable like a normal text file, but let the information rate go much
lower, like 5 %.

Fortunately this thing called source coding allows us to do both: create
an easy to read text file, then store it with an efficiency close to 100
%. It's not a hack at all, it's called information theory. I see no
reason not to use it here. It's just like creating a water-tight file
format which can be converted to human-readable form, only done the
other way around.

Create a human readable form and then compress it into a water tight
format. This format can then be manipulated using standard tools, like
gunzip and a text editor. What could be more elegant?

> john

Have fun, and merry xmas,
Darren

Reply via email to