> In fact, a group of my friends coined the term "Microsoft Compression" > to refer to the amazing phenomenon of saving the same Powerpoint > document over and over again until the file size is as small as > possible. We have observed that it incrementally grows with each save, > until about 3 times the smallest size, then jumping back to the smallest > size. So if your file won't fit on floppy, do that until it does =)
This is not due to compression at all. M$ office doesn't really compress its data. In fact, it inflates it! Here's how: It is due to the fact that well... all non-XML m$ office documents are saved in ole compound files. OLE compound files are just another name for a supposedly-lightweight but quite s***ty filesystem which is maintained inside of a file (on linux a.k.a. mounting a filesystem image in a file through the loop). Office documents are small filesystems with all their drawbacks (and benefits where applicable ;-): filesystems usually have significant amount of "slack" -- unused free space. What m$ office does is that it defragments and downsizes that "filesystem" when the amount of slack exceeds some threshold (either fixed or as a multiple of used space). > What shits me about MC is that when you exceed your disc quota at uni, > Powerpoint silently fux up your document with no hints. So you could > lose your work (happened to me just before a presentation once). That's becuase, like with everyday filesystems, OLE compound documents (which are treated by M$ office as filesystems) may be kept mounted. When they are kept mounted, they can also dynamically grow (when you add data to them). That's the reason why office apps: 1. lock the files when using them (an otherwise completely unnecessary thing) 2. may grow the files up to a certain limit -- the free space reclamation algorithm in the m$ OLE compound file implementation is pretty trivial and doesn't do a good job at all 3. will fux the files if they cannot grow them -- because they keep the "filesystem" mounted, and they just treat it as a real filesystem of unlimited size -- they don't try to verify whether it can be grown or not, and they add/change the data in the file even when you're not saving it. This is somewhat simplified, but it's sad and true. Cheers, Kuba Ober