Vincent Lefevre wrote: > On 2013-03-05 13:30:28 +0000, Julian Foad wrote: >> Vincent Lefevre wrote: >> > On 2013-03-01 14:58:10 +0000, Philip Martin wrote: >> >> A server-side solution is difficult. Suppose the client has some >> >> uncompressed content U which it compresses to C and sends to the server. >> >> The server can uncompress C to get U but unless the compression scheme >> >> has a canonical compressed form, with no other forms allowed, the server >> >> cannot avoid storing C because there is no guarantee that C can be >> >> reconstructed from U. >> > >> > This is not specific to server side. Even on the client side, the >> > reconstruction may not be always possible, e.g. if the system is >> > upgraded or if NFS is used. And the compression level may need to >> > be detected or provided in some way. >> >> Hi Vincent. I'm not sure you understood Philip's point. > > This should be more clear about what I meant below. What I'm saying is > that whether this is done entirely on the server side (a bad solution, > IMHO) or on the client side (see below why), the problems are similar.
The point Philip made is *not* a problem if done client-side; some of the *other* problems are similar no matter on which side we would do the expansion/compression. >> His point is (correct me if I'm wrong) that Subversion's design >> requires that during a checkout or update, the server must >> reconstruct a file containing exactly the same bit pattern that the >> client sent when committing the file. Compression schemes in >> general don't guarantee that expanding and then compressing will >> produce the same compressed bit pattern, even if you take care to >> use the same "compression level". Therefore, the server cannot >> simply expand the data before storing it and then re-compress it >> during checkout or update, because, although the resulting >> compressed file would be a valid representation of the user's data, >> it would not satisfy Subversion's own requirement that the bit >> pattern be identical to what was sent by the client during the >> commit. > > You say that the server expands the data before storing it. This is > for a server-side only solution, I assume. Yes, I'm talking about the server-side-only solution, which is one of the hypothetical solutions that we are discussing and comparing. > But even if there would > be no problems with the construction/reconstruction, it would be a > bad solution, IMHO. Indeed, for a commit, it is the client that is > supposed to expand the data before sending the diff to the server, What do you mean "the client [...] is supposed to expand the data"? I don't understand why you think the client is "supposed" to do such a thing. > and for an update, it is the client that is supposed to recompress > the data before storing it to the WC. Actually, the server doesn't > need to know how the file was compressed, it just needs to record > information about the compression (but doesn't need to know what > this means exactly). > >> That point _is_ specific to a server-side solution. With a >> client-side solution, the user's word processor may not mind if a >> versioning operation such as a commit (through a decompressing >> plug-in) followed by checkout (through a re-compressing plug-in) >> changes the bit pattern of the compressed file, so long as the >> uncompressed content that it represents is unchanged. > > I disagree. It's not clear what you disagree with. > The word processor may not mind (in theory, because > in practice, one may have bugs that depend on the bit pattern, > and it would be bad to expose the user to such kind of bugs and > non-deterministic behavior), but for the user this may be important. > For instance, a different bit pattern will break a possible signature > on the compressed file. I agree that it *may* be important for the user, but the users have control so they can use this client-side scheme in scenarios where it works for them and not use it in other scenarios. - Julian