On Mon, Apr 13, 2015 at 11:20 PM, Daniel Shahaf <d...@daniel.shahaf.name> wrote:
> rhuij...@apache.org wrote on Mon, Apr 13, 2015 at 14:28:39 -0000: > > - *has_props = (noderev->prop_rep->expanded_size > 4); > > + *has_props = (noderev->prop_rep->expanded_size > 4 > > + || (noderev->prop_rep->expanded_size == 0 > > + && noderev->prop_rep->size > 4)); > > Having to remember, on every use of EXPANDED_SIZE, to check if it's zero > is madness. The pain has been increasing and is now at a level where we should do something about it. > Is there a good reason not to make the deserializer > function handle this? If needed, by inventing a getter or a new struct > member that doesn't have this quirk? > The main problem is that we don't know whether SIZE is correct, either, without looking at the actual representation: It could be a self-delta of an empty fulltext. So, most of these tests are contextual, we e.g. know that these are property reps. Or we can allow for false positives or false negatives etc. One way to get around this problem is to basically use the approach of svn_fs_fs__file_length(), where we compare the contents checksum with with that of an empty rep. We might also restrict that check to "meaningful" values of "SIZE", i.e. those txdelta streams that produce an empty output. Assuming we find a truly reliable test (e.g. say txdelta is always 4 bytes and we can enumerate all MD5s), we could add it to the parser function in low_level.c. Thoughts? -- Stefan^2.