On 26 June 2014 19:08, Stefan Fuhrmann <stefan.fuhrm...@wandisco.com> wrote: > > > On Thu, Jun 26, 2014 at 11:52 AM, Ivan Zhakov <i...@visualsvn.com> wrote: >> >> On 25 June 2014 22:11, Daniel Shahaf <d...@daniel.shahaf.name> wrote: >> > Ivan Zhakov wrote on Wed, Jun 25, 2014 at 20:03:11 +0400: >> >> On 25 June 2014 19:54, Stefan Fuhrmann <stefan.fuhrm...@wandisco.com> >> >> wrote: >> >> > On Wed, Jun 25, 2014 at 5:46 PM, Ivan Zhakov <i...@visualsvn.com> >> >> > wrote: >> >> >> >> >> >> On 25 June 2014 19:24, Stefan Fuhrmann >> >> >> <stefan.fuhrm...@wandisco.com> >> >> >> wrote: >> >> >> > On Wed, Jun 25, 2014 at 4:57 PM, Ivan Zhakov <i...@visualsvn.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> On 21 June 2014 17:15, <stef...@apache.org> wrote: >> >> >> >> > Author: stefan2 >> >> >> >> > Date: Sat Jun 21 15:15:30 2014 >> >> >> >> > New Revision: 1604421 >> >> >> >> > >> >> >> >> > URL: http://svn.apache.org/r1604421 >> >> >> >> > Log: >> >> >> >> > Append index data to the rev data file instead of using >> >> >> >> > separate >> >> >> >> > files. > > >> >> >> >> >> > > >> >> >> >> > > > >============================================================================== >> >> >> >> >> >--- subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c (original) >> >> >> >> >+++ subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c Sat Jun 21 >> >> >> >> > 15:15:30 >> >> >> >> > 2014 >> >> >> >> >@@ -1449,15 +1449,35 @@ write_revision_zero(svn_fs_t *fs) >> >> >> >> >+ SVN_ERR(svn_io_file_create_binary(path_revision_zero, >> >> >> >> >+ "PLAIN\nEND\nENDREP\n" >> >> >> >> >+ "id: 0.0.r0/2\n" >> >> >> >> >+ "type: dir\n" >> >> >> >> >+ "count: 0\n" >> >> >> >> >+ "text: 0 3 4 4 " >> >> >> >> >+ "2d2977d1c96f487abe4a1e202dd03b4e\n" >> >> >> >> >+ "cpath: /\n" >> >> >> >> >+ "\n\n" >> >> >> >> >+ >> >> >> >> >+ /* L2P index */ >> >> >> >> >+ "\0\x80\x40" /* rev 0, 8k entries >> >> >> >> > per page >> >> >> >> > */ >> >> >> >> >+ "\1\1\1" /* 1 rev, 1 page, 1 >> >> >> >> > page in >> >> >> >> > 1st >> >> >> >> > rev */ >> >> >> >> >+ "\6\4" /* page size: bytes, >> >> >> >> > count */ >> >> >> >> >+ "\0\xd6\1\xb1\1\x21" /* phys offsets + 1 */ >> >> >> >> >+ >> >> >> >> >+ /* P2L index */ >> >> >> >> >+ "\0\x6b" /* start rev, rev file >> >> >> >> > size >> >> >> >> > */ >> >> >> >> >+ "\x80\x80\4\1\x1D" /* 64k pages, 1 page >> >> >> >> > using 29 >> >> >> >> > bytes */ >> >> >> >> >+ "\0" /* offset entry 0 page >> >> >> >> > 1 */ >> >> >> >> >+ /* len, item & type, >> >> >> >> > rev, >> >> >> >> > checksum */ >> >> >> >> >+ "\x11\x34\0\xf5\xd6\x8c\x81\x06" >> >> >> >> >+ "\x59\x09\0\xc8\xfc\xf6\x81\x04" >> >> >> >> >+ "\1\x0d\0\x9d\x9e\xa9\x94\x0f" >> >> >> >> >+ "\x95\xff\3\x1b\0\0" /* last entry fills up >> >> >> >> > 64k >> >> >> >> > page >> >> >> >> > */ >> >> >> >> >+ >> >> >> >> >+ /* Footer */ >> >> >> >> >+ "107 121\7", >> >> >> >> >+ 107 + 14 + 38 + 7 + 1, fs->pool)); >> >> >> >> This constant makes code unreadable, unmaintainable and very >> >> >> >> error >> >> >> >> prone. >> > ... >> >> I saw it, but it doesn't make code easier to maintain. >> > >> > Ivan, that's the third time in as many emails that you say "the new code >> > is hard to maintain". Could you please explain *how* you find it hard >> > to maintain? >> > >> > Anyway, that's just me guessing. Could you please clarify what exactly >> > makes the code unmaintainable in your opinion? >> > >> In my opinion 'code maintainability' is how easy or hard to make >> change and review the code. This magic constant with a lot 7b encoded >> numbers are very hard to review and modify if needed in future. Even >> Stefan2 mentioned that he made slight mistakes when changing that >> constant that was caught by test suite. But I don't assume that code >> that passes all test suite doesn't have bugs, while Stefan2 seems to >> assume. I bet that nobody reviewed "\x11\x34\0\xf5\xd6\x8c\x81\x06" >> constant in the code above. Even I didn't, because I think it's a >> waste of time and proper code should be committed or change reverted. > > > Hi Ivan, > > I see three alternative ways to code that function > > 1. As hard coded string / byte sequence (current implementation). > Cons: > * Hard to write, hard to review by just looking at it (applies to time > until initial release only). > Pros: > * Explicitly coded as constant, deterring people from changing it. > * Independent of other code, i.e. unintended changes to the format / > encoding generated by the normal code usually become apparent > when running the test suite. > > 2. Use our txn code to write r0. This should be simple and might > at most require some special ID handling. > Cons: > * Generating incompatible r0s is likely not caught by our test suite > (assuming that reader and writer functions are in sync). Basically > all the risk that comes with dynamically generating a "constant". > * Test cases must make sure our backward compat repo creation > options create repos that can actually be used by old releases. > (we might want explicit test for that anyway, though) > Pros: > * No or very little special code for r0 (not sure, yet). > * Format / encoding changes don't require new r0 templates. > > 3. Write code to "stitch" r0 together, e.g. string_add(md5("END\n")). > Cons: > * No more robust than 1. but requires much more code. > * May _look_ easy to understand but an actual offline review is still hard. > Pros: > * Widely independent of other code, although not as much as 1. > > Do you generally agree with the range of options and their assessment? Yes, I generally agree with range of options. The only other I have is do not implement log addressing in first place.
> Which one would you pick and why? > It's hard to pick option without looking to code, but I would start with leaving string constant for revision body and then appending indexes data using API. I.e. somewhat modified (2). -- Ivan Zhakov