Hi, obliterate fans! For FS-BDB, the core "obliterate" operation follows this sequence, descibed in terms of obliterating one or more nodes within existing revision r10:
1. "begin": start a new txn that is a mutable copy of r10's txn. 2. modify the new txn as desired - e.g. delete '/d/foo/c' 3. "commit": replace r10's txn with the new txn. 4. remove the old txn to reclaim space (I'm not going to talk in this message about fixing up references from later revs; let's assume that we have already obliterated all references before running this sequence.) So far, I have implemented the core parts [*1] of "begin" and "commit", thus replacing one revision's worth of history with an identical copy of itself except with a new txn-id, without having removed the old txn from the database. To watch this in action, my "obliterate_tests.py" is dumping the state of the repository into a text file, before and afterwards: [[[ $ svn-py-test obliterate --no-cleanup --verbose --fs-type=bdb ## ignore the "parsed wrongly" messages [*2] ]]] (svn-py-test is a script of mine that runs the specified test) In another window I diff the two to see what has changed. The result looks like this (shown using GNU diff; really I use vimdiff for a side-by-side view): [[[ $ (cd obj-dir/subversion/tests/cmdline/svn-test-work/working_copies/obliterate_tests-1/ && \ diff -U2 -p {before,after}.dump/all.bdb) --- before.dump/all.bdb 2009-12-11 14:04:52.873840736 +0000 +++ after.dump/all.bdb 2009-12-11 14:04:53.177814628 +0000 @@ -10,5 +10,5 @@ revisions: (revision 8) (revision 9) - (revision a) + (revision c) (revision b) @@ -38,6 +38,8 @@ transactions: b (committed 0.0.b 11 (svn:date 2009-12-11T14:04:52.650534Z svn:author jrandom svn:log '') ()) - next-key c + (committed 0.0.c 10 (svn:date 2009-12-11T14:04:52.372379Z svn:author jrandom svn:log Rev to be obliterated) ()) + next-key + d changes: @@ -180,4 +182,6 @@ nodes: 0.0.b ((dir / 0.0.a 11 0 0) 7 v) + 0.0.c + ((dir / 0.0.9 10 0 0) 7 16) 1.0.1 ((dir /f-add '' 0 0 0) '' '') @@ -186,4 +190,6 @@ nodes: 1.0.b ((dir /f-add 1.0.a 2 0 0) '' w) + 1.0.c + ((dir /f-add 1.0.1 1 0 0) '' 1c) 2.0.1 ((dir /f-del '' 0 0 0) '' '') @@ -192,4 +198,6 @@ nodes: 2.0.a ((dir /f-del 2.0.9 2 0 0) '' n) + 2.0.c + ((dir /f-del 2.0.9 2 0 0) '' n) 3.0.1 ((dir /f-mod '' 0 0 0) '' '') @@ -200,4 +208,6 @@ nodes: 3.0.b ((dir /f-mod 3.0.a 3 0 0) '' 10) + 3.0.c + ((dir /f-mod 3.0.9 2 0 0) '' 18) 4.0.1 ((dir /f-mov '' 0 0 0) '' '') @@ -208,4 +218,6 @@ nodes: 4.0.b ((dir /f-mov 4.0.a 3 0 0) '' 12) + 4.0.c + ((dir /f-mov 4.0.9 2 0 0) '' 19) 5.0.1 ((dir /f-rpl '' 0 0 0) '' '') @@ -216,4 +228,6 @@ nodes: 5.0.b ((dir /f-rpl 5.0.a 3 0 0) '' y) + 5.0.c + ((dir /f-rpl 5.0.9 2 0 0) '' 15) 6.0.9 ((file /f-del/F '' 0 0 0) a k) @@ -224,4 +238,6 @@ nodes: 7.0.b ((file /f-mod/F 7.0.a 2 0 0) c 11) + 7.0.c + ((file /f-mod/F 7.0.9 1 0 0) c 17) 8.0.9 ((file /f-mov/E '' 0 0 0) e j) @@ -230,4 +246,6 @@ nodes: 8.1.b ((file /f-mov/F 8.1.a 2 0 0) e 13) + 8.1.c + ((file /f-mov/F 8.0.9 1 0 0) e 1a) 9.0.9 ((file /f-rpl/F '' 0 0 0) g h) @@ -236,8 +254,12 @@ nodes: a.0.b ((file /f-add/F a.0.a 1 0 0) g x) + a.0.c + ((file /f-add/F '' 0 0 0) g 1b) b.0.a ((file /f-rpl/F '' 0 0 0) g r) b.0.b ((file /f-rpl/F b.0.a 1 0 0) g z) + b.0.c + ((file /f-rpl/F '' 0 0 0) g 14) next-key c @@ -280,4 +302,22 @@ representations: 13 ((fulltext b (md5 h$\a8\17\ac2\a4n\18]bX\b1t\db@) (sha1 \1bO\e1\bc\df\1da\fe\f4\b3\80PC\1baS\a2\f3\b33)) 19) + 14 + ((fulltext c (md5 _:oz\cdWT\03\d7\a7\16\a3\cf\d4\12\03) (sha1 \0f\09u\c6H\c4X~X*Bs\cd\9b\fd\02C\b3/\c8)) 1e) + 15 + ((fulltext c (md5 \d0\c3\19hP.\a1\c9\b68\f4W\19\ec\9cQ) (sha1 \17\b0\c2?pl\ef\...@\a2x\1dkn=\02\00z\be)) 1f) + 16 + ((fulltext c (md5 \e98\f3\da\b5\0b!gb\82=\e1\93A\0b\d1) (sha1 \ac\e1\cf\ca\be\ae:\a9|\07\81\fb\e6<\b0!\b6\15G3)) 1g) + 17 + ((fulltext c (md5 _:oz\cdWT\03\d7\a7\16\a3\cf\d4\12\03) (sha1 \0f\09u\c6H\c4X~X*Bs\cd\9b\fd\02C\b3/\c8)) 1h) + 18 + ((fulltext c (md5 \ed<pC\7f\f2^\fe\f5\b3wP\b9vF[) (sha1 .G\04.\9c?\dcQxa%G\ee\d8\af\de}&2\d3)) 1i) + 19 + ((fulltext c (md5 \18\c6\9c\bd\97K\a5\83\04\03\b8Z\d60f\80) (sha1 \a1\e9\cd\9b\c3U\ec\92\f5uxw\01\f3V\94\cd\e6B\d3)) 1j) + 1a + ((fulltext c (md5 _:oz\cdWT\03\d7\a7\16\a3\cf\d4\12\03) (sha1 \0f\09u\c6H\c4X~X*Bs\cd\9b\fd\02C\b3/\c8)) 1k) + 1b + ((fulltext c (md5 _:oz\cdWT\03\d7\a7\16\a3\cf\d4\12\03) (sha1 \0f\09u\c6H\c4X~X*Bs\cd\9b\fd\02C\b3/\c8)) 1l) + 1c + ((fulltext c (md5 N\e3\d4+)\a5\c2Pth\1d\\.7M\a5) (sha1 B!\99\82\a0\16\023\00\ec\f6V{*\9b\d3k\db\c0\ab)) 1m) 2 ((fulltext 3 (md5 GU\02\81\f7\11\05E\0a\f9\b0_q\a7\f8e) (sha1 \ba\97\fd \0a\c2\c5J\c9\b3\d7\f2\e4\dd\97R \f8\b7\18)) 2) @@ -325,5 +365,5 @@ representations: ((fulltext a (md5 \bc\d8\b0\c2\eb\1f\ceqN\abl\ef\0dw\1a\cc) (sha1 \f9\06_\a78\97P\e1o\e0\0d{\a3gH\f6\1d>\0d\f6)) r) next-key - 14 + 1d o ((fulltext a (md5 \c3q\b3v\ad\b5\c0q\bd\fda\01\f4\f4\b7\0a) (sha1 |y\89\95\df"\bee\a81O\ee\d3:\eb\0c,\a6\b8\16)) s) @@ -422,4 +462,40 @@ strings: 19 Orange\0a + 1e + + 1e + Apple\0a + 1f + + 1f + ((F b.0.c)) + 1g + + 1g + ((f-add 1.0.c) (f-del 2.0.c) (f-mov 4.0.c) (f-mod 3.0.c) (f-rpl 5.0.c)) + 1h + + 1h + Apple\0a + 1i + + 1i + ((F 7.0.c)) + 1j + + 1j + ((F 8.1.c) (E 8.0.9)) + 1k + + 1k + Apple\0a + 1l + + 1l + Apple\0a + 1m + + 1m + ((F a.0.c)) 2 @@ -495,5 +571,5 @@ strings: Pear\0a next-key - 1e + 1n o ]]] I compare the table changes that the code made with the changes I'm expecting, looking at the diagrams linked as "before" and "after" from <http://svn.apache.org/repos/asf/subversion/trunk/notes/obliterate/design-repos.html#refs>; the direct link to the "after" is <http://svn.apache.org/repos/asf/subversion/trunk/notes/obliterate/schema-bdb-dd1-after.svg>. (I'm working on an intermediate diagram.) Next steps are (in no particular order): - modify the new txn (delete a specified node in it) before putting it back; this will be the real "obliterate". - [*1] Complete the "commit" stage: update the "changes" and "copies" and "node-origins" tables. These currently are not being updated, which means e.g. "svn log" after the obliteration shows that the revision is an empty revision, because its changes are not listed in the "changes" table. - Julian [*2] I use a crude function that pretties the BDB dump for human reading. It removes the numeric length prefixes like "5 hello 3 bye" to show just "hello bye". It parses arbitrary binary data wrongly, hence the warning messages. For my purposes in comparing the before- and after output, the wrong parsing doesn't matter. The function is subversion/tests/cmdline/svntest/objects.py:crude_bdb_parse().