C. Michael Pilato wrote on Wed, Sep 21, 2011 at 11:50:40 -0400: > On 09/21/2011 11:03 AM, Daniel Shahaf wrote: > >> But before we press on here, I'd like to understanding your bigger-picture > >> view. > > > > The branch operates on the assumption that an efficiently-queryable > > successors store should be managed by the FS. In this thread I'm > > further assuming that creating successors would be expensive and > > therefore 'svnadmin upgrade' should create a 'miscellaneous' table > > record and bump the format number. > > > > There is a concurrent thread by Stefan2 that challenges both of these > > assumptions. I don't know that we have consensus yet whether the design > > in that thread or the design currently on the branch are better. (And, > > yes, figuring that is the second thing at the top of my list, next to > > figuring out how to implement 'upgrade' on the branch.) > > Yeah, I'm not following Stefan2's thread very closely. But regardless of > what he thinks Subversion *should* have, I don't know of any reasons why it > should *not* have this successor-id mapping. >
On a high level, I recall Stefan2 was suggesting a design that focuses not on node-rev successors but on high-level copy operation, and that is not FS-backend-specific. > >> Why are you choosing to this by-revision in fs_base rather than using > >> a more lower-level, largely-Subversion-ignorant approach? Is it > >> specifically so you can have an interruptible/restartable process? Is it > >> so > >> you can hook into some pre-existing per-revision subsystem (notification, > >> perhaps)? > > > > I was simply trying to outline an algorithm for populating the > > successors store from scratch in a live FS. (And yes, both > > restartability and notification are nice properties to have.) > > Okay. I'm not sure that I would take the same course in a live FS versus an > offline one, and you've been referring to 'upgrade' which shouldn't be run > on a live FS -- that is, it should make the FS effectively "not live" for > the duration of the upgrade. So, I'm a touch confused about what > specifically you are aiming at. > What I'm thinking is as follows: base_upgrade(): - create 'miscellaneous' table entry - set the stored format number to 5 add_successors_to_f5_fs(): - backfill successors and remove 'miscellaneous' tables entry base_history_next(): - assert format >= 5 - assert no 'miscellaneous' table entry - do whatever it does today This makes base_upgrade() a cheap operation. I was trying to make add_successors_to_f5_fs() not block concurrent writers more than necessary. For add_successors_to_f5_fs(), I assumed operating by-revision would result in smaller transactions and thus better behaviour for concurrent readers/writers. It's also what I imagined the algorithm for FSFS would be. > But here's the extent of my assumptions: you want to backfill successors as > quickly, efficiently, and painlessly as possible, ideally without > interrupting live operation of the repository. Is that fair? :-) > Yes :-) > > It's not clear to me exactly what the alternatives your question refers > > to are. Could you elaborate on them, please? > > Well, BDB being a real database, we can do this sort of backfill operation > without attending to any higher-level Subversion concepts such as revisions > at all. A cursor walk through the `nodes' table should suffice: > > for key, value in nodes_table.rows() > successor_id = key > node_rev = parse_node_revision_skel(value) > successors_table.add_row(node_rev.predecessor_id, successor_id) > I see what you mean now, thanks. (See above for why I went for a by-revision algorithm.) > -- > C. Michael Pilato <cmpil...@collab.net> > CollabNet <> www.collab.net <> Distributed Development On Demand >