Re: BDB: implementing 'upgrade' for fs-successor-ids

Daniel Shahaf Wed, 21 Sep 2011 09:58:25 -0700

C. Michael Pilato wrote on Wed, Sep 21, 2011 at 11:50:40 -0400:
> On 09/21/2011 11:03 AM, Daniel Shahaf wrote:
> >> But before we press on here, I'd like to understanding your bigger-picture
> >> view.
> > 
> > The branch operates on the assumption that an efficiently-queryable
> > successors store should be managed by the FS.  In this thread I'm
> > further assuming that creating successors would be expensive and
> > therefore 'svnadmin upgrade' should create a 'miscellaneous' table
> > record and bump the format number.
> > 
> > There is a concurrent thread by Stefan2 that challenges both of these
> > assumptions.  I don't know that we have consensus yet whether the design
> > in that thread or the design currently on the branch are better.  (And,
> > yes, figuring that is the second thing at the top of my list, next to
> > figuring out how to implement 'upgrade' on the branch.)
> 
> Yeah, I'm not following Stefan2's thread very closely.  But regardless of
> what he thinks Subversion *should* have, I don't know of any reasons why it
> should *not* have this successor-id mapping.
>


On a high level, I recall Stefan2 was suggesting a design that focuses
not on node-rev successors but on high-level copy operation, and that is
not FS-backend-specific.

> >> Why are you choosing to this by-revision in fs_base rather than using
> >> a more lower-level, largely-Subversion-ignorant approach?  Is it
> >> specifically so you can have an interruptible/restartable process?  Is it 
> >> so
> >> you can hook into some pre-existing per-revision subsystem (notification,
> >> perhaps)?
> > 
> > I was simply trying to outline an algorithm for populating the
> > successors store from scratch in a live FS.  (And yes, both
> > restartability and notification are nice properties to have.)
> 
> Okay.  I'm not sure that I would take the same course in a live FS versus an
> offline one, and you've been referring to 'upgrade' which shouldn't be run
> on a live FS -- that is, it should make the FS effectively "not live" for
> the duration of the upgrade.  So, I'm a touch confused about what
> specifically you are aiming at.
> 

What I'm thinking is as follows:

  base_upgrade():
    - create 'miscellaneous' table entry
    - set the stored format number to 5

  add_successors_to_f5_fs():
    - backfill successors and remove 'miscellaneous' tables entry

  base_history_next():
    - assert format >= 5
    - assert no 'miscellaneous' table entry
    - do whatever it does today

This makes base_upgrade() a cheap operation.  I was trying to make
add_successors_to_f5_fs() not block concurrent writers more than
necessary.

For add_successors_to_f5_fs(), I assumed operating by-revision would
result in smaller transactions and thus better behaviour for concurrent
readers/writers.  It's also what I imagined the algorithm for FSFS would
be.

> But here's the extent of my assumptions:  you want to backfill successors as
> quickly, efficiently, and painlessly as possible, ideally without
> interrupting live operation of the repository.  Is that fair?  :-)
> 

Yes :-)

> > It's not clear to me exactly what the alternatives your question refers
> > to are.  Could you elaborate on them, please?
> 
> Well, BDB being a real database, we can do this sort of backfill operation
> without attending to any higher-level Subversion concepts such as revisions
> at all.  A cursor walk through the `nodes' table should suffice:
> 
>    for key, value in nodes_table.rows()
>       successor_id = key
>       node_rev = parse_node_revision_skel(value)
>       successors_table.add_row(node_rev.predecessor_id, successor_id)
> 

I see what you mean now, thanks.  (See above for why I went for
a by-revision algorithm.)

> -- 
> C. Michael Pilato <cmpil...@collab.net>
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>

Re: BDB: implementing 'upgrade' for fs-successor-ids

Reply via email to