Re: NODE_DATA (2nd iteration)

Julian Foad Tue, 03 Aug 2010 05:25:33 -0700

On Tue, 2010-08-03 at 10:12 +0100, Julian Foad wrote:
> On Mon, 2010-08-02, I (Julian Foad) wrote:
> > Hi Erik.
> > 
> > Would you or anybody volunteer to draw a diagram of how these table rows
> > look in various simple-ish WC states?
> 
> Maybe I can help by drawing my best interpretation of it and getting
> your feedback.  I'll have a go.


Take a look at my attempt.  I've started as textual tables rather than a
visual diagram.

<https://docs.google.com/document/edit?id=1IhLTs37OszES0dQ4f08RF9dl4VqvBoCrFHBaH6cFGbk#>.

You're welcome to edit it directly if you email/IRC me and ask.

- Julian



> > I feel stupid saying this, but I haven't yet got much of an idea at all
> > about how a set of database rows will represent a particular collection
> > of repository nodes and local changes in the new scheme.  I know roughly
> > what the aim is (to be able to represent nested tree changes more
> > flexibly), and I can read what elements of data will be stored in each
> > table, but I am missing the part that says how those are connected.
> > 
> > At this point we might as well assume it's a single DB - I think that
> > will be clearest.
> > 
> > Thanks.
> > 
> > - Julian
> > 
> > 
> > On Mon, 2010-07-12 at 23:23 +0200, Erik Huelsmann wrote:
> > > After lots of discussion regarding the way NODE_DATA/4th tree should
> > > be working, I'm now ready to post a summary of the progress. In my
> > > last e-mail (http://svn.haxx.se/dev/archive-2010-07/0262.shtml) I
> > > stated why we need this; this post is about the conclusion of what
> > > needs to happen. Also included are the first steps there.
> > > 
> > > 
> > > With the advent of NODE_DATA, we distinguish node values specifically
> > > related to BASE nodes, those specifically related to "current" WORKING
> > > nodes and those which are to be maintained for multiple levels of
> > > WORKING nodes (not only the "current" view) (the latter category is
> > > most often also shared with BASE).
> > > 
> > > The respective tables will hold the columns shown below.
> > > 
> > > 
> > > -------------------------
> > > TABLE WORKING_NODE (
> > >   wc_id  INTEGER NOT NULL REFERENCES WCROOT (id),
> > >   local_relpath  TEXT NOT NULL,
> > >   parent_relpath  TEXT,
> > >   moved_here  INTEGER,
> > >   moved_to  TEXT,
> > >   original_repos_id  INTEGER REFERENCES REPOSITORY (id),
> > >   original_repos_path  TEXT,
> > >   original_revnum  INTEGER,
> > >   translated_size  INTEGER,
> > >   last_mod_time  INTEGER,  /* an APR date/time (usec since 1970) */
> > >   keep_local  INTEGER,
> > > 
> > >   PRIMARY KEY (wc_id, local_relpath)
> > >   );
> > > 
> > > CREATE INDEX I_WORKING_PARENT ON WORKING_NODE (wc_id, parent_relpath);
> > > --------------------------------
> > > 
> > > The moved_* and original_* columns are typical examples of "WORKING
> > > fields only maintained for the visible WORKING nodes": the original_*
> > > and moved_* fields are inherited from the operation root by all
> > > children part of the operation. The operation root will be the visible
> > > change on its own level, meaning it'll have rows both in the
> > > WORKING_NODE and NODE_DATA tables. The fact that these columns are not
> > > in the WORKING_NODE table means that tree changes are not preserved
> > > accros overlapping changes. This is fully compatible with what we do
> > > today: changes to higher levels destroy changes to lower levels.
> > > 
> > > The translated_size and last_mod_time columns exist in WORKING_NODE
> > > and BASE_NODE; they explicitly don't exist in NODE_DATA. The fact that
> > > they exist in BASE_NODE is a bit of a hack: it's to prevent creation
> > > of WORKING_NODE data for every file which has keyword expansion or eol
> > > translation properties set: these columns serve only to optimize
> > > working copy scanning for changes and as such only relate to the
> > > visible WORKING_NODEs.
> > > 
> > > 
> > >  TABLE BASE_NODE (
> > >   wc_id  INTEGER NOT NULL REFERENCES WCROOT (id),
> > >   local_relpath  TEXT NOT NULL,
> > >   repos_id  INTEGER REFERENCES REPOSITORY (id),
> > >   repos_relpath  TEXT,
> > >   parent_relpath  TEXT,
> > >   translated_size  INTEGER,
> > >   last_mod_time  INTEGER,  /* an APR date/time (usec since 1970) */
> > >   dav_cache  BLOB,
> > >   incomplete_children  INTEGER,
> > >   file_external  TEXT,
> > > 
> > >   PRIMARY KEY (wc_id, local_relpath)
> > >   );
> > > 
> > > 
> > > TABLE NODE_DATA (
> > >   wc_id  INTEGER NOT NULL REFERENCES WCROOT (id),
> > >   local_relpath  TEXT NOT NULL,
> > >   op_depth  INTEGER NOT NULL,
> > >   presence  TEXT NOT NULL,
> > >   kind  TEXT NOT NULL,
> > >   checksum  TEXT,
> > >   changed_rev  INTEGER,
> > >   changed_date  INTEGER,  /* an APR date/time (usec since 1970) */
> > >   changed_author  TEXT,
> > >   depth  TEXT,
> > >   symlink_target  TEXT,
> > >   properties  BLOB,
> > > 
> > >   PRIMARY KEY (wc_id, local_relpath, oproot)
> > >   );
> > > 
> > > CREATE INDEX I_NODE_WC_RELPATH ON NODE_DATA (wc_id, local_relpath);
> > > 
> > > 
> > > Which leaves the NODE_DATA structure above. The op_depth column
> > > contains the depth of the node - relative to the wc root - on which
> > > the operation was run which caused the creation of the given NODE_DATA
> > > node.  In the final scheme (based on single-db), the value will be 0
> > > for base and a positive integer for WORKING related data.
> > > 
> > > In order to be able to implement NODE_DATA even without having a fully
> > > functional SINGLE_DB yet, a transitional node numbering scheme needs
> > > to be devised. The following numbers will apply: BASE == 0,
> > > WORKING-this-dir == 1, WORKING-any-immediate-child == 2.
> > > 
> > > 
> > > Other transitioning related remarks:
> > > 
> > >  * Conditional-protected experimentational sections, just like with 
> > > SINGLE_DB
> > >  * Initial implementation will simply replace the current
> > > functionality of the 2 tables, from there we can work our way through
> > > whatever needs doing.
> > >  * Am I forgetting any others?
> > > 
> > > Bye,
> > > 
> > > Erik.
> > 
> > 
> 
>

Re: NODE_DATA (2nd iteration)

Reply via email to