On Thu, 2010-08-12, Greg Stein wrote: > Looks good! Committed. Now I'm trying to understand and document BASE_NODE (first) in more detail. Then I/we can tackle the other tables, and especially it should give me (and anyone else who may be struggling to follow) a starting point for understanding the NODE_DATA change.
I want to help write a top-down description that new developers can understand, and at the same time I'm aiming for something we can use as a *definitive* specification when it comes to deciding what's a bug, writing unit tests, etc., so it must not gloss over any important details. My starting point is trying to describe what I think it means now. I haven't tested each statement against current behaviour. If we find any unnecessary quirks and exceptions, I want to first document them as such, and then we can decide whether and when to eliminate them. Are any parts of this inaccurate? [[[ Index: subversion/libsvn_wc/wc-metadata.sql =================================================================== /home/julianfoad/bin/svn-external-diff: warning: files are different types: text/x-c; charset=us-ascii text/x-lisp; charset=us-ascii --- subversion/libsvn_wc/wc-metadata.sql (revision 985836) +++ subversion/libsvn_wc/wc-metadata.sql (working copy) @@ -73,21 +73,111 @@ CREATE TABLE WCROOT ( CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath); /* ------------------------------------------------------------------------- */ +/* +The BASE_NODE table: + + BASE is what we get from the server. It is the *absolute* pristine copy. + You need to use checkout, update, switch, or commit to alter your view of + the repository. + + In the BASE tree, each node corresponds to a particular node-rev in the + repository. It can be a mixed-revision tree. Each node holds either a + copy of the node-rev as it exists in the repository (if presence == + 'normal'), or a place-holder (if presence == 'absent' or 'excluded' or + 'not-present'). + [Quoted from wc_db.h] + +Overview of BASE_NODE columns: + + Indexing columns: (wc_id, local_relpath, parent_relpath) + + (presence) + - One of the following values: + + 'presence' Meaning Node-Rev? Content? Last-Change? + ---------- ----------- ----------- -------- ------------ + normal => Present Existing Yes Yes + absent => Unauthz Existing No No + excluded => Unwanted Existing No No + not-pres => Nonexistent Nonexistent No No + incomplete => ### undefined + + Node-Rev columns: (repos_id, repos_relpath, revnum) + - Always points to the corresponding repository node-rev. + - Points to an existing node-rev, unless presence==not-present in which + case it points to a nonexistent node-rev. + - ### A comment on 'repos_id' and 'repos_relpath' says they may be null; + is this true and wanted? + - ### A comment on 'revnum' says, "this could be NULL for non-present + nodes -- no info"; is this true and wanted? + + Content columns: (kind, properties, depth, target, checksum) + - One of: ---- ---------- ----- ------ -------- + 'dir' Yes Yes null null + 'symlink' Yes null Yes null + 'file' Yes null null Yes + 'unknown' null null null null + - Content is present iff presence==normal, otherwise kind=unknown and + the other columns are null. + - If kind==dir: the children are represented by the existence of other + BASE_NODE rows. For each immediate child of 'repos_relpath'@'revnum' + that is included by 'depth', a BASE_NODE row exists with its + 'local_relpath' being this node's 'local_relpath' plus the child's + basename. (Rows may also exist for additional children which are + outside the scope of 'depth' or do not exist as children of this + node-rev in the repository, including 'externals' and paths updated to + a revision in which they do exist.) There is no distinction between + depth=immediates and depth=infinity here. + - If kind==symlink: the target path is contained in 'target'. + - If kind==file: the content is contained in the Pristine Store, + referenced by its SHA-1 checksum 'checksum'. + + Last-Change columns: (changed_rev, changed_date, changed_author) + - Last-Change info is present iff presence==normal, otherwise null. + - Specifies the revision in which the content was last changed before + Node-Rev, following copies and not counting the copy operation itself + as a change. + - Does not specify the revision in which this node first appeared at + the repository path 'repos_relpath', which could be more recent than + the last change of this node's content. + - Includes a copy of the corresponding date and author rev-props. + + Working file status: (translated_size, last_mod_time) + - Present iff kind==file and node has no WORKING_NODE row, otherwise + null. (If kind==file and node has a WORKING_NODE row, the info is + recorded in that row). ### True? + - Records the status of the working file on disk, for the purpose of + detecting quickly whether that file has been modified. + - Logically belongs to the ACTUAL_NODE table but is recorded in the + BASE_NODE and WORKING_NODE tables instead to avoid the overhead of + storing an ACTUAL_NODE row for each unmodified file. + - Records the actual size and mod-time of the disk file at the time when + its content was last determined to be logically unmodified relative to + its base, taking account of keywords and EOL style. + + (dav_cache) + + (incomplete_children) + - Obsolete, unused. + + (file_external) +*/ + CREATE TABLE BASE_NODE ( /* specifies the location of this node in the local filesystem. wc_id implies an absolute path, and local_relpath is relative to that location (meaning it will be "" for the wcroot). */ wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, - /* the repository this node is part of, and the relative path [to its - root] within revision "revnum" of that repository. These may be NULL, + /* The repository this node is part of, and the relative path (from its + root) within revision "revnum" of that repository. These may be NULL, implying they should be derived from the parent and local_relpath. Non-NULL typically indicates a switched node. Note: they must both be NULL, or both non-NULL. */ repos_id INTEGER REFERENCES REPOSITORY (id), repos_relpath TEXT, @@ -109,20 +199,18 @@ CREATE TABLE BASE_NODE ( this could be NULL for non-present nodes -- no info. */ revnum INTEGER, /* If this node is a file, then the SHA-1 checksum of the pristine text. */ checksum TEXT, - /* The size in bytes of the working file when it had no local text - modifications. This means the size of the text when translated from - repository-normal format to working copy format with EOL style - translated and keywords expanded according to the properties in the - "properties" column of this row. + /* The size in bytes of the working file when it was last determined to be + logically unmodified relative to its base, taking account of keywords + and EOL style. - NULL if this node is not a file or if the size has not (yet) been - computed. */ + NULL if this node is not a file or if this info has not yet been + determined. */ translated_size INTEGER, /* Information about the last change to this node. changed_rev must be not-null if this node has presence=="normal". changed_date and changed_author may be null if the corresponding revprops are missing. @@ -134,18 +222,25 @@ CREATE TABLE BASE_NODE ( /* NULL depth means "default" (typically svn_depth_infinity) */ depth TEXT, /* for kind==symlink, this specifies the target. */ symlink_target TEXT, + /* The mod-time of the working file when it was last determined to be + logically unmodified relative to its base, taking account of keywords + and EOL style. + + NULL if this node is not a file or if this info has not yet been + determined. + */ /* ### Do we need this? We've currently got various mod time APIs ### internal to libsvn_wc, but those might be used in answering some ### question which is better answered some other way. */ last_mod_time INTEGER, /* an APR date/time (usec since 1970) */ - /* serialized skel of this node's properties. could be NULL if we + /* serialized skel of this node's properties. NULL if we have no information about the properties (a non-present node). */ properties BLOB, /* serialized skel of this node's dav-cache. could be NULL if the node does not have any dav-cache. */ dav_cache BLOB, ]]] wc-metadata.sql looks like a good home for this documentation for the time being. Perhaps it will eventually live separately. - Julian