On Fri, Sep 2, 2011 at 9:30 AM, Philip Martin <philip.mar...@wandisco.com> wrote: > The query STMT_SELECT_NODE_CHILDREN_WALKER_INFO as used in 1.7 > > SELECT local_relpath, op_depth, presence, kind > FROM nodes > WHERE wc_id = ?1 AND parent_relpath = ?2 > GROUP BY local_relpath > ORDER BY op_depth DESC > > performs poorly and doesn't scale well with working copy size. This > causes recursive functions that use svn_wc__internal_walk_children to be > slow, things like "svn info --depth infinity". I fixed this on trunk by > changing the query and some C code, see r1164426. > > On IRC Bert pointed out that we can fix the problem by introducing a new > index: > > NODES(wc_id, parent_relpath, local_relpath, op_depth) > > This improves things dramatically. If we add this new index we can > revert r1164426, the index provides a slightly larger performance gain > than the query/C code change. > > Bert also suggests changing our other indices by adding wc_id and/or > local_relpath thus allowing them to be UNIQUE. Can anyone confirm that > UNIQUE indices are better? > > I think that the I_ROOT and I_LOCAL_ABSPATH indices are unnecessary > given that columns they index are defined as UNIQUE. Can anyone confirm > that we don't need indices on UNIQUE columns?
I believe that UNIQUE columns are auto indexed. For example: sqlite> PRAGMA INDEX_LIST('wcroot'); 0|I_LOCAL_ABSPATH|1 1|sqlite_autoindex_WCROOT_1|1 and sqlite> select * from sqlite_master where type = 'index' and tbl_name = 'WCROOT'; index|sqlite_autoindex_WCROOT_1|WCROOT|8| index|I_LOCAL_ABSPATH|WCROOT|9|CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath) would both indicate there are two indices on the WCROOT table, though we only define one. I believe one of these indices is due to the UNIQUEness of the local_abspath column. Also, from http://www.sqlite.org/faq.html#q7 (in describing the sqlite_master table): For automatically created indices (used to implement the PRIMARY KEY or UNIQUE constraints) the sql field is NULL. >From this documentation, it would seem that our indices are redundant. > It's possible that we don't need I_EXTERNALS_PARENT as none of the > queries look like they will use it. Perhaps we should drop it? > > So how should we fix the 1.7 performance problem? > > - Use r1164426, my non-schema change fix. > > - Create a new WC format 30 with the new index. > > - Create a new WC format 30 with all the schema changes in the patch > below. > > Changing the WC format would involve auto-upgrading format 29 working > copies. We need to decide whether we want the minimal format 30 change > in 1.7 before we develop this feature on trunk. > > > Patch to change indices below. This is not a complete patch, it doesn't > bump the format or auto-upgrade. It does pass the regression tests and > improve the performance of recursive info and propset. > > Index: subversion/libsvn_wc/wc-metadata.sql > =================================================================== > --- subversion/libsvn_wc/wc-metadata.sql (revision 1164459) > +++ subversion/libsvn_wc/wc-metadata.sql (working copy) > @@ -58,7 +58,6 @@ > /* Note: a repository (identified by its UUID) may appear at multiple URLs. > For example, http://example.com/repos/ and https://example.com/repos/. */ > CREATE INDEX I_UUID ON REPOSITORY (uuid); > -CREATE INDEX I_ROOT ON REPOSITORY (root); > > > /* ------------------------------------------------------------------------- > */ > @@ -71,7 +70,6 @@ > local_abspath TEXT UNIQUE > ); > > -CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath); > > > /* ------------------------------------------------------------------------- > */ > @@ -178,8 +176,10 @@ > PRIMARY KEY (wc_id, local_relpath) > ); > > -CREATE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath); > -CREATE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (changelist); > +CREATE UNIQUE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath, > + local_relpath); > +CREATE UNIQUE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (wc_id, changelist, > + local_relpath); > > > /* ------------------------------------------------------------------------- > */ > @@ -478,7 +478,10 @@ > > ); > > -CREATE INDEX I_NODES_PARENT ON NODES (wc_id, parent_relpath, op_depth); > +CREATE UNIQUE INDEX I_NODES_PARENT1 ON NODES (wc_id, parent_relpath, > + op_depth, local_relpath); > +CREATE UNIQUE INDEX I_NODES_PARENT2 ON NODES (wc_id, parent_relpath, > + local_relpath, op_depth); > > /* Many queries have to filter the nodes table to pick only that version > of each node with the highest (most "current") op_depth. This view > @@ -567,7 +570,8 @@ > PRIMARY KEY (wc_id, local_relpath) > ); > > -CREATE INDEX I_EXTERNALS_PARENT ON EXTERNALS (wc_id, parent_relpath); > +CREATE UNIQUE INDEX I_EXTERNALS_PARENT ON EXTERNALS (wc_id, parent_relpath, > + local_relpath); > CREATE UNIQUE INDEX I_EXTERNALS_DEFINED ON EXTERNALS (wc_id, > def_local_relpath, > local_relpath); > @@ -804,8 +808,10 @@ > PRIMARY KEY (wc_id, local_relpath) > ); > > -CREATE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath); > -CREATE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (changelist); > +CREATE UNIQUE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath, > + local_relpath); > +CREATE UNIQUE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (wc_id, changelist, > + local_relpath); > > INSERT INTO ACTUAL_NODE SELECT > wc_id, local_relpath, parent_relpath, properties, conflict_old, > > -- > uberSVN: Apache Subversion Made Easy > http://www.uberSVN.com > -- uberSVN: Apache Subversion Made Easy http://www.uberSVN.com/