The complete patch (with log message) is attached.  Any review would be
appreciated.

> I have tried some timing comparisons and can't even see a difference.

My test WC consisted of 10 identical directories, each containing 1000
different text files, each consisting of a single line "Hello NNN" where
NNN is a number from 000 to 999.

Timings for the patched version (with the ref-count verification code
disabled):

CMD: svn add --quiet wc1/d000/ wc1/d001/ wc1/d002/ wc1/d003/ wc1/d004/
wc1/d005/ wc1/d006/ wc1/d007/ wc1/d008/ wc1/d009/
3.22 real, 1.52 user, 1.71 sys
CMD: svn commit --quiet -m  wc1
86.05 real, 62.47 user, 19.80 sys
CMD: svn st --quiet wc1
0.30 real, 0.16 user, 0.10 sys
CMD: svn co --quiet file:///home/julianfoad/tmp/svn/ref1/repo wc2
28.67 real, 15.09 user, 10.30 sys
CMD: svn up --quiet -r0 wc2
5.98 real, 2.78 user, 2.31 sys

Timings for the unpatched trunk code (r1057552):

CMD: svn add --quiet wc1/d000/ wc1/d001/ wc1/d002/ wc1/d003/ wc1/d004/
wc1/d005/ wc1/d006/ wc1/d007/ wc1/d008/ wc1/d009/
3.30 real, 1.33 user, 1.98 sys
CMD: svn commit --quiet -m  wc1
117.22 real, 63.88 user, 36.82 sys
CMD: svn st --quiet wc1
0.32 real, 0.14 user, 0.11 sys
CMD: svn co --quiet file:///home/julianfoad/tmp/svn/ref0/repo wc2
29.66 real, 15.08 user, 10.49 sys
CMD: svn up --quiet -r0 wc2
5.99 real, 2.56 user, 2.32 sys

Some of these timings - especially the commit - varied widely across
several runs, both patched and unpatched.

- Julian

Implement pristine text management by reference counting, deleting
unreferenced texts every time a WC-root object is closed.  Bump the WC
format and upgrade old development WCs by setting their reference counts.

This log message is arranged in several functional parts.

Turn on recursive triggers in SQLite so that triggers will fire on both
parts of an "INSERT OR REPLACE" statement.  This requires SQLite >= 3.6.18.

* subversion/libsvn_subr/sqlite.c
  (svn_sqlite__open): Add "PRAGMA recursive_triggers=ON".

Implement reference counting for pristine texts, using SQLite triggers.

* subversion/libsvn_wc/wc-metadata.sql
  (PRISTINE): Update the documentation of the 'refcount' column.
  (nodes_insert_trigger, nodes_delete_trigger, nodes_update_checksum_trigger):
    New triggers that cause the appropriate reference count (or counts) to
    be incremented or decremented whenever a NODES table row containing a
    reference is inserted or deleted or modified.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_INSERT_PRISTINE): Initialize the ref count to 0 instead of 1.

* subversion/libsvn_wc/wc_db.c
  (create_db): After creating the schema, create the new triggers.

* subversion/tests/libsvn_wc/db-test.c
  (create_fake_wc): Same.

* subversion/tests/libsvn_wc/entries-compat.c
  (create_fake_wc): Same.

When closing a "wcroot" object, remove all unreferenced pristine texts.

* subversion/libsvn_wc/wc_db_pdh.c
  (close_wcroot): Clean up unreferenced pristines by calling
    svn_wc__db_pristine_cleanup_wcroot().

* subversion/libsvn_wc/wc_db_private.h
  (svn_wc__db_pristine_cleanup_wcroot): New function.

* subversion/libsvn_wc/wc_db.c
  (count_pristine_references): New function.
  (pristine_cleanup_wcroot): Rename to svn_wc__db_pristine_cleanup_wcroot()
    and make public. Optionally (currently #ifdef'd out), verify that the
    ref count column of each pristine text matches the actual number of
    references; this can be slow. (### That should be in a separate function
    or perhaps not committed.) Show debug messages when deleting unused
    pristine texts.  (### That should not be committed.)
  (svn_wc__db_pristine_cleanup): Adjust for the renaming.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_SELECT_ALL_PRISTINES, STMT_SELECT_ALL_PRISTINE_REFERENCES): New
    queries, used only for verification.

* subversion/libsvn_wc/adm_ops.c
  (svn_wc__internal_remove_from_revision_control): Don't remove unused
    pristine texts here, because the clean-up-on-close will take care of it.

Bump the WC format and upgrade old development WCs by initializing the
ref-count.

* subversion/libsvn_wc/wc.h
  (SVN_WC__VERSION): Bump to 24.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_UPGRADE_TO_24): New statement.

* subversion/libsvn_wc/upgrade.c
  (bump_to_24): New function.
  (svn_wc__upgrade_sdb): Call it.

--This line, and those below, will be ignored--

Index: subversion/libsvn_subr/sqlite.c
===================================================================
--- subversion/libsvn_subr/sqlite.c	(revision 1057551)
+++ subversion/libsvn_subr/sqlite.c	(working copy)
@@ -932,7 +932,11 @@ svn_sqlite__open(svn_sqlite__db_t **db, 
 
                  ### Maybe switch to NORMAL(1) when we use larger transaction
                      scopes */
-              "PRAGMA synchronous=OFF;"));
+              "PRAGMA synchronous=OFF;"
+              /* Enable recursive triggers so that a user trigger will fire
+               * in the deletion phase of an INSERT OR REPLACE statement.
+               * Requires SQLite >= 3.6.18 */
+              "PRAGMA recursive_triggers=ON;"));
 
 #if SQLITE_VERSION_AT_LEAST(3,6,19) && defined(SVN_DEBUG)
   /* When running in debug mode, enable the checking of foreign key
Index: subversion/libsvn_wc/adm_ops.c
===================================================================
--- subversion/libsvn_wc/adm_ops.c	(revision 1057551)
+++ subversion/libsvn_wc/adm_ops.c	(working copy)
@@ -1819,21 +1819,8 @@ svn_wc__internal_remove_from_revision_co
                                               scratch_pool));
 
       /* Having removed the checksums that reference the pristine texts,
-         remove the pristine texts (if now totally unreferenced) from the
-         pristine store.  Don't try to remove the same pristine text twice.
-         The two checksums might be the same, either because the copied base
-         was exactly the same as the replaced base, or just because the
-         ..._read_info() code above sets WORKING_SHA1_CHECKSUM to the base
-         checksum if there is no WORKING_NODE row. */
-      if (base_sha1_checksum)
-        SVN_ERR(svn_wc__db_pristine_remove(db, local_abspath,
-                                           base_sha1_checksum,
-                                           scratch_pool));
-      if (working_sha1_checksum
-          && ! svn_checksum_match(base_sha1_checksum, working_sha1_checksum))
-        SVN_ERR(svn_wc__db_pristine_remove(db, local_abspath,
-                                           working_sha1_checksum,
-                                           scratch_pool));
+         at this point we could remove the pristine texts from the
+         pristine store if they are now totally unreferenced. */
 
       /* If we were asked to destroy the working file, do so unless
          it has local mods. */
Index: subversion/libsvn_wc/upgrade.c
===================================================================
--- subversion/libsvn_wc/upgrade.c	(revision 1057551)
+++ subversion/libsvn_wc/upgrade.c	(working copy)
@@ -1129,6 +1129,14 @@ bump_to_23(void *baton, svn_sqlite__db_t
   return SVN_NO_ERROR;
 }
 
+static svn_error_t *
+bump_to_24(void *baton, svn_sqlite__db_t *sdb, apr_pool_t *scratch_pool)
+{
+  SVN_ERR(svn_sqlite__exec_statements(sdb, STMT_UPGRADE_TO_24));
+  SVN_ERR(svn_sqlite__exec_statements(sdb, STMT_CREATE_NODES_TRIGGERS));
+  return SVN_NO_ERROR;
+}
+
 
 struct upgrade_data_t {
   svn_sqlite__db_t *sdb;
@@ -1390,6 +1398,12 @@ svn_wc__upgrade_sdb(int *result_format,
         *result_format = 23;
         /* FALLTHROUGH  */
 
+      case 23:
+        SVN_ERR(svn_sqlite__with_transaction(sdb, bump_to_24, &bb,
+                                             scratch_pool));
+        *result_format = 24;
+        /* FALLTHROUGH  */
+
       /* ### future bumps go here.  */
 #if 0
       case XXX-1:
Index: subversion/libsvn_wc/wc_db.c
===================================================================
--- subversion/libsvn_wc/wc_db.c	(revision 1057551)
+++ subversion/libsvn_wc/wc_db.c	(working copy)
@@ -1273,9 +1273,8 @@ create_db(svn_sqlite__db_t **sdb,
 
   /* Create the database's schema.  */
   SVN_ERR(svn_sqlite__exec_statements(*sdb, STMT_CREATE_SCHEMA));
-
-  /* Create the NODES table for the experimental schema */
   SVN_ERR(svn_sqlite__exec_statements(*sdb, STMT_CREATE_NODES));
+  SVN_ERR(svn_sqlite__exec_statements(*sdb, STMT_CREATE_NODES_TRIGGERS));
 
   /* Insert the repository. */
   SVN_ERR(create_repos_id(repos_id, repos_root_url, repos_uuid, *sdb,
@@ -2585,11 +2584,99 @@ svn_wc__db_pristine_remove(svn_wc__db_t 
 }
 
 
+#ifdef SVN_DEBUG
+/* Set *ACTUAL_REFS to the number of references to SHA1_CHECKSUM. */
 static svn_error_t *
-pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
-                        apr_pool_t *scratch_pool)
+count_pristine_references(int *actual_refs,
+                          svn_sqlite__db_t *sdb,
+                          const svn_checksum_t *sha1_checksum,
+                          apr_pool_t *scratch_pool)
+{
+  svn_sqlite__stmt_t *stmt;
+
+  *actual_refs = 0;
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb,
+                                    STMT_SELECT_ALL_PRISTINE_REFERENCES));
+  SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool));
+  while (1)
+    {
+      svn_boolean_t have_row;
+
+      SVN_ERR(svn_sqlite__step(&have_row, stmt));
+      if (! have_row)
+        break;
+
+      (*actual_refs)++;
+    }
+  SVN_ERR(svn_sqlite__reset(stmt));
+  return SVN_NO_ERROR;
+}
+#endif
+
+svn_error_t *
+svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
+                                   apr_pool_t *scratch_pool)
 {
   svn_sqlite__stmt_t *stmt;
+  int removed = 0;
+
+#ifdef SVN_DEBUG2
+  /* For each pristine text: check its refcount matches its actual references. */
+  SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
+                                    STMT_SELECT_ALL_PRISTINES));
+  while (1)
+    {
+      svn_boolean_t have_row;
+      const svn_checksum_t *sha1_checksum;
+      int refcount, actual_refs;
+      svn_sqlite__stmt_t *stmt2;
+
+      SVN_ERR(svn_sqlite__step(&have_row, stmt));
+      if (! have_row)
+        break;
+
+      SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0,
+                                          scratch_pool));
+      refcount = svn_sqlite__column_int(stmt, 1);
+
+      /* Count the actual references to this checksum. */
+      SVN_ERR(count_pristine_references(&actual_refs, wcroot->sdb,
+                                        sha1_checksum, scratch_pool));
+
+      if (refcount == actual_refs)
+        continue;
+
+      SVN_DBG(("%s: refcount %d != actual_refs %d\n",
+               svn_checksum_to_cstring_display(sha1_checksum, scratch_pool),
+               refcount, actual_refs));
+
+      /* Show the actual references to this checksum. */
+      SVN_ERR(svn_sqlite__get_statement(&stmt2, wcroot->sdb,
+                                        STMT_SELECT_ALL_PRISTINE_REFERENCES));
+      SVN_ERR(svn_sqlite__bind_checksum(stmt2, 1, sha1_checksum, scratch_pool));
+      while (1)
+        {
+          svn_boolean_t have_row2;
+          const char *local_relpath;
+          int op_depth;
+
+          SVN_ERR(svn_sqlite__step(&have_row2, stmt2));
+          if (! have_row2)
+            break;
+
+          local_relpath = svn_sqlite__column_text(stmt2, 1, scratch_pool);
+          op_depth = svn_sqlite__column_int(stmt2, 2);
+          SVN_DBG(("pristine ref: %s, op_depth %d, '%s'\n",
+                   svn_checksum_to_cstring_display(sha1_checksum, scratch_pool),
+                   op_depth, local_relpath));
+        }
+      SVN_ERR(svn_sqlite__reset(stmt2));
+
+      SVN_ERR_ASSERT(!"pristine refcount mismatch");
+    }
+  SVN_ERR(svn_sqlite__reset(stmt));
+#endif
 
   /* Find each unreferenced pristine in the DB and remove it. */
   SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
@@ -2605,8 +2692,13 @@ pristine_cleanup_wcroot(svn_wc__db_wcroo
 
       SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0,
                                           scratch_pool));
+      SVN_DBG(("removing unused pristine %s\n",
+              svn_checksum_to_cstring_display(sha1_checksum, scratch_pool)));
+      removed++;
       SVN_ERR(pristine_remove(wcroot, sha1_checksum, scratch_pool));
     }
+  if (removed > 0)
+    SVN_DBG(("removed %d unused pristines in '%s'\n", removed, wcroot->abspath));
   SVN_ERR(svn_sqlite__reset(stmt));
 
   return SVN_NO_ERROR;
@@ -2628,7 +2720,7 @@ svn_wc__db_pristine_cleanup(svn_wc__db_t
                               scratch_pool, scratch_pool));
   VERIFY_USABLE_PDH(pdh);
 
-  SVN_ERR(pristine_cleanup_wcroot(pdh->wcroot, scratch_pool));
+  SVN_ERR(svn_wc__db_pristine_cleanup_wcroot(pdh->wcroot, scratch_pool));
 
   return SVN_NO_ERROR;
 }
Index: subversion/libsvn_wc/wc_db_pdh.c
===================================================================
--- subversion/libsvn_wc/wc_db_pdh.c	(revision 1057551)
+++ subversion/libsvn_wc/wc_db_pdh.c	(working copy)
@@ -26,6 +26,7 @@
 #include <assert.h>
 
 #include "svn_dirent_uri.h"
+#include "svn_pools.h"
 
 #include "wc.h"
 #include "adm_files.h"
@@ -114,6 +115,13 @@ close_wcroot(void *data)
 
   SVN_ERR_ASSERT_NO_RETURN(wcroot->sdb != NULL);
 
+  /* Remove all unreferenced pristine texts. */
+  {
+    apr_pool_t *scratch_pool = svn_pool_create(NULL);
+    svn_error_clear(svn_wc__db_pristine_cleanup_wcroot(wcroot, scratch_pool));
+    svn_pool_destroy(scratch_pool);
+  }
+
   err = svn_sqlite__close(wcroot->sdb);
   wcroot->sdb = NULL;
   if (err)
Index: subversion/libsvn_wc/wc_db_private.h
===================================================================
--- subversion/libsvn_wc/wc_db_private.h	(revision 1057551)
+++ subversion/libsvn_wc/wc_db_private.h	(working copy)
@@ -194,5 +194,10 @@ svn_wc__db_util_open_db(svn_sqlite__db_t
                         apr_pool_t *result_pool,
                         apr_pool_t *scratch_pool);
 
+/* Remove all unreferenced pristines in WCROOT. */
+svn_error_t *
+svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
+                                   apr_pool_t *scratch_pool);
+
 
 #endif /* WC_DB_PDH_H */
Index: subversion/libsvn_wc/wc.h
===================================================================
--- subversion/libsvn_wc/wc.h	(revision 1057551)
+++ subversion/libsvn_wc/wc.h	(working copy)
@@ -133,12 +133,15 @@ extern "C" {
  * The change from 22 to 23 introduced multi-layer op_depth processing for
  * NODES.
  *
+ * The change from 23 to 24 started using the 'refcount' column of the
+ * 'pristine' table correctly, instead of always setting it to '1'.
+ *
  * == 1.7.x shipped with format ???
  *
  * Please document any further format changes here.
  */
 
-#define SVN_WC__VERSION 23
+#define SVN_WC__VERSION 24
 
 
 /* Formats <= this have no concept of "revert text-base/props".  */
Index: subversion/libsvn_wc/wc-metadata.sql
===================================================================
--- subversion/libsvn_wc/wc-metadata.sql	(revision 1057551)
+++ subversion/libsvn_wc/wc-metadata.sql	(working copy)
@@ -95,8 +95,9 @@ CREATE TABLE PRISTINE (
      Used to verify the pristine file is "proper". */
   size  INTEGER NOT NULL,
 
-  /* ### this will probably go away, in favor of counting references
-     ### that exist in NODES. Not yet used; always set to 1. */
+  /* The number of rows in the NODES table that have a 'checksum' column
+     value that refers to this row.  (References in other places, such as
+     in the ACTUAL_NODE table, are not counted.) */
   refcount  INTEGER NOT NULL,
 
   /* Alternative MD5 checksum used for communicating with older
@@ -480,6 +481,35 @@ CREATE TABLE NODES (
 CREATE INDEX I_NODES_PARENT ON NODES (wc_id, parent_relpath, op_depth);
 
 
+-- STMT_CREATE_NODES_TRIGGERS
+
+CREATE TRIGGER nodes_insert_trigger
+AFTER INSERT ON nodes
+/* WHEN NEW.checksum IS NOT NULL */
+BEGIN
+  UPDATE pristine SET refcount = refcount + 1
+  WHERE checksum = NEW.checksum;
+END;
+
+CREATE TRIGGER nodes_delete_trigger
+AFTER DELETE ON nodes
+/* WHEN OLD.checksum IS NOT NULL */
+BEGIN
+  UPDATE pristine SET refcount = refcount - 1
+  WHERE checksum = OLD.checksum;
+END;
+
+CREATE TRIGGER nodes_update_checksum_trigger
+AFTER UPDATE OF checksum ON nodes
+/* WHEN NEW.checksum IS NOT NULL OR OLD.checksum IS NOT NULL */
+BEGIN
+  UPDATE pristine SET refcount = refcount + 1
+  WHERE checksum = NEW.checksum;
+  UPDATE pristine SET refcount = refcount - 1
+  WHERE checksum = OLD.checksum;
+END;
+
+
 
 /* Format 20 introduces NODES and removes BASE_NODE and WORKING_NODE */
 
@@ -552,6 +582,19 @@ PRAGMA user_version = 23;
 
 /* ------------------------------------------------------------------------- */
 
+/* Format 24 involves no schema changes; it starts using the pristine
+   table's refcount column correctly. */
+
+-- STMT_UPGRADE_TO_24
+UPDATE pristine SET refcount =
+  (SELECT COUNT(*) FROM nodes
+   WHERE checksum = pristine.checksum /*OR checksum = pristine.md5_checksum*/);
+
+PRAGMA user_version = 24;
+
+
+/* ------------------------------------------------------------------------- */
+
 /* Format YYY introduces new handling for conflict information.  */
 -- format: YYY
 
Index: subversion/libsvn_wc/wc-queries.sql
===================================================================
--- subversion/libsvn_wc/wc-queries.sql	(revision 1057551)
+++ subversion/libsvn_wc/wc-queries.sql	(working copy)
@@ -411,7 +411,15 @@ DELETE FROM work_queue WHERE id = ?1;
 
 -- STMT_INSERT_PRISTINE
 INSERT OR IGNORE INTO pristine (checksum, md5_checksum, size, refcount)
-VALUES (?1, ?2, ?3, 1);
+VALUES (?1, ?2, ?3, 0);
+
+-- STMT_SELECT_ALL_PRISTINES
+SELECT checksum, refcount
+FROM pristine;
+
+-- STMT_SELECT_ALL_PRISTINE_REFERENCES
+SELECT checksum, local_relpath, op_depth FROM nodes
+  WHERE checksum = ?1 /*OR checksum = ?2*/
 
 -- STMT_SELECT_PRISTINE_MD5_CHECKSUM
 SELECT md5_checksum
Index: subversion/tests/libsvn_wc/db-test.c
===================================================================
--- subversion/tests/libsvn_wc/db-test.c	(revision 1057551)
+++ subversion/tests/libsvn_wc/db-test.c	(working copy)
@@ -329,6 +329,7 @@ create_fake_wc(const char *subdir, int f
   const char * const my_statements[] = {
     statements[STMT_CREATE_SCHEMA],
     statements[STMT_CREATE_NODES],
+    statements[STMT_CREATE_NODES_TRIGGERS],
     TESTING_DATA,
     NULL
   };
Index: subversion/tests/libsvn_wc/entries-compat.c
===================================================================
--- subversion/tests/libsvn_wc/entries-compat.c	(revision 1057551)
+++ subversion/tests/libsvn_wc/entries-compat.c	(working copy)
@@ -336,12 +336,14 @@ create_fake_wc(const char *subdir, int f
   const char * const my_statements[] = {
     statements[STMT_CREATE_SCHEMA],
     statements[STMT_CREATE_NODES],
+    statements[STMT_CREATE_NODES_TRIGGERS],
     TESTING_DATA,
     NULL
   };
   const char * const M_statements[] = {
     statements[STMT_CREATE_SCHEMA],
     statements[STMT_CREATE_NODES],
+    statements[STMT_CREATE_NODES_TRIGGERS],
     M_TESTING_DATA,
     NULL
   };

Reply via email to