I'm implementing ref-counting for pristine texts, with the aim of
deleting any text whose ref-count reaches zero.  The current situation
without this work is that many pristine texts are not deleted when they
become unreferenced, and they accumulate in the pristine store until the
user runs "svn cleanup".  I think that is not good enough even for an
initial release.

Having discarded other approaches, my plan now is to:

  * Maintain the ref count at the granularity of SQL statements: each
time we add or remove a reference in the DB (which is usually in the
NODES table), update the corresponding ref count in the PRISTINE table.
This could be changed later to once per SQLite transaction if we find
cases where that would make a significant saving.

  * Act on the ref count (deleting texts whose count is zero) at some
higher level: when closing a "wcroot" object or after running the Work
Queue or just before returning from a libsvn_wc public API.  For the
time being I'm doing it when closing a "wcroot" object.


The best way I have found to maintain the ref counts is using SQL
triggers to update the count when a reference is added or deleted.

CREATE TRIGGER nodes_insert_trigger
AFTER INSERT ON nodes
BEGIN
  UPDATE pristine SET refcount = refcount + 1
  WHERE checksum = NEW.checksum;
END;

CREATE TRIGGER nodes_delete_trigger
AFTER DELETE ON nodes
BEGIN
  UPDATE pristine SET refcount = refcount - 1
  WHERE checksum = OLD.checksum;
END;

CREATE TRIGGER nodes_update_checksum_trigger
AFTER UPDATE OF checksum ON nodes
BEGIN
  UPDATE pristine SET refcount = refcount + 1
  WHERE checksum = NEW.checksum;
  UPDATE pristine SET refcount = refcount - 1
  WHERE checksum = OLD.checksum;
END;

(A detail: This correctly handles NULL in the checksum column: such a
value won't match any row in the 'pristine' table.)

The only case that SQLite doesn't handle automatically is the
replacement part of "INSERT OR REPLACE INTO ...": it doesn't fire the
"delete" trigger in that case.  To overcome this limitation, we must
instead use an explicit "DELETE FROM ..." statement followed by an
unconditional "INSERT INTO ..." statement.  I'm puzzled about why some
of our INSERT statements should ever need to overwrite an existing row,
but they currently do, and I am marking such places with '###' comments.

In order to check whether this is all working, I have inserted in
close_wcroot() a call to svn_wc__db_pristine_cleanup_wcroot() and made
the latter function count (in debug mode only) the number of actual
references versus the recorded refcount of each pristine, and complain
about any difference.  Using this a few tests in the test suite fail,
mostly copy/revert/upgrade tests, so I can see where I have a few more
cases to complete.  I expect all of these to be cases where I need to
put an explicit "DELETE" statement instead of "INSERT OR REPLACE".

This checking code prints a report such as the following when there is a
mismatch:

  DBG: wc_db.c:2692: unused/miscounted pristines in 
'/home/julianfoad/src/subversion-n-gamma/subversion/tests/cmdline/svn-test-work/working_copies/copy_tests-30'
  DBG: wc_db.c:2697: 56388a031dffbf9df7c32e1f299b1d5d7ef60881: refcount 2 != 
actual_refs 1
  DBG: wc_db.c:2717: pristine ref: 56388a031dffbf9df7c32e1f299b1d5d7ef60881, 
op_depth 0, 'A/D/G/rho'


Attached are two patches, including log messages.  The one called
"pristine-refcount-triggers-1.patch" implements the trigger approach, as
described above.  The one called "pristine-refcount-manual-1.patch" is
an earlier patch, in which I started adding explicit SQL queries to
increment and decrement the ref count whenever we update the NODES
table.  Both of them include the same checking/debugging/cleanup code.

Any thoughts on the direction of the whole thing, or the details?

- Julian

### This patch is experimental and incomplete.

Implement reference counting for pristine texts.  This part of the patch
only causes the reference count to be incremented and decremented.

* subversion/libsvn_wc/wc_db.c
  (pristine_refcount_increment, pristine_refcount_decrement,
   pristine_refcount_decrement_by_nodes_row): New functions.
### pristine_refcount_decrement() is not used.

  (insert_base_node, insert_incomplete_children, insert_working_node,
   commit_node, set_new_dir_to_incomplete_txn): Decrement the ref count
    associated with any pristine text in a row that is about to be
    overwritten, and increment the ref count associated with the new row
    afterwards.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_PRISTINE_REFCOUNT_DECREMENT_BY_NODES_ROW,
   STMT_PRISTINE_REFCOUNT_INCREMENT): New queries.
  (STMT_INSERT_PRISTINE): Initialize the ref count to 0 instead of 1.


When closing a "wcroot" object, this patch also checks the ref counts of all
pristine texts, and removes all unreferenced pristine texts.  This part of
the patch is included only for testing purposes.

* subversion/libsvn_wc/wc_db.c
  (count_pristine_references): New function.
  (pristine_cleanup_wcroot): Rename to svn_wc__db_pristine_cleanup_wcroot().
    Check that the ref count column of each pristine text matches the actual
    number of references. Show debug messages when deleting unused pristine
    texts.
  (svn_wc__db_pristine_cleanup): Adjust for the renaming.

* subversion/libsvn_wc/wc_db_pdh.c
  (close_wcroot): Clean up unreferenced pristines.

* subversion/libsvn_wc/wc_db_private.h
  (svn_wc__db_pristine_cleanup_wcroot): New function.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_SELECT_ALL_PRISTINES, STMT_SELECT_ALL_PRISTINE_REFERENCES): New
    queries.
--This line, and those below, will be ignored--

Index: subversion/libsvn_wc/wc_db.c
===================================================================
--- subversion/libsvn_wc/wc_db.c	(revision 1056311)
+++ subversion/libsvn_wc/wc_db.c	(working copy)
@@ -816,6 +816,48 @@ retract_parent_delete(svn_sqlite__db_t *
 }
 
 
+/* Increment the ref-count of the pristine text identified by SHA1_CHECKSUM. */
+static svn_error_t *
+pristine_refcount_increment(svn_sqlite__db_t *sdb,
+                            const svn_checksum_t *sha1_checksum,
+                            apr_pool_t *scratch_pool)
+{
+  svn_sqlite__stmt_t *stmt;
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb,
+                                    STMT_PRISTINE_REFCOUNT_INCREMENT));
+  SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool));
+  SVN_ERR(svn_sqlite__update(NULL, stmt));
+
+  return SVN_NO_ERROR;
+}
+
+/* Decrement the ref-count of the pristine text identified by SHA1_CHECKSUM. */
+static svn_error_t *
+pristine_refcount_decrement(const svn_checksum_t *sha1_checksum,
+                            apr_pool_t *scratch_pool)
+{
+  return SVN_NO_ERROR;
+}
+
+/* Decrement the ref-count of the pristine text whose checksum is given in
+ * the NODES table row indexed by (WC_ID, LOCAL_RELPATH, OP_DEPTH).  If
+ * there is no such row in NODES, or its 'checksum' is null, do nothing. */
+static svn_error_t *
+pristine_refcount_decrement_by_nodes_row(svn_sqlite__db_t *sdb,
+                                         apr_int64_t wc_id,
+                                         const char *local_relpath,
+                                         apr_int64_t op_depth)
+{
+  svn_sqlite__stmt_t *stmt;
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb,
+                                    STMT_PRISTINE_REFCOUNT_DECREMENT_BY_NODES_ROW));
+  SVN_ERR(svn_sqlite__bindf(stmt, "isi", wc_id, local_relpath, op_depth));
+  SVN_ERR(svn_sqlite__update(NULL, stmt));
+
+  return SVN_NO_ERROR;
+}
 
 /* */
 static svn_error_t *
@@ -835,6 +899,10 @@ insert_base_node(void *baton, svn_sqlite
   /* ### we can't handle this right now  */
   SVN_ERR_ASSERT(pibb->conflict == NULL);
 
+  SVN_ERR(pristine_refcount_decrement_by_nodes_row(
+            sdb, pibb->wc_id, pibb->local_relpath,
+            (apr_int64_t)0 /* op_depth for base */));
+
   SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE));
   SVN_ERR(svn_sqlite__bindf(stmt, "isisisr"
                             "tstr"               /* 8 - 11 */
@@ -856,11 +930,18 @@ insert_base_node(void *baton, svn_sqlite
                             (pibb->kind == svn_wc__db_kind_symlink) ?
                                 pibb->target : NULL)); /* 19 */
 
-  if (pibb->kind == svn_wc__db_kind_file) {
-    SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, scratch_pool));
-    if (pibb->translated_size != SVN_INVALID_FILESIZE)
-      SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size));
-  }
+  if (pibb->kind == svn_wc__db_kind_file)
+    {
+      if (pibb->checksum)
+        {
+          SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum,
+                                            scratch_pool));
+          SVN_ERR(pristine_refcount_increment(sdb, pibb->checksum,
+                                              scratch_pool));
+        }
+      if (pibb->translated_size != SVN_INVALID_FILESIZE)
+        SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size));
+    }
 
   SVN_ERR(svn_sqlite__bind_properties(stmt, 15, pibb->props,
                                       scratch_pool));
@@ -973,6 +1064,9 @@ insert_working_node(void *baton,
   SVN_ERR_ASSERT(*piwb->local_relpath != '\0');
   parent_relpath = svn_relpath_dirname(piwb->local_relpath, scratch_pool);
 
+  SVN_ERR(pristine_refcount_decrement_by_nodes_row(
+            sdb, piwb->wc_id, piwb->local_relpath, piwb->op_depth));
+
   SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE));
   SVN_ERR(svn_sqlite__bindf(stmt, "isisnnntstrisn"
                 "nnnn" /* properties translated_size last_mod_time dav_cache */
@@ -993,8 +1093,13 @@ insert_working_node(void *baton,
 
   if (piwb->kind == svn_wc__db_kind_file)
     {
-      SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, piwb->checksum,
-                                        scratch_pool));
+      if (piwb->checksum)
+        {
+          SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, piwb->checksum,
+                                            scratch_pool));
+          SVN_ERR(pristine_refcount_increment(sdb, piwb->checksum,
+                                              scratch_pool));
+        }
     }
   else if (piwb->kind == svn_wc__db_kind_symlink)
     {
@@ -2584,12 +2689,105 @@ svn_wc__db_pristine_remove(svn_wc__db_t 
 }
 
 
+#ifdef SVN_DEBUG
+/* Set *ACTUAL_REFS to the number of references to SHA1_CHECKSUM. */
 static svn_error_t *
-pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
-                        apr_pool_t *scratch_pool)
+count_pristine_references(int *actual_refs,
+                          svn_sqlite__db_t *sdb,
+                          const svn_checksum_t *sha1_checksum,
+                          apr_pool_t *scratch_pool)
 {
   svn_sqlite__stmt_t *stmt;
 
+  *actual_refs = 0;
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb,
+                                    STMT_SELECT_ALL_PRISTINE_REFERENCES));
+  SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool));
+  while (1)
+    {
+      svn_boolean_t have_row;
+
+      SVN_ERR(svn_sqlite__step(&have_row, stmt));
+      if (! have_row)
+        break;
+
+      (*actual_refs)++;
+    }
+  SVN_ERR(svn_sqlite__reset(stmt));
+  return SVN_NO_ERROR;
+}
+#endif
+
+svn_error_t *
+svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
+                                   apr_pool_t *scratch_pool)
+{
+  svn_sqlite__stmt_t *stmt;
+  svn_boolean_t done_one = FALSE;
+
+#ifdef SVN_DEBUG
+  /* For each pristine text: check its refcount matches its actual references. */
+  SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
+                                    STMT_SELECT_ALL_PRISTINES));
+  while (1)
+    {
+      svn_boolean_t have_row;
+      const svn_checksum_t *sha1_checksum;
+      int refcount, actual_refs;
+      svn_sqlite__stmt_t *stmt2;
+
+      SVN_ERR(svn_sqlite__step(&have_row, stmt));
+      if (! have_row)
+        break;
+
+      SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0,
+                                          scratch_pool));
+      refcount = svn_sqlite__column_int(stmt, 1);
+
+      /* Count the actual references to this checksum. */
+      SVN_ERR(count_pristine_references(&actual_refs, wcroot->sdb,
+                                        sha1_checksum, scratch_pool));
+
+      if (refcount == actual_refs)
+        continue;
+
+      if (! done_one)
+        {
+          SVN_DBG(("unused/miscounted pristines in '%s'\n", wcroot->abspath));
+          done_one = TRUE;
+        }
+      SVN_DBG(("%s: refcount %d != actual_refs %d\n",
+               svn_checksum_to_cstring_display(sha1_checksum, scratch_pool),
+               refcount, actual_refs));
+
+      /* Show the actual references to this checksum. */
+      SVN_ERR(svn_sqlite__get_statement(&stmt2, wcroot->sdb,
+                                        STMT_SELECT_ALL_PRISTINE_REFERENCES));
+      SVN_ERR(svn_sqlite__bind_checksum(stmt2, 1, sha1_checksum, scratch_pool));
+      while (1)
+        {
+          svn_boolean_t have_row2;
+          const char *local_relpath;
+          int op_depth;
+
+          SVN_ERR(svn_sqlite__step(&have_row2, stmt2));
+          if (! have_row2)
+            break;
+
+          local_relpath = svn_sqlite__column_text(stmt2, 1, scratch_pool);
+          op_depth = svn_sqlite__column_int(stmt2, 2);
+          SVN_DBG(("pristine ref: %s, op_depth %d, '%s'\n",
+                   svn_checksum_to_cstring_display(sha1_checksum, scratch_pool),
+                   op_depth, local_relpath));
+        }
+      SVN_ERR(svn_sqlite__reset(stmt2));
+
+      SVN_ERR_ASSERT(!"pristine refcount mismatch");
+    }
+  SVN_ERR(svn_sqlite__reset(stmt));
+#endif
+
   /* Find each unreferenced pristine in the DB and remove it. */
   SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
                                     STMT_SELECT_UNREFERENCED_PRISTINES));
@@ -2604,6 +2802,13 @@ pristine_cleanup_wcroot(svn_wc__db_wcroo
 
       SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0,
                                           scratch_pool));
+      if (! done_one)
+        {
+          SVN_DBG(("removing unused pristines in '%s'\n", wcroot->abspath));
+          done_one = TRUE;
+        }
+      SVN_DBG(("removing unused pristine %s\n",
+              svn_checksum_to_cstring_display(sha1_checksum, scratch_pool)));
       SVN_ERR(pristine_remove(wcroot, sha1_checksum, scratch_pool));
     }
   SVN_ERR(svn_sqlite__reset(stmt));
@@ -2627,7 +2832,7 @@ svn_wc__db_pristine_cleanup(svn_wc__db_t
                               scratch_pool, scratch_pool));
   VERIFY_USABLE_PDH(pdh);
 
-  SVN_ERR(pristine_cleanup_wcroot(pdh->wcroot, scratch_pool));
+  SVN_ERR(svn_wc__db_pristine_cleanup_wcroot(pdh->wcroot, scratch_pool));
 
   return SVN_NO_ERROR;
 }
@@ -6007,6 +6212,10 @@ commit_node(void *baton, svn_sqlite__db_
   /* ### other presences? or reserve that for separate functions?  */
   new_presence = svn_wc__db_status_normal;
 
+  SVN_ERR(pristine_refcount_decrement_by_nodes_row(
+            cb->pdh->wcroot->sdb, cb->pdh->wcroot->wc_id, cb->local_relpath,
+            (apr_int64_t)0 /* op_depth for base */));
+
   SVN_ERR(svn_sqlite__get_statement(&stmt, cb->pdh->wcroot->sdb,
                                     STMT_APPLY_CHANGES_TO_BASE_NODE));
   /* symlink_target not yet used */
@@ -6026,6 +6243,9 @@ commit_node(void *baton, svn_sqlite__db_
 
   SVN_ERR(svn_sqlite__bind_checksum(stmt, 13, cb->new_checksum,
                                     scratch_pool));
+  if (cb->new_checksum)
+    SVN_ERR(pristine_refcount_increment(cb->pdh->wcroot->sdb, cb->new_checksum,
+                                        scratch_pool));
   SVN_ERR(svn_sqlite__bind_properties(stmt, 15, cb->new_dav_cache,
                                       scratch_pool));
 
Index: subversion/libsvn_wc/wc_db_pdh.c
===================================================================
--- subversion/libsvn_wc/wc_db_pdh.c	(revision 1056311)
+++ subversion/libsvn_wc/wc_db_pdh.c	(working copy)
@@ -26,6 +26,7 @@
 #include <assert.h>
 
 #include "svn_dirent_uri.h"
+#include "svn_pools.h"
 
 #include "wc.h"
 #include "adm_files.h"
@@ -114,6 +115,14 @@ close_wcroot(void *data)
 
   SVN_ERR_ASSERT_NO_RETURN(wcroot->sdb != NULL);
 
+#ifdef SVN_WC__PRISTINE_CLEANUP
+  {
+    apr_pool_t *scratch_pool = svn_pool_create(NULL);
+    svn_error_clear(svn_wc__db_pristine_cleanup_wcroot(wcroot, scratch_pool));
+    svn_pool_destroy(scratch_pool);
+  }
+#endif
+
   err = svn_sqlite__close(wcroot->sdb);
   wcroot->sdb = NULL;
   if (err)
Index: subversion/libsvn_wc/wc_db_private.h
===================================================================
--- subversion/libsvn_wc/wc_db_private.h	(revision 1056311)
+++ subversion/libsvn_wc/wc_db_private.h	(working copy)
@@ -194,5 +194,10 @@ svn_wc__db_util_open_db(svn_sqlite__db_t
                         apr_pool_t *result_pool,
                         apr_pool_t *scratch_pool);
 
+/* Remove all unreferenced pristines in WCROOT. */
+svn_error_t *
+svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
+                                   apr_pool_t *scratch_pool);
+
 
 #endif /* WC_DB_PDH_H */
Index: subversion/libsvn_wc/wc-queries.sql
===================================================================
--- subversion/libsvn_wc/wc-queries.sql	(revision 1056311)
+++ subversion/libsvn_wc/wc-queries.sql	(working copy)
@@ -132,4 +132,16 @@ SELECT id FROM repository WHERE root = ?
 -- STMT_INSERT_REPOSITORY
 INSERT INTO repository (root, uuid) VALUES (?1, ?2);
 
+-- STMT_PRISTINE_REFCOUNT_DECREMENT_BY_NODES_ROW
+/* Decrement the ref-count of the pristine text whose checksum is given in
+ * the NODES table row indexed by (wc_id, local_relpath, op_depth).  If
+ * there is no such row in NODES, or its 'checksum' is null, do nothing. */
+UPDATE pristine SET refcount = refcount - 1
+WHERE checksum = (SELECT checksum FROM nodes
+                  WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = ?3);
+
+-- STMT_PRISTINE_REFCOUNT_INCREMENT
+UPDATE pristine SET refcount = refcount + 1
+WHERE checksum = ?1;
+
 -- STMT_INSERT_NODE
@@ -411,7 +427,24 @@ DELETE FROM work_queue WHERE id = ?1;
 
 -- STMT_INSERT_PRISTINE
 INSERT OR IGNORE INTO pristine (checksum, md5_checksum, size, refcount)
-VALUES (?1, ?2, ?3, 1);
+VALUES (?1, ?2, ?3, 0);
+
+-- STMT_SELECT_ALL_PRISTINES
+SELECT checksum, refcount
+FROM pristine;
+
+-- STMT_SELECT_ALL_PRISTINE_REFERENCES
+SELECT checksum, local_relpath, op_depth FROM nodes
+  WHERE checksum = ?1 /*OR checksum = ?2*/
+UNION ALL
+SELECT older_checksum, local_relpath, -1 FROM actual_node
+  WHERE older_checksum = ?1 /*OR older_checksum = ?2*/
+UNION ALL
+SELECT left_checksum, local_relpath, -2 FROM actual_node
+  WHERE left_checksum  = ?1 /*OR left_checksum  = ?2*/
+UNION ALL
+SELECT right_checksum, local_relpath, -3 FROM actual_node
+  WHERE right_checksum = ?1 /*OR right_checksum = ?2*/
 
 -- STMT_SELECT_PRISTINE_MD5_CHECKSUM
 SELECT md5_checksum
### This patch is experimental and incomplete.

Implement reference counting for pristine texts, using SQLite triggers.
This part of the patch only causes the reference count to be incremented and
decremented.

* subversion/libsvn_wc/wc_db.c
  (delete_nodes_row): New function.

  (pristine_refcount_increment, pristine_refcount_decrement,
   pristine_refcount_decrement_by_nodes_row): New functions, inactive. For
    experimenting with non-automatic reference counting, as an alternative
    to the SQLite triggers method that is currently active in this patch.

  (insert_base_node, insert_incomplete_children, insert_working_node,
   commit_node, set_new_dir_to_incomplete_txn): Delete any existing row,
    thus decrementing the pristine refcount if applicable, before inserting
    the new one.

* subversion/libsvn_wc/wc-metadata.sql
  (nodes_insert_trigger, nodes_delete_trigger, nodes_update_checksum_trigger):
    New triggers.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_INSERT_PRISTINE): Initialize the ref count to 0 instead of 1.
  (STMT_DELETE_NODES_ROW): New query.
  (STMT_INSERT_NODE, STMT_APPLY_CHANGES_TO_BASE_NODE): Use just "INSERT"
    instead of "INSERT OR REPLACE".


When closing a "wcroot" object, this patch also checks the ref counts of all
pristine texts, and removes all unreferenced pristine texts.  This part of
the patch is included only for testing purposes.

* subversion/libsvn_wc/wc_db.c
  (count_pristine_references): New function.
  (pristine_cleanup_wcroot): Rename to svn_wc__db_pristine_cleanup_wcroot().
    Check that the ref count column of each pristine text matches the actual
    number of references. Show debug messages when deleting unused pristine
    texts.
  (svn_wc__db_pristine_cleanup): Adjust for the renaming.

* subversion/libsvn_wc/wc_db_pdh.c
  (close_wcroot): Clean up unreferenced pristines.

* subversion/libsvn_wc/wc_db_private.h
  (svn_wc__db_pristine_cleanup_wcroot): New function.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_SELECT_ALL_PRISTINES, STMT_SELECT_ALL_PRISTINE_REFERENCES): New
    queries.
--This line, and those below, will be ignored--

Index: subversion/libsvn_wc/wc_db.c
===================================================================
--- subversion/libsvn_wc/wc_db.c	(revision 1056311)
+++ subversion/libsvn_wc/wc_db.c	(working copy)
@@ -816,6 +816,23 @@ retract_parent_delete(svn_sqlite__db_t *
 }
 
 
+/* Delete the row indexed by (WC_ID, LOCAL_RELPATH, OP_DEPTH) from the NODES
+ * table in SDB.  If there is no such row in NODES, do nothing. */
+static svn_error_t *
+delete_nodes_row(svn_sqlite__db_t *sdb,
+                 apr_int64_t wc_id,
+                 const char *local_relpath,
+                 apr_int64_t op_depth)
+{
+  svn_sqlite__stmt_t *stmt;
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_DELETE_NODES_ROW));
+  SVN_ERR(svn_sqlite__bindf(stmt, "isi", wc_id, local_relpath, op_depth));
+  SVN_ERR(svn_sqlite__step_done(stmt));
+
+  return SVN_NO_ERROR;
+}
+
 
 /* */
 static svn_error_t *
@@ -835,6 +852,12 @@ insert_base_node(void *baton, svn_sqlite
   /* ### we can't handle this right now  */
   SVN_ERR_ASSERT(pibb->conflict == NULL);
 
+  /* First, delete any existing row.  We do this explicitly because if we
+   * were to use an "INSERT OR REPLACE" statement SQLite would not fire the
+   * triggers for the deletion part of the replacement.  */
+  SVN_ERR(delete_nodes_row(sdb, pibb->wc_id, pibb->local_relpath,
+                           (apr_int64_t)0 /* op_depth for base */));
+
   SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE));
   SVN_ERR(svn_sqlite__bindf(stmt, "isisisr"
                             "tstr"               /* 8 - 11 */
@@ -856,11 +879,12 @@ insert_base_node(void *baton, svn_sqlite
                             (pibb->kind == svn_wc__db_kind_symlink) ?
                                 pibb->target : NULL)); /* 19 */
 
-  if (pibb->kind == svn_wc__db_kind_file) {
-    SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, scratch_pool));
-    if (pibb->translated_size != SVN_INVALID_FILESIZE)
-      SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size));
-  }
+  if (pibb->kind == svn_wc__db_kind_file)
+    {
+      SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, scratch_pool));
+      if (pibb->translated_size != SVN_INVALID_FILESIZE)
+        SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size));
+    }
 
   SVN_ERR(svn_sqlite__bind_properties(stmt, 15, pibb->props,
                                       scratch_pool));
@@ -933,6 +957,16 @@ insert_incomplete_children(svn_sqlite__d
     {
       const char *name = APR_ARRAY_IDX(children, i, const char *);
 
+      /* First, delete any existing row.  We do this explicitly because if we
+       * were to use an "INSERT OR REPLACE" statement SQLite would not fire the
+       * triggers for the deletion part of the replacement.  */
+      /* ### Why should we ever need to overwrite an existing row here?
+       * This happens in op-depth-test 14,15 for example. */
+      SVN_ERR(delete_nodes_row(sdb, wc_id,
+                               svn_relpath_join(local_relpath, name,
+                                                scratch_pool),
+                               op_depth));
+
       SVN_ERR(svn_sqlite__bindf(stmt, "isisnnrsns",
                                 wc_id,
                                 svn_relpath_join(local_relpath, name,
@@ -973,6 +1007,12 @@ insert_working_node(void *baton,
   SVN_ERR_ASSERT(*piwb->local_relpath != '\0');
   parent_relpath = svn_relpath_dirname(piwb->local_relpath, scratch_pool);
 
+  /* First, delete any existing row.  We do this explicitly because if we
+   * were to use an "INSERT OR REPLACE" statement SQLite would not fire the
+   * triggers for the deletion part of the replacement.  */
+  SVN_ERR(delete_nodes_row(sdb, piwb->wc_id, piwb->local_relpath,
+                           piwb->op_depth));
+
   SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE));
   SVN_ERR(svn_sqlite__bindf(stmt, "isisnnntstrisn"
                 "nnnn" /* properties translated_size last_mod_time dav_cache */
@@ -2584,11 +2624,104 @@ svn_wc__db_pristine_remove(svn_wc__db_t 
 }
 
 
+#ifdef SVN_DEBUG
+/* Set *ACTUAL_REFS to the number of references to SHA1_CHECKSUM. */
 static svn_error_t *
-pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
-                        apr_pool_t *scratch_pool)
+count_pristine_references(int *actual_refs,
+                          svn_sqlite__db_t *sdb,
+                          const svn_checksum_t *sha1_checksum,
+                          apr_pool_t *scratch_pool)
+{
+  svn_sqlite__stmt_t *stmt;
+
+  *actual_refs = 0;
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt, sdb,
+                                    STMT_SELECT_ALL_PRISTINE_REFERENCES));
+  SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool));
+  while (1)
+    {
+      svn_boolean_t have_row;
+
+      SVN_ERR(svn_sqlite__step(&have_row, stmt));
+      if (! have_row)
+        break;
+
+      (*actual_refs)++;
+    }
+  SVN_ERR(svn_sqlite__reset(stmt));
+  return SVN_NO_ERROR;
+}
+#endif
+
+svn_error_t *
+svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
+                                   apr_pool_t *scratch_pool)
 {
   svn_sqlite__stmt_t *stmt;
+  svn_boolean_t done_one = FALSE;
+
+#ifdef SVN_DEBUG
+  /* For each pristine text: check its refcount matches its actual references. */
+  SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
+                                    STMT_SELECT_ALL_PRISTINES));
+  while (1)
+    {
+      svn_boolean_t have_row;
+      const svn_checksum_t *sha1_checksum;
+      int refcount, actual_refs;
+      svn_sqlite__stmt_t *stmt2;
+
+      SVN_ERR(svn_sqlite__step(&have_row, stmt));
+      if (! have_row)
+        break;
+
+      SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0,
+                                          scratch_pool));
+      refcount = svn_sqlite__column_int(stmt, 1);
+
+      /* Count the actual references to this checksum. */
+      SVN_ERR(count_pristine_references(&actual_refs, wcroot->sdb,
+                                        sha1_checksum, scratch_pool));
+
+      if (refcount == actual_refs)
+        continue;
+
+      if (! done_one)
+        {
+          SVN_DBG(("unused/miscounted pristines in '%s'\n", wcroot->abspath));
+          done_one = TRUE;
+        }
+      SVN_DBG(("%s: refcount %d != actual_refs %d\n",
+               svn_checksum_to_cstring_display(sha1_checksum, scratch_pool),
+               refcount, actual_refs));
+
+      /* Show the actual references to this checksum. */
+      SVN_ERR(svn_sqlite__get_statement(&stmt2, wcroot->sdb,
+                                        STMT_SELECT_ALL_PRISTINE_REFERENCES));
+      SVN_ERR(svn_sqlite__bind_checksum(stmt2, 1, sha1_checksum, scratch_pool));
+      while (1)
+        {
+          svn_boolean_t have_row2;
+          const char *local_relpath;
+          int op_depth;
+
+          SVN_ERR(svn_sqlite__step(&have_row2, stmt2));
+          if (! have_row2)
+            break;
+
+          local_relpath = svn_sqlite__column_text(stmt2, 1, scratch_pool);
+          op_depth = svn_sqlite__column_int(stmt2, 2);
+          SVN_DBG(("pristine ref: %s, op_depth %d, '%s'\n",
+                   svn_checksum_to_cstring_display(sha1_checksum, scratch_pool),
+                   op_depth, local_relpath));
+        }
+      SVN_ERR(svn_sqlite__reset(stmt2));
+
+      SVN_ERR_ASSERT(!"pristine refcount mismatch");
+    }
+  SVN_ERR(svn_sqlite__reset(stmt));
+#endif
 
   /* Find each unreferenced pristine in the DB and remove it. */
   SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
@@ -2604,6 +2737,13 @@ pristine_cleanup_wcroot(svn_wc__db_wcroo
 
       SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0,
                                           scratch_pool));
+      if (! done_one)
+        {
+          SVN_DBG(("removing unused pristines in '%s'\n", wcroot->abspath));
+          done_one = TRUE;
+        }
+      SVN_DBG(("removing unused pristine %s\n",
+              svn_checksum_to_cstring_display(sha1_checksum, scratch_pool)));
       SVN_ERR(pristine_remove(wcroot, sha1_checksum, scratch_pool));
     }
   SVN_ERR(svn_sqlite__reset(stmt));
@@ -2627,7 +2767,7 @@ svn_wc__db_pristine_cleanup(svn_wc__db_t
                               scratch_pool, scratch_pool));
   VERIFY_USABLE_PDH(pdh);
 
-  SVN_ERR(pristine_cleanup_wcroot(pdh->wcroot, scratch_pool));
+  SVN_ERR(svn_wc__db_pristine_cleanup_wcroot(pdh->wcroot, scratch_pool));
 
   return SVN_NO_ERROR;
 }
@@ -6007,6 +6147,14 @@ commit_node(void *baton, svn_sqlite__db_
   /* ### other presences? or reserve that for separate functions?  */
   new_presence = svn_wc__db_status_normal;
 
+  /* First, delete any existing row.  We do this explicitly because if we
+   * were to use an "INSERT OR REPLACE" statement SQLite would not fire the
+   * triggers for the deletion part of the replacement.  */
+  SVN_ERR(delete_nodes_row(cb->pdh->wcroot->sdb,
+                           cb->pdh->wcroot->wc_id, cb->local_relpath,
+                           (apr_int64_t)0 /* op_depth for BASE */));
+
+  /* ### Change stmt to fail if row exists? On failure: delete row, dec refcount, insert new row. */
   SVN_ERR(svn_sqlite__get_statement(&stmt, cb->pdh->wcroot->sdb,
                                     STMT_APPLY_CHANGES_TO_BASE_NODE));
   /* symlink_target not yet used */
@@ -9067,6 +9215,12 @@ set_new_dir_to_incomplete_txn(void *bato
   SVN_ERR(create_repos_id(&repos_id, dtb->repos_root_url, dtb->repos_uuid,
                           sdb, scratch_pool));
 
+  /* ### Why does this function need to overwrite an existing base row?  It
+   * should be putting a base row underneath a locally-added row. */
+  SVN_ERR(delete_nodes_row(dtb->pdh->wcroot->sdb,
+                           dtb->pdh->wcroot->wc_id, dtb->local_relpath,
+                           (apr_int64_t)0 /* op_depth for BASE */));
+
   SVN_ERR(svn_sqlite__get_statement(&stmt, dtb->pdh->wcroot->sdb,
                                     STMT_INSERT_NODE));
 
Index: subversion/libsvn_wc/wc_db_pdh.c
===================================================================
--- subversion/libsvn_wc/wc_db_pdh.c	(revision 1056311)
+++ subversion/libsvn_wc/wc_db_pdh.c	(working copy)
@@ -26,6 +26,7 @@
 #include <assert.h>
 
 #include "svn_dirent_uri.h"
+#include "svn_pools.h"
 
 #include "wc.h"
 #include "adm_files.h"
@@ -114,6 +115,14 @@ close_wcroot(void *data)
 
   SVN_ERR_ASSERT_NO_RETURN(wcroot->sdb != NULL);
 
+#ifdef SVN_WC__PRISTINE_CLEANUP
+  {
+    apr_pool_t *scratch_pool = svn_pool_create(NULL);
+    svn_error_clear(svn_wc__db_pristine_cleanup_wcroot(wcroot, scratch_pool));
+    svn_pool_destroy(scratch_pool);
+  }
+#endif
+
   err = svn_sqlite__close(wcroot->sdb);
   wcroot->sdb = NULL;
   if (err)
Index: subversion/libsvn_wc/wc_db_private.h
===================================================================
--- subversion/libsvn_wc/wc_db_private.h	(revision 1056311)
+++ subversion/libsvn_wc/wc_db_private.h	(working copy)
@@ -194,5 +194,10 @@ svn_wc__db_util_open_db(svn_sqlite__db_t
                         apr_pool_t *result_pool,
                         apr_pool_t *scratch_pool);
 
+/* Remove all unreferenced pristines in WCROOT. */
+svn_error_t *
+svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot,
+                                   apr_pool_t *scratch_pool);
+
 
 #endif /* WC_DB_PDH_H */
Index: subversion/libsvn_wc/wc-metadata.sql
===================================================================
--- subversion/libsvn_wc/wc-metadata.sql	(revision 1056311)
+++ subversion/libsvn_wc/wc-metadata.sql	(working copy)
@@ -479,6 +479,32 @@ CREATE TABLE NODES (
 
 CREATE INDEX I_NODES_PARENT ON NODES (wc_id, parent_relpath, op_depth);
 
+CREATE TRIGGER nodes_insert_trigger
+AFTER INSERT ON nodes
+/* WHEN NEW.checksum IS NOT NULL */
+BEGIN
+  UPDATE pristine SET refcount = refcount + 1
+  WHERE checksum = NEW.checksum;
+END;
+
+CREATE TRIGGER nodes_delete_trigger
+AFTER DELETE ON nodes
+/* WHEN OLD.checksum IS NOT NULL */
+BEGIN
+  UPDATE pristine SET refcount = refcount - 1
+  WHERE checksum = OLD.checksum;
+END;
+
+CREATE TRIGGER nodes_update_checksum_trigger
+AFTER UPDATE OF checksum ON nodes
+/* WHEN NEW.checksum IS NOT NULL OR OLD.checksum IS NOT NULL */
+BEGIN
+  UPDATE pristine SET refcount = refcount + 1
+  WHERE checksum = NEW.checksum;
+  UPDATE pristine SET refcount = refcount - 1
+  WHERE checksum = OLD.checksum;
+END;
+
 
 
 /* Format 20 introduces NODES and removes BASE_NODE and WORKING_NODE */
Index: subversion/libsvn_wc/wc-queries.sql
===================================================================
--- subversion/libsvn_wc/wc-queries.sql	(revision 1056311)
+++ subversion/libsvn_wc/wc-queries.sql	(working copy)
@@ -133,7 +133,7 @@ SELECT id FROM repository WHERE root = ?
 INSERT INTO repository (root, uuid) VALUES (?1, ?2);
 
 -- STMT_INSERT_NODE
-INSERT OR REPLACE INTO nodes (
+INSERT INTO nodes (
   wc_id, local_relpath, op_depth, parent_relpath, repos_id, repos_path,
   revision, presence, depth, kind, changed_revision, changed_date,
   changed_author, checksum, properties, translated_size, last_mod_time,
@@ -296,6 +296,10 @@ WHERE wc_id = ?1 AND local_relpath = ?2
   AND right_checksum IS NULL
   AND left_checksum IS NULL;
 
+-- STMT_DELETE_NODES_ROW
+DELETE FROM nodes
+WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = ?3;
+
 -- STMT_DELETE_BASE_NODE
 DELETE FROM nodes
 WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = 0;
@@ -411,7 +415,24 @@ DELETE FROM work_queue WHERE id = ?1;
 
 -- STMT_INSERT_PRISTINE
 INSERT OR IGNORE INTO pristine (checksum, md5_checksum, size, refcount)
-VALUES (?1, ?2, ?3, 1);
+VALUES (?1, ?2, ?3, 0);
+
+-- STMT_SELECT_ALL_PRISTINES
+SELECT checksum, refcount
+FROM pristine;
+
+-- STMT_SELECT_ALL_PRISTINE_REFERENCES
+SELECT checksum, local_relpath, op_depth FROM nodes
+  WHERE checksum = ?1 /*OR checksum = ?2*/
+UNION ALL
+SELECT older_checksum, local_relpath, -1 FROM actual_node
+  WHERE older_checksum = ?1 /*OR older_checksum = ?2*/
+UNION ALL
+SELECT left_checksum, local_relpath, -2 FROM actual_node
+  WHERE left_checksum  = ?1 /*OR left_checksum  = ?2*/
+UNION ALL
+SELECT right_checksum, local_relpath, -3 FROM actual_node
+  WHERE right_checksum = ?1 /*OR right_checksum = ?2*/
 
 -- STMT_SELECT_PRISTINE_MD5_CHECKSUM
 SELECT md5_checksum
@@ -504,7 +525,7 @@ WHERE wc_id = ?1 AND local_dir_relpath L
 /* translated_size and last_mod_time are not mentioned here because they will
    be tweaked after the working-file is installed.
    ### what to do about file_external?  */
-INSERT OR REPLACE INTO nodes (
+INSERT INTO nodes (
   wc_id, local_relpath, op_depth, parent_relpath, repos_id, repos_path,
   revision, presence, depth, kind, changed_revision, changed_date,
   changed_author, checksum, properties, dav_cache, symlink_target )

Reply via email to