On Tue, Jun 14, 2011 at 11:45 AM, Paul Burba <ptbu...@gmail.com> wrote:
> On Tue, Jun 14, 2011 at 3:41 AM, Philip Martin
> <philip.mar...@wandisco.com> wrote:
>> Paul Burba <ptbu...@gmail.com> writes:
>>
>>> No surprise that your checkouts are faster than mine given you are
>>> using a local mirror.  What's puzzling me is why my upgrades are so
>>> much slower than yours.
>>>
>>> Running an upgrade of a trunk WC on my machine under xperf takes
>>> 00:03:46.351 elapsed and 11.44s CPU time using my primary drive (320
>>> GB, SATA-II, 7200 rpm, 16 MB Cache, NTFS).  Subversion spends 50s
>>> total disk service time (46.8s of that is read service time).
>>
>> Does "total disk service time" represent all the time waiting for disk
>> IO?  11s CPU plus 50s disk still leaves 165s unaccounted.
>
> I've only just started using xperf (it's part of the Windows
> Performance Toolkit) and can't find an exact definition of their
> terminology.  I can say that xperf adds lot of overhead since upgrades
> run under it reliably take about twice as long.
>
> The 165s of unaccounted time is largely made up of 143s of disk
> service time by the system process...but now that I dig a little
> further I find that the system process' disk utilization is touching
> every directory in the working copy as well as writing the new
> .svn\pristine\ stuffs.  Previously I was only looking at the disk
> utilization by the svn.exe process itself.
>
> ~~~~~
>
> Not sure what to make of this: I tried an upgrade of a trunk WC using
> trunk@1135642 on my HDD and then my SSD (on the same box)
>
> Elapsed time for the HDD: 00:01:12.450
> For the SSD: 00:00:09.267
>
> Not sure if this points to anything other than SSDs are faster (duh!),
> but the degree of improvement was a unexpected.
>
>> Upgrade
>> prints a notification line for each directory, is there a significant
>> delay before the first line, or after the last line, or is the delay
>> spread among the lines?
>
> In all my testing I used the --quiet option.  I just now tried it
> without and the delay is definitely spread among the output.
>
>> If it is a Subversion problem rather than your machine there are two
>> areas that may be worth investigating (but it's hard to say when most of
>> the time is unaccounted).
>>
>> The property migration currently invokes multiple transactions
>> per-directory.  Moving to a single transaction per-directory will
>> probably help.
>>
>> Upgrade currently copies all the text-bases, I did experiment with
>> creating workqueue items to move them instead but it wasn't any faster
>> on my Linux box.  However it may help on Windows.
>
> A bit of cut-and-paste from IRC this morning for the benefit of others
> reading this thread:
>
> <philipm> Bert, pburba: I suppose we could do the whole upgrade as a
> single transaction.
> <philipm> The database will never be accessed by more than one
> thread/process during upgrade.
> <philipm> And "partial" upgrades are already thrown away.
>
> <Bert> philipm: Agree
>
> <hwright> philipm: we have nested txns in sqlite, so you should be
> able to just wrap the entire upgrade process in a txn
>
> <Bert> philipm: Last week I tried to reduce the number of stats in the
> property load code by just fetching all the entries in the property
> dirs, but the perf difference was not measurable. I think making it a
> single transaction would have the most impact.
>
> <pburba> philipm, Bert, hwright : Have we done anything similar to
> that already (wrapping the whole upgrade in a single transaction)? I'd
> like to tackle it, but looking at similar work to provide some
> understanding/traction would be very helpful
>
> <hwright> pburba: we don't expose txns outside of wc_db
>
> <philipm> pburba: What you want is upgrade_working_copy as a callback
> from svn_wc__db_with_txn
>
> <hwright> I'm not sure how much of the work is done in wc_db (and how
> much without), but that's the first thing to consider
>
> <philipm> We would have to move lots of code, or expose the txn to
> upgrade.c in some form.
>
> <hwright> upgrade is such a special case, it makes sense to
> potentially expose it in a limited fashion
>
> <Bert> svn_sqlite__with_lock()
>
> <philipm> upgrade.c already access the sqlite db directly
>
> <hwright> philipm: in that case, just do the txn in upgrade.c, and
> wc_db.c should be fine due to txn nesting, yes?
>
> <philipm> Yes.  Putting svn_sqlite__with_lock around
> upgrade_working_copy in upgrade.c will probably be OK
>
> <hwright> (if it isn't, you'll know pretty quick. :) )
>
> <philipm> Upgrade uses a txn to write entries, per-dir.
> <philipm> But if the whole upgrade is a txn that could be removed,
> it's not used for anything other than upgrade.
> <philipm> With nested txns that will not be necessary, but avoiding
> nested txns may have a performance benefit?
> <philipm> I don't know.  For a first attempt just put
> upgrade_working_copy in svn_sqlite__with_lock.
>
> <pburba> philipm: Ok, I'll give that a try then.

Here's a stab at it.  In my ad-hoc testing, the attached patch
improved the elapsed upgrade time of a ^/subversion/trunk WC from
00:02:00 to 00:01:34 (3 runs).  It improved an upgrade of a
^/subversion/tags/ebcdic WC from 00:11:44 to 00:06:52 (single run).

[[[
Improve 'svn upgrade' performance by moving more of the upgrade process into
a single sqlite transaction.

* subversion/libsvn_wc/entries.c

  (entries_write_baton,
   entries_write_new_cb): Remove.

  (svn_wc__write_upgraded_entries): Don't run this part of the upgrade
   operation in a dedicated sqlite transaction, but rather embed it in a
   larger transaction.  Also use the existing iterpool when creating
   tmp_entry_abspath argument to write_entry.

* subversion/libsvn_wc/upgrade.c

  (upgrade_to_wcng): Don't create the new DB here, move that to svn_wc_upgrade.

  (upgrade_working_copy_baton_t,
   upgrade_working_copy_txn): New baton and transaction wrapper for
   upgrade_working_copy.

  (svn_wc_upgrade): Create the new wcng DB here and wrap the call to
   upgrade_working_copy() in a sqlite transaction.
]]]

Paul
Index: subversion/libsvn_wc/entries.c
===================================================================
--- subversion/libsvn_wc/entries.c      (revision 1135795)
+++ subversion/libsvn_wc/entries.c      (working copy)
@@ -2199,38 +2199,29 @@
   return SVN_NO_ERROR;
 }
 
-struct entries_write_baton
+svn_error_t *
+svn_wc__write_upgraded_entries(void **dir_baton,
+                               void *parent_baton,
+                               svn_wc__db_t *db,
+                               svn_sqlite__db_t *sdb,
+                               apr_int64_t repos_id,
+                               apr_int64_t wc_id,
+                               const char *dir_abspath,
+                               const char *new_root_abspath,
+                               apr_hash_t *entries,
+                               apr_hash_t *text_bases_info,
+                               apr_pool_t *result_pool,
+                               apr_pool_t *scratch_pool)
 {
-  svn_wc__db_t *db;
-  apr_int64_t repos_id;
-  apr_int64_t wc_id;
-  const char *dir_abspath;
-  const char *new_root_abspath;
-  apr_hash_t *entries;
-  apr_hash_t *text_bases_info;
-  struct write_baton *parent_node;
-  struct write_baton *dir_node;
-  apr_pool_t *result_pool;
-};
-
-/* Writes entries inside a sqlite transaction
-   Implements svn_sqlite__transaction_callback_t. */
-static svn_error_t *
-entries_write_new_cb(void *baton,
-                     svn_sqlite__db_t *sdb,
-                     apr_pool_t *scratch_pool)
-{
-  struct entries_write_baton *ewb = baton;
-  svn_wc__db_t *db = ewb->db;
-  const char *dir_abspath = ewb->dir_abspath;
-  const char *new_root_abspath = ewb->new_root_abspath;
   const svn_wc_entry_t *this_dir;
   apr_hash_index_t *hi;
   apr_pool_t *iterpool = svn_pool_create(scratch_pool);
   const char *old_root_abspath, *dir_relpath;
+  struct write_baton *parent_node = parent_baton;
+  struct write_baton *dir_node;
 
   /* Get a copy of the "this dir" entry for comparison purposes. */
-  this_dir = apr_hash_get(ewb->entries, SVN_WC_ENTRY_THIS_DIR,
+  this_dir = apr_hash_get(entries, SVN_WC_ENTRY_THIS_DIR,
                           APR_HASH_KEY_STRING);
 
   /* If there is no "this dir" entry, something is wrong. */
@@ -2248,21 +2239,21 @@
   dir_relpath = svn_dirent_skip_ancestor(old_root_abspath, dir_abspath);
 
   /* Write out "this dir" */
-  SVN_ERR(write_entry(&ewb->dir_node, ewb->parent_node, db, sdb,
-                      ewb->wc_id, ewb->repos_id, this_dir, NULL, dir_relpath,
+  SVN_ERR(write_entry(&dir_node, parent_node, db, sdb,
+                      wc_id, repos_id, this_dir, NULL, dir_relpath,
                       svn_dirent_join(new_root_abspath, dir_relpath,
-                                      scratch_pool),
+                                      iterpool),
                       old_root_abspath,
-                      this_dir, FALSE, ewb->result_pool, iterpool));
+                      this_dir, FALSE, result_pool, iterpool));
 
-  for (hi = apr_hash_first(scratch_pool, ewb->entries); hi;
+  for (hi = apr_hash_first(scratch_pool, entries); hi;
        hi = apr_hash_next(hi))
     {
       const char *name = svn__apr_hash_index_key(hi);
       const svn_wc_entry_t *this_entry = svn__apr_hash_index_val(hi);
       const char *child_abspath, *child_relpath;
       svn_wc__text_base_info_t *text_base_info
-        = apr_hash_get(ewb->text_bases_info, name, APR_HASH_KEY_STRING);
+        = apr_hash_get(text_bases_info, name, APR_HASH_KEY_STRING);
 
       svn_pool_clear(iterpool);
 
@@ -2274,61 +2265,25 @@
          use this function for upgrading old working copies. */
       child_abspath = svn_dirent_join(dir_abspath, name, iterpool);
       child_relpath = svn_dirent_skip_ancestor(old_root_abspath, 
child_abspath);
-      SVN_ERR(write_entry(NULL, ewb->dir_node, db, sdb,
-                          ewb->wc_id, ewb->repos_id,
+      SVN_ERR(write_entry(NULL, dir_node, db, sdb,
+                          wc_id, repos_id,
                           this_entry, text_base_info, child_relpath,
                           svn_dirent_join(new_root_abspath, child_relpath,
-                                          scratch_pool),
+                                          iterpool),
                           old_root_abspath,
                           this_dir, TRUE, iterpool, iterpool));
     }
 
-  if (ewb->dir_node->tree_conflicts)
-    SVN_ERR(write_actual_only_entries(ewb->dir_node->tree_conflicts, sdb,
-                                      ewb->wc_id, dir_relpath, iterpool));
+  if (dir_node->tree_conflicts)
+    SVN_ERR(write_actual_only_entries(dir_node->tree_conflicts, sdb,
+                                      wc_id, dir_relpath, iterpool));
 
+  *dir_baton = dir_node;
   svn_pool_destroy(iterpool);
   return SVN_NO_ERROR;
 }
 
 
-svn_error_t *
-svn_wc__write_upgraded_entries(void **dir_baton,
-                               void *parent_baton,
-                               svn_wc__db_t *db,
-                               svn_sqlite__db_t *sdb,
-                               apr_int64_t repos_id,
-                               apr_int64_t wc_id,
-                               const char *dir_abspath,
-                               const char *new_root_abspath,
-                               apr_hash_t *entries,
-                               apr_hash_t *text_bases_info,
-                               apr_pool_t *result_pool,
-                               apr_pool_t *scratch_pool)
-{
-  struct entries_write_baton ewb;
-
-  ewb.db = db;
-  ewb.repos_id = repos_id;
-  ewb.wc_id = wc_id;
-  ewb.dir_abspath = dir_abspath;
-  ewb.new_root_abspath = new_root_abspath;
-  ewb.entries = entries;
-  ewb.text_bases_info = text_bases_info;
-  ewb.parent_node = parent_baton;
-  ewb.result_pool = result_pool;
-
-  /* Run this operation in a transaction to speed up SQLite.
-     See http://www.sqlite.org/faq.html#q19 for more details */
-  SVN_ERR(svn_sqlite__with_transaction(sdb, entries_write_new_cb, &ewb,
-                                       scratch_pool));
-
-  *dir_baton = ewb.dir_node;
-
-  return SVN_NO_ERROR;
-}
-
-
 svn_wc_entry_t *
 svn_wc_entry_dup(const svn_wc_entry_t *entry, apr_pool_t *pool)
 {
Index: subversion/libsvn_wc/upgrade.c
===================================================================
--- subversion/libsvn_wc/upgrade.c      (revision 1135795)
+++ subversion/libsvn_wc/upgrade.c      (working copy)
@@ -1366,9 +1366,7 @@
    REPOS_CACHE if it doesn't have a cached entry for this
    repository.
 
-   DATA->SDB will be null if this is the root directory, in which case
-   the db must be created and *DATA filled in, otherwise *DATA refer
-   to the single root db.
+   *DATA refers to the single root db.
 
    Uses SCRATCH_POOL for all temporary allocation.  */
 static svn_error_t *
@@ -1452,39 +1450,6 @@
                    apr_pstrdup(hash_pool, this_dir->uuid));
     }
 
-  /* Create the new DB if it hasn't already been created. */
-  if (!data->sdb)
-    {
-      const char *root_adm_abspath;
-
-      /* In root wc construst path to temporary root wc/.svn/tmp/wcng/.svn */
-
-      data->root_abspath = svn_dirent_join(svn_wc__adm_child(dir_abspath, 
"tmp",
-                                                             scratch_pool),
-                                           "wcng", result_pool);
-      root_adm_abspath = svn_wc__adm_child(data->root_abspath, "",
-                                           scratch_pool);
-      SVN_ERR(svn_io_remove_dir2(root_adm_abspath, TRUE, NULL, NULL,
-                                 scratch_pool));
-      SVN_ERR(svn_wc__ensure_directory(root_adm_abspath, scratch_pool));
-
-      /* Create an empty sqlite database for this directory. */
-      SVN_ERR(svn_wc__db_upgrade_begin(&data->sdb,
-                                       &data->repos_id, &data->wc_id,
-                                       data->root_abspath,
-                                       this_dir->repos, this_dir->uuid,
-                                       result_pool, scratch_pool));
-
-      /* Migrate the entries over to the new database.
-         ### We need to think about atomicity here.
-
-         entries_write_new() writes in current format rather than
-         f12. Thus, this function bumps a working copy all the way to
-         current.  */
-      SVN_ERR(svn_wc__db_wclock_obtain(db, data->root_abspath, 0, FALSE,
-                                       scratch_pool));
-    }
-
   old_wcroot_abspath = svn_dirent_get_longest_ancestor(dir_abspath,
                                                        data->root_abspath,
                                                        scratch_pool);
@@ -1833,6 +1798,40 @@
     svn_dirent_local_style(parent_abspath, scratch_pool));
 }
 
+/* Data for upgrade_working_copy_txn(). */
+typedef struct upgrade_working_copy_baton_t
+{
+  svn_wc__db_t *db;
+  const char *dir_abspath;
+  svn_wc_upgrade_get_repos_info_t repos_info_func;
+  void *repos_info_baton;
+  apr_hash_t *repos_cache;
+  struct upgrade_data_t *data;
+  svn_cancel_func_t cancel_func;
+  void *cancel_baton;
+  svn_wc_notify_func2_t notify_func;
+  void *notify_baton;
+  apr_pool_t *result_pool;
+} upgrade_working_copy_baton_t;
+
+
+/* Helper for svn_wc_upgrade. Implements svn_sqlite__transaction_callback_t */
+static svn_error_t *
+upgrade_working_copy_txn(void *baton,
+                         svn_sqlite__db_t *sdb,
+                         apr_pool_t *scratch_pool)
+{
+  upgrade_working_copy_baton_t *b = baton;
+
+  /* Upgrade the pre-wcng into a wcng in a temporary location. */
+  return(upgrade_working_copy(NULL, b->db, b->dir_abspath,
+                              b->repos_info_func, b->repos_info_baton,
+                              b->repos_cache, b->data,
+                              b->cancel_func, b->cancel_baton,
+                              b->notify_func, b->notify_baton,
+                              b->result_pool, scratch_pool));
+}
+
 svn_error_t *
 svn_wc_upgrade(svn_wc_context_t *wc_ctx,
                const char *local_abspath,
@@ -1848,6 +1847,11 @@
   struct upgrade_data_t data = { NULL };
   svn_skel_t *work_item, *work_items = NULL;
   const char *pristine_from, *pristine_to, *db_from, *db_to;
+  apr_hash_t *repos_cache = apr_hash_make(scratch_pool);
+  svn_wc_entry_t *this_dir;
+  apr_hash_t *entries;
+  const char *root_adm_abspath;
+  upgrade_working_copy_baton_t cb_baton;
 
   SVN_ERR(is_old_wcroot(local_abspath, scratch_pool));
 
@@ -1864,14 +1868,65 @@
                           NULL /* ### config */, FALSE, FALSE,
                           scratch_pool, scratch_pool));
 
-  /* Upgrade the pre-wcng into a wcng in a temporary location. */
-  SVN_ERR(upgrade_working_copy(NULL, db, local_abspath,
-                               repos_info_func, repos_info_baton,
-                               apr_hash_make(scratch_pool), &data,
-                               cancel_func, cancel_baton,
-                               notify_func, notify_baton,
-                               scratch_pool, scratch_pool));
+  SVN_ERR(svn_wc__read_entries_old(&entries, local_abspath,
+                                   scratch_pool, scratch_pool));
 
+  this_dir = apr_hash_get(entries, SVN_WC_ENTRY_THIS_DIR,
+                          APR_HASH_KEY_STRING);
+  SVN_ERR(ensure_repos_info(this_dir, local_abspath, repos_info_func,
+                            repos_info_baton, repos_cache,
+                            scratch_pool, scratch_pool));
+
+  /* Cache repos UUID pairs for when a subdir doesn't have this information */
+  if (!apr_hash_get(repos_cache, this_dir->repos, APR_HASH_KEY_STRING))
+    apr_hash_set(repos_cache,
+                 apr_pstrdup(scratch_pool, this_dir->repos),
+                 APR_HASH_KEY_STRING,
+                 apr_pstrdup(scratch_pool, this_dir->uuid));
+
+  /* Create the new DB in the temporary root wc/.svn/tmp/wcng/.svn */
+  data.root_abspath = svn_dirent_join(svn_wc__adm_child(local_abspath, "tmp",
+                                                        scratch_pool),
+                                       "wcng", scratch_pool);
+  root_adm_abspath = svn_wc__adm_child(data.root_abspath, "",
+                                       scratch_pool);
+  SVN_ERR(svn_io_remove_dir2(root_adm_abspath, TRUE, NULL, NULL,
+                             scratch_pool));
+  SVN_ERR(svn_wc__ensure_directory(root_adm_abspath, scratch_pool));
+
+  /* Create an empty sqlite database for this directory. */
+  SVN_ERR(svn_wc__db_upgrade_begin(&data.sdb,
+                                   &data.repos_id, &data.wc_id,
+                                   data.root_abspath,
+                                   this_dir->repos, this_dir->uuid,
+                                   scratch_pool, scratch_pool));
+
+  /* Migrate the entries over to the new database.
+   ### We need to think about atomicity here.
+
+   entries_write_new() writes in current format rather than
+   f12. Thus, this function bumps a working copy all the way to
+   current.  */
+  SVN_ERR(svn_wc__db_wclock_obtain(db, data.root_abspath, 0, FALSE,
+                                   scratch_pool));
+
+  cb_baton.db = db;
+  cb_baton.dir_abspath = local_abspath;
+  cb_baton.repos_info_func = repos_info_func;
+  cb_baton.repos_info_baton = repos_info_baton;
+  cb_baton.repos_cache = repos_cache;
+  cb_baton.data = &data;
+  cb_baton.cancel_func = cancel_func;
+  cb_baton.cancel_baton = cancel_baton;
+  cb_baton.notify_func = notify_func;
+  cb_baton.notify_baton = notify_baton;
+  cb_baton.result_pool = scratch_pool;
+  
+  SVN_ERR(svn_sqlite__with_lock(data.sdb,
+                                upgrade_working_copy_txn,
+                                &cb_baton,
+                                scratch_pool));
+
   /* A workqueue item to move the pristine dir into place */
   pristine_from = svn_wc__adm_child(data.root_abspath, 
PRISTINE_STORAGE_RELPATH,
                                     scratch_pool);

Reply via email to