On Mon, Feb 4, 2013 at 7:59 PM, Julian Foad <julianf...@btopenworld.com>wrote:
> Hi Stefan. > > > URL: http://svn.apache.org/viewvc?rev=1442088&view=rev > > Log: > > Improve DAG node / path lookup performance in FSFS. Since this is a hot > > path in many fs_repo-based functions (like fs_verify), 20 .. 30% time > > saved here can easily become significant. > > > > The idea here is that get_dag() already has knowledge about the PATH > > parameter and does not actually use all the result data returned by > > open_path(). So, define a number of flags that allow a caller to tell > > open_path() where it may take short-cuts based on the context. > > > > * subversion/libsvn_fs_fs/tree.c > > (dag_node_cache_set): add comment to document a rationale > > (enum open_path_flags_t): declare further flags > > (open_path): support the new flags to short-circuit portions of code > > (get_dag, > > fs_closest_copy, > > get_mergeinfo_for_path_internal): provide flags to open_path() > > > > + /* When this flag is set, don't bother to lookup the DAG node in > > + our caches because we already tried this. Ignoring this flag > > + has no functional impact. */ > > + open_path_uncached = 2, > > + > > + /* Assume that the path parameter is already in canonical form. */ > > + open_path_is_canonical = 4, > > I don't like this 'is_canonical' flag. In the rest of Subversion, nearly > all APIs assume paths are canonical already, and for a good reason: > re-canonicalizing at a lower level tends to be expensive because it happens > more times and with less knowledge of whether it's needed. I think it > would be better (including faster) to simply require that the path is > canonical, and make sure the callers honour that. > > I see that many of the functions in this file currently appear to be > accepting and dealing with non-canonical paths, but I think the way to go > is to change that. > Turned out not to be too much work to fix all callers. > + /* The caller does not care about the parent node chain but only > > + the final DAG node. */ > > + open_path_node_only = 8 > > } open_path_flags_t; > > > > > > @@ -871,6 +885,10 @@ typedef enum open_path_flags_t { > > callers that create new nodes --- we find the parent directory for > > them, and tell them whether the entry exists already. > > > > + The remaining bits in FLAGS are hints that allow this function > > + to take shortcuts based on knowledge that the caller provides, > > + such as the fact that PATH is already in canonical form. > > Maybe give a different example. > Done. > > NOTE: Public interfaces which only *read* from the filesystem > > should not call this function directly, but should instead use > > get_dag(). > > @@ -884,23 +902,47 @@ open_path(parent_path_t **parent_path_p, > > apr_pool_t *pool) > > { > > svn_fs_t *fs = root->fs; > > - dag_node_t *here; /* The directory we're currently looking at. */ > > - parent_path_t *parent_path; /* The path from HERE up to the root. */ > > + dag_node_t *here = NULL; /* The directory we're currently looking > at. */ > > + parent_path_t *parent_path; /* The path from HERE up to the root. */ > > const char *rest; /* The portion of PATH we haven't traversed yet. */ > > - const char *canon_path = svn_fs__is_canonical_abspath(path) > > + > > + /* ensure a canonical path representation */ > > + const char *canon_path = ( (flags & open_path_is_canonical) > > + || svn_fs__is_canonical_abspath(path)) > > ? path > > : svn_fs__canonicalize_abspath(path, pool); > > Here you seem also to have decided that when canonicalizing, it's more > efficient to test with svn_fs__is_canonical_abspath() before calling > svn_fs__canonicalize_abspath(). Are you sure that is true? If it is true > here (and I see you already have the same pattern in one of the callers), > then perhaps it is true in general, so why not do it inside > svn_fs__canonicalize_abspath() so all callers benefit? > It's now all inside svn_fs__canonicalize_abspath, which has now subtly changed semantics. But all existing callers should be fine with that. > @@ -1142,7 +1195,9 @@ get_dag(dag_node_t **dag_node_p, > > { > > /* Call open_path with no flags, as we want this to return an > > * error if the node for which we are searching doesn't > exist. */ > > The comment was falsified by your change and needs updating. (There are > similar comments on open_path() calls elsewhere in the file, which probably > should be updated in the same way even though those calls are still passing > no flags.) > > > - SVN_ERR(open_path(&parent_path, root, path, 0, NULL, pool)); > > + SVN_ERR(open_path(&parent_path, root, path, > > + open_path_uncached | open_path_is_canonical > > + | open_path_node_only, NULL, pool)); > > > (This is the only significant place where the 'is_canonical' flag is used.) > > > node = parent_path->node; > > > > /* No need to cache our find -- open_path() will do that for > us. */ > > @@ -3162,7 +3217,7 @@ static svn_error_t *fs_closest_copy(svn_ > > if (kind == svn_node_none) > > return SVN_NO_ERROR; > > SVN_ERR(open_path(©_dst_parent_path, copy_dst_root, path, > > - 0, NULL, pool)); > > + open_path_node_only, NULL, pool)); > > copy_dst_node = copy_dst_parent_path->node; > > if (! > svn_fs_fs__id_check_related(svn_fs_fs__dag_get_id(copy_dst_node), > > > > svn_fs_fs__dag_get_id(parent_path->node))) > > @@ -3775,7 +3830,8 @@ get_mergeinfo_for_path_internal(svn_merg > > > > path = svn_fs__canonicalize_abspath(path, scratch_pool); > > > > - SVN_ERR(open_path(&parent_path, rev_root, path, 0, NULL, > scratch_pool)); > > + SVN_ERR(open_path(&parent_path, rev_root, path, > open_path_is_canonical, > > + NULL, scratch_pool)); > > > Here you've left a plain 'canonicalize' in the caller, even though you > appear to think that the version inside open_path() is more efficient. > (The result of this 'canonicalize' here is *only* passed to open_path().) > > > > if (inherit == svn_mergeinfo_nearest_ancestor && ! > > parent_path->parent) > > return SVN_NO_ERROR; > > > - Julian > > Changes committed in r1448810 with a follow-up in r1448820. Thanks for the review! -- Stefan^2. -- Certified & Supported Apache Subversion Downloads: * http://www.wandisco.com/subversion/download *