I have been studying when this implementation fetches pristines. Two concerns about performance in the current implementation:
1. scanning the whole subtree, calling 'stat' on every file 2. premature hydrating Scanning with 'stat' I'm concerned about the implementation scanning the whole subtree, calling 'stat' on every file to determine whether the file is "changed" (locally modified). This is done in svn_wc__textbase_sync() with its textbase_walk_cb(). It does this scan on every sync, which is twice on every syncing operation such as diff. Don't we already have an optimised scan for local modifications implemented in the "status" code? Could we re-use this? Premature Hydrating The present implementation "hydrates" (fetches missing pristines) every file within the whole subtree the operation targets. This is done by every major client operation calling svn_client__textbase_sync() before and afterwards. That is pessimistic: the operation may not actually touch all these files if limited in any way such as by - depth filtering - other filtering (changelist, properties-only, ...) - terminating early (e.g. output piped to 'head') That introduces all the fetching overhead for the given subtree as a latency before the operation shows its results, which for something small at the root of the tree such as "svn diff --depth=empty --properties-only ./" may make a significant usability impact. Presumably we could add the depth and some other kinds of filtering to the tree walk. But that will always leave terminating early, and possibly other cases, sub-optimal. I would prefer a solution that defers the hydrating until closer to the moment of demand. Evgeny, have you looked into these possibilities at all? What are your thoughts about these? - Julian