Johan Corveleyn wrote on Fri, 18 Oct 2024 14:33 +00:00:
> Haven't looked into the code yet, but this might be an nice
> "bite-sized" issue to investigate, if someone has time and inclination
> to do so:
>
> Seen on a FSFS f8 repository (on Linux) with a 200 MB revision:
>
> Apparently 'svnlook changed -r $REV' gets slow if $REV is large (as in
> "lots of MB", not in amount of changed files -- it can be seen on a
> commit with a single large file). OTOH, 'svnlook author' or 'svnlook
> filesize' of that same revision+file will be almost instant.

svnlook.c ends up calling svn_repos_replay2(send_deltas=TRUE).

Try changing TRUE to FALSE for the 'changed' and 'dirs-changed' callers.

> Interestingly 'svn log -v -r $REV $URL' is much faster (not dependent
> on the revision size), and shows the same information as 'svnlook
> changed -r $REV'.

It should end up calling svn_fs_paths_changed3(), which, in turn, should
simply read out the list of changed paths in the revision file (without
actually recursing down any tree or DAG).

>                   This to me is a clear indication that 'svnlook
> changed' is doing something sub-optimal, making it perform O(revsize)
> instead of O(1).

Exactly so.  (Though it'll be O(#paths), not O(1).)

Beware of the svn_fs_contents_changed() v. svn_fs_contents_different()
distinction; it might matter here.  («svn cat foo@HEAD | svnmucc -U … -r … -- 
put - foo»
is probably the easiest way to create such commits.)

> Note: 'svnlook changed' is often used in hook scripts. In a pre-commit
> hook one would run 'svnlook changed -t $TXN' on the transaction if one
> would like to validate some things -- at that point 'svn log -v'
> cannot be used as an alternative because there is no revision yet. I
> would hope that any possible improvement would also apply to
> transactions as well as to revisions.

I _think_ improvements will indeed apply to both cases.

HTH,

Daniel

Reply via email to