On 31.07.2013 23:21, Stefan Fuhrmann wrote: > On Tue, Jul 30, 2013 at 9:56 PM, Greg Stein <gst...@gmail.com > <mailto:gst...@gmail.com>> wrote: > > It really doesn't make that much of a difference. APR_HASH_KEY_STRING > simply tells the hash function to watch for NUL rather than to > traverse N characters. Because N doesn't have to be tracked, you might > even argue the NUL-test is faster. > > IOW, passing APR_HASH_KEY_STRING does NOT perform a strlen(). So yes: > a lot of this work isn't saving much at all. > > Just look at apr/tables/apr_hash.c:hashfunc_default() > > Cheers, > -g > > (tho we'll leave it to Stefan and Daniel to decide whether to blow > more of their time on this "issue" or just back it all out) > > > Hi all, > > There are a number of different issues to be addressed here. > > * svn_hash_gets / svn_hash_sets have been introduced to > simplify coding. Being used in 600+ places, this surely is > a useful API. We should have kept it private, though. > > * Definitions in svn_private_config.h determine various properties > of SVN's internal code. We must ensure that those take effect > consistently. I see two options: > > - #include that header immediately after the system headers. > r1508225 follows that idea. > - no headers whatsoever, including lib-internal ones, may > depend on anything inside svn_private_config.h. Ever. > > * How relevant is hash performance for SVN performance? > As it turns out, using the faster hash function behind > svn_hash__make gives a 50+% overall speedup in loggy > operations. The using strlen() in the getter is much less > of an issue (~1%). See details below. Tons of tuning on the > caller side changed the usage pattern that it works well with > SVN's own hash function, including its use of strlen(). > > Suggestion: > > * revert svn_hash_gets to simply use APR_HASH_KEY_STRING > * where it is being used, make svn_private_config.h the first #include > > -- Stefan^2. > > > Performance measurements of loggy operations on an FSX > mirror of the FreeBSD repository using svnserve > (-Td --cache-revprops yes -c 0 -M 6000) at localhost. > Optimized build on my old 2.4GHz laptop. > Best total execution times of runs 2 to 4. > > A / B / C > (1) 12.359s / 7.988s / 7.940s > (2) 1.105s / 0.783s / 0.775s > (3) 7.406s / 4.397s / 4.288s
Thanks, this the kind of measurement I wanted to see. > A ... apr_hash_get(ht, key, APR_HASH_KEY_STRING) and > #define svn_hash__make apr_hash_make > B ... apr_hash_get(ht, key, APR_HASH_KEY_STRING) > C ... apr_hash_get(ht, key, strlen(key)) Interestingly enough -- depending on the compiler, strlen of a constant string will already be optimized. So that makes me wonder ... > 1 ... svn-bench null-log svn://localhost/bsd-x/head -v -g -r250000:200000 > 176,481 revisions, 141,842 merged in 607 merges > 936,522 msg lines, 708,473 in merged revisions > 959,510 changes, 703,949 in merged revisions > 2 ... svn-bench null-log svn://localhost/bsd-x > 251,278 revisions > 1,335,928 msg lines > 0 changes > 3 ... svn-bench null-log svn://localhost/bsd-x -v > 251,278 revisions > 1,335,928 msg lines > 2,684,416 changes ... how much the differences between the three measurements is the result of different command-line options ... -- Brane -- Branko Čibej | Director of Subversion WANdisco // Non-Stop Data e. br...@wandisco.com