Am 21.05.2012 12:38, schrieb Julian Foad:
Stefan Fuhrmann wrote:
Julian Foad wrote:
URL:http://svn.apache.org/viewvc?rev=1333326&view=rev
Introduce private API functions that wrap apr_hash_make_custom
and return hash tables that are 2 to 4 times faster than the
APR default.
Would it be sensible to propose these (the hash-functions) for
inclusion in APR itself?
Certainly. The question would be whether Apache is
meant to run on CPUs without a decent MUL.
I don't understand why that question is relevant.
APRs implementation uses 33 as multiplier which
can conveniently be implemented as shift & add.
My code uses factors up to 33^4 where that
optimization / workaround would no longer be
useful. A non-pipelined MUL operation may take
as much as 40 ticks (i386) instead of 2 .. 6 ticks for
shift&add.
I don't know of any popular CPUs that have this
problem but OTOH, I don't know all exotic platforms /
embedded devices that Apache is being run on.
Modified: subversion/trunk/subversion/libsvn_subr/hash.c
URL:http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_subr/hash.c?rev=1333326&r1=1333325&r2=1333326&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_subr/hash.c (original)
+++ subversion/trunk/subversion/libsvn_subr/hash.c Thu May 3 07:16:11 2012
+/*** Optimized hash functions ***/
+
+/* Optimized version of apr_hashfunc_default. It assumes that the CPU has
+ * 32-bit multiplications with high throughput of at least 1 operation
+ * every 3 cycles. Latency is not an issue. Another optimization is a
+ * mildly unrolled main loop.
Such specific details should at least refer to a specific version
of apr_hashfunc_default(). Perhaps also (for the "1 op per 3 cycles"
part, in particular) a specific system architecture or compiler.
r1340601 explains why that is a reasonable assumption.
What's missing is a statement that this is an optimized version of
"apr_hashfunc_default in APR 1.4.5".
Added the version info in r1341271.
-- Stefan^2.