APR hash order

Philip Martin Tue, 21 Feb 2012 04:33:19 -0800

A recent APR change affects hashes: the default hash function now has a
random component.  This change was made to avoid possible DOS attacks
where the user controls input to the hash.  If the user selects inputs
that all collide then subsequent hash access becomes O(N) rather than
O(1).  The random component makes this attack harder, although it may
not prevent it.


In a number of places Subversion iterates over hashes in hash order and
so the order is now different.  Things like the order of properties in
proplist, the order of children in an update, etc.  Not only is the
order now different from previous releases but the order can also vary
from run to run.  Most of the time this doesn't matter, one hash order
is as good as another, but it has caused a few failures in our
regression testsuite where a test depended on a particular hash order.

For things like 'proplist' where the output lines are in an arbitrary
order using UnorderedOutput is the right fix.

Here's a typical FAIL in diff_tests.py 14:

--- EXPECTED  
+++ ACTUAL  
@@ -1,9 +1,3 @@
-Index: svn-test-work/working_copies/diff_tests-14/iota
-===================================================================
-@@ -1,3 +1,2 @@
- This is the file 'iota'.
- some rev2 iota text.
--an iota local mod.
 Index: svn-test-work/working_copies/diff_tests-14/A/mu
 ===================================================================
 @@ -0,0 +1 @@
@@ -12,3 +6,9 @@
 ===================================================================
 @@ -1 +0,0 @@
 -Contents of newfile
+Index: svn-test-work/working_copies/diff_tests-14/iota
+===================================================================
+@@ -1,3 +1,2 @@
+ This is the file 'iota'.
+ some rev2 iota text.
+-an iota local mod.

I've used UnorderedOutput here as well.  I suppose a more rigourous fix
would be to write some sort of UnorderedDiff code but I'm not sure where
or how.

The remaining FAILs are mostly of two kinds: differences in XML status
and differences in dumpfile order.  The first can be fixed by writing
run_and_verify_status_xml and parsing the XML as already done in
run_and_verify_log_xml.  It's mostly a matter of deciding how much of
the log XML parsing code can be reused.

The dumpfile order is more interesting.  Although we don't specify the
dumpfile order until now it has been repeatable, at least when using the
same executable/libraries.  I can see that this repeatability is useful
to an administrator.  Rather than fixing the testsuite to ignore
dumpfile order changes perhaps we should remove the random behaviour and
continue to provide repeatable dumpfiles?  This would involve using
apr_hash_make_custom rather than apr_hash_make.  I don't know whether
it's possible to do that just for dump, or whether it would affect other
operations.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

APR hash order

Reply via email to