On 10.12.2015 18:23, Ivan Zhakov wrote:
On 10 December 2015 at 20:20, Julian Foad <julianf...@apache.org> wrote:
  Ivan Zhakov wrote:
On 10 December 2015 at 19:14, Julian Foad <julianf...@apache.org> wrote:
APR devs, Subversion devs:

On Subversion's Mac OS buildbots it appears that apr_hash_overlay()
sometimes returns a hash containing duplicate keys, which (as I
understand it) should be impossible.

We had an issue where some 'svnmover' tests were failing only on Mac
OS buildbots. I added some debugging in Subversion commits r1719056,
r1719067, r1719072, r1719074.

Buildbot result:
     
https://ci.apache.org/builders/svn-x64-macosx-bdb/builds/485/steps/Test%20ra_svn%2Bbdb
     --> debug output in 'faillog' shows duplicate keys in hash:
        "union_children={A, iota, foo, boozle, boozle, iota}"

I replaced apr_hash_overlay() with my own simple re-implementation:

     http://svn.apache.org/r1719089 -- re-implement hash overlay

Hi Julian,

That could be possible if two hashes uses different hash functions.
This could happen if you're using svn_hash__make() directly or
indirectly: for example RA get_dirent for svn:// protocol returns hash
with non-standard hash-function.

Ugh. Yes, that is probably the cause. Thanks.

I can see now that the doc string of apr_hash_overlay() does say "Both
hash tables must use the same hash function" but that was not obvious,
and I had totally forgotten that our Subversion code was using more
than one hash function.

That API restriction has been removed in 1.4.6.

When APR introduced "randomized" hash functions for
each hash instance, they could no longer simply reuse
the hash values across instances.

I will use my own hash overlay function from now on.

You could add a version check and call the APR
library function for newer APR releases.

I don't think this is proper fix for this problem. I think returning
hash with non-standard hash function from public API is a bug (and API
regression).

We always return a standard hash. What is different
is that we don't instantiate it with the *default*
hash function.

Other API users may get to the same situation. So proper
fix would be revert these optimizations from public API imo.

It is up to the APR API user to use that API correctly.
For instance, there is no guarantee anywhere that any
two apr_hash_t instances use the same hash function.

Furthermore, it is up to the APR project to keep their
API documentation in sync with the code such that users
won't jump through hoops needlessly.

-- Stefan^2.

Reply via email to