I noticed that ip_find() calculates the hash bucket for the incoming
fragment using ipfrag_hash_rnd outside the ipfrag_lock.  So it can race
with ipfrag_secret_rebuild() and end up putting a frag in the previous
bucket instead of the new bucket that ipfrag_secret_rebuild() has put
the previous frags in.  The unlock and write lock acquiry in the
allocating case could also hit this.

I verified this by spraying a box with 16k udp writes and turning the
secret interval way down to hz/100.  It didn't take long before it hit.
 It resulted in reasm failures due to frag timeouts as expected.

Before worrying about fixing this I thought I'd sample the crowd.

a) Who cares?  The interval is huge and it'd be a few packets at most.

b) Just calculate the hashes under the lock, we're already doing lots of
work there anyway.

c) Sample the random value when calculating the hash outside the lock
and only recalculate the hash inside the lock if it changes.  (Maybe
with some memory barrier help).

d) Go insane redoing the serialization so that we get rid of this and
other suboptimal behaviour.  (please no :)).

I vote for c) for it's compromise of minimal invasiveness and impact to
the unracey case.

- z
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to