I noticed that ip_find() calculates the hash bucket for the incoming fragment using ipfrag_hash_rnd outside the ipfrag_lock. So it can race with ipfrag_secret_rebuild() and end up putting a frag in the previous bucket instead of the new bucket that ipfrag_secret_rebuild() has put the previous frags in. The unlock and write lock acquiry in the allocating case could also hit this.
I verified this by spraying a box with 16k udp writes and turning the secret interval way down to hz/100. It didn't take long before it hit. It resulted in reasm failures due to frag timeouts as expected. Before worrying about fixing this I thought I'd sample the crowd. a) Who cares? The interval is huge and it'd be a few packets at most. b) Just calculate the hashes under the lock, we're already doing lots of work there anyway. c) Sample the random value when calculating the hash outside the lock and only recalculate the hash inside the lock if it changes. (Maybe with some memory barrier help). d) Go insane redoing the serialization so that we get rid of this and other suboptimal behaviour. (please no :)). I vote for c) for it's compromise of minimal invasiveness and impact to the unracey case. - z - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html