On Mon, 5 Sep 2016, Felipe Balbi wrote:
> okay, I'm thinking that's a plausible explanation, please correct me if
> I'm wrong. What I'm speculating now is that inside a locked region,
> reordering can happen (can it?)
Yes, it can.
> which means that our:
>
> spin_lock(&common->lock);
> bh->outreq_busy = 0;
> bh->state = BUF_STATE_FULL;
> wakeup_thread(common);
> spin_unlock(&common->lock);
>
> ends up being reordered as:
>
> spin_lock(&common->lock);
> bh->outreq_busy = 0;
> wakeup_thread(common);
> bh->state = BUF_STATE_FULL;
> spin_unlock(&common->lock);
>
> which triggers the fault. Is that possible?
That's why wakeup_thread needs to include a memory barrier -- to
prevent exactly this sort of thing from happening.
> Looking at the history.git tree [0] I found a little more of
> file_storage gadget's history. Here are some of the details I could find
> out:
>
> When you first wrote the file_storage gadget, wakeup_thread() was called
> outside the lock from bulk_{in,out}_complete() (see [1]). A follow-up
> patch ended up moving wakeup_thread() inside the locked region (while
> also removing a wait_queue()) which, I'm speculating, introduced the
> race condition (see [2])
>
> Another piece of speculation (which I can test, if necessary) is that
> larger mass storage buffers (e.g. 64k or 128k) would hide this race
> because it would take longer for UDC to complete the request. This might
> be why we never saw this before with HS-only UDCs.
>
> Are we really dealing with a regression introduced back in 2.6.15?
Quite possibly. Or maybe it was there from the very beginning, but it
doesn't show up unless you're using a fast processor.
Alan Stern
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html