On Wednesday April 25, [EMAIL PROTECTED] wrote: > > That's pretty close to where I think the problem is (the front merging > and cfq_reposition_rq_rb()). The issue with that is that you'd only get > aliases for O_DIRECT and/or raw IO, and that doesn't seem to be the case > here. Given that front merges are equally not very likely, I'd be > surprised is something like that has ever happened.
Well it certainly doesn't happen very often.... And I can imagine a filesystem genuinely wanting to read the same block twice - maybe a block that contained packed tails of two different files. > > BUT... That may explain while we are only seeing it on md. Would md > ever be issuing such requests that trigger this condition? Can someone remind me which raid level(s) was/were involved? I think one was raid0 - that just passes requests on from the filesystem, so md would only issue requests like that if the filesystem did. I guess it could happen with raid4/5/6. A read request that was properly aligned (and we do encourage proper alignment) will be passed directly to the underlying device. A concurrent write elsewhere could require the same block to be read into the stripe-cache to enable a parity calculation. So you could get two reads at the same block address. Getting a front-merge would probably require two stripe-heads to be processed in reverse-sector order, and it tends to preserve the order of incoming requests (though that isn't firmly enforced). raid1 is much like raid0 (with totally different code) in that the request pattern seen by the underlying device matches the request pattern generated by the filesystem. If I read the debugging output correctly, the request which I hypothesise was the subject of a front-merge is a 'sync' request. raid5 does not generate sync requests to fill the stripe cache (maybe it should?) so I really think it must have come directly from the filesystem. (just checked previous email for more detail of when it hits) The fact that it hits degraded arrays more easily is interesting. Maybe we try to read a block on the missing device and so schedule reads for the other devices. Then we try to read a block on a good device and issue a request for exactly the same block that raid5 asked for. That still doesn't explain the 'sync' and the 'front merge'. But that is quite possible, just not common maybe. It doesn't help us understand the raid0 example though. May it is just a 'can happen, but only rarely' thing. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/