On Wed, 2019-01-16 at 19:54 -0500, Douglas Gilbert wrote: > On 2019-01-16 6:56 p.m., Bart Van Assche wrote: > > On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote: > > > The block layer assumes scsi_request:sense is always a valid > > > pointer. This is set up once in scsi_mq_init_request() and the > > > containing scsi_cmnd object is used often, being re-initialized > > > by scsi_init_command(). That works unless some code re-purposes > > > part of the scsi_cmnd object for something else. And that is > > > what bidi handling does in scsi_mq_prep_fn(). The result is an > > > oops at some later time when the partly overwritten object is > > > re-used. The overwrite is from d285203cf647d but 'git blame' > > > does not show removed code, so that commit may not be the > > > culprit. > > > > > > Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> > > > --- > > > > > > This was found while injecting errors (thus generating sense data) > > > into a sequence of bidi commands. At some later time the block > > > layer blew up with a scsi_request::sense NULL dereference in > > > sg_rq_end_io(). Without testing I'm confident the bsg driver, > > > the osd ULD and exofs are exposed to this bug. > > > > > > drivers/scsi/scsi_lib.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > > index b13cc9288ba0..71259bd4040a 100644 > > > --- a/drivers/scsi/scsi_lib.c > > > +++ b/drivers/scsi/scsi_lib.c > > > @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, > > > struct scsi_cmnd *cmd) > > > > > > cmd->device = dev; > > > cmd->sense_buffer = buf; > > > + cmd->req.sense = buf; > > > cmd->prot_sdb = prot; > > > cmd->flags = flags; > > > INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler); > > > > Hi Doug, > > > > The description of this patch does not look correct to me. > > scsi_init_command() > > does not overwrite the sense pointer. From the body of that function: > > > > /* zero out the cmd, except for the embedded scsi_request */ > > memset((char *)cmd + sizeof(cmd->req), 0, > > sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size); > > > > It is not clear to me which code overwrites the sense pointer. I think that > > needs to be figured out before discussion of this patch can continue. > > Bart, > Please re-read the explanation. The two sentences in the middle should tell > you that you are looking at the wrong memset(). > > And I'm waiting for the person who wrote the questionable code to comment. > > > If you don't believe me, try sending a device reset to a scsi_debug device. > Then use sg_raw *** to send a XDWRITREAD(10) bidi command to the same > scsi_debug > device, you should get a Unit Attention concerning that device reset. If > you do, keep sending that bidi command. Make sure you have no important > files open because that machine will lock solid. Basically what happens > when a rq_end_io() callback de-references a NULL pointer. It looks like > it has been there since 2014 and took me 2 days to find. Sorry that I can't > explain it in one simple sentence.
Hi Doug, Is this perhaps the memset you are referring to? memset(bidi_sdb, 0, sizeof(struct scsi_data_buffer)); Bart.