On Sun, 2013-12-15 at 16:32 -0500, Alan Stern wrote:
> On Sat, 14 Dec 2013, James Bottomley wrote:
> 
> > > The refcount test and state change race with scsi_alloc_target().  
> > > Maybe the race won't occur in practice, but to be safe you should hold
> > > shost->host_lock throughout that time interval, as the original code
> > > here does.
> > 
> > You mean the fact that using a state model to indicate whether we should
> > destroy a target without bothering to refcount isn't robust against two
> > threads of execution running through a scan on the same target?
> 
> I meant that the patch you posted suffers from a race when one thread
> is adding a device to a target and another thread is removing an
> existing device below that target at the same time.  Suppose the
> target's reap_ref count is initially equal to 1:
> 
>       Thread 0                        Thread 1
>       --------                        --------
>       In scsi_alloc_target():         In scsi_target_reap():
>       lock host_lock                  scsi_target_reap_ref_put():
>       find existing starget           starget->reap_ref drops to 0
>       incr starget->reap_ref          In scsi_target_reap_ref_release():
>       unlock host_lock                device_del(&starget->dev);
>       starget->state == STARGET_DEL?
>       No => okay to use starget       set starget->state = STARGET_DEL
> 
> Result: We end up using starget _and_ removing it.  The only way to
> avoid this race would be to guarantee that we never add and remove
> devices below the same target at the same time.  In theory this is 
> feasible, but I don't know if you want to do it.
> 
> This doesn't seem to be what you are talking about above.  In any case, 
> it is a bug.

No, I was thinking of the two thread scan bug (i.e. two scan threads)
not one scan and one remove, which is a bug in the old code.  This is a
race between put and get when the kref is incremented from zero (an
illegal operation which triggers a warn on).

The way to mediate this is to check for the kref already being zero
condition, like below.

James

---
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 327c0e92..303d471 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -473,7 +473,13 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
        return starget;
 
  found:
-       kref_get(&found_target->reap_ref);
+       if (!kref_get_unless_zero(&found_target->reap_ref))
+               /*
+                * release routine already fired.  Target is dead, but
+                * STARGET_DEL may not yet be set (set in the release
+                * routine), so set here as well, just in case
+                */
+               found_target->state = STARGET_DEL;
        spin_unlock_irqrestore(shost->host_lock, flags);
        if (found_target->state != STARGET_DEL) {
                put_device(dev);



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to