On Tue, 25 Feb 2014, Dan Williams wrote: > > Look, we guarantee that the peer relation is dropped when either of the > > devices is unregistered. Therefore there's no need to take a reference > > to the peer device. > > Ugh, yes. Reference counts alone don't save us from this scenario: > > CPU1 CPU2 > mutex_lock(peer_lock) > unlink_peers(portA, portC) > mutex_unlock(peer_lock) > mutex_lock(peer_lock) > device_unregister(portA) > link_peers(portB, portA) > mutex_unlock(peer_lock) > > > This will cause a problem only if something else goes wrong. That's > > why I suggested, in patch 2, that when a peering error is detected, you > > print out the erroneous pointer rather than assuming it points to a > > valid port structure. > > ...but even with the invalid port flag portA can theoretically still be > a stale pointer.
I don't see how. > > This is also why I suggested that you close the race in > > usb_hub_remove_port_device: to make sure that the guarantee really is > > enforced. > > You proposed a flag on the port_dev once it goes invalid. It seems to > me the flag to check if a port is valid is parent_hub->disconnected. Good suggestion. > However, in the case above, I believe find_port_by_location() and > find_default_peer() need to prevent the "link_peers(portB, portA)" > event. When these routines walk down the tree they should be taking > usb_lock_device() to make sure they are not racing with disconnect > operations. > > Locking while traversing the list and otherwise checking for > hub->disconnected closes the race. We can't solve the race at the > port_dev level alone. Agreed? No. We already have the port_lock mutex; nothing else is needed. Simply make link_peers fail if either of the hub->disconnected flags is set. Then the scenario shown above can't happen, because link_peers running on CPU2 will see that portA is stale. However, it might be a good idea for usb_hub_create_port_device to hold the port_lock while doing hub->ports[port1 - 1] = port_dev; And perhaps this line should be moved down to just before the device_register call. In fact, maybe that call should occur inside the scope of the mutex. There may be other places where we have to be more careful about assuming that hub->ports[port1 - 1] is non-NULL. Even though checking the disconnected flag will prevent races with port destruction, there could be races with port creation. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html