On Thu, 28 Oct 2021, 00:43 Julien Grall, <julien.grall....@gmail.com> wrote:
> > > On Thu, 28 Oct 2021, 00:14 Stefano Stabellini, <sstabell...@kernel.org> > wrote: > >> On Wed, 27 Oct 2021, Julien Grall wrote: >> > > > > > + return ret; >> > > > > > } >> > > > > > static int register_smmu_master(struct arm_smmu_device *smmu, >> > > > > > @@ -2056,7 +2066,10 @@ static int arm_smmu_add_device(struct >> device >> > > > > > *dev) >> > > > > > } else { >> > > > > > struct arm_smmu_master *master; >> > > > > > + spin_lock(&arm_smmu_devices_lock); >> > > > > > master = find_smmu_master(smmu, dev->of_node); >> > > > > > + spin_unlock(&arm_smmu_devices_lock); >> > > > > >> > > > > At the moment, unlocking here is fine because we don't remove the >> > > > > device. However, there are a series to supporting removing a >> device (see >> > > > > [1]). So I think it would be preferable to unlock after the last >> use of >> > > > > 'cfg'. >> > > > > >> > > Ok. I will move unlocking to the end of this else {} block. I was not >> aware >> > > of the patch you are referring to. >> > >> > I think the end of the else is still too early. This needs to at least >> be past >> > iommu_group_set_iommudata() because we store cfg. >> > >> > Potentially, the lock wants to also englobe >> arm_smmu_master_alloc_smes(). So I >> > am wondering whether it would be simpler to hold the lock for the whole >> > duration of arm_smmu_add_device() (I can see value when we will want to >> > interlock with the remove code). >> >> The patch was to protect the smmu master list. From that point of view >> the unlock right after find_smmu_master would be sufficient, right? >> > > Yes. However this is not fixing all the problems (see below). > > >> We only need to protect cfg if we are worried that the same device is >> added in two different ways at the same time. Did I get it right? If so, >> I would say that that case should not be possible? Am I missing another >> potential conflict? >> > > It should not be possible to add the same device twice. The problem is > more when we are going to remove the device. In this case, "master" may > disappear at any point. > > The support for removing device is not yet implemented in the tree. But > there is already a patch on the ML. So I think it would be shortsighted to > only move the lock to just solve concurrent access to the list. > > >> >> I am pointing this out for two reasons: >> >> Protecting the list is different from protecting each element from >> concurrent modification of the element itself. If the latter is a >> potential problem, I wonder if arm_smmu_add_device is the only function >> affected? >> > > I had a brief looked at the code and couldn't find any other places where > this may be an issue. > > >> The second reason is that extending the lock past >> arm_smmu_master_alloc_smes is a bit worrying because it causes >> &arm_smmu_devices_lock and smmu->stream_map_lock to nest, which wasn't >> the case before. >> > > Nested locks are common. I don't believe there would be a problem here > with this one. > > >> I am not saying that it is a bad idea to extend the lock past >> arm_smmu_master_alloc_smes -- it might be the right thing to do. > > > I don't usually suggest locking changes blindly ;). > > But I > > am merely saying that it might be best to think twice about it. > > and/or do >> that after 4.16. > > > To be honest, this patch is not useful the callback to register a device > in the IOMMU subsystem. So if you are concerned with > The sentence makes no sense sorry. I meant the add callback doesn't support PCI devices. So the locking is a latent issue at the moment. the my suggested locking then, I am afraid the current patch is a no-go on > my side for 4.16. > > That said we can work towards a new locking approach for 4.17. However, I > would want to have a proposal from your side or at least some details on > why the suggested locking is not suitable. > > Cheers, >