On Thu, Apr 10, 2025 at 03:33:40PM +0200, Alexander Lobakin wrote:
> From: Leon Romanovsky <l...@kernel.org>
> Date: Thu, 10 Apr 2025 16:27:06 +0300
> 
> > On Thu, Apr 10, 2025 at 02:58:28PM +0200, Larysa Zaremba wrote:
> >> On Thu, Apr 10, 2025 at 02:23:49PM +0300, Leon Romanovsky wrote:
> >>> On Thu, Apr 10, 2025 at 12:44:33PM +0200, Larysa Zaremba wrote:
> >>>> On Thu, Apr 10, 2025 at 11:21:37AM +0300, Leon Romanovsky wrote:
> >>>>> On Tue, Apr 08, 2025 at 02:47:51PM +0200, Larysa Zaremba wrote:
> >>>>>> From: Phani R Burra <phani.r.bu...@intel.com>
> >>>>>>
> >>>>>> Libeth will now support control queue setup and configuration APIs.
> >>>>>> These are mainly used for mailbox communication between drivers and
> >>>>>> control plane.
> >>>>>>
> >>>>>> Make use of the page pool support for managing controlq buffers.
> >>>>>
> >>>>> <...>
> >>>>>
> >>>>>>  libeth-y                      := rx.o
> >>>>>>  
> >>>>>> +obj-$(CONFIG_LIBETH_CP)               += libeth_cp.o
> >>>>>> +
> >>>>>> +libeth_cp-y                   := controlq.o
> >>>>>
> >>>>> So why did you create separate module for it?
> >>>>> Now you have pci -> libeth -> libeth_cp -> ixd, with the potential 
> >>>>> races between ixd and libeth, am I right?
> >>>>>
> >>>>
> >>>> I am not sure what kind of races do you mean, all libeth modules 
> >>>> themselves are 
> >>>> stateless and will stay this way [0], all used data is owned by drivers.
> >>>
> >>> Somehow such separation doesn't truly work. There are multiple syzkaller
> >>> reports per-cycle where module A tries to access module C, which already
> >>> doesn't exist because it was proxied through module B.
> >>
> >> Are there similar reports for libeth and libie modules when iavf is 
> >> enabled?
> > 
> > To get such report, syzkaller should run on physical iavf, it looks like it 
> > doesn't.
> > Did I miss it here?
> > https://syzkaller.appspot.com/upstream/s/net
> > 
> >> It is basically the same hierarchy. (iavf uses both libeth and libie, 
> >> libie 
> >> depends on libeth).
> >>
> >> I am just trying to understand, is this a regular situation or did I just 
> >> mess 
> >> smth up?
> > 
> > My review comment was general one. It is almost impossible to review
> > this newly proposed architecture split for correctness.
> > 
> >>
> >>>
> >>>>
> >>>> As for the module separation, I think there is no harm in keeping it 
> >>>> modular. 
> >>>
> >>> Syzkaller reports disagree with you. 
> >>>
> >>
> >> Could you please share them?
> > 
> > It is not an easy question to answer, because all these reports are 
> > complaining
> > about some wrong locking order or NULL-pointer access. You will never know 
> > if
> > it is because of programming or design error.
> > 
> > As an approximate example, see commits a27c6f46dcec ("RDMA/bnxt_re: Fix an 
> > issue in bnxt_re_async_notifier")
> > and f0df225d12fc ("RDMA/bnxt_re: Add sanity checks on rdev validity").
> > At the first glance, they look unrelated to our discussion, however
> > they can serve as an example or races between deinit/disable paths in
> > parent module vs. child.
> 
> Unrelated. At first, you were talking about module dependencies, now
> you're talking about struct device etc dependencies, which is a
> completely different thing.
> 
> As already said, libeth is stateless, so the latter one can't happen.
> The former one is impossible at all. As long as at least 1 child module
> is loaded, you can't unload the parent. And load/unload is serialized,
> see module core code.

It is not only module load/unload. It is bind/unbind, devlink operations
e.t.c, everything that can cause to reload driver in module C.

Thanks

Reply via email to