On 2/25/25 15:35, Jiri Pirko wrote:
Tue, Feb 25, 2025 at 12:30:49PM +0100, przemyslaw.kits...@intel.com wrote:

Thanks to Wojciech Drewek for very nice naming of the devlink instance:
PF0:            pci/0000:00:18.0
whole-dev:      pci/0000:00:18
But I made this a param for now (driver is free to pass just "whole-dev").

$ devlink dev # (Interesting part of output only)
pci/0000:af:00:
   nested_devlink:
     pci/0000:af:00.0
     pci/0000:af:00.1
     pci/0000:af:00.2
     pci/0000:af:00.3
     pci/0000:af:00.4
     pci/0000:af:00.5
     pci/0000:af:00.6
     pci/0000:af:00.7


In general, I like this approach. In fact, I have quite similar
patch/set in my sandbox git.

The problem I didn't figure out how to handle, was a backing entity
for the parent devlink.

You use part of PCI BDF, which is obviously wrong:
1) bus_name/dev_name the user expects to be the backing device bus and
     address on it (pci/usb/i2c). With using part of BDF, you break this
     assumption.
2) 2 PFs can have totally different BDF (in VM for example). Then your
     approach is broken.

To make the hard part of it easy, I like to have the name to be provided
by what the PF/driver has available (whichever will be the first of
given device PFs), as of now, we resolve this issue (and provide ~what
your devlink_shared does) via ice_adapter.

I don't understand. Can you provide some examples please?

Right now we have one object of struct ice_adapter per device/card,
it is refcounted and freed after last PF put()s their copy.
In the struct one could have a mutex or spinlock to guard shared stuff,
existing example is ptp_gltsyn_time_lock of ice driver.




Making it a devlink instance gives user an easy way to see the whole
picture of all resources handled as "shared per device", my current

This part is what is missing in current devlink impl and likely would
still be after your series. I would still like to have it :)
(And the rest is sugar coating for me)

output, for all PFs and VFs on given device:

pci/0000:af:00:
  name rss size 8 unit entry size_min 0 size_max 24 size_gran 1
    resources:
      name lut_512 size 0 unit entry size_min 0 size_max 16 size_gran 1
      name lut_2048 size 8 unit entry size_min 0 size_max 8 size_gran 1

What is contributing to the hardness, this is not just one for all ice
PFs, but one per device, which we distinguish via pci BDF.

How?

code is in ice_adapter_index()
Now I get what DSN is, looks like it could be used equally well instead
pci BDF.

Still we need more instances, each card has their own PTP clock, their
own "global RSS LUT" pool, etc.





I was thinking about having an auxiliary device created for the parent,
but auxiliary assumes it is child. The is upside-down.

I was thinking about having some sort of made-up per-driver bus, like
"ice" of "mlx5" with some thing like DSN that would act as a "dev_name".
I have a patch that introduces:

struct devlink_shared_inst;

struct devlink *devlink_shared_alloc(const struct devlink_ops *ops,
                                       size_t priv_size, struct net *net,
                                       struct module *module, u64 per_module_id,
                                       void *inst_priv,
                                       struct devlink_shared_inst **p_inst);
void devlink_shared_free(struct devlink *devlink,
                          struct devlink_shared_inst *inst);

I took a stab at it here:
https://github.com/jpirko/linux_mlxsw/commits/wip_dl_pfs_parent/
The work is not finished.


Also, I was thinking about having some made-up bus, like "pci_ids",
where instead of BDFs as addresses, there would be DSN for example.

None of these 3 is nice.

how one would invent/infer/allocate the DSN?

Driver knows DSN, it can obtain from pci layer.

Aaach, I got the abbreviation wrong, pci_get_dsn() does the thing, thank
you. BTW, again, by Jake :D




faux_bus mentioned by Jake would be about the same level of "fakeness"
as simply allocating a new instance of devlink by the first PF, IMO :)

Hmm, briefly looking at faux, this looks like fills the gap I missed in
auxdev. Will try to use it in my patchset.

Thanks!



Reply via email to