> -----Original Message----- > From: Adrien Mazarguil <adrien.mazarg...@6wind.com> > Sent: Wednesday, June 27, 2018 9:32 PM > To: Shahaf Shuler <shah...@mellanox.com> > Cc: Xueming(Steven) Li <xuemi...@mellanox.com>; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port representors > > On Sun, Jun 17, 2018 at 10:15:07AM +0000, Shahaf Shuler wrote: > > Hi Adrien, > > > > Saturday, June 16, 2018 11:58 AM, Xueming(Steven) Li: > > > Subject: RE: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port > > > representors > > > > > > > -----Original Message----- > > > > From: dev <dev-boun...@dpdk.org> On Behalf Of Adrien Mazarguil > > > > Sent: Thursday, June 14, 2018 4:35 PM > > > > To: Shahaf Shuler <shah...@mellanox.com> > > > > Cc: dev@dpdk.org > > > > Subject: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port > > > > representors > > > > > > > > Probe existing port representors in addition to their master > > > > device and > > > associate them automatically. > > > > > > > > To avoid name collision between Ethernet devices, their names use > > > > the same convention as ixgbe and i40e PMDs, that is, instead of > > > > only a PCI > > > address in DBDF notation: > > > > > > > > - "net_{DBDF}_0" for master/switch devices. > > > > This is breaking compatibility for application using the device names in > > order to attach them to the > application (e.g. OVS-DPDK). > > Before this patch the naming scheme for non-representor port is "{DBDF}". > > > > Can we preserve the compatibility and add appropriate suffix for the > > representor case? > > There's one issue if representors are hot-plugged. The name of the master > device, which happens to be > that of the switch domain, cannot be updated. The form "net_{DBDF}_0" seems > expected for PMDs that > support representors (see ixgbe and i40e). > > Now since representor hot-plugging is not supported yet, I guess we could > postpone this problem by > keeping the old format in the meantime, however ideally, these applications > should not rely on it. The > only safe assumption they can make is the uniqueness of any given name among > ethdevs. > > PCI bus addresses, if needed, should be retrieved by looking at the > underlying bus object. > > By the way, while thinking again about a past comment from Xueming [1], maybe > it's finally time to > remove support for multiple Verbs ports on mlx5 after all. This should drop > another unnecessary loop > and the need for the unused "port %u" suffix at all while naming the device. > > So how about the following plan for v3: > > - Adding a patch that drops support for multiple Verbs ports (note for > Xueming, yes I changed my mind *again* :) > > - If you really think this will break OVS (please confirm), then when no > "representor" parameter is provided (regardless of the presence of any > representors), name format will use the usual "{DBDF}" notation as you > suggested. > > - Otherwise as soon as a "representor" is found on the command line, the new > format will be used, again regardless of the presence of any representors.
The port creation sequence of upcoming hot plug looks like this: 0000:81:00.1 0000:81:00.1,representor=0 0000:81:00.1,representor=1 So the PF attaching comes always w/o "representor" parameter. > > - In both cases, representors if any, will be named according to the format > specified in this patch. > > [1] > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2 > 018- > June%2F104015.html&data=02%7C01%7Cxuemingl%40mellanox.com%7Cad9a1b32e5e241e375d208d5dc32778b%7Ca652971 > c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636657031666639542&sdata=w4oNeWXwKXS0%2BNSZsYQaneW%2BkFxvWIHZFHLoM > fLOxkg%3D&reserved=0 > > <snip> > > > > for (i = 0; i < attr.orig_attr.phys_port_cnt; ++i) { > > > > - eth_list[i] = mlx5_dev_spawn_one(dpdk_dev, ibv_dev, vf, > > > > - &attr, i + 1); > > > > - if (eth_list[i]) > > > > - continue; > > > > - /* Save rte_errno and roll back in case of failure. */ > > > > - ret = rte_errno; > > > > - while (i--) { > > > > - mlx5_dev_close(eth_list[i]); > > > > - if (rte_eal_process_type() == RTE_PROC_PRIMARY) > > > > - > > > > rte_free(eth_list[i]->data->dev_private); > > > > - > > > > claim_zero(rte_eth_dev_release_port(eth_list[i])); > > > > - } > > > > - free(eth_list); > > > > - rte_errno = ret; > > > > - return NULL; > > > > + eth_list[n] = mlx5_dev_spawn_one(dpdk_dev, ibv_dev[j], > > > vf, > > > > + &attr, i + 1, > > > > + j ? eth_list[0] : NULL, > > > > + j - 1); > > > > The representor id is according to the sort made by qsort (based on device > > names). > > A better way may be to set it according to the sysfs information, like you > > do in the mlx5_get_ifname > function. > > What do you think? > > I agree that the current approach sucks, hence the big fat warnings I left > around (see discussion with > Xueming [2]). Problem is that the needed information is not yet known at this > stage; there is no > private structure to rely on to use mlx5_get_ifname() directly. > > I'd also rather see these assumptions go in any case. I'll attempt to improve > things for v3 in > preparation of allowing representors to be probed on their own anytime, > possibly even before the > master device. > > [2] > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2 > 018- > June%2F104059.html&data=02%7C01%7Cxuemingl%40mellanox.com%7Cad9a1b32e5e241e375d208d5dc32778b%7Ca652971 > c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636657031666639542&sdata=5eOWb69duEB%2BkIW1ZGkv%2FLxkZfwErQOd%2FV7 > nDpN2jOg%3D&reserved=0 > > <snip> > > > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h > > > > index > > > > 997b04a33..0fe467140 100644 > > > > --- a/drivers/net/mlx5/mlx5.h > > > > +++ b/drivers/net/mlx5/mlx5.h > > > > @@ -161,6 +161,10 @@ struct priv { > > > > uint16_t mtu; /* Configured MTU. */ > > > > uint8_t port; /* Physical port number. */ > > > > unsigned int isolated:1; /* Whether isolated mode is enabled. */ > > > > + unsigned int representor:1; /* Device is a port representor. */ > > > > Why we need above flag? Why can't we use RTE_ETH_DEV_REPRESENTOR from > > eth_dev->data->dev_flags. > > Problem is that this flag can only be set once the ethdev is fully > instantiated and can't be relied on > internally where needed (e.g. during clean up in error handling code). It's > reported to applications > but not used internally. > > As a device property, it's actually pretty similar to the VF bit or offloaded > capabilities where > checking exposed information would be needlessly complex. > > Now maybe it could be part of struct mlx5_dev_config as well. I initially > assumed this object was only > for user-provided parameters but looks like it's not the case. I intend to > move it there for v3. > > -- > Adrien Mazarguil > 6WIND