On Thu, Jun 28, 2018 at 05:57:03AM +0000, Shahaf Shuler wrote:
> Wednesday, June 27, 2018 4:33 PM, Adrien Mazarguil:
> > Subject: Re: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port representors
> > 
> > On Sun, Jun 24, 2018 at 01:33:31PM +0000, Shahaf Shuler wrote:
> > > One more input,
> > >
> > > Sunday, June 17, 2018 1:15 PM, Shahaf Shuler:
> > > > Subject: RE: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port
> > > > representors
> > >
> > > [...]
> > >
> > > > > > +   eth_list = tmp;
> > > > > >     for (i = 0; i < attr.orig_attr.phys_port_cnt; ++i) {
> > > > > > -           eth_list[i] = mlx5_dev_spawn_one(dpdk_dev,
> > ibv_dev, vf,
> > > > > > -                                            &attr, i + 1);
> > > > > > -           if (eth_list[i])
> > > > > > -                   continue;
> > > > > > -           /* Save rte_errno and roll back in case of failure. */
> > > > > > -           ret = rte_errno;
> > > > > > -           while (i--) {
> > > > > > -                   mlx5_dev_close(eth_list[i]);
> > > > > > -                   if (rte_eal_process_type() ==
> > RTE_PROC_PRIMARY)
> > > > > > -                           rte_free(eth_list[i]->data-
> > >dev_private);
> > > > > > -
> >     claim_zero(rte_eth_dev_release_port(eth_list[i]));
> > > > > > -           }
> > > > > > -           free(eth_list);
> > > > > > -           rte_errno = ret;
> > > > > > -           return NULL;
> > > > > > +           eth_list[n] = mlx5_dev_spawn_one(dpdk_dev,
> > ibv_dev[j],
> > > > > vf,
> > > > > > +                                            &attr, i + 1,
> > > > > > +                                            j ? eth_list[0] : NULL,
> > > > > > +                                            j - 1);
> > > >
> > > > The representor id is according to the sort made by qsort (based on
> > > > device names).
> > > > A better way may be to set it according to the sysfs information,
> > > > like you do in the mlx5_get_ifname function.
> > > > What do you think?
> > >
> > > In fact relaying on linear increasing port numbers is dangerous. In may
> > break on special scenarios like BlueField.
> > > In BlueField there are representors between the x86 and the ARM cores.
> > Those are not VF representors. The phys_port_name of those is -1 and each
> > of them belongs to different phys_switch_id.
> > >
> > > We can argue whether it is correct/not to assign them w/ -1 value, but I
> > think the suggested approach above can detect the right "vf_id" for those
> > and not break the current behavior on x86.
> > > Let me know if you have other suggestions.
> > 
> > I didn't know that. Assuming that with these, there is exactly only one
> > representor per device, I think we can manage, the main issue being that "-
> > 1" will be difficult to parse as a valid "representor" argument which uses 
> > "-"
> > for ranges.
> 
> The -1 value is not for the representor id, It is for the id of the entity 
> which exists on the other size of the representor. 
> The repesentor index is still 0, meaning the command line -w 
> <pci_bdf>,representor=0 is correct on this case.
> 
> The problems comes from the assumption you do in your code about the 
> representor id.
> What you do currently is to receive the representors and qsort them by device 
> name. then you assign the priv->rep_id based on the qsort output.
> Later on when querying the if_name (mlx5_get_ifname) you assume that the 
> phys_port_name of representor (which include the enumeration of what exists 
> on its other side) is the same.
> 
> For x86 it probably works. On BlueField it breaks, as from some reason the 
> phys_port_name is -1. 
> 
> My suggestion is to set the priv->rep_id based on the phys_port_name instead 
> of qsort output. 

Yes, understood. The only drawback using this approach is that mlx5 devices
won't be usable at all if no netdevice can be associated with them (e.g. in
case it was moved to another netns). Currently all matching IB devs are
probed regardless, except they are handled as normal devices when the PMD
can't determine whether it's dealing with a representor.

> > Anyway, I suggest to deal with Bluefield specifics in a subsequent series.
> > This one focuses on and is validated with VF representors only.
> 
> It is related to VF representors only. BlueField is just an example. 

By "BlueField specifics", I mean the translation of -1 to 0 which so far is
specific to BlueField. Another patch is needed for that.

For devices where representors are properly numbered starting from 0, we must
rely on the uninterpreted phys_port_name value directly, which must be a
positive integer instead of a qsort() interpretation in order to properly
handle holes in the sequence due to missing devices (netns).

I intend to modify this patch as described.

-- 
Adrien Mazarguil
6WIND

Reply via email to