Den tis 31 juli 2018 kl 04:49 skrev Jakub Kicinski
<jakub.kicin...@netronome.com>:
>
> On Mon, 30 Jul 2018 14:49:32 +0200, Björn Töpel wrote:
> > Den tors 26 juli 2018 kl 23:44 skrev Jakub Kicinski:
> > >
> > > Hi!
> >
> > Thanks for spending your time on this, Jakub. I'm (temporarily) back
> > for a week, so you can expect faster replies now...
> >
> > > This set tries to make the core take care of error checking for the
> > > drivers.  In particular making sure that the AF_XDP UMEM is not installed
> > > on queues which don't exist (or are disabled) and that changing queue
> > > (AKA ethtool channel) count cannot disable queues with active AF_XDF
> > > zero-copy sockets.
> > >
> > > I'm sending as an RFC because I'm not entirely sure what the desired
> > > behaviour is here.  Is it Okay to install AF_XDP on queues which don't
> > > exist?  I presume not?
> >
> > Your presumption is correct. The idea with the
> > real_num_rx_queues/real_num_tx_queues check in xsk_bind, is to bail
> > out if the queue doesn't exist at bind call. Note that we *didn't* add
> > any code to avoid the bound queue from being removed via set channel
> > (your patch 6). Our idea was that if you remove a queue, the ingress
> > frames would simply stop flowing, and the queue config change checking
> > was "out-of-band".
> >
> > I think I prefer your approach, i.e. not allowing the channels/queues
> > to change if they're bound to an AF_XDP socket. However, your
> > xdp_umem_query used in ethtool only works for ZC enabled drivers, not
> > for the existing non-ZC/copy case. If we'd like to go the route of
> > disabling ethtool_set_channels for an AF_XDP enabled queue this
> > functionality needs to move the query into netdev core, so we have a
> > consistent behavior.
>
> Agreed.  There seems to be no notification for changing the number of
> queues and therefore no very clean way to solve this today.

I'm probably lacking some history here; Has there been any past
efforts in making channels/queues a "kernel object"? Would it make
sense to add notifications for queue changes analogous to netdev
changes?

> The last
> two patches are more of a courtesy to the drivers, to simplify the
> data structure for holding the UMEMs.
>
> I could argue that driver and stack are not really apples to apples.
> Much like Generic XDP, the skb-based AF_XDP is basically a development
> tool and last-resort fallback.

Hmm... I partially agree. Let me think about it a bit more.

> For TX driver will most likely allocate
> separate queues, while skb-based will use the stack's queues.  These
> are actually different queues.  Stack will also fallback to other queue
> in __netdev_pick_tx() if number of queues changes.
>

Yup, ideally the driver will use a dedicated queue. We had some
thoughts on hijacking the skb Tx queue, and route the stack egress
packets elsewhere, but it ended up way too messy.

> But yes, preferably skb-based and ZC should behave the same..
>
> > > Are the AF_XDP queue_ids referring to TX queues
> > > as well as RX queues in case of the driver?  I presume not?
> >
> > We've had a lot of discussions about this internally. Ideally, we'd
> > like to give driver implementors the most freedom, and not enforcing a
> > certain queue scheme for Tx.
>
> You say freedom I hear diverging implementations and per-driver
> checks in user space ;-)
>

Yeah. :-) Well, for the i40e ZC implementation, the Tx queue id was
not equal to Rx queue id, so to answer your question: "Correct, the
queue id refer to Rx."

> Practically speaking unless you take the xmit lock there is little
> chance of reusing stack's TX queues, so you'd have to allocate a
> separate queue one way or the other..  At which point the number of
> stack's TX queues has no bearing on AF_XDP ZC.
>

Yup, you're right.

> > OTOH it makes it weird for the userland
> > application *not* to have the same id, e.g. if a userland application
> > would like to get stats or configure the AF_XDP bound Tx queue --
> > which id is it? Should the Tx queue id  for an xsk be exposed in
> > sysfs?
>
> I'd not go there.
>

Honestly, me neither. I need to think more about to expose the Tx
queue pulls/knobs for a control plane.

> > If the id is *not* the same, would it be OK to change the number of
> > channels and Tx would continue to operate correctly? A related
> > question; An xsk with *only* Tx, should it be constrained by the
> > number of (enabled) Rx queues?
>
> Good question, are drivers even supposed to care about tx-only/rx-only?
> From driver's perspective rx-only socket will simply never have
> anything to transmit, and tx-only socket will never successfully
> xdp_do_redirect().  So IMHO - yes, tx-only XSK still only cares about
> RX queues, driver doesn't know no RX will ever happen.  The fact that
> user has to populate the FILL queue on a tx-only socket may be counter
> intuitive to many...
>

Not allowing tx-only/rx-only definitely makes it easier. Many NICs,
however, have non-symmetrical # of Tx/Rx queues. Scenarios where one
would have #txq >> #rxq would then be hard to support. Again this
could be worked around with virtual netdevs, so maybe it makes sense
to keep the socket layer simple.

> I was considering proposing a change to drop the *x-only option in the
> net tree, I don't really see much use for it :S  And it potentially
> creates weird corner cases.
>
> > I'd be happy to hear some more opinions/thoughts on this...
>
> +1
>
> > > Should we try to prevent disabling queues which have non zero-copy
> > > sockets installed as well? :S
> > >
> >
> > Yes, the ZC/non-ZC case must be consistent IMO. See comment above.
> >
> > > Anyway, if any of those patches seem useful and reasonable, please
> > > let me know I will repost as non-RFC.
> > >
> >
> > I definitely think patch 2 and 3 (and probably 1) should go as
> > non-RFC!
>
> Thanks Björn, I will submit the first three!
>

Thanks! I've already taken them for a spin and acked them!


Björn

> > Thanks for spotting that we're not holding the rtnl lock when checking
> > the # queues (patch 4)!
> >
> > Björn

Reply via email to