Den tis 31 juli 2018 kl 04:49 skrev Jakub Kicinski <jakub.kicin...@netronome.com>: > > On Mon, 30 Jul 2018 14:49:32 +0200, Björn Töpel wrote: > > Den tors 26 juli 2018 kl 23:44 skrev Jakub Kicinski: > > > > > > Hi! > > > > Thanks for spending your time on this, Jakub. I'm (temporarily) back > > for a week, so you can expect faster replies now... > > > > > This set tries to make the core take care of error checking for the > > > drivers. In particular making sure that the AF_XDP UMEM is not installed > > > on queues which don't exist (or are disabled) and that changing queue > > > (AKA ethtool channel) count cannot disable queues with active AF_XDF > > > zero-copy sockets. > > > > > > I'm sending as an RFC because I'm not entirely sure what the desired > > > behaviour is here. Is it Okay to install AF_XDP on queues which don't > > > exist? I presume not? > > > > Your presumption is correct. The idea with the > > real_num_rx_queues/real_num_tx_queues check in xsk_bind, is to bail > > out if the queue doesn't exist at bind call. Note that we *didn't* add > > any code to avoid the bound queue from being removed via set channel > > (your patch 6). Our idea was that if you remove a queue, the ingress > > frames would simply stop flowing, and the queue config change checking > > was "out-of-band". > > > > I think I prefer your approach, i.e. not allowing the channels/queues > > to change if they're bound to an AF_XDP socket. However, your > > xdp_umem_query used in ethtool only works for ZC enabled drivers, not > > for the existing non-ZC/copy case. If we'd like to go the route of > > disabling ethtool_set_channels for an AF_XDP enabled queue this > > functionality needs to move the query into netdev core, so we have a > > consistent behavior. > > Agreed. There seems to be no notification for changing the number of > queues and therefore no very clean way to solve this today.
I'm probably lacking some history here; Has there been any past efforts in making channels/queues a "kernel object"? Would it make sense to add notifications for queue changes analogous to netdev changes? > The last > two patches are more of a courtesy to the drivers, to simplify the > data structure for holding the UMEMs. > > I could argue that driver and stack are not really apples to apples. > Much like Generic XDP, the skb-based AF_XDP is basically a development > tool and last-resort fallback. Hmm... I partially agree. Let me think about it a bit more. > For TX driver will most likely allocate > separate queues, while skb-based will use the stack's queues. These > are actually different queues. Stack will also fallback to other queue > in __netdev_pick_tx() if number of queues changes. > Yup, ideally the driver will use a dedicated queue. We had some thoughts on hijacking the skb Tx queue, and route the stack egress packets elsewhere, but it ended up way too messy. > But yes, preferably skb-based and ZC should behave the same.. > > > > Are the AF_XDP queue_ids referring to TX queues > > > as well as RX queues in case of the driver? I presume not? > > > > We've had a lot of discussions about this internally. Ideally, we'd > > like to give driver implementors the most freedom, and not enforcing a > > certain queue scheme for Tx. > > You say freedom I hear diverging implementations and per-driver > checks in user space ;-) > Yeah. :-) Well, for the i40e ZC implementation, the Tx queue id was not equal to Rx queue id, so to answer your question: "Correct, the queue id refer to Rx." > Practically speaking unless you take the xmit lock there is little > chance of reusing stack's TX queues, so you'd have to allocate a > separate queue one way or the other.. At which point the number of > stack's TX queues has no bearing on AF_XDP ZC. > Yup, you're right. > > OTOH it makes it weird for the userland > > application *not* to have the same id, e.g. if a userland application > > would like to get stats or configure the AF_XDP bound Tx queue -- > > which id is it? Should the Tx queue id for an xsk be exposed in > > sysfs? > > I'd not go there. > Honestly, me neither. I need to think more about to expose the Tx queue pulls/knobs for a control plane. > > If the id is *not* the same, would it be OK to change the number of > > channels and Tx would continue to operate correctly? A related > > question; An xsk with *only* Tx, should it be constrained by the > > number of (enabled) Rx queues? > > Good question, are drivers even supposed to care about tx-only/rx-only? > From driver's perspective rx-only socket will simply never have > anything to transmit, and tx-only socket will never successfully > xdp_do_redirect(). So IMHO - yes, tx-only XSK still only cares about > RX queues, driver doesn't know no RX will ever happen. The fact that > user has to populate the FILL queue on a tx-only socket may be counter > intuitive to many... > Not allowing tx-only/rx-only definitely makes it easier. Many NICs, however, have non-symmetrical # of Tx/Rx queues. Scenarios where one would have #txq >> #rxq would then be hard to support. Again this could be worked around with virtual netdevs, so maybe it makes sense to keep the socket layer simple. > I was considering proposing a change to drop the *x-only option in the > net tree, I don't really see much use for it :S And it potentially > creates weird corner cases. > > > I'd be happy to hear some more opinions/thoughts on this... > > +1 > > > > Should we try to prevent disabling queues which have non zero-copy > > > sockets installed as well? :S > > > > > > > Yes, the ZC/non-ZC case must be consistent IMO. See comment above. > > > > > Anyway, if any of those patches seem useful and reasonable, please > > > let me know I will repost as non-RFC. > > > > > > > I definitely think patch 2 and 3 (and probably 1) should go as > > non-RFC! > > Thanks Björn, I will submit the first three! > Thanks! I've already taken them for a spin and acked them! Björn > > Thanks for spotting that we're not holding the rtnl lock when checking > > the # queues (patch 4)! > > > > Björn