16/03/2022 13:25, Ilya Maximets: > On 3/16/22 10:41, Thomas Monjalon wrote: > > 15/03/2022 23:12, Ilya Maximets: > >> Hi, everyone. > >> > >> After implementing support for tunnel offloading in OVS we faced a > >> significant performance issue caused by the requirement to call > >> rte_flow_get_restore_info() function on a per-packet basis. > >> > >> The main problem is that once the tunnel offloading is configured, > >> there is no way for the application to tell if a particular packet > >> was partially processed in the hardware and the tunnel info has to > >> be restored. What we have to do right now is to call the > >> rte_flow_get_restore_info() unconditionally for every packet. The > >> result of that call says if we have the tunnel info or not. > >> > >> rte_flow_get_restore_info() call itself is very heavy. It is at > >> least a couple of indirect function calls and the device lock > >> on the application side (not-really-thread-safety of the rte_flow > >> API is a separate topic). Internal info lookup inside the driver > >> also costs a lot (depends on a driver). > >> > >> It has been measured that having this call on a per-packet basis can > >> reduce application performance (OVS) by a factor of 3 in some cases: > >> https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389265.html > >> > >> https://github.com/openvswitch/ovs/commit/6e50c1651869de0335eb4b7fd0960059c5505f5c > >> (Above patch avoid the problem in a hacky way for devices that doesn't > >> support tunnel offloading, but it's not applicable to situation > >> where device actually supports it, since the API has to be called.) > >> > >> Another tricky part is that we have to call rte_flow_get_restore_info() > >> before checking other parts of the mbuf, because mlx5 driver, for > >> example, re-uses the mbuf.hash.fdir value for both tunnel info > >> restoration and classification offloading, so the application has > >> no way to tell which one is used right now and has to call the > >> restoration API first in order to find out. > >> > >> > >> What we need: > >> > >> A generic and fast (couple of CPU cycles) API that will clearly say > >> if the heavy rte_flow_get_restore_info() has to be called for a > >> particular packet or not. Ideally, that should be a static mbuf > >> flag that can be easily checked by the application. > > > > A dynamic mbuf flag, defined in the API, looks to be a good idea. > > Makes sense. OTOH, I'm not sure what is the profit of having it > dynamically allocated if it will need to be always defined.
True. We need to discuss whether we can have situations where it is not registered at all. We recently introduced a function for initial config of rte_flow, it could be the trigger to register such flag or field. > But, well, it doesn't really matter, I guess. That's a detail but it should be discussed. > >> Calls inside the device driver are way too slow for that purpose, > >> especially if they are not fully thread-safe, or require complex > >> lookups or calculations. > >> > >> I'm assuming here that packets that really need the tunnel info > >> restoration should be fairly rare. > >> > >> > >> Current state: > >> > >> Currently, the get_restore_info() API is implemented only for mlx5 > >> and sfc drivers, AFAICT. > >> SFC driver is already using mbuf flag, but > >> it's dynamic and not exposed to the application. > > > > What is this flag? > > SFC driver defines a dynamic field 'rte_net_sfc_dynfield_ft_id' > and the corresponding flag 'rte_net_sfc_dynflag_ft_id_valid' to > check if the field contains a valid data. OK, so we could deprecate this flag. > >> MLX5 driver re-uses mbuf.hash.fdir value > >> and performs a heavy lookup inside the driver. > > > > We should avoid re-using a field. > > +1 from me, but I'm not familiar with the mlx5 driver enough > to tell how to change it. > > > > >> For now OVS doesn't support tunnel offload with DPDK formally, the > >> code in OVS is under the experimental API ifdef and not compiled-in > >> by default. > >> > >> //Let me know if there is more formal way to submit such requests. > > > > That's a well written request, thanks. > > If you are looking for something more formal, > > it could be a patch in the API. > > I'm not looking for now. :) > I think we need an agreement from driver maintainers first. And > I don't think we can introduce such API change without changing > drivers first, because otherwise we'll end up with the 'has_vlan' > situation and the broken offloading support. OK