Hi Ori, A few questions on top of Ferruh's questions for better understanding the concept inlined below:
> -----Original Message----- > From: Ori Kam <or...@nvidia.com> > Sent: Thursday, December 14, 2023 2:17 PM > To: Ferruh Yigit <ferruh.yi...@amd.com>; Dariusz Sosnowski > <dsosnow...@nvidia.com>; Dumitrescu, Cristian > <cristian.dumitre...@intel.com>; NBU-Contact-Thomas Monjalon (EXTERNAL) > <tho...@monjalon.net>; Andrew Rybchenko > <andrew.rybche...@oktetlabs.ru> > Cc: dev@dpdk.org; Raslan Darawsheh <rasl...@nvidia.com> > Subject: RE: [RFC] ethdev: introduce entropy calculation > > Hi Ferruh, > > > -----Original Message----- > > From: Ferruh Yigit <ferruh.yi...@amd.com> > > Sent: Thursday, December 14, 2023 1:35 PM > > > > On 12/10/2023 8:30 AM, Ori Kam wrote: > > > When offloading rules with the encap action, the HW may calculate entropy > > based on the encap protocol. > > > Each HW can implement a different algorithm. > > > > > > > Hi Ori, > > > > Can you please provide more details what this 'entropy' is used for, > > what is the usecase? > > Sure, in some tunnel protocols, for example, VXLAN, NVGE it is possible to > add > entropy value in one of the > fields of the tunnel. In VXLAN for example it is in the source port, > From the VXLAN protocol: > Source Port: It is recommended that the UDP source port number > be calculated using a hash of fields from the inner packet -- > one example being a hash of the inner Ethernet frame's headers. > This is to enable a level of entropy for the ECMP/load- > balancing of the VM-to-VM traffic across the VXLAN overlay. > When calculating the UDP source port number in this manner, it > is RECOMMENDED that the value be in the dynamic/private port > range 49152-65535 [RFC6335]. > > Since encap groups number of different 5 tuples together, if HW doesn’t know > how to RSS > based on the inner application will not be able to get any distribution of > packets. > > This value is used to reflect the inner packet on the outer header, so > distribution > will be possible. > > The main use case is, if application does full offload and implements the > encap on > the RX. > For example: > Ingress/FDB match on 5 tuple encap send to hairpin / different port in case > of > switch. > Smart idea! So basically the user is able to get an idea on how good the RSS distribution is, correct? Can you elaborate a bit on how the entropy is measured: is it a number, what is the range of values, does higher value means better, etc. > The issue starts when there is a miss on the 5 tuple table for example, due > to syn > packet. > A packet arrives at the application, and then the application offloads the > rule. > So the application must encap the packet and set the same entropy as the HW > will do for all the rest > of the packets. > How can the app set the entropy? > > > > > > > When the application receives packets that should have been > > > encaped by the HW, but didn't reach this stage yet (for example TCP SYN > > packets), > > > then when encap is done in SW, application must apply > > > the same entropy calculation algorithm. > > >> Using the new API application can request the PMD to calculate the > > > value as if the packet passed in the HW. > > > > > > > So is this new API a datapath API? Is the intention that application > > call this API per packet that is missing 'entropy' information? > > The application will call this API when it gets a packet that it knows, that > the rest > of the > packets from this connection will be offloaded and encaped by the HW. > (see above explanation) > > > > > > Signed-off-by: Ori Kam <or...@nvidia.com> > > > --- > > > lib/ethdev/rte_flow.h | 49 > > +++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 49 insertions(+) > > > > > > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h > > > index affdc8121b..3989b089dd 100644 > > > --- a/lib/ethdev/rte_flow.h > > > +++ b/lib/ethdev/rte_flow.h > > > @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const > > struct rte_flow_template_table > > > const struct rte_flow_item pattern[], uint8_t > > pattern_template_index, > > > uint32_t *hash, struct rte_flow_error *error); > > > > > > +/** > > > + * @warning > > > + * @b EXPERIMENTAL: this API may change without prior notice. > > > + * > > > + * Destination field type for the entropy calculation. > > > + * > > > + * @see function rte_flow_calc_encap_entropy > > > + */ > > > +enum rte_flow_entropy_dest { > > > + /* Calculate entropy placed in UDP source port field. */ > > > + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT, > > > + /* Calculate entropy placed in NVGRE flow ID field. */ > > > + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID, > > > +}; > > > + > > > +/** > > > + * @warning > > > + * @b EXPERIMENTAL: this API may change without prior notice. > > > + * > > > + * Calculate the entropy generated by the HW for a given pattern, > > > + * when encapsulation flow action is executed. > > > + * > > > + * @param[in] port_id > > > + * Port identifier of Ethernet device. > > > + * @param[in] pattern > > > + * The values to be used in the entropy calculation. > > > + * @param[in] dest_field > > > + * Type of destination field for entropy calculation. > > > + * @param[out] entropy > > > + * Used to return the calculated entropy. It will be written in > > > network order, > > > + * so entropy[0] is the MSB. > > > + * The number of bytes is based on the destination field type. > > > > > > > > > Is the size same as field size in the 'dest_field'? > > Like for 'RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT' is it two bytes? > > Yes, > > > > > > > + * @param[out] error > > > + * Perform verbose error reporting if not NULL. > > > + * PMDs initialize this structure in case of error only. > > > + * > > > + * @return > > > + * - (0) if success. > > > + * - (-ENODEV) if *port_id* invalid. > > > + * - (-ENOTSUP) if underlying device does not support this > > > functionality. > > > + * - (-EINVAL) if *pattern* doesn't hold enough information to > > > calculate the > > entropy > > > + * or the dest is not supported. > > > + */ > > > +__rte_experimental > > > +int > > > +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item > > pattern[], > > > + enum rte_flow_entropy_dest dest_field, uint8_t > > *entropy, > > > + struct rte_flow_error *error); > > > + > > > #ifdef __cplusplus > > > } > > > #endif Thanks, Cristian