Hi Konstantin, On Sat, May 27, 2017 at 07:12:16PM +0800, Ananyev, Konstantin wrote: > > > > -----Original Message----- > > From: Hu, Jiayu > > Sent: Saturday, May 27, 2017 4:42 AM > > To: Ananyev, Konstantin <konstantin.anan...@intel.com> > > Cc: dev@dpdk.org; Wiles, Keith <keith.wi...@intel.com>; > > yuanhan....@linux.intel.com > > Subject: Re: [PATCH v3 1/3] lib: add Generic Receive Offload API framework > > > > On Sat, May 27, 2017 at 07:10:21AM +0800, Ananyev, Konstantin wrote: > > > Hi Jiayu, > > > > > > > -----Original Message----- > > > > From: Hu, Jiayu > > > > Sent: Friday, May 26, 2017 8:26 AM > > > > To: Ananyev, Konstantin <konstantin.anan...@intel.com> > > > > Cc: dev@dpdk.org; Wiles, Keith <keith.wi...@intel.com>; > > > > yuanhan....@linux.intel.com > > > > Subject: Re: [PATCH v3 1/3] lib: add Generic Receive Offload API > > > > framework > > > > > > > > Hi Konstantin, > > > > > > > > On Wed, May 24, 2017 at 08:38:25PM +0800, Ananyev, Konstantin wrote: > > > > > > > > > > Hi Jiayu, > > > > > > > > > > > > > > > > > Hi Konstantin, > > > > > > > > > > > > Thanks for your comments. My replies/questions are below. > > > > > > > > > > > > BRs, > > > > > > Jiayu > > > > > > > > > > > > On Mon, May 22, 2017 at 05:19:19PM +0800, Ananyev, Konstantin wrote: > > > > > > > Hi Jiayu, > > > > > > > My comments/questions below. > > > > > > > Konstantin > > > > > > > > > > > > > > > > > > > > > > > For applications, DPDK GRO provides three external functions to > > > > > > > > enable/disable GRO: > > > > > > > > - rte_gro_init: initialize GRO environment; > > > > > > > > - rte_gro_enable: enable GRO for a given port; > > > > > > > > - rte_gro_disable: disable GRO for a given port. > > > > > > > > Before using GRO, applications should explicitly call > > > > > > > > rte_gro_init to > > > > > > > > initizalize GRO environment. After that, applications can call > > > > > > > > rte_gro_enable to enable GRO and call rte_gro_disable to > > > > > > > > disable GRO for > > > > > > > > specific ports. > > > > > > > > > > > > > > I think this is too restrictive and wouldn't meet various user's > > > > > > > needs. > > > > > > > User might want to: > > > > > > > - enable/disable GRO for particular RX queue > > > > > > > - or even setup different GRO types for different RX queues, > > > > > > > i.e, - GRO over IPV4/TCP for queue 0, and GRO over IPV6/TCP > > > > > > > for queue 1, etc. > > > > > > > > > > > > The reason for enabling/disabling GRO per-port instead of per-queue > > > > > > is that LINUX > > > > > > controls GRO per-port. To control GRO per-queue indeed can provide > > > > > > more flexibility > > > > > > to applications. But are there any scenarios that different queues > > > > > > of a port may > > > > > > require different GRO control (i.e. GRO types and enable/disable > > > > > > GRO)? > > > > > > > > > > I think yes. > > > > > > > > > > > > > > > > > > - For various reasons, user might prefer not to use RX callbacks > > > > > > > for various reasons, > > > > > > > But invoke gro() manually at somepoint in his code. > > > > > > > > > > > > An application-used GRO library can enable more flexibility to > > > > > > applications. Besides, > > > > > > when perform GRO in ethdev layer or inside PMD drivers, it is an > > > > > > issue that > > > > > > rte_eth_rx_burst returns actually received packet number or GROed > > > > > > packet number. And > > > > > > the same issue happens in GSO, and even more seriously. This is > > > > > > because applications > > > > > > , like VPP, always rely on the return value of rte_eth_tx_burst to > > > > > > decide further > > > > > > operations. If applications can direcly call GRO/GSO libraries, > > > > > > this issue won't exist. > > > > > > And DPDK is a library, which is not a holistic system like LINUX. > > > > > > We don't need to do > > > > > > the same as LINUX. Therefore, maybe it's a better idea to directly > > > > > > provide SW > > > > > > segmentation/reassembling libraries to applications. > > > > > > > > > > > > > - Many users would like to control size (number of flows/items > > > > > > > per flow), > > > > > > > max allowed packet size, max timeout, etc., for different GRO > > > > > > > tables. > > > > > > > - User would need a way to flush all or only timeout packets from > > > > > > > particular GRO tables. > > > > > > > > > > > > > > So I think that API needs to extended to become be much more > > > > > > > fine-grained. > > > > > > > Something like that: > > > > > > > > > > > > > > struct rte_gro_tbl_param { > > > > > > > int32_t socket_id; > > > > > > > size_t max_flows; > > > > > > > size_t max_items_per_flow; > > > > > > > size_t max_pkt_size; > > > > > > > uint64_t packet_timeout_cycles; > > > > > > > <desired GRO types (IPV4_TCP | IPV6_TCP, ...)> > > > > > > > <probably type specific params> > > > > > > > ... > > > > > > > }; > > > > > > > > > > > > > > struct rte_gro_tbl; > > > > > > > strct rte_gro_tbl *rte_gro_tbl_create(const struct > > > > > > > rte_gro_tbl_param *param); > > > > > > > > > > > > > > void rte_gro_tbl_destroy(struct rte_gro_tbl *tbl); > > > > > > > > > > > > Yes, I agree with you. It's necessary to provide more fine-grained > > > > > > control APIs to > > > > > > applications. But what's 'packet_timeout_cycles' used for? Is it > > > > > > for TCP packets? > > > > > > > > > > For any packets that sits in the gro_table for too long. > > > > > > > > > > > > > > > > > > > > > > > > > /* > > > > > > > * process packets, might store some packets inside the GRO table, > > > > > > > * returns number of filled entries in pkt[] > > > > > > > */ > > > > > > > uint32_t rte_gro_tbl_process(struct rte_gro_tbl *tbl, struct > > > > > > > rte_mbuf *pkt[], uint32_t num); > > > > > > > > > > > > > > /* > > > > > > > * retirieves up to num timeouted packets from the table. > > > > > > > */ > > > > > > > uint32_t rtre_gro_tbl_timeout(struct rte_gro_tbl *tbl, uint64_t > > > > > > > tmt, struct rte_mbuf *pkt[], uint32_t num); > > > > > > > > > > > > Currently, we implement GRO as RX callback, whose processing logic > > > > > > is simple: > > > > > > receive burst packets -> perform GRO -> return to application. GRO > > > > > > stops after > > > > > > finishing processing received packets. If we provide > > > > > > rte_gro_tbl_timeout, when > > > > > > and who will call it? > > > > > > > > > > I mean the following scenario: > > > > > We receive a packet, find it is eligible for GRO and put it into > > > > > gro_table > > > > > in expectation - there would be more packets for the same flow. > > > > > But it could happen that we would never (or for some long time) > > > > > receive > > > > > any new packets for that flow. > > > > > So the first packet would never be delivered to the upper layer, > > > > > or delivered too late. > > > > > So we need a mechanism to extract such not merged packets > > > > > and push them to the upper layer. > > > > > My thought it would be application responsibility to call it from > > > > > time to time. > > > > > That's actually another reason why I think we shouldn't use > > > > > application > > > > > to always use RX callbacks for GRO. > > > > > > > > Currently, we only provide one reassembly function, > > > > rte_gro_reassemble_burst, > > > > which merges N inputted packets at a time. After finishing processing > > > > these > > > > packets, it returns all of them and reset hashing tables. Therefore, > > > > there > > > > are no packets in hashing tables after rte_gro_reassemble_burst returns. > > > > > > Ok, sorry I missed that part with rte_hash_reset(). > > > So gro_ressemble_burst() performs only inline processing on current input > > > packets > > > and doesn't try to save packets for future merging, correct? > > > > Yes. > > > > > Such approach indeed is much lightweight and doesn't require any extra > > > timeouts and flush(). > > > So my opinion let's keep it like that - nice and simple. > > > BTW, I think in that case we don't need any hashtables (or any other > > > persistent strucures)at all. > > > What we need is just a set of GROs (tcp4, tpc6, etc.) we want to perform > > > on given array of packets. > > > > Beside GRO types that are desired to perform, maybe it also needs > > max_pkt_size and > > some GRO type specific information? > > Yes, but we don't need the actual hash-tables, etc. inside. > Passing something like struct gro_param seems enough.
Yes, we can just pass gro_param and allocate hashing tables inside rte_gro_reassemble_burst. If so, hashing tables of desired GRO types are created and freed in each invocation of rte_gro_reassemble_burst. In GRO library, hashing tables are created by GRO type specific gro_tbl_create_fn. These gro_tbl_create_fn may allocate hashing table space via malloc (or rte_malloc). Therefore, we may frequently call malloc/free when using rte_gro_reassemble_burst. In my opinion, it will degrade GRO performance greatly. But if we ask applications to input hashing tables, what we need to do is to reset them after finishing using in rte_gro_reassemble_burst, rather than to malloc and free each time. Therefore, I think this way is more efficient. How do you think? > > > > > > > > > > > > > > If we provide rte_gro_tbl_timeout, we also need to provide another > > > > reassmebly > > > > function, like rte_gro_reassemble, which processes one given packet at a > > > > time and won't reset hashing tables. Applications decide when to flush > > > > packets > > > > in hashing tables. And rte_gro_tbl_timeout is one of the ways that can > > > > be used > > > > to flush the packets. Do you mean that? > > > > > > Yes, that's what I meant, but as I said above - I think your approach is > > > probably > > > more preferable - it is much simpler and lightweight. > > > Konstantin > > >