HI Jerin:

BR
Rongwei

> -----Original Message-----
> From: Jerin Jacob <jerinjac...@gmail.com>
> Sent: Wednesday, December 21, 2022 20:45
> To: Rongwei Liu <rongw...@nvidia.com>
> Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Ferruh Yigit
> <ferruh.yi...@amd.com>; Andrew Rybchenko
> <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> <rasl...@nvidia.com>
> Subject: Re: [RFC v3 2/2] ethdev: add API to set process to active or standby
> 
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Dec 21, 2022 at 5:35 PM Rongwei Liu <rongw...@nvidia.com> wrote:
> >
> > Hi Jerin:
> >
> > BR
> > Rongwei
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjac...@gmail.com>
> > > Sent: Wednesday, December 21, 2022 19:00
> > > To: Rongwei Liu <rongw...@nvidia.com>
> > > Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
> > > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
> > > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Ferruh Yigit
> > > <ferruh.yi...@amd.com>; Andrew Rybchenko
> > > <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> > > <rasl...@nvidia.com>
> > > Subject: Re: [RFC v3 2/2] ethdev: add API to set process to active
> > > or standby
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On Wed, Dec 21, 2022 at 3:02 PM Rongwei Liu <rongw...@nvidia.com>
> wrote:
> > > >
> > > > HI Jerin:
> > > >
> > >
> > > Hi Rongwei
> > >
> > > > BR
> > > > Rongwei
> > > >
> > > > > -----Original Message-----
> > > > > From: Jerin Jacob <jerinjac...@gmail.com>
> > > > > Sent: Wednesday, December 21, 2022 17:13
> > > > > To: Rongwei Liu <rongw...@nvidia.com>
> > > > > Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
> > > > > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>;
> > > > > NBU-Contact- Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>;
> > > > > Ferruh Yigit <ferruh.yi...@amd.com>; Andrew Rybchenko
> > > > > <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> > > > > <rasl...@nvidia.com>
> > > > > Subject: Re: [RFC v3 2/2] ethdev: add API to set process to
> > > > > active or standby
> > > > >
> > > > > External email: Use caution opening links or attachments
> > > > >
> > > > >
> > > > > On Wed, Dec 21, 2022 at 2:31 PM Rongwei Liu
> > > > > <rongw...@nvidia.com>
> > > wrote:
> > > > > >
> > > > > > Users may want to change the DPDK process to different
> > > > > > versions
> > > > >
> > > > > Different version of DPDK? If there is any ABI change how to support
> this?
> > > > >
> > > > There is a new member which was introduced into rte_eth_dev_info
> > > > but it
> > > shouldn’t be ABI breaking since using reserved fields.
> > >
> > > That is just for rte_eth_dev_info. What about the ABI change in
> > > different ethdev structure and rte_flow structures across different DPDK
> ABI versions.
> > >
> > Besides this, there is no other ABI changes dependency.
> >
> > Assume there is a DPDK process A running with version v21.11 and plan
> > to upgrade to version v22.11. Let' call v22.11 as process B.
> 
> OK. That's a relief. I understand the use case now.
> 
> Why not simply use standard DPDK multiprocess model then.
> Primary process act as server for slow path API. Secondary process can come
> and go(aka can be updated at runtime) and use as client to update rules via
> primary-secondray communication mechanism.
> 
Just image if process A and process B have ABI breakage like different 
rte_flow_item_*** and rte_flow_action_*** size and members.
How can we quickly accommodate primary/secondary to be ABI compatible across 
different versions?
It will be very huge effort and difficult to implement, at least in my opinion. 
What do you think?
> 
> >
> > Now, process A has been running for long time and has lot of rules
> configured. It' "active" role per this API definition.
> > Process B starts and it should call this API and set itself to
> > "standby" role and user can program the flow rules as they want and
> different NIC vendors may have different recommendations. Nvidia suggests
> only program process B with group 0' rules now.
> >
> > The user should sync all desired configurations from process A to
> > process B, and process A starts to yield traffic like "delete all group 0 
> > rules
> for Nvidia' NICs" or quit.
> > After that process B calls this API and set itself to "active" role, now 
> > the hot-
> upgrade finishes.
> >
> > > > > > such as hot upgrade.
> > > > > > There is a strong requirement to simplify the logic and
> > > > > > shorten the traffic downtime as much as possible.
> > > > > >
> > > > > > This update introduces new rte_eth process role definitions:
> > > > > > active or standby.
> > > > > >
> > > > > > The active role means rules are programmed to HW immediately,
> > > > > > and no
> > > > >
> > > > > Why it has to be specific only to rte_flow rule? If it spedieic
> > > > > to rte_flow, why it is in rte_eth_process_ name space?
> > > > For now, this design focuses on the flow rule offloading and
> > > > traffic
> > > redirection.
> > > > When switching process version, it' important to make sure which
> > > application receives and handles the traffic.
> > >
> > > Changing the DPDK version runtime is just beyond rte_flow driver.
> >
> > It' not about changing DPDK version but upgrading DPDK from one PMD
> version to another one.
> > Does the preceding example answer your question?
> > >
> > > > The changing should be effective across all probing eth devices,
> > > > that' why it
> > > was put under rte_eth_process_ (for all rte_eth_dev) name space.
> > > > >
> > > > > Also, if we are moving the standby, What about the rule whose
> > > > > ABI is changed between versions?
> > > >
> > > > Like the comments mentioned: " Before role transition, all the
> > > > rules set by
> > > the active process should be flushed first. "
> > >
> > > What happens to rte_flow flow handles for existing ones  which is
> > > created with version X?
> > > Also What if new version Y has ABI change in rte_flow_pattern and
> > > rte_flow_action structure?
> > >
> > > For me, If DPDK version change is needed, simply reload the
> > > application. This API will soon bloat, and it will be a mess if to
> > > start handling Different DPDK version which is not ABI compatible at all.
> > >
> > Yes, you are right. Reloading the application is the easiest way but
> > it may have a long time Window that traffic is lost. No traffic arrives at
> process A or process B.
> > We are trying to simplify the reloading logic and minimize the traffic down
> time as much as possible.
> > The approach may differentiate hugely between different NIC vendors,
> > so I think it should be better if DPDK can provide an abstract API.
> >
> > If process A and process B are ABI different, it doesn't matter.
> > 1. Call this API with process A means older ABI.
> > 2. Call this API with process B means newer ABI.
> > It' have process concept and working scope.
> >
> > >
> > >
> > >
> > > > > > behavior changed. This is the default state.
> > > > > > The standby role means rules are queued in the HW. If no
> > > > > > active roles alive or back to active, the rules are effective
> immediately.
> > > > > >
> > > > > > Signed-off-by: Rongwei Liu <rongw...@nvidia.com>

Reply via email to