On Tue, Jan 31, 2023 at 2:31 PM Rongwei Liu <rongw...@nvidia.com> wrote: > > Hi Jerin: > > BR > Rongwei > > > -----Original Message----- > > From: Jerin Jacob <jerinjac...@gmail.com> > > Sent: Tuesday, January 31, 2023 16:46 > > To: Rongwei Liu <rongw...@nvidia.com> > > Cc: dev@dpdk.org; Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko > > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; > > step...@networkplumber.org; Raslan Darawsheh <rasl...@nvidia.com>; > > Ferruh Yigit <ferruh.yi...@amd.com>; Andrew Rybchenko > > <andrew.rybche...@oktetlabs.ru> > > Subject: Re: [PATCH v4 3/3] ethdev: add standby flags for live migration > > > > External email: Use caution opening links or attachments > > > > > > On Tue, Jan 31, 2023 at 8:23 AM Rongwei Liu <rongw...@nvidia.com> wrote: > > > > > > HI Jerin: > > > > > > BR > > > Rongwei > > > > > > > -----Original Message----- > > > > From: Jerin Jacob <jerinjac...@gmail.com> > > > > Sent: Tuesday, January 31, 2023 01:10 > > > > To: Rongwei Liu <rongw...@nvidia.com> > > > > Cc: dev@dpdk.org; Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko > > > > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > > > > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; > > > > step...@networkplumber.org; Raslan Darawsheh <rasl...@nvidia.com>; > > > > Ferruh Yigit <ferruh.yi...@amd.com>; Andrew Rybchenko > > > > <andrew.rybche...@oktetlabs.ru> > > > > Subject: Re: [PATCH v4 3/3] ethdev: add standby flags for live > > > > migration > > > > > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > On Mon, Jan 30, 2023 at 8:17 AM Rongwei Liu <rongw...@nvidia.com> > > wrote: > > > > > > > > > > Hi Jerin > > > > > > > > > > BR > > > > > Rongwei > > > > > > > > > > > -----Original Message----- > > > > > > From: Jerin Jacob <jerinjac...@gmail.com> > > > > > > Sent: Monday, January 23, 2023 21:20 > > > > > > To: Rongwei Liu <rongw...@nvidia.com> > > > > > > Cc: dev@dpdk.org; Matan Azrad <ma...@nvidia.com>; Slava > > > > > > Ovsiienko <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; > > > > > > NBU-Contact- Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; > > > > > > step...@networkplumber.org; Raslan Darawsheh > > > > > > <rasl...@nvidia.com>; Ferruh Yigit <ferruh.yi...@amd.com>; > > > > > > Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> > > > > > > Subject: Re: [PATCH v4 3/3] ethdev: add standby flags for live > > > > > > migration > > > > > > > > > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > > > > > > > On Wed, Jan 18, 2023 at 9:15 PM Rongwei Liu > > > > > > <rongw...@nvidia.com> > > > > wrote: > > > > > > > > > > > > > > Some flags are added to the process state API for live > > > > > > > migration in order to change the behavior of the flow rules in a > > standby process. > > > > > > > > > > > > > > Signed-off-by: Rongwei Liu <rongw...@nvidia.com> > > > > > > > --- > > > > > > > lib/ethdev/rte_ethdev.h | 21 +++++++++++++++++++++ > > > > > > > 1 file changed, 21 insertions(+) > > > > > > > > > > > > > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > > > > > > > index > > > > > > > 1505396ced..9ae4f426a7 100644 > > > > > > > --- a/lib/ethdev/rte_ethdev.h > > > > > > > +++ b/lib/ethdev/rte_ethdev.h > > > > > > > @@ -2260,6 +2260,27 @@ int rte_eth_dev_owner_get(const > > > > > > > uint16_t port_id, __rte_experimental int > > > > > > > rte_eth_process_set_role(bool standby, uint32_t flags); > > > > > > > > > > > > > > +/**@{@name Process role flags > > > > > > > + * used when migrating from an application to another one. > > > > > > > + * @see rte_eth_process_set_active */ > > > > > > > +/** > > > > > > > + * When set on a standby process, ingress flow rules will be > > > > > > > +effective > > > > > > > + * in active and standby processes, so the ingress traffic > > > > > > > +may be > > > > duplicated. > > > > > > > + */ > > > > > > > +#define RTE_ETH_PROCESS_FLAG_STANDBY_DUP_FLOW_INGRESS > > > > > > RTE_BIT32(0) > > > > > > > > > > > > > > > > > > How to duplicate if action has statefull items for example, > > > > > > rte_flow_action_security::security_session -> it store the live > > > > > > pointer rte_flow_action_meter::mtr_id; -> MTR object ID created > > > > > > with > > > > > > rte_mtr_create() > > > > > I agree with you, not all actions can be supported in the > > > > > active/standby > > > > model. > > > > > > > > IMO, Where ever rules are not standalone (like QUEUE, RSS) etc, It > > > > will be architecturally is not possible to migrate with pointers. > > > > That's where I have concern generalizing this feature for this ethdev. > > > > > > > Not sure I understand your concern correctly. What' the pointer concept > > here? > > > > I meant, Any HW resource driver deals with "pointers" or "fixed ID" > > can not get the same value > > for the new application. That's where I believe this whole concepts works > > for > > very standalone rte_flow patterns and actions. > > > > > > > Queue RSS actions can be migrated per my local test. Active/Standby > > application have its fully own rxq/txq. > > > > Yes. It because it is standalone. > > > > > They are totally separated processes and like two members in pipeline. 2nd > > member can't be feed if 1st member alive and handle the traffic. > > > > > > > Also, I don't believe there is any real HW support needed for this. > > > > IMO, Having DPDK standard multiprocess can do this by keeping > > > > secondary application can migrate, keeping all the SW logic in the > > > > primary process by doing the housekeeping in the application. On > > > > plus side, it works with pointers too. > > > > > IMO, in multiple process model, primary process usually owns the hardware > > resources via mmap/iomap/pci_map etc. > > > Secondary process is not able to run if primary quits no matter > > > gracefully or > > crashing. > > > This patch wants to introduce a "backup to alive" model. > > > Assume user wants to upgrade from DPDK version 22.03 to 23.03, 22.03 is > > running and active role while 23.03 comes up in standby. > > > Both DPDK processes have its own resources and doesn't rely on each other. > > > User can migrate the application following the steps in commit message > > with minimum traffic downtime. > > > SW logic like flow rules can be done following > > > iptables-save/iptables-restore > > approach. > > > > > > > > I am not sure how much housekeeping offload to _HW_ in your case. In > > > > my view, it should be generic utils functions to track the flow and > > > > installing the rules using rte_flow APIs and keep the scope only for > > rte_flow. > > > For rules part, totally agree with you. Issue is there maybe millions > > > of flow rules in field and each rule may take different steps to > > > re-install per > > vendor' implementations. > > > > I understand the desire for millon flow migrations. Which makes sense.IMO, > > It > > may be just easy to make this feature just for rte_flow name space. Just > > have > > APIs to export() existing rules for the given port and import() the rules > > exported rather than going to ethdev space and call it as "live migration". > > > Do you mean the API naming should be "rte_flow_process_set_role()" instead of > "rte_eth_process_set_role()" ? > Also move to rte_flow.c/.h files? Are we good to keep the PMD callback in > eth_dev layer?
Yes. something with rte_flow_ prefix and not sure _set_role() kind of scheme. > Simple export()/import() may not work. Image some flow rules are exclusive > and can't be issued from both applications. > We need to stop old application. I am afraid this will introduce big time > window which traffic stops. Yes, I think the sequence is rte_flow_rules_export() on app 1 stop the app 1 rte_flow_rules_import() of app 1 by app2. > Application won't like this behavior. > With this callback, each PMD can specify each rule, queue it or use lower > priority if exclusive. Or return error. > > > > This serial wants to propose a unified interface for upper layer > > > application' > > easy use. > > > > > > > > That's just my view. I leave to ethdev maintainers for the rest of > > > > the review and decision on this series. > > > > > > > > > That' why we have return value checking and rollback. > > > > > In Nvidia driver doc, we suggested user to start from 'rss/queue/jump' > > > > actions. > > > > > Meter is possible, at least per my view. > > > > > Assume: "meter g_action queue 0 / y_action drop / r_action drop" > > > > > Old application: create meter_id 'A' with pre-defined limitation. > > > > > New application: create meter_id 'B' which has the same parameters > > > > > with > > > > 'A'. > > > > > 1. 1st possible approach: > > > > > Hardware duplicates the traffic; old application use meter > > > > > 'A' and new > > > > application uses meter 'B' to control traffic throughputs. > > > > > Since traffic is duplicated, so it can go to different meters. > > > > > 2. 2nd possible approach: > > > > > Meter 'A' and 'B' point to the same hardware > > > > > resource, and traffic > > > > reaches this part first and if green, duplication happens.