On Tue, 6 Dec 2022 03:47:42 +0000 Rongwei Liu <rongw...@nvidia.com> wrote:
> Hi > > BR > Rongwei > > > -----Original Message----- > > From: Stephen Hemminger <step...@networkplumber.org> > > Sent: Tuesday, December 6, 2022 00:08 > > To: Rongwei Liu <rongw...@nvidia.com> > > Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko > > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Ferruh Yigit > > <ferruh.yi...@amd.com>; Andrew Rybchenko > > <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh > > <rasl...@nvidia.com> > > Subject: Re: [RFC 2/2] ethdev: add API to set process to primary or > > secondary > > > > External email: Use caution opening links or attachments > > > > > > On Fri, 2 Dec 2022 03:27:38 +0000 > > Rongwei Liu <rongw...@nvidia.com> wrote: > > > > > > > > > > The state of the devices and the system is really unstable if this > > > > fails. There is no rollback here. > > > > > > > Assume application is calling rte_eth_process_set_primary(false); > > > Once failed, call all preceding successful ports as > > > rte_eth_process_set_primary(true); > > > What do you think? > > > > I think this should have a PMD capability flag so that application > > > > can check that device supports doing this. And it would have to be > > > > opt-in so that existing devices would always fail. > > > If device doesn't support it, it can set the ethdev callback to NULL or > > > return > > failure for all devices. > > > Then the devices' state will be consistent. > > > > > > Assume there are two DPDK ports. > > If the application tries to change roles and one of the devices does not > > support > > the change over, then that error is fatal. The first device has changed > > state > > already, and the second doesn't allow it. > > > If my understanding is correct, you are saying one application to probe two > PMD vendors. This is difficult to handle. > > This needs to be a capability flag for the device, and would need an > > additional > > flag in the device documentation as well. > > > For multiple vendors simultaneously probing. Capability flag is a must. But > no need for single vendor, right? > > I bet many devices do regular malloc or mmap in the primary process and that > > is not going to work with this change. > Sorry, looks like I mis-lead you. The words "Primary/Secondary" in this > update have no relationship with current PRIMARY/SECDONARY definition. > Doesn't focus on the resource ownership. > In this update: Primary application' offload rules are effective immediately. > Secondary' rules are in queue which will be effective if primary application > exits or primary application doesn't insert any rule. > Maybe we can call it as "active/standby" or "main/standby" or > "active/backup"? Do you have naming suggestion? DPDK supports any combination of PCI and virtual devices. Any patch that restricts that is a bad idea. There already is a capability mechanism in ethdev API. A well written application would look at those flags for all ethdev's before attempting transition. Layered PMD's like bonding, failsafe, and netvsc would also need to handle the nesting. The problem is hard, what you did so far is a start but there are lots more issues that need to be considered.