> -----Original Message----- > From: Burakov, Anatoly > Sent: Thursday, June 21, 2018 4:37 PM > To: Zhang, Qi Z <qi.z.zh...@intel.com>; tho...@monjalon.net > Cc: Ananyev, Konstantin <konstantin.anan...@intel.com>; dev@dpdk.org; > Richardson, Bruce <bruce.richard...@intel.com>; Yigit, Ferruh > <ferruh.yi...@intel.com>; Shelton, Benjamin H > <benjamin.h.shel...@intel.com>; Vangati, Narender > <narender.vang...@intel.com> > Subject: Re: [PATCH v2 04/22] ethdev: enable hotplug on multi-process > > On 21-Jun-18 3:00 AM, Qi Zhang wrote: > > We are going to introduce the solution to handle different hotplug > > cases in multi-process situation, it include below scenario: > > > > 1. Attach a share device from primary > > 2. Detach a share device from primary > > 3. Attach a share device from secondary 4. Detach a share device from > > secondary 5. Attach a private device from secondary 6. Detach a > > private device from secondary 7. Detach a share device from secondary > > privately 8. Attach a share device from secondary privately > > > > In primary-secondary process model, we assume device is shared by default. > > that means attach or detach a device on any process will broadcast to > > all other processes through mp channel then device information will be > > synchronized on all processes. > > > > Any failure during attaching process will cause inconsistent status > > between processes, so proper rollback action should be considered. > > Also it is not safe to detach a share device when other process still > > use it, so a handshake mechanism is introduced. > > > > This patch covers the implementation of case 1,2,5,6,7,8. > > Case 3,4 will be implemented on separate patch as well as handshake > > mechanism. > > > > Scenario for Case 1, 2: > > > > attach device > > a) primary attach the new device if failed goto h). > > b) primary send attach sync request to all secondary. > > c) secondary receive request and attach device and send reply. > > d) primary check the reply if all success go to i). > > e) primary send attach rollback sync request to all secondary. > > f) secondary receive the request and detach device and send reply. > > g) primary receive the reply and detach device as rollback action. > > h) attach fail > > i) attach success > > > > detach device > > a) primary perform pre-detach check, if device is locked, goto i). > > b) primary send pre-detach sync request to all secondary. > > c) secondary perform pre-detach check and send reply. > > d) primary check the reply if any fail goto i). > > e) primary send detach sync request to all secondary > > f) secondary detach the device and send reply (assume no fail) > > g) primary detach the device. > > h) detach success > > i) detach failed > > > > Case 5, 6: > > Secondary process can attach private device which only visible to > > itself, in this case no IPC is involved, primary process is not > > allowed to have private device so far. > > > > Case 7, 8: > > Secondary process can also temporally to detach a share device "privately" > > then attach it back later, this action also not impact other processes. > > > > APIs changes: > > > > rte_eth_dev_attach and rte_eth_dev_attach are extended to support > > share device attach/detach in primary-secondary process model, it will > > be called in case 1,2,3,4. > > > > New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are > > introduced to cover case 5,6,7,8, this API can only be invoked in > > secondary process. > > > > Signed-off-by: Qi Zhang <qi.z.zh...@intel.com> > > --- > > <snip> > > > + memset(&da, 0, sizeof(da)); > > + > > + if (rte_devargs_parse(&da, "%s", devargs)) { > > + ethdev_log(ERR, "failed to parse devargs %s\n", devargs); > > + return -EINVAL; > > + } > > + > > + ret = rte_eal_hotplug_add(da.bus->name, da.name, ""); > > + if (ret) { > > + ethdev_log(ERR, "failed to hotplug bus:%s, device:%s\n", > > + da.bus->name, da.name); > > + free(da.args); > > + return ret; > > + } > > + > > + if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) { > > + ethdev_log(ERR, "failed to attach to port %d, this is a pmd > > issue\n", > > + port_id); > > + return -ENODEV; > > ^^^ Leaking da.args here? > > > + } > > + free(da.args); > > + return 0; > > +} > > + > > +static int handle_secondary_request(const struct rte_mp_msg *msg, > > +const void *peer) { > > + RTE_SET_USED(msg); > > + RTE_SET_USED(peer); > > + return -ENOTSUP; > > +} > > + > > +static int handle_primary_response(const struct rte_mp_msg *msg, > > +const void *peer) { > > <snip> > > > + ret = rte_mp_request_sync(&mp_req, &mp_reply, &ts); > > + if (ret) { > > + ethdev_log(ERR, "rte_mp_request_sync failed\n"); > > + return ret; > > + } > > + > > + req->result = 0; > > + for (i = 0; i < mp_reply.nb_received; i++) { > > + struct eth_dev_mp_req *resp = > > + (struct eth_dev_mp_req *)mp_reply.msgs[i].param; > > + if (resp->result) { > > + req->result = resp->result; > > + break; > > + } > > + } > > Do we care if nb_sent != nb_received? > > > + > > + return 0; > > +} > > + > > +int rte_eth_dev_mp_init(void) > > +{ > > + > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { > > + if (rte_mp_action_register(ETH_DEV_MP_ACTION_REQUEST, > > <snip> > > > +/** > > + * this is a synchronous wrapper for secondary process send > > + * request to primary process, this is invoked when an attach > > + * or detach request issued from primary. > > + */ > > +int rte_eth_dev_request_to_primary(struct eth_dev_mp_req *req); > > + > > +/** > > + * this is a synchronous wrapper for primary process send > > + * request to secondary process, this is invoked when an attach > > + * or detach request issued from secondary process. > > + */ > > +int rte_eth_dev_request_to_secondary(struct eth_dev_mp_req *req); > > Nitpicking, but the two above functions aren't used outside ethdev library. > You can probably drop the rte_ prefix. > > > + > > +/* Register mp channel callback functions of ethdev layer.*/ int > > +rte_eth_dev_mp_init(void); > > I don't quite understand what you're doing here. (Or rather, i understand the > intention, but i don't understand the implementation :) ) > > This function is meant to be called from EAL at startup. First of all, why is > it > declared twice (once in eal_private, once in ethdev_private)?
Ah, I forgot this, this is something in mess. > > Second of all, ethdev is a library, but this function is called from EAL. > Which > means it cannot be in a private header (nor should it be declared in EAL), and > you cannot even call it from EAL because that would introduce a circular > dependency between EAL and ethdev. > > So, this needs to be redone the other way around - have ethdev register itself > with EAL, and get called at some point, in a generic way (e.g. > see how bus probe works for example). I don't know what this would look like > - maybe some kind of generic multiprocess init? Yes, properly like this. I will re-work it. Thanks Qi > > -- > Thanks, > Anatoly