> -----Original Message----- > From: Burakov, Anatoly > Sent: Monday, June 18, 2018 4:18 PM > To: Zhang, Qi Z <qi.z.zh...@intel.com>; tho...@monjalon.net > Cc: Ananyev, Konstantin <konstantin.anan...@intel.com>; dev@dpdk.org; > Richardson, Bruce <bruce.richard...@intel.com>; Yigit, Ferruh > <ferruh.yi...@intel.com>; Shelton, Benjamin H > <benjamin.h.shel...@intel.com>; Vangati, Narender > <narender.vang...@intel.com> > Subject: Re: [PATCH 04/22] ethdev: enable hotplug on multi-process > > On 07-Jun-18 1:38 PM, Qi Zhang wrote: > > The patch introduce the solution to handle different hotplug cases in > > multi-process situation, it include below scenario: > > > > 1. Attach a share device from primary > > 2. Detach a share device from primary > > 3. Attach a share device from secondary 4. Detach a share device from > > secondary 5. Attach a private device from secondary 6. Detach a > > private device from secondary 7. Detach a share device from secondary > > privately 8. Attach a share device from secondary privately > > > > In primary-secondary process model, we assume device is shared by > default. > > that means attach or detach a device on any process will broadcast to > > all other processes through mp channel then device information will be > > synchronized on all processes. > > > > Any failure during attaching process will cause inconsistent status > > between processes, so proper rollback action should be considered. > > Also it is not safe to detach a share device when other process still > > use it, so a handshake mechanism is introduced, it will be implemented > > in following separate patch. > > > > Scenario for Case 1, 2: > > > > attach device > > a) primary attach the new device if failed goto h). > > b) primary send attach sync request to all secondary. > > c) secondary receive request and attach device and send reply. > > d) primary check the reply if all success go to i). > > e) primary send attach rollback sync request to all secondary. > > f) secondary receive the request and detach device and send reply. > > g) primary receive the reply and detach device as rollback action. > > h) attach fail > > i) attach success > > > > detach device > > a) primary perform pre-detach check, if device is locked, goto i). > > b) primary send pre-detach sync request to all secondary. > > c) secondary perform pre-detach check and send reply. > > d) primary check the reply if any fail goto i). > > e) primary send detach sync request to all secondary > > f) secondary detach the device and send reply (assume no fail) > > g) primary detach the device. > > h) detach success > > i) detach failed > > > > Case 3, 4: > > This will be implemented in following patch. > > If these will be implemented in following patch, why spend half the commit > message talking about it? :)
Sorry, I didn't get your point about "see half commit to talk about it" :) This patch covered an overview, and also the implementation of case 1,2,5,6,7,8 For case 3, 4, just below 4 lines to describe it 3. Attach a share device from secondary. 4. Detach a share device from secondary. Case 3, 4: This will be implemented in following patch. > is commit doesn't implement secondary > process functionality at all, so the commit message should probably be > reworded to only include primary process logic, no? OK, I will reword it to highlight the patch's scope as description at above. > > > > > Case 5, 6: > > Secondary process can attach private device which only visible to itself, > > in this case no IPC is involved, primary process is not allowed to have > > private device so far. > > > > Case 7, 8: > > Secondary process can also temporally to detach a share device "privately" > > then attach it back later, this action also not impact other processes. > > > > APIs chenages: > > Multiple typos - "chenages", "temporally", "allowd", etc. Thanks > > > > > rte_eth_dev_attach and rte_eth_dev_attach are extended to support > > share device attach/detach in primary-secondary process model, it will > > be called in case 1,2,3,4. > > > > New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are > > introduced to cover case 5,6,7,8, this API can only be invoked in secondary > > process. > > > Signed-off-by: Qi Zhang <qi.z.zh...@intel.com> > > --- > > <snip> > > > rte_eal_mcfg_complete(); > > > > + if (rte_eth_dev_mp_init()) { > > + rte_eal_init_alert("rte_eth_dev_mp_init() failed\n"); > > + rte_errno = ENOEXEC; > > + return -1; > > + } > > + > > Why is this done after the end of init? rte_eal_mcfg_complete() makes it > so that secondaries can initialize, at that point all initialization > should have been finished. I would expect this to be called after > (before?) bus probe, since this is device-related. OK will move ahead. > > > return fctret; > > } > > > > diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile > > index c2f2f7d82..04e93f337 100644 > > --- a/lib/librte_ethdev/Makefile > > +++ b/lib/librte_ethdev/Makefile > > @@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map > > LIBABIVER := 9 > > > > <snip> > > > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > > + > > + /** > > + * If secondary process, we just send request to primray > > + * to start the process. > > + */ > > + req.t = REQ_TYPE_ATTACH; > > + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN); > > + > > + ret = rte_eth_dev_request_to_primary(&req); > > + if (ret) { > > + ethdev_log(ERR, "Failed to send device attach request to > primary\n"); > > The log message is a little misleading. It can be that secondary has > failed to send request. It can also be that it succeeded, but the attach > itself has failed. I think a better message would be "attach request has > failed" or something to that effect. The return value of rte_eth_dev_request_to_primary only means communication fail, (message not able to send, or not get reply in time). but not the fail on attach/detach itself. (which comes from req->result) > > > + return ret; > > + } > > + > > + *port_id = req.port_id; > > + return req.result; > > + } > > + > > + ret = do_eth_dev_attach(devargs, port_id); > > + if (ret) > > + return ret; > > + > > + /* send attach request to seoncary */ > > + req.t = REQ_TYPE_ATTACH; > > + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN); > > + req.port_id = *port_id; > > + ret = rte_eth_dev_request_to_secondary(&req); > > + if (ret) { > > + ethdev_log(ERR, "Failed to send device attach request to > secondary\n"); > > Same as above - log message can/might be misleading. There are a few > other places where similar log message is present, those should be > corrected too. Same as above > > > + goto rollback; > > + } > > + > > + if (req.result) > > + goto rollback; > > + > > + return 0; > > <snip> > > > +{ > > + uint32_t dev_flags; > > + > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) > > + return -ENOTSUP; > > + > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > > + > > + dev_flags = rte_eth_devices[port_id].data->dev_flags; > > + if (dev_flags & RTE_ETH_DEV_BONDED_SLAVE) { > > + ethdev_log(ERR, > > + "Port %" PRIu16 " is bonded, cannot detach", port_id); > > + return -ENOTSUP; > > + } > > Do we have to do a similar check for failsafe devices? Just keep it same logic as before, it could be a separate patch to fix I guess. > > > + > > + return do_eth_dev_detach(port_id); > > +} > > + > > static int > > rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t > nb_queues) > > { > > diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h > > index 36e3984ea..bb03d613b 100644 > > --- a/lib/librte_ethdev/rte_ethdev.h > > +++ b/lib/librte_ethdev/rte_ethdev.h > > <snip> > > > /** > > + * Attach a private Ethernet device specified by arguments. > > + * A private device is invisible to other process. > > + * Can only be invoked in secondary process. > > + * > > + * @param devargs > > + * A pointer to a strings array describing the new device > > + * to be attached. The strings should be a pci address like > > + * '0000:01:00.0' or virtual device name like 'net_pcap0'. > > + * @param port_id > > + * A pointer to a port identifier actually attached. > > + * @return > > + * 0 on success and port_id is filled, negative on error > > + */ > > +int rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id); > > New API's should be marked as __rte_experimental. OK > > > + > > +/** > > * Detach a Ethernet device specified by port identifier. > > * This function must be called when the device is in the > > * closed state. > > + * In multi-process mode, it will sync with other process > > + * to detach the device. > > * > > * @param port_id > > * The port identifier of the device to detach. > > @@ -1490,6 +1511,22 @@ int rte_eth_dev_attach(const char *devargs, > uint16_t *port_id); > > <snip> > > > + * Detach a Ethernet device in current process. > > + * > > + * @param port_id > > + * The port identifier of the device to detach. > > + * @param devname > > + * A pointer to a buffer that will be filled with the device name. > > + * This buffer must be at least RTE_DEV_NAME_MAX_LEN long. > > + * @return > > + * 0 on success and devname is filled, negative on error > > + */ > > +int do_eth_dev_detach(uint16_t port_id); > > + > > Why is this made part of an external API? You should have a separate, > private header file for these. OK, will add to ethdev_private.h in v2. > > > #ifdef __cplusplus > > } > > #endif > > diff --git a/lib/librte_ethdev/rte_ethdev_mp.c > b/lib/librte_ethdev/rte_ethdev_mp.c > > new file mode 100644 > > index 000000000..8ede8151d > > --- /dev/null > > +++ b/lib/librte_ethdev/rte_ethdev_mp.c > > @@ -0,0 +1,195 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2010-2018 Intel Corporation > > + */ > > + > > +#include "rte_ethdev_driver.h" > > +#include "rte_ethdev_mp.h" > > + > > +static int detach_on_secondary(uint16_t port_id) > > <snip> > > > + free(da.args); > > + return 0; > > +} > > + > > +static int handle_secondary_request(const struct rte_mp_msg *msg, const > void *peer) > > +{ > > + (void)msg; > > + (void)(peer); > > + return -ENOTSUP; > > Please either mark arguments as __rte_unused, or use RTE_SET_USED(blah) > macro. Same in other similar places. OK. > > > +} > > + > > +static int handle_primary_response(const struct rte_mp_msg *msg, const > void *peer) > > +{ > > + (void)msg; > > + (void)(peer); > > + return -ENOTSUP; > > +} > > + > > +static int handle_primary_request(const struct rte_mp_msg *msg, const > void *peer) > > +{ > > + const struct eth_dev_mp_req *req = > > + (const struct eth_dev_mp_req *)msg->param; > > <snip> > > > + case REQ_TYPE_DETACH: > > + case REQ_TYPE_ATTACH_ROLLBACK: > > + ret = detach_on_secondary(req->port_id); > > + break; > > + default: > > + ret = -EINVAL; > > + } > > + > > + strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST); > > Here and in other places: rte_strlcpy? OK Thanks! Qi > > -- > Thanks, > Anatoly