On 07-Jun-18 1:38 PM, Qi Zhang wrote:
The patch introduce the solution to handle different hotplug cases in
multi-process situation, it include below scenario:

1. Attach a share device from primary
2. Detach a share device from primary
3. Attach a share device from secondary
4. Detach a share device from secondary
5. Attach a private device from secondary
6. Detach a private device from secondary
7. Detach a share device from secondary privately
8. Attach a share device from secondary privately

In primary-secondary process model, we assume device is shared by default.
that means attach or detach a device on any process will broadcast to
all other processes through mp channel then device information will be
synchronized on all processes.

Any failure during attaching process will cause inconsistent status
between processes, so proper rollback action should be considered.
Also it is not safe to detach a share device when other process still use
it, so a handshake mechanism is introduced, it will be implemented in
following separate patch.

Scenario for Case 1, 2:

attach device
a) primary attach the new device if failed goto h).
b) primary send attach sync request to all secondary.
c) secondary receive request and attach device and send reply.
d) primary check the reply if all success go to i).
e) primary send attach rollback sync request to all secondary.
f) secondary receive the request and detach device and send reply.
g) primary receive the reply and detach device as rollback action.
h) attach fail
i) attach success

detach device
a) primary perform pre-detach check, if device is locked, goto i).
b) primary send pre-detach sync request to all secondary.
c) secondary perform pre-detach check and send reply.
d) primary check the reply if any fail goto i).
e) primary send detach sync request to all secondary
f) secondary detach the device and send reply (assume no fail)
g) primary detach the device.
h) detach success
i) detach failed

Case 3, 4:
This will be implemented in following patch.

If these will be implemented in following patch, why spend half the commit message talking about it? :) This commit doesn't implement secondary process functionality at all, so the commit message should probably be reworded to only include primary process logic, no?


Case 5, 6:
Secondary process can attach private device which only visible to itself,
in this case no IPC is involved, primary process is not allowd to have
private device so far.

Case 7, 8:
Secondary process can also temporally to detach a share device "privately"
then attach it back later, this action also not impact other processes.

APIs chenages:

Multiple typos - "chenages", "temporally", "allowd", etc.


rte_eth_dev_attach and rte_eth_dev_attach are extended to support
share device attach/detach in primary-secondary process model, it will
be called in case 1,2,3,4.

New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
introduced to cover case 5,6,7,8, this API can only be invoked in secondary
process.
 > Signed-off-by: Qi Zhang <qi.z.zh...@intel.com>
---

<snip>

        rte_eal_mcfg_complete();
+ if (rte_eth_dev_mp_init()) {
+               rte_eal_init_alert("rte_eth_dev_mp_init() failed\n");
+               rte_errno = ENOEXEC;
+               return -1;
+       }
+

Why is this done after the end of init? rte_eal_mcfg_complete() makes it so that secondaries can initialize, at that point all initialization should have been finished. I would expect this to be called after (before?) bus probe, since this is device-related.

        return fctret;
  }
diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
index c2f2f7d82..04e93f337 100644
--- a/lib/librte_ethdev/Makefile
+++ b/lib/librte_ethdev/Makefile
@@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map
  LIBABIVER := 9

<snip>

+       if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+
+               /**
+                * If secondary process, we just send request to primray
+                * to start the process.
+                */
+               req.t = REQ_TYPE_ATTACH;
+               strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
+
+               ret = rte_eth_dev_request_to_primary(&req);
+               if (ret) {
+                       ethdev_log(ERR, "Failed to send device attach request to 
primary\n");

The log message is a little misleading. It can be that secondary has failed to send request. It can also be that it succeeded, but the attach itself has failed. I think a better message would be "attach request has failed" or something to that effect.

+                       return ret;
+               }
+
+               *port_id = req.port_id;
+               return req.result;
+       }
+
+       ret = do_eth_dev_attach(devargs, port_id);
+       if (ret)
+               return ret;
+
+       /* send attach request to seoncary */
+       req.t = REQ_TYPE_ATTACH;
+       strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
+       req.port_id = *port_id;
+       ret = rte_eth_dev_request_to_secondary(&req);
+       if (ret) {
+               ethdev_log(ERR, "Failed to send device attach request to 
secondary\n");

Same as above - log message can/might be misleading. There are a few other places where similar log message is present, those should be corrected too.

+               goto rollback;
+       }
+
+       if (req.result)
+               goto rollback;
+
+       return 0;

<snip>

+{
+       uint32_t dev_flags;
+
+       if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+               return -ENOTSUP;
+
+       RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+
+       dev_flags = rte_eth_devices[port_id].data->dev_flags;
+       if (dev_flags & RTE_ETH_DEV_BONDED_SLAVE) {
+               ethdev_log(ERR,
+                       "Port %" PRIu16 " is bonded, cannot detach", port_id);
+               return -ENOTSUP;
+       }

Do we have to do a similar check for failsafe devices?

+
+       return do_eth_dev_detach(port_id);
+}
+
  static int
  rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
  {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 36e3984ea..bb03d613b 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h

<snip>

  /**
+ * Attach a private Ethernet device specified by arguments.
+ * A private device is invisible to other process.
+ * Can only be invoked in secondary process.
+ *
+ * @param devargs
+ *  A pointer to a strings array describing the new device
+ *  to be attached. The strings should be a pci address like
+ *  '0000:01:00.0' or virtual device name like 'net_pcap0'.
+ * @param port_id
+ *  A pointer to a port identifier actually attached.
+ * @return
+ *  0 on success and port_id is filled, negative on error
+ */
+int rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id);

New API's should be marked as __rte_experimental.

+
+/**
   * Detach a Ethernet device specified by port identifier.
   * This function must be called when the device is in the
   * closed state.
+ * In multi-process mode, it will sync with other process
+ * to detach the device.
   *
   * @param port_id
   *   The port identifier of the device to detach.
@@ -1490,6 +1511,22 @@ int rte_eth_dev_attach(const char *devargs, uint16_t 
*port_id);

<snip>

+ * Detach a Ethernet device in current process.
+ *
+ * @param port_id
+ *   The port identifier of the device to detach.
+ * @param devname
+ *   A pointer to a buffer that will be filled with the device name.
+ *   This buffer must be at least RTE_DEV_NAME_MAX_LEN long.
+ * @return
+ *  0 on success and devname is filled, negative on error
+ */
+int do_eth_dev_detach(uint16_t port_id);
+

Why is this made part of an external API? You should have a separate, private header file for these.

  #ifdef __cplusplus
  }
  #endif
diff --git a/lib/librte_ethdev/rte_ethdev_mp.c 
b/lib/librte_ethdev/rte_ethdev_mp.c
new file mode 100644
index 000000000..8ede8151d
--- /dev/null
+++ b/lib/librte_ethdev/rte_ethdev_mp.c
@@ -0,0 +1,195 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2018 Intel Corporation
+ */
+
+#include "rte_ethdev_driver.h"
+#include "rte_ethdev_mp.h"
+
+static int detach_on_secondary(uint16_t port_id)

<snip>

+       free(da.args);
+       return 0;
+}
+
+static int handle_secondary_request(const struct rte_mp_msg *msg, const void 
*peer)
+{
+       (void)msg;
+       (void)(peer);
+       return -ENOTSUP;

Please either mark arguments as __rte_unused, or use RTE_SET_USED(blah) macro. Same in other similar places.

+}
+
+static int handle_primary_response(const struct rte_mp_msg *msg, const void 
*peer)
+{
+       (void)msg;
+       (void)(peer);
+       return -ENOTSUP;
+}
+
+static int handle_primary_request(const struct rte_mp_msg *msg, const void 
*peer)
+{
+       const struct eth_dev_mp_req *req =
+               (const struct eth_dev_mp_req *)msg->param;

<snip>

+       case REQ_TYPE_DETACH:
+       case REQ_TYPE_ATTACH_ROLLBACK:
+               ret = detach_on_secondary(req->port_id);
+               break;
+       default:
+               ret = -EINVAL;
+       }
+
+       strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST);

Here and in other places: rte_strlcpy?

--
Thanks,
Anatoly

Reply via email to