On 07-Jun-18 1:38 PM, Qi Zhang wrote:
This patch cover the multi-process hotplug case when a share device
attach/detach request be issued from secondary process, the implementation
references malloc_mp.c.

device attach on secondary:
a) seconary send asycn request to primary and wait on a condition
    which will be released by matched response from primary.
b) primary receive the request and attach the new device if failed
    goto i).
c) primary forward attach request to all secondary as async request
    (because this in mp thread context, use sync request will deadlock)
d) secondary receive request and attach device and send reply.
e) primary check the reply if all success go to j).
f) primary send attach rollback async request to all secondary.
g) secondary receive the request and detach device and send reply.
h) primary receive the reply and detach device as rollback action.
i) send fail response to secondary, goto k).
j) send success response to secondary.
k) secondary process receive response and return.

device detach on secondary:
a) secondary send async request to primary and wait on a condition
    which will be released by matched response from primary.
b) primary receive the request and  perform pre-detach check, if device
    is locked, goto j).
c) primary send pre-detach async request to all secondary.
d) secondary perform pre-detach check and send reply.
e) primary check the reply if any fail goto j).
f) primary send detach async request to all secondary
g) secondary detach the device and send reply
h) primary detach the device.
i) send success response to secondary, goto k).
j) send fail response to secondary.
k) secondary process receive response and return.

Signed-off-by: Qi Zhang <qi.z.zh...@intel.com>
---

<snip>

+TAILQ_HEAD(mp_request_list, mp_request);
+static struct {
+       struct mp_request_list list;
+       pthread_mutex_t lock;
+} mp_request_list = {
+       .list = TAILQ_HEAD_INITIALIZER(mp_request_list.list),
+       .lock = PTHREAD_MUTEX_INITIALIZER
+};
+
+#define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */

Patch number 4 should've used this #define to set up its timeout.

+
+static struct mp_request *
+find_request_by_id(uint64_t id)
+{
+       struct mp_request *req;
+
+       TAILQ_FOREACH(req, &mp_request_list.list, next) {
+               if (req->user_req.id == id)
+                       break;
+       }
+       return req;
+}
+

<snip>

+               if (resp->result)
+                       return resp->result;
+       }
+
+       return 0;
+}
+
+static int
+send_response_to_secondary(const struct eth_dev_mp_req *req, int result)
+{
+       struct rte_mp_msg resp_msg = {0};

I've been bitten by this in the past - some compilers (*cough* clang *cough*) don't like this kind of zero-initialization depending on which type of parameter comes first in the structure, so i would refrain from using it and used memset(0) instead.

+       struct eth_dev_mp_req *resp =
+               (struct eth_dev_mp_req *)resp_msg.param;
+       int ret = 0;
+
+       resp_msg.len_param = sizeof(*resp);
+       strcpy(resp_msg.name, ETH_DEV_MP_ACTION_RESPONSE);

here and in other places - strlcpy()?

+       memcpy(resp, req, sizeof(*req));
+       resp->result = result;
+
+       ret = rte_mp_sendmsg(&resp_msg);
+       if (ret)
+               ethdev_log(ERR, "failed to send response to secondary\n");
+
+       return ret;
+}
+
+static int
+handle_async_attach_response(const struct rte_mp_msg *request,
+                            const struct rte_mp_reply *reply)
+{

<snip>

+       else
+               return -1;
+       do {
+               ret = rte_mp_request_async(&mp_req, &ts, clb);
+       } while (ret != 0 && rte_errno == EEXIST);
+
+       if (ret)
+               ethdev_log(ERR, "couldn't send async request\n");
+       entry = find_request_by_id(req->id > +    (void)entry;

Why did you look up entry and then marked it as used without checking the return value? Leftover? Some code missing?

+       return ret;
+}
+
+static int handle_secondary_request(const struct rte_mp_msg *msg,
+                                   const void *peer __rte_unused)
+{
+       const struct eth_dev_mp_req *req =
+               (const struct eth_dev_mp_req *)msg->param;
+       struct eth_dev_mp_req tmp_req;

<snip>

@@ -124,10 +490,101 @@ static int handle_primary_request(const struct 
rte_mp_msg *msg, const void *peer
        return 0;
  }
+/**
+ * secondary to primary request.
+ *
+ * device attach:
+ * a) seconary send request to primary.
+ * b) primary attach the new device if failed goto i).
+ * c) primary forward attach request to all secondary.
+ * d) secondary receive request and attach device and send reply.
+ * e) primary check the reply if all success go to j).
+ * f) primary send attach rollback request to all secondary.
+ * g) secondary receive the request and detach device and send reply.
+ * h) primary receive the reply and detach device as rollback action.
+ * i) send fail response to secondary, goto k).
+ * j) send success response to secondary.
+ * k) end.
+
+ * device detach:
+ * a) secondary send request to primary.
+ * b) primary perform pre-detach check, if device is locked, got j).
+ * c) primary send pre-detach check request to all secondary.
+ * d) secondary perform pre-detach check and send reply.
+ * e) primary check the reply if any fail goto j).
+ * f) primary send detach request to all secondary
+ * g) secondary detach the device and send reply
+ * h) primary detach the device.
+ * i) send success response to secondary, goto k).
+ * j) send fail response to secondary.
+ * k) end.
+ */

I think this comment should be at the top of this file.

--
Thanks,
Anatoly

Reply via email to