> >Hi Mark, > >Replies inline prefixed with [PL]. > >Thanks, >Przemek > >-----Original Message----- >From: Kavanagh, Mark B >Sent: Thursday, May 5, 2016 2:19 PM >To: Lal, PrzemyslawX <przemyslawx....@intel.com>; dev@openvswitch.org >Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user ports > >Hi Przemek, > >Some additional comments/queries inline. > >Thanks again, >Mark > >> >>This patch adds sFlow support for DPDK vHost-user interfaces by >>assigning ifindex value. Values of ifindexes for vHost-user interfaces >>start with 2000 to avoid overlapping with kernel datapath interfaces. >> >>Patch also fixes issue with 'dpdk0' interface being ignored by sFlow >>agent, because of ifindex 0. Ifindexes values for physical DPDK >>interfaces start with 1000 to avoid overlapping with kernel datapath >>interfaces. >> >>Signed-off-by: Przemyslaw Lal <przemyslawx....@intel.com> >>--- >> lib/netdev-dpdk.c | 70 >>++++++++++++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 69 insertions(+), 1 deletion(-) >> >>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index >>208c5f5..1b60c5a 100644 >>--- a/lib/netdev-dpdk.c >>+++ b/lib/netdev-dpdk.c >>@@ -115,6 +115,18 @@ static char *vhost_sock_dir = NULL; /* Location of >>vhost-user sockets >>*/ >> */ >> #define VHOST_ENQ_RETRY_USECS 100 >> >>+/* For DPDK ETH interfaces use ifindex values starting with 1000 >>+ * to avoid overlapping with kernel-space interfaces. >>+ * Also starting with 0 would cause sFlow to ignore 'dpdk0' interface. >>+ */ >>+#define DPDK_PORT_ID_TO_IFINDEX(port_id) ((port_id) + 1000) >>+ >>+/* For DPDK vhost-user interfaces use ifindexes starting with 2000. >>+ */ >>+#define VHOST_ID_TO_IFINDEX(port_id) ((port_id) + 2000) >>+ >>+#define VHOST_IDS_MAX_LEN 1024 >>+ >> static const struct rte_eth_conf port_conf = { >> .rxmode = { >> .mq_mode = ETH_MQ_RX_RSS, >>@@ -149,6 +161,7 @@ enum dpdk_dev_type { static int rte_eal_init_ret = >>ENODEV; >> >> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER; >>+static struct ovs_mutex vhost_mutex = OVS_MUTEX_INITIALIZER; >> >> /* Quality of Service */ >> >>@@ -790,6 +803,50 @@ netdev_dpdk_vhost_cuse_construct(struct netdev *netdev) >> return err; >> } >> >>+/* Counter for VHOST interfaces as DPDK library doesn't provide >>+mechanism >>+ * similar to rte_eth_dev_count for vhost-user sockets. >>+ */ >>+static int vhost_counter OVS_GUARDED_BY(vhost_mutex) = 0; >>+ >>+/* Array storing vhost_ids, so their ifindex can be reused after >>+socket >>+ * recreation. >>+ */ >>+static char vhost_ids[VHOST_IDS_MAX_LEN][PATH_MAX] >>+OVS_GUARDED_BY(vhost_mutex); >>+ >>+/* Simple lookup in vhost_ids array. >>+ * On success returns id of vhost_id stored in the array, >>+ * otherwise returns -1. >>+ */ >>+static int >>+netdev_dpdk_lookup_vhost_id(struct netdev_dpdk *dev) >>+OVS_REQUIRES(vhost_mutex) { >>+ for (int i = 0; i < vhost_counter; i++) { >>+ if (strncmp(vhost_ids[i], dev->vhost_id, strlen(dev->vhost_id)) == >>0) { >>+ return i; >>+ } >>+ } >>+ return -1; >>+} >>+ >>+/* Inserts vhost_id at the first free position in the vhost_ids array. >>+ */ >>+static void >>+netdev_dpdk_push_vhost_id(struct netdev_dpdk *dev) { >>+ ovs_mutex_lock(&vhost_mutex); >>+ if (netdev_dpdk_lookup_vhost_id(dev) < 0) { >>+ if (vhost_counter < VHOST_IDS_MAX_LEN) { >>+ strncpy(vhost_ids[vhost_counter++], dev->vhost_id, >>+ strlen(dev->vhost_id)); >>+ } else { >>+ VLOG_WARN("Could not assign ifindex to \"%s\" port. " >>+ "List of vhost IDs list is full.", >>+ dev->vhost_id); >>+ } >>+ } >>+ ovs_mutex_unlock(&vhost_mutex); >>+} >>+ >> static int >> netdev_dpdk_vhost_user_construct(struct netdev *netdev) { @@ -825,6 >>+882,8 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) >> err = vhost_construct_helper(netdev); >> } >> >>+ netdev_dpdk_push_vhost_id(dev); >>+ >> ovs_mutex_unlock(&dpdk_mutex); >> return err; >> } >>@@ -1773,9 +1832,18 @@ netdev_dpdk_get_ifindex(const struct netdev >>*netdev) { >> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >> int ifindex; >>+ int ret; >> >> ovs_mutex_lock(&dev->mutex); >>- ifindex = dev->port_id; >>+ if (dev->type == DPDK_DEV_ETH) { >>+ ifindex = DPDK_PORT_ID_TO_IFINDEX(dev->port_id); >>+ } else { > > >This 'else' statement is executed in both the case of vhost user devices and >also vhost cuse >devices. >In the latter case, do the port numbers returned by VHOST_ID_IFINDEX make >sense? (i.e. can >sFlow 'talk' to vhost-cuse ports? If not, then this code has a bug). > >[PL] At this point for vhost cuse there's always '-1' returned from lookup >function, so in >the nested if statement you'll always hit 'else' and 0 will be returned as >ifindex for vhost >cuse devices.
Agreed. But then all vhost-cuse devices share index of 0 - this is surely problematic? >Anyway setting ifindexes for vhost cuse devices should be a trivial task and >if vhost cuse is >supported by sflow I can handle it - but then it will require changing patch >subject and >message. Have you tested this patch against vhost-cuse? I'd be interested as to how it behaves. > >>+ if ((ret = netdev_dpdk_lookup_vhost_id(dev)) >= 0) { >>+ ifindex = VHOST_ID_TO_IFINDEX(ret); >>+ } else { >>+ ifindex = 0; > >I don't think that 0 is an appropriate value to return here, since it overlaps >with a kernel >interface's index. > >[PL] IMHO it's ok to use 0 here - it will be assigned to ifindex only when >lookup for vhost >device failed. Have you seen Ben's comment on this? http://openvswitch.org/pipermail/dev/2016-May/070626.html > >>+ } >>+ } >> ovs_mutex_unlock(&dev->mutex); >> >> return ifindex; >>-- >>2.1.0 > >[MK] There's a gap in this code - when the vhost-user device is destroyed, its >entry should >be removed from 'vhost_ids'. > >[PL] Actually the whole idea behind "caching" these vhost_ids is to have them >stored to >survive destroy and recreation of vhost devices, for instance when VM is >hard-rebooted, to >keep the same ifindex and not assign a new one, which could be confusing (for >example one >could have vhu1 port with ifindex 2001 visible in sFlow stats, but then >hard-reboots this VM >and vhu1 gets ifindex 2002 - it's the same vhost user socket, the same VM, but >somehow it >gets new ifindex and gets new entry in sflow). But if you think I can be >improve this, don't >hesitate to share your ideas. I completely understand what your intention is here. There is still an inherent issue with the design though, in that once vhost ports have been added to 'vhost_ids', they are never removed. Consider the following scenario: - multiple vhost ports have been added to the bridge, such that vhost_counter = VHOST_IDS_MAX_LEN (I'm not sure what the use case for adding 1024 ports to a bridge would be, but I digress) - one or more of the previously-added ports are no longer needed, so we bring the VMs that they are attached to down - we want to add another vhost port, described by a vhost_id that is not already present in 'vhost_ids', and connect it to a VM => Now, however, we cannot add the additional port, due to the fact that vhost_counter was never decremented, such that it is still equal to VHOST_IDS_MAX_LEN: the bounds check in netdev_dpdk_push_vhost_id fails, and the user is warned that their port cannot be added. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev