> >Hi Mark, > >RFC 2863 standard describes in "Interface Numbering" chapter >(https://tools.ietf.org/html/rfc2863#page-10) that if there's possibility to >reuse ifindex >after reinitialization then it should be reused.
[MK] Very interesting - thanks for the link. And in our case it's definitely possible. I >also agree with you that limitation to 1024 vhostuser interfaces is a major >problem here and >I should avoid such arbitrary values. I think the solution might be to use >hashmap or OVS >list implementation or maybe even dynamic allocation of new vhost ifindex >entries. [MK] Do you think that the issues relating to ifIndex/ifNumber are only applicable to DPDK-based vhost and ring ports? i.e. does OVS handle kernel-based ifIndexes gracefully currently, or is this a larger issue that requires a more general solution? > >I have also analyzed current vhost-cuse initialization and adding support for >ifindex >assignment to this type of port should be a trivial task and it could be >easily introduced in >V2 of this patch. [MK] Okay great - I look forward to it. > >Please share your thoughts. > >Thanks, >Przemek > >-----Original Message----- >From: Kavanagh, Mark B >Sent: Wednesday, May 11, 2016 11:56 AM >To: Lal, PrzemyslawX <przemyslawx....@intel.com>; dev@openvswitch.org >Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user ports > >> >>Hi Mark, >> >>Replies inline prefixed with [PL]. >> >>Thanks, >>Przemek >> >>-----Original Message----- >>From: Kavanagh, Mark B >>Sent: Thursday, May 5, 2016 2:19 PM >>To: Lal, PrzemyslawX <przemyslawx....@intel.com>; dev@openvswitch.org >>Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user >>ports >> >>Hi Przemek, >> >>Some additional comments/queries inline. >> >>Thanks again, >>Mark >> >>> >>>This patch adds sFlow support for DPDK vHost-user interfaces by >>>assigning ifindex value. Values of ifindexes for vHost-user interfaces >>>start with 2000 to avoid overlapping with kernel datapath interfaces. >>> >>>Patch also fixes issue with 'dpdk0' interface being ignored by sFlow >>>agent, because of ifindex 0. Ifindexes values for physical DPDK >>>interfaces start with 1000 to avoid overlapping with kernel datapath >>>interfaces. >>> >>>Signed-off-by: Przemyslaw Lal <przemyslawx....@intel.com> >>>--- >>> lib/netdev-dpdk.c | 70 >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 69 insertions(+), 1 deletion(-) >>> >>>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index >>>208c5f5..1b60c5a 100644 >>>--- a/lib/netdev-dpdk.c >>>+++ b/lib/netdev-dpdk.c >>>@@ -115,6 +115,18 @@ static char *vhost_sock_dir = NULL; /* Location of >>>vhost-user >sockets >>>*/ >>> */ >>> #define VHOST_ENQ_RETRY_USECS 100 >>> >>>+/* For DPDK ETH interfaces use ifindex values starting with 1000 >>>+ * to avoid overlapping with kernel-space interfaces. >>>+ * Also starting with 0 would cause sFlow to ignore 'dpdk0' interface. >>>+ */ >>>+#define DPDK_PORT_ID_TO_IFINDEX(port_id) ((port_id) + 1000) >>>+ >>>+/* For DPDK vhost-user interfaces use ifindexes starting with 2000. >>>+ */ >>>+#define VHOST_ID_TO_IFINDEX(port_id) ((port_id) + 2000) >>>+ >>>+#define VHOST_IDS_MAX_LEN 1024 >>>+ >>> static const struct rte_eth_conf port_conf = { >>> .rxmode = { >>> .mq_mode = ETH_MQ_RX_RSS, >>>@@ -149,6 +161,7 @@ enum dpdk_dev_type { static int rte_eal_init_ret >>>= ENODEV; >>> >>> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER; >>>+static struct ovs_mutex vhost_mutex = OVS_MUTEX_INITIALIZER; >>> >>> /* Quality of Service */ >>> >>>@@ -790,6 +803,50 @@ netdev_dpdk_vhost_cuse_construct(struct netdev *netdev) >>> return err; >>> } >>> >>>+/* Counter for VHOST interfaces as DPDK library doesn't provide >>>+mechanism >>>+ * similar to rte_eth_dev_count for vhost-user sockets. >>>+ */ >>>+static int vhost_counter OVS_GUARDED_BY(vhost_mutex) = 0; >>>+ >>>+/* Array storing vhost_ids, so their ifindex can be reused after >>>+socket >>>+ * recreation. >>>+ */ >>>+static char vhost_ids[VHOST_IDS_MAX_LEN][PATH_MAX] >>>+OVS_GUARDED_BY(vhost_mutex); >>>+ >>>+/* Simple lookup in vhost_ids array. >>>+ * On success returns id of vhost_id stored in the array, >>>+ * otherwise returns -1. >>>+ */ >>>+static int >>>+netdev_dpdk_lookup_vhost_id(struct netdev_dpdk *dev) >>>+OVS_REQUIRES(vhost_mutex) { >>>+ for (int i = 0; i < vhost_counter; i++) { >>>+ if (strncmp(vhost_ids[i], dev->vhost_id, strlen(dev->vhost_id)) == >>>0) { >>>+ return i; >>>+ } >>>+ } >>>+ return -1; >>>+} >>>+ >>>+/* Inserts vhost_id at the first free position in the vhost_ids array. >>>+ */ >>>+static void >>>+netdev_dpdk_push_vhost_id(struct netdev_dpdk *dev) { >>>+ ovs_mutex_lock(&vhost_mutex); >>>+ if (netdev_dpdk_lookup_vhost_id(dev) < 0) { >>>+ if (vhost_counter < VHOST_IDS_MAX_LEN) { >>>+ strncpy(vhost_ids[vhost_counter++], dev->vhost_id, >>>+ strlen(dev->vhost_id)); >>>+ } else { >>>+ VLOG_WARN("Could not assign ifindex to \"%s\" port. " >>>+ "List of vhost IDs list is full.", >>>+ dev->vhost_id); >>>+ } >>>+ } >>>+ ovs_mutex_unlock(&vhost_mutex); >>>+} >>>+ >>> static int >>> netdev_dpdk_vhost_user_construct(struct netdev *netdev) { @@ -825,6 >>>+882,8 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) >>> err = vhost_construct_helper(netdev); >>> } >>> >>>+ netdev_dpdk_push_vhost_id(dev); >>>+ >>> ovs_mutex_unlock(&dpdk_mutex); >>> return err; >>> } >>>@@ -1773,9 +1832,18 @@ netdev_dpdk_get_ifindex(const struct netdev >>>*netdev) { >>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >>> int ifindex; >>>+ int ret; >>> >>> ovs_mutex_lock(&dev->mutex); >>>- ifindex = dev->port_id; >>>+ if (dev->type == DPDK_DEV_ETH) { >>>+ ifindex = DPDK_PORT_ID_TO_IFINDEX(dev->port_id); >>>+ } else { >> >> >>This 'else' statement is executed in both the case of vhost user >>devices and also vhost cuse devices. >>In the latter case, do the port numbers returned by VHOST_ID_IFINDEX >>make sense? (i.e. can sFlow 'talk' to vhost-cuse ports? If not, then this >>code has a bug). >> >>[PL] At this point for vhost cuse there's always '-1' returned from >>lookup function, so in the nested if statement you'll always hit 'else' >>and 0 will be returned as ifindex for vhost cuse devices. > >Agreed. But then all vhost-cuse devices share index of 0 - this is surely >problematic? > >>Anyway setting ifindexes for vhost cuse devices should be a trivial >>task and if vhost cuse is supported by sflow I can handle it - but then >>it will require changing patch subject and message. > >Have you tested this patch against vhost-cuse? I'd be interested as to how it >behaves. > >> >>>+ if ((ret = netdev_dpdk_lookup_vhost_id(dev)) >= 0) { >>>+ ifindex = VHOST_ID_TO_IFINDEX(ret); >>>+ } else { >>>+ ifindex = 0; >> >>I don't think that 0 is an appropriate value to return here, since it >>overlaps with a kernel interface's index. >> >>[PL] IMHO it's ok to use 0 here - it will be assigned to ifindex only >>when lookup for vhost device failed. > >Have you seen Ben's comment on this? http://openvswitch.org/pipermail/dev/2016- >May/070626.html > >> >>>+ } >>>+ } >>> ovs_mutex_unlock(&dev->mutex); >>> >>> return ifindex; >>>-- >>>2.1.0 >> >>[MK] There's a gap in this code - when the vhost-user device is >>destroyed, its entry should be removed from 'vhost_ids'. >> >>[PL] Actually the whole idea behind "caching" these vhost_ids is to >>have them stored to survive destroy and recreation of vhost devices, >>for instance when VM is hard-rebooted, to keep the same ifindex and not >>assign a new one, which could be confusing (for example one could have >>vhu1 port with ifindex 2001 visible in sFlow stats, but then >>hard-reboots this VM and vhu1 gets ifindex 2002 - it's the same vhost >>user socket, the same VM, but somehow it gets new ifindex and gets new entry >>in sflow). But >if you think I can be improve this, don't hesitate to share your ideas. > >I completely understand what your intention is here. There is still an >inherent issue with >the design though, in that once vhost ports have been added to 'vhost_ids', >they are never >removed. > >Consider the following scenario: >- multiple vhost ports have been added to the bridge, such that vhost_counter = >VHOST_IDS_MAX_LEN (I'm not sure what the use case for adding 1024 ports to a >bridge would be, >but I digress) >- one or more of the previously-added ports are no longer needed, so we bring >the VMs that >they are attached to down >- we want to add another vhost port, described by a vhost_id that is not >already present in >'vhost_ids', and connect it to a VM => Now, however, we cannot add the >additional port, due >to the fact that vhost_counter was never decremented, such that it is still >equal to >VHOST_IDS_MAX_LEN: the bounds check in netdev_dpdk_push_vhost_id fails, and >the user is >warned that their port cannot be added. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev