Reusing an ifIndex from time to time is no problem from the sFlow standard point of view either. In fact we used a hash approach in the host-sflow project to assign index numbers to containers (by open-hashing the container UUID into a range of possible index numbers): https://github.com/sflow/host-sflow/blob/master/src/Linux/hsflowd.c#L373
------ Neil McKee InMon Corp. http://www.inmon.com On Thu, Jun 16, 2016 at 4:16 AM, Lal, PrzemyslawX <przemyslawx....@intel.com > wrote: > Hi Mark, > > RFC 2863 standard describes in "Interface Numbering" chapter ( > https://tools.ietf.org/html/rfc2863#page-10) that if there's possibility > to reuse ifindex after reinitialization then it should be reused. And in > our case it's definitely possible. I also agree with you that limitation to > 1024 vhostuser interfaces is a major problem here and I should avoid such > arbitrary values. I think the solution might be to use hashmap or OVS list > implementation or maybe even dynamic allocation of new vhost ifindex > entries. > > I have also analyzed current vhost-cuse initialization and adding support > for ifindex assignment to this type of port should be a trivial task and it > could be easily introduced in V2 of this patch. > > Please share your thoughts. > > Thanks, > Przemek > > -----Original Message----- > From: Kavanagh, Mark B > Sent: Wednesday, May 11, 2016 11:56 AM > To: Lal, PrzemyslawX <przemyslawx....@intel.com>; dev@openvswitch.org > Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user ports > > > > >Hi Mark, > > > >Replies inline prefixed with [PL]. > > > >Thanks, > >Przemek > > > >-----Original Message----- > >From: Kavanagh, Mark B > >Sent: Thursday, May 5, 2016 2:19 PM > >To: Lal, PrzemyslawX <przemyslawx....@intel.com>; dev@openvswitch.org > >Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user > >ports > > > >Hi Przemek, > > > >Some additional comments/queries inline. > > > >Thanks again, > >Mark > > > >> > >>This patch adds sFlow support for DPDK vHost-user interfaces by > >>assigning ifindex value. Values of ifindexes for vHost-user interfaces > >>start with 2000 to avoid overlapping with kernel datapath interfaces. > >> > >>Patch also fixes issue with 'dpdk0' interface being ignored by sFlow > >>agent, because of ifindex 0. Ifindexes values for physical DPDK > >>interfaces start with 1000 to avoid overlapping with kernel datapath > interfaces. > >> > >>Signed-off-by: Przemyslaw Lal <przemyslawx....@intel.com> > >>--- > >> lib/netdev-dpdk.c | 70 > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++- > >> 1 file changed, 69 insertions(+), 1 deletion(-) > >> > >>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > >>208c5f5..1b60c5a 100644 > >>--- a/lib/netdev-dpdk.c > >>+++ b/lib/netdev-dpdk.c > >>@@ -115,6 +115,18 @@ static char *vhost_sock_dir = NULL; /* Location > of vhost-user sockets > >>*/ > >> */ > >> #define VHOST_ENQ_RETRY_USECS 100 > >> > >>+/* For DPDK ETH interfaces use ifindex values starting with 1000 > >>+ * to avoid overlapping with kernel-space interfaces. > >>+ * Also starting with 0 would cause sFlow to ignore 'dpdk0' interface. > >>+ */ > >>+#define DPDK_PORT_ID_TO_IFINDEX(port_id) ((port_id) + 1000) > >>+ > >>+/* For DPDK vhost-user interfaces use ifindexes starting with 2000. > >>+ */ > >>+#define VHOST_ID_TO_IFINDEX(port_id) ((port_id) + 2000) > >>+ > >>+#define VHOST_IDS_MAX_LEN 1024 > >>+ > >> static const struct rte_eth_conf port_conf = { > >> .rxmode = { > >> .mq_mode = ETH_MQ_RX_RSS, > >>@@ -149,6 +161,7 @@ enum dpdk_dev_type { static int rte_eal_init_ret > >>= ENODEV; > >> > >> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER; > >>+static struct ovs_mutex vhost_mutex = OVS_MUTEX_INITIALIZER; > >> > >> /* Quality of Service */ > >> > >>@@ -790,6 +803,50 @@ netdev_dpdk_vhost_cuse_construct(struct netdev > *netdev) > >> return err; > >> } > >> > >>+/* Counter for VHOST interfaces as DPDK library doesn't provide > >>+mechanism > >>+ * similar to rte_eth_dev_count for vhost-user sockets. > >>+ */ > >>+static int vhost_counter OVS_GUARDED_BY(vhost_mutex) = 0; > >>+ > >>+/* Array storing vhost_ids, so their ifindex can be reused after > >>+socket > >>+ * recreation. > >>+ */ > >>+static char vhost_ids[VHOST_IDS_MAX_LEN][PATH_MAX] > >>+OVS_GUARDED_BY(vhost_mutex); > >>+ > >>+/* Simple lookup in vhost_ids array. > >>+ * On success returns id of vhost_id stored in the array, > >>+ * otherwise returns -1. > >>+ */ > >>+static int > >>+netdev_dpdk_lookup_vhost_id(struct netdev_dpdk *dev) > >>+OVS_REQUIRES(vhost_mutex) { > >>+ for (int i = 0; i < vhost_counter; i++) { > >>+ if (strncmp(vhost_ids[i], dev->vhost_id, strlen(dev->vhost_id)) > == 0) { > >>+ return i; > >>+ } > >>+ } > >>+ return -1; > >>+} > >>+ > >>+/* Inserts vhost_id at the first free position in the vhost_ids array. > >>+ */ > >>+static void > >>+netdev_dpdk_push_vhost_id(struct netdev_dpdk *dev) { > >>+ ovs_mutex_lock(&vhost_mutex); > >>+ if (netdev_dpdk_lookup_vhost_id(dev) < 0) { > >>+ if (vhost_counter < VHOST_IDS_MAX_LEN) { > >>+ strncpy(vhost_ids[vhost_counter++], dev->vhost_id, > >>+ strlen(dev->vhost_id)); > >>+ } else { > >>+ VLOG_WARN("Could not assign ifindex to \"%s\" port. " > >>+ "List of vhost IDs list is full.", > >>+ dev->vhost_id); > >>+ } > >>+ } > >>+ ovs_mutex_unlock(&vhost_mutex); > >>+} > >>+ > >> static int > >> netdev_dpdk_vhost_user_construct(struct netdev *netdev) { @@ -825,6 > >>+882,8 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) > >> err = vhost_construct_helper(netdev); > >> } > >> > >>+ netdev_dpdk_push_vhost_id(dev); > >>+ > >> ovs_mutex_unlock(&dpdk_mutex); > >> return err; > >> } > >>@@ -1773,9 +1832,18 @@ netdev_dpdk_get_ifindex(const struct netdev > >>*netdev) { > >> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > >> int ifindex; > >>+ int ret; > >> > >> ovs_mutex_lock(&dev->mutex); > >>- ifindex = dev->port_id; > >>+ if (dev->type == DPDK_DEV_ETH) { > >>+ ifindex = DPDK_PORT_ID_TO_IFINDEX(dev->port_id); > >>+ } else { > > > > > >This 'else' statement is executed in both the case of vhost user > >devices and also vhost cuse devices. > >In the latter case, do the port numbers returned by VHOST_ID_IFINDEX > >make sense? (i.e. can sFlow 'talk' to vhost-cuse ports? If not, then this > code has a bug). > > > >[PL] At this point for vhost cuse there's always '-1' returned from > >lookup function, so in the nested if statement you'll always hit 'else' > >and 0 will be returned as ifindex for vhost cuse devices. > > Agreed. But then all vhost-cuse devices share index of 0 - this is surely > problematic? > > >Anyway setting ifindexes for vhost cuse devices should be a trivial > >task and if vhost cuse is supported by sflow I can handle it - but then > >it will require changing patch subject and message. > > Have you tested this patch against vhost-cuse? I'd be interested as to how > it behaves. > > > > >>+ if ((ret = netdev_dpdk_lookup_vhost_id(dev)) >= 0) { > >>+ ifindex = VHOST_ID_TO_IFINDEX(ret); > >>+ } else { > >>+ ifindex = 0; > > > >I don't think that 0 is an appropriate value to return here, since it > >overlaps with a kernel interface's index. > > > >[PL] IMHO it's ok to use 0 here - it will be assigned to ifindex only > >when lookup for vhost device failed. > > Have you seen Ben's comment on this? > http://openvswitch.org/pipermail/dev/2016-May/070626.html > > > > >>+ } > >>+ } > >> ovs_mutex_unlock(&dev->mutex); > >> > >> return ifindex; > >>-- > >>2.1.0 > > > >[MK] There's a gap in this code - when the vhost-user device is > >destroyed, its entry should be removed from 'vhost_ids'. > > > >[PL] Actually the whole idea behind "caching" these vhost_ids is to > >have them stored to survive destroy and recreation of vhost devices, > >for instance when VM is hard-rebooted, to keep the same ifindex and not > >assign a new one, which could be confusing (for example one could have > >vhu1 port with ifindex 2001 visible in sFlow stats, but then > >hard-reboots this VM and vhu1 gets ifindex 2002 - it's the same vhost > >user socket, the same VM, but somehow it gets new ifindex and gets new > entry in sflow). But if you think I can be improve this, don't hesitate to > share your ideas. > > I completely understand what your intention is here. There is still an > inherent issue with the design though, in that once vhost ports have been > added to 'vhost_ids', they are never removed. > > Consider the following scenario: > - multiple vhost ports have been added to the bridge, such that > vhost_counter = VHOST_IDS_MAX_LEN (I'm not sure what the use case for > adding 1024 ports to a bridge would be, but I digress) > - one or more of the previously-added ports are no longer needed, so we > bring the VMs that they are attached to down > - we want to add another vhost port, described by a vhost_id that is not > already present in 'vhost_ids', and connect it to a VM => Now, however, we > cannot add the additional port, due to the fact that vhost_counter was > never decremented, such that it is still equal to VHOST_IDS_MAX_LEN: the > bounds check in netdev_dpdk_push_vhost_id fails, and the user is warned > that their port cannot be added. > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev