Hi Pankaj,
On 9/12/2016 6:55 PM, Pankaj Chauhan wrote: > On 9/9/2016 2:26 PM, Tan, Jianfeng wrote: >> Hi Pankaj, >> >> Thanks for the massive and great work. > > Hi Jianfeng, > > Thanks for the review. >> >> On 9/5/2016 6:54 PM, Pankaj Chauhan wrote: >>> Introduce support for a generic framework for handling of switching >>> between >>> physical and vhost devices. The vswitch framework introduces the >>> following >>> concept: >>> >>> 1. vswitch_dev: Vswitch device is a logical switch which can have >>> physical and >>> virtio devices. The devices are operated/used using standard rte_eth >>> API for >>> physical devices and lib_vhost API for vhost devices, PMD API is not >>> used >>> for vhost devices. >>> >>> 2. vswitch_port: Any physical or virtio device that is added to >>> vswitch. The >>> port can have its own tx/rx functions for doing data transfer, which >>> are exposed >>> to the framework using generic function pointers >>> (vs_port->do_tx/do_rx). This way >>> the generic code can do tx/rx without understanding the type of device >>> (Physical or >>> virtio). Similarly each port has its own functions to select tx/rx >>> queues. The framework >>> provides default tx/rx queue selection functions to the port when port >>> is added >>> (for both physical and vhost devices). But the framework allows the >>> switch implementation >>> to override the queue selection functions (vs_port->get_txq/rxq) if >>> required. >>> >>> 3. vswitch_ops: The ops is set of function pointers which are used to >>> do operations >>> like learning, unlearning, add/delete port, lookup_and_forward. The >>> user of vswitch >>> framework (vhost/main.[c,h])uses these function pointers to perform >>> above mentioned >>> operations, thus it remains agnostic of the underlying implementation. >>> >>> Different switching logics can implement their vswitch_device and >>> vswitch_ops, and >>> register with the framework. This framework makes vhost-switch >>> application scalable >>> in terms of: >>> >>> 1. Different switching logics (one of them is vmdq, vhost/vmdq.[c,h] >>> 2. Number of ports. >>> 3. Policies of selecting ports for rx and tx. >>> >>> Signed-off-by: Pankaj Chauhan <pankaj.chauhan at nxp.com> >> >> After this patch set, how's mapping relationship between cores and >> vswitch_dev? Old vhost example does not have the concept of switch, so >> each core is binded with some VDEVs. Now, we still keep original logic? >> >> (a) If yes, provided that phys device could has no direct relationship >> with vdevs in other switching logics, we may need to bind those physical >> devices to cores too? In that case, switch_worker() will receiving pkts >> from all devices (phys or virtual) and dispatch, which looks like: >> >> switch_worker() { >> FOR each port binding to this core { >> rx(port, pkts[], count); >> vs_lookup_n_fwd( information_needed ); >> } >> } > > Since we support only one switch device running at one time. We bind > the ports of the switchdev to the core. But The switch might have it's > own logic to bind the port to the core. For example > VMDQ only supports one Physical port, other switch can support more > than one Physical port. In order to take care of that i have added two > ops in swithdev_ops: > > 1. vs_sched_rx_port: > > struct vswitch_port *vs_sched_rx_port(struct vswitch_dev *vs_dev, enum > vswitch_port_type ptype, > uint16_t core_id) > > 2. vs_sched_tx_port: > > struct vswitch_port *vs_sched_tx_port(struct vswitch_dev *vs_dev, enum > vswitch_port_type ptype, uint16_t > core_id) > > The idea of providing these functions is that vhost/main requests the > underlying switch implementation to schedule a port for rx or Tx, the > current running core_id is also passed as parameter. So the switch can > take a decision which port to do rx or tx based on core id, and may be > some other custom policy. > > For example VMDQ always returns the one single Physical port in > response to these functions called from any core. The other switch > can return the ports bound to the cores. > > Similarly there are two port operations (vs_port->get_rxq and > vs_port->get_txq), here also we pass core_id as parameter so that > the underlying switch implementation can schedule the rx or tx queue > based on the core_id. > > The above mentioned ops are used in drain_eth_rx() and > do_drain_mbuf_table() (main.c) currently, and they leave binding of > physical port and the queues to the underlying implementation. This > way we can accommodate VMDQ which uses only one physical port and > rxqueues based on VMDQ, OR any other switch which uses multiple > physical port and rx/tx queue scheduling based on some other logic. > > Please suggest if this way of scheduling ports and tx/rx queues is > fine or not? According to above explanation, in VMDQ switch, we cannot schedule two queues (belongs to the same port) on the same core, right? >> >> (b) If no, we may bind each core with n switches? If so, switch_worker() >> also need to be reworked to keep receiving pkts from all ports of the >> switch, and dispatch. >> >> switch_worker() { >> FOR each switch binding to this core { >> FOR each port of switch { >> rx(port, pkts[], count); >> vs_lookup_n_fwd( information_needed ); >> } >> } >> } > Since we currently support only one switch_dev in a running instance of > vhost_switch() we can use binding of ports to core as you suggested in > #(a). OK. > >> >> In all, (1) we'd better not use vdev to find phys dev in switch_worker >> any more; > > I agree with you. I have tried to do following in drain_eth_rx (): > > > core_id = rte_lcore_id(); > rx_port = vs_sched_rx_port(vswitch_dev_g, VSWITCH_PTYPE_PHYS, > core_id); > if (unlikely(!rx_port)) > goto out; > > rxq = rx_port->get_rxq(rx_port, vdev, core_id); > rx_count = rx_port->do_rx(rx_port, rxq, NULL, pkts, > MAX_PKT_BURST); > > Here we don't assume any relation between vdev and the physical device > or rxq. But pass vdev to the underlying switch so that it can choose > the rxq for this vdev. In case of VMDQ it will return VMDQ queue > number for this vdev. In case of any other switch implementation it > can have their own logic to decide rxq. Thanks for explanation. > > (2) we'd better not differentiate phys device and virtual >> device in generic framework (it's just an attribute of vswitch_port. >> >> What do you think? > > I agree with your thought that given the current API in this patchset we > should aim for making switch_worker agnostic of the port type. Ideally > it should look something like this: > > switch_worker() { > > rx_port mask = VSWITCH_PTYPE_PHYS | VSWITCH_PTYPE_PHYS; > > rx_port = vs_sched_rx_port(vswit_dev_g, rx_port_mask, core_id) > rx_q = rx_port->get_rxq(vs_port, vdev, code_id); > rx_port->do_rx(rx_port, rxq, NULL, pktss, MAX_PKT_BURST); Can we hide queues inside struct vswitch_port? I mean: For VMDQ switch, treat (port_id, queue_id) as a vswitch_port, so far you've already stored "struct vhost_dev *" into vswitch_port.priv when it's a virtual port, how about store queue_id into vswitch_port.priv when it's a physical port. For arp_learning switch, make (port_id, all_enabled_queues) as a vswitch_port. Summarize above two: we treat (port_id, all_enabled_queues[]) as a vswitch_port. How about it? > > vs_lookup_n_fwd(rx_port, pkts, rx_count, rxq); > > } > > Here i have two problems in achieving switch_worker as mentioned above: > > 1. while doing Rx from physical port, we need the associated vdev to > know the VMDQ rx_q. That was the reason i kept current logic of > keeping vdevs bound to the core in the switch_worker. If we don't > keep list of vdevs assigned to the core, how we' will get the rxq on > phsyical device in case of VMDQ switch implementation. I think my proposal above can solve this problem. > > 2. drain_mbuf_table() currently assumes that there is only one > physical device on which we have to transmit. We may have to rethink > where to keep the tx queues (&lcore_tx_queue[core_id]), i case we have > multiple physical ports we might have to move tx queues to each port. This function, drain_mbuf_table(), and its related data structures, are used for cache on tx direction. But it's not only useful for physical port, but also virtual port (vhost_dev). I suggest to make such cache a field of per vswitch_port. And each again, all vswitch_ports are connected to a core, each core will periodically clean those vswitch_ports. Thanks, Jianfeng > > Since i didn't have good solution currently for #1 and #2 mentioned > above i kept assumption of one single physical port as it is and tried > to integrate VMDQ in the framework with leaving these assumptions (thus > the main structure of switch_worker) as it it. > > Please suggest if you have some solution to above mentioned points in > your mind. Ideally i agree we should have switch_worker() as i > mentioned in pseudo-code i gave above. >> >> Thanks, >> Jianfeng >> > >