On 07.07.2016 03:27, Daniele Di Proietto wrote: > Hi Ilya, > > > > apologies for the delay > > On 20/06/2016 07:22, "Ilya Maximets" <i.maxim...@samsung.com> wrote: > >> On 11.06.2016 02:53, Daniele Di Proietto wrote: >>> On 02/06/2016 06:55, "Ilya Maximets" <i.maxim...@samsung.com> wrote: >>> >>>> Hi, Daniele. >>>> Thanks for review. >>>> >>>> On 02.06.2016 04:33, Daniele Di Proietto wrote: >>>>> Hi Ilya, >>>>> >>>>> apologies for the delay. >>>>> >>>>> I didn't take a extremely detailed look at this series, but I have >>>>> a few high level comments. >>>>> >>>>> Thanks for adding a command to configure the rxq affinity. Have >>>>> you thought about using the database instead? I think it will >>>>> be easier to use because it survives restarts, and one can batch >>>>> the affinity assignment for multiple ports without explicitly >>>>> calling pmd-reconfigure. I'm not sure what the best interface >>>>> would look like. Perhaps a string in Interface:other_config that >>>>> maps rxqs with core ids? >>>>> >>>>> I'd prefer to avoid exporting an explicit command like >>>>> dpif-netdev/pmd-reconfigure. If we use the database we don't have to, >>>>> right? >>>> >>>> I thought about solution with database. Actually, I can't see big >>>> difference between database and appctl in this case. For automatic >>>> usage both commands may be scripted, but for manual pinning this >>>> approaches equally uncomfortable. >>>> IMHO, if it will be database it shouldn't be a per 'Interface' >>>> string with mapping, because one map influences on other ports >>>> (core isolation). Also there is an issue with synchronization with >>>> 'pmd-cpu-mask' that should be performed manually anyway. >>>> appctl command may be changed to receive string of all mappings and >>>> trigger reconfiguration. In this case there will be no need to have >>>> explicit 'dpif-netdev/pmd-reconfigure'. >>> >>> Do we really need to implement core isolation? I'd prefer an interface where >>> if an interface has an affinity we enforce that (as far as we can with the >>> current pmd-cpu-mask), and for other interfaces we keep the current model. >>> Probably there are some limitation I'm not seeing with this model. >> >> Generally, core isolation prevents polling of other ports on PMD thread. >> This is useful to keep constant polling rate on some performance >> critical port while adding/deleting of other ports. Without isolation >> we will need to pin exactly all ports to achieve desired level of >> performance. >> >>> I'd prefer to keep the mapping in the database because it's more in line >>> with the rest of OVS configuration. The database survives crashes, restarts >>> and reboots. >> >> Ok. How about something like this: >> >> * Per-port database entry for available core-ids: >> >> # ovs-vsctl set interface <iface> \ >> other_config:pmd-rxq-affinity=<rxq-affinity-list> >> >> where: >> <rxq-affinity-list> ::= NULL | <non-empty-list> >> <non-empty-list> ::= <affinity-pair> | >> <affinity-pair> , <non-empty-list> >> <affinity-pair> ::= <queue-id> : <core-id> >> >> Example: >> >> # ./bin/ovs-vsctl set interface dpdk0 options:n_rxq=4 \ >> other_config:pmd-rxq-affinity="0:3,1:7,3:8" >> Queue #0 pinned to core 3; >> Queue #1 pinned to core 7; >> Queue #2 not pinned. >> Queue #3 pinned to core 8; > > Unless someone has better ideas, this looks good
Ok. I'll implement this in third (last) part of this patch-set. >> >> * Configurable mask of isolated PMD threads: >> >> # ./bin/ovs-vsctl set Open_vSwitch . \ >> other_config:pmd-isol-cpus=<core-list> >> Empty means "none". > > I still think this looks kind of complicated. These are the options: > > 1) Do not deal with isolation. If some "isolation" is required, the user > has to assign the rxqs of every port. > > 2) Automatically isolate cores that have rxq explicitly assigned to them. > > 3) Add a pmd-isol-cpus parameter > > 4) Add a rxq-default-affinity (the opposite of pmd-isol-cpus). > > 1) and 2) only add a single configuration parameter, but > 1) puts the burden on the controller (as you point out), and 2) is > not too flexible. > > 3) and 4) are equivalent, they allow more flexibility, but add two > configuration parameters. > > If you think more flexibility is required, we can go for 3) or 4). > > Otherwise we can probably try implementing 2). > > Are there better ideas? I think we can implement 2) for now. In the future maybe we will abandon core isolation after implementing of some automatic load balancer. >> * Pinned RX queue can be polled only by PMD thread on core to >> which it is pinned. >> >> * Only pinned RX queues can be polled by isolated PMD thread. >> >>>> >>>>> I'm not sure what's the best way to introduce XPS in OVS. First of all, >>>>> for physical NICs I'd want to keep the current model where possible >>>>> (one queue per core, no locking). Maybe we can introduce some mapping >>>>> just for vhost-user ports in netdev-dpdk to implement packet steering? >>>> >>>> We can just set default values for 'n_txq' more accurately: 'n_cores() + 1' >>>> for phy ports and '1' for virtual. To avoid locking on TX to phy ports >>>> we may just check that 'n_txq' >= 'n_cores() + 1'. We can do that because >>>> reconfiguration required to change 'n_txq' and, in current implementation >>>> of XPS, one queue will not be used twice if we have unused queues. >>> >>> Ok, if it can be done without any overhead for phy ports, I'm fine with >>> that. >>> >>>> >>>>> Also, have you considered packet reordering issues? If a 5-tuple is >>>>> processed by one core and we often change the tx queue, we end up >>>>> with the same 5-tuple on two different tx queues. >>>> >>>> To avoid reordering issues there is a timeout mechanism inside XPS. >>>> Current tx queue_id may be changed only if there was no packets to >>>> this queue for a significant amount of time (XPS_CYCLES). >>> >>> Ok, no problem then. >>> >>>> >>>>> Lastly I'm not 100% sure we need a "n-txq" parameter. I think that >>>>> for vhost-user ports, we know the value in new_device() (we should >>>>> do that for rxqs too now that we have netdev_reconfigure). physical >>>>> NICs txqs should match the cpu cores and avoid locking, where >>>>> possible. >>>> >>>> Actually, we don't need this if default values will be set as described >>>> above and reconfiguration request will be used inside 'new_device()' >>>> (I guess, this should be implemented separately). >>>> But it's harmless and, actually, almost free in implementation. May be >>>> it can be used for some specific cases or devices. >>> >>> In general, we try to limit the knobs that a user can play with, unless they >>> provide some value. >>> >>> In this case, if we need something for netdev-dummy, we can implement it >>> explicitly for it and not avoid exposing it otherwise. >> >> Ok. I agree. We can introduce something like 'options:dummy_n_txq' for >> dummy interfaces only. >> >> What do you think? > > Agreed > > Thanks, > > Daniele > > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev