> -----Original Message----- > From: Yigit, Ferruh > Sent: Tuesday, November 13, 2018 10:44 AM > To: Zhang, Qi Z <qi.z.zh...@intel.com>; Thomas Monjalon > <tho...@monjalon.net> > Cc: dev@dpdk.org; Lin, Xueqin <xueqin....@intel.com> > Subject: Re: [PATCH v2] net/pcap: enable data path on secondary > > On 11/13/2018 6:27 PM, Zhang, Qi Z wrote: > > First, apologies to make this in rush since I was somehow under pressure to > make pdump works in 18.11. > > I agree there is lot of things need to improve, but the strategy here > > is to make it work quietly and not break anything else :) add some > comments inline. > > > >> -----Original Message----- > >> From: Thomas Monjalon [mailto:tho...@monjalon.net] > >> Sent: Tuesday, November 13, 2018 9:15 AM > >> To: Yigit, Ferruh <ferruh.yi...@intel.com>; Zhang, Qi Z > >> <qi.z.zh...@intel.com> > >> Cc: dev@dpdk.org; Lin, Xueqin <xueqin....@intel.com> > >> Subject: Re: [PATCH v2] net/pcap: enable data path on secondary > >> > >> Just a quick comment: > >> There are probably some ideas to take from what was done for tap. > > > >> > >> > >> 13/11/2018 17:56, Ferruh Yigit: > >>> On 11/12/2018 4:51 PM, Qi Zhang wrote: > >>>> Private vdev on secondary is never supported by the new shared > >>>> device mode but pdump still relies on a private pcap PMD on > secondary. > >>>> The patch enables pcap PMD's data path on secondary so that pdump > >>>> can work as usual. > >>> > >>> It would be great if you described the problem a little more. > >>> > >>> Private vdev was the way previously, when pdump developed, now with > >>> shared device mode on virtual devices, pcap data path in secondary > >>> is not > >> working. > >>> > >>> What exactly not working is (please correct me if I am wrong): > >>> When secondary adds a virtual device, related data transferred to > >>> primary and primary creates the device and shares device back with > >> secondary. > >>> When pcap device created in primary, pcap handlers (pointers) are > >>> process local and they are not valid for secondary process. This > >>> breaks > >> secondary. > >>> > >>> So we can't directly share the pcap handlers, but need to create a > >>> new set of handlers for secondary, that is what you are doing in > >>> this patch, although I have some comments, please check below. > >>> > >>> Since there is single storage for pcap handlers that primary and > >>> secondary shares and they can't share the handlers, you can't make > >>> both primary and secondary data path work. Also freeing handlers is > >>> another concern. What is needed is `rte_eth_dev->process_private` > >>> which > >> has been added in this release. > > > > You are right, we should prevent handler be opened in primary be > corrupted during probe at secondary. > > Now, I see this problem in pcap , as an example: > > internals->tx_queue[i].dumper/pcap is shared but will be overwritten > > at secondary, we should fix them by use process_private, > > > >>> > >>>> > >>>> Signed-off-by: Qi Zhang <qi.z.zh...@intel.com> > >>>> Tested-by: Yufeng Mo <yufengx...@intel.com> > >>> > >>> <...> > >>> > >>>> @@ -934,6 +935,10 @@ pmd_init_internals(struct rte_vdev_device > >> *vdev, > >>>> */ > >>>> (*eth_dev)->dev_ops = &ops; > >>>> > >>>> + /* store a copy of devargs for secondary process */ > >>>> + strlcpy(internals->devargs, rte_vdev_device_args(vdev), > >>>> + ETH_PCAP_ARG_MAXLEN); > >>> > >>> Why we need to cover this in PMD level? > >>> > >>> Why secondary probe isn't getting devargs? Can't we fix this in eal level? > >>> It can be OK to workaround in the PMD taking account of the time of > >>> the release, but for long term I think this should be fixed in eal. > > > > Yes this is the workaround for quick fix. > > Ideally secondary process should not take care of devargs, it just attach. > > And it's better to only parse devargs on one process ( primary process), the > parsed result could be stored to intermediate result in shared > memory,(examples, internal->nb_rx_queue_required) so secondary process > don't need to parse it again. > >>> > >>> <...> > >>> > >>>> @@ -1122,23 +1126,37 @@ pmd_pcap_probe(struct rte_vdev_device > >> *dev) > >>>> start_cycles = rte_get_timer_cycles(); > >>>> hz = rte_get_timer_hz(); > >>>> > >>>> - if (rte_eal_process_type() == RTE_PROC_SECONDARY) { > >>>> + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { > >>>> + kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), > >>>> + valid_arguments); > >>>> + if (kvlist == NULL) > >>>> + return -EINVAL; > >>>> + if (rte_kvargs_count(kvlist, ETH_PCAP_IFACE_ARG) == 1) > >>>> + nb_rx_queue = 1; > >>>> + else > >>>> + nb_rx_queue = > >>>> + rte_kvargs_count(kvlist, > >>>> + ETH_PCAP_RX_PCAP_ARG) ? 1 : 0; > >>>> + nb_tx_queue = 1; > >>> > >>> This part is wrong. pcap pmd supports multi queue, you can't > >>> hardcode the number of queues. Also for Tx why it ignores `rx_iface` > argument? > >>> This is just hacking the driver for a specific use case breaking others. > > > > Previously the nb_tx_queue and nb_rx_queue is decided by > pcaps.num_of_queue and dumpers.num_of_queues. > > I just can't figure out a way that we can have more than 1 queue during > probe, look at below code. > > > > If ETH_PCAP_IFACE_ARG > > > > pcaps.num_of_queue = 1; > > dumpers.num_of_queue = 1; > > > > else > > is_rx_pcap = rte_kvargs_count(kvlist, ETH_PCAP_RX_PCAP_ARG) ? 1 : 0; > > pcaps.num_of_queue = 0; > > if (is_rx_pcap) { > > ret = rte_kvargs_process(kvlist, > ETH_PCAP_RX_PCAP_ARG, > > &open_rx_pcap, &pcaps); > > > > // pcaps.num_of_queue = 1; > > } else { > > ret = rte_kvargs_process(kvlist, NULL, > > &rx_iface_args_process, &pcaps); > > // pcaps.num_of_queue = 0; > > } > > > > is_tx_pcap = rte_kvargs_count(kvlist, ETH_PCAP_TX_PCAP_ARG) ? > 1 : 0; > > dumpers.num_of_queue = 0; > > > > if (is_tx_pcap) > > ret = rte_kvargs_process(kvlist, > ETH_PCAP_TX_PCAP_ARG, > > &open_tx_pcap, &dumpers); > > // dumpers.num_of_queue = 1 > > else > > ret = rte_kvargs_process(kvlist, > ETH_PCAP_TX_IFACE_ARG, > > &open_tx_iface, &dumpers); > > // dumpers.num_of_queue = 1 > > > > That's the same logic I applied, did I missed something, would you explain > more for this? > > ETH_PCAP_IFACE_ARG is "iface=xxx" usage, both Rx and Tx use the same > interface, because of implementation limitation it only supports 1 queue. > > rx_pcap, rx_iface, rx_iface_in supports multiple queues, by providing them > multiple time. Like "rx_pcap=q1.pcap,rx_pcap=q2.pcap,rx_pcap=q3.pcap" > will create 3 Rx queues each having their own .pcap file. Same is valid for > Tx. > > rte_kvargs_process() calls callback function per argument provided, so if an > argument provided multiple times, it will call same callback multiple times, > that is why 'num_of_queue' increased in callback functions.
Ok, this is what I missed, I didn't notice pcap devargs will take advantage by using the same key for multiple times. BTW, I also noticed it may not necessary to estimate the queue number and call pmd_init_internals before open any pcap file or iface as you mentioned. The reason I have this is, previously I worried about if a pcap file can be locked by one process and prevent another process to access it, so I think about to add "owner" to in devargs to make sure only one process will access the files, but after I figure out this is not necessary, the queue estimation part should be removed and rollback to a simple solution as you suggested. Thanks for the help! Qi > > In high-level, pmd_pcap_probe() first parses the arguments and creates pcap > handlers based on arguments, later as a last thing creates ethdev using these > information. I am for keeping this logic, doing something different for > secondary can cause issues in edge cases not obvious at first look. > > > > > Thanks > > Qi > > > >>> > >>>> + ret = pmd_init_internals(dev, nb_rx_queue, > >>>> + nb_tx_queue, ð_dev); > >>> > >>> I think it is not required to move pmd_init_internals() here. > >>> This can be done simpler, I will send a draft patch as a reply to > >>> this mail for possible solution. > >>> But again that can't be final solution, we need to use > >>> `process_private` > >>> > >>> <...> > >>> > >> > >> > >> > >> > >