14/09/2022 03:23, Chaoyong He: > > On 9/13/2022 7:51 AM, Chaoyong He wrote: > > > > > >> On 9/9/2022 3:36 AM, Chaoyong He wrote: > > >>>> On 9/8/2022 9:44 AM, Chaoyong He wrote: > > >>>>> Adds the vNIC initialization logic for the flower PF vNIC. The > > >>>>> flower firmware exposes this vNIC for the purposes of fallback > > >>>>> traffic in the switchdev use-case. > > >>>>> > > >>>>> Adds minimal dev_ops for this PF device. Because the device is > > >>>>> being exposed externally to DPDK it should also be configured > > >>>>> using DPDK helpers like rte_eth_configure(). For these helpers to > > >>>>> work the flower logic needs to implements a minimal set of dev_ops. > > >>>>> > > >>>>> Signed-off-by: Chaoyong He <chaoyong...@corigine.com> > > >>>>> Reviewed-by: Niklas Söderlund <niklas.soderl...@corigine.com> > > >> > > >> <...> > > >> > > >>>>> +static int > > >>>>> +nfp_flower_init_pf_vnic(struct nfp_net_hw *hw) { int ret; > > >>>>> +uint16_t i; uint16_t n_txq; uint16_t n_rxq; uint16_t port_id; > > >>>>> +unsigned int numa_node; struct rte_mempool *mp; struct > > >>>>> +nfp_pf_dev *pf_dev; struct rte_eth_dev *eth_dev; struct > > >>>>> +nfp_app_fw_flower *app_fw_flower; > > >>>>> + > > >>>>> + static const struct rte_eth_conf port_conf = { > > >>>>> + .rxmode = { > > >>>>> + .mq_mode = RTE_ETH_MQ_RX_RSS, > > >>>>> + .offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM, > > >>>>> + }, > > >>>>> + .txmode = { > > >>>>> + .mq_mode = RTE_ETH_MQ_TX_NONE, > > >>>>> + }, > > >>>>> + }; > > >>>>> + > > >>>>> + /* Set up some pointers here for ease of use */ pf_dev = > > >>>>> + hw->pf_dev; app_fw_flower = > > NFP_PRIV_TO_APP_FW_FLOWER(pf_dev- > > >>>>> app_fw_priv); > > >>>>> + > > >>>>> + /* > > >>>>> + * Perform the "common" part of setting up a flower vNIC. > > >>>>> + * Mostly reading configuration from hardware. > > >>>>> + */ > > >>>>> + ret = nfp_flower_init_vnic_common(hw, "pf_vnic"); if (ret != 0) > > >>>>> + goto done; > > >>>>> + > > >>>>> + hw->eth_dev = rte_eth_dev_allocate("nfp_pf_vnic"); > > >>>>> + if (hw->eth_dev == NULL) { > > >>>>> + ret = -ENOMEM; > > >>>>> + goto done; > > >>>>> + } > > >>>>> + > > >>>>> + /* Grab the pointer to the newly created rte_eth_dev here */ > > >>>>> + eth_dev = hw->eth_dev; > > >>>>> + > > >>>>> + numa_node = rte_socket_id(); > > >>>>> + > > >>>>> + /* Fill in some of the eth_dev fields */ eth_dev->device = > > >>>>> + &pf_dev->pci_dev->device; eth_dev->data->dev_private = hw; > > >>>>> + > > >>>>> + /* Create a mbuf pool for the PF */ > > >>>>> + app_fw_flower->pf_pktmbuf_pool = nfp_flower_pf_mp_create(); > > if > > >>>>> + (app_fw_flower->pf_pktmbuf_pool == NULL) { > > >>>>> + ret = -ENOMEM; > > >>>>> + goto port_release; > > >>>>> + } > > >>>>> + > > >>>>> + mp = app_fw_flower->pf_pktmbuf_pool; > > >>>>> + > > >>>>> + /* Add Rx/Tx functions */ > > >>>>> + eth_dev->dev_ops = &nfp_flower_pf_vnic_ops; > > >>>>> + > > >>>>> + /* PF vNIC gets a random MAC */ > > >>>>> + eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", > > >>>> RTE_ETHER_ADDR_LEN, 0); > > >>>>> + if (eth_dev->data->mac_addrs == NULL) { > > >>>>> + ret = -ENOMEM; > > >>>>> + goto mempool_cleanup; > > >>>>> + } > > >>>>> + > > >>>>> + rte_eth_random_addr(eth_dev->data->mac_addrs->addr_bytes); > > >>>>> + rte_eth_dev_probing_finish(eth_dev); > > >>>>> + > > >>>>> + /* Configure the PF device now */ n_rxq = hw->max_rx_queues; > > >>>>> + n_txq = hw->max_tx_queues; port_id = hw->eth_dev->data- > > >port_id; > > >>>>> + > > >>>>> + ret = rte_eth_dev_configure(port_id, n_rxq, n_txq, &port_conf); > > >>>> > > >>>> Still not sure about PMD calling 'rte_eth_dev_configure()', can you > > >>>> please give more details on what specific configuration is expected > > >>>> with > > >> that call? > > >>> > > >>> The main configuration we need is the number of rx/tx queue. > > >>> So we should use the internal api `eth_dev_rx/tx_queue_config` to > > instead? > > >>> > > >> > > >> nb_rx_q/nb_tx_q are parameters provided by user (via > > >> rte_eth_dev_configure()), won't is wrong for PMD to set a value on its > > own? > > >> > > >> Why nb_rx_q/nb_tx_q are required in the probe() stage? Probe stage is > > >> not to configure the device. > > > > > > Our nfp card use `control message` to exchange message between PMD > > and firmware when we use flower firmware. > > > The control message is in the form of pkt and we use a `ctrl vNIC` ehtdev > > > as > > the agent to send and receive these pkts. > > > e.g., if we want to create representor port, the PMD must send the > > corresponding control message to firmware. > > > > > > This `ctrl vNIC` is totally user app Invisible, to make it able to send > > > and > > receive pkt, we must do some configure steps to this ethdev ourselves > > firstly. > > > We can don't use 'rte_eth_dev_configure()', but we still should do the > > needed configure steps to make sure the device can work. > > > > > > > How ethdev instance becomes app invisible, unless owner set properly I > > think apps can able to see it. > > And if needs to be internal, do you really need to create an ethdev? Why not > > communicate with HW via internal functions? > > > > Cc'ed Jerin, their driver also communicate with FW before probing. > > @Jerin, can you please review this patch and help on the design? > > > > @Thomas, @Andrew, can you please check the design too?
I agree with Ferruh. ethdev ports are for upper levels. > There still main problem for now: > 1. Driver shouldn't need to update the bus related data struct. I don't see how it's related. Just do your communication as device-specific object. The EAL device is to manage HW resource, this is what you should extend. The ethdev port is to expose Rx/Tx to the application. > 2. Probe stage should not configure the device. When probing, you need to initialize communication with HW. > And the first problem won't exist if we solve the second problem. > > So, how about we move all the logics relates to configuration into service? > According to the logic in rte_eal_init(): > ``` > ... > if (rte_bus_probe()) { > ... > ret = rte_service_start_with_defaults(); > if (ret < 0 && ret != -ENOTSUP) { > ... > ``` > Then this can guarantee that these logics run after the probe stage. No, you can manage all in your driver with touching the rest of DPDK. > I'll send a new version patch series based on this, and we can continue > discussing about the best choice. Please try what is described above.