Hi Maxime, > -----Original Message----- > From: Maxime Coquelin <maxime.coque...@redhat.com> > Sent: Wednesday, September 27, 2023 6:40 AM > To: Chautru, Nicolas <nicolas.chau...@intel.com>; dev@dpdk.org > Cc: hemant.agra...@nxp.com; david.march...@redhat.com; Vargas, Hernan > <hernan.var...@intel.com> > Subject: Re: [PATCH v2 6/7] baseband/acc: introduce the new VRB2 variant > > > > On 9/21/23 22:43, Nicolas Chautru wrote: > > This extends the unified driver to support both the > > VRB1 and VRB2 implementations of Intel vRAN Boost. > > > > Signed-off-by: Nicolas Chautru <nicolas.chau...@intel.com> > > --- > > doc/guides/bbdevs/index.rst | 1 + > > doc/guides/bbdevs/vrb2.rst | 269 +++++ > > doc/guides/rel_notes/release_23_11.rst | 3 + > > drivers/baseband/acc/acc_common.h | 84 +- > > drivers/baseband/acc/rte_acc100_pmd.c | 4 +- > > drivers/baseband/acc/rte_vrb_pmd.c | 1442 +++++++++++++++++++++--- > > drivers/baseband/acc/vrb1_pf_enum.h | 17 +- > > drivers/baseband/acc/vrb2_pf_enum.h | 124 ++ > > drivers/baseband/acc/vrb2_vf_enum.h | 121 ++ > > drivers/baseband/acc/vrb_pmd.h | 161 ++- > > 10 files changed, 2062 insertions(+), 164 deletions(-) > > create mode 100644 doc/guides/bbdevs/vrb2.rst > > create mode 100644 drivers/baseband/acc/vrb2_pf_enum.h > > create mode 100644 drivers/baseband/acc/vrb2_vf_enum.h > > > As a general comment, the review process would be easier if the patch was > split by features introduced: > - Base init, no capabilities exposed > - Add LDPC capabilities and related code > - ADD FFT capabilities and related code > - MLD-TS ... > > We had this very discussion last week at the techboard for a Net PMD, and it > was a shared opinion across the board.
Yes I realize there is more code that I first thought. Will do in v3. > > > diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst > > index 77d4c54664..269157d77f 100644 > > --- a/doc/guides/bbdevs/index.rst > > +++ b/doc/guides/bbdevs/index.rst > > @@ -15,4 +15,5 @@ Baseband Device Drivers > > fpga_5gnr_fec > > acc100 > > vrb1 > > + vrb2 > > la12xx > > diff --git a/doc/guides/bbdevs/vrb2.rst b/doc/guides/bbdevs/vrb2.rst > > new file mode 100644 index 0000000000..8d8e094660 > > --- /dev/null > > +++ b/doc/guides/bbdevs/vrb2.rst > > @@ -0,0 +1,269 @@ > > +.. SPDX-License-Identifier: BSD-3-Clause > > + Copyright(c) 2023 Intel Corporation > > + > > +.. include:: <isonum.txt> > > + > > +Intel\ |reg| vRAN Boost v2 Poll Mode Driver (PMD) > > +================================================= > > + > > +The Intel\ |reg| vRAN Boost integrated accelerator enables > > +cost-effective 4G and 5G next-generation virtualized Radio Access > > +Network (vRAN) solutions. > > +The Intel vRAN Boost v2.0 (VRB2 in the code) is specifically > > +integrated on the Intel\ |reg| Xeon\ |reg| Granite Rapids-D Process (GNR- > D). > > + > > +Features > > +-------- > > + > > +Intel vRAN Boost v2.0 includes a 5G Low Density Parity Check (LDPC) > > +encoder/decoder, rate match/dematch, Hybrid Automatic Repeat Request > > +(HARQ) with access to DDR memory for buffer management, a 4G Turbo > > +encoder/decoder, a Fast Fourier Transform (FFT) block providing > > +DFT/iDFT processing offload for the 5G Sounding Reference Signal > > +(SRS), a MLD-TS accelerator, a Queue Manager (QMGR), and a DMA > subsystem. > > +There is no dedicated on-card memory for HARQ, the coherent memory on > the CPU side is being used. > > + > > +These hardware blocks provide the following features exposed by the PMD: > > + > > +- LDPC Encode in the Downlink (5GNR) > > +- LDPC Decode in the Uplink (5GNR) > > +- Turbo Encode in the Downlink (4G) > > +- Turbo Decode in the Uplink (4G) > > +- FFT processing > > +- MLD-TS processing > > +- Single Root I/O Virtualization (SR-IOV) with 16 Virtual Functions > > +(VFs) per Physical Function (PF) > > +- Maximum of 2048 queues per VF > > +- Message Signaled Interrupts (MSIs) > > + > > +The Intel vRAN Boost v2.0 PMD supports the following bbdev capabilities: > > + > > +* For the LDPC encode operation: > > + - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH``: set to attach CRC24B to CB(s). > > + - ``RTE_BBDEV_LDPC_RATE_MATCH``: if set then do not do Rate Match > bypass. > > + - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS``: if set then bypass > interleaver. > > + - ``RTE_BBDEV_LDPC_ENC_SCATTER_GATHER``: supports scatter-gather > for input/output data. > > + - ``RTE_BBDEV_LDPC_ENC_CONCATENATION``: concatenate code blocks > with bit granularity. > > + > > +* For the LDPC decode operation: > > + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK``: check CRC24B from CB(s). > > + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP``: drops CRC24B bits > appended while decoding. > > + - ``RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK``: check CRC24A from CB(s). > > + - ``RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK``: check CRC16 from CB(s). > > + - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE``: provides an input for > HARQ combining. > > + - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE``: provides an input > for HARQ combining. > > + - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE``: disable early > termination. > > + - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER``: supports scatter-gather > for input/output data. > > + - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION``: supports > compression of the HARQ input/output. > > + - ``RTE_BBDEV_LDPC_LLR_COMPRESSION``: supports LLR input > compression. > > + - ``RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION``: supports > compression of the HARQ input/output. > > + - ``RTE_BBDEV_LDPC_SOFT_OUT_ENABLE``: set the APP LLR soft output. > > + - ``RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS``: set the APP LLR soft > output after rate-matching. > > + - ``RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS``: disables the > de-interleaver. > > + > > +* For the turbo encode operation: > > + - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH``: set to attach CRC24B to > CB(s). > > + - ``RTE_BBDEV_TURBO_RATE_MATCH``: if set then do not do Rate Match > bypass. > > + - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS``: set for encoder dequeue > interrupts. > > + - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS``: set to bypass RV index. > > + - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER``: supports scatter- > gather for input/output data. > > + > > +* For the turbo decode operation: > > + - ``RTE_BBDEV_TURBO_CRC_TYPE_24B``: check CRC24B from CB(s). > > + - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE``: perform subblock > de-interleave. > > + - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS``: set for decoder dequeue > interrupts. > > + - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN``: set if negative LLR input is > supported. > > + - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP``: keep CRC24B bits > appended while decoding. > > + - ``RTE_BBDEV_TURBO_DEC_CRC_24B_DROP``: option to drop the code > block CRC after decoding. > > + - ``RTE_BBDEV_TURBO_EARLY_TERMINATION``: set early termination > feature. > > + - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER``: supports scatter- > gather for input/output data. > > + - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN``: set half iteration > granularity. > > + - ``RTE_BBDEV_TURBO_SOFT_OUTPUT``: set the APP LLR soft output. > > + - ``RTE_BBDEV_TURBO_EQUALIZER``: set the turbo equalizer feature. > > + - ``RTE_BBDEV_TURBO_SOFT_OUT_SATURATE``: set the soft output > saturation. > > + - ``RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH``: set to run an extra > odd iteration after CRC match. > > + - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT``: set if negative APP > LLR output supported. > > + - ``RTE_BBDEV_TURBO_MAP_DEC``: supports flexible parallel MAP > engine decoding. > > + > > +* For the FFT operation: > > + - ``RTE_BBDEV_FFT_WINDOWING``: flexible windowing capability. > > + - ``RTE_BBDEV_FFT_CS_ADJUSTMENT``: flexible adjustment of Cyclic Shift > time offset. > > + - ``RTE_BBDEV_FFT_DFT_BYPASS``: set for bypass the DFT and get > directly into iDFT input. > > + - ``RTE_BBDEV_FFT_IDFT_BYPASS``: set for bypass the IDFT and get > directly the DFT output. > > + - ``RTE_BBDEV_FFT_WINDOWING_BYPASS``: set for bypass time domain > windowing. > > + > > +* For the MLD-TS operation: > > + - ``RTE_BBDEV_MLDTS_REP``: set to repeat and reuse channel across > operations. > > + > > +Installation > > +------------ > > + > > +Section 3 of the DPDK manual provides instructions on installing and > compiling DPDK. > > + > > +DPDK requires hugepages to be configured as detailed in section 2 of the > DPDK manual. > > +The bbdev test application has been tested with a configuration 40 x 1GB > hugepages. > > +The hugepage configuration of a server may be examined using: > > + > > +.. code-block:: console > > + > > + grep Huge* /proc/meminfo > > + > > + > > +Initialization > > +-------------- > > + > > +When the device first powers up, its PCI Physical Functions (PF) can > > +be listed through these commands for Intel vRAN Boost v2: > > + > > +.. code-block:: console > > + > > + sudo lspci -vd8086:57c2 > > + > > +The physical and virtual functions are compatible with Linux UIO drivers: > > +``vfio`` and ``igb_uio``. > > +However, in order to work the 5G/4G FEC device first needs to be > > +bound to one of these Linux drivers through DPDK. > > + > > + > > +Bind PF UIO driver(s) > > +~~~~~~~~~~~~~~~~~~~~~ > > > > Please document with VFIO instead of IGB_UIO, which is being deprecated. There are still some deployment with igb_uio, so keeping for a bit. Then we can remove in all documentations. > > > +Install the DPDK igb_uio driver, bind it with the PF PCI device ID > > +and use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK > UIO driver. > > + > > +The igb_uio driver may be bound to the PF PCI device using one of two > > +methods for Intel vRAN Boost v2: > > + > > +#. PCI functions (physical or virtual, depending on the use case) can > > +be bound to the UIO driver by repeating this command for every function. > > + > > +.. code-block:: console > > + > > + cd <dpdk-top-level-directory> > > + insmod build/kmod/igb_uio.ko > > + echo "8086 57c2" > /sys/bus/pci/drivers/igb_uio/new_id > > + lspci -vd8086:57c2 > > + > > +#. Another way to bind PF with DPDK UIO driver is by using the > > +``dpdk-devbind.py`` tool > > + > > +.. code-block:: console > > + > > + cd <dpdk-top-level-directory> > > + usertools/dpdk-devbind.py -b igb_uio 0000:f7:00.0 > > + > > +where the PCI device ID (example: 0000:f7:00.0) is obtained using ``lspci - > vd8086:57c2``. > > + > > +In a similar way the PF may be bound with vfio-pci as any PCIe device. > > + > > + > > +Enable Virtual Functions > > +~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +Now, it should be visible in the printouts that PCI PF is under > > +igb_uio control "``Kernel driver in use: igb_uio``" > > + > > +To show the number of available VFs on the device, read ``sriov_totalvfs`` > file. > > + > > +.. code-block:: console > > + > > + cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs > > + > > +where ``0000\:<b>\:<d>.<f>`` is the PCI device ID > > + > > +To enable VFs via igb_uio, echo the number of virtual functions > > +intended to enable to ``max_vfs`` file. > > + > > +.. code-block:: console > > + > > + echo <num-of-vfs> > > > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs > > + > > +Afterwards, all VFs must be bound to appropriate UIO drivers as > > +required, same way it was done with the physical function previously. > > + > > +Enabling SR-IOV via VFIO driver is pretty much the same, except that > > +the file name is different: > > + > > +.. code-block:: console > > + > > + echo <num-of-vfs> > > > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs > > + > > + > > +Configure the VFs through PF > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +The PCI virtual functions must be configured before working or > > +getting assigned to VMs/Containers. > > +The configuration involves allocating the number of hardware queues, > > +priorities, load balance, bandwidth and other settings necessary for > > +the device to perform FEC functions. > > + > > +This configuration needs to be executed at least once after reboot or > > +PCI FLR and can be achieved by using the functions > > +``rte_acc_configure()``, which sets up the parameters defined in the > compatible ``rte_acc_conf`` structure. > > + > > + > > +Test Application > > +---------------- > > + > > +The bbdev class is provided with a test application, > > +``test-bbdev.py`` and range of test data for testing the > > +functionality of the device, depending on the device's capabilities. > > +The test application is located under app/test-bbdev folder and has > > +the following options: > > + > > +.. code-block:: console > > + > > + "-p", "--testapp-path": specifies path to the bbdev test app. > > + "-e", "--eal-params": EAL arguments which are passed to the test app. > > + "-t", "--timeout": Timeout in seconds (default=300). > > + "-c", "--test-cases": Defines test cases to run. Run all if not > > specified. > > + "-v", "--test-vector": Test vector path. > > + "-n", "--num-ops": Number of operations to process on device > (default=32). > > + "-b", "--burst-size": Operations enqueue/dequeue burst size > > (default=32). > > + "-s", "--snr": SNR in dB used when generating LLRs for bler tests. > > + "-s", "--iter_max": Number of iterations for LDPC decoder. > > + "-l", "--num-lcores": Number of lcores to run (default=16). > > + "-i", "--init-device": Initialise PF device with default values. > > + > > + > > +To execute the test application tool using simple decode or encode > > +data, type one of the following: > > + > > +.. code-block:: console > > + > > + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data > > + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data > > + > > + > > +The test application ``test-bbdev.py``, supports the ability to > > +configure the PF device with a default set of values, if the "-i" or > > +"- -init-device" option is included. The default values are defined in > test_bbdev_perf.c. > > + > > + > > +Test Vectors > > +~~~~~~~~~~~~ > > + > > +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev > > +also provides a range of additional tests under the test_vectors > > +folder, which may be useful. > > +The results of these tests will depend on the device capabilities > > +which may cause some test cases to be skipped, but no failure should be > reported. > > + > > + > > +Alternate Baseband Device configuration tool > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +On top of the embedded configuration feature supported in test-bbdev > > +using > > +"- -init-device" option mentioned above, there is also a tool > > +available to perform that device configuration using a companion > application. > > +The ``pf_bb_config`` application notably enables then to run > > +bbdev-test from the VF and not only limited to the PF as captured above. > > + > > +See for more details: https://github.com/intel/pf-bb-config > > + > > +Specifically for the bbdev Intel vRAN Boost v2 PMD, the command below > > +can be used (note that ACC200 was used previously to refer to VRB2): > > + > > +.. code-block:: console > > + > > + pf_bb_config VRB2 -c ./vrb2/vrb2_config_vf_5g.cfg > > + test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64 > > + -b 64 -l 1 -v ./ldpc_dec_default.data > > diff --git a/doc/guides/rel_notes/release_23_11.rst > > b/doc/guides/rel_notes/release_23_11.rst > > index 333e1d95a2..668dd58ee3 100644 > > --- a/doc/guides/rel_notes/release_23_11.rst > > +++ b/doc/guides/rel_notes/release_23_11.rst > > @@ -78,6 +78,9 @@ New Features > > * build: Optional libraries can now be selected with the new > > ``enable_libs`` > > build option similarly to the existing ``enable_drivers`` build option. > > > > +* **Updated Intel vRAN Boost bbdev PMD.** > > + > > + Added support for the new Intel vRAN Boost v2 device variant (GNR-D) > within the unified driver. > > > > Removed Items > > ------------- > > diff --git a/drivers/baseband/acc/acc_common.h > > b/drivers/baseband/acc/acc_common.h > > index 5de58dbe36..b71292af94 100644 > > --- a/drivers/baseband/acc/acc_common.h > > +++ b/drivers/baseband/acc/acc_common.h > > @@ -18,6 +18,7 @@ > > #define ACC_DMA_BLKID_OUT_HARQ 3 > > #define ACC_DMA_BLKID_IN_HARQ 3 > > #define ACC_DMA_BLKID_IN_MLD_R 3 > > +#define ACC_DMA_BLKID_DEWIN_IN 3 > > > > /* Values used in filling in decode FCWs */ > > #define ACC_FCW_TD_VER 1 > > @@ -103,6 +104,9 @@ > > #define ACC_MAX_NUM_QGRPS 32 > > #define ACC_RING_SIZE_GRANULARITY 64 > > #define ACC_MAX_FCW_SIZE 128 > > +#define ACC_IQ_SIZE 4 > > + > > +#define ACC_FCW_FFT_BLEN_3 28 > > > > /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */ > > #define ACC_N_ZC_1 66 /* N = 66 Zc for BG 1 */ @@ -132,6 +136,11 @@ > > #define ACC_LIM_21 14 /* 0.21 */ > > #define ACC_LIM_31 20 /* 0.31 */ > > #define ACC_MAX_E (128 * 1024 - 2) > > +#define ACC_MAX_CS 12 > > + > > +#define ACC100_VARIANT 0 > > +#define VRB1_VARIANT 2 > > +#define VRB2_VARIANT 3 > > > > /* Helper macro for logging */ > > #define rte_acc_log(level, fmt, ...) \ @@ -332,6 +341,37 @@ struct > > __rte_packed acc_fcw_fft { > > res:19; > > }; > > > > +/* FFT Frame Control Word. */ > > +struct __rte_packed acc_fcw_fft_3 { > > + uint32_t in_frame_size:16, > > + leading_pad_size:16; > > + uint32_t out_frame_size:16, > > + leading_depad_size:16; > > + uint32_t cs_window_sel; > > + uint32_t cs_window_sel2:16, > > + cs_enable_bmap:16; > > + uint32_t num_antennas:8, > > + idft_size:8, > > + dft_size:8, > > + cs_offset:8; > > + uint32_t idft_shift:8, > > + dft_shift:8, > > + cs_multiplier:16; > > + uint32_t bypass:2, > > + fp16_in:1, > > + fp16_out:1, > > + exp_adj:4, > > + power_shift:4, > > + power_en:1, > > + enable_dewin:1, > > + freq_resample_mode:2, > > + depad_ouput_size:16; > > + uint16_t cs_theta_0[ACC_MAX_CS]; > > + uint32_t cs_theta_d[ACC_MAX_CS]; > > + int8_t cs_time_offset[ACC_MAX_CS]; > > +}; > > + > > + > > /* MLD-TS Frame Control Word */ > > struct __rte_packed acc_fcw_mldts { > > uint32_t fcw_version:4, > > @@ -473,14 +513,14 @@ union acc_info_ring_data { > > uint16_t valid: 1; > > }; > > struct { > > - uint32_t aq_id_3: 6; > > - uint32_t qg_id_3: 5; > > - uint32_t vf_id_3: 6; > > - uint32_t int_nb_3: 6; > > - uint32_t msi_0_3: 1; > > - uint32_t vf2pf_3: 6; > > - uint32_t loop_3: 1; > > - uint32_t valid_3: 1; > > + uint32_t aq_id_vrb2: 6; > > + uint32_t qg_id_vrb2: 5; > > + uint32_t vf_id_vrb2: 6; > > + uint32_t int_nb_vrb2: 6; > > + uint32_t msi_0_vrb2: 1; > > + uint32_t vf2pf_vrb2: 6; > > + uint32_t loop_vrb2: 1; > > + uint32_t valid_vrb2: 1; > > }; > > } __rte_packed; > > > > @@ -765,16 +805,20 @@ alloc_sw_rings_min_mem(struct rte_bbdev *dev, > struct acc_device *d, > > */ > > static inline uint16_t > > get_queue_id_from_ring_info(struct rte_bbdev_data *data, > > - const union acc_info_ring_data ring_data) > > + const union acc_info_ring_data ring_data, uint16_t > device_variant) > > The device variant ID should be derived from the data, which stores device > private pointer. No need to add an extra parameter. the device variant comes from directly forom dev-> not from dev->data. > > > { > > uint16_t queue_id; > > + struct acc_queue *acc_q; > > + uint16_t vf_id = (device_variant == VRB2_VARIANT) ? > ring_data.vf_id_vrb2 : ring_data.vf_id; > > + uint16_t aq_id = (device_variant == VRB2_VARIANT) ? > ring_data.aq_id_vrb2 : ring_data.aq_id; > > + uint16_t qg_id = (device_variant == VRB2_VARIANT) ? > > +ring_data.qg_id_vrb2 : ring_data.qg_id; > > Wouldn't it be better to have an intermediate representation, and each variant > implements a callback to fill it? > > It will become messy otherwise when other variants will be added. > > Something like the existing queue_offset callback would be great. ok fair enough. > > > > > for (queue_id = 0; queue_id < data->num_queues; ++queue_id) { > > - struct acc_queue *acc_q = > > - data->queues[queue_id].queue_private; > > - if (acc_q != NULL && acc_q->aq_id == ring_data.aq_id && > > - acc_q->qgrp_id == ring_data.qg_id && > > - acc_q->vf_id == ring_data.vf_id) > > + acc_q = data->queues[queue_id].queue_private; > > + > > + if (acc_q != NULL && acc_q->aq_id == aq_id && > > + acc_q->qgrp_id == qg_id && > > + acc_q->vf_id == vf_id) > > return queue_id; > > } > > > > @@ -1436,4 +1480,16 @@ get_num_cbs_in_tb_ldpc_enc(struct > rte_bbdev_op_ldpc_enc *ldpc_enc) > > return cbs_in_tb; > > } > > > > +#ifdef RTE_LIBRTE_BBDEV_DEBUG > > +static inline void > > +acc_memdump(const char *string, void *buf, uint16_t bytes) { > > + printf("%s\n", string); > > + uint32_t *data = buf; > > + uint16_t i; > > + for (i = 0; i < bytes / 4; i++) > > + printf("0x%08X\n", data[i]); > > +} > > For consistency and to avoid code dumplication, please use rte_memdump() > directly as you did in ACC100. I can do for now, it is just that for format is quite poor. Will change. > > > +#endif > > + > > #endif /* _ACC_COMMON_H_ */ > > diff --git a/drivers/baseband/acc/rte_acc100_pmd.c > > b/drivers/baseband/acc/rte_acc100_pmd.c > > index 5362d39c30..7f8d05b5a9 100644 > > --- a/drivers/baseband/acc/rte_acc100_pmd.c > > +++ b/drivers/baseband/acc/rte_acc100_pmd.c > > @@ -294,7 +294,7 @@ acc100_pf_interrupt_handler(struct rte_bbdev *dev) > > case ACC100_PF_INT_DMA_UL5G_DESC_IRQ: > > case ACC100_PF_INT_DMA_DL5G_DESC_IRQ: > > deq_intr_det.queue_id = > get_queue_id_from_ring_info( > > - dev->data, *ring_data); > > + dev->data, *ring_data, acc100_dev- > >device_variant); > > if (deq_intr_det.queue_id == UINT16_MAX) { > > rte_bbdev_log(ERR, > > "Couldn't find queue: aq_id: > %u, qg_id: %u, vf_id: %u", @@ > > -348,7 +348,7 @@ acc100_vf_interrupt_handler(struct rte_bbdev *dev) > > */ > > ring_data->vf_id = 0; > > deq_intr_det.queue_id = > get_queue_id_from_ring_info( > > - dev->data, *ring_data); > > + dev->data, *ring_data, acc100_dev- > >device_variant); > > if (deq_intr_det.queue_id == UINT16_MAX) { > > rte_bbdev_log(ERR, > > "Couldn't find queue: aq_id: > %u, qg_id: %u", diff --git > > a/drivers/baseband/acc/rte_vrb_pmd.c > > b/drivers/baseband/acc/rte_vrb_pmd.c > > index e82ed55ca7..a787592ec9 100644 > > --- a/drivers/baseband/acc/rte_vrb_pmd.c > > +++ b/drivers/baseband/acc/rte_vrb_pmd.c > > @@ -37,6 +37,15 @@ vrb1_queue_offset(bool pf_device, uint8_t vf_id, > uint8_t qgrp_id, uint16_t aq_id > > return ((qgrp_id << 7) + (aq_id << 3) + > VRB1_VfQmgrIngressAq); > > } > > > > +static inline uint32_t > > +vrb2_queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, > > +uint16_t aq_id) { > > + if (pf_device) > > + return ((vf_id << 14) + (qgrp_id << 9) + (aq_id << 3) + > VRB2_PfQmgrIngressAq); > > + else > > + return ((qgrp_id << 9) + (aq_id << 3) + > VRB2_VfQmgrIngressAq); } > > + > > enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, MLD, NUM_ACC}; > > > > /* Return the accelerator enum for a Queue Group Index. */ @@ -197,7 > > +206,7 @@ fetch_acc_config(struct rte_bbdev *dev) > > struct acc_device *d = dev->data->dev_private; > > struct rte_acc_conf *acc_conf = &d->acc_conf; > > uint8_t acc, qg; > > - uint32_t reg_aq, reg_len0, reg_len1, reg0, reg1; > > + uint32_t reg_aq, reg_len0, reg_len1, reg_len2, reg_len3, reg0, reg1, > > +reg2, reg3; > > uint32_t reg_mode, idx; > > struct rte_acc_queue_topology *q_top = NULL; > > int qman_func_id[VRB_NUM_ACCS] = {ACC_ACCMAP_0, > ACC_ACCMAP_1, @@ > > -219,32 +228,81 @@ fetch_acc_config(struct rte_bbdev *dev) > > acc_conf->num_vf_bundles = 1; > > initQTop(acc_conf); > > > > - reg0 = acc_reg_read(d, d->reg_addr->qman_group_func); > > - reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4); > > - for (qg = 0; qg < d->num_qgroups; qg++) { > > - reg_aq = acc_reg_read(d, d->queue_offset(d->pf_device, 0, > qg, 0)); > > - if (reg_aq & ACC_QUEUE_ENABLE) { > > - if (qg < ACC_NUM_QGRPS_PER_WORD) > > - idx = (reg0 >> (qg * 4)) & 0x7; > > + if (d->device_variant == VRB1_VARIANT) { > > + reg0 = acc_reg_read(d, d->reg_addr->qman_group_func); > > + reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4); > > + for (qg = 0; qg < d->num_qgroups; qg++) { > > + reg_aq = acc_reg_read(d, d->queue_offset(d- > >pf_device, 0, qg, 0)); > > + if (reg_aq & ACC_QUEUE_ENABLE) { > > + if (qg < ACC_NUM_QGRPS_PER_WORD) > > + idx = (reg0 >> (qg * 4)) & 0x7; > > + else > > + idx = (reg1 >> ((qg - > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7; > > + if (idx < VRB1_NUM_ACCS) { > > + acc = qman_func_id[idx]; > > + updateQtop(acc, qg, acc_conf, d); > > + } > > + } > > + } > > + > > + /* Check the depth of the AQs. */ > > + reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset); > > + reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset); > > + for (acc = 0; acc < NUM_ACC; acc++) { > > + qtopFromAcc(&q_top, acc, acc_conf); > > + if (q_top->first_qgroup_index < > ACC_NUM_QGRPS_PER_WORD) > > + q_top->aq_depth_log2 = > > + (reg_len0 >> (q_top- > >first_qgroup_index * 4)) & 0xF; > > else > > - idx = (reg1 >> ((qg - > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7; > > - if (idx < VRB1_NUM_ACCS) { > > - acc = qman_func_id[idx]; > > - updateQtop(acc, qg, acc_conf, d); > > + q_top->aq_depth_log2 = (reg_len1 >> ((q_top- > >first_qgroup_index - > > + > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF; > > + } > > + } else { > > + reg0 = acc_reg_read(d, d->reg_addr->qman_group_func); > > + reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4); > > + reg2 = acc_reg_read(d, d->reg_addr->qman_group_func + 8); > > + reg3 = acc_reg_read(d, d->reg_addr->qman_group_func + 12); > > + /* printf("Debug Function %08x %08x %08x %08x\n", reg0, > reg1, reg2, reg3);*/ > > + for (qg = 0; qg < VRB2_NUM_QGRPS; qg++) { > > + reg_aq = acc_reg_read(d, vrb2_queue_offset(d- > >pf_device, 0, qg, 0)); > > + if (reg_aq & ACC_QUEUE_ENABLE) { > > + /* printf("Qg enabled %d %x\n", qg, > reg_aq);*/ > > + if (qg / ACC_NUM_QGRPS_PER_WORD == 0) > > + idx = (reg0 >> ((qg % > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7; > > + else if (qg / ACC_NUM_QGRPS_PER_WORD == > 1) > > + idx = (reg1 >> ((qg % > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7; > > + else if (qg / ACC_NUM_QGRPS_PER_WORD == > 2) > > + idx = (reg2 >> ((qg % > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7; > > + else > > + idx = (reg3 >> ((qg % > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7; > > + if (idx < VRB_NUM_ACCS) { > > + acc = qman_func_id[idx]; > > + updateQtop(acc, qg, acc_conf, d); > > + } > > } > > } > > - } > > > > - /* Check the depth of the AQs. */ > > - reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset); > > - reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset); > > - for (acc = 0; acc < NUM_ACC; acc++) { > > - qtopFromAcc(&q_top, acc, acc_conf); > > - if (q_top->first_qgroup_index < > ACC_NUM_QGRPS_PER_WORD) > > - q_top->aq_depth_log2 = (reg_len0 >> (q_top- > >first_qgroup_index * 4)) & 0xF; > > - else > > - q_top->aq_depth_log2 = (reg_len1 >> ((q_top- > >first_qgroup_index - > > - ACC_NUM_QGRPS_PER_WORD) * 4)) > & 0xF; > > + /* Check the depth of the AQs. */ > > + reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset); > > + reg_len1 = acc_reg_read(d, d->reg_addr->depth_log0_offset + > 4); > > + reg_len2 = acc_reg_read(d, d->reg_addr->depth_log0_offset + > 8); > > + reg_len3 = acc_reg_read(d, d->reg_addr->depth_log0_offset + > 12); > > + > > + for (acc = 0; acc < NUM_ACC; acc++) { > > + qtopFromAcc(&q_top, acc, acc_conf); > > + if (q_top->first_qgroup_index / > ACC_NUM_QGRPS_PER_WORD == 0) > > + q_top->aq_depth_log2 = (reg_len0 >> ((q_top- > >first_qgroup_index % > > + > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF; > > + else if (q_top->first_qgroup_index / > ACC_NUM_QGRPS_PER_WORD == 1) > > + q_top->aq_depth_log2 = (reg_len1 >> ((q_top- > >first_qgroup_index % > > + > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF; > > + else if (q_top->first_qgroup_index / > ACC_NUM_QGRPS_PER_WORD == 2) > > + q_top->aq_depth_log2 = (reg_len2 >> ((q_top- > >first_qgroup_index % > > + > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF; > > + else > > + q_top->aq_depth_log2 = (reg_len3 >> ((q_top- > >first_qgroup_index % > > + > ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF; > > + } > > } > > > > /* Read PF mode. */ > > @@ -341,18 +399,29 @@ vrb_check_ir(struct acc_device *acc_dev) > > ring_data = acc_dev->info_ring + (acc_dev->info_ring_head & > > ACC_INFO_RING_MASK); > > > > while (ring_data->valid) { > > - if ((ring_data->int_nb < ACC_PF_INT_DMA_DL_DESC_IRQ) || ( > > - ring_data->int_nb > > ACC_PF_INT_DMA_MLD_DESC_IRQ)) { > > - rte_bbdev_log(WARNING, "InfoRing: ITR:%d > Info:0x%x", > > - ring_data->int_nb, ring_data- > >detailed_info); > > - /* Initialize Info Ring entry and move forward. */ > > - ring_data->val = 0; > > + if (acc_dev->device_variant == VRB1_VARIANT) { > > + if ((ring_data->int_nb < > ACC_PF_INT_DMA_DL_DESC_IRQ) || ( > > + ring_data->int_nb > > ACC_PF_INT_DMA_MLD_DESC_IRQ)) { > > + rte_bbdev_log(WARNING, "InfoRing: ITR:%d > Info:0x%x", > > + ring_data->int_nb, ring_data- > >detailed_info); > > + /* Initialize Info Ring entry and move forward. > */ > > + ring_data->val = 0; > > + } > > + } else { /* VRB2_VARIANT */ > > + if ((ring_data->int_nb_vrb2 < > ACC_PF_INT_DMA_DL_DESC_IRQ) || ( > > + ring_data->int_nb_vrb2 > > ACC_PF_INT_DMA_MLD_DESC_IRQ)) { > > + rte_bbdev_log(WARNING, "InfoRing: ITR:%d > Info:0x%x", > > + ring_data->int_nb_vrb2, > ring_data->val); > > + /* Initialize Info Ring entry and move forward. > */ > > + ring_data->val = 0; > > + } > > Same as get_queue_id_from_ring_info(). > Having a callback to get the different parameters (int_nb, vf_id, ...) > implemented by each variant would avoid duplication like above. > > It would make it more easier to add new variants, i.e. just implement the > callback without needing to patch every function making use of these info. ok, fair enough. > > > } > > info_ring_head++; > > ring_data = acc_dev->info_ring + (info_ring_head & > ACC_INFO_RING_MASK); > > } > > } > > > > + > > Remove new line. ok > > > /* Interrupt handler triggered by dev for handling specific interrupt. */ > > static void > > vrb_dev_interrupt_handler(void *cb_arg) @@ -361,16 +430,22 @@ > > vrb_dev_interrupt_handler(void *cb_arg) > > struct acc_device *acc_dev = dev->data->dev_private; > > volatile union acc_info_ring_data *ring_data; > > struct acc_deq_intr_details deq_intr_det; > > + uint16_t vf_id, aq_id, qg_id, int_nb; > > + bool isVrb1 = (acc_dev->device_variant == VRB1_VARIANT); > > > > ring_data = acc_dev->info_ring + (acc_dev->info_ring_head & > > ACC_INFO_RING_MASK); > > > > while (ring_data->valid) { > > + vf_id = isVrb1 ? ring_data->vf_id : ring_data->vf_id_vrb2; > > + aq_id = isVrb1 ? ring_data->aq_id : ring_data->aq_id_vrb2; > > + qg_id = isVrb1 ? ring_data->qg_id : ring_data->qg_id_vrb2; > + > int_nb = isVrb1 ? ring_data->int_nb : ring_data->int_nb_vrb2; > > Same comment as above. ok > > > if (acc_dev->pf_device) { > > rte_bbdev_log_debug( > > - "VRB1 PF Interrupt received, Info Ring > data: 0x%x -> %d", > > - ring_data->val, ring_data->int_nb); > > + "PF Interrupt received, Info Ring data: > 0x%x -> %d", > > + ring_data->val, int_nb); > > > > - switch (ring_data->int_nb) { > > + switch (int_nb) { > > case ACC_PF_INT_DMA_DL_DESC_IRQ: > > case ACC_PF_INT_DMA_UL_DESC_IRQ: > > case ACC_PF_INT_DMA_FFT_DESC_IRQ: > > @@ -378,13 +453,11 @@ vrb_dev_interrupt_handler(void *cb_arg) > > case ACC_PF_INT_DMA_DL5G_DESC_IRQ: > > case ACC_PF_INT_DMA_MLD_DESC_IRQ: > > deq_intr_det.queue_id = > get_queue_id_from_ring_info( > > - dev->data, *ring_data); > > + dev->data, *ring_data, > acc_dev->device_variant); > > if (deq_intr_det.queue_id == UINT16_MAX) { > > rte_bbdev_log(ERR, > > "Couldn't find queue: > aq_id: %u, qg_id: %u, vf_id: %u", > > - ring_data->aq_id, > > - ring_data->qg_id, > > - ring_data->vf_id); > > + aq_id, qg_id, vf_id); > > return; > > } > > rte_bbdev_pmd_callback_process(dev, > > @@ -396,9 +469,9 @@ vrb_dev_interrupt_handler(void *cb_arg) > > } > > } else { > > rte_bbdev_log_debug( > > - "VRB1 VF Interrupt received, Info Ring > data: 0x%x\n", > > + "VRB VF Interrupt received, Info Ring > data: 0x%x\n", > > ring_data->val); > > - switch (ring_data->int_nb) { > > + switch (int_nb) { > > case ACC_VF_INT_DMA_DL_DESC_IRQ: > > case ACC_VF_INT_DMA_UL_DESC_IRQ: > > case ACC_VF_INT_DMA_FFT_DESC_IRQ: > > @@ -406,14 +479,16 @@ vrb_dev_interrupt_handler(void *cb_arg) > > case ACC_VF_INT_DMA_DL5G_DESC_IRQ: > > case ACC_VF_INT_DMA_MLD_DESC_IRQ: > > /* VFs are not aware of their vf_id - it's set > > to > 0. */ > > - ring_data->vf_id = 0; > > + if (acc_dev->device_variant == > VRB1_VARIANT) > > + ring_data->vf_id = 0; > > + else > > + ring_data->vf_id_vrb2 = 0; > > Ditto, you also need a way to set these fields in a generic way. ok > > > deq_intr_det.queue_id = > get_queue_id_from_ring_info( > > - dev->data, *ring_data); > > + dev->data, *ring_data, > acc_dev->device_variant); > > if (deq_intr_det.queue_id == UINT16_MAX) { > > rte_bbdev_log(ERR, > > "Couldn't find queue: > aq_id: %u, qg_id: %u", > > - ring_data->aq_id, > > - ring_data->qg_id); > > + aq_id, qg_id); > > return; > > } > > rte_bbdev_pmd_callback_process(dev, > > @@ -428,8 +503,7 @@ vrb_dev_interrupt_handler(void *cb_arg) > > /* Initialize Info Ring entry and move forward. */ > > ring_data->val = 0; > > ++acc_dev->info_ring_head; > > - ring_data = acc_dev->info_ring + > > - (acc_dev->info_ring_head & > ACC_INFO_RING_MASK); > > + ring_data = acc_dev->info_ring + (acc_dev->info_ring_head & > > +ACC_INFO_RING_MASK); > > } > > } > > > > @@ -461,7 +535,10 @@ allocate_info_ring(struct rte_bbdev *dev) > > phys_low = (uint32_t)(info_ring_iova); > > acc_reg_write(d, d->reg_addr->info_ring_hi, phys_high); > > acc_reg_write(d, d->reg_addr->info_ring_lo, phys_low); > > - acc_reg_write(d, d->reg_addr->info_ring_en, > VRB1_REG_IRQ_EN_ALL); > > + if (d->device_variant == VRB1_VARIANT) > > + acc_reg_write(d, d->reg_addr->info_ring_en, > VRB1_REG_IRQ_EN_ALL); > > + else > > + acc_reg_write(d, d->reg_addr->info_ring_en, > VRB2_REG_IRQ_EN_ALL); > > My guess is the mask value difference between the two is the VRB2 added > MLD-TS related interrupts, is it? > > If so, it would be more future proof to configure this based on the device > capabilities instead of its variant? There are some differences in these implementations, best to to keep them separate. > > > d->info_ring_head = (acc_reg_read(d, d->reg_addr->info_ring_ptr) & > > 0xFFF) / sizeof(union acc_info_ring_data); > > return 0; > > @@ -516,6 +593,7 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t > num_queues, int socket_id) > > phys_high = (uint32_t)(d->sw_rings_iova >> 32); > > phys_low = (uint32_t)(d->sw_rings_iova & ~(ACC_SIZE_64MBYTE-1)); > > > > + > > Remove new line. ok > > > /* Read the populated cfg from device registers. */ > > fetch_acc_config(dev); > > > > @@ -540,6 +618,10 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t > num_queues, int socket_id) > > acc_reg_write(d, d->reg_addr->dma_ring_dl4g_lo, phys_low); > > acc_reg_write(d, d->reg_addr->dma_ring_fft_hi, phys_high); > > acc_reg_write(d, d->reg_addr->dma_ring_fft_lo, phys_low); > > + if (d->device_variant == VRB2_VARIANT) { > > + acc_reg_write(d, d->reg_addr->dma_ring_mld_hi, phys_high); > > + acc_reg_write(d, d->reg_addr->dma_ring_mld_lo, phys_low); > > Same, wouldn't it be more future-proof to set this based on the capabilities > of > the device and not its variant? Checking capability on the fly would be cumbersome I think. I dont see problem related to next gen extension for this. Thanks > > > + } > > /* > > * Configure Ring Size to the max queue ring size > > * (used for wrapping purpose). > > @@ -549,8 +631,7 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t > > num_queues, int socket_id) > > > > /* Configure tail pointer for use when SDONE enabled. */ > > if (d->tail_ptrs == NULL) > > - d->tail_ptrs = rte_zmalloc_socket( > > - dev->device->driver->name, > > + d->tail_ptrs = rte_zmalloc_socket(dev->device->driver->name, > > VRB_MAX_QGRPS * VRB_MAX_AQS * > sizeof(uint32_t), > > RTE_CACHE_LINE_SIZE, socket_id); > > if (d->tail_ptrs == NULL) { > > @@ -574,6 +655,10 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t > num_queues, int socket_id) > > acc_reg_write(d, d->reg_addr->tail_ptrs_dl4g_lo, phys_low); > > acc_reg_write(d, d->reg_addr->tail_ptrs_fft_hi, phys_high); > > acc_reg_write(d, d->reg_addr->tail_ptrs_fft_lo, phys_low); > > + if (d->device_variant == VRB2_VARIANT) { > > + acc_reg_write(d, d->reg_addr->tail_ptrs_mld_hi, phys_high); > > + acc_reg_write(d, d->reg_addr->tail_ptrs_mld_lo, phys_low); > > + } > > Ditto. > > > > > ret = allocate_info_ring(dev); > > if (ret < 0) { > > @@ -671,10 +756,17 @@ vrb_intr_enable(struct rte_bbdev *dev) > > return ret; > > } > > > > - if (acc_dev->pf_device) > > - max_queues = VRB1_MAX_PF_MSIX; > > - else > > - max_queues = VRB1_MAX_VF_MSIX; > > + if (d->device_variant == VRB1_VARIANT) { > > + if (acc_dev->pf_device) > > + max_queues = VRB1_MAX_PF_MSIX; > > + else > > + max_queues = VRB1_MAX_VF_MSIX; > > + } else { > > + if (acc_dev->pf_device) > > + max_queues = VRB2_MAX_PF_MSIX; > > + else > > + max_queues = VRB2_MAX_VF_MSIX; > > + } > > > > if (rte_intr_efd_enable(dev->intr_handle, max_queues)) { > > rte_bbdev_log(ERR, "Failed to create fds for %u > queues", @@ > > -776,7 +868,10 @@ vrb_find_free_queue_idx(struct rte_bbdev *dev, > > /* Mark the Queue as assigned. */ > > d->q_assigned_bit_map[group_idx] |= (1ULL << > aq_idx); > > /* Report the AQ Index. */ > > - return (group_idx << VRB1_GRP_ID_SHIFT) + aq_idx; > > + if (d->device_variant == VRB1_VARIANT) > > + return (group_idx << VRB1_GRP_ID_SHIFT) + > aq_idx; > > + else > > + return (group_idx << VRB2_GRP_ID_SHIFT) + > aq_idx; > > This is the same bitmap as in acc_info_ring_data, isn't it? > We could maybe share some code to to the decoding as suggested ealier in > this patch. you mean just to add an inline function for the grp_id_shift(variant) right? fair enough > > > } > > } > > rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u", > > @@ -819,6 +914,9 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t > queue_id, > > ACC_FCW_LD_BLEN : (conf->op_type == > RTE_BBDEV_OP_FFT ? > > ACC_FCW_FFT_BLEN : ACC_FCW_MLDTS_BLEN)))); > > > > + if ((q->d->device_variant == VRB2_VARIANT) && (conf->op_type == > RTE_BBDEV_OP_FFT)) > > + fcw_len = ACC_FCW_FFT_BLEN_3; > > + > > for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) { > > desc = q->ring_addr + desc_idx; > > desc->req.word0 = ACC_DMA_DESC_TYPE; @@ -915,9 > +1013,16 @@ > > vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, > > } > > } > > > > - q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF; > > - q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT) & 0x3F; > > - q->aq_id = q_idx & 0xF; > > + if (d->device_variant == VRB1_VARIANT) { > > + q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF; > > + q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT) & 0x3F; > > + q->aq_id = q_idx & 0xF; > > + } else { > > + q->qgrp_id = (q_idx >> VRB2_GRP_ID_SHIFT) & 0x1F; > > + q->vf_id = (q_idx >> VRB2_VF_ID_SHIFT) & 0x3F; > > + q->aq_id = q_idx & 0x3F; > > + } > > + > > Same remark as before. yes, still valid > > > q->aq_depth = 0; > > if (conf->op_type == RTE_BBDEV_OP_TURBO_DEC) > > q->aq_depth = (1 << d->acc_conf.q_ul_4g.aq_depth_log2); > > @@ -1149,6 +1254,127 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct > rte_bbdev_driver_info *dev_info) > > RTE_BBDEV_END_OF_CAPABILITIES_LIST() > > }; > > > > + static const struct rte_bbdev_op_cap vrb2_bbdev_capabilities[] = { > > + { > > + .type = RTE_BBDEV_OP_TURBO_DEC, > > + .cap.turbo_dec = { > > + .capability_flags = > > + > RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE | > > + RTE_BBDEV_TURBO_CRC_TYPE_24B | > > + > RTE_BBDEV_TURBO_DEC_CRC_24B_DROP | > > + RTE_BBDEV_TURBO_EQUALIZER | > > + > RTE_BBDEV_TURBO_SOFT_OUT_SATURATE | > > + > RTE_BBDEV_TURBO_HALF_ITERATION_EVEN | > > + > RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH | > > + RTE_BBDEV_TURBO_SOFT_OUTPUT | > > + > RTE_BBDEV_TURBO_EARLY_TERMINATION | > > + > RTE_BBDEV_TURBO_DEC_INTERRUPTS | > > + > RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN | > > + > RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT | > > + RTE_BBDEV_TURBO_MAP_DEC | > > + > RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | > > + > RTE_BBDEV_TURBO_DEC_SCATTER_GATHER, > > + .max_llr_modulus = INT8_MAX, > > + .num_buffers_src = > > + > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, > > + .num_buffers_hard_out = > > + > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, > > + .num_buffers_soft_out = > > + > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, > > + } > > + }, > > + { > > + .type = RTE_BBDEV_OP_TURBO_ENC, > > + .cap.turbo_enc = { > > + .capability_flags = > > + > RTE_BBDEV_TURBO_CRC_24B_ATTACH | > > + > RTE_BBDEV_TURBO_RV_INDEX_BYPASS | > > + RTE_BBDEV_TURBO_RATE_MATCH | > > + > RTE_BBDEV_TURBO_ENC_INTERRUPTS | > > + > RTE_BBDEV_TURBO_ENC_SCATTER_GATHER, > > + .num_buffers_src = > > + > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, > > + .num_buffers_dst = > > + > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, > > + } > > + }, > > + { > > + .type = RTE_BBDEV_OP_LDPC_ENC, > > + .cap.ldpc_enc = { > > + .capability_flags = > > + RTE_BBDEV_LDPC_RATE_MATCH | > > + RTE_BBDEV_LDPC_CRC_24B_ATTACH > | > > + > RTE_BBDEV_LDPC_INTERLEAVER_BYPASS | > > + RTE_BBDEV_LDPC_ENC_INTERRUPTS > | > > + > RTE_BBDEV_LDPC_ENC_SCATTER_GATHER | > > + > RTE_BBDEV_LDPC_ENC_CONCATENATION, > > + .num_buffers_src = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > + .num_buffers_dst = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > + } > > + }, > > + { > > + .type = RTE_BBDEV_OP_LDPC_DEC, > > + .cap.ldpc_dec = { > > + .capability_flags = > > + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK | > > + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP | > > + RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK | > > + RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK | > > + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE > | > > + > RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE | > > + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE > | > > + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS | > > + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER | > > + > RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION | > > + > RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION | > > + RTE_BBDEV_LDPC_LLR_COMPRESSION | > > + RTE_BBDEV_LDPC_SOFT_OUT_ENABLE | > > + RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS | > > + > RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS | > > + RTE_BBDEV_LDPC_DEC_INTERRUPTS, > > + .llr_size = 8, > > + .llr_decimals = 2, > > + .num_buffers_src = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > + .num_buffers_hard_out = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > + .num_buffers_soft_out = 0, > > + } > > + }, > > + { > > + .type = RTE_BBDEV_OP_FFT, > > + .cap.fft = { > > + .capability_flags = > > + > RTE_BBDEV_FFT_WINDOWING | > > + > RTE_BBDEV_FFT_CS_ADJUSTMENT | > > + > RTE_BBDEV_FFT_DFT_BYPASS | > > + > RTE_BBDEV_FFT_IDFT_BYPASS | > > + RTE_BBDEV_FFT_FP16_INPUT > | > > + > RTE_BBDEV_FFT_FP16_OUTPUT | > > + > RTE_BBDEV_FFT_POWER_MEAS | > > + > RTE_BBDEV_FFT_WINDOWING_BYPASS, > > + .num_buffers_src = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > + .num_buffers_dst = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > Is that a copy/pasta error or is it expected to have LDPC defines in FFT > capabilities? I need to change that, this is not quite correct actually even previously. Probably through other serie though. No one uses this currently. > > > + } > > + }, > > + { > > + .type = RTE_BBDEV_OP_MLDTS, > > + .cap.fft = { > > + .capability_flags = > > + RTE_BBDEV_MLDTS_REP, > > + .num_buffers_src = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > + .num_buffers_dst = > > + > RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, > > Same question here, that looks suspicious but maybe I'm wrong. > > > + } > > + }, > > + RTE_BBDEV_END_OF_CAPABILITIES_LIST() > > + }; > > + > > static struct rte_bbdev_queue_conf default_queue_conf; > > default_queue_conf.socket = dev->data->socket_id; > > default_queue_conf.queue_size = ACC_MAX_QUEUE_DEPTH; @@ - > 1193,7 > > +1419,10 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct > rte_bbdev_driver_info *dev_info) > > dev_info->default_queue_conf = default_queue_conf; > > dev_info->cpu_flag_reqs = NULL; > > dev_info->min_alignment = 1; > > - dev_info->capabilities = vrb1_bbdev_capabilities; > > + if (d->device_variant == VRB1_VARIANT) > > + dev_info->capabilities = vrb1_bbdev_capabilities; > > + else > > + dev_info->capabilities = vrb2_bbdev_capabilities; > > dev_info->harq_buffer_size = 0; > > > > vrb_check_ir(d); > > @@ -1242,6 +1471,9 @@ static struct rte_pci_id pci_id_vrb_pf_map[] = { > > { > > RTE_PCI_DEVICE(RTE_VRB1_VENDOR_ID, > RTE_VRB1_PF_DEVICE_ID) > > }, > > + { > > + RTE_PCI_DEVICE(RTE_VRB2_VENDOR_ID, > RTE_VRB2_PF_DEVICE_ID) > > + }, > > {.device_id = 0}, > > }; > > > > @@ -1250,6 +1482,9 @@ static struct rte_pci_id pci_id_vrb_vf_map[] = { > > { > > RTE_PCI_DEVICE(RTE_VRB1_VENDOR_ID, > RTE_VRB1_VF_DEVICE_ID) > > }, > > + { > > + RTE_PCI_DEVICE(RTE_VRB2_VENDOR_ID, > RTE_VRB2_VF_DEVICE_ID) > > + }, > > {.device_id = 0}, > > }; > > > > I'm stopping the review here for now. Thanks expecting a v3 this week based on your suggestions. > > Thanks, > Maxime