On 2021/4/20 21:18, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Chengchang Tang <tangchengch...@huawei.com>
>> Sent: Tuesday, April 20, 2021 1:44 PM
>> To: Ananyev, Konstantin <konstantin.anan...@intel.com>; Yigit, Ferruh
>> <ferruh.yi...@intel.com>; dev@dpdk.org
>> Cc: linux...@huawei.com; ch...@att.com; humi...@huawei.com
>> Subject: Re: [dpdk-dev] [RFC 0/2] add Tx prepare support for bonding device
>>
>> Hi
>> On 2021/4/20 16:33, Ananyev, Konstantin wrote:
>>> Hi everyone,
>>>
>>>>
>>>> On 2021/4/20 9:26, Ferruh Yigit wrote:
>>>>> On 4/16/2021 12:04 PM, Chengchang Tang wrote:
>>>>>> This patch add Tx prepare for bonding device.
>>>>>>
>>>>>> Currently, the bonding driver has not implemented the callback of
>>>>>> rte_eth_tx_prepare function. Therefore, the TX prepare function of the
>>>>>> slave devices will never be invoked. When hardware offloading such as
>>>>>> CKSUM and TSO are enabled for some drivers, tx_prepare needs to be used
>>>>>> to adjust packets (for example, set correct pseudo packet headers).
>>>>>> Otherwise, related offloading fails and even packets are sent
>>>>>> incorrectly. Due to this limitation, the bonded device cannot use these
>>>>>> HW offloading in the Tx direction.
>>>>>>
>>>>>> Because packet sending algorithms are numerous and complex in bond PMD,
>>>>>> it is hard to design the callback for rte_eth_tx_prepare. In this patch,
>>>>>> the tx_prepare callback of bonding PMD is not implemented. Instead,
>>>>>> rte_eth_tx_prepare has been called in tx_burst callback. And a global
>>>>>> variable is introduced to control whether the bonded device need call
>>>>>> the rte_eth_tx_prepare. If upper-layer users need to use some TX
>>>>>> offloading that depend on tx_prepare , they should enable the preparation
>>>>>> function. In this way, the bonded device will call the rte_eth_tx_prepare
>>>>>> for the fast path packets in the tx_burst callback.
>>>
>>> I admit that I didn't look at the implementation yet, but it sounds like
>>> overcomplication to me. Can't we just have a new TX function for bonding PMD
>>> when TX offloads are enabled? And inside that function we will do:
>>> tx_prepare(); tx_burst(); for selected device.
>>
>> The solution you mentioned is workable and may perform better. However, the
>> current
>> solution is also simple and has a limited impact on performance. It is
>> actually:
>> if (tx_prepare_enable)
>> tx_prepare();
>> tx_burst();
>>
>> Overall, it adds almost only one judgment to the case where the related Tx
>> offloads
>> is not turned on.
>>
>>> We can select this function at setup stage analysing requested by user TX
>>> offloads.
>>>
>>
>> In PMDs, it is a common practice to select different Tx/Rx function during
>> the setup
>> phase. But for a 'vdev' device like Bonding, we may need to think more about
>> it.
>> The reasons are explained below.
>>>
>>>>>>
>>>>>
>>>>> What do you think to add a devarg to bonding PMD to control the
>>>>> tx_prepare?
>>>>> It won't be as dynamic as API, since it can be possible to change the
>>>>> behavior after application is started with API, but do we really need
>>>> this?
>>>>
>>>> If an API is not added, unnecessary constraints may be introduced. If the
>>>> bonding device is created through the rte_eth_bond_create interface instead
>>>> devarg "vdev", this function cannot be used because devargs does not take
>>>> effect
>>>> in this case. But from an ease-of-use perspective, adding a devarg is a
>>>> good
>>>> idea. I will add related implementations in the later official patches.
>>>
>>> I am also against introducing new devarg to control tx_prepare() invocation.
>>> I think at dev_config/queue_setup phase PMD will have enough information to
>>> decide.
>>>
>> Currently, the community does not specify which Tx offloads need to invoke
>> tx_prepare.
>
> I think inside bond PMD we can safely assume that any TX offload does need
> tx_prepare().
> If that's not the case then slave dev tx_prepare pointer will be NULL and
> rte_eth_tx_prepare()
> will be just a NOOP.
Get it. I agree that these decisions should be offloaded directly into PMDs.
In the formal patch, the API that used to control enable states will be deleted.
>
>> For Vdev devices such as bond, all NIC devices need to be considered.
>> Generally,
>> tx_prepare is used in CKSUM and TSO. It is possible that for some NIC
>> devices, even
>> CKSUM and TSO do not need to invoke tx_prepare, or for some NIC devices,
>> there are
>> other Tx offloads that need to call tx_prepare. From this perspective,
>> leaving the
>> choice to the user seems to be a better choice.
>
> Wonder how user will know when to enable/disable it?
> As you said it depends on the underlying HW/PMD and can change from system to
> system?
Generally, decisions need to be made based on debugging results, which is not
good.
> I think it is PMD that needs to take this decision, and I think the safest
> bet might be to enable
> it when any TX offloads was enabled by user.
>
I agree that these decisions should be made by the PMDs. Even, I think the
tx_prepare()
should always be called in bonding, its impact on performance should be
directly controlled
by the PMDs.
>>>>
>>>> If I understand correctly, the current community does not want to introduce
>>>> more private APIs for PMDs. However, the absence of an API on this issue
>>>> would
>>>> introduce some unnecessary constraints, and from that point of view, I
>>>> think
>>>> adding an API seems necessary.
>>>>>
>>>>>> Chengchang Tang (2):
>>>>>> net/bonding: add Tx prepare for bonding
>>>>>> app/testpmd: add cmd for bonding Tx prepare
>>>>>>
>>>>>> app/test-pmd/cmdline.c | 66
>>>>>> +++++++++++++++++++++++++++++
>>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 9 ++++
>>>>>> drivers/net/bonding/eth_bond_private.h | 1 +
>>>>>> drivers/net/bonding/rte_eth_bond.h | 29 +++++++++++++
>>>>>> drivers/net/bonding/rte_eth_bond_api.c | 28 ++++++++++++
>>>>>> drivers/net/bonding/rte_eth_bond_pmd.c | 33 +++++++++++++--
>>>>>> drivers/net/bonding/version.map | 5 +++
>>>>>> 7 files changed, 167 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>