Dear Nitin,

That doesn't work that way. 

Regards,

Damjan

> On 1 Jun 2018, at 19:41, Saxena, Nitin <nitin.sax...@cavium.com> wrote:
> 
> Hi Damjan,
> 
>  Now that you are aware that Cavium is working on optimisations for ARM, can 
> I request that you check with us on implications for ARM(at least Cavium), 
> before bringing changes in dpdk-input? 
> 
> Regards,
> Nitin
> 
> On 01-Jun-2018, at 21:39, Damjan Marion <dmar...@me.com 
> <mailto:dmar...@me.com>> wrote:
> 
>> 
>> Dear Nitin,
>> 
>> I really don't have anything else to add. It your call how do you want to 
>> proceed....
>> 
>> Regards,
>> 
>> Damjan
>> 
>>> On 1 Jun 2018, at 18:02, Nitin Saxena <nitin.sax...@cavium.com 
>>> <mailto:nitin.sax...@cavium.com>> wrote:
>>> 
>>> Hi Damjan,
>>> 
>>> Answers Inline.
>>> 
>>> Thanks,
>>> Nitin
>>> 
>>> On Friday 01 June 2018 08:49 PM, Damjan Marion wrote:
>>>> Hi Nitin,
>>>> inline...
>>>>> On 1 Jun 2018, at 15:23, Nitin Saxena <nitin.sax...@cavium.com 
>>>>> <mailto:nitin.sax...@cavium.com>> wrote:
>>>>> 
>>>>> Hi Damjan,
>>>>> 
>>>>>> It was hard to know that you have subset of patches hidden somewhere.
>>>>> I wouldn't say patches are hidden. We are trying to fine tune dpdk-input 
>>>>> initially from our end first and later we will seek your expertise while 
>>>>> upstreaming.
>>>> for me they were hidden.
>>>>>> Typically it makes sense to discuss such kind of changes with person 
>>>>>> >who "maintains" the code before starting writing the code.
>>>>> Agreed. However we prefer to do internal analysis/POC first before 
>>>>> reaching out to MAINTAINERS. That way we can better understand code 
>>>>> review comments.
>>>> Perfectly fine, but then don't put blame on us for not knowing that you 
>>>> are doing something internally...
>>> The intention was not to blame anybody but to understand modular approach 
>>> in vpp to accommodate multi-arch(s).
>>>>> 
>>>>>> Maybe, but sounds to me like we are still in guessing phase.
>>>>> I wouldn't do any guess work with MAINTAINERS.
>>>>> 
>>>>>> Maybe we even need different function for each ARM CPU core as they
>>>>>> maybe have different memory subsystem and pipeline....
>>>>> This is what I am looking for. Is it ok to detect our hardware natively 
>>>>> from autoconf and append target specific macro to CFLAGS? And then 
>>>>> separate function for our target in dpdk/device/node.c? Sorry my 
>>>>> multi-arch select example was incorrect and that's not what I am looking 
>>>>> at.
>>>> Here I will be able to help when I get reasonable understanding what is 
>>>> the "big" plan.
>>> The "Big" plan is to optimize each vpp node for Aarch64. For now focus is 
>>> dpdk-input.
>>>> I don't want that we end up in 6 months with cavium patches, nxp patches, 
>>>> marvell patches, and so on.
>>> Is it a problem? If yes than I am not able to visualize it as the same 
>>> problem would exist for any architecture and not just for Aarch64.
>>>>> 
>>>>>> Is there an agreement between ARM vendors what is the targeted core
>>>>>> you want to have code tuned for or you are simply tuning to whatever
>>>>>> core Cavium uses?
>>>>> I am trying to optimize Cavium's SOC. This question is in this regard 
>>>>> only. However efforts are going on optimizing Cortex cores as well by ARM 
>>>>> community.
>>>> What about agreeing on plan for optimising on all ARM cores, and then 
>>>> starting doing optimisation?
>>> This is cross-company question so hard to answer but Cavium has the "big" 
>>> plan described above.
>>>>> 
>>>>> Thanks,
>>>>> Nitin
>>>>> 
>>>>> On Friday 01 June 2018 01:55 AM, Damjan Marion wrote:
>>>>>> inline...
>>>>>> -- 
>>>>>> Damjan
>>>>>>> On 31 May 2018, at 21:10, Saxena, Nitin <nitin.sax...@cavium.com 
>>>>>>> <mailto:nitin.sax...@cavium.com> <mailto:nitin.sax...@cavium.com 
>>>>>>> <mailto:nitin.sax...@cavium.com>>> wrote:
>>>>>>> 
>>>>>>> Hi Damjan,
>>>>>>> 
>>>>>>> Answers inline.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Nitin
>>>>>>> 
>>>>>>>> On 01-Jun-2018, at 12:15 AM, Damjan Marion <dmarion.li...@gmail.com 
>>>>>>>> <mailto:dmarion.li...@gmail.com> <mailto:dmarion.li...@gmail.com 
>>>>>>>> <mailto:dmarion.li...@gmail.com>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Dear Nitin,
>>>>>>>> 
>>>>>>>> See inline….
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 31 May 2018, at 19:59, Nitin Saxena <nitin.sax...@cavium.com 
>>>>>>>>> <mailto:nitin.sax...@cavium.com> <mailto:nitin.sax...@cavium.com 
>>>>>>>>> <mailto:nitin.sax...@cavium.com>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I am working on optimising dpdk-input node (based on vpp v1804) for 
>>>>>>>>> our target. I am able to get performance improvements on our target 
>>>>>>>>> but the problem I am finding now are:
>>>>>>>>> 
>>>>>>>>> 1) The dpdk-input code is completely changed on master branch from 
>>>>>>>>> v1804.
>>>>>>>> 
>>>>>>>> Why is this a problem? It was done with reason and for tangible 
>>>>>>>> benefit.
>>>>>>> This is a problem for me as I can not apply my v1804 changes directly 
>>>>>>> to the master branch. I have to again rework on master branch and 
>>>>>>> that’s why I am not able to move to master branch or v1807 in future.
>>>>>> It was hard to know that you have subset of patches hidden somewhere. 
>>>>>> Typically it makes sense to discuss such kind of changes with person who 
>>>>>> "maintains" the code before starting writing the code.
>>>>>>>> 
>>>>>>>>> Not to mention the dpdk-input master branch code do not give better 
>>>>>>>>> numbers on our target as compared to v1804
>>>>>>>> 
>>>>>>>> Sad to hear that, good thing is, it gives better numbers on x86.
>>>>>>> As I understand one dpdk_device_input function cannot be same for all 
>>>>>>> architectures because if the underlying micro-architecture is 
>>>>>>> different, the hot spots changes.
>>>>>> Maybe, but sounds to me like we are still in guessing phase.
>>>>>> Maybe we even need different function for each ARM CPU core as they 
>>>>>> maybe have different memory subsystem and pipeline....
>>>>>> Is there an agreement between ARM vendors what is the targeted core you 
>>>>>> want to have code tuned for or you are simply tuning to whatever core 
>>>>>> Cavium uses?
>>>>>>> I have seen dpdk-input master branch changes and on a positive notes 
>>>>>>> those changes make sense however some codes are tuned for x86 specially 
>>>>>>> Skylake. I was looking for some kind of  way to have mutiarch select 
>>>>>>> function for the Rx path, like the way it’s done for tx path.
>>>>>> Not sure why do you need that, unless you are going to have code 
>>>>>> optimised for different CPU variants (i.e. Cortex-A53 and Cortex-A72) in 
>>>>>> the same binary.
>>>>>>>> 
>>>>>>>>> 2) I don’t know the modular approach I should follow to merge my 
>>>>>>>>> changes as I have completely changed the quad loop handling and the 
>>>>>>>>> prefetches order in dpdk-input.
>>>>>>>> 
>>>>>>>> I carefully tuned that code. It was multi day exercise and losing 
>>>>>>>> single clock/packet on x86 with additional modifications are not 
>>>>>>>> acceptable. Still I’m open for discussion how to address this problem.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Note: I am far away from upstreaming the code currently as my 
>>>>>>>>> optimisation is still in progress. It will be better if I know the 
>>>>>>>>> proper way of doing it.
>>>>>>>> 
>>>>>>>> I suggest that you don’t even start on working on upstreaming before 
>>>>>>>> we have deep understanding of what and why needs to be done and we are 
>>>>>>>> all in agreement.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Nitin
>>>>> 
>>>>> 
>>> 
>>> 
>> 

Reply via email to