on 2023/12/6 02:01, Ajit Agarwal wrote: > Hello Kewen: > > > On 05/12/23 7:13 pm, Ajit Agarwal wrote: >> Hello Kewen: >> >> On 04/12/23 7:31 am, Kewen.Lin wrote: >>> Hi Ajit, >>> >>> on 2023/12/1 17:10, Ajit Agarwal wrote: >>>> Hello Kewen: >>>> >>>> On 24/11/23 3:01 pm, Kewen.Lin wrote: >>>>> Hi Ajit, >>>>> >>>>> Don't forget to CC David (CC-ed) :), some comments are inlined below. >>>>> >>>>> on 2023/10/8 03:04, Ajit Agarwal wrote: >>>>>> Hello All: >>>>>> >>>>>> This patch add new pass to replace contiguous addresses vector load lxv >>>>>> with mma instruction >>>>>> lxvp. >>>>> >>>>> IMHO the current binding lxvp (and lxvpx, stxvp{x,}) to MMA looks wrong, >>>>> it's only >>>>> Power10 and VSX required, these instructions should perform well without >>>>> MMA support. >>>>> So one patch to separate their support from MMA seems to go first. >>>>> >>>> >>>> I will make the changes for Power10 and VSX. >>>> >>>>>> This patch addresses one regressions failure in ARM architecture. >>>>> >>>>> Could you explain this? I don't see any test case for this. >>>> >>>> I have submitted v1 of the patch and there were regressions failure for >>>> Linaro. >>>> I have fixed in version V2. >>> >>> OK, thanks for clarifying. So some unexpected changes on generic code in v1 >>> caused the failure exposed on arm. >>> >>>> >>>> >>>>> Besides, it seems a bad idea to put this pass after reload? as register >>>>> allocation >>>>> finishes, this pairing has to be restricted by the reg No. (I didn't see >>>>> any >>>>> checking on the reg No. relationship for paring btw.) >>>>> >>>> >>>> Adding before reload pass deletes one of the lxv and replaced with lxvp. >>>> This >>>> fails in reload pass while freeing reg_eqivs as ira populates them and then >>> >>> I can't find reg_eqivs, I guessed you meant reg_equivs and moved this pass >>> right before >>> pass_reload (between pass_ira and pass_reload)? IMHO it's unexpected as >>> those two passes >>> are closely correlated. I was expecting to put it somewhere before ira. >> >> Yes they are tied together and moving before reload will not work. >> >>> >>>> vecload pass deletes some of insns and while freeing in reload pass as insn >>>> is already deleted in vecload pass reload pass segfaults. >>>> >>>> Moving vecload pass before ira will not make register pairs with lxvp and >>>> in ira and that will be a problem. >>> >>> Could you elaborate the obstacle for moving such pass before pass_ira? >>> >>> Basing on the status quo, the lxvp is bundled with OOmode, then I'd expect >>> we can generate OOmode move (load) and use the components with unspec (or >>> subreg with Peter's patch) to replace all the previous use places, it looks >>> doable to me. >> >> Moving before ira passes, we delete the offset lxv and generate lxvp and >> replace all >> the uses, that I am doing. But the offset lxvp register generated by ira are >> not >> register pair and generate random register and hence we cannot generate lxvp. >> >> For example one lxv is generated with register 32 and other pair is generated >> with register 45 by ira if we move it before ira passes. > > It generates the following. > lxvp %vs32,0(%r4) > xvf32ger 0,%vs34,%vs32 > xvf32gerpp 0,%vs34,%vs45
What do the RTL insns for these insns look like? I'd expect you use UNSPEC_MMA_EXTRACT to extract V16QI from the result of lxvp, the current define_insn_and_split "*vsx_disassemble_pair" should be able to take care of it further (eg: reg and regoff). BR, Kewen > xxmfacc 0 > stxvp %vs2,0(%r3) > stxvp %vs0,32(%r3) > blr > > > Instead of vs33 ira generates vs45 if we move before pass_ira. > > Thanks & Regards > Ajit > > >> Thanks & Regards >> Ajit >>> >> >>>> >>>> Making after reload pass is the only solution I see as ira and reload pass >>>> makes register pairs and vecload pass will be easier with generation of >>>> lxvp. >>>> >>>> Please suggest. >>>> >>>>> Looking forward to the comments from Segher/David/Peter/Mike etc. >>> >>> Still looking forward. :) >>> >>> BR, >>> Kewen