Hi Richi, on 2020/3/10 下午7:12, Richard Biener wrote: > On Tue, Mar 10, 2020 at 7:52 AM Kewen.Lin <li...@linux.ibm.com> wrote: >> >> Hi all, >> >> But how to teach it to be aware of this? Currently the processing starts >> from bottom to up (from stores), can we do some analysis on the SLP >> instance, detect some pattern and update the whole instance? > > In theory yes (Tamar had something like that for AARCH64 complex > rotations IIRC). And yes, the issue boils down to how we handle > SLP discovery. I'd like to improve SLP discovery but it's on my list > only after I managed to get rid of the non-SLP code paths. I have > played with some ideas (even produced hackish patches) to find > "seeds" to form SLP groups from using multi-level hashing of stmts. > My plan is to rewrite SLP discovery completely, starting from a > SLP graph that 1:1 reflects the SSA use-def graph (no groups > formed yet) and then form groups from seeds and insert > "connection" nodes (merging two subgraphs with less lanes > into one with more lanes, splitting lanes, permuting lanes, etc.). >
Nice! Thanks for the information! This improvement sounds big and promising! If we can discovery SLP opportunities origins from loads, this case isn't a big deal then. > Currently I'm working on doing exactly this but only for SLP loads > (because that's where it's most difficult...). This case looks can be an input for your work? since the isomorphic computation are easy to detect from loads. >> A-way requires some additional vector permutations. However, I thought >> if the existing scheme can't get any SLP chances, it looks reasonable to >> extend it to consider this A-way grouping. Does it make sense? >> >> Another question is that even if we can go with A-way grouping, it can >> only handle packing one byte (four iteration -> 4), what place is >> reasonable to extend it to pack more? How about scaning all leaf >> nodes and consider/transform them together? too hacky? > > Well, not sure - in the end it will all be heuristics since otherwise > the exploration space is too big. But surely concentrating on > load/store groups is good. > Totally agreed. This hacky idea orgined from the existing codes, if SLP discovery improves, I think it's useless then. > The SLP discovery code is already quite complicated btw., I'd > hate to add "unstructured" hacks ontop of it right now without > future design goals in mind. OK. Looking forward to its landing! BR, Kewen