On Mon, Jul 3, 2023 at 5:39 AM Kewen.Lin <li...@linux.ibm.com> wrote: > > Hi Richi, > > Thanks for your review comments on this and some others! > > on 2023/6/30 19:37, Richard Biener wrote: > > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <li...@linux.ibm.com> wrote: > >> > >> This patch series follows Richi's suggestion at the link [1], > >> which suggest structuring vectorizable_load to make costing > >> next to the transform, in order to make it easier to keep > >> costing and the transform in sync. For now, it's a known > >> issue that what we cost can be inconsistent with what we > >> transform, as the case in PR82255 and some other associated > >> test cases in the patches of this series show. > >> > >> Basically this patch series makes costing not call function > >> vect_model_load_cost any more. To make the review and > >> bisection easy, I organized the changes according to the > >> memory access types of vector load. For each memory access > >> type, firstly it follows the handlings in the function > >> vect_model_load_costto avoid any missing, then refines > >> further by referring to the transform code, I also checked > >> them with some typical test cases to verify. Hope the > >> subjects of patches are clear enough. > >> > >> The whole series can be bootstrapped and regtested > >> incrementally on: > >> - x86_64-redhat-linux > >> - aarch64-linux-gnu > >> - powerpc64-linux-gnu P7, P8 and P9 > >> - powerpc64le-linux-gnu P8, P9 and P10 > >> > >> By considering the current vector test buckets are mainly > >> tested without cost model, I also verified the whole patch > >> series was neutral for SPEC2017 int/fp on Power9 at O2, > >> O3 and Ofast separately. > > > > I went through the series now and I like it overall (well, I suggested > > the change). > > Looking at the changes I think we want some followup to reduce the > > mess in the final loop nest. We already have some VMAT_* cases handled > > separately, maybe we can split out some more cases. Maybe we should > > At first glance, the simple parts look to be the handlings for > VMAT_LOAD_STORE_LANES, and VMAT_GATHER_SCATTER (with ifn and emulated). > It seems a bit straightforward if it's fine to duplicate the nested loop, > but may need to care about removing some useless code. > > > bite the bullet and duplicate that loop nest for the different VMAT_* cases. > > Maybe we can merge some of the if (!costing_p) checks by clever > > re-ordering. > > I've tried a bit to merge them if possible, like the place to check > VMAT_CONTIGUOUS, VMAT_CONTIGUOUS_REVERSE and VMAT_CONTIGUOUS_PERMUTE. > But will keep in mind for the following updates. > > > So what > > this series doesn't improve is overall readability of the code (indent and > > our > > 80 char line limit). > > Sorry about that. > > > > > The change also makes it more difficult(?) to separate analysis and > > transform > > though in the end I hope that analysis will actually "code generate" to a > > (SLP) > > data structure so the target will have a chance to see the actual flow of > > insns. > > > > That said, I'd like to hear from Richard whether he thinks this is a step > > in the right direction. > > > > Are you willing to followup with doing the same re-structuring to > > vectorizable_store? > > Yes, vectorizable_store was also pointed out in your original suggestion [1], > I planned to update this once this series meets your expectations and gets > landed. > > > > > OK from my side with the few comments addressed. The patch likely needs > > refresh > > after the RVV changes in this area? > > Thanks! Yes, I've updated 2/9 and 3/9 according to your comments, and updated > 5/9 and 9/9 as they had some conflicts when rebasing. Re-testing is ongoing, > do the updated versions look good to you? Is this series ok for trunk if all > the > test runs go well again as before?
Yes. Thanks, Richard. > BR, > Kewen