On Thu, Aug 19, 2021 at 10:14 PM Thomas Schwinge <tho...@codesourcery.com> wrote: > > Hi! > > Richard, maybe you have an opinion here, in particular about my > "SLP vectorizer" comment below? Please see > <87r1f2puss.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87r1f2puss.fsf@euler.schwinge.homeip.net> > for the full context. > > On 2021-08-16T10:21:04+0200, Jakub Jelinek <ja...@redhat.com> wrote: > > On Mon, Aug 16, 2021 at 10:08:42AM +0200, Thomas Schwinge wrote: > >> /* Build COMPONENT_REF and set TREE_THIS_VOLATILE and TREE_READONLY on it > >> as appropriate. */ > >> > >> tree > >> omp_build_component_ref (tree obj, tree field) > >> { > >> + tree field_type = TREE_TYPE (field); > >> + tree obj_type = TREE_TYPE (obj); > >> + if (!ADDR_SPACE_GENERIC_P (TYPE_ADDR_SPACE (obj_type))) > >> + field_type > >> + = build_qualified_type (field_type, > >> + KEEP_QUAL_ADDR_SPACE (TYPE_QUALS (obj_type))); > > (For later reference: "Kwok's new code" here is to propagate to > 'field_type' any non-generic address space of 'obj_type'.) > > |> Concerning the current 'gcc/omp-low.c:omp_build_component_ref', for the > |> current set of offloading testcases, we never see a > |> '!ADDR_SPACE_GENERIC_P' there, so the address space handling doesn't seem > |> to be necessary there (but also won't do any harm: no-op). > > > > Are you sure this can't trigger? > > Say > > extern int __seg_fs a; > > > > void > > foo (void) > > { > > #pragma omp parallel private (a) > > a = 2; > > } > > That test case doesn't run into 'omp_build_component_ref' at all, > but I'm attaching an altered and extended variant that does, > "Add 'libgomp.c/address-space-1.c'". OK to push to master branch? > > In this case, 'omp_build_component_ref' called via host compilation > 'pass_lower_omp', it's the 'field_type' that has 'address-space-1', not > 'obj_type', so indeed Kwok's new code is a no-op: > > (gdb) call debug_tree(field_type) > <pointer_type 0x7ffff7686b28 > type <integer_type 0x7ffff7686498 int address-space-1 SI > size <integer_cst 0x7ffff7540f30 constant 32> > unit-size <integer_cst 0x7ffff7540f48 constant 4> > align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > 0x7ffff7686498 precision:32 min <integer_cst 0x7ffff7540ee8 -2147483648> max > <integer_cst 0x7ffff7540f00 2147483647> > pointer_to_this <pointer_type 0x7ffff7686b28>> > unsigned DI > size <integer_cst 0x7ffff7540cf0 type <integer_type 0x7ffff75590a8 > bitsizetype> constant 64> > unit-size <integer_cst 0x7ffff7540d08 type <integer_type > 0x7ffff7559000 sizetype> constant 8> > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > 0x7ffff7686b28> > > (gdb) call debug_tree(obj_type) > <record_type 0x7ffff7686bd0 .omp_data_t.0 readonly DI > size <integer_cst 0x7ffff7540cf0 type <integer_type 0x7ffff75590a8 > bitsizetype> constant 64> > unit-size <integer_cst 0x7ffff7540d08 type <integer_type > 0x7ffff7559000 sizetype> constant 8> > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > 0x7ffff7686bd0 > fields <field_decl 0x7ffff7568428 a > type <pointer_type 0x7ffff7686b28 type <integer_type > 0x7ffff7686498 int address-space-1> > unsigned DI size <integer_cst 0x7ffff7540cf0 64> unit-size > <integer_cst 0x7ffff7540d08 8> > align:64 warn_if_not_align:0 symtab:0 alias-set -1 > canonical-type 0x7ffff7686b28> > unsigned DI /home/thomas/shared/gcc/omp/as.c:4:14 size > <integer_cst 0x7ffff7540cf0 64> unit-size <integer_cst 0x7ffff7540d08 8> > align:64 warn_if_not_align:0 offset_align 128 > offset <integer_cst 0x7ffff7540d20 constant 0> > bit-offset <integer_cst 0x7ffff7540d68 constant 0> context > <record_type 0x7ffff7686540 .omp_data_t.0>> reference_to_this <reference_type > 0x7ffff7686c78>> > > The case that Kwok's new code handles, however, is when 'obj_type' has a > non-generic address space, and then propagates that one to 'field_type'. > > For a similar OpenACC example, 'omp_build_component_ref' called via GCN > offloading compilation 'pass_omp_oacc_neuter_broadcast', we've got > without Kwok's new code: > > (gdb) call debug_tree(field_type) > <boolean_type 0x7ffff7550b28 bool public unsigned QI > size <integer_cst 0x7ffff754fa80 type <integer_type 0x7ffff75500a8 > bitsizetype> constant 8> > unit-size <integer_cst 0x7ffff754fa98 type <integer_type > 0x7ffff7550000 sizetype> constant 1> > align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > 0x7ffff7550b28 precision:1 min <integer_cst 0x7ffff754fcd8 0> max > <integer_cst 0x7ffff754fd08 1>> > > (gdb) call debug_tree(obj_type) > <record_type 0x7ffff7631000 .oacc_ws_data_s.0 address-space-4 QI > size <integer_cst 0x7ffff754fa80 type <integer_type 0x7ffff75500a8 > bitsizetype> constant 8> > unit-size <integer_cst 0x7ffff754fa98 type <integer_type > 0x7ffff7550000 sizetype> constant 1> > align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > 0x7ffff7631000 > fields <field_decl 0x7ffff762e260 _52 > type <boolean_type 0x7ffff7550b28 bool public unsigned QI size > <integer_cst 0x7ffff754fa80 8> unit-size <integer_cst 0x7ffff754fa98 1> > align:8 warn_if_not_align:0 symtab:0 alias-set -1 > canonical-type 0x7ffff7550b28 precision:1 min <integer_cst 0x7ffff754fcd8 0> > max <integer_cst 0x7ffff754fd08 1>> > unsigned QI <built-in>:0:0 size <integer_cst 0x7ffff754fa80 8> > unit-size <integer_cst 0x7ffff754fa98 1> > align:8 warn_if_not_align:0 offset_align 64 > offset <integer_cst 0x7ffff754f9c0 constant 0> > bit-offset <integer_cst 0x7ffff754fa08 constant 0> context > <record_type 0x7ffff7631000 .oacc_ws_data_s.0 address-space-4>> > pointer_to_this <pointer_type 0x7ffff7631498>> > > ..., and with Kwok's new code the 'address-space-4' of 'obj_type' is > propagated to 'field_type': > > (gdb) call debug_tree(field_type) > <boolean_type 0x7ffff7631540 bool address-space-4 unsigned QI > size <integer_cst 0x7ffff754fa80 type <integer_type 0x7ffff75500a8 > bitsizetype> constant 8> > unit-size <integer_cst 0x7ffff754fa98 type <integer_type > 0x7ffff7550000 sizetype> constant 1> > align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type > 0x7ffff7631540 precision:1 min <integer_cst 0x7ffff754fcd8 0> max > <integer_cst 0x7ffff754fd08 1>> > > I'm not familiar enough with these bits to tell whether Kwok's new code > is the right solution to this problem -- or if, for example, the problem > is rather in the SLP vectorizer, where the ICE seems to ultimately > emerge? > > Without (ICEs later) vs. with (works) Kwok's new code, we see the > 'a.xamdgcn-amdhsa.mkoffload.175t.slp1' dump change as follows (word-diff, > only additional '<address-space-4>', occasionally): > > [...] > {+<address-space-4>+} vector(2) long int * vectp.58; > {+<address-space-4>+} vector(2) long int * vectp_.oacc_worker_o.57; > {+<address-space-4>+} vector(2) int * vectp.56; > {+<address-space-4>+} vector(2) int * vectp_.oacc_worker_o.55; > [...] > {+<address-space-4>+} long int * _104; > [...] > {+<address-space-4>+} long int * _108; > [...] > <address-space-4> void * _350; > [...] > _350 = __builtin_gcn_single_copy_start (&.oacc_worker_o.6); > [...] > MEM <{+<address-space-4>+} vector(2) long int> [(long int > *)&.oacc_worker_o.6] = _101; > _108 = &.oacc_worker_o.6._22 + 16; > MEM <{+<address-space-4>+} vector(2) long int> [(long int *)_108] = > _100; > _104 = &.oacc_worker_o.6._22 + 32; > [...] > > For example, with Kwok's new code, '_108' ('<address-space-4> long int *') > is cast into '(long int *)' -- presumably synthesized in the SLP > vectorizer? Is that correct or shouldn't that cast also include > '<address-space-4>'? > > I see a similar issue has been fixed a while ago: r245772 (Git commit > c7d97b2846c5647a81548caa3264d77c0a595010) for PR79723 > "Another case of dropped gs: prefix", changing > 'gcc/tree-vect-stmts.c:get_vectype_for_scalar_type_and_size' as follows: > > + /* Re-attach the address-space qualifier if we canonicalized the scalar > + type. */ > + if (TYPE_ADDR_SPACE (orig_scalar_type) != TYPE_ADDR_SPACE (vectype)) > + return build_qualified_type > + (vectype, KEEP_QUAL_ADDR_SPACE (TYPE_QUALS > (orig_scalar_type))); > + > return vectype; > > (It looks a bit like the address space handling is quite fragile in GCC's > 'tree' types/interfaces? Do we have ideas about how to make that more > robust, less "bolt-on"?)
If in doubt always look at what RTL expansion does - it looks like set_mem_attributes expects the address-space qualifier to be present on the type or in case it is passed an object, on the type of the base, or in case of a dereference, on the pointed-to type of the pointer (and yes, that does look somewhat fragile). So it looks like the patch you refer to shouldn't fix anything and > + /* Re-attach the address-space qualifier if we canonicalized the scalar > + type. */ > + if (TYPE_ADDR_SPACE (orig_scalar_type) != TYPE_ADDR_SPACE (vectype)) > + return build_qualified_type > + (vectype, KEEP_QUAL_ADDR_SPACE (TYPE_QUALS > (orig_scalar_type))); looks incomplete. What you'd need to look for is MEM_REFs built by the vectorizer and the address-space information on the pointers, like generated from vect_create_data_ref_ptr. It might also be that data-ref analysis / SCEV looks through address-space qualifier changing casts and thus we pick up the wrong address-space in the end. What's the testcase that ICEs on trunk? > I did add a few 'assert's for non-generic address space to > 'gcc/tree-vect*', but have not yet located where things may be going > wrong. > > > > I think keeping the qual addr space here is the wrong thing to do, > > it should keep the other quals and clear the address space instead, > > the whole struct is going to be in generic addres space, isn't it? > > Correct for 'omp_build_component_ref' called via host compilation > 'pass_lower_omp', but in the case of 'omp_build_component_ref' called via > GCN offloading compilation 'pass_omp_oacc_neuter_broadcast', 'obj_type' > has a non-generic address space. > > However, regarding the former comment -- shouldn't we force generic > address space for all 'tree' types read in via LTO streaming for > offloading compilation? I assume that (in the general case) address > spaces are never compatible between host and offloading compilation? > For the attached "Add 'libgomp.c/address-space-1.c'", propagating the > '__seg_fs' address space across the offloading boundary (assuming I did > interpret the dumps correctly) doesn't seem to cause any problems, but > maybe it's problematic for other cases? (This is, however, a separate > issue from what I'm discussing here.) > > > >> + tree ret = build3 (COMPONENT_REF, field_type, obj, field, NULL); > >> + if (TREE_THIS_VOLATILE (field)) > >> + TREE_THIS_VOLATILE (ret) |= 1; > >> + if (TREE_READONLY (field)) > >> + TREE_READONLY (ret) |= 1; > > > > When touching these two, shouldn't it be better written as > > = 1; instead of |= 1; ? For a bitfield... > > Yes, that was just copied from the original > 'gcc/omp-general.c:omp_build_component_ref' -- but happy to simplify > that, of course. > > > Grüße > Thomas > > > ----------------- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 > München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas > Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht > München, HRB 106955