Hi! On Thu, 5 Sep 2013 18:11:05 +0200, Jakub Jelinek <ja...@redhat.com> wrote: > 3) I figured out we need to tell the runtime library not just > address, size and kind, but also alignment (we won't need that for > the #pragma omp declare target global vars though), so that the > runtime library can properly align it. As TYPE_ALIGN/DECL_ALIGN > is in bits and is 32 bit wide, when that is in bytes and we only care > about power of twos, I've decided to encode it in the upper 5 bits > of the kind (lower 3 bits are used for OMP_CLAUSE_MAP_* kind).
Unfortunately, this scheme breaks down with OpenACC: we need an additional bit to codify a flag for present_or_* map clauses (meaning: only map the data (allocate/to/from/tofrom, as for OpenMP) if not already present on the device). With five bits available for the OpenMP case, we can describe alignments up to 2 GiB, and I've empirically found on my development system that the largest possible alignment is MAX_OFILE_ALIGNMENT, 256 MiB for ELF systems, so that's fine. But with only four bits available, we get to describe alignments up to 1 << ((1 << 4) - 1) = 32 KiB, which is too small -- even though it'd be fine for "normal" usage of __attribute__ ((aligned (x))). So it seems our options are to use a bigger datatype for the kinds array, to split off from the kinds array a new alignments array, or to generally switch to using an array of a struct containing hostaddr, size, alignment, kind. The latter would require additional changes in the child_fn. As it's an ABI change no matter what, would you like to see this limited to OpenACC? Changing it also for OpenMP's GOMP_target would have the advantage to have them not diverge (especially at the generating side in omp-low.c's lowering functions), but I'm not sure whether such an ABI change would easily be possible now, with the OpenMP 4 support merged into trunk -- though, it is not yet part of a regular GCC release? > --- gcc/omp-low.c.jj 2013-09-05 09:19:03.000000000 +0200 > +++ gcc/omp-low.c 2013-09-05 17:11:14.693638660 +0200 > @@ -9342,6 +9349,11 @@ lower_omp_target (gimple_stmt_iterator * | unsigned char tkind = 0; | switch (OMP_CLAUSE_CODE (c)) | { | case OMP_CLAUSE_MAP: | tkind = OMP_CLAUSE_MAP_KIND (c); | break; | case OMP_CLAUSE_TO: | tkind = OMP_CLAUSE_MAP_TO; | break; | case OMP_CLAUSE_FROM: | tkind = OMP_CLAUSE_MAP_FROM; | break; > default: > gcc_unreachable (); > } > + unsigned int talign = TYPE_ALIGN_UNIT (TREE_TYPE (ovar)); > + if (DECL_P (ovar) && DECL_ALIGN_UNIT (ovar) > talign) > + talign = DECL_ALIGN_UNIT (ovar); > + talign = ceil_log2 (talign); > + tkind |= talign << 3; > CONSTRUCTOR_APPEND_ELT (vkind, purpose, > build_int_cst (unsigned_char_type_node, > tkind)); The use of OMP_CLAUSE_MAP_* on the generating and integer numerals on the receiving (libgomp) side is a bit unesthetic, likewise for the hard-coded 3 in the bit shift. What would be the standard GCC way of sharing a description of the tkind layout between gcc/omp-low.c and libgomp/target.c? Are we allowed to #include (a new header file) libgomp/target.h from gcc/omp-low.c? To avoid silent breakage should alignments bigger than 2 GiB be allowed in a distant future, would a check like the following be appropriate? --- gcc/omp-low.c +++ gcc/omp-low.c @@ -10378,6 +10383,11 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) unsigned int talign = TYPE_ALIGN_UNIT (TREE_TYPE (ovar)); if (DECL_P (ovar) && DECL_ALIGN_UNIT (ovar) > talign) talign = DECL_ALIGN_UNIT (ovar); + const unsigned int talign_max + = 1 << ((1 << (BITS_PER_UNIT - 3)) - 1); + if (talign > talign_max) + sorry ("can't encode alignment of %u bytes, which is bigger than " + "%u bytes", talign, talign_max); talign = ceil_log2 (talign); tkind |= talign << 3; CONSTRUCTOR_APPEND_ELT (vkind, purpose, Grüße, Thomas
pgp_bUwzkKWWX.pgp
Description: PGP signature