On Fri, Aug 28, 2020 at 10:52 AM Jakub Jelinek <[email protected]> wrote:
>
> On Fri, Aug 28, 2020 at 08:47:06AM +0200, Richard Biener via Gcc-patches
> wrote:
> > IIRC elsewhere it was discussed to use ld to perform merging by
> > emitting separate rodata sections for constant sizes (4, 8, 16, 32, 64
> > byte sizes).
>
> ld does that already, and gcc too.
>
> > ld could always direct 8 byte constant refs to the larger pools (sub-)entry.
>
> But there is no way to express in ELF that something like that would be
> acceptable.
Hmm, I see.
> I meant something like the following, which on e.g. a dumb:
>
> typedef float V __attribute__((vector_size (4 * sizeof (float))));
>
> void
> foo (V *p, float *q)
> {
> p[0] += (V) { 1.0f, 2.0f, 3.0f, 4.0f };
> q[0] += 4.0f;
> q[1] -= 3.0f;
> q[17] -= 2.0f;
> q[31] += 1.0f;
> }
>
> testcase merges all the 4 scalar constant pool entries into the CONST_VECTOR
> one.
>
> I'm punting for section anchors and not doing it in the per-function (i.e.
> non-shared) constant pools simply because I don't know them well enough,
> don't know whether backends use the offsets for something etc.
> For section anchors, I guess it would need to be done before (re)computing the
> offsets and arrange for the desc->mark < 0 entries not to be considered as
> objects in the object block, for non-shared pools, perhaps it would be
> enough to call the new function from output_constant_pool before calling
> recompute_pool_offsets and adjust recompute_pool_offsets to ignore
> desc->mark < 0.
Guess this would work indeed. It's probably quite common to have
both vector and non-vector constants because of vectorization
and scalar epilogues. But note that elsewhere we're using
the largest component mode to emit vector constant pool entries
to share { -1, -1, -1, -1 } and {-1l, -1l } for example so while the
below code works for FP modes it likely will break down for
integer modes?
Richard.
> 2020-08-28 Jakub Jelinek <[email protected]>
>
> * varasm.c (output_constant_pool_contents): Emit desc->mark < 0
> entries as aliases.
> (optimize_constant_pool): New function.
> (output_shared_constant_pool): Call it if TARGET_SUPPORTS_ALIASES.
>
> --- gcc/varasm.c.jj 2020-07-28 15:39:10.091755086 +0200
> +++ gcc/varasm.c 2020-08-28 10:38:10.207636849 +0200
> @@ -4198,7 +4198,27 @@ output_constant_pool_contents (struct rt
> class constant_descriptor_rtx *desc;
>
> for (desc = pool->first; desc ; desc = desc->next)
> - if (desc->mark)
> + if (desc->mark < 0)
> + {
> +#ifdef ASM_OUTPUT_DEF
> + const char *name = targetm.strip_name_encoding (XSTR (desc->sym, 0));
> + char label[256];
> + char buffer[256 + 32];
> + const char *p;
> +
> + ASM_GENERATE_INTERNAL_LABEL (label, "LC", ~desc->mark);
> + p = targetm.strip_name_encoding (label);
> + if (desc->offset)
> + {
> + sprintf (buffer, "%s+%ld", p, (long) (desc->offset));
> + p = buffer;
> + }
> + ASM_OUTPUT_DEF (asm_out_file, name, p);
> +#else
> + gcc_unreachable ();
> +#endif
> + }
> + else if (desc->mark)
> {
> /* If the constant is part of an object_block, make sure that
> the constant has been positioned within its block, but do not
> @@ -4216,6 +4236,52 @@ output_constant_pool_contents (struct rt
> }
> }
>
> +/* Attempt to optimize constant pool POOL. If it contains both CONST_VECTOR
> + constants and scalar constants with the values of CONST_VECTOR elements,
> + try to alias the scalar constants with the CONST_VECTOR elements. */
> +
> +static void
> +optimize_constant_pool (struct rtx_constant_pool *pool)
> +{
> + for (constant_descriptor_rtx *desc = pool->first; desc; desc = desc->next)
> + if (desc->mark > 0
> + && GET_CODE (desc->constant) == CONST_VECTOR
> + && VECTOR_MODE_P (desc->mode)
> + && GET_MODE_CLASS (desc->mode) != MODE_VECTOR_BOOL
> + && !(SYMBOL_REF_HAS_BLOCK_INFO_P (desc->sym)
> + && SYMBOL_REF_BLOCK (desc->sym))
> + && desc->labelno >= 0)
> + {
> + scalar_mode submode = GET_MODE_INNER (desc->mode);
> + unsigned int subalign = MIN (desc->align, GET_MODE_BITSIZE (submode));
> + int units = GET_MODE_NUNITS (desc->mode);
> +
> + for (int i = 0; i < units; i++)
> + {
> + if (i != 0
> + && rtx_equal_p (CONST_VECTOR_ELT (desc->constant, i),
> + CONST_VECTOR_ELT (desc->constant, i - 1)))
> + continue;
> +
> + constant_descriptor_rtx tmp;
> + tmp.constant = CONST_VECTOR_ELT (desc->constant, i);
> + tmp.mode = submode;
> + hashval_t hash = const_rtx_hash (tmp.constant);
> + constant_descriptor_rtx *eldesc
> + = pool->const_rtx_htab->find_with_hash (&tmp, hash);
> + if (eldesc
> + && eldesc->mark > 0
> + && eldesc->align <= subalign
> + && !(SYMBOL_REF_HAS_BLOCK_INFO_P (eldesc->sym)
> + && SYMBOL_REF_BLOCK (eldesc->sym)))
> + {
> + eldesc->mark = ~desc->labelno;
> + eldesc->offset = i * GET_MODE_SIZE (submode);
> + }
> + }
> + }
> +}
> +
> /* Mark all constants that are used in the current function, then write
> out the function's private constant pool. */
>
> @@ -4251,6 +4317,9 @@ output_constant_pool (const char *fnname
> void
> output_shared_constant_pool (void)
> {
> + if (TARGET_SUPPORTS_ALIASES)
> + optimize_constant_pool (shared_constant_pool);
> +
> output_constant_pool_contents (shared_constant_pool);
> }
>
>
>
> Jakub
>