Re: -fprofile-update=atomic vs. 32-bit architectures

Richard Biener via Gcc Tue, 08 Nov 2022 02:27:04 -0800

On Tue, Nov 8, 2022 at 7:22 AM Sebastian Huber
<[email protected]> wrote:
>
> On 05.11.22 12:18, Richard Biener wrote:
> > On Fri, Nov 4, 2022 at 9:28 AM Sebastian Huber
> > <[email protected]>  wrote:
> >> Hello,
> >>
> >> even recent 32-bit architectures such as RISC-V do not support 64-bit
> >> atomic operations.  Using -fprofile-update=atomic for the 32-bit RISC-V
> >> RV32GC ISA yields:
> >>
> >> warning: target does not support atomic profile update, single mode is
> >> selected
> >>
> >> For multi-threaded applications it is quite important to use atomic
> >> counter increments to get valid coverage data. I think this fall back is
> >> not really good. Maybe we should consider using this approach from Jakub
> >> Jelinek for 32-bit architectures lacking 64-bit atomic operations:
> >>
> >>     if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED)
> >> == 0)
> >>       __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1,
> >> __ATOMIC_RELAXED);
> >>
> >> https://patchwork.ozlabs.org/project/gcc/patch/[email protected]/#1447334
> >>
> >> Last year I added the TARGET_GCOV_TYPE_SIZE target hook to optionally
> >> reduce the gcov type size to 32 bits. I am not really sure if this was a
> >> good idea. Longer running executables may observe counter overflows
> >> leading to invalid coverage data. If someone wants atomic updates, then
> >> the updates should be atomic even if this means to use a library
> >> implementation (libatomic).
> >>
> >> What about the following approach if -fprofile-update=atomic is given:
> >>
> >> 1. Use 64-bit atomics if available.
> >>
> >> 2. Use
> >>
> >>     if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED)
> >> == 0)
> >>       __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1,
> >> __ATOMIC_RELAXED);
> >>
> >> if 32-bit atomics are available.
> >>
> >> 3. Else use a library call (libatomic).
> > sounds good, though a library call would really be prohibitly expensive?
>
> I someone wants to profile a multi-threaded application and selects
> -fprofile-update=atomic, then probably a correct result is preferred.
> You still have the option to select -fprofile-update=prefer-atomic.
>
> For 2. I have to modify:
>
> void
> gimple_gen_edge_profiler (int edgeno, edge e)
> {
>    tree one;
>
>    one = build_int_cst (gcov_type_node, 1);
>
>    if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
>      {
>        /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */
>        tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno);
>        tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
>                                       ? BUILT_IN_ATOMIC_FETCH_ADD_8:
>                                       BUILT_IN_ATOMIC_FETCH_ADD_4);
>        gcall *stmt = gimple_build_call (f, 3, addr, one,
>                                        build_int_cst (integer_type_node,
>                                                       MEMMODEL_RELAXED));
>        gsi_insert_on_edge (e, stmt);
>      }
>
> Is
>
> if (WORDS_BIG_ENDIAN)
>
> the right way to check for big/little endian?


Yes.

> How do I get ((unsigned int *) &val) + 1 from tree addr?
>
> It would be great to have a code example for the construction of the "if
> (f()) f();".

I think for the function above we need to emit __atomic_fetch_add_8,
not the emulated form because we cannot insert the required control
flow (if (f()) f()) on an edge.  The __atomic_fetch_add_8 should then be
lowered after the instrumentation took place.

There's currently no helper to create a diamond so the canonical
way is to create a GIMPLE_COND, split the block after this stmt,
split the outgoing edge and then redirect edges to form a half-diamond.
move_sese_in_condition has most of that CFG manipulation (but
it performs sth different)

Richard.

> --
> embedded brains GmbH
> Herr Sebastian HUBER
> Dornierstr. 4
> 82178 Puchheim
> Germany
> email: [email protected]
> phone: +49-89-18 94 741 - 16
> fax:   +49-89-18 94 741 - 08
>
> Registergericht: Amtsgericht München
> Registernummer: HRB 157899
> Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> Unsere Datenschutzerklärung finden Sie hier:
> https://embedded-brains.de/datenschutzerklaerung/

Re: -fprofile-update=atomic vs. 32-bit architectures

Reply via email to