On Tue, Nov 8, 2022 at 7:22 AM Sebastian Huber <sebastian.hu...@embedded-brains.de> wrote: > > On 05.11.22 12:18, Richard Biener wrote: > > On Fri, Nov 4, 2022 at 9:28 AM Sebastian Huber > > <sebastian.hu...@embedded-brains.de> wrote: > >> Hello, > >> > >> even recent 32-bit architectures such as RISC-V do not support 64-bit > >> atomic operations. Using -fprofile-update=atomic for the 32-bit RISC-V > >> RV32GC ISA yields: > >> > >> warning: target does not support atomic profile update, single mode is > >> selected > >> > >> For multi-threaded applications it is quite important to use atomic > >> counter increments to get valid coverage data. I think this fall back is > >> not really good. Maybe we should consider using this approach from Jakub > >> Jelinek for 32-bit architectures lacking 64-bit atomic operations: > >> > >> if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED) > >> == 0) > >> __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1, > >> __ATOMIC_RELAXED); > >> > >> https://patchwork.ozlabs.org/project/gcc/patch/19c4a81d-6ecd-8c6e-b641-e257c1959...@suse.cz/#1447334 > >> > >> Last year I added the TARGET_GCOV_TYPE_SIZE target hook to optionally > >> reduce the gcov type size to 32 bits. I am not really sure if this was a > >> good idea. Longer running executables may observe counter overflows > >> leading to invalid coverage data. If someone wants atomic updates, then > >> the updates should be atomic even if this means to use a library > >> implementation (libatomic). > >> > >> What about the following approach if -fprofile-update=atomic is given: > >> > >> 1. Use 64-bit atomics if available. > >> > >> 2. Use > >> > >> if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED) > >> == 0) > >> __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1, > >> __ATOMIC_RELAXED); > >> > >> if 32-bit atomics are available. > >> > >> 3. Else use a library call (libatomic). > > sounds good, though a library call would really be prohibitly expensive? > > I someone wants to profile a multi-threaded application and selects > -fprofile-update=atomic, then probably a correct result is preferred. > You still have the option to select -fprofile-update=prefer-atomic. > > For 2. I have to modify: > > void > gimple_gen_edge_profiler (int edgeno, edge e) > { > tree one; > > one = build_int_cst (gcov_type_node, 1); > > if (flag_profile_update == PROFILE_UPDATE_ATOMIC) > { > /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */ > tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno); > tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32 > ? BUILT_IN_ATOMIC_FETCH_ADD_8: > BUILT_IN_ATOMIC_FETCH_ADD_4); > gcall *stmt = gimple_build_call (f, 3, addr, one, > build_int_cst (integer_type_node, > MEMMODEL_RELAXED)); > gsi_insert_on_edge (e, stmt); > } > > Is > > if (WORDS_BIG_ENDIAN) > > the right way to check for big/little endian?
Yes. > How do I get ((unsigned int *) &val) + 1 from tree addr? > > It would be great to have a code example for the construction of the "if > (f()) f();". I think for the function above we need to emit __atomic_fetch_add_8, not the emulated form because we cannot insert the required control flow (if (f()) f()) on an edge. The __atomic_fetch_add_8 should then be lowered after the instrumentation took place. There's currently no helper to create a diamond so the canonical way is to create a GIMPLE_COND, split the block after this stmt, split the outgoing edge and then redirect edges to form a half-diamond. move_sese_in_condition has most of that CFG manipulation (but it performs sth different) Richard. > -- > embedded brains GmbH > Herr Sebastian HUBER > Dornierstr. 4 > 82178 Puchheim > Germany > email: sebastian.hu...@embedded-brains.de > phone: +49-89-18 94 741 - 16 > fax: +49-89-18 94 741 - 08 > > Registergericht: Amtsgericht München > Registernummer: HRB 157899 > Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler > Unsere Datenschutzerklärung finden Sie hier: > https://embedded-brains.de/datenschutzerklaerung/