On Thu, 23 Jun 2016, Jakub Jelinek wrote: > Hi! > > This PR is about 2 issues with the *atomic_compare_exchange* APIs, which > didn't exist with __sync_*_compare_and_swap: > 1) the APIs make the expected argument addressable, although it is very > common it is an automatic variable that is addressable only because of > these APIs > 2) for the fear that expected might be a pointer to memory accessed by > multiple threads, the store of the oldvar to that location is only > conditional (if the compare and swap failed) - while again for the > common case when it is a local otherwise non-addressable automatic > var, it can be stored unconditionally. > > To resolve this, we effectively need a call (or some other stmt) that > returns two values. We need that also for the __builtin_*_overflow* > builtins and have solved it by returning from an internal-fn call > a complex int value, where REALPART_EXPR of it is one result and > IMAGPART_EXPR the other (bool-ish) result. > > The following patch handles it the same, by folding > __atomic_compare_exchange_N early to an internal call (with conditional > store in the IL), and then later on if the expected var becomes > non-addressable and is rewritten into SSA, optimizing the conditional store > into unconditional (that is the gimple-fold.c part). > > Thinking about this again, there could be another option - keep > __atomic_compare_exchange_N in the IL, but under certain conditions (similar > to what the patch uses in fold_builtin_atomic_compare_exchange) for these > builtins ignore &var on the second argument, and if we actually turn var > into non-addressable, convert the builtin call similarly to what > fold_builtin_atomic_compare_exchange does in the patch (except the store > would be non-conditional then; the gimple-fold.c part wouldn't be needed > then). > > Any preferences?
I wonder if always expanding from the internal-fn would eventually generate worse code if the value would be addressable already. If that is the case then doing it conditionally on the arg becoming non-addressable (update-address-taken I guess) would be prefered. Otherwise you can't use immediate uses from gimple-fold but you have to stick the transform into tree-ssa-forwprop.c instead (gimple-fold can only rely on up-to-date use-def chains, stmt operands are _not_ up-to-date reliably). Thanks, Richard. > This version has been bootstrapped/regtested on > x86_64-linux and i686-linux. Attached are various testcases I've been using > to see if the generated code improved (tried x86_64, powerpc64le, s390x and > aarch64). E.g. on x86_64-linux, in the first testcase at -O2 the > improvement in f1/f2 is removal of dead > movl $0, -4(%rsp) > in f4 > - movl $0, -4(%rsp) > lock; cmpxchgl %edx, (%rdi) > - je .L7 > - movl %eax, -4(%rsp) > -.L7: > - movl -4(%rsp), %eax > etc. > > 2016-06-23 Jakub Jelinek <ja...@redhat.com> > > PR middle-end/66867 > * builtins.c: Include gimplify.h. > (expand_ifn_atomic_compare_exchange_into_call, > expand_ifn_atomic_compare_exchange, > fold_builtin_atomic_compare_exchange): New functions. > (fold_builtin_varargs): Handle BUILT_IN_ATOMIC_COMPARE_EXCHANGE_*. > * internal-fn.c (expand_ATOMIC_COMPARE_EXCHANGE): New function. > * tree.h (build_call_expr_internal_loc): Rename to ... > (build_call_expr_internal_loc_array): ... this. Fix up type of > last argument. > * internal-fn.def (ATOMIC_COMPARE_EXCHANGE): New internal fn. > * predict.c (expr_expected_value_1): Handle IMAGPART_EXPR of > ATOMIC_COMPARE_EXCHANGE result. > * builtins.h (expand_ifn_atomic_compare_exchange): New prototype. > * gimple-fold.c (fold_ifn_atomic_compare_exchange): New function. > (gimple_fold_call): Handle IFN_ATOMIC_COMPARE_EXCHANGE. > > * gfortran.dg/coarray_atomic_4.f90: Add -O0 to dg-options. > > --- gcc/builtins.c.jj 2016-06-08 21:01:25.000000000 +0200 > +++ gcc/builtins.c 2016-06-23 09:17:51.053713986 +0200 > @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3. > #include "internal-fn.h" > #include "case-cfn-macros.h" > #include "gimple-fold.h" > +#include "gimplify.h" > > > struct target_builtins default_target_builtins; > @@ -5158,6 +5159,123 @@ expand_builtin_atomic_compare_exchange ( > return target; > } > > +/* Helper function for expand_ifn_atomic_compare_exchange - expand > + internal ATOMIC_COMPARE_EXCHANGE call into __atomic_compare_exchange_N > + call. The weak parameter must be dropped to match the expected parameter > + list and the expected argument changed from value to pointer to memory > + slot. */ > + > +static void > +expand_ifn_atomic_compare_exchange_into_call (gcall *call, machine_mode mode) > +{ > + unsigned int z; > + vec<tree, va_gc> *vec; > + > + vec_alloc (vec, 5); > + vec->quick_push (gimple_call_arg (call, 0)); > + tree expected = gimple_call_arg (call, 1); > + rtx x = assign_stack_temp_for_type (mode, GET_MODE_SIZE (mode), > + TREE_TYPE (expected)); > + rtx expd = expand_expr (expected, x, mode, EXPAND_NORMAL); > + if (expd != x) > + emit_move_insn (x, expd); > + tree v = make_tree (TREE_TYPE (expected), x); > + vec->quick_push (build1 (ADDR_EXPR, > + build_pointer_type (TREE_TYPE (expected)), v)); > + vec->quick_push (gimple_call_arg (call, 2)); > + /* Skip the boolean weak parameter. */ > + for (z = 4; z < 6; z++) > + vec->quick_push (gimple_call_arg (call, z)); > + built_in_function fncode > + = (built_in_function) ((int) BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1 > + + exact_log2 (GET_MODE_SIZE (mode))); > + tree fndecl = builtin_decl_explicit (fncode); > + tree fn = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fndecl)), > + fndecl); > + tree exp = build_call_vec (boolean_type_node, fn, vec); > + tree lhs = gimple_call_lhs (call); > + rtx boolret = expand_call (exp, NULL_RTX, lhs == NULL_TREE); > + if (lhs) > + { > + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); > + if (GET_MODE (boolret) != mode) > + boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1); > + x = force_reg (mode, x); > + write_complex_part (target, boolret, true); > + write_complex_part (target, x, false); > + } > +} > + > +/* Expand IFN_ATOMIC_COMPARE_EXCHANGE internal function. */ > + > +void > +expand_ifn_atomic_compare_exchange (gcall *call) > +{ > + int size = tree_to_shwi (gimple_call_arg (call, 3)) & 255; > + gcc_assert (size == 1 || size == 2 || size == 4 || size == 8 || size == > 16); > + machine_mode mode = mode_for_size (BITS_PER_UNIT * size, MODE_INT, 0); > + rtx expect, desired, mem, oldval, boolret; > + enum memmodel success, failure; > + tree lhs; > + bool is_weak; > + source_location loc > + = expansion_point_location_if_in_system_header (gimple_location (call)); > + > + success = get_memmodel (gimple_call_arg (call, 4)); > + failure = get_memmodel (gimple_call_arg (call, 5)); > + > + if (failure > success) > + { > + warning_at (loc, OPT_Winvalid_memory_model, > + "failure memory model cannot be stronger than success " > + "memory model for %<__atomic_compare_exchange%>"); > + success = MEMMODEL_SEQ_CST; > + } > + > + if (is_mm_release (failure) || is_mm_acq_rel (failure)) > + { > + warning_at (loc, OPT_Winvalid_memory_model, > + "invalid failure memory model for " > + "%<__atomic_compare_exchange%>"); > + failure = MEMMODEL_SEQ_CST; > + success = MEMMODEL_SEQ_CST; > + } > + > + if (!flag_inline_atomics) > + { > + expand_ifn_atomic_compare_exchange_into_call (call, mode); > + return; > + } > + > + /* Expand the operands. */ > + mem = get_builtin_sync_mem (gimple_call_arg (call, 0), mode); > + > + expect = expand_expr_force_mode (gimple_call_arg (call, 1), mode); > + desired = expand_expr_force_mode (gimple_call_arg (call, 2), mode); > + > + is_weak = (tree_to_shwi (gimple_call_arg (call, 3)) & 256) != 0; > + > + boolret = NULL; > + oldval = NULL; > + > + if (!expand_atomic_compare_and_swap (&boolret, &oldval, mem, expect, > desired, > + is_weak, success, failure)) > + { > + expand_ifn_atomic_compare_exchange_into_call (call, mode); > + return; > + } > + > + lhs = gimple_call_lhs (call); > + if (lhs) > + { > + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); > + if (GET_MODE (boolret) != mode) > + boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1); > + write_complex_part (target, boolret, true); > + write_complex_part (target, oldval, false); > + } > +} > + > /* Expand the __atomic_load intrinsic: > TYPE __atomic_load (TYPE *object, enum memmodel) > EXP is the CALL_EXPR. > @@ -9515,6 +9633,63 @@ fold_builtin_object_size (tree ptr, tree > return NULL_TREE; > } > > +/* Fold > + r = __atomic_compare_exchange_N (p, &e, d, w, s, f); > + into > + _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, e, d, w * 256 + N, s, > f); > + i = IMAGPART_EXPR <t>; > + r = (_Bool) i; > + if (!b) > + e = REALPART_EXPR <t>; */ > + > +static tree > +fold_builtin_atomic_compare_exchange (location_t loc, tree fndecl, > + tree *args, int nargs) > +{ > + if (nargs != 6 > + || !flag_inline_atomics > + || !optimize > + || (flag_sanitize & (SANITIZE_THREAD | SANITIZE_ADDRESS)) != 0 > + || (!integer_zerop (args[3]) && !integer_onep (args[3]))) > + return NULL_TREE; > + > + tree argsc[6]; > + tree parmt = TYPE_ARG_TYPES (TREE_TYPE (fndecl)); > + tree itype = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (parmt))); > + machine_mode mode = TYPE_MODE (itype); > + > + if (direct_optab_handler (atomic_compare_and_swap_optab, mode) > + == CODE_FOR_nothing > + && optab_handler (sync_compare_and_swap_optab, mode) == > CODE_FOR_nothing) > + return NULL_TREE; > + > + tree ctype = build_complex_type (itype); > + tree alias_type = build_pointer_type_for_mode (itype, ptr_mode, true); > + tree alias_off = build_int_cst (alias_type, 0); > + tree expected = fold_build2_loc (loc, MEM_REF, itype, args[1], alias_off); > + memcpy (argsc, args, sizeof (argsc)); > + argsc[1] = expected; > + argsc[3] = build_int_cst (integer_type_node, > + (integer_onep (args[3]) ? 256 : 0) > + + int_size_in_bytes (itype)); > + tree var = create_tmp_var_raw (ctype); > + DECL_CONTEXT (var) = current_function_decl; > + tree call > + = build_call_expr_internal_loc_array (loc, IFN_ATOMIC_COMPARE_EXCHANGE, > + ctype, 6, argsc); > + var = build4 (TARGET_EXPR, ctype, var, call, NULL, NULL); > + tree ret > + = fold_convert_loc (loc, boolean_type_node, > + build1 (IMAGPART_EXPR, itype, var)); > + tree condstore > + = build3_loc (loc, COND_EXPR, void_type_node, ret, > + void_node, build2_loc (loc, MODIFY_EXPR, void_type_node, > + unshare_expr (expected), > + build1 (REALPART_EXPR, itype, var))); > + return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, condstore, > + unshare_expr (ret)); > +} > + > /* Builtins with folding operations that operate on "..." arguments > need special handling; we need to store the arguments in a convenient > data structure before attempting any folding. Fortunately there are > @@ -9533,6 +9708,13 @@ fold_builtin_varargs (location_t loc, tr > ret = fold_builtin_fpclassify (loc, args, nargs); > break; > > + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1: > + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2: > + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4: > + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8: > + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16: > + return fold_builtin_atomic_compare_exchange (loc, fndecl, args, nargs); > + > default: > break; > } > --- gcc/internal-fn.c.jj 2016-06-15 19:09:09.000000000 +0200 > +++ gcc/internal-fn.c 2016-06-22 15:30:06.838951934 +0200 > @@ -2143,6 +2143,14 @@ expand_ATOMIC_BIT_TEST_AND_RESET (intern > expand_ifn_atomic_bit_test_and (call); > } > > +/* Expand atomic bit test and set. */ > + > +static void > +expand_ATOMIC_COMPARE_EXCHANGE (internal_fn, gcall *call) > +{ > + expand_ifn_atomic_compare_exchange (call); > +} > + > /* Expand a call to FN using the operands in STMT. FN has a single > output operand and NARGS input operands. */ > > --- gcc/tree.h.jj 2016-06-20 21:16:07.000000000 +0200 > +++ gcc/tree.h 2016-06-21 17:35:19.806362408 +0200 > @@ -3985,8 +3985,8 @@ extern tree build_call_expr_loc (locatio > extern tree build_call_expr (tree, int, ...); > extern tree build_call_expr_internal_loc (location_t, enum internal_fn, > tree, int, ...); > -extern tree build_call_expr_internal_loc (location_t, enum internal_fn, > - tree, int, tree *); > +extern tree build_call_expr_internal_loc_array (location_t, enum internal_fn, > + tree, int, const tree *); > extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree, > int, ...); > extern tree build_string_literal (int, const char *); > --- gcc/internal-fn.def.jj 2016-05-03 13:36:50.000000000 +0200 > +++ gcc/internal-fn.def 2016-06-21 17:10:23.516879436 +0200 > @@ -193,6 +193,7 @@ DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | EC > DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_SET, ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_COMPLEMENT, ECF_LEAF | ECF_NOTHROW, > NULL) > DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_RESET, ECF_LEAF | ECF_NOTHROW, NULL) > +DEF_INTERNAL_FN (ATOMIC_COMPARE_EXCHANGE, ECF_LEAF | ECF_NOTHROW, NULL) > > #undef DEF_INTERNAL_INT_FN > #undef DEF_INTERNAL_FLT_FN > --- gcc/predict.c.jj 2016-06-22 11:17:44.763444374 +0200 > +++ gcc/predict.c 2016-06-22 14:26:08.894724088 +0200 > @@ -1978,6 +1978,25 @@ expr_expected_value_1 (tree type, tree o > if (TREE_CONSTANT (op0)) > return op0; > > + if (code == IMAGPART_EXPR) > + { > + if (TREE_CODE (TREE_OPERAND (op0, 0)) == SSA_NAME) > + { > + def = SSA_NAME_DEF_STMT (TREE_OPERAND (op0, 0)); > + if (is_gimple_call (def) > + && gimple_call_internal_p (def) > + && (gimple_call_internal_fn (def) > + == IFN_ATOMIC_COMPARE_EXCHANGE)) > + { > + /* Assume that any given atomic operation has low contention, > + and thus the compare-and-swap operation succeeds. */ > + if (predictor) > + *predictor = PRED_COMPARE_AND_SWAP; > + return build_one_cst (TREE_TYPE (op0)); > + } > + } > + } > + > if (code != SSA_NAME) > return NULL_TREE; > > --- gcc/builtins.h.jj 2016-05-03 13:36:50.000000000 +0200 > +++ gcc/builtins.h 2016-06-21 18:05:11.678635858 +0200 > @@ -72,6 +72,7 @@ extern tree std_canonical_va_list_type ( > extern void std_expand_builtin_va_start (tree, rtx); > extern void expand_builtin_trap (void); > extern void expand_ifn_atomic_bit_test_and (gcall *); > +extern void expand_ifn_atomic_compare_exchange (gcall *); > extern rtx expand_builtin (tree, rtx, rtx, machine_mode, int); > extern rtx expand_builtin_with_bounds (tree, rtx, rtx, machine_mode, int); > extern enum built_in_function builtin_mathfn_code (const_tree); > --- gcc/gimple-fold.c.jj 2016-06-16 21:00:08.000000000 +0200 > +++ gcc/gimple-fold.c 2016-06-23 11:48:48.081789706 +0200 > @@ -2980,6 +2980,236 @@ arith_overflowed_p (enum tree_code code, > return wi::min_precision (wres, sign) > TYPE_PRECISION (type); > } > > +/* Recognize: > + _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, _1, d, w, s, f); > + r = IMAGPART_EXPR <t>; > + b = (_Bool) r; > + if (!b) > + _2 = REALPART_EXPR <t>; > + _3 = PHI<_1, _2>; > + and, because REALPART_EXPR <t> for !!b is necessarily equal to > + _1, use REALPART_EXPR <t> unconditionally. This happens when the > + expected argument of __atomic_compare_exchange* is addressable > + only because its address had to be passed to __atomic_compare_exchange*, > + but otherwise is a local variable. We don't need to worry about any > + race conditions in that case. */ > + > +static bool > +fold_ifn_atomic_compare_exchange (gcall *call) > +{ > + tree lhs = gimple_call_lhs (call); > + imm_use_iterator imm_iter; > + use_operand_p use_p; > + > + if (cfun->cfg == NULL > + || lhs == NULL_TREE > + || TREE_CODE (lhs) != SSA_NAME > + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) > + return false; > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) > + { > + gimple *use_stmt = USE_STMT (use_p); > + if (is_gimple_debug (use_stmt)) > + continue; > + if (!is_gimple_assign (use_stmt)) > + break; > + if (gimple_assign_rhs_code (use_stmt) == REALPART_EXPR) > + continue; > + if (gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR) > + break; > + > + tree lhs2 = gimple_assign_lhs (use_stmt); > + if (TREE_CODE (lhs2) != SSA_NAME) > + continue; > + > + use_operand_p use2_p; > + gimple *use2_stmt; > + if (!single_imm_use (lhs2, &use2_p, &use2_stmt) > + || !gimple_assign_cast_p (use2_stmt)) > + continue; > + > + tree lhs3 = gimple_assign_lhs (use2_stmt); > + if (TREE_CODE (lhs3) != SSA_NAME) > + continue; > + > + imm_use_iterator imm3_iter; > + use_operand_p use3_p; > + FOR_EACH_IMM_USE_FAST (use3_p, imm3_iter, lhs3) > + { > + gimple *use3_stmt = USE_STMT (use3_p); > + if (gimple_code (use3_stmt) != GIMPLE_COND > + || gimple_cond_lhs (use3_stmt) != lhs3 > + || !integer_zerop (gimple_cond_rhs (use3_stmt)) > + || (gimple_cond_code (use3_stmt) != EQ_EXPR > + && gimple_cond_code (use3_stmt) != NE_EXPR)) > + continue; > + > + basic_block bb = gimple_bb (use3_stmt); > + basic_block bb1, bb2; > + edge e1, e2; > + > + e1 = EDGE_SUCC (bb, 0); > + bb1 = e1->dest; > + e2 = EDGE_SUCC (bb, 1); > + bb2 = e2->dest; > + > + /* We cannot do the optimization on abnormal edges. */ > + if ((e1->flags & EDGE_ABNORMAL) != 0 > + || (e2->flags & EDGE_ABNORMAL) != 0 > + || bb1 == NULL > + || bb2 == NULL) > + continue; > + > + /* Find the bb which is the fall through to the other. */ > + if (single_succ_p (bb1) && single_succ (bb1) == bb2) > + ; > + else if (single_succ_p (bb2) && single_succ (bb2) == bb1) > + { > + std::swap (bb1, bb2); > + std::swap (e1, e2); > + } > + else > + continue; > + > + e1 = single_succ_edge (bb1); > + > + /* Make sure that bb1 is just a fall through. */ > + if ((e1->flags & EDGE_FALLTHRU) == 0) > + continue; > + > + /* Make sure bb1 is executed if b is the atomic operation > + failed. */ > + if ((gimple_cond_code (use3_stmt) == NE_EXPR) > + ^ ((e2->flags & EDGE_TRUE_VALUE) != 0)) > + continue; > + > + /* Also make sure that bb1 only have one predecessor and that it > + is bb. */ > + if (!single_pred_p (bb1) || single_pred (bb1) != bb) > + continue; > + > + gimple_stmt_iterator gsi = gsi_start_nondebug_after_labels_bb (bb1); > + if (gsi_end_p (gsi)) > + continue; > + > + gimple *rp_stmt = gsi_stmt (gsi); > + if (!is_gimple_assign (rp_stmt) > + || gimple_assign_rhs_code (rp_stmt) != REALPART_EXPR > + || TREE_OPERAND (gimple_assign_rhs1 (rp_stmt), 0) != lhs) > + continue; > + > + gsi_next_nondebug (&gsi); > + > + tree lhs4 = gimple_assign_lhs (rp_stmt); > + if (TREE_CODE (lhs4) != SSA_NAME) > + continue; > + > + use_operand_p use4_p; > + gimple *use4_stmt; > + if (!single_imm_use (lhs4, &use4_p, &use4_stmt)) > + continue; > + > + tree_code cvt = ERROR_MARK; > + > + /* See if there is extra cast, like: > + _1 = VIEW_CONVERT_EXPR<uintN_t>(_4); > + _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, _1, d, w, s, f); > + r = IMAGPART_EXPR <t>; > + b = (_Bool) r; > + if (!b) { > + _2 = REALPART_EXPR <t>; > + _5 = (intN_t) _2; > + } > + _3 = PHI<_4, _5>; */ > + if (gimple_assign_cast_p (use4_stmt) > + && gimple_bb (use4_stmt) == bb1 > + && use4_stmt == gsi_stmt (gsi)) > + { > + tree rhstype = TREE_TYPE (lhs4); > + lhs4 = gimple_assign_lhs (use4_stmt); > + cvt = gimple_assign_rhs_code (use4_stmt); > + if (cvt != VIEW_CONVERT_EXPR > + && (!CONVERT_EXPR_CODE_P (cvt) > + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs4)) > + || (TYPE_PRECISION (TREE_TYPE (lhs4)) > + != TYPE_PRECISION (rhstype)))) > + continue; > + if (!single_imm_use (lhs4, &use4_p, &use4_stmt)) > + continue; > + gsi_next_nondebug (&gsi); > + } > + > + if (gimple_code (use4_stmt) != GIMPLE_PHI > + || gimple_bb (use4_stmt) != bb2) > + continue; > + > + if (!gsi_end_p (gsi)) > + continue; > + > + use_operand_p val_p = PHI_ARG_DEF_PTR_FROM_EDGE (use4_stmt, e2); > + tree val = USE_FROM_PTR (val_p); > + tree arge = gimple_call_arg (call, 1); > + if (!operand_equal_p (val, arge, 0)) > + { > + > + if (cvt == ERROR_MARK) > + continue; > + else if (TREE_CODE (val) == SSA_NAME) > + { > + if (TREE_CODE (arge) != SSA_NAME) > + continue; > + gimple *def = SSA_NAME_DEF_STMT (arge); > + if (!gimple_assign_cast_p (def)) > + continue; > + tree arg = gimple_assign_rhs1 (def);; > + switch (gimple_assign_rhs_code (def)) > + { > + case VIEW_CONVERT_EXPR: > + arg = TREE_OPERAND (arg, 0); > + break; > + CASE_CONVERT: > + if (!INTEGRAL_TYPE_P (TREE_TYPE (arge)) > + || (TYPE_PRECISION (TREE_TYPE (arge)) > + != TYPE_PRECISION (TREE_TYPE (arg)))) > + continue; > + break; > + default: > + continue; > + } > + if (!operand_equal_p (val, arg, 0)) > + continue; > + } > + else if (TREE_CODE (arge) == SSA_NAME > + || !operand_equal_p (val, fold_build1 (cvt, > + TREE_TYPE (lhs4), > + arge), 0)) > + continue; > + } > + > + gsi = gsi_for_stmt (use3_stmt); > + tree type = TREE_TYPE (TREE_TYPE (lhs)); > + gimple *g = gimple_build_assign (make_ssa_name (type), > + build1 (REALPART_EXPR, type, lhs)); > + gsi_insert_before (&gsi, g, GSI_SAME_STMT); > + if (cvt != ERROR_MARK) > + { > + tree arg = gimple_assign_lhs (g); > + if (cvt == VIEW_CONVERT_EXPR) > + arg = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs4), arg); > + g = gimple_build_assign (make_ssa_name (TREE_TYPE (lhs4)), > + cvt, arg); > + gsi_insert_before (&gsi, g, GSI_SAME_STMT); > + } > + SET_USE (val_p, gimple_assign_lhs (g)); > + val_p = PHI_ARG_DEF_PTR_FROM_EDGE (use4_stmt, e1); > + SET_USE (val_p, gimple_assign_lhs (g)); > + update_stmt (use4_stmt); > + return true; > + } > + } > + return false; > +} > + > /* Attempt to fold a call statement referenced by the statement iterator GSI. > The statement may be replaced by another statement, e.g., if the call > simplifies to a constant value. Return true if any changes were made. > @@ -3166,6 +3396,10 @@ gimple_fold_call (gimple_stmt_iterator * > return true; > } > break; > + case IFN_ATOMIC_COMPARE_EXCHANGE: > + if (fold_ifn_atomic_compare_exchange (stmt)) > + changed = true; > + break; > case IFN_GOACC_DIM_SIZE: > case IFN_GOACC_DIM_POS: > result = fold_internal_goacc_dim (stmt); > --- gcc/testsuite/gfortran.dg/coarray_atomic_4.f90.jj 2015-05-29 > 15:03:08.000000000 +0200 > +++ gcc/testsuite/gfortran.dg/coarray_atomic_4.f90 2016-06-23 > 12:11:55.507093867 +0200 > @@ -1,5 +1,5 @@ > ! { dg-do compile } > -! { dg-options "-fcoarray=single -fdump-tree-original" } > +! { dg-options "-fcoarray=single -fdump-tree-original -O0" } > ! > use iso_fortran_env, only: atomic_int_kind, atomic_logical_kind > implicit none > > Jakub > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)