On 07/20/2017 03:04 PM, Tom de Vries wrote:
On 07/13/2017 06:53 PM, Cesar Philippidis wrote:
Similarly, for nvptx vector reductions, when it comes time to initialize
the reduction variable, the nvptx BE constructs a branch so that only
vector lanes 1 to vector_length-1 are initialized the the default value
for a given reduction type, where vector lane 0 retains the original
value of the reduction variable. For similar reason to the gang and
worker reductions, I set the probability of the new edge introduced for
the vector reduction to even.


Hi,

The problem that you describe in abstract term looks like this concretely:
....
(gdb) call debug_bb_n (4)
;; basic block 4, loop depth 0, freq 662, maybe hot
;;  prev block 3, next block 16, flags: (VISITED)
;;  pred:       3 [always (guessed)]  (FALLTHRU,EXECUTABLE)
# VUSE <.MEM_61>
# PT = nonlocal unit-escaped null
_18 = MEM[(const struct .omp_data_t.33D.1518 &).omp_data_i_9(D)
           clique 1 base 1].s2D.1519;
# VUSE <.MEMD.1540>
# USE = anything
_72 = GOACC_DIM_POS (2);
if (_72 != 0)
   goto <bb 16>; [100.00%] [count: INV]
else
   goto <bb 17>; [INV] [count: INV]
;;  succ:       16 [always]  (TRUE_VALUE)
;;              17 (FALSE_VALUE)
...

The edge to bb16 has probability 100%. The edge to bb17 has no probability set.


Hi,


I.

the patch below fixes the probabilities on the outgoing edges, setting them to even:
...
(gdb) call debug_bb_n (4)
;; basic block 4, loop depth 0, freq 662, maybe hot
;;  prev block 3, next block 16, flags: (VISITED)
;;  pred:       3 [always (guessed)]  (FALLTHRU,EXECUTABLE)
# VUSE <.MEM_61>
# PT = nonlocal unit-escaped null
_18 = MEM[(const struct .omp_data_t.33D.1518 &).omp_data_i_9(D) clique 1 base 1].s2D.1519;
# VUSE <.MEMD.1540>
# USE = anything
_72 = GOACC_DIM_POS (2);
if (_72 != 0)
  goto <bb 16>; [50.00%] [count: INV]
else
  goto <bb 17>; [50.00%] [count: INV]
;;  succ:       16 [50.0% (adjusted)]  (TRUE_VALUE)
;;              17 [50.0% (adjusted)]  (FALSE_VALUE)
...


II.

The quality is 'adjusted'. [ Even() first calls always() which has quality precise, and then applies scale(), which downgrades the quality from 'precise' to 'adjusted'. ]

The reason for that is explained in this comment AFAIU:
...
   Named probabilities except for never/always are assumed to be
   statically guessed and thus not necessarily accurate.
...

When I look at the definitions of 'adjusted' and 'precise':
...
  /* Profile was originally based on feedback but it was adjusted
     by code duplicating optimization.  It may not precisely reflect the
     particular code path.  */
  profile_adjusted = 2,
  /* Profile was read from profile feedback or determined by accurate
     static method.  */
  profile_precise = 3
...

I wonder: there seem to be two situations in which 'precise' is possible:
- Profile was read from profile feedback
- Profile was determined by accurate static method
But there is only one situation where 'adjusted' is possible:
- Profile was originally based on feedback but it was adjusted by code
  duplicating optimization.
I can imagine as well that we originally have a static method giving precise information, and that this information is downgraded by a code duplication optimization.

So, should this be instead:
...
  /* Profile was originally based on feedback or accurate static method,
     but it was adjusted by code duplicating optimization.  It may not
     precisely reflect the particular code path.  */
  profile_adjusted = 2,
...
?


III.

In this particular case, we insert a conditional jump based on a special machine register which gives us the knowledge that both the true and false edge are equally likely (in fact, both are executed, with different warp enabling mask).

So I think the most accurate representation would be 50/50 'precise', but that does not fit with the assumption above.

I'll commit the patch below for now.

But I'm curious if you have any comments.

Thanks,
- Tom

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 6314653..2c427fa 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5149,6 +5149,7 @@ nvptx_goacc_reduction_init (gcall *call)

       /* Fixup flags from call_bb to init_bb.  */
       init_edge->flags ^= EDGE_FALLTHRU | EDGE_TRUE_VALUE;
+      init_edge->probability = profile_probability::even ();

       /* Set the initialization stmts.  */
       gimple_seq init_seq = NULL;
@@ -5164,6 +5165,7 @@ nvptx_goacc_reduction_init (gcall *call)

       /* Create false edge from call_bb to dst_bb.  */
       edge nop_edge = make_edge (call_bb, dst_bb, EDGE_FALSE_VALUE);
+      nop_edge->probability = profile_probability::even ();

       /* Create phi node in dst block.  */
       gphi *phi = create_phi_node (lhs, dst_bb);

Reply via email to