[I've just posted this query to gcc-help, but it occurred to me that this list 
might be more appropriate.  I am sorry for the duplication for people who 
subscribe to both lists.]

Hi!  I'm reaching the point of exhaustion in trying to understand GCC code, so 
I need help.  I want to change the code that GCC emits when the source code has 
an OpenMP reduction clause.  


WHAT GCC DOES NOW

Suppose your source code looks like this, a minimal example:

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
       omp_set_num_threads(4);
       int x = 42;
#pragma omp parallel reduction(+:x)
       {
               x++;
       }       
       printf("x = %d\n", x);
       return EXIT_SUCCESS;
}       

GCC creates an external function, "main.omp_fn.0," for the OpenMP parallel 
block.  Within main.omp_fn.0, in order to represent the reduction clause, GCC 
uses a temporary stack variable (let's call it x_prime), initialized to 0, in 
place of the original x.  Near the end of main.omp_fn.0, it then adds the 
current value of x_prime to the original x, using an atomic instruction, such 
as the LOCK ADD instruction for x86.  Here's the assembly code for 
x86_64/Ubuntu Linux, with labels and some dot-directives removed:

main.omp_fn.0:
       pushq   %rbp
       movq    %rsp, %rbp
       movq    %rdi, -24(%rbp)
       movl    $0, -4(%rbp)
       addl    $1, -4(%rbp)
       movq    -24(%rbp), %rax
       movl    -4(%rbp), %edx
       lock addl       %edx, (%rax)
       leave
       ret


HOW I WOULD LIKE TO CHANGE GCC'S BEHAVIOR

I want to replace the LOCK ADD instruction with a call to my own function 
(let's say "omp_reduction").   I will need to pass to omp_reduction the 
following parameters:
-- An enumerator value dependent on the operator originally used in the 
reduction--here, say, "OP_PLUS" for the original + operator.
-- The address of (original) x
-- The address of x_prime
-- An enumerator value for the type of x and x_prime

So the signature of omp_reduction would be

void omp_reduction(enum op_type op, void * var, void * tmp, enum operand_type 
type);

And the call, if written in C, would look like this, if (say) x were a 32-bit 
integer:

omp_reduction(OP_PLUS, &x, &x_prime, INT32);


WHERE I AM NOW (LOST)

I think the atomic instruction at the end (e.g., LOCK ADD) is represented by 
the gimple_reduction_merge field of type gimple_seq in the tree_omp_clause 
structure defined in tree.h:

struct GTY(()) tree_omp_clause {
 /* (.. Other fields ...) */
 /* The gimplification of OMP_CLAUSE_REDUCTION_{INIT,MERGE} for omp-low's
    usage.  */
 gimple_seq gimple_reduction_init;
 gimple_seq gimple_reduction_merge;

 tree GTY ((length ("omp_clause_num_ops[OMP_CLAUSE_CODE ((tree)&%h)]"))) ops[1];
};

But I do not understand how GCC assigns or uses this field or how I can alter 
GCC's behavior WRT it.  I cannot seem to find the relevant source code in 
gcc/gcc.

I'd really appreciate help or guidance.  Thanks!

Amittai Aviram
PhD Student in Computer Science
Yale University
646 483 2639
amittai.avi...@yale.edu
http://www.amittai.com
Amittai Aviram
PhD Student in Computer Science
Yale University
646 483 2639
amittai.avi...@yale.edu
http://www.amittai.com

Reply via email to