On Thu, Dec 03, 2015 at 12:09:04PM +0100, Tom de Vries wrote:
> The flag is set here in expand_omp_target:
> ...
> 12682         /* Prevent IPA from removing child_fn as unreachable,
>                  since there are no
> 12683            refs from the parent function to child_fn in offload
>                  LTO mode.  */
> 12684         if (ENABLE_OFFLOADING)
> 12685           cgraph_node::get (child_fn)->mark_force_output ();
> ...
> 
> I guess setting forced_by_abi instead would also mean child_fn is not
> removed as unreachable, while still allowing optimizations:
> ...
>   /* Like FORCE_OUTPUT, but in the case it is ABI requiring the symbol
>      to be exported.  Unlike FORCE_OUTPUT this flag gets cleared to
>      symbols promoted to static and it does not inhibit
>      optimization.  */
>   unsigned forced_by_abi : 1;
> ...
> 
> But I suspect that other optimizations (than ipa-pta) might break things.
> 
> Essentially we have two situations:
> - in the host compiler, there is no need for the forced_output flag,
>   and it inhibits optimization
> - in the accelerator compiler, it (or some equivalent) is needed
> 
> I wonder if setting the force_output flag only when streaming the bytecode
> for offloading would work. That way, it wouldn't be set in the host
> compiler, while being set in the accelerator compiler.

I believe that the host and offload func (and var) tables need to be in
sync, so there needs to be something both in the host and accel compilers
that prevents the functions and variables that have their accel or host
counterpart in the tables from being optimized away, or say replaced by
a clone with different arguments etc.

        Jakub

Reply via email to