On 7/1/24 6:32 AM, Richard Sandiford wrote:
Thomas pointed out that we sometimes failed to eliminate some dead code
(specifically clobbers of otherwise unused registers) on nvptx when
late-combine is enabled.  This happens because:

- combine is able to optimise the function in a way that exposes dead code.
   This leaves the df information in a "dirty" state.

- late_combine calls df_analyze without DF_LR_RUN_DCE run set.
   This updates the df information and clears the "dirty" state.

- late_combine doesn't find any extra optimisations, and so leaves
   the df information up-to-date.

- if_after_combine (ce2) calls df_analyze with DF_LR_RUN_DCE set.
   Because the df information is already up-to-date, fast DCE is
   not run.

The upshot is that running late-combine has the effect of suppressing
a DCE opportunity that would have been noticed without late-combine.

I think this shows that we should track the state of the DCE separately
from the LR problem.  Every pass updates the latter, but not all passes
update the former.

Bootstrapped & regression-tested on aarch64-linux-gnu.  Thomas also
confirms that it fixes the nvptx problem.  OK to install?

Richard


gcc/
        * df.h (DF_LR_DCE): New df_problem_id.
        (df_lr_dce): New macro.
        * df-core.cc (rest_of_handle_df_finish): Check for a null free_fun.
        * df-problems.cc (df_lr_finalize): Split out fast DCE handling to...
        (df_lr_dce_finalize): ...this new function.
        (problem_LR_DCE): New df_problem.
        (df_lr_add_problem): Register LR_DCE rather than LR itself.
        * dce.cc (fast_dce): Clear df_lr_dce->solutions_dirty.
OK
jeff

Reply via email to