Hi Kewen,
I'll leave the continued review of the back-end parts of this to Segher,
but I do have one long-term comment. The rs6000_builtin_info[code].mask
field that you're relying on is going away as part of the built-in
function rewrite. There will be a "bifattrs" field that replaces this
in the table-driven data structures. I'm including those below. Please
think about whether what you're building will easily translate to using
these data structures in the near future. I don't think there are any
show-stoppers here, but if the new table needs adjustment to make your
changes easier, let's discuss.
Thanks,
Bill
#define PPC_MAXRESTROPNDS 3
struct GTY((user)) bifdata
{
const char *bifname;
bif_enable enable;
tree fntype;
insn_code icode;
int nargs;
int bifattrs;
int restr_opnd[PPC_MAXRESTROPNDS];
restriction restr[PPC_MAXRESTROPNDS];
int restr_val1[PPC_MAXRESTROPNDS];
int restr_val2[PPC_MAXRESTROPNDS];
const char *attr_string;
rs6000_gen_builtins assoc_bif;
};
#define bif_init_bit (0x00000001)
#define bif_set_bit (0x00000002)
#define bif_extract_bit (0x00000004)
#define bif_nosoft_bit (0x00000008)
#define bif_ldvec_bit (0x00000010)
#define bif_stvec_bit (0x00000020)
#define bif_reve_bit (0x00000040)
#define bif_pred_bit (0x00000080)
#define bif_htm_bit (0x00000100)
#define bif_htmspr_bit (0x00000200)
#define bif_htmcr_bit (0x00000400)
#define bif_mma_bit (0x00000800)
#define bif_quad_bit (0x00001000)
#define bif_pair_bit (0x00002000)
#define bif_mmaint_bit (0x00004000)
#define bif_no32bit_bit (0x00008000)
#define bif_32bit_bit (0x00010000)
#define bif_cpu_bit (0x00020000)
#define bif_ldstmask_bit (0x00040000)
#define bif_lxvrse_bit (0x00080000)
#define bif_lxvrze_bit (0x00100000)
#define bif_endian_bit (0x00200000)
#define bif_is_init(x) ((x).bifattrs & bif_init_bit)
#define bif_is_set(x) ((x).bifattrs & bif_set_bit)
#define bif_is_extract(x) ((x).bifattrs & bif_extract_bit)
#define bif_is_nosoft(x) ((x).bifattrs & bif_nosoft_bit)
#define bif_is_ldvec(x) ((x).bifattrs & bif_ldvec_bit)
#define bif_is_stvec(x) ((x).bifattrs & bif_stvec_bit)
#define bif_is_reve(x) ((x).bifattrs & bif_reve_bit)
#define bif_is_predicate(x) ((x).bifattrs & bif_pred_bit)
#define bif_is_htm(x) ((x).bifattrs & bif_htm_bit)
#define bif_is_htmspr(x) ((x).bifattrs & bif_htmspr_bit)
#define bif_is_htmcr(x) ((x).bifattrs & bif_htmcr_bit)
#define bif_is_mma(x) ((x).bifattrs & bif_mma_bit)
#define bif_is_quad(x) ((x).bifattrs & bif_quad_bit)
#define bif_is_pair(x) ((x).bifattrs & bif_pair_bit)
#define bif_is_mmaint(x) ((x).bifattrs & bif_mmaint_bit)
#define bif_is_no32bit(x) ((x).bifattrs & bif_no32bit_bit)
#define bif_is_32bit(x) ((x).bifattrs & bif_32bit_bit)
#define bif_is_cpu(x) ((x).bifattrs & bif_cpu_bit)
#define bif_is_ldstmask(x) ((x).bifattrs & bif_ldstmask_bit)
#define bif_is_lxvrse(x) ((x).bifattrs & bif_lxvrse_bit)
#define bif_is_lxvrze(x) ((x).bifattrs & bif_lxvrze_bit)
#define bif_is_endian(x) ((x).bifattrs & bif_endian_bit)
extern bifdata rs6000_builtin_info_x[RS6000_BIF_MAX];
On 9/8/21 2:43 AM, Kewen.Lin wrote:
Hi!
Power ISA 2.07 (Power8) introduces transactional memory feature
but ISA3.1 (Power10) removes it. It exposes one troublesome
issue as PR102059 shows. Users define some function with
target pragma cpu=power10 then it calls one function with
attribute always_inline which inherits command line option
cpu=power8 which enables HTM implicitly. The current isa_flags
check doesn't allow this inlining due to "target specific
option mismatch" and error mesasge is emitted.
Normally, the callee function isn't intended to exploit HTM
feature, but the default flag setting make it look it has.
As Richi raised in the PR, we have fp_expressions flag in
function summary, and allow us to check the function actually
contains any floating point expressions to avoid overkill.
So this patch follows the similar idea but is more target
specific, for this rs6000 port specific requirement on HTM
feature check, we would like to check rs6000 specific HTM
built-in functions and inline assembly, it allows targets
to do their own customized checks and updates.
It introduces two target hooks need_ipa_fn_target_info and
update_ipa_fn_target_info. The former allows target to do
some previous check and decides to collect target specific
information for this function or not. For some special case,
it can predict the analysis result and push it early without
any scannings. The latter allows the analyze_function_body
to pass gimple stmts down just like fp_expressions handlings,
target can do its own tricks. I put them as one hook initially
with one boolean to indicates whether it's initial time, but
the code looks a bit ugly, to separate them seems to have
better readability.
To make it simple, this patch uses HOST_WIDE_INT to record the
flags just like what we use for isa_flags. For rs6000's HTM
need, one HOST_WIDE_INT variable is quite enough, but it seems
good to have one auto_vec for scalability as I noticed some
targets have more than one HOST_WIDE_INT flag.
Against v1 [1], this v2 addressed Richi's and Segher's review
comments, mainly consists of:
- Extend it to cover non always_inline.
- Exclude the case for offload streaming.
- Some function naming and formatting issues.
- Adjust rs6000_can_inline_p.
- Add new cases.
The patch has been bootstrapped and regress-tested on
powerpc64le-linux-gnu Power9.
Any comments are highly appreciated!
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578555.html
BR,
Kewen
-----
gcc/ChangeLog:
PR ipa/102059
* config/rs6000/rs6000-call.c (rs6000_fn_has_any_of_these_mask_bits):
New function.
* config/rs6000/rs6000-internal.h
(rs6000_fn_has_any_of_these_mask_bits): New declare.
* config/rs6000/rs6000.c (TARGET_NEED_IPA_FN_TARGET_INFO): New macro.
(TARGET_UPDATE_IPA_FN_TARGET_INFO): Likewise.
(rs6000_need_ipa_fn_target_info): New function.
(rs6000_update_ipa_fn_target_info): Likewise.
(rs6000_can_inline_p): Adjust for ipa function summary target info.
* ipa-fnsummary.c (ipa_dump_fn_summary): Adjust for ipa function
summary target info.
(analyze_function_body): Adjust for ipa function summary target
info and call hook rs6000_need_ipa_fn_target_info and
rs6000_update_ipa_fn_target_info.
(ipa_merge_fn_summary_after_inlining): Adjust for ipa function
summary target info.
(inline_read_section): Likewise.
(ipa_fn_summary_write): Likewise.
* ipa-fnsummary.h (ipa_fn_summary::target_info): New member.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_UPDATE_IPA_FN_TARGET_INFO): Document new
hook.
(TARGET_NEED_IPA_FN_TARGET_INFO): Likewise.
* target.def (update_ipa_fn_target_info): New hook.
(need_ipa_fn_target_info): Likewise.
* targhooks.c (default_need_ipa_fn_target_info): New function.
(default_update_ipa_fn_target_info): Likewise.
* targhooks.h (default_update_ipa_fn_target_info): New declare.
(default_need_ipa_fn_target_info): Likewise.
gcc/testsuite/ChangeLog:
PR ipa/102059
* gcc.dg/lto/pr102059-1_0.c: New test.
* gcc.dg/lto/pr102059-1_1.c: New test.
* gcc.dg/lto/pr102059-1_2.c: New test.
* gcc.dg/lto/pr102059-2_0.c: New test.
* gcc.dg/lto/pr102059-2_1.c: New test.
* gcc.dg/lto/pr102059-2_2.c: New test.
* gcc.target/powerpc/pr102059-5.c: New test.
* gcc.target/powerpc/pr102059-6.c: New test.
* gcc.target/powerpc/pr102059-7.c: New test.