On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy...@gmail.com> wrote: > > > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道: > > > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches > > > > wrote: > > > > > From: Kong Lingling <lingling.k...@intel.com> > > > > > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable > > > > > EGPR > > > > > usage by default from mapping the common reg/mem constraint to > > > > > non-EGPR > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > > > > for inline asm. > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > > > > ix86_md_asm_adjust. > > > > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying > > > > > the > > > > > target option, map reg/mem constraints to non-EGPR constraints. > > > > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > > > > --- > > > > > gcc/config/i386/i386.cc | 44 +++++++ > > > > > gcc/config/i386/i386.opt | 5 + > > > > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 > > > > > ++++++++++++++++++ > > > > > 3 files changed, 156 insertions(+) > > > > > create mode 100644 > > > > > gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > > > index d26d9ab0d9d..9460ebbfda4 100644 > > > > > --- a/gcc/config/i386/i386.cc > > > > > +++ b/gcc/config/i386/i386.cc > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General > > > > > Public License > > > > > along with GCC; see the file COPYING3. If not see > > > > > <http://www.gnu.org/licenses/>. */ > > > > > > > > > > +#define INCLUDE_STRING > > > > > #define IN_TARGET_CODE 1 > > > > > > > > > > #include "config.h" > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, > > > > > vec<rtx> & /*inputs*/, > > > > > bool saw_asm_flag = false; > > > > > > > > > > start_sequence (); > > > > > + /* TODO: Here we just mapped the general r/m constraints to > > > > > non-EGPR > > > > > + constraints, will eventually map all the usable constraints in > > > > > the future. */ > > > > > > > > I think there should be some constraint which explicitly has all the 32 > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm > > > > wants. > > > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > > without r16..r31? What about the various other memory > > > > constraints ("<", "o", ...)? > > > > > > I think we should leave all existing constraints as they are, so "r" > > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > > introduce "h" to instructions that have the ability to handle EGPR. > > > This would be somehow similar to the SSE -> AVX512F transition, where > > > we still have "x" for SSE16 and "v" was introduced as a separate > > > register class for EVEX SSE registers. This way, asm will be > > > compatible, when "r", "m", "o" and "g" are used. The new memory > > > constraint "Bt", should allow new registers, and should be added to > > > the constraint string as a separate constraint, and conditionally > > > enabled by relevant "isa" (AKA "enabled") attribute. > > > > The extended constraint can work for registers, but for memory it is more > > complicated. > > Yes, unfortunately. The compiler assumes that an unchangeable register > class is used for BASE/INDEX registers. I have hit this limitation > when trying to implement memory support for instructions involving > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX > registers, also inside memory operand. (You can see the "hack" in e.g. > *extzvqi_mem_rex64" and corresponding peephole2 with the original > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX > register class is the major limitation in the compiler, so perhaps the > strategy on how to override this limitation should be discussed with > the register allocator author first. Perhaps adding an insn attribute > to insn RTX pattern to specify different BASE/INDEX register sets can > be a better solution than passing insn RTX to the register allocator. > > The above idea still does not solve the asm problem on how to select > correct BASE/INDEX register set for memory operands. The current approach disables gpr32 for memory operand in asm_operand by default. but can be turned on by options ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction supports gpr32). Only ~ 5% of total instructions don't support gpr32, reversed approach only gonna get more complicated.
> > Uros. > > > > If we want to use new mem constraints that allow gpr32, then BASE/INDEX > > reg class still requires per-insn verification, so it means changes > > on all patterns with vm, and those SSE patterns on opcode map0/1. Also, > > several legacy insns that are promoted to EVEX encoding space need to be > > changed. The overall implementation could be 10 times larger than current, > > which would be quite hard for maintenance. > > > > > > > > Uros. > > > > > > > > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > > > > > + { > > > > > + /* Map "r" constraint in inline asm to "h" that disallows > > > > > r16-r31 > > > > > + and replace only r, exclude Br and Yr. */ > > > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > > > + { > > > > > + std::string *s = new std::string (constraints[i]); > > > > > > > > Doesn't this leak memory (all the time)? > > > > I must say I don't really understand why you need to use std::string > > > > here, > > > > but certainly it shouldn't leak. > > > > > > > > > + size_t pos = s->find ('r'); > > > > > + while (pos != std::string::npos) > > > > > + { > > > > > + if (pos > 0 > > > > > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > > > > > + pos = s->find ('r', pos + 1); > > > > > + else > > > > > + { > > > > > + s->replace (pos, 1, "h"); > > > > > + constraints[i] = (const char*) s->c_str (); > > > > > > > > Formatting (space before *). The usual way for constraints is > > > > ggc_strdup on > > > > some string in a buffer. Also, one could have several copies or r (or > > > > m, memory (doesn't > > > > that appear just in clobbers? And that doesn't look like something that > > > > should be replaced), Bm, e.g. in various alternatives. So, you > > > > need to change them all, not just the first hit. "r,r,r,m" and the > > > > like. > > > > Normally, one would simply walk the constraint string, parsing the > > > > special > > > > letters (+, =, & etc.) and single letter constraints and 2 letter > > > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC > > > > sources). > > > > Either do it in 2 passes, first one counts how long constraint string > > > > one > > > > will need after the adjustments (and whether to adjust something at > > > > all), > > > > then if needed XALLOCAVEC it and adjust in there, or say use a > > > > auto_vec<char, 32> for > > > > it. > > > > > > > > > + break; > > > > > + } > > > > > + } > > > > > + } > > > > > + /* Also map "m/memory/Bm" constraint that may use GPR32, > > > > > replace them with > > > > > + "Bt/Bt/BT". */ > > > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > > > + { > > > > > + std::string *s = new std::string (constraints[i]); > > > > > + size_t pos = s->find ("m"); > > > > > + size_t pos2 = s->find ("memory"); > > > > > + if (pos != std::string::npos) > > > > > + { > > > > > + if (pos > 0 && (s->at (pos - 1) == 'B')) > > > > > + s->replace (pos - 1, 2, "BT"); > > > > > + else if (pos2 != std::string::npos) > > > > > + s->replace (pos, 6, "Bt"); > > > > > + else > > > > > + s->replace (pos, 1, "Bt"); > > > > > > > > Formatting, the s->replace calls are indented too much. > > > > > > > > Jakub > > > > -- BR, Hongtao