On Fri, Oct 25, 2024 at 12:19 AM Antoni Boucher <boua...@zoho.com> wrote:
>
> Thanks.
> Did you review the new patch?
> Can I push it to master?
Ok.
>
> Le 2024-10-20 à 22 h 01, Hongtao Liu a écrit :
> > On Sat, Oct 19, 2024 at 2:06 AM Antoni Boucher <boua...@zoho.com> wrote:
> >>
> >> Thanks for the review.
> >> Here's the updated patch.
> >>
> >> Le 2024-10-17 à 21 h 50, Hongtao Liu a écrit :
> >>> On Fri, Oct 18, 2024 at 9:08 AM Antoni Boucher <boua...@zoho.com> wrote:
> >>>>
> >>>> Hi.
> >>>> This is a patch for the bug 116725.
> >>>> I'm not sure if it is a good fix, but it seems to do the job.
> >>>> If you have suggestions for better comments than what I wrote that would
> >>>> explain what's happening, I'm open to suggestions.
> >>>
> >>>> @@ -7548,7 +7548,8 @@ (define_insn 
> >>>> "avx512fp16_vcvtph2<sseintconvertsignprefix><sseintconvert>_<mode><
> >>>>       [(match_operand:<ssePHmode> 1 "<round_nimm_predicate>" 
> >>>> "<round_constraint>")]
> >>>>       UNSPEC_US_FIX_NOTRUNC))]
> >>>>     "TARGET_AVX512FP16 && <round_mode_condition>"
> >>>> -  
> >>>> "vcvtph2<sseintconvertsignprefix><sseintconvert>\t{<round_mask_op2>%1, 
> >>>> %0<mask_operand2>|%0<mask_operand2>, %1<round_mask_op2>}"
> >>>> +;; %X1 so that we don't emit any *WORD PTR for -masm=intel.
> >>>> +  
> >>>> "vcvtph2<sseintconvertsignprefix><sseintconvert>\t{<round_mask_op2>%1, 
> >>>> %0<mask_operand2>|%0<mask_operand2>, %X1<round_mask_op2>}"
> >>> Could you define something like
> >>>
> >>>    ;; Pointer size override for 16-bit upper-convert modes (Intel asm 
> >>> dialect)
> >>>    (define_mode_attr iptrh
> >>>     [(V32HI "") (V16SI "") (V8DI "")
> >>>      (V16HI "") (V8SI "") (V4DI "q")
> >>>      (V8HI "") (V4SI "q") (V2DI "k")])
> >>
> >> For my own understanding, was my usage of %X equivalent to a mode_attr
> >> with an empty string for all cases?
> >> How did you know which one needed an empty string?
> >
> > It's in ix86_print_operand
> > 14155  else if (MEM_P (x))
> > 14156    {
> > 14157      rtx addr = XEXP (x, 0);
> > 14158
> > 14159      /* No `byte ptr' prefix for call instructions ... */
> > 14160      if (ASSEMBLER_DIALECT == ASM_INTEL && code != 'X' && code != 'P')
> > 14161        {
> > 14162          machine_mode mode = GET_MODE (x);
> > 14163          const char *size;
> > 14164
> > 14165          /* Check for explicit size override codes.  */
> > 14166          if (code == 'b')
> > 14167            size = "BYTE";
> > 14168          else if (code == 'w')
> > 14169            size = "WORD";
> > 14170          else if (code == 'k')
> > 14171            size = "DWORD";
> > 14172          else if (code == 'q')
> > 14173            size = "QWORD";
> > 14174          else if (code == 'x')
> > 14175            size = "XMMWORD";
> > 14176          else if (code == 't')
> > 14177            size = "YMMWORD";
> > 14178          else if (code == 'g')
> > 14179            size = "ZMMWORD";
> > 14180          else if (mode == BLKmode)
> > 14181            /* ... or BLKmode operands, when not overridden.  */
> > 14182            size = NULL;
> > 14183          else
> > 14184            switch (GET_MODE_SIZE (mode))
> > 14185              {
> > 14186              case 1: size = "BYTE"; break;
> >
> >>
> >>>
> >>> And use
> >>> +  "vcvtph2<sseintconvertsignprefix><sseintconvert>\t{<round_mask_op2>%1,
> >>> %0<mask_operand2>|%0<mask_operand2>, %<iptrh>1<round_mask_op2>}"
> >>>
> >>>>     [(set_attr "type" "ssecvt")
> >>>>      (set_attr "prefix" "evex")
> >>>>      (set_attr "mode" "<sseinsnmode>")])
> >>>> @@ -29854,7 +29855,8 @@ (define_insn 
> >>>> "avx512dq_vmfpclass<mode><mask_scalar_merge_name>"
> >>>>        UNSPEC_FPCLASS)
> >>>>      (const_int 1)))]
> >>>>      "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(<MODE>mode)"
> >>>> -   "vfpclass<ssescalarmodesuffix>\t{%2, %1, 
> >>>> %0<mask_scalar_merge_operand3>|%0<mask_scalar_merge_operand3>, %1, %2}";
> >>>> +;; %X1 so that we don't emit any *WORD PTR for -masm=intel.
> >>>> +   "vfpclass<ssescalarmodesuffix>\t{%2, %1, 
> >>>> %0<mask_scalar_merge_operand3>|%0<mask_scalar_merge_operand3>, %X1, %2}";
> >>>
> >>> For scaar memory operand rewrite, we usually use <iptr>, so
> >>>      "vfpclass<ssescalarmodesuffix>\t{%2, %1,
> >>> %0<mask_scalar_merge_operand3>|%0<mask_scalar_merge_operand3>,
> >>> %<iptr>1, %2}";
> >>>
> >>>
> >>>
> >>>
> >
> >
> >
>


-- 
BR,
Hongtao

Reply via email to