On 7/29/21 1:14 AM, Peter Maydell wrote:
+ r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)], d[H##ESIZE(e)], \ + 0, fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VFMS16(N, M, D, F, S) float16_muladd(float16_chs(N), M, D, F, S) +#define DO_VFMS32(N, M, D, F, S) float32_muladd(float32_chs(N), M, D, F, S) + +DO_VFMA(vfmah, 2, uint16_t, float16_muladd) +DO_VFMA(vfmas, 4, uint32_t, float32_muladd) +DO_VFMA(vfmsh, 2, uint16_t, DO_VFMS16) +DO_VFMA(vfmss, 4, uint32_t, DO_VFMS32)
Here's where I think passing float16/float32 as the type will pay off, with r = n[H##SIZE(e)]; if (CHS) { r = TYPE##_chs(r); } r = TYPE##_muladd(r, m[...], d[...], 0, fpst); r~