Re: [RFC PATCH 3/3] hw/openrisc: Add the OpenRISC virtual machine
On Sun, Jun 05, 2022 at 10:58:14AM +0900, Stafford Horne wrote: > On Fri, Jun 03, 2022 at 09:05:09AM +0200, Geert Uytterhoeven wrote: > > Hi Stafford, > > > > On Thu, Jun 2, 2022 at 9:59 PM Stafford Horne wrote: > > > On Thu, Jun 02, 2022 at 09:08:52PM +0200, Geert Uytterhoeven wrote: > > > > On Thu, Jun 2, 2022 at 1:42 PM Joel Stanley wrote: > > > > > On Fri, 27 May 2022 at 17:27, Stafford Horne wrote: > > > > > > This patch add the OpenRISC virtual machine 'virt' for OpenRISC. > > > > > > This > > > > > > platform allows for a convenient CI platform for toolchain, software > > > > > > ports and the OpenRISC linux kernel port. > > > > > > > > > > > > Much of this has been sourced from the m68k and riscv virt > > > > > > platforms. > > > > > > > > > I enabled the options: > > > > > > > > > > CONFIG_RTC_CLASS=y > > > > > # CONFIG_RTC_SYSTOHC is not set > > > > > # CONFIG_RTC_NVMEM is not set > > > > > CONFIG_RTC_DRV_GOLDFISH=y > > > > > > > > > > But it didn't work. It seems the goldfish rtc model doesn't handle a > > > > > big endian guest running on my little endian host. > > > > > > > > > > Doing this fixes it: > > > > > > > > > > -.endianness = DEVICE_NATIVE_ENDIAN, > > > > > +.endianness = DEVICE_HOST_ENDIAN, > > > > > > > > > > [0.19] goldfish_rtc 96005000.rtc: registered as rtc0 > > > > > [0.19] goldfish_rtc 96005000.rtc: setting system clock to > > > > > 2022-06-02T11:16:04 UTC (1654168564) > > > > > > > > > > But literally no other model in the tree does this, so I suspect it's > > > > > not the right fix. > > > > > > > > Goldfish devices are supposed to be little endian. > > > > Unfortunately m68k got this wrong, cfr. > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2e2ac4a3327479f7e2744cdd88a5c823f2057bad > > > > Please don't duplicate this bad behavior for new architectures > > > > > > Thanks for the pointer, I just wired in the goldfish RTC because I wanted > > > to > > > play with it. I was not attached to it. I can either remove it our find > > > another > > > RTC. > > > > Sorry for being too unclear: the mistake was not to use the Goldfish > > RTC, but to make its register accesses big-endian. > > Using Goldfish devices as little-endian devices should be fine. > > OK, then I would think this patch would be needed on Goldfish. I tested this > out and it seems to work: Sorry, it seems maybe I mis-understood this again. In Arnd's mail [1] he, at the end, mentions. It might be a good idea to revisit the qemu implementation and make sure that the extra byteswap is only inserted on m68k and not on other targets, but hopefully there are no new targets based on goldfish anymore and we don't need to care. So, it seems that in addition to my patch we would need something in m68k to switch it back to 'native' (big) endian? Looking at the m68k kernel/qemu interface I see: Pre 5.19: (data) <-- kernel(readl / little) <-- m68k qemu (native / big) - RTC/PIC (data) <-- kernel(__raw_readl / big) <-- m68k qemu (native / big) - TTY 5.19: (data) <-- kernel(gf_ioread32 / big) <-- m68k qemu (native / big) - all The new fixes to add gf_ioread32/gf_iowrite32 fix this for goldfish and m68k. This wouldn't have been an issue for little-endian platforms where readl/writel were originally used. Why can't m68k switch to little-endian in qemu and the kernel? The m68k virt platform is not that old, 1 year? Are there a lot of users that this would be a big problem? [1] https://lore.kernel.org/lkml/CAK8P3a1oN8NrUjkh2X8jHQbyz42Xo6GSa=5n0gd6vqcxrjm...@mail.gmail.com/ -Stafford > Patch: > > diff --git a/hw/rtc/goldfish_rtc.c b/hw/rtc/goldfish_rtc.c > index 35e493be31..f1dc5af297 100644 > --- a/hw/rtc/goldfish_rtc.c > +++ b/hw/rtc/goldfish_rtc.c > @@ -219,7 +219,7 @@ static int goldfish_rtc_post_load(void *opaque, int > version_id) > static const MemoryRegionOps goldfish_rtc_ops = { > .read = goldfish_rtc_read, > .write = goldfish_rtc_write, > -.endianness = DEVICE_NATIVE_ENDIAN, > +.endianness = DEVICE_LITTLE_ENDIAN, > .valid = { > .min_access_size = 4, > .max_access_size = 4 > > Boot Log: > > io scheduler mq-deadline registered > io scheduler kyber registered > Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled > 9000.serial: ttyS0 at MMIO 0x9000 (irq = 2, base_baud = 125) > is a 16550A > printk: console [ttyS0] enabled > loop: module loaded > virtio_blk virtio1: [vda] 32768 512-byte logical blocks (16.8 MB/16.0 MiB) > Freeing initrd memory: 1696K >*goldfish_rtc 96005000.rtc: registered as rtc0 >*goldfish_rtc 96005000.rtc: setting system clock to 2022-06-05T01:49:57 > UTC (1654393797) > NET: Registered PF_PACKET protocol family > random: fast init done > > -Stafford
Re: [RFC PATCH 3/3] hw/openrisc: Add the OpenRISC virtual machine
Hi folks, On Sun, Jun 05, 2022 at 04:32:13PM +0900, Stafford Horne wrote: > Why can't m68k switch to little-endian in qemu and the kernel? The m68k virt > platform is not that old, 1 year? Are there a lot of users that this would be > a big > problem? I also share this perspective. AFAICT, m68k virt platform *just* shipped. Fix this stuff instead of creating more compatibility bloat for a platform with no new silicon. The risks of making life difficult for 15 minutes for all seven and a half users of that code that only now has become operational is vastly dwarfed by the good sense to just fix the mistake. Treat the endian thing as a *bug* rather than a sacred ABI. Bugs only become sacred if you let them sit for years and large numbers of people grow to rely on spacebar heating. Otherwise they're just bugs. This can be fixed. Jason
[PATCH 0/2] Fixes for ui/gtk-gl-area
The first patch fixes a GL context leak. The second patch fixes a black guest screen on Wayland with OpenGL accelerated QEMU graphics devices. This bug doesn't seem to be related to issues #910, #865, #671 or #298. Volker Rümelin (2): ui/gtk-gl-area: implement GL context destruction ui/gtk-gl-area: create the requested GL context version ui/gtk-gl-area.c | 39 +-- ui/trace-events | 2 ++ 2 files changed, 39 insertions(+), 2 deletions(-) -- 2.35.3
[PATCH 2/2] ui/gtk-gl-area: create the requested GL context version
Since about 2018 virglrenderer (commit fa835b0f88 "vrend: don't hardcode context version") tries to open the highest available GL context version. This is done by creating the known GL context versions from the highest to the lowest until (*create_gl_context) returns a context != NULL. This does not work properly with the current QEMU gd_gl_area_create_context() function, because gdk_gl_context_realize() on Wayland creates a version 3.0 legacy context if the requested GL context version can't be created. In order for virglrenderer to find the highest available GL context version, return NULL if the created context version is lower than the requested version. This fixes the following error: QEMU started with -device virtio-vga-gl -display gtk,gl=on. Under Wayland, the guest window remains black and the following information can be seen on the host. gl_version 30 - compat profile (qemu:5978): Gdk-WARNING **: 16:19:01.533: gdk_gl_context_set_required_version - GL context versions less than 3.2 are not supported. (qemu:5978): Gdk-WARNING **: 16:19:01.537: gdk_gl_context_set_required_version - GL context versions less than 3.2 are not supported. (qemu:5978): Gdk-WARNING **: 16:19:01.554: gdk_gl_context_set_required_version - GL context versions less than 3.2 are not supported. vrend_renderer_fill_caps: Entering with stale GL error: 1282 To reproduce this error, an OpenGL driver is required on the host that doesn't have the latest OpenGL extensions fully implemented. An example for this is the Intel i965 driver on a Haswell processor. Signed-off-by: Volker Rümelin --- ui/gtk-gl-area.c | 31 ++- ui/trace-events | 1 + 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c index 0e20ea031d..2e0129c28c 100644 --- a/ui/gtk-gl-area.c +++ b/ui/gtk-gl-area.c @@ -170,6 +170,23 @@ void gd_gl_area_switch(DisplayChangeListener *dcl, } } +static int gd_cmp_gl_context_version(int major, int minor, QEMUGLParams *params) +{ +if (major > params->major_ver) { +return 1; +} +if (major < params->major_ver) { +return -1; +} +if (minor > params->minor_ver) { +return 1; +} +if (minor < params->minor_ver) { +return -1; +} +return 0; +} + QEMUGLContext gd_gl_area_create_context(DisplayGLCtx *dgc, QEMUGLParams *params) { @@ -177,8 +194,8 @@ QEMUGLContext gd_gl_area_create_context(DisplayGLCtx *dgc, GdkWindow *window; GdkGLContext *ctx; GError *err = NULL; +int major, minor; -gtk_gl_area_make_current(GTK_GL_AREA(vc->gfx.drawing_area)); window = gtk_widget_get_window(vc->gfx.drawing_area); ctx = gdk_window_create_gl_context(window, &err); if (err) { @@ -196,6 +213,18 @@ QEMUGLContext gd_gl_area_create_context(DisplayGLCtx *dgc, g_clear_object(&ctx); return NULL; } + +gdk_gl_context_make_current(ctx); +gdk_gl_context_get_version(ctx, &major, &minor); +gdk_gl_context_clear_current(); +gtk_gl_area_make_current(GTK_GL_AREA(vc->gfx.drawing_area)); + +if (gd_cmp_gl_context_version(major, minor, params) == -1) { +/* created ctx version < requested version */ +g_clear_object(&ctx); +} + +trace_gd_gl_area_create_context(ctx, params->major_ver, params->minor_ver); return ctx; } diff --git a/ui/trace-events b/ui/trace-events index 1040ba0f88..a922f00e10 100644 --- a/ui/trace-events +++ b/ui/trace-events @@ -26,6 +26,7 @@ gd_key_event(const char *tab, int gdk_keycode, int qkeycode, const char *action) gd_grab(const char *tab, const char *device, const char *reason) "tab=%s, dev=%s, reason=%s" gd_ungrab(const char *tab, const char *device) "tab=%s, dev=%s" gd_keymap_windowing(const char *name) "backend=%s" +gd_gl_area_create_context(void *ctx, int major, int minor) "ctx=%p, major=%d, minor=%d" gd_gl_area_destroy_context(void *ctx, void *current_ctx) "ctx=%p, current_ctx=%p" # vnc-auth-sasl.c -- 2.35.3
[PATCH 1/2] ui/gtk-gl-area: implement GL context destruction
The counterpart function for gd_gl_area_create_context() is currently empty. Implement the gd_gl_area_destroy_context() function to avoid GL context leaks. Signed-off-by: Volker Rümelin --- ui/gtk-gl-area.c | 8 +++- ui/trace-events | 1 + 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c index fc5a082eb8..0e20ea031d 100644 --- a/ui/gtk-gl-area.c +++ b/ui/gtk-gl-area.c @@ -201,7 +201,13 @@ QEMUGLContext gd_gl_area_create_context(DisplayGLCtx *dgc, void gd_gl_area_destroy_context(DisplayGLCtx *dgc, QEMUGLContext ctx) { -/* FIXME */ +GdkGLContext *current_ctx = gdk_gl_context_get_current(); + +trace_gd_gl_area_destroy_context(ctx, current_ctx); +if (ctx == current_ctx) { +gdk_gl_context_clear_current(); +} +g_clear_object(&ctx); } void gd_gl_area_scanout_texture(DisplayChangeListener *dcl, diff --git a/ui/trace-events b/ui/trace-events index f78b5e6606..1040ba0f88 100644 --- a/ui/trace-events +++ b/ui/trace-events @@ -26,6 +26,7 @@ gd_key_event(const char *tab, int gdk_keycode, int qkeycode, const char *action) gd_grab(const char *tab, const char *device, const char *reason) "tab=%s, dev=%s, reason=%s" gd_ungrab(const char *tab, const char *device) "tab=%s, dev=%s" gd_keymap_windowing(const char *name) "backend=%s" +gd_gl_area_destroy_context(void *ctx, void *current_ctx) "ctx=%p, current_ctx=%p" # vnc-auth-sasl.c # vnc-auth-vencrypt.c -- 2.35.3
Re: [RFC PATCH 1/3] target/openrisc: Add basic support for semihosting
On 6/4/22 17:57, Stafford Horne wrote: I am kind of leaning towards dropping the semi-hosting patches and only moving forward with the virt patches. The reason being that 1. we would not need to expand the architecture spec to support the qemu virt platform, and we would need to document the NOP's formally, and 2. OpenRISC doesn't really support the full "semihosting" facilities for file open/close/write etc. I agree that "virt" would to more for openrisc devel than these nops. Also, if we have virt I can't imagine anyone using the semihosting much. IMO, semihosting is most valuable for writing regression tests and not much more. (You have no control over the exit status of qemu with normal shutdown, as compared with semihosting exit.) r~
dbus-display-test is flakey
Hi Marc-André, dbus-display-test seems flakey. I'm occasionally seeing: ▶ 692/746 ERROR:../tests/qtest/dbus-display-test.c:68:test_dbus_display_vm: assertion failed (qemu_dbus_display1_vm_get_name(QEMU_DBUS_DISPLAY1_VM(vm)) == "dbus-test"): (NULL == "dbus-test") ERROR Examples: fedora rawhide x86_64: https://kojipkgs.fedoraproject.org//work/tasks/4945/87834945/build.log fedora rawhide aarch64: https://kojipkgs.fedoraproject.org//work/tasks/4946/87834946/build.log fedora 35 x86_64: https://download.copr.fedorainfracloud.org/results/@virtmaint-sig/virt-preview/fedora-35-x86_64/04491978-qemu/builder-live.log.gz This is qemu v7.0.0 with some unrelated patches on top. /usr/bin/make -O -j5 V=1 VERBOSE=1 check Side question: I know I can patch meson.build to skip the test, or similar patch changes, but is there a non-patch way to skip specific tests? Thanks, Cole
[PATCH] qemu-iotests: Discard stderr when probing devices
./configure --enable-modules --enable-smartcard \ --target-list=x86_64-softmmu,s390x-softmmu make cd build QEMU_PROG=`pwd`/s390x-softmmu/qemu-system-s390x \ ../tests/check-block.sh qcow2 ... --- /home/crobinso/src/qemu/tests/qemu-iotests/127.out +++ /home/crobinso/src/qemu/build/tests/qemu-iotests/scratch/127.out.bad @@ -1,4 +1,18 @@ QA output created by 127 +Failed to open module: /home/crobinso/src/qemu/build/hw-usb-smartcard.so: undefined symbol: ccid_card_ccid_attach ... --- /home/crobinso/src/qemu/tests/qemu-iotests/267.out +++ /home/crobinso/src/qemu/build/tests/qemu-iotests/scratch/267.out.bad @@ -1,4 +1,11 @@ QA output created by 267 +Failed to open module: /home/crobinso/src/qemu/build/hw-usb-smartcard.so: undefined symbol: ccid_card_ccid_attach The stderr spew is its own known issue, but seems like iotests should be discarding stderr in this case. Signed-off-by: Cole Robinson --- tests/qemu-iotests/common.rc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 165b54a61e..db757025cb 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -982,7 +982,7 @@ _require_large_file() # _require_devices() { -available=$($QEMU -M none -device help | \ +available=$($QEMU -M none -device help 2> /dev/null | \ grep ^name | sed -e 's/^name "//' -e 's/".*$//') for device do @@ -994,7 +994,7 @@ _require_devices() _require_one_device_of() { -available=$($QEMU -M none -device help | \ +available=$($QEMU -M none -device help 2> /dev/null | \ grep ^name | sed -e 's/^name "//' -e 's/".*$//') for device do -- 2.36.1
[PATCH] hw/mips/boston: Initialize g_autofree pointers
Fixes compilation due to false positives with -Werror: In file included from /usr/include/glib-2.0/glib.h:114, from qemu/src/include/glib-compat.h:32, from qemu/src/include/qemu/osdep.h:144, from ../src/hw/mips/boston.c:20: In function ‘g_autoptr_cleanup_generic_gfree’, inlined from ‘boston_mach_init’ at ../src/hw/mips/boston.c:790:52: /usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: ‘dtb_load_data’ may be used uninitialized [-Werror=maybe-uninitialized] 28 | g_free (*pp); | ^~~~ ../src/hw/mips/boston.c: In function ‘boston_mach_init’: ../src/hw/mips/boston.c:790:52: note: ‘dtb_load_data’ was declared here 790 | g_autofree const void *dtb_file_data, *dtb_load_data; |^ In function ‘g_autoptr_cleanup_generic_gfree’, inlined from ‘boston_mach_init’ at ../src/hw/mips/boston.c:790:36: /usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: ‘dtb_file_data’ may be used uninitialized [-Werror=maybe-uninitialized] 28 | g_free (*pp); | ^~~~ ../src/hw/mips/boston.c: In function ‘boston_mach_init’: ../src/hw/mips/boston.c:790:36: note: ‘dtb_file_data’ was declared here 790 | g_autofree const void *dtb_file_data, *dtb_load_data; |^ cc1: all warnings being treated as errors Signed-off-by: Bernhard Beschow --- hw/mips/boston.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/mips/boston.c b/hw/mips/boston.c index 59ca08b93a..1debca18ec 100644 --- a/hw/mips/boston.c +++ b/hw/mips/boston.c @@ -787,7 +787,8 @@ static void boston_mach_init(MachineState *machine) if (kernel_size > 0) { int dt_size; -g_autofree const void *dtb_file_data, *dtb_load_data; +g_autofree const void *dtb_file_data = NULL; +g_autofree const void *dtb_load_data = NULL; hwaddr dtb_paddr = QEMU_ALIGN_UP(kernel_high, 64 * KiB); hwaddr dtb_vaddr = cpu_mips_phys_to_kseg0(NULL, dtb_paddr); -- 2.36.1
[PATCH 1/2] target/arm: SCR_EL3 bits 4,5 are always res0
These bits do not depend on whether or not el1 supports aa32. Signed-off-by: Richard Henderson --- target/arm/helper.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/target/arm/helper.c b/target/arm/helper.c index 40da63913c..c262b00c3c 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -1752,11 +1752,8 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) ARMCPU *cpu = env_archcpu(env); if (ri->state == ARM_CP_STATE_AA64) { -if (arm_feature(env, ARM_FEATURE_AARCH64) && -!cpu_isar_feature(aa64_aa32_el1, cpu)) { -value |= SCR_FW | SCR_AW; /* these two bits are RES1. */ -} -valid_mask &= ~SCR_NET; +value |= SCR_FW | SCR_AW; /* RES1 */ +valid_mask &= ~SCR_NET;/* RES0 */ if (cpu_isar_feature(aa64_ras, cpu)) { valid_mask |= SCR_TERR; -- 2.34.1
[PATCH 0/2] target/arm: SCR_EL3 RES0, RAO/WI tweaks
Adjust RW, fixing #1062, and adjusting bits [4:2]. r~ Richard Henderson (2): target/arm: SCR_EL3 bits 4,5 are always res0 target/arm: SCR_EL3.RW is RAO/WI without AArch32 EL[12] target/arm/cpu.h| 5 + target/arm/helper.c | 11 ++- 2 files changed, 11 insertions(+), 5 deletions(-) -- 2.34.1
[PATCH 2/2] target/arm: SCR_EL3.RW is RAO/WI without AArch32 EL[12]
Since DDI0487F.a, the RW bit is RAO/WI. When specifically targeting such a cpu, e.g. cortex-a76, it is legitimate to ignore the bit within the secure monitor. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1062 Signed-off-by: Richard Henderson --- target/arm/cpu.h| 5 + target/arm/helper.c | 4 2 files changed, 9 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index c1865ad5da..a7c45d0d66 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3947,6 +3947,11 @@ static inline bool isar_feature_aa64_aa32_el1(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL1) >= 2; } +static inline bool isar_feature_aa64_aa32_el2(const ARMISARegisters *id) +{ +return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL2) >= 2; +} + static inline bool isar_feature_aa64_ras(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, RAS) != 0; diff --git a/target/arm/helper.c b/target/arm/helper.c index c262b00c3c..84232a6437 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -1755,6 +1755,10 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) value |= SCR_FW | SCR_AW; /* RES1 */ valid_mask &= ~SCR_NET;/* RES0 */ +if (!cpu_isar_feature(aa64_aa32_el1, cpu) && +!cpu_isar_feature(aa64_aa32_el2, cpu)) { +value |= SCR_RW; /* RAO/WI*/ +} if (cpu_isar_feature(aa64_ras, cpu)) { valid_mask |= SCR_TERR; } -- 2.34.1
Re: [PATCH] hw/mips/boston: Initialize g_autofree pointers
On 6/5/22 08:19, Bernhard Beschow wrote: Fixes compilation due to false positives with -Werror: In file included from /usr/include/glib-2.0/glib.h:114, from qemu/src/include/glib-compat.h:32, from qemu/src/include/qemu/osdep.h:144, from ../src/hw/mips/boston.c:20: In function ‘g_autoptr_cleanup_generic_gfree’, inlined from ‘boston_mach_init’ at ../src/hw/mips/boston.c:790:52: /usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: ‘dtb_load_data’ may be used uninitialized [-Werror=maybe-uninitialized] 28 | g_free (*pp); | ^~~~ ../src/hw/mips/boston.c: In function ‘boston_mach_init’: ../src/hw/mips/boston.c:790:52: note: ‘dtb_load_data’ was declared here 790 | g_autofree const void *dtb_file_data, *dtb_load_data; |^ In function ‘g_autoptr_cleanup_generic_gfree’, inlined from ‘boston_mach_init’ at ../src/hw/mips/boston.c:790:36: /usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: ‘dtb_file_data’ may be used uninitialized [-Werror=maybe-uninitialized] 28 | g_free (*pp); | ^~~~ ../src/hw/mips/boston.c: In function ‘boston_mach_init’: ../src/hw/mips/boston.c:790:36: note: ‘dtb_file_data’ was declared here 790 | g_autofree const void *dtb_file_data, *dtb_load_data; |^ cc1: all warnings being treated as errors Signed-off-by: Bernhard Beschow --- hw/mips/boston.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v5 2/3] target/riscv: Add stimecmp support
On Thu, Jun 2, 2022 at 12:02 AM Alistair Francis wrote: > > On Wed, Jun 1, 2022 at 4:16 AM Atish Patra wrote: > > > > stimecmp allows the supervisor mode to update stimecmp CSR directly > > to program the next timer interrupt. This CSR is part of the Sstc > > extension which was ratified recently. > > > > Signed-off-by: Atish Patra > > --- > > target/riscv/cpu.c | 8 > > target/riscv/cpu.h | 5 ++ > > target/riscv/cpu_bits.h| 4 ++ > > target/riscv/csr.c | 81 +++ > > target/riscv/machine.c | 1 + > > target/riscv/meson.build | 3 +- > > target/riscv/time_helper.c | 98 ++ > > target/riscv/time_helper.h | 30 > > 8 files changed, 229 insertions(+), 1 deletion(-) > > create mode 100644 target/riscv/time_helper.c > > create mode 100644 target/riscv/time_helper.h > > > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > > index 19f4e8294042..d58dd2f857a7 100644 > > --- a/target/riscv/cpu.c > > +++ b/target/riscv/cpu.c > > @@ -23,6 +23,7 @@ > > #include "qemu/log.h" > > #include "cpu.h" > > #include "internals.h" > > +#include "time_helper.h" > > #include "exec/exec-all.h" > > #include "qapi/error.h" > > #include "qemu/error-report.h" > > @@ -779,7 +780,12 @@ static void riscv_cpu_init(Object *obj) > > #ifndef CONFIG_USER_ONLY > > qdev_init_gpio_in(DEVICE(cpu), riscv_cpu_set_irq, > >IRQ_LOCAL_MAX + IRQ_LOCAL_GUEST_MAX); > > + > > +if (cpu->cfg.ext_sstc) { > > +riscv_timer_init(cpu); > > +} > > #endif /* CONFIG_USER_ONLY */ > > + > > } > > > > static Property riscv_cpu_properties[] = { > > @@ -806,6 +812,7 @@ static Property riscv_cpu_properties[] = { > > DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true), > > DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true), > > DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true), > > +DEFINE_PROP_BOOL("sstc", RISCVCPU, cfg.ext_sstc, true), > > Do we want this enabled by default? > sstc extension will result in performance improvements as it avoids the SBI calls & interrupt forwarding path. That's why I think it should be enabled by default. > > > > DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec), > > DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec), > > @@ -965,6 +972,7 @@ static void riscv_isa_string_ext(RISCVCPU *cpu, char > > **isa_str, int max_str_len) > > ISA_EDATA_ENTRY(zbs, ext_zbs), > > ISA_EDATA_ENTRY(zve32f, ext_zve32f), > > ISA_EDATA_ENTRY(zve64f, ext_zve64f), > > +ISA_EDATA_ENTRY(sstc, ext_sstc), > > ISA_EDATA_ENTRY(svinval, ext_svinval), > > ISA_EDATA_ENTRY(svnapot, ext_svnapot), > > ISA_EDATA_ENTRY(svpbmt, ext_svpbmt), > > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > > index 1119d5201066..9a5e02f217ba 100644 > > --- a/target/riscv/cpu.h > > +++ b/target/riscv/cpu.h > > @@ -276,6 +276,9 @@ struct CPUArchState { > > uint64_t mfromhost; > > uint64_t mtohost; > > > > +/* Sstc CSRs */ > > +uint64_t stimecmp; > > + > > /* physical memory protection */ > > pmp_table_t pmp_state; > > target_ulong mseccfg; > > @@ -329,6 +332,7 @@ struct CPUArchState { > > float_status fp_status; > > > > /* Fields from here on are preserved across CPU reset. */ > > +QEMUTimer *stimer; /* Internal timer for S-mode interrupt */ > > > > hwaddr kernel_addr; > > hwaddr fdt_addr; > > @@ -379,6 +383,7 @@ struct RISCVCPUConfig { > > bool ext_counters; > > bool ext_ifencei; > > bool ext_icsr; > > +bool ext_sstc; > > bool ext_svinval; > > bool ext_svnapot; > > bool ext_svpbmt; > > diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h > > index 4e5b630f5965..29d0e4a1be01 100644 > > --- a/target/riscv/cpu_bits.h > > +++ b/target/riscv/cpu_bits.h > > @@ -215,6 +215,10 @@ > > #define CSR_STVAL 0x143 > > #define CSR_SIP 0x144 > > > > +/* Sstc supervisor CSRs */ > > +#define CSR_STIMECMP0x14D > > +#define CSR_STIMECMPH 0x15D > > + > > /* Supervisor Protection and Translation */ > > #define CSR_SPTBR 0x180 > > #define CSR_SATP0x180 > > diff --git a/target/riscv/csr.c b/target/riscv/csr.c > > index 245f007e66e1..48d07911ae14 100644 > > --- a/target/riscv/csr.c > > +++ b/target/riscv/csr.c > > @@ -21,6 +21,7 @@ > > #include "qemu/log.h" > > #include "qemu/timer.h" > > #include "cpu.h" > > +#include "time_helper.h" > > #include "qemu/main-loop.h" > > #include "exec/exec-all.h" > > #include "sysemu/cpu-timers.h" > > @@ -537,6 +538,76 @@ static RISCVException read_timeh(CPURISCVState *env, > > int csrno, > > return RISCV_EXCP_NONE; > > } > > > > +static RISCVException sstc(CPURISCVState *env, int csrno) > > +{ > > +CPUState *cs = env_cpu(env); > > +RISCVCPU *cpu = RISCV_CPU(cs); > > + > > +if (!cpu->cfg.ext_sstc || !env->rdt
Re: [PATCH qemu v18 00/16] Add tail agnostic behavior for rvv instructions
On Fri, May 13, 2022 at 9:55 PM ~eopxd wrote: > > According to v-spec, tail agnostic behavior can be either kept as > undisturbed or set elements' bits to all 1s. To distinguish the > difference of tail policies, QEMU should be able to simulate the tail > agnostic behavior as "set tail elements' bits to all 1s". An option > 'rvv_ta_all_1s' is added to enable the behavior, it is default as > disabled. > > There are multiple possibility for agnostic elements according to > v-spec. The main intent of this patch-set tries to add option that > can distinguish between tail policies. Setting agnostic elements to > all 1s makes things simple and allow QEMU to express this. > > We may explore other possibility of agnostic behavior by adding > other options in the future. Please understand that this patch-set > is limited. > > v2 updates: > - Addressed comments from Weiwei Li > - Added commit tail agnostic on load / store instructions (which > I forgot to include into the patch-set) > > v3 updates: > - Missed the very 1st commit, adding it back > > v4 updates: > - Renamed vlmax to total_elems > - Deal with tail element when vl_eq_vlmax == true > > v5 updates: > - Let `vext_get_total_elems` take `desc` and `esz` > - Utilize `simd_maxsz(desc)` to get `vlenb` > - Fix alignments to code > > v6 updates: > - Fix `vext_get_total_elems` > > v7 updates: > - Reuse `max_elems` for vector load / store helper functions. The > translation sets desc's `lmul` to `min(1, lmul)`, making > `vext_max_elems` equivalent to `vext_get_total_elems`. > > v8 updates: > - Simplify `vext_set_elems_1s`, don't need `vext_set_elems_1s_fns` > - Fix `vext_get_total_elems`, it should derive upon EMUL instead > of LMUL > > v9 updates: > - Let instructions that is tail agnostic regardless of vta respect the > option and not the vta. > > v10 updates: > - Correct range to set element to 1s for load instructions > > v11 updates: > - Separate addition of option 'rvv_ta_all_1s' as a new (last) commit > - Add description to show intent of the option in first commit for the > optional tail agnostic behavior > - Tag WeiWei as Reviewed-by for all commits > - Tag Alistair as Reviewed-by for commit 01, 02 > - Tag Alistair as Acked-by for commit 03 > > v12 updates: > - Add missing space in WeiWei's "Reviewed-by" tag > > v13 updates: > - Fix tail agnostic for vext_ldst_us. The function operates on input > parameter 'evl' rather than 'env->vl'. > - Fix tail elements for vector segment load / store instructions > A vector segment load / store instruction may contain fractional > lmul with nf * lmul > 1. The rest of the elements in the last > register should be treated as tail elements. > - Fix tail agnostic length for instructions with mask destination > register. Instructions with mask destination register should have > 'vlen - vl' tail elements. > > v14 updates: > - Pass lmul information to into vector helper function. > `vext_get_total_elems` needs it. > > v15 updates: > - Rebase to latest `master` > - Tag Alistair as Acked by for commit 04 ~ 14 > - Tag Alistair as Acked by for commit 15 > > v16 updates: > - Fix bug, when encountering situation when lmul < 0 and vl_eq_vlmax, > the original version will override on `vd` but the computation will > override again, meaning the tail elements will not be set correctly. > Now, we don't use TCG functions if we are trying to simulate all 1s > for agnostic and use vector helpers instead. > > v17 updates: > - Add "Prune access_type parameter" commit to cleanup vector load/ > store functions. Then add parameter `is_load` in vector helper > functions to enable vta behavior in the commit for adding vta on > vector load/store functions. > > v18 updates: > - Don't use `is_load` parameter in vector helper. Don't let vta pass >through in `trans_rvv.inc` > > eopXD (16): > target/riscv: rvv: Prune redundant ESZ, DSZ parameter passed > target/riscv: rvv: Prune redundant access_type parameter passed > target/riscv: rvv: Rename ambiguous esz > target/riscv: rvv: Early exit when vstart >= vl > target/riscv: rvv: Add tail agnostic for vv instructions > target/riscv: rvv: Add tail agnostic for vector load / store > instructions > target/riscv: rvv: Add tail agnostic for vx, vvm, vxm instructions > target/riscv: rvv: Add tail agnostic for vector integer shift > instructions > target/riscv: rvv: Add tail agnostic for vector integer comparison > instructions > target/riscv: rvv: Add tail agnostic for vector integer merge and move > instructions > target/riscv: rvv: Add tail agnostic for vector fix-point arithmetic > instructions > target/riscv: rvv: Add tail agnostic for vector floating-point > instructions > target/riscv: rvv: Add tail agnostic for vector reduction instructions > target/riscv: rvv: Add tail agnostic for vector mask instructions > target/riscv: rvv: Add tail agnostic for vector permutation > instructions > target/riscv: rv
Re: [PATCH v3 3/4] target/riscv: Update [m|h]tinst CSR in riscv_cpu_do_interrupt()
On Thu, May 26, 2022 at 8:12 PM Anup Patel wrote: > > We should write transformed instruction encoding of the trapped > instruction in [m|h]tinst CSR at time of taking trap as defined > by the RISC-V privileged specification v1.12. > > Signed-off-by: Anup Patel Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu_helper.c | 210 +- > target/riscv/instmap.h| 43 > 2 files changed, 249 insertions(+), 4 deletions(-) > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > index d99fac9d2d..2a2b6776fb 100644 > --- a/target/riscv/cpu_helper.c > +++ b/target/riscv/cpu_helper.c > @@ -22,6 +22,7 @@ > #include "qemu/main-loop.h" > #include "cpu.h" > #include "exec/exec-all.h" > +#include "instmap.h" > #include "tcg/tcg-op.h" > #include "trace.h" > #include "semihosting/common-semi.h" > @@ -1316,6 +1317,200 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, > int size, > > return true; > } > + > +static target_ulong riscv_transformed_insn(CPURISCVState *env, > + target_ulong insn) > +{ > +bool xinsn_has_addr_offset = false; > +target_ulong xinsn = 0; > + > +/* > + * Only Quadrant 0 and Quadrant 2 of RVC instruction space need to > + * be uncompressed. The Quadrant 1 of RVC instruction space need > + * not be transformed because these instructions won't generate > + * any load/store trap. > + */ > + > +if ((insn & 0x3) != 0x3) { > +/* Transform 16bit instruction into 32bit instruction */ > +switch (GET_C_OP(insn)) { > +case OPC_RISC_C_OP_QUAD0: /* Quadrant 0 */ > +switch (GET_C_FUNC(insn)) { > +case OPC_RISC_C_FUNC_FLD_LQ: > +if (riscv_cpu_xlen(env) != 128) { /* C.FLD (RV32/64) */ > +xinsn = OPC_RISC_FLD; > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_I_IMM(xinsn, GET_C_LD_IMM(insn)); > +xinsn_has_addr_offset = true; > +} > +break; > +case OPC_RISC_C_FUNC_LW: /* C.LW */ > +xinsn = OPC_RISC_LW; > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_I_IMM(xinsn, GET_C_LW_IMM(insn)); > +xinsn_has_addr_offset = true; > +break; > +case OPC_RISC_C_FUNC_FLW_LD: > +if (riscv_cpu_xlen(env) == 32) { /* C.FLW (RV32) */ > +xinsn = OPC_RISC_FLW; > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_I_IMM(xinsn, GET_C_LW_IMM(insn)); > +xinsn_has_addr_offset = true; > +} else { /* C.LD (RV64/RV128) */ > +xinsn = OPC_RISC_LD; > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_I_IMM(xinsn, GET_C_LD_IMM(insn)); > +xinsn_has_addr_offset = true; > +} > +break; > +case OPC_RISC_C_FUNC_FSD_SQ: > +if (riscv_cpu_xlen(env) != 128) { /* C.FSD (RV32/64) */ > +xinsn = OPC_RISC_FSD; > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_S_IMM(xinsn, GET_C_SD_IMM(insn)); > +xinsn_has_addr_offset = true; > +} > +break; > +case OPC_RISC_C_FUNC_SW: /* C.SW */ > +xinsn = OPC_RISC_SW; > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_S_IMM(xinsn, GET_C_SW_IMM(insn)); > +xinsn_has_addr_offset = true; > +break; > +case OPC_RISC_C_FUNC_FSW_SD: > +if (riscv_cpu_xlen(env) == 32) { /* C.FSW (RV32) */ > +xinsn = OPC_RISC_FSW; > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_S_IMM(xinsn, GET_C_SW_IMM(insn)); > +xinsn_has_addr_offset = true; > +} else { /* C.SD (RV64/RV128) */ > +xinsn = OPC_RISC_SD; > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > +xinsn = SET_S_IMM(xinsn, GET_C_SD_IMM(insn)); > +xinsn_has_addr_offset = true; > +} > +break; > +default: > +
Re: [PATCH v3 3/4] target/riscv: Update [m|h]tinst CSR in riscv_cpu_do_interrupt()
On Mon, Jun 6, 2022 at 11:48 AM Alistair Francis wrote: > > On Thu, May 26, 2022 at 8:12 PM Anup Patel wrote: > > > > We should write transformed instruction encoding of the trapped > > instruction in [m|h]tinst CSR at time of taking trap as defined > > by the RISC-V privileged specification v1.12. > > > > Signed-off-by: Anup Patel > > Reviewed-by: Alistair Francis Whoops, I thought there was another patch. This doesn't seem to implemented the guest-page fault pseudoinstructions which can be generated while doing VS-stage translation Alistair > > Alistair > > > --- > > target/riscv/cpu_helper.c | 210 +- > > target/riscv/instmap.h| 43 > > 2 files changed, 249 insertions(+), 4 deletions(-) > > > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > > index d99fac9d2d..2a2b6776fb 100644 > > --- a/target/riscv/cpu_helper.c > > +++ b/target/riscv/cpu_helper.c > > @@ -22,6 +22,7 @@ > > #include "qemu/main-loop.h" > > #include "cpu.h" > > #include "exec/exec-all.h" > > +#include "instmap.h" > > #include "tcg/tcg-op.h" > > #include "trace.h" > > #include "semihosting/common-semi.h" > > @@ -1316,6 +1317,200 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr > > address, int size, > > > > return true; > > } > > + > > +static target_ulong riscv_transformed_insn(CPURISCVState *env, > > + target_ulong insn) > > +{ > > +bool xinsn_has_addr_offset = false; > > +target_ulong xinsn = 0; > > + > > +/* > > + * Only Quadrant 0 and Quadrant 2 of RVC instruction space need to > > + * be uncompressed. The Quadrant 1 of RVC instruction space need > > + * not be transformed because these instructions won't generate > > + * any load/store trap. > > + */ > > + > > +if ((insn & 0x3) != 0x3) { > > +/* Transform 16bit instruction into 32bit instruction */ > > +switch (GET_C_OP(insn)) { > > +case OPC_RISC_C_OP_QUAD0: /* Quadrant 0 */ > > +switch (GET_C_FUNC(insn)) { > > +case OPC_RISC_C_FUNC_FLD_LQ: > > +if (riscv_cpu_xlen(env) != 128) { /* C.FLD (RV32/64) */ > > +xinsn = OPC_RISC_FLD; > > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_I_IMM(xinsn, GET_C_LD_IMM(insn)); > > +xinsn_has_addr_offset = true; > > +} > > +break; > > +case OPC_RISC_C_FUNC_LW: /* C.LW */ > > +xinsn = OPC_RISC_LW; > > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_I_IMM(xinsn, GET_C_LW_IMM(insn)); > > +xinsn_has_addr_offset = true; > > +break; > > +case OPC_RISC_C_FUNC_FLW_LD: > > +if (riscv_cpu_xlen(env) == 32) { /* C.FLW (RV32) */ > > +xinsn = OPC_RISC_FLW; > > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_I_IMM(xinsn, GET_C_LW_IMM(insn)); > > +xinsn_has_addr_offset = true; > > +} else { /* C.LD (RV64/RV128) */ > > +xinsn = OPC_RISC_LD; > > +xinsn = SET_RD(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_I_IMM(xinsn, GET_C_LD_IMM(insn)); > > +xinsn_has_addr_offset = true; > > +} > > +break; > > +case OPC_RISC_C_FUNC_FSD_SQ: > > +if (riscv_cpu_xlen(env) != 128) { /* C.FSD (RV32/64) */ > > +xinsn = OPC_RISC_FSD; > > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_S_IMM(xinsn, GET_C_SD_IMM(insn)); > > +xinsn_has_addr_offset = true; > > +} > > +break; > > +case OPC_RISC_C_FUNC_SW: /* C.SW */ > > +xinsn = OPC_RISC_SW; > > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_S_IMM(xinsn, GET_C_SW_IMM(insn)); > > +xinsn_has_addr_offset = true; > > +break; > > +case OPC_RISC_C_FUNC_FSW_SD: > > +if (riscv_cpu_xlen(env) == 32) { /* C.FSW (RV32) */ > > +xinsn = OPC_RISC_FSW; > > +xinsn = SET_RS2(xinsn, GET_C_RS2S(insn)); > > +xinsn = SET_RS1(xinsn, GET_C_RS1S(insn)); > > +xinsn = SET_S_IMM(xinsn, GET_C_SW_IMM(insn)); > > +xinsn_
Re: [PATCH v3 4/4] target/riscv: Force disable extensions if priv spec version does not match
On Thu, May 26, 2022 at 8:09 PM Anup Patel wrote: > > We should disable extensions in riscv_cpu_realize() if minimum required > priv spec version is not satisfied. This also ensures that machines with > priv spec v1.11 (or lower) cannot enable H, V, and various multi-letter > extensions. > > Fixes: a775398be2e ("target/riscv: Add isa extenstion strings to the > device tree") > Signed-off-by: Anup Patel > --- > target/riscv/cpu.c | 56 +- > 1 file changed, 51 insertions(+), 5 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index b086eb25da..e6e878ceb3 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -43,9 +43,13 @@ static const char riscv_single_letter_exts[] = > "IEMAFDQCPVH"; > > struct isa_ext_data { > const char *name; > -bool enabled; > +int min_version; > +bool *enabled; > }; > > +#define ISA_EDATA_ENTRY(name, prop) {#name, PRIV_VERSION_1_10_0, > &cpu->cfg.prop} > +#define ISA_EDATA_ENTRY2(name, min_ver, prop) {#name, min_ver, > &cpu->cfg.prop} > + > const char * const riscv_int_regnames[] = { >"x0/zero", "x1/ra", "x2/sp", "x3/gp", "x4/tp", "x5/t0", "x6/t1", >"x7/t2", "x8/s0", "x9/s1", "x10/a0", "x11/a1", "x12/a2", "x13/a3", > @@ -513,8 +517,42 @@ static void riscv_cpu_realize(DeviceState *dev, Error > **errp) > CPURISCVState *env = &cpu->env; > RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev); > CPUClass *cc = CPU_CLASS(mcc); > -int priv_version = -1; > +int i, priv_version = -1; > Error *local_err = NULL; > +struct isa_ext_data isa_edata_arr[] = { > +ISA_EDATA_ENTRY2(h, PRIV_VERSION_1_12_0, ext_h), > +ISA_EDATA_ENTRY2(v, PRIV_VERSION_1_12_0, ext_v), > +ISA_EDATA_ENTRY2(zicsr, PRIV_VERSION_1_10_0, ext_icsr), > +ISA_EDATA_ENTRY2(zifencei, PRIV_VERSION_1_10_0, ext_ifencei), > +ISA_EDATA_ENTRY2(zfh, PRIV_VERSION_1_12_0, ext_zfh), > +ISA_EDATA_ENTRY2(zfhmin, PRIV_VERSION_1_12_0, ext_zfhmin), > +ISA_EDATA_ENTRY2(zfinx, PRIV_VERSION_1_12_0, ext_zfinx), > +ISA_EDATA_ENTRY2(zdinx, PRIV_VERSION_1_12_0, ext_zdinx), > +ISA_EDATA_ENTRY2(zba, PRIV_VERSION_1_12_0, ext_zba), > +ISA_EDATA_ENTRY2(zbb, PRIV_VERSION_1_12_0, ext_zbb), > +ISA_EDATA_ENTRY2(zbc, PRIV_VERSION_1_12_0, ext_zbc), > +ISA_EDATA_ENTRY2(zbkb, PRIV_VERSION_1_12_0, ext_zbkb), > +ISA_EDATA_ENTRY2(zbkc, PRIV_VERSION_1_12_0, ext_zbkc), > +ISA_EDATA_ENTRY2(zbkx, PRIV_VERSION_1_12_0, ext_zbkx), > +ISA_EDATA_ENTRY2(zbs, PRIV_VERSION_1_12_0, ext_zbs), > +ISA_EDATA_ENTRY2(zk, PRIV_VERSION_1_12_0, ext_zk), > +ISA_EDATA_ENTRY2(zkn, PRIV_VERSION_1_12_0, ext_zkn), > +ISA_EDATA_ENTRY2(zknd, PRIV_VERSION_1_12_0, ext_zknd), > +ISA_EDATA_ENTRY2(zkne, PRIV_VERSION_1_12_0, ext_zkne), > +ISA_EDATA_ENTRY2(zknh, PRIV_VERSION_1_12_0, ext_zknh), > +ISA_EDATA_ENTRY2(zkr, PRIV_VERSION_1_12_0, ext_zkr), > +ISA_EDATA_ENTRY2(zks, PRIV_VERSION_1_12_0, ext_zks), > +ISA_EDATA_ENTRY2(zksed, PRIV_VERSION_1_12_0, ext_zksed), > +ISA_EDATA_ENTRY2(zksh, PRIV_VERSION_1_12_0, ext_zksh), > +ISA_EDATA_ENTRY2(zkt, PRIV_VERSION_1_12_0, ext_zkt), > +ISA_EDATA_ENTRY2(zve32f, PRIV_VERSION_1_12_0, ext_zve32f), > +ISA_EDATA_ENTRY2(zve64f, PRIV_VERSION_1_12_0, ext_zve64f), > +ISA_EDATA_ENTRY2(zhinx, PRIV_VERSION_1_12_0, ext_zhinx), > +ISA_EDATA_ENTRY2(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin), > +ISA_EDATA_ENTRY2(svinval, PRIV_VERSION_1_12_0, ext_svinval), > +ISA_EDATA_ENTRY2(svnapot, PRIV_VERSION_1_12_0, ext_svnapot), > +ISA_EDATA_ENTRY2(svpbmt, PRIV_VERSION_1_12_0, ext_svpbmt), > +}; > > cpu_exec_realizefn(cs, &local_err); > if (local_err != NULL) { > @@ -541,6 +579,16 @@ static void riscv_cpu_realize(DeviceState *dev, Error > **errp) > set_priv_version(env, priv_version); > } > > +/* Force disable extensions if priv spec version does not match */ > +for (i = 0; i < ARRAY_SIZE(isa_edata_arr); i++) { > +if (*isa_edata_arr[i].enabled && > +(env->priv_ver < isa_edata_arr[i].min_version)) { > +*isa_edata_arr[i].enabled = false; > +warn_report("privilege spec version does not match for %s > extension", > +isa_edata_arr[i].name); This should indicate to the user that we are disabling the extension because of this Alistair > +} > +} > + > if (cpu->cfg.mmu) { > riscv_set_feature(env, RISCV_FEATURE_MMU); > } > @@ -1005,8 +1053,6 @@ static void riscv_cpu_class_init(ObjectClass *c, void > *data) > device_class_set_props(dc, riscv_cpu_properties); > } > > -#define ISA_EDATA_ENTRY(name, prop) {#name, cpu->cfg.prop} > - > static void riscv_isa_string_ext(RISCVCPU *cpu, char **isa_str, int > max_str_len) > { > char *old = *isa_str; > @@ -1064,7 +1
Re: [PATCH] target/riscv: Wake on VS-level external interrupts
On Wed, Jun 1, 2022 at 7:07 AM Andrew Bresticker wrote: > > Whether or not VSEIP is pending isn't reflected in env->mip and must > instead be determined from hstatus.vgein and hgeip. As a result a > CPU in WFI won't wake on a VSEIP, which violates the WFI behavior as > specified in the privileged ISA. Just use riscv_cpu_all_pending() > instead, which already accounts for VSEIP. > > Signed-off-by: Andrew Bresticker Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c| 2 +- > target/riscv/cpu.h| 1 + > target/riscv/cpu_helper.c | 2 +- > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index a91253d4bd..c6cc08c355 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -391,7 +391,7 @@ static bool riscv_cpu_has_work(CPUState *cs) > * Definition of the WFI instruction requires it to ignore the privilege > * mode and delegation registers, but respect individual enables > */ > -return (env->mip & env->mie) != 0; > +return riscv_cpu_all_pending(env) != 0; > #else > return true; > #endif > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > index f08c3e8813..758ab6c90b 100644 > --- a/target/riscv/cpu.h > +++ b/target/riscv/cpu.h > @@ -488,6 +488,7 @@ int riscv_cpu_gdb_read_register(CPUState *cpu, GByteArray > *buf, int reg); > int riscv_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); > int riscv_cpu_hviprio_index2irq(int index, int *out_irq, int *out_rdzero); > uint8_t riscv_cpu_default_priority(int irq); > +uint64_t riscv_cpu_all_pending(CPURISCVState *env); > int riscv_cpu_mirq_pending(CPURISCVState *env); > int riscv_cpu_sirq_pending(CPURISCVState *env); > int riscv_cpu_vsirq_pending(CPURISCVState *env); > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > index d99fac9d2d..16c6045459 100644 > --- a/target/riscv/cpu_helper.c > +++ b/target/riscv/cpu_helper.c > @@ -340,7 +340,7 @@ static int riscv_cpu_pending_to_irq(CPURISCVState *env, > return best_irq; > } > > -static uint64_t riscv_cpu_all_pending(CPURISCVState *env) > +uint64_t riscv_cpu_all_pending(CPURISCVState *env) > { > uint32_t gein = get_field(env->hstatus, HSTATUS_VGEIN); > uint64_t vsgein = (env->hgeip & (1ULL << gein)) ? MIP_VSEIP : 0; > -- > 2.25.1 > >
Re: [PATCH] microvm: turn off io reservations for pcie root ports
On Fri, Jun 03, 2022 at 10:59:20AM +0200, Gerd Hoffmann wrote: > The pcie host bridge has no io window on microvm, > so io reservations will not work. > > Signed-off-by: Gerd Hoffmann > --- > hw/i386/microvm.c | 6 ++ > 1 file changed, 6 insertions(+) Reviewed-by: Sergio Lopez signature.asc Description: PGP signature
Re: [PATCH] target/riscv: Wake on VS-level external interrupts
On Wed, Jun 1, 2022 at 7:07 AM Andrew Bresticker wrote: > > Whether or not VSEIP is pending isn't reflected in env->mip and must > instead be determined from hstatus.vgein and hgeip. As a result a > CPU in WFI won't wake on a VSEIP, which violates the WFI behavior as > specified in the privileged ISA. Just use riscv_cpu_all_pending() > instead, which already accounts for VSEIP. > > Signed-off-by: Andrew Bresticker Thanks! Applied to riscv-to-apply.next Alistair > --- > target/riscv/cpu.c| 2 +- > target/riscv/cpu.h| 1 + > target/riscv/cpu_helper.c | 2 +- > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index a91253d4bd..c6cc08c355 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -391,7 +391,7 @@ static bool riscv_cpu_has_work(CPUState *cs) > * Definition of the WFI instruction requires it to ignore the privilege > * mode and delegation registers, but respect individual enables > */ > -return (env->mip & env->mie) != 0; > +return riscv_cpu_all_pending(env) != 0; > #else > return true; > #endif > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > index f08c3e8813..758ab6c90b 100644 > --- a/target/riscv/cpu.h > +++ b/target/riscv/cpu.h > @@ -488,6 +488,7 @@ int riscv_cpu_gdb_read_register(CPUState *cpu, GByteArray > *buf, int reg); > int riscv_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); > int riscv_cpu_hviprio_index2irq(int index, int *out_irq, int *out_rdzero); > uint8_t riscv_cpu_default_priority(int irq); > +uint64_t riscv_cpu_all_pending(CPURISCVState *env); > int riscv_cpu_mirq_pending(CPURISCVState *env); > int riscv_cpu_sirq_pending(CPURISCVState *env); > int riscv_cpu_vsirq_pending(CPURISCVState *env); > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > index d99fac9d2d..16c6045459 100644 > --- a/target/riscv/cpu_helper.c > +++ b/target/riscv/cpu_helper.c > @@ -340,7 +340,7 @@ static int riscv_cpu_pending_to_irq(CPURISCVState *env, > return best_irq; > } > > -static uint64_t riscv_cpu_all_pending(CPURISCVState *env) > +uint64_t riscv_cpu_all_pending(CPURISCVState *env) > { > uint32_t gein = get_field(env->hstatus, HSTATUS_VGEIN); > uint64_t vsgein = (env->hgeip & (1ULL << gein)) ? MIP_VSEIP : 0; > -- > 2.25.1 > >
Re: [PATCH v3 1/3] target/riscv: Reorganize riscv_cpu_properties
On Fri, Jun 3, 2022 at 9:36 PM Tsukasa OI wrote: > > Because many developers introduced new properties in various ways, the > entire riscv_cpu_properties block is getting too complex. > > This commit reorganizes riscv_cpu_properties for clarity on future. > > Signed-off-by: Tsukasa OI Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 64 +++--- > 1 file changed, 37 insertions(+), 27 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index a91253d4bd..3f21563f2d 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -840,7 +840,7 @@ static void riscv_cpu_init(Object *obj) > } > > static Property riscv_cpu_properties[] = { > -/* Defaults for standard extensions */ > +/* Base ISA and single-letter standard extensions */ > DEFINE_PROP_BOOL("i", RISCVCPU, cfg.ext_i, true), > DEFINE_PROP_BOOL("e", RISCVCPU, cfg.ext_e, false), > DEFINE_PROP_BOOL("g", RISCVCPU, cfg.ext_g, false), > @@ -853,29 +853,17 @@ static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true), > DEFINE_PROP_BOOL("v", RISCVCPU, cfg.ext_v, false), > DEFINE_PROP_BOOL("h", RISCVCPU, cfg.ext_h, true), > -DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), > -DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), > + > +/* Standard unprivileged extensions */ > DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), > +DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), > + > DEFINE_PROP_BOOL("Zfh", RISCVCPU, cfg.ext_zfh, false), > DEFINE_PROP_BOOL("Zfhmin", RISCVCPU, cfg.ext_zfhmin, false), > -DEFINE_PROP_BOOL("Zve32f", RISCVCPU, cfg.ext_zve32f, false), > -DEFINE_PROP_BOOL("Zve64f", RISCVCPU, cfg.ext_zve64f, false), > -DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true), > -DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true), > -DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true), > - > -DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec), > -DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec), > -DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128), > -DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64), > - > -DEFINE_PROP_UINT32("mvendorid", RISCVCPU, cfg.mvendorid, 0), > -DEFINE_PROP_UINT64("marchid", RISCVCPU, cfg.marchid, RISCV_CPU_MARCHID), > -DEFINE_PROP_UINT64("mimpid", RISCVCPU, cfg.mimpid, RISCV_CPU_MIMPID), > - > -DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false), > -DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false), > -DEFINE_PROP_BOOL("svpbmt", RISCVCPU, cfg.ext_svpbmt, false), > +DEFINE_PROP_BOOL("zfinx", RISCVCPU, cfg.ext_zfinx, false), > +DEFINE_PROP_BOOL("zdinx", RISCVCPU, cfg.ext_zdinx, false), > +DEFINE_PROP_BOOL("zhinx", RISCVCPU, cfg.ext_zhinx, false), > +DEFINE_PROP_BOOL("zhinxmin", RISCVCPU, cfg.ext_zhinxmin, false), > > DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true), > DEFINE_PROP_BOOL("zbb", RISCVCPU, cfg.ext_zbb, true), > @@ -884,6 +872,7 @@ static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("zbkc", RISCVCPU, cfg.ext_zbkc, false), > DEFINE_PROP_BOOL("zbkx", RISCVCPU, cfg.ext_zbkx, false), > DEFINE_PROP_BOOL("zbs", RISCVCPU, cfg.ext_zbs, true), > + > DEFINE_PROP_BOOL("zk", RISCVCPU, cfg.ext_zk, false), > DEFINE_PROP_BOOL("zkn", RISCVCPU, cfg.ext_zkn, false), > DEFINE_PROP_BOOL("zknd", RISCVCPU, cfg.ext_zknd, false), > @@ -895,10 +884,31 @@ static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("zksh", RISCVCPU, cfg.ext_zksh, false), > DEFINE_PROP_BOOL("zkt", RISCVCPU, cfg.ext_zkt, false), > > -DEFINE_PROP_BOOL("zdinx", RISCVCPU, cfg.ext_zdinx, false), > -DEFINE_PROP_BOOL("zfinx", RISCVCPU, cfg.ext_zfinx, false), > -DEFINE_PROP_BOOL("zhinx", RISCVCPU, cfg.ext_zhinx, false), > -DEFINE_PROP_BOOL("zhinxmin", RISCVCPU, cfg.ext_zhinxmin, false), > +DEFINE_PROP_BOOL("Zve32f", RISCVCPU, cfg.ext_zve32f, false), > +DEFINE_PROP_BOOL("Zve64f", RISCVCPU, cfg.ext_zve64f, false), > + > +/* Standard supervisor-level extensions */ > +DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false), > +DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false), > +DEFINE_PROP_BOOL("svpbmt", RISCVCPU, cfg.ext_svpbmt, false), > + > +/* Base features */ > +DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), > +DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true), > +DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true), > +DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true), > + > +/* ISA specification / extension versions */ > +DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec), > +DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec), > + > +/* CPU parameters */ > +DEFINE_PROP_UINT32("mvendorid", RISCVCPU, cfg.mvendorid, 0), > +DEFINE_PROP_UINT64
Re: [PATCH v3 2/3] target/riscv: Make CPU property names lowercase
On Fri, Jun 3, 2022 at 9:37 PM Tsukasa OI wrote: > > Many CPU properties for RISC-V are in lowercase except those with > "capitalized" (or CamelCase) names: > > - Counters > - Zifencei > - Zicsr > - Zfh > - Zfhmin > - Zve32f > - Zve64f > > This commit makes lowercase names primary but keeps capitalized names > as aliases (for backward compatibility, but with deprecated status). > > Signed-off-by: Tsukasa OI Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 27 --- > 1 file changed, 20 insertions(+), 7 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index 3f21563f2d..83262586e4 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -840,6 +840,10 @@ static void riscv_cpu_init(Object *obj) > } > > static Property riscv_cpu_properties[] = { > +/* > + * Names for ISA extensions and features should be in lowercase. > + */ > + > /* Base ISA and single-letter standard extensions */ > DEFINE_PROP_BOOL("i", RISCVCPU, cfg.ext_i, true), > DEFINE_PROP_BOOL("e", RISCVCPU, cfg.ext_e, false), > @@ -855,11 +859,11 @@ static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("h", RISCVCPU, cfg.ext_h, true), > > /* Standard unprivileged extensions */ > -DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), > -DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), > +DEFINE_PROP_BOOL("zicsr", RISCVCPU, cfg.ext_icsr, true), > +DEFINE_PROP_BOOL("zifencei", RISCVCPU, cfg.ext_ifencei, true), > > -DEFINE_PROP_BOOL("Zfh", RISCVCPU, cfg.ext_zfh, false), > -DEFINE_PROP_BOOL("Zfhmin", RISCVCPU, cfg.ext_zfhmin, false), > +DEFINE_PROP_BOOL("zfh", RISCVCPU, cfg.ext_zfh, false), > +DEFINE_PROP_BOOL("zfhmin", RISCVCPU, cfg.ext_zfhmin, false), > DEFINE_PROP_BOOL("zfinx", RISCVCPU, cfg.ext_zfinx, false), > DEFINE_PROP_BOOL("zdinx", RISCVCPU, cfg.ext_zdinx, false), > DEFINE_PROP_BOOL("zhinx", RISCVCPU, cfg.ext_zhinx, false), > @@ -884,8 +888,8 @@ static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("zksh", RISCVCPU, cfg.ext_zksh, false), > DEFINE_PROP_BOOL("zkt", RISCVCPU, cfg.ext_zkt, false), > > -DEFINE_PROP_BOOL("Zve32f", RISCVCPU, cfg.ext_zve32f, false), > -DEFINE_PROP_BOOL("Zve64f", RISCVCPU, cfg.ext_zve64f, false), > +DEFINE_PROP_BOOL("zve32f", RISCVCPU, cfg.ext_zve32f, false), > +DEFINE_PROP_BOOL("zve64f", RISCVCPU, cfg.ext_zve64f, false), > > /* Standard supervisor-level extensions */ > DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false), > @@ -893,7 +897,7 @@ static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("svpbmt", RISCVCPU, cfg.ext_svpbmt, false), > > /* Base features */ > -DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), > +DEFINE_PROP_BOOL("counters", RISCVCPU, cfg.ext_counters, true), > DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true), > DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true), > DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true), > @@ -922,6 +926,15 @@ static Property riscv_cpu_properties[] = { > /* Other options */ > DEFINE_PROP_BOOL("short-isa-string", RISCVCPU, cfg.short_isa_string, > false), > > +/* Capitalized aliases (deprecated and will be removed) */ > +DEFINE_PROP("Counters", RISCVCPU, cfg.ext_counters, qdev_prop_bool, > bool), > +DEFINE_PROP("Zifencei", RISCVCPU, cfg.ext_ifencei, qdev_prop_bool, bool), > +DEFINE_PROP("Zicsr", RISCVCPU, cfg.ext_icsr, qdev_prop_bool, bool), > +DEFINE_PROP("Zfh", RISCVCPU, cfg.ext_zfh, qdev_prop_bool, bool), > +DEFINE_PROP("Zfhmin", RISCVCPU, cfg.ext_zfhmin, qdev_prop_bool, bool), > +DEFINE_PROP("Zve32f", RISCVCPU, cfg.ext_zve32f, qdev_prop_bool, bool), > +DEFINE_PROP("Zve64f", RISCVCPU, cfg.ext_zve64f, qdev_prop_bool, bool), > + > DEFINE_PROP_END_OF_LIST(), > }; > > -- > 2.34.1 >
Re: [PATCH] target/riscv/debug.c: keep experimental rv128 support working
On Thu, Jun 2, 2022 at 11:55 PM Frédéric Pétrot wrote: > > Add an MXL_RV128 case in two switches so that no error is triggered when > using the -cpu x-rv128 option. > > Signed-off-by: Frédéric Pétrot > --- > target/riscv/debug.c | 2 ++ > 1 file changed, 2 insertions(+) > Reviewed-by: Bin Meng
Re: [PATCH] target/riscv/debug.c: keep experimental rv128 support working
On Fri, Jun 3, 2022 at 1:55 AM Frédéric Pétrot wrote: > > Add an MXL_RV128 case in two switches so that no error is triggered when > using the -cpu x-rv128 option. > > Signed-off-by: Frédéric Pétrot Acked-by: Alistair Francis Alistair > --- > target/riscv/debug.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/target/riscv/debug.c b/target/riscv/debug.c > index 2f2a51c732..fc6e13222f 100644 > --- a/target/riscv/debug.c > +++ b/target/riscv/debug.c > @@ -77,6 +77,7 @@ static inline target_ulong trigger_type(CPURISCVState *env, > tdata1 = RV32_TYPE(type); > break; > case MXL_RV64: > +case MXL_RV128: > tdata1 = RV64_TYPE(type); > break; > default: > @@ -123,6 +124,7 @@ static target_ulong tdata1_validate(CPURISCVState *env, > target_ulong val, > tdata1 = RV32_TYPE(t); > break; > case MXL_RV64: > +case MXL_RV128: > type = extract64(val, 60, 4); > dmode = extract64(val, 59, 1); > tdata1 = RV64_TYPE(t); > -- > 2.36.1 > >
[PATCH qemu v19 01/16] target/riscv: rvv: Prune redundant ESZ, DSZ parameter passed
From: eopXD No functional change intended in this commit. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Reviewed-by: Alistair Francis --- target/riscv/vector_helper.c | 1132 +- 1 file changed, 565 insertions(+), 567 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 576b14e5a3..85dd611cd9 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -710,7 +710,6 @@ RVVCALL(OPIVV2, vsub_vv_d, OP_SSS_D, H8, H8, H8, DO_SUB) static void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, CPURISCVState *env, uint32_t desc, - uint32_t esz, uint32_t dsz, opivv2_fn *fn) { uint32_t vm = vext_vm(desc); @@ -727,23 +726,23 @@ static void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, } /* generate the helpers for OPIVV */ -#define GEN_VEXT_VV(NAME, ESZ, DSZ) \ +#define GEN_VEXT_VV(NAME) \ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ void *vs2, CPURISCVState *env, \ uint32_t desc) \ { \ -do_vext_vv(vd, v0, vs1, vs2, env, desc, ESZ, DSZ, \ +do_vext_vv(vd, v0, vs1, vs2, env, desc, \ do_##NAME);\ } -GEN_VEXT_VV(vadd_vv_b, 1, 1) -GEN_VEXT_VV(vadd_vv_h, 2, 2) -GEN_VEXT_VV(vadd_vv_w, 4, 4) -GEN_VEXT_VV(vadd_vv_d, 8, 8) -GEN_VEXT_VV(vsub_vv_b, 1, 1) -GEN_VEXT_VV(vsub_vv_h, 2, 2) -GEN_VEXT_VV(vsub_vv_w, 4, 4) -GEN_VEXT_VV(vsub_vv_d, 8, 8) +GEN_VEXT_VV(vadd_vv_b) +GEN_VEXT_VV(vadd_vv_h) +GEN_VEXT_VV(vadd_vv_w) +GEN_VEXT_VV(vadd_vv_d) +GEN_VEXT_VV(vsub_vv_b) +GEN_VEXT_VV(vsub_vv_h) +GEN_VEXT_VV(vsub_vv_w) +GEN_VEXT_VV(vsub_vv_d) typedef void opivx2_fn(void *vd, target_long s1, void *vs2, int i); @@ -773,7 +772,6 @@ RVVCALL(OPIVX2, vrsub_vx_d, OP_SSS_D, H8, H8, DO_RSUB) static void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, CPURISCVState *env, uint32_t desc, - uint32_t esz, uint32_t dsz, opivx2_fn fn) { uint32_t vm = vext_vm(desc); @@ -790,27 +788,27 @@ static void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, } /* generate the helpers for OPIVX */ -#define GEN_VEXT_VX(NAME, ESZ, DSZ) \ +#define GEN_VEXT_VX(NAME) \ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,\ void *vs2, CPURISCVState *env, \ uint32_t desc) \ { \ -do_vext_vx(vd, v0, s1, vs2, env, desc, ESZ, DSZ, \ +do_vext_vx(vd, v0, s1, vs2, env, desc,\ do_##NAME);\ } -GEN_VEXT_VX(vadd_vx_b, 1, 1) -GEN_VEXT_VX(vadd_vx_h, 2, 2) -GEN_VEXT_VX(vadd_vx_w, 4, 4) -GEN_VEXT_VX(vadd_vx_d, 8, 8) -GEN_VEXT_VX(vsub_vx_b, 1, 1) -GEN_VEXT_VX(vsub_vx_h, 2, 2) -GEN_VEXT_VX(vsub_vx_w, 4, 4) -GEN_VEXT_VX(vsub_vx_d, 8, 8) -GEN_VEXT_VX(vrsub_vx_b, 1, 1) -GEN_VEXT_VX(vrsub_vx_h, 2, 2) -GEN_VEXT_VX(vrsub_vx_w, 4, 4) -GEN_VEXT_VX(vrsub_vx_d, 8, 8) +GEN_VEXT_VX(vadd_vx_b) +GEN_VEXT_VX(vadd_vx_h) +GEN_VEXT_VX(vadd_vx_w) +GEN_VEXT_VX(vadd_vx_d) +GEN_VEXT_VX(vsub_vx_b) +GEN_VEXT_VX(vsub_vx_h) +GEN_VEXT_VX(vsub_vx_w) +GEN_VEXT_VX(vsub_vx_d) +GEN_VEXT_VX(vrsub_vx_b) +GEN_VEXT_VX(vrsub_vx_h) +GEN_VEXT_VX(vrsub_vx_w) +GEN_VEXT_VX(vrsub_vx_d) void HELPER(vec_rsubs8)(void *d, void *a, uint64_t b, uint32_t desc) { @@ -889,30 +887,30 @@ RVVCALL(OPIVV2, vwadd_wv_w, WOP_WSSS_W, H8, H4, H4, DO_ADD) RVVCALL(OPIVV2, vwsub_wv_b, WOP_WSSS_B, H2, H1, H1, DO_SUB) RVVCALL(OPIVV2, vwsub_wv_h, WOP_WSSS_H, H4, H2, H2, DO_SUB) RVVCALL(OPIVV2, vwsub_wv_w, WOP_WSSS_W, H8, H4, H4, DO_SUB) -GEN_VEXT_VV(vwaddu_vv_b, 1, 2) -GEN_VEXT_VV(vwaddu_vv_h, 2, 4) -GEN_VEXT_VV(vwaddu_vv_w, 4, 8) -GEN_VEXT_VV(vwsubu_vv_b, 1, 2) -GEN_VEXT_VV(vwsubu_vv_h, 2, 4) -GEN_VEXT_VV(vwsubu_vv_w, 4, 8) -GEN_VEXT_VV(vwadd_vv_b, 1, 2) -GEN_VEXT_VV(vwadd_vv_h, 2, 4) -GEN_VEXT_VV(vwadd_vv_w, 4, 8) -GEN_VEXT_VV(vwsub_vv_b, 1, 2) -GEN_VEXT_VV(vwsub_vv_h, 2, 4) -GEN_VEXT_VV(vwsub_vv_w, 4, 8) -GEN_VEXT_VV(vwaddu_wv_b, 1, 2) -GEN_VEXT_VV(vwaddu_wv_h, 2, 4) -GEN_VEXT_VV(vwaddu_wv_w, 4, 8) -GEN_VEXT_VV(vwsubu_wv_b, 1, 2) -GEN_VEXT_VV(vwsubu_wv_h, 2, 4) -GEN_VEXT_VV(vwsubu_wv_w, 4, 8) -GEN_VEXT_VV(vwadd_wv_b, 1, 2) -GEN_VEXT_VV(vwadd_wv_h, 2, 4) -GEN_VEXT_VV(vwadd_wv_w, 4, 8) -GEN_VEXT_VV(vwsub_wv_b, 1, 2) -GEN_VEXT_VV(vwsub_wv_h, 2, 4) -GEN_VEXT_VV(vwsub_wv_w, 4, 8) +GEN_VEXT_VV(vwaddu_vv_b) +GEN_VEXT_VV(vwaddu_vv_h) +GEN_VEXT_VV(vwaddu_vv_w) +GEN_VEXT_VV(vwsubu_vv_b) +GEN_VEXT_VV(vwsubu_vv_h) +GEN_VEXT_VV(vwsubu_vv_w) +GEN_VEXT_VV(vwadd_vv_b) +GEN_VEXT_VV(vwadd_vv_h) +GEN_VEXT_VV(vwadd
[PATCH qemu v19 02/16] target/riscv: rvv: Prune redundant access_type parameter passed
From: eopXD No functional change intended in this commit. Signed-off-by: eop Chen Reviewed-by: Alistair Francis --- target/riscv/vector_helper.c | 35 --- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 85dd611cd9..60840325c4 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -231,7 +231,7 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, target_ulong stride, CPURISCVState *env, uint32_t desc, uint32_t vm, vext_ldst_elem_fn *ldst_elem, - uint32_t esz, uintptr_t ra, MMUAccessType access_type) + uint32_t esz, uintptr_t ra) { uint32_t i, k; uint32_t nf = vext_nf(desc); @@ -259,7 +259,7 @@ void HELPER(NAME)(void *vd, void * v0, target_ulong base, \ { \ uint32_t vm = vext_vm(desc);\ vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \ - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_LOAD); \ + ctzl(sizeof(ETYPE)), GETPC()); \ } GEN_VEXT_LD_STRIDE(vlse8_v, int8_t, lde_b) @@ -274,7 +274,7 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ { \ uint32_t vm = vext_vm(desc);\ vext_ldst_stride(vd, v0, base, stride, env, desc, vm, STORE_FN, \ - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_STORE); \ + ctzl(sizeof(ETYPE)), GETPC()); \ } GEN_VEXT_ST_STRIDE(vsse8_v, int8_t, ste_b) @@ -290,7 +290,7 @@ GEN_VEXT_ST_STRIDE(vsse64_v, int64_t, ste_d) static void vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, vext_ldst_elem_fn *ldst_elem, uint32_t esz, uint32_t evl, - uintptr_t ra, MMUAccessType access_type) + uintptr_t ra) { uint32_t i, k; uint32_t nf = vext_nf(desc); @@ -319,14 +319,14 @@ void HELPER(NAME##_mask)(void *vd, void *v0, target_ulong base, \ { \ uint32_t stride = vext_nf(desc) << ctzl(sizeof(ETYPE)); \ vext_ldst_stride(vd, v0, base, stride, env, desc, false, LOAD_FN, \ - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_LOAD); \ + ctzl(sizeof(ETYPE)), GETPC()); \ } \ \ void HELPER(NAME)(void *vd, void *v0, target_ulong base,\ CPURISCVState *env, uint32_t desc)\ { \ vext_ldst_us(vd, base, env, desc, LOAD_FN, \ - ctzl(sizeof(ETYPE)), env->vl, GETPC(), MMU_DATA_LOAD); \ + ctzl(sizeof(ETYPE)), env->vl, GETPC());\ } GEN_VEXT_LD_US(vle8_v, int8_t, lde_b) @@ -340,14 +340,14 @@ void HELPER(NAME##_mask)(void *vd, void *v0, target_ulong base, \ {\ uint32_t stride = vext_nf(desc) << ctzl(sizeof(ETYPE)); \ vext_ldst_stride(vd, v0, base, stride, env, desc, false, STORE_FN, \ - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_STORE); \ + ctzl(sizeof(ETYPE)), GETPC()); \ }\ \ void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ CPURISCVState *env, uint32_t desc) \ {\ vext_ldst_us(vd, base, env, desc, STORE_FN, \ - ctzl(sizeof(ETYPE)), env->vl, GETPC(), MMU_DATA_STORE); \ + ctzl(sizeof(ETYPE)), env->vl, GETPC()); \ } GEN_VEXT_ST_US(vse8_v, int8_t, ste_b) @@ -364,7 +364,7 @@ void HELPER(vlm_v)(void *vd, void *v0, target_ulong base, /* evl = ceil(vl/8) */ uint8_t evl = (env->vl + 7) >> 3; vext_ldst_us(vd, base, env, desc, lde_b, - 0, evl, GETPC(), MMU_DATA_LOAD); + 0, evl, GETPC()); } void HELPER(vsm_v)(void *vd, void *v0, target_ulong base, @@ -373,7 +373,7 @@ void HELPER(vsm_v)(void *vd, void *v0, target_ulong base, /* evl = ceil(vl/8) */ uint8_t evl = (env->vl + 7) >> 3; vext_ldst_us(vd, ba
[PATCH qemu v19 03/16] target/riscv: rvv: Rename ambiguous esz
From: eopXD No functional change intended in this commit. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Reviewed-by: Alistair Francis --- target/riscv/vector_helper.c | 76 ++-- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 60840325c4..3b79b9cbc2 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -125,9 +125,9 @@ static inline int32_t vext_lmul(uint32_t desc) /* * Get the maximum number of elements can be operated. * - * esz: log2 of element size in bytes. + * log2_esz: log2 of element size in bytes. */ -static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz) +static inline uint32_t vext_max_elems(uint32_t desc, uint32_t log2_esz) { /* * As simd_desc support at most 2048 bytes, the max vlen is 1024 bits. @@ -136,7 +136,7 @@ static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz) uint32_t vlenb = simd_maxsz(desc); /* Return VLMAX */ -int scale = vext_lmul(desc) - esz; +int scale = vext_lmul(desc) - log2_esz; return scale < 0 ? vlenb >> -scale : vlenb << scale; } @@ -231,11 +231,11 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, target_ulong stride, CPURISCVState *env, uint32_t desc, uint32_t vm, vext_ldst_elem_fn *ldst_elem, - uint32_t esz, uintptr_t ra) + uint32_t log2_esz, uintptr_t ra) { uint32_t i, k; uint32_t nf = vext_nf(desc); -uint32_t max_elems = vext_max_elems(desc, esz); +uint32_t max_elems = vext_max_elems(desc, log2_esz); for (i = env->vstart; i < env->vl; i++, env->vstart++) { if (!vm && !vext_elem_mask(v0, i)) { @@ -244,7 +244,7 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, k = 0; while (k < nf) { -target_ulong addr = base + stride * i + (k << esz); +target_ulong addr = base + stride * i + (k << log2_esz); ldst_elem(env, adjust_addr(env, addr), i + k * max_elems, vd, ra); k++; } @@ -289,18 +289,18 @@ GEN_VEXT_ST_STRIDE(vsse64_v, int64_t, ste_d) /* unmasked unit-stride load and store operation*/ static void vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, - vext_ldst_elem_fn *ldst_elem, uint32_t esz, uint32_t evl, + vext_ldst_elem_fn *ldst_elem, uint32_t log2_esz, uint32_t evl, uintptr_t ra) { uint32_t i, k; uint32_t nf = vext_nf(desc); -uint32_t max_elems = vext_max_elems(desc, esz); +uint32_t max_elems = vext_max_elems(desc, log2_esz); /* load bytes from guest memory */ for (i = env->vstart; i < evl; i++, env->vstart++) { k = 0; while (k < nf) { -target_ulong addr = base + ((i * nf + k) << esz); +target_ulong addr = base + ((i * nf + k) << log2_esz); ldst_elem(env, adjust_addr(env, addr), i + k * max_elems, vd, ra); k++; } @@ -399,12 +399,12 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, void *vs2, CPURISCVState *env, uint32_t desc, vext_get_index_addr get_index_addr, vext_ldst_elem_fn *ldst_elem, -uint32_t esz, uintptr_t ra) +uint32_t log2_esz, uintptr_t ra) { uint32_t i, k; uint32_t nf = vext_nf(desc); uint32_t vm = vext_vm(desc); -uint32_t max_elems = vext_max_elems(desc, esz); +uint32_t max_elems = vext_max_elems(desc, log2_esz); /* load bytes from guest memory */ for (i = env->vstart; i < env->vl; i++, env->vstart++) { @@ -414,7 +414,7 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, k = 0; while (k < nf) { -abi_ptr addr = get_index_addr(base, i, vs2) + (k << esz); +abi_ptr addr = get_index_addr(base, i, vs2) + (k << log2_esz); ldst_elem(env, adjust_addr(env, addr), i + k * max_elems, vd, ra); k++; } @@ -480,13 +480,13 @@ static inline void vext_ldff(void *vd, void *v0, target_ulong base, CPURISCVState *env, uint32_t desc, vext_ldst_elem_fn *ldst_elem, - uint32_t esz, uintptr_t ra) + uint32_t log2_esz, uintptr_t ra) { void *host; uint32_t i, k, vl = 0; uint32_t nf = vext_nf(desc); uint32_t vm = vext_vm(desc); -uint32_t max_elems = vext_max_elems(desc, esz); +uint32_t max_elems = vext_max_elems(desc, log2_esz); target_ulong addr, offset, remain; /* probe every access*/ @@ -494,12 +494,12 @@ vext_ldff(void *vd, void *v0, target_ulong base, if (!vm && !vext_elem_mask(v0, i)) { continue; } -addr = adjust_addr(env, base + i * (nf << esz)); +addr = adjust_addr(env, base + i * (nf << log2_esz)); if (i == 0)
[PATCH qemu v19 00/16] Add tail agnostic behavior for rvv instructions
According to v-spec, tail agnostic behavior can be either kept as undisturbed or set elements' bits to all 1s. To distinguish the difference of tail policies, QEMU should be able to simulate the tail agnostic behavior as "set tail elements' bits to all 1s". An option 'rvv_ta_all_1s' is added to enable the behavior, it is default as disabled. There are multiple possibility for agnostic elements according to v-spec. The main intent of this patch-set tries to add option that can distinguish between tail policies. Setting agnostic elements to all 1s makes things simple and allow QEMU to express this. We may explore other possibility of agnostic behavior by adding other options in the future. Please understand that this patch-set is limited. v2 updates: - Addressed comments from Weiwei Li - Added commit tail agnostic on load / store instructions (which I forgot to include into the patch-set) v3 updates: - Missed the very 1st commit, adding it back v4 updates: - Renamed vlmax to total_elems - Deal with tail element when vl_eq_vlmax == true v5 updates: - Let `vext_get_total_elems` take `desc` and `esz` - Utilize `simd_maxsz(desc)` to get `vlenb` - Fix alignments to code v6 updates: - Fix `vext_get_total_elems` v7 updates: - Reuse `max_elems` for vector load / store helper functions. The translation sets desc's `lmul` to `min(1, lmul)`, making `vext_max_elems` equivalent to `vext_get_total_elems`. v8 updates: - Simplify `vext_set_elems_1s`, don't need `vext_set_elems_1s_fns` - Fix `vext_get_total_elems`, it should derive upon EMUL instead of LMUL v9 updates: - Let instructions that is tail agnostic regardless of vta respect the option and not the vta. v10 updates: - Correct range to set element to 1s for load instructions v11 updates: - Separate addition of option 'rvv_ta_all_1s' as a new (last) commit - Add description to show intent of the option in first commit for the optional tail agnostic behavior - Tag WeiWei as Reviewed-by for all commits - Tag Alistair as Reviewed-by for commit 01, 02 - Tag Alistair as Acked-by for commit 03 v12 updates: - Add missing space in WeiWei's Reviewed-by tag v13 updates: - Fix tail agnostic for vext_ldst_us. The function operates on input parameter 'evl' rather than 'env->vl'. - Fix tail elements for vector segment load / store instructions A vector segment load / store instruction may contain fractional lmul with nf * lmul > 1. The rest of the elements in the last register should be treated as tail elements. - Fix tail agnostic length for instructions with mask destination register. Instructions with mask destination register should have 'vlen - vl' tail elements. v14 updates: - Pass lmul information to into vector helper function. `vext_get_total_elems` needs it. v15 updates: - Rebase to latest `master` - Tag Alistair as Acked-by for commit 04 ~ 14 - Tag Alistair as Acked-by for commit 15 v16 updates: - Fix bug, when encountering situation when lmul < 0 and vl_eq_vlmax, the original version will override on `vd` but the computation will override again, meaning the tail elements will not be set correctly. Now, we don't use TCG functions if we are trying to simulate all 1s for agnostic and use vector helpers instead. v17 updates: - Add "Prune access_type parameter" commit to cleanup vector load/ store functions. Then add parameter `is_load` in vector helper functions to enable vta behavior in the commit for adding vta on vector load/store functions. v18 updates: - Don't use `is_load` parameter in vector helper. Don't let vta pass through in `trans_rvv.inc` v19 updates: - Tag Alistair as Reviewed by for commit 02 - Rebase to alistair23/qemu/riscv-to-apply.next eopXD (16): target/riscv: rvv: Prune redundant ESZ, DSZ parameter passed target/riscv: rvv: Prune redundant access_type parameter passed target/riscv: rvv: Rename ambiguous esz target/riscv: rvv: Early exit when vstart >= vl target/riscv: rvv: Add tail agnostic for vv instructions target/riscv: rvv: Add tail agnostic for vector load / store instructions target/riscv: rvv: Add tail agnostic for vx, vvm, vxm instructions target/riscv: rvv: Add tail agnostic for vector integer shift instructions target/riscv: rvv: Add tail agnostic for vector integer comparison instructions target/riscv: rvv: Add tail agnostic for vector integer merge and move instructions target/riscv: rvv: Add tail agnostic for vector fix-point arithmetic instructions target/riscv: rvv: Add tail agnostic for vector floating-point instructions target/riscv: rvv: Add tail agnostic for vector reduction instructions target/riscv: rvv: Add tail agnostic for vector mask instructions target/riscv: rvv: Add tail agnostic for vector permutation instructions target/riscv: rvv: Add option 'rvv_ta_all_1s' to enable optional tail agnostic behavior target/riscv/cpu.c |2 + target/riscv/cpu.h |2
[PATCH qemu v19 13/16] target/riscv: rvv: Add tail agnostic for vector reduction instructions
From: eopXD Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/vector_helper.c | 20 1 file changed, 20 insertions(+) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 8ac7fabb05..2ab4308ef0 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4534,6 +4534,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ { \ uint32_t vm = vext_vm(desc); \ uint32_t vl = env->vl;\ +uint32_t esz = sizeof(TD);\ +uint32_t vlenb = simd_maxsz(desc);\ +uint32_t vta = vext_vta(desc);\ uint32_t i; \ TD s1 = *((TD *)vs1 + HD(0));\ \ @@ -4546,6 +4549,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ } \ *((TD *)vd + HD(0)) = s1; \ env->vstart = 0; \ +/* set tail elements to 1s */ \ +vext_set_elems_1s(vd, vta, esz, vlenb); \ } /* vd[0] = sum(vs1[0], vs2[*]) */ @@ -4615,6 +4620,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ { \ uint32_t vm = vext_vm(desc); \ uint32_t vl = env->vl; \ +uint32_t esz = sizeof(TD); \ +uint32_t vlenb = simd_maxsz(desc); \ +uint32_t vta = vext_vta(desc); \ uint32_t i;\ TD s1 = *((TD *)vs1 + HD(0)); \ \ @@ -4627,6 +4635,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ } \ *((TD *)vd + HD(0)) = s1; \ env->vstart = 0; \ +/* set tail elements to 1s */ \ +vext_set_elems_1s(vd, vta, esz, vlenb);\ } /* Unordered sum */ @@ -4651,6 +4661,9 @@ void HELPER(vfwredsum_vs_h)(void *vd, void *v0, void *vs1, { uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; +uint32_t esz = sizeof(uint32_t); +uint32_t vlenb = simd_maxsz(desc); +uint32_t vta = vext_vta(desc); uint32_t i; uint32_t s1 = *((uint32_t *)vs1 + H4(0)); @@ -4664,6 +4677,8 @@ void HELPER(vfwredsum_vs_h)(void *vd, void *v0, void *vs1, } *((uint32_t *)vd + H4(0)) = s1; env->vstart = 0; +/* set tail elements to 1s */ +vext_set_elems_1s(vd, vta, esz, vlenb); } void HELPER(vfwredsum_vs_w)(void *vd, void *v0, void *vs1, @@ -4671,6 +4686,9 @@ void HELPER(vfwredsum_vs_w)(void *vd, void *v0, void *vs1, { uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; +uint32_t esz = sizeof(uint64_t); +uint32_t vlenb = simd_maxsz(desc); +uint32_t vta = vext_vta(desc); uint32_t i; uint64_t s1 = *((uint64_t *)vs1); @@ -4684,6 +4702,8 @@ void HELPER(vfwredsum_vs_w)(void *vd, void *v0, void *vs1, } *((uint64_t *)vd) = s1; env->vstart = 0; +/* set tail elements to 1s */ +vext_set_elems_1s(vd, vta, esz, vlenb); } /* -- 2.34.2
[PATCH qemu v19 05/16] target/riscv: rvv: Add tail agnostic for vv instructions
From: eopXD According to v-spec, tail agnostic behavior can be either kept as undisturbed or set elements' bits to all 1s. To distinguish the difference of tail policies, QEMU should be able to simulate the tail agnostic behavior as "set tail elements' bits to all 1s". There are multiple possibility for agnostic elements according to v-spec. The main intent of this patch-set tries to add option that can distinguish between tail policies. Setting agnostic elements to all 1s allows QEMU to express this. This is the first commit regarding the optional tail agnostic behavior. Follow-up commits will add this optional behavior for all rvv instructions. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/cpu.h | 2 + target/riscv/cpu_helper.c | 2 + target/riscv/insn_trans/trans_rvv.c.inc | 3 +- target/riscv/internals.h| 5 +- target/riscv/translate.c| 2 + target/riscv/vector_helper.c| 295 +--- 6 files changed, 177 insertions(+), 132 deletions(-) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 890d33cebb..c38f2035c5 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -412,6 +412,7 @@ struct RISCVCPUConfig { bool ext_zve32f; bool ext_zve64f; bool ext_zmmul; +bool rvv_ta_all_1s; uint32_t mvendorid; uint64_t marchid; @@ -566,6 +567,7 @@ FIELD(TB_FLAGS, XL, 20, 2) /* If PointerMasking should be applied */ FIELD(TB_FLAGS, PM_MASK_ENABLED, 22, 1) FIELD(TB_FLAGS, PM_BASE_ENABLED, 23, 1) +FIELD(TB_FLAGS, VTA, 24, 1) #ifdef TARGET_RISCV32 #define riscv_cpu_mxl(env) ((void)(env), MXL_RV32) diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index d99fac9d2d..2aa6e463a0 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -65,6 +65,8 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc, flags = FIELD_DP32(flags, TB_FLAGS, LMUL, FIELD_EX64(env->vtype, VTYPE, VLMUL)); flags = FIELD_DP32(flags, TB_FLAGS, VL_EQ_VLMAX, vl_eq_vlmax); +flags = FIELD_DP32(flags, TB_FLAGS, VTA, +FIELD_EX64(env->vtype, VTYPE, VTA)); } else { flags = FIELD_DP32(flags, TB_FLAGS, VILL, 1); } diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 6750f5d04a..bf9886a93d 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -1231,7 +1231,7 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); -if (a->vm && s->vl_eq_vlmax) { +if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), vreg_ofs(s, a->rs1), MAXSZ(s), MAXSZ(s)); @@ -1240,6 +1240,7 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2), cpu_env, s->cfg_ptr->vlen / 8, diff --git a/target/riscv/internals.h b/target/riscv/internals.h index dbb322bfa7..512c6c30cf 100644 --- a/target/riscv/internals.h +++ b/target/riscv/internals.h @@ -24,8 +24,9 @@ /* share data between vector helpers and decode code */ FIELD(VDATA, VM, 0, 1) FIELD(VDATA, LMUL, 1, 3) -FIELD(VDATA, NF, 4, 4) -FIELD(VDATA, WD, 4, 1) +FIELD(VDATA, VTA, 4, 1) +FIELD(VDATA, NF, 5, 4) +FIELD(VDATA, WD, 5, 1) /* float point classify helpers */ target_ulong fclass_h(uint64_t frs1); diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 55a4713af2..59f0ee9a50 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -94,6 +94,7 @@ typedef struct DisasContext { */ int8_t lmul; uint8_t sew; +uint8_t vta; target_ulong vstart; bool vl_eq_vlmax; uint8_t ntemp; @@ -1099,6 +1100,7 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs) ctx->vill = FIELD_EX32(tb_flags, TB_FLAGS, VILL); ctx->sew = FIELD_EX32(tb_flags, TB_FLAGS, SEW); ctx->lmul = sextract32(FIELD_EX32(tb_flags, TB_FLAGS, LMUL), 0, 3); +ctx->vta = FIELD_EX32(tb_flags, TB_FLAGS, VTA) && cpu->cfg.rvv_ta_all_1s; ctx->vstart = env->vstart; ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX); ctx->misa_mxl_max = env->misa_mxl_max; diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 3b79b9cbc2..2248f0cbee 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@
[PATCH qemu v19 08/16] target/riscv: rvv: Add tail agnostic for vector integer shift instructions
From: eopXD Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 3 ++- target/riscv/vector_helper.c| 11 +++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 22edf6228d..dbe687fb73 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -1831,7 +1831,7 @@ do_opivx_gvec_shift(DisasContext *s, arg_rmrr *a, GVecGen2sFn32 *gvec_fn, return false; } -if (a->vm && s->vl_eq_vlmax) { +if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { TCGv_i32 src1 = tcg_temp_new_i32(); tcg_gen_trunc_tl_i32(src1, get_gpr(s, a->rs1, EXT_NONE)); @@ -1890,6 +1890,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ +data = FIELD_DP32(data, VDATA, VTA, s->vta); \ tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ vreg_ofs(s, a->rs1),\ vreg_ofs(s, a->rs2), cpu_env, \ diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 3bcde37e28..9738c50222 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1270,6 +1270,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ { \ uint32_t vm = vext_vm(desc); \ uint32_t vl = env->vl;\ +uint32_t esz = sizeof(TS1); \ +uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ +uint32_t vta = vext_vta(desc);\ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -1281,6 +1284,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ *((TS1 *)vd + HS1(i)) = OP(s2, s1 & MASK);\ } \ env->vstart = 0; \ +/* set tail elements to 1s */ \ +vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ } GEN_VEXT_SHIFT_VV(vsll_vv_b, uint8_t, uint8_t, H1, H1, DO_SLL, 0x7) @@ -1305,6 +1310,10 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ { \ uint32_t vm = vext_vm(desc);\ uint32_t vl = env->vl; \ +uint32_t esz = sizeof(TD); \ +uint32_t total_elems = \ +vext_get_total_elems(env, desc, esz); \ +uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) {\ @@ -1315,6 +1324,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ *((TD *)vd + HD(i)) = OP(s2, s1 & MASK);\ } \ env->vstart = 0;\ +/* set tail elements to 1s */ \ +vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);\ } GEN_VEXT_SHIFT_VX(vsll_vx_b, uint8_t, int8_t, H1, H1, DO_SLL, 0x7) -- 2.34.2
[PATCH qemu v19 06/16] target/riscv: rvv: Add tail agnostic for vector load / store instructions
From: eopXD Destination register of unit-stride mask load and store instructions are always written with a tail-agnostic policy. A vector segment load / store instruction may contain fractional lmul with nf * lmul > 1. The rest of the elements in the last register should be treated as tail elements. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 6 +++ target/riscv/translate.c| 2 + target/riscv/vector_helper.c| 60 + 3 files changed, 68 insertions(+) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index bf9886a93d..cd73fd6119 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -711,6 +711,7 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t eew) data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, emul); data = FIELD_DP32(data, VDATA, NF, a->nf); +data = FIELD_DP32(data, VDATA, VTA, s->vta); return ldst_us_trans(a->rd, a->rs1, data, fn, s, false); } @@ -774,6 +775,8 @@ static bool ld_us_mask_op(DisasContext *s, arg_vlm_v *a, uint8_t eew) /* EMUL = 1, NFIELDS = 1 */ data = FIELD_DP32(data, VDATA, LMUL, 0); data = FIELD_DP32(data, VDATA, NF, 1); +/* Mask destination register are always tail-agnostic */ +data = FIELD_DP32(data, VDATA, VTA, s->cfg_vta_all_1s); return ldst_us_trans(a->rd, a->rs1, data, fn, s, false); } @@ -862,6 +865,7 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t eew) data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, emul); data = FIELD_DP32(data, VDATA, NF, a->nf); +data = FIELD_DP32(data, VDATA, VTA, s->vta); return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, false); } @@ -991,6 +995,7 @@ static bool ld_index_op(DisasContext *s, arg_rnfvm *a, uint8_t eew) data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, emul); data = FIELD_DP32(data, VDATA, NF, a->nf); +data = FIELD_DP32(data, VDATA, VTA, s->vta); return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s, false); } @@ -1108,6 +1113,7 @@ static bool ldff_op(DisasContext *s, arg_r2nfvm *a, uint8_t eew) data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, emul); data = FIELD_DP32(data, VDATA, NF, a->nf); +data = FIELD_DP32(data, VDATA, VTA, s->vta); return ldff_trans(a->rd, a->rs1, data, fn, s); } diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 59f0ee9a50..b151c20674 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -95,6 +95,7 @@ typedef struct DisasContext { int8_t lmul; uint8_t sew; uint8_t vta; +bool cfg_vta_all_1s; target_ulong vstart; bool vl_eq_vlmax; uint8_t ntemp; @@ -1101,6 +1102,7 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs) ctx->sew = FIELD_EX32(tb_flags, TB_FLAGS, SEW); ctx->lmul = sextract32(FIELD_EX32(tb_flags, TB_FLAGS, LMUL), 0, 3); ctx->vta = FIELD_EX32(tb_flags, TB_FLAGS, VTA) && cpu->cfg.rvv_ta_all_1s; +ctx->cfg_vta_all_1s = cpu->cfg.rvv_ta_all_1s; ctx->vstart = env->vstart; ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX); ctx->misa_mxl_max = env->misa_mxl_max; diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 2248f0cbee..cb14c321ea 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -269,6 +269,9 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, uint32_t i, k; uint32_t nf = vext_nf(desc); uint32_t max_elems = vext_max_elems(desc, log2_esz); +uint32_t esz = 1 << log2_esz; +uint32_t total_elems = vext_get_total_elems(env, desc, esz); +uint32_t vta = vext_vta(desc); for (i = env->vstart; i < env->vl; i++, env->vstart++) { if (!vm && !vext_elem_mask(v0, i)) { @@ -283,6 +286,18 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, } } env->vstart = 0; +/* set tail elements to 1s */ +for (k = 0; k < nf; ++k) { +vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, + (k * max_elems + max_elems) * esz); +} +if (nf * max_elems % total_elems != 0) { +uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; +uint32_t registers_used = +((nf * max_elems) * esz + (vlenb - 1)) / vlenb; +vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, + registers_used * vlenb); +} } #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)\ @@ -328,6 +343,9 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, uint32_t i, k; uint32_t nf =
[PATCH qemu v19 14/16] target/riscv: rvv: Add tail agnostic for vector mask instructions
From: eopXD The tail elements in the destination mask register are updated under a tail-agnostic policy. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 6 + target/riscv/vector_helper.c| 30 + 2 files changed, 36 insertions(+) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 1add4cb655..a94e634a6b 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -3135,6 +3135,8 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ +data = \ +FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s);\ tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ vreg_ofs(s, a->rs1),\ vreg_ofs(s, a->rs2), cpu_env, \ @@ -3239,6 +3241,8 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ +data = \ +FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s);\ tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ vreg_ofs(s, 0), vreg_ofs(s, a->rs2),\ cpu_env, s->cfg_ptr->vlen / 8, \ @@ -3276,6 +3280,7 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a) data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); static gen_helper_gvec_3_ptr * const fns[4] = { gen_helper_viota_m_b, gen_helper_viota_m_h, gen_helper_viota_m_w, gen_helper_viota_m_d, @@ -3305,6 +3310,7 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a) data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); static gen_helper_gvec_2_ptr * const fns[4] = { gen_helper_vid_v_b, gen_helper_vid_v_h, gen_helper_vid_v_w, gen_helper_vid_v_d, diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 2ab4308ef0..5c2d1c02f4 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4716,6 +4716,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ uint32_t desc) \ { \ uint32_t vl = env->vl;\ +uint32_t total_elems = env_archcpu(env)->cfg.vlen;\ +uint32_t vta_all_1s = vext_vta_all_1s(desc); \ uint32_t i; \ int a, b; \ \ @@ -4725,6 +4727,15 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ vext_set_elem_mask(vd, i, OP(b, a)); \ } \ env->vstart = 0; \ +/* mask destination register are always tail- \ + * agnostic \ + */ \ +/* set tail elements to 1s */ \ +if (vta_all_1s) { \ +for (; i < total_elems; i++) {\ +vext_set_elem_mask(vd, i, 1); \ +} \ +} \ } #define DO_NAND(N, M) (!(N & M)) @@ -4792,6 +4803,8 @@ static void vmsetm(void *vd, void *v0, void *vs2, CPURISCVState *env, { uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; +uint32_t total_elems = env_archcpu(env)->cfg.vlen; +uint32_t vta_all_1s = vext_vta_all_1s(desc); int i; bool first_mask_bit = false; @@ -4820,6 +4833,13 @@ static void vmsetm(void *vd, void *v0, void *vs2, CPURISCVState *env, } } env->vstart = 0; +/* mask destination register are always tail-agnostic */ +/* set tail elements to 1s */ +if (vta_all_1s) { +for (; i < total_elems; i++) { +vext_set_elem_mask(vd, i, 1); +} +} }
[PATCH qemu v19 04/16] target/riscv: rvv: Early exit when vstart >= vl
From: eopXD According to v-spec (section 5.4): When vstart ≥ vl, there are no body elements, and no elements are updated in any destination vector register group, including that no tail elements are updated with agnostic values. vmsbf.m, vmsif.m, vmsof.m, viota.m, vcompress instructions themselves require vstart to be zero. So they don't need the early exit. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 27 + 1 file changed, 27 insertions(+) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 391c61fe93..6750f5d04a 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -652,6 +652,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data, TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); @@ -818,6 +819,7 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2, TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); @@ -925,6 +927,7 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); @@ -1067,6 +1070,7 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data, TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); @@ -1225,6 +1229,7 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, } tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); if (a->vm && s->vl_eq_vlmax) { gvec_fn(s->sew, vreg_ofs(s, a->rd), @@ -1272,6 +1277,7 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm, TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); @@ -1436,6 +1442,7 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm, TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); @@ -1522,6 +1529,7 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a, uint32_t data = 0; TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -1602,6 +1610,7 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a, uint32_t data = 0; TCGLabel *over = gen_new_label(); tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -1679,6 +1688,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ }; \ TCGLabel *over = gen_new_label(); \ tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -1860,6 +1870,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ }; \ TCGLabel *over = gen_new_label(); \ tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2070,6 +2081,7 @@ static bool trans_v
[PATCH qemu v19 11/16] target/riscv: rvv: Add tail agnostic for vector fix-point arithmetic instructions
From: eopXD Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/vector_helper.c | 220 ++- 1 file changed, 114 insertions(+), 106 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 2bf670920a..db221797c6 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2102,10 +2102,12 @@ static inline void vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2, CPURISCVState *env, uint32_t desc, - opivv2_rm_fn *fn) + opivv2_rm_fn *fn, uint32_t esz) { uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; +uint32_t total_elems = vext_get_total_elems(env, desc, esz); +uint32_t vta = vext_vta(desc); switch (env->vxrm) { case 0: /* rnu */ @@ -2125,15 +2127,17 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2, env, vl, vm, 3, fn); break; } +/* set tail elements to 1s */ +vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); } /* generate helpers for fixed point instructions with OPIVV format */ -#define GEN_VEXT_VV_RM(NAME)\ +#define GEN_VEXT_VV_RM(NAME, ESZ) \ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ CPURISCVState *env, uint32_t desc)\ { \ vext_vv_rm_2(vd, v0, vs1, vs2, env, desc, \ - do_##NAME);\ + do_##NAME, ESZ); \ } static inline uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) @@ -2183,10 +2187,10 @@ RVVCALL(OPIVV2_RM, vsaddu_vv_b, OP_UUU_B, H1, H1, H1, saddu8) RVVCALL(OPIVV2_RM, vsaddu_vv_h, OP_UUU_H, H2, H2, H2, saddu16) RVVCALL(OPIVV2_RM, vsaddu_vv_w, OP_UUU_W, H4, H4, H4, saddu32) RVVCALL(OPIVV2_RM, vsaddu_vv_d, OP_UUU_D, H8, H8, H8, saddu64) -GEN_VEXT_VV_RM(vsaddu_vv_b) -GEN_VEXT_VV_RM(vsaddu_vv_h) -GEN_VEXT_VV_RM(vsaddu_vv_w) -GEN_VEXT_VV_RM(vsaddu_vv_d) +GEN_VEXT_VV_RM(vsaddu_vv_b, 1) +GEN_VEXT_VV_RM(vsaddu_vv_h, 2) +GEN_VEXT_VV_RM(vsaddu_vv_w, 4) +GEN_VEXT_VV_RM(vsaddu_vv_d, 8) typedef void opivx2_rm_fn(void *vd, target_long s1, void *vs2, int i, CPURISCVState *env, int vxrm); @@ -2219,10 +2223,12 @@ static inline void vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2, CPURISCVState *env, uint32_t desc, - opivx2_rm_fn *fn) + opivx2_rm_fn *fn, uint32_t esz) { uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; +uint32_t total_elems = vext_get_total_elems(env, desc, esz); +uint32_t vta = vext_vta(desc); switch (env->vxrm) { case 0: /* rnu */ @@ -2242,25 +2248,27 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2, env, vl, vm, 3, fn); break; } +/* set tail elements to 1s */ +vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); } /* generate helpers for fixed point instructions with OPIVX format */ -#define GEN_VEXT_VX_RM(NAME) \ +#define GEN_VEXT_VX_RM(NAME, ESZ) \ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,\ void *vs2, CPURISCVState *env, uint32_t desc) \ { \ vext_vx_rm_2(vd, v0, s1, vs2, env, desc, \ - do_##NAME); \ + do_##NAME, ESZ); \ } RVVCALL(OPIVX2_RM, vsaddu_vx_b, OP_UUU_B, H1, H1, saddu8) RVVCALL(OPIVX2_RM, vsaddu_vx_h, OP_UUU_H, H2, H2, saddu16) RVVCALL(OPIVX2_RM, vsaddu_vx_w, OP_UUU_W, H4, H4, saddu32) RVVCALL(OPIVX2_RM, vsaddu_vx_d, OP_UUU_D, H8, H8, saddu64) -GEN_VEXT_VX_RM(vsaddu_vx_b) -GEN_VEXT_VX_RM(vsaddu_vx_h) -GEN_VEXT_VX_RM(vsaddu_vx_w) -GEN_VEXT_VX_RM(vsaddu_vx_d) +GEN_VEXT_VX_RM(vsaddu_vx_b, 1) +GEN_VEXT_VX_RM(vsaddu_vx_h, 2) +GEN_VEXT_VX_RM(vsaddu_vx_w, 4) +GEN_VEXT_VX_RM(vsaddu_vx_d, 8) static inline int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { @@ -2306,19 +2314,19 @@ RVVCALL(OPIVV2_RM, vsadd_vv_b, OP_SSS_B, H1, H1, H1, sadd8) RVVCALL(OPIVV2_RM, vsadd_vv_h, OP_SSS_H, H2, H2, H2, sadd16) RVVCALL(OPIVV2_RM, vsadd_vv_w, OP_SSS_W, H4, H4, H4, sadd32) RVVCALL(OPIVV2_RM, vsadd_vv_d, OP_SSS_D, H8, H8, H8, sadd64) -GEN_VEXT_VV_RM(vsadd_vv_b) -GEN_VEXT_VV_RM(vsadd_vv_h) -GEN_VEXT_VV_RM(vsadd_vv_w) -GEN_VEXT_VV_RM(vsadd_vv_d) +GEN_VEXT_VV_RM(vsadd_vv_b, 1) +GEN_VEXT_VV_RM(vsadd_vv_h, 2) +GEN_VEXT_VV_RM(vsadd_vv_w, 4) +GEN_VEXT_VV_RM(vsadd_vv_d, 8) RVVCALL(OPIVX2_RM, vsadd_vx_b, OP_SSS_B, H1, H1, sadd8) RVVCALL(OPIVX2_RM, vsadd_vx_h, OP_SSS_H, H2, H2, sadd16) RVVCALL(OPIVX2_RM, vsadd_vx_w, OP_SSS_W, H4, H4,
[PATCH qemu v19 09/16] target/riscv: rvv: Add tail agnostic for vector integer comparison instructions
From: eopXD Compares write mask registers, and so always operate under a tail- agnostic policy. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/vector_helper.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 9738c50222..b964b01a15 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1370,6 +1370,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ { \ uint32_t vm = vext_vm(desc); \ uint32_t vl = env->vl;\ +uint32_t total_elems = env_archcpu(env)->cfg.vlen;\ +uint32_t vta_all_1s = vext_vta_all_1s(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -1381,6 +1383,13 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ vext_set_elem_mask(vd, i, DO_OP(s2, s1)); \ } \ env->vstart = 0; \ +/* mask destination register are always tail-agnostic */ \ +/* set tail elements to 1s */ \ +if (vta_all_1s) { \ +for (; i < total_elems; i++) {\ +vext_set_elem_mask(vd, i, 1); \ +} \ +} \ } GEN_VEXT_CMP_VV(vmseq_vv_b, uint8_t, H1, DO_MSEQ) @@ -1419,6 +1428,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ { \ uint32_t vm = vext_vm(desc);\ uint32_t vl = env->vl; \ +uint32_t total_elems = env_archcpu(env)->cfg.vlen; \ +uint32_t vta_all_1s = vext_vta_all_1s(desc);\ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) {\ @@ -1430,6 +1441,13 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ DO_OP(s2, (ETYPE)(target_long)s1)); \ } \ env->vstart = 0;\ +/* mask destination register are always tail-agnostic */\ +/* set tail elements to 1s */ \ +if (vta_all_1s) { \ +for (; i < total_elems; i++) { \ +vext_set_elem_mask(vd, i, 1); \ +} \ +} \ } GEN_VEXT_CMP_VX(vmseq_vx_b, uint8_t, H1, DO_MSEQ) -- 2.34.2
[PATCH qemu v19 15/16] target/riscv: rvv: Add tail agnostic for vector permutation instructions
From: eopXD Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 7 +++-- target/riscv/vector_helper.c| 40 + 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index a94e634a6b..4f84d4878a 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -3669,7 +3669,7 @@ static bool trans_vrgather_vx(DisasContext *s, arg_rmrr *a) return false; } -if (a->vm && s->vl_eq_vlmax) { +if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { int scale = s->lmul - (s->sew + 3); int vlmax = s->cfg_ptr->vlen >> -scale; TCGv_i64 dest = tcg_temp_new_i64(); @@ -3701,7 +3701,7 @@ static bool trans_vrgather_vi(DisasContext *s, arg_rmrr *a) return false; } -if (a->vm && s->vl_eq_vlmax) { +if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { int scale = s->lmul - (s->sew + 3); int vlmax = s->cfg_ptr->vlen >> -scale; if (a->rs1 >= vlmax) { @@ -3753,6 +3753,7 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a) tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2), cpu_env, s->cfg_ptr->vlen / 8, @@ -3853,6 +3854,8 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq) } data = FIELD_DP32(data, VDATA, VM, a->vm); +data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), vreg_ofs(s, a->rs2), cpu_env, diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 5c2d1c02f4..2afbac6e37 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4930,6 +4930,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ { \ uint32_t vm = vext_vm(desc); \ uint32_t vl = env->vl;\ +uint32_t esz = sizeof(ETYPE); \ +uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ +uint32_t vta = vext_vta(desc);\ target_ulong offset = s1, i_min, i; \ \ i_min = MAX(env->vstart, offset); \ @@ -4939,6 +4942,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ } \ *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - offset)); \ } \ +/* set tail elements to 1s */ \ +vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ } /* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */ @@ -4954,6 +4959,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ uint32_t vlmax = vext_max_elems(desc, ctzl(sizeof(ETYPE))); \ uint32_t vm = vext_vm(desc); \ uint32_t vl = env->vl;\ +uint32_t esz = sizeof(ETYPE); \ +uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ +uint32_t vta = vext_vta(desc);\ target_ulong i_max, i;\ \ i_max = MAX(MIN(s1 < vlmax ? vlmax - s1 : 0, vl), env->vstart); \ @@ -4970,6 +4978,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ } \ \ env->vstart = 0; \ +/* set tail elements to 1s */ \ +vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ } /* vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i+rs1] */ @@ -4985,6 +4995,9 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, target_ulong s1, \ typedef uint#
[PATCH qemu v19 16/16] target/riscv: rvv: Add option 'rvv_ta_all_1s' to enable optional tail agnostic behavior
From: eopXD According to v-spec, tail agnostic behavior can be either kept as undisturbed or set elements' bits to all 1s. To distinguish the difference of tail policies, QEMU should be able to simulate the tail agnostic behavior as "set tail elements' bits to all 1s". There are multiple possibility for agnostic elements according to v-spec. The main intent of this patch-set tries to add option that can distinguish between tail policies. Setting agnostic elements to all 1s allows QEMU to express this. This commit adds option 'rvv_ta_all_1s' is added to enable the behavior, it is default as disabled. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Reviewed-by: Alistair Francis --- target/riscv/cpu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index bcbba3fbd5..99e7832d8a 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -918,6 +918,8 @@ static Property riscv_cpu_properties[] = { DEFINE_PROP_UINT64("resetvec", RISCVCPU, cfg.resetvec, DEFAULT_RSTVEC), DEFINE_PROP_BOOL("short-isa-string", RISCVCPU, cfg.short_isa_string, false), + +DEFINE_PROP_BOOL("rvv_ta_all_1s", RISCVCPU, cfg.rvv_ta_all_1s, false), DEFINE_PROP_END_OF_LIST(), }; -- 2.34.2
[PATCH qemu v19 07/16] target/riscv: rvv: Add tail agnostic for vx, vvm, vxm instructions
From: eopXD `vmadc` and `vmsbc` produces a mask value, they always operate with a tail agnostic policy. Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 13 +- target/riscv/internals.h| 5 +- target/riscv/vector_helper.c| 314 +--- 3 files changed, 190 insertions(+), 142 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index cd73fd6119..22edf6228d 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -1293,6 +1293,8 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm, data = FIELD_DP32(data, VDATA, VM, vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); +data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, data)); @@ -1328,7 +1330,7 @@ do_opivx_gvec(DisasContext *s, arg_rmrr *a, GVecGen2sFn *gvec_fn, return false; } -if (a->vm && s->vl_eq_vlmax) { +if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { TCGv_i64 src1 = tcg_temp_new_i64(); tcg_gen_ext_tl_i64(src1, get_gpr(s, a->rs1, EXT_SIGN)); @@ -1458,6 +1460,8 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm, data = FIELD_DP32(data, VDATA, VM, vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); +data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, data)); @@ -1486,7 +1490,7 @@ do_opivi_gvec(DisasContext *s, arg_rmrr *a, GVecGen2iFn *gvec_fn, return false; } -if (a->vm && s->vl_eq_vlmax) { +if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), extract_imm(s, a->rs1, imm_mode), MAXSZ(s), MAXSZ(s)); mark_vs_dirty(s); @@ -1540,6 +1544,7 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a, data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2), @@ -1621,6 +1626,7 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a, data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2), @@ -1699,6 +1705,9 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ +data = FIELD_DP32(data, VDATA, VTA, s->vta); \ +data = \ +FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s);\ tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ vreg_ofs(s, a->rs1),\ vreg_ofs(s, a->rs2), cpu_env, \ diff --git a/target/riscv/internals.h b/target/riscv/internals.h index 512c6c30cf..193ce57a6d 100644 --- a/target/riscv/internals.h +++ b/target/riscv/internals.h @@ -25,8 +25,9 @@ FIELD(VDATA, VM, 0, 1) FIELD(VDATA, LMUL, 1, 3) FIELD(VDATA, VTA, 4, 1) -FIELD(VDATA, NF, 5, 4) -FIELD(VDATA, WD, 5, 1) +FIELD(VDATA, VTA_ALL_1S, 5, 1) +FIELD(VDATA, NF, 6, 4) +FIELD(VDATA, WD, 6, 1) /* float point classify helpers */ target_ulong fclass_h(uint64_t frs1); diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index cb14c321ea..3bcde37e28 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -127,6 +127,11 @@ static inline uint32_t vext_vta(uint32_t desc) return FIELD_EX32(simd_data(desc), VDATA, VTA); } +static inline uint32_t vext_vta_all_1s(uint32_t desc) +{ +return FIELD_EX32(simd_data(desc), VDATA, VTA_ALL_1S); +} + /* * Get the maximum number of elements can be operated. * @@ -866,10 +871,12 @@ RVVCALL(OPIVX2, vrsub_vx_d, OP_SSS_D, H8, H8, DO_RSUB) static void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, CPURISCVState *env, uint32_t desc, -
RE: [QEMU PATCH v2 0/6] Support ACPI NVDIMM Label Methods
Ping... Best Regards, Robert Hoo > -Original Message- > From: Robert Hoo > Sent: Monday, May 30, 2022 11:41 > To: imamm...@redhat.com; m...@redhat.com; > xiaoguangrong.e...@gmail.com; a...@anisinha.ca; Williams, Dan J > ; Liu, Jingqi > Cc: qemu-devel@nongnu.org; Hu, Robert > Subject: [QEMU PATCH v2 0/6] Support ACPI NVDIMM Label Methods > > (v1 Subject was "acpi/nvdimm: support NVDIMM _LS{I,R,W} methods") > > Originally NVDIMM Label methods was defined in Intel PMEM _DSM Interface > Spec [1], of function index 4, 5 and 6. > Recent ACPI spec [2] has deprecated those _DSM methods with ACPI NVDIMM > Label Methods _LS{I,R,W}. The essence of these functions has no changes. > > This patch set is to update QEMU emulation on this, as well as update bios- > table-test binaries, and substitute trace events for nvdimm_debug(). > > Patch 1 and 5, the opening and closing parenthesis patches for changes > affecting ACPI tables. Details see tests/qtest/bios-tables-test.c. > Patch 2, a trivial fix on aml_or()/aml_and() usage. > Patch 3, allow NVDIMM _DSM revision 2 to get in. > Patch 4, main body, which implements the virtual _LS{I,R,W} methods and also > generalize QEMU <--> ACPI NVDIMM method interface, which paves the way > for future necessary methods implementation, not only _DSM. The result SSDT > table changes in ASL can be found in Patch 5's commit message. > Patch 6, define trace events for acpi/nvdimm, replace nvdimm_debug() > > Test > Tested Linux guest of recent Kernel 5.18.0-rc4, create/destroy namespace, init > labels, etc. works as before. > Tested Windows 10 (1607) guest, and Windows server 2019, but seems > vNVDIMM in Windows guest hasn't ever been supported. Before and after this > patch set, no difference on guest boot up and other functions. > > [1] Intel PMEM _DSM Interface Spec v2.0, 3.10 Deprecated Functions > https://pmem.io/documents/IntelOptanePMem_DSM_Interface-V2.0.pdf > [2] ACPI Spec v6.4, 6.5.10 NVDIMM Label Methods > https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf > > --- > Change Log: > v2: > Almost rewritten > Separate Patch 2 > Dance with tests/qtest/bios-table-tests > Add trace events > > Robert Hoo (6): > tests/acpi: allow SSDT changes > acpi/ssdt: Fix aml_or() and aml_and() in if clause > acpi/nvdimm: NVDIMM _DSM Spec supports revision 2 > nvdimm: Implement ACPI NVDIMM Label Methods > test/acpi/bios-tables-test: SSDT: update standard AML binaries > acpi/nvdimm: Define trace events for NVDIMM and substitute > nvdimm_debug() > > hw/acpi/nvdimm.c | 434 +++ > hw/acpi/trace-events | 14 + > include/hw/mem/nvdimm.h | 12 +- > tests/data/acpi/pc/SSDT.dimmpxm | Bin 734 -> 1829 bytes > tests/data/acpi/q35/SSDT.dimmpxm | Bin 734 -> 1829 bytes > 5 files changed, 344 insertions(+), 116 deletions(-) > > > base-commit: 58b53669e87fed0d70903e05cd42079fbbdbc195 > -- > 2.31.1
[PATCH qemu v19 10/16] target/riscv: rvv: Add tail agnostic for vector integer merge and move instructions
From: eopXD Signed-off-by: eop Chen Reviewed-by: Frank Chang Reviewed-by: Weiwei Li Acked-by: Alistair Francis --- target/riscv/insn_trans/trans_rvv.c.inc | 12 target/riscv/vector_helper.c| 20 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index dbe687fb73..e75a2fd196 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -2086,12 +2086,13 @@ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a) vext_check_isa_ill(s) && /* vmv.v.v has rs2 = 0 and vm = 1 */ vext_check_sss(s, a->rd, a->rs1, 0, 1)) { -if (s->vl_eq_vlmax) { +if (s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { tcg_gen_gvec_mov(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1), MAXSZ(s), MAXSZ(s)); } else { uint32_t data = FIELD_DP32(0, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); static gen_helper_gvec_2_ptr * const fns[4] = { gen_helper_vmv_v_v_b, gen_helper_vmv_v_v_h, gen_helper_vmv_v_v_w, gen_helper_vmv_v_v_d, @@ -2126,7 +2127,7 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a) s1 = get_gpr(s, a->rs1, EXT_SIGN); -if (s->vl_eq_vlmax) { +if (s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { tcg_gen_gvec_dup_tl(s->sew, vreg_ofs(s, a->rd), MAXSZ(s), MAXSZ(s), s1); } else { @@ -2134,6 +2135,7 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a) TCGv_i64 s1_i64 = tcg_temp_new_i64(); TCGv_ptr dest = tcg_temp_new_ptr(); uint32_t data = FIELD_DP32(0, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); static gen_helper_vmv_vx * const fns[4] = { gen_helper_vmv_v_x_b, gen_helper_vmv_v_x_h, gen_helper_vmv_v_x_w, gen_helper_vmv_v_x_d, @@ -2163,7 +2165,7 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) /* vmv.v.i has rs2 = 0 and vm = 1 */ vext_check_ss(s, a->rd, 0, 1)) { int64_t simm = sextract64(a->rs1, 0, 5); -if (s->vl_eq_vlmax) { +if (s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { tcg_gen_gvec_dup_imm(s->sew, vreg_ofs(s, a->rd), MAXSZ(s), MAXSZ(s), simm); mark_vs_dirty(s); @@ -2172,6 +2174,7 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) TCGv_i64 s1; TCGv_ptr dest; uint32_t data = FIELD_DP32(0, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); static gen_helper_vmv_vx * const fns[4] = { gen_helper_vmv_v_x_b, gen_helper_vmv_v_x_h, gen_helper_vmv_v_x_w, gen_helper_vmv_v_x_d, @@ -2743,7 +2746,7 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) TCGv_i64 t1; -if (s->vl_eq_vlmax) { +if (s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { t1 = tcg_temp_new_i64(); /* NaN-box f[rs1] */ do_nanbox(s, t1, cpu_fpr[a->rs1]); @@ -2755,6 +2758,7 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) TCGv_ptr dest; TCGv_i32 desc; uint32_t data = FIELD_DP32(0, VDATA, LMUL, s->lmul); +data = FIELD_DP32(data, VDATA, VTA, s->vta); static gen_helper_vmv_vx * const fns[3] = { gen_helper_vmv_v_x_h, gen_helper_vmv_v_x_w, diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index b964b01a15..2bf670920a 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1968,6 +1968,9 @@ void HELPER(NAME)(void *vd, void *vs1, CPURISCVState *env, \ uint32_t desc) \ {\ uint32_t vl = env->vl; \ +uint32_t esz = sizeof(ETYPE);\ +uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ +uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -1975,6 +1978,8 @@ void HELPER(NAME)(void *vd, void *vs1, CPURISCVState *env, \ *((ETYPE *)vd + H(i)) = s1; \ }\ env->vstart = 0;