date:20230714

Re: [PATCH v2 07/11] hw/char/pl011: Extract pl011_read_rxdata() from pl011_read()

2023-07-14 Thread Richard Henderson


On 7/10/23 18:50, Philippe Mathieu-Daudé wrote:

+if (s->read_count == s->read_trigger - 1)
+s->int_level &= ~ INT_RX;


Fix the braces.  Otherwise,
Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 08/11] hw/char/pl011: Warn when using disabled transmitter

2023-07-14 Thread Richard Henderson


On 7/10/23 18:50, Philippe Mathieu-Daudé wrote:

We shouldn't transmit characters when the full UART or its
transmitter is disabled. However we don't want to break the
possibly incomplete "my first bare metal assembly program"s,
so we choose to simply display a warning when this occurs.

Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Alex Bennée
---
  hw/char/pl011.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 09/11] hw/char/pl011: Check if receiver is enabled

2023-07-14 Thread Richard Henderson


On 7/10/23 18:51, Philippe Mathieu-Daudé wrote:

Do not receive characters when UART or receiver are disabled.

Signed-off-by: Philippe Mathieu-Daudé
---
  hw/char/pl011.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)


I guess this doesn't fall under "my first assembly program" because it isn't part of 
"Hello, World"?


Anyway, for real stuffz:

Reviewed-by: Richard Henderson 


r~

Re: [PATCH v2 10/11] hw/char/pl011: Rename RX FIFO methods

2023-07-14 Thread Richard Henderson


On 7/10/23 18:51, Philippe Mathieu-Daudé wrote:

In preparation of having a TX FIFO, rename the RX FIFO methods.

Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Alex Bennée
---
  hw/char/pl011.c  | 10 +-
  hw/char/trace-events |  4 ++--
  2 files changed, 7 insertions(+), 7 deletions(-)


Reviewed-by: Richard Henderson 

r~

[PATCH v2 00/11] tpm: introduce TPM CRB SysBus device

2023-07-14 Thread Joelle van Dyne

The impetus for this patch set is to get TPM 2.0 working on Windows 11 ARM64.
Windows' tpm.sys does not seem to work on a TPM TIS device (as verified with
VMWare's implementation). However, the current TPM CRB device uses a fixed
system bus address that is reserved for RAM in ARM64 Virt machines.

In the process of adding the TPM CRB SysBus device, we also went ahead and
cleaned up some of the existing TPM hardware code and fixed some bugs. We used
the TPM TIS devices as a template for the TPM CRB devices and refactored out
common code. We moved the ACPI DSDT generation to the device in order to handle
dynamic base address requirements as well as reduce redundent code in different
machine ACPI generation. We also changed the tpm_crb device to use the ISA bus
instead of depending on the default system bus as the device only was built for
the PC configuration.

Another change is that the TPM CRB registers are now mapped in the same way that
the pflash ROM devices are mapped. It is a memory region whose writes are
trapped as MMIO accesses. This was needed because Apple Silicon does not decode
LDP (AARCH64 load pair of registers) caused page faults. @agraf suggested that
we do this to avoid having to do AARCH64 decoding in the HVF backend's fault
handler.

Unfortunately, it seems like the LDP fault still happens on HVF but the issue
seems to be in the HVF backend which needs to be fixed in a separate patch.

One last thing that's needed to get Windows 11 to recognize the TPM 2.0 device
is for the OVMF firmware to setup the TPM device. Currently, OVMF for ARM64 Virt
only recognizes the TPM TIS device through a FDT entry. A workaround is to
falsely identify the TPM CRB device as a TPM TIS device in the FDT node but this
causes issues for Linux. A proper fix would involve adding an ACPI device driver
in OVMF.

v2:
- Fixed an issue where VMstate restore from an older version failed due to name
  collision of the memory block.
- In the ACPI table generation for CRB devices, the check for TPM 2.0 backend is
  moved to the device realize as CRB does not support TPM 1.0. It will error in
  that case.
- Dropped the patch to fix crash when PPI is enabled on TIS SysBus device since
  a separate patch submitted by Stefan Berger disables such an option.
- Fixed an issue where we default tpmEstablished=0 when it should be 1.
- In TPM CRB SysBus's ACPI entry, we accidently changed _UID from 0 to 1. This
  shouldn't be an issue but we changed it back just in case.
- Added a patch to migrate saved VMstate from an older version with the regs
  saved separately instead of as a RAM block.

Joelle van Dyne (11):
  tpm_crb: refactor common code
  tpm_crb: CTRL_RSP_ADDR is 64-bits wide
  tpm_ppi: refactor memory space initialization
  tpm_crb: use a single read-as-mem/write-as-mmio mapping
  tpm_crb: use the ISA bus
  tpm_crb: move ACPI table building to device interface
  hw/arm/virt: add plug handler for TPM on SysBus
  hw/loongarch/virt: add plug handler for TPM on SysBus
  tpm_tis_sysbus: move DSDT AML generation to device
  tpm_crb_sysbus: introduce TPM CRB SysBus device
  tpm_crb: support restoring older vmstate

 docs/specs/tpm.rst  |   2 +
 hw/tpm/tpm_crb.h|  75 +
 hw/tpm/tpm_ppi.h|  10 +-
 include/hw/acpi/aml-build.h |   1 +
 include/hw/acpi/tpm.h   |   3 +-
 include/sysemu/tpm.h|   3 +
 hw/acpi/aml-build.c |   7 +-
 hw/arm/virt-acpi-build.c|  38 +
 hw/arm/virt.c   |  38 +
 hw/core/sysbus-fdt.c|   1 +
 hw/i386/acpi-build.c|  23 ---
 hw/loongarch/acpi-build.c   |  38 +
 hw/loongarch/virt.c |  38 +
 hw/riscv/virt.c |   1 +
 hw/tpm/tpm_crb.c| 314 +---
 hw/tpm/tpm_crb_common.c | 233 ++
 hw/tpm/tpm_crb_sysbus.c | 170 +++
 hw/tpm/tpm_ppi.c|   5 +-
 hw/tpm/tpm_tis_isa.c|   5 +-
 hw/tpm/tpm_tis_sysbus.c |  35 
 tests/qtest/tpm-crb-test.c  |   2 +-
 tests/qtest/tpm-util.c  |   2 +-
 hw/arm/Kconfig  |   1 +
 hw/riscv/Kconfig|   1 +
 hw/tpm/Kconfig  |   7 +-
 hw/tpm/meson.build  |   3 +
 hw/tpm/trace-events |   2 +-
 27 files changed, 708 insertions(+), 350 deletions(-)
 create mode 100644 hw/tpm/tpm_crb.h
 create mode 100644 hw/tpm/tpm_crb_common.c
 create mode 100644 hw/tpm/tpm_crb_sysbus.c

-- 
2.39.2 (Apple Git-143)

[PATCH v2 01/11] tpm_crb: refactor common code

2023-07-14 Thread Joelle van Dyne

In preparation for the SysBus variant, we move common code styled
after the TPM TIS devices.

To maintain compatibility, we do not rename the existing tpm-crb
device.

Signed-off-by: Joelle van Dyne 
Reviewed-by: Stefan Berger 
---
 docs/specs/tpm.rst  |   1 +
 hw/tpm/tpm_crb.h|  76 +++
 hw/tpm/tpm_crb.c| 270 ++--
 hw/tpm/tpm_crb_common.c | 218 
 hw/tpm/meson.build  |   1 +
 hw/tpm/trace-events |   2 +-
 6 files changed, 333 insertions(+), 235 deletions(-)
 create mode 100644 hw/tpm/tpm_crb.h
 create mode 100644 hw/tpm/tpm_crb_common.c

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index efe124a148..2bc29c9804 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -45,6 +45,7 @@ operating system.
 
 QEMU files related to TPM CRB interface:
  - ``hw/tpm/tpm_crb.c``
+ - ``hw/tpm/tpm_crb_common.c``
 
 SPAPR interface
 ---
diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h
new file mode 100644
index 00..da3a0cf256
--- /dev/null
+++ b/hw/tpm/tpm_crb.h
@@ -0,0 +1,76 @@
+/*
+ * tpm_crb.h - QEMU's TPM CRB interface emulator
+ *
+ * Copyright (c) 2018 Red Hat, Inc.
+ *
+ * Authors:
+ *   Marc-André Lureau 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * tpm_crb is a device for TPM 2.0 Command Response Buffer (CRB) Interface
+ * as defined in TCG PC Client Platform TPM Profile (PTP) Specification
+ * Family “2.0” Level 00 Revision 01.03 v22
+ */
+#ifndef TPM_TPM_CRB_H
+#define TPM_TPM_CRB_H
+
+#include "exec/memory.h"
+#include "hw/acpi/tpm.h"
+#include "sysemu/tpm_backend.h"
+#include "tpm_ppi.h"
+
+#define CRB_CTRL_CMD_SIZE (TPM_CRB_ADDR_SIZE - A_CRB_DATA_BUFFER)
+
+typedef struct TPMCRBState {
+TPMBackend *tpmbe;
+TPMBackendCmd cmd;
+uint32_t regs[TPM_CRB_R_MAX];
+MemoryRegion mmio;
+MemoryRegion cmdmem;
+
+size_t be_buffer_size;
+
+bool ppi_enabled;
+TPMPPI ppi;
+} TPMCRBState;
+
+#define CRB_INTF_TYPE_CRB_ACTIVE 0b1
+#define CRB_INTF_VERSION_CRB 0b1
+#define CRB_INTF_CAP_LOCALITY_0_ONLY 0b0
+#define CRB_INTF_CAP_IDLE_FAST 0b0
+#define CRB_INTF_CAP_XFER_SIZE_64 0b11
+#define CRB_INTF_CAP_FIFO_NOT_SUPPORTED 0b0
+#define CRB_INTF_CAP_CRB_SUPPORTED 0b1
+#define CRB_INTF_IF_SELECTOR_CRB 0b1
+
+enum crb_loc_ctrl {
+CRB_LOC_CTRL_REQUEST_ACCESS = BIT(0),
+CRB_LOC_CTRL_RELINQUISH = BIT(1),
+CRB_LOC_CTRL_SEIZE = BIT(2),
+CRB_LOC_CTRL_RESET_ESTABLISHMENT_BIT = BIT(3),
+};
+
+enum crb_ctrl_req {
+CRB_CTRL_REQ_CMD_READY = BIT(0),
+CRB_CTRL_REQ_GO_IDLE = BIT(1),
+};
+
+enum crb_start {
+CRB_START_INVOKE = BIT(0),
+};
+
+enum crb_cancel {
+CRB_CANCEL_INVOKE = BIT(0),
+};
+
+#define TPM_CRB_NO_LOCALITY 0xff
+
+void tpm_crb_request_completed(TPMCRBState *s, int ret);
+enum TPMVersion tpm_crb_get_version(TPMCRBState *s);
+int tpm_crb_pre_save(TPMCRBState *s);
+void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr);
+void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error **errp);
+
+#endif /* TPM_TPM_CRB_H */
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index ea930da545..3ef4977fb5 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -31,257 +31,62 @@
 #include "tpm_ppi.h"
 #include "trace.h"
 #include "qom/object.h"
+#include "tpm_crb.h"
 
 struct CRBState {
 DeviceState parent_obj;
 
-TPMBackend *tpmbe;
-TPMBackendCmd cmd;
-uint32_t regs[TPM_CRB_R_MAX];
-MemoryRegion mmio;
-MemoryRegion cmdmem;
-
-size_t be_buffer_size;
-
-bool ppi_enabled;
-TPMPPI ppi;
+TPMCRBState state;
 };
 typedef struct CRBState CRBState;
 
 DECLARE_INSTANCE_CHECKER(CRBState, CRB,
  TYPE_TPM_CRB)
 
-#define CRB_INTF_TYPE_CRB_ACTIVE 0b1
-#define CRB_INTF_VERSION_CRB 0b1
-#define CRB_INTF_CAP_LOCALITY_0_ONLY 0b0
-#define CRB_INTF_CAP_IDLE_FAST 0b0
-#define CRB_INTF_CAP_XFER_SIZE_64 0b11
-#define CRB_INTF_CAP_FIFO_NOT_SUPPORTED 0b0
-#define CRB_INTF_CAP_CRB_SUPPORTED 0b1
-#define CRB_INTF_IF_SELECTOR_CRB 0b1
-
-#define CRB_CTRL_CMD_SIZE (TPM_CRB_ADDR_SIZE - A_CRB_DATA_BUFFER)
-
-enum crb_loc_ctrl {
-CRB_LOC_CTRL_REQUEST_ACCESS = BIT(0),
-CRB_LOC_CTRL_RELINQUISH = BIT(1),
-CRB_LOC_CTRL_SEIZE = BIT(2),
-CRB_LOC_CTRL_RESET_ESTABLISHMENT_BIT = BIT(3),
-};
-
-enum crb_ctrl_req {
-CRB_CTRL_REQ_CMD_READY = BIT(0),
-CRB_CTRL_REQ_GO_IDLE = BIT(1),
-};
-
-enum crb_start {
-CRB_START_INVOKE = BIT(0),
-};
-
-enum crb_cancel {
-CRB_CANCEL_INVOKE = BIT(0),
-};
-
-#define TPM_CRB_NO_LOCALITY 0xff
-
-static uint64_t tpm_crb_mmio_read(void *opaque, hwaddr addr,
-  unsigned size)
-{
-CRBState *s = CRB(opaque);
-void *regs = (void *)&s->regs + (addr & ~3);
-unsigned offset = addr & 3;
-uint32_t val = *(uint32_t *)regs >> (8 * offset);
-
-switch (addr) {
-case A_CRB_LOC_STATE:
-val |= !tpm_backend_get_tpm_established_flag(

[PATCH v2 06/11] tpm_crb: move ACPI table building to device interface

2023-07-14 Thread Joelle van Dyne

This logic is similar to TPM TIS ISA device. Since TPM CRB can only
support TPM 2.0 backends, we check for this in realize.

Signed-off-by: Joelle van Dyne 
---
 hw/i386/acpi-build.c | 23 ---
 hw/tpm/tpm_crb.c | 29 +
 2 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9c74fa17ad..b767df39df 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 uint32_t nr_mem = machine->ram_slots;
 int root_bus_limit = 0xFF;
 PCIBus *bus = NULL;
-#ifdef CONFIG_TPM
-TPMIf *tpm = tpm_find();
-#endif
 bool cxl_present = false;
 int i;
 VMBusBridge *vmbus_bridge = vmbus_bridge_find();
@@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 }
 }
 
-#ifdef CONFIG_TPM
-if (TPM_IS_CRB(tpm)) {
-dev = aml_device("TPM");
-aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
-aml_append(dev, aml_name_decl("_STR",
-  aml_string("TPM 2.0 Device")));
-crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE,
-   TPM_CRB_ADDR_SIZE, AML_READ_WRITE));
-aml_append(dev, aml_name_decl("_CRS", crs));
-
-aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
-aml_append(dev, aml_name_decl("_UID", aml_int(1)));
-
-tpm_build_ppi_acpi(tpm, dev);
-
-aml_append(sb_scope, dev);
-}
-#endif
-
 if (pcms->sgx_epc.size != 0) {
 uint64_t epc_base = pcms->sgx_epc.base;
 uint64_t epc_size = pcms->sgx_epc.size;
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 6144081d30..594696ffb8 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -19,6 +19,8 @@
 #include "qemu/module.h"
 #include "qapi/error.h"
 #include "exec/address-spaces.h"
+#include "hw/acpi/acpi_aml_interface.h"
+#include "hw/acpi/tpm.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci_ids.h"
 #include "hw/acpi/tpm.h"
@@ -99,6 +101,11 @@ static void tpm_crb_isa_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
+if (tpm_crb_isa_get_version(TPM_IF(s)) != TPM_VERSION_2_0) {
+error_setg(errp, "TPM CRB only supports TPM 2.0 backends");
+return;
+}
+
 tpm_crb_init_memory(OBJECT(s), &s->state, errp);
 
 memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
@@ -116,10 +123,30 @@ static void tpm_crb_isa_realize(DeviceState *dev, Error 
**errp)
 }
 }
 
+static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope)
+{
+Aml *dev, *crs;
+CRBState *s = CRB(adev);
+TPMIf *ti = TPM_IF(s);
+
+dev = aml_device("TPM");
+aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
+aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device")));
+aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
+crs = aml_resource_template();
+aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, TPM_CRB_ADDR_SIZE,
+  AML_READ_WRITE));
+aml_append(dev, aml_name_decl("_CRS", crs));
+tpm_build_ppi_acpi(ti, dev);
+aml_append(scope, dev);
+}
+
 static void tpm_crb_isa_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 TPMIfClass *tc = TPM_IF_CLASS(klass);
+AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass);
 
 dc->realize = tpm_crb_isa_realize;
 device_class_set_props(dc, tpm_crb_isa_properties);
@@ -128,6 +155,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, void 
*data)
 tc->model = TPM_MODEL_TPM_CRB;
 tc->get_version = tpm_crb_isa_get_version;
 tc->request_completed = tpm_crb_isa_request_completed;
+adevc->build_dev_aml = build_tpm_crb_isa_aml;
 
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
@@ -139,6 +167,7 @@ static const TypeInfo tpm_crb_isa_info = {
 .class_init  = tpm_crb_isa_class_init,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_TPM_IF },
+{ TYPE_ACPI_DEV_AML_IF },
 { }
 }
 };
-- 
2.39.2 (Apple Git-143)

[PATCH v2 02/11] tpm_crb: CTRL_RSP_ADDR is 64-bits wide

2023-07-14 Thread Joelle van Dyne

The register is actually 64-bits but in order to make this more clear
than the specification, we define two 32-bit registers:
CTRL_RSP_LADDR and CTRL_RSP_HADDR to match the CTRL_CMD_* naming. This
deviates from the specs but is way more clear.

Previously, the only CRB device uses a fixed system address so this
was not an issue. However, once we support SysBus CRB device, the
address can be anywhere in 64-bit space.

Signed-off-by: Joelle van Dyne 
Reviewed-by: Stefan Berger 
---
 include/hw/acpi/tpm.h  | 3 ++-
 hw/tpm/tpm_crb_common.c| 3 ++-
 tests/qtest/tpm-crb-test.c | 2 +-
 tests/qtest/tpm-util.c | 2 +-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/hw/acpi/tpm.h b/include/hw/acpi/tpm.h
index 579c45f5ba..f60bfe2789 100644
--- a/include/hw/acpi/tpm.h
+++ b/include/hw/acpi/tpm.h
@@ -174,7 +174,8 @@ REG32(CRB_CTRL_CMD_SIZE, 0x58)
 REG32(CRB_CTRL_CMD_LADDR, 0x5C)
 REG32(CRB_CTRL_CMD_HADDR, 0x60)
 REG32(CRB_CTRL_RSP_SIZE, 0x64)
-REG32(CRB_CTRL_RSP_ADDR, 0x68)
+REG32(CRB_CTRL_RSP_LADDR, 0x68)
+REG32(CRB_CTRL_RSP_HADDR, 0x6C)
 REG32(CRB_DATA_BUFFER, 0x80)
 
 #define TPM_CRB_ADDR_BASE   0xFED4
diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c
index 4c173affb6..228e2d0faf 100644
--- a/hw/tpm/tpm_crb_common.c
+++ b/hw/tpm/tpm_crb_common.c
@@ -199,7 +199,8 @@ void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr)
 s->regs[R_CRB_CTRL_CMD_LADDR] = (uint32_t)baseaddr;
 s->regs[R_CRB_CTRL_CMD_HADDR] = (uint32_t)(baseaddr >> 32);
 s->regs[R_CRB_CTRL_RSP_SIZE] = CRB_CTRL_CMD_SIZE;
-s->regs[R_CRB_CTRL_RSP_ADDR] = (uint32_t)baseaddr;
+s->regs[R_CRB_CTRL_RSP_LADDR] = (uint32_t)baseaddr;
+s->regs[R_CRB_CTRL_RSP_HADDR] = (uint32_t)(baseaddr >> 32);
 
 s->be_buffer_size = MIN(tpm_backend_get_buffer_size(s->tpmbe),
 CRB_CTRL_CMD_SIZE);
diff --git a/tests/qtest/tpm-crb-test.c b/tests/qtest/tpm-crb-test.c
index 396ae3f91c..9d30fe8293 100644
--- a/tests/qtest/tpm-crb-test.c
+++ b/tests/qtest/tpm-crb-test.c
@@ -28,7 +28,7 @@ static void tpm_crb_test(const void *data)
 uint32_t csize = readl(TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_SIZE);
 uint64_t caddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_LADDR);
 uint32_t rsize = readl(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_SIZE);
-uint64_t raddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_ADDR);
+uint64_t raddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_LADDR);
 uint8_t locstate = readb(TPM_CRB_ADDR_BASE + A_CRB_LOC_STATE);
 uint32_t locctrl = readl(TPM_CRB_ADDR_BASE + A_CRB_LOC_CTRL);
 uint32_t locsts = readl(TPM_CRB_ADDR_BASE + A_CRB_LOC_STS);
diff --git a/tests/qtest/tpm-util.c b/tests/qtest/tpm-util.c
index 1c0319e6e7..dd02057fc0 100644
--- a/tests/qtest/tpm-util.c
+++ b/tests/qtest/tpm-util.c
@@ -25,7 +25,7 @@ void tpm_util_crb_transfer(QTestState *s,
unsigned char *rsp, size_t rsp_size)
 {
 uint64_t caddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_LADDR);
-uint64_t raddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_ADDR);
+uint64_t raddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_LADDR);
 
 qtest_writeb(s, TPM_CRB_ADDR_BASE + A_CRB_LOC_CTRL, 1);
 
-- 
2.39.2 (Apple Git-143)

[PATCH v2 10/11] tpm_crb_sysbus: introduce TPM CRB SysBus device

2023-07-14 Thread Joelle van Dyne

This SysBus variant of the CRB interface supports dynamically locating
the MMIO interface so that Virt machines can use it. This interface
is currently the only one supported by QEMU that works on Windows 11
ARM64. We largely follow the TPM TIS SysBus device as a template.

To try out this device with Windows 11 before OVMF is updated, you
will need to modify `sysbud-fdt.c` and change the added line from:

```c
TYPE_BINDING(TYPE_TPM_CRB_SYSBUS, no_fdt_node),
```

to

```c
TYPE_BINDING(TYPE_TPM_CRB_SYSBUS, add_tpm_tis_fdt_node),
```

This change was not included because it can confuse Linux (although
from testing, it seems like Linux is able to properly ignore the
device from the TPM TIS driver and recognize it from the ACPI device
in the TPM CRB driver). A proper fix would require OVMF to recognize
the ACPI device and not depend on the FDT node for recognizing TPM.

The command line to try out this device with SWTPM is:

```
$ qemu-system-aarch64 \
-chardev socket,id=chrtpm0,path=tpm.sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm0 \
-device tpm-crb-device,tpmdev=tpm0
```

along with SWTPM:

```
$ swtpm \
--ctrl type=unixio,path=tpm.sock,terminate \
--tpmstate backend-uri=file://tpm.data \
--tpm2
```

Signed-off-by: Joelle van Dyne 
---
 docs/specs/tpm.rst  |   1 +
 include/hw/acpi/aml-build.h |   1 +
 include/sysemu/tpm.h|   3 +
 hw/acpi/aml-build.c |   7 +-
 hw/arm/virt.c   |   1 +
 hw/core/sysbus-fdt.c|   1 +
 hw/loongarch/virt.c |   1 +
 hw/riscv/virt.c |   1 +
 hw/tpm/tpm_crb_sysbus.c | 170 
 hw/arm/Kconfig  |   1 +
 hw/riscv/Kconfig|   1 +
 hw/tpm/Kconfig  |   5 ++
 hw/tpm/meson.build  |   2 +
 13 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100644 hw/tpm/tpm_crb_sysbus.c

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index 2bc29c9804..95aeb49220 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -46,6 +46,7 @@ operating system.
 QEMU files related to TPM CRB interface:
  - ``hw/tpm/tpm_crb.c``
  - ``hw/tpm/tpm_crb_common.c``
+ - ``hw/tpm/tpm_crb_sysbus.c``
 
 SPAPR interface
 ---
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index d1fb08514b..9660e16148 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -3,6 +3,7 @@
 
 #include "hw/acpi/acpi-defs.h"
 #include "hw/acpi/bios-linker-loader.h"
+#include "exec/hwaddr.h"
 
 #define ACPI_BUILD_APPNAME6 "BOCHS "
 #define ACPI_BUILD_APPNAME8 "BXPC"
diff --git a/include/sysemu/tpm.h b/include/sysemu/tpm.h
index 66e3b45f30..f79c8f3575 100644
--- a/include/sysemu/tpm.h
+++ b/include/sysemu/tpm.h
@@ -47,6 +47,7 @@ struct TPMIfClass {
 #define TYPE_TPM_TIS_ISA"tpm-tis"
 #define TYPE_TPM_TIS_SYSBUS "tpm-tis-device"
 #define TYPE_TPM_CRB"tpm-crb"
+#define TYPE_TPM_CRB_SYSBUS "tpm-crb-device"
 #define TYPE_TPM_SPAPR  "tpm-spapr"
 #define TYPE_TPM_TIS_I2C"tpm-tis-i2c"
 
@@ -56,6 +57,8 @@ struct TPMIfClass {
 object_dynamic_cast(OBJECT(chr), TYPE_TPM_TIS_SYSBUS)
 #define TPM_IS_CRB(chr) \
 object_dynamic_cast(OBJECT(chr), TYPE_TPM_CRB)
+#define TPM_IS_CRB_SYSBUS(chr)  \
+object_dynamic_cast(OBJECT(chr), TYPE_TPM_CRB_SYSBUS)
 #define TPM_IS_SPAPR(chr)   \
 object_dynamic_cast(OBJECT(chr), TYPE_TPM_SPAPR)
 #define TPM_IS_TIS_I2C(chr)  \
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index ea331a20d1..f809137fc9 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -31,6 +31,7 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci/pci_bridge.h"
 #include "qemu/cutils.h"
+#include "qom/object.h"
 
 static GArray *build_alloc_array(void)
 {
@@ -2218,7 +2219,7 @@ void build_tpm2(GArray *table_data, BIOSLinker *linker, 
GArray *tcpalog,
 {
 uint8_t start_method_params[12] = {};
 unsigned log_addr_offset;
-uint64_t control_area_start_address;
+uint64_t baseaddr, control_area_start_address;
 TPMIf *tpmif = tpm_find();
 uint32_t start_method;
 AcpiTable table = { .sig = "TPM2", .rev = 4,
@@ -2236,6 +2237,10 @@ void build_tpm2(GArray *table_data, BIOSLinker *linker, 
GArray *tcpalog,
 } else if (TPM_IS_CRB(tpmif)) {
 control_area_start_address = TPM_CRB_ADDR_CTRL;
 start_method = TPM2_START_METHOD_CRB;
+} else if (TPM_IS_CRB_SYSBUS(tpmif)) {
+baseaddr = object_property_get_uint(OBJECT(tpmif), "baseaddr", NULL);
+control_area_start_address = baseaddr + A_CRB_CTRL_REQ;
+start_method = TPM2_START_METHOD_CRB;
 } else {
 g_assert_not_reached();
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 432148ef47..88e8b16103 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2977,6 +2977,7 @@ static void virt_machine_class_init(ObjectClass *oc

[PATCH v2 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-14 Thread Joelle van Dyne

On Apple Silicon, when Windows performs a LDP on the CRB MMIO space,
the exception is not decoded by hardware and we cannot trap the MMIO
read. This led to the idea from @agraf to use the same mapping type as
ROM devices: namely that reads should be seen as memory type and
writes should trap as MMIO.

Once that was done, the second memory mapping of the command buffer
region was redundent and was removed.

A note about the removal of the read trap for `CRB_LOC_STATE`:
The only usage was to return the most up-to-date value for
`tpmEstablished`. However, `tpmEstablished` is only cleared when a
TPM2_HashStart operation is called which only exists for locality 4.
We do not handle locality 4. Indeed, the comment for the write handler
of `CRB_LOC_CTRL` makes the same argument for why it is not calling
the backend to reset the `tpmEstablished` bit (to 1).
As this bit is unused, we do not need to worry about updating it for
reads.

Signed-off-by: Joelle van Dyne 
---
 hw/tpm/tpm_crb.h|   2 -
 hw/tpm/tpm_crb.c|   3 -
 hw/tpm/tpm_crb_common.c | 126 +---
 3 files changed, 65 insertions(+), 66 deletions(-)

diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h
index da3a0cf256..7cdd37335f 100644
--- a/hw/tpm/tpm_crb.h
+++ b/hw/tpm/tpm_crb.h
@@ -26,9 +26,7 @@
 typedef struct TPMCRBState {
 TPMBackend *tpmbe;
 TPMBackendCmd cmd;
-uint32_t regs[TPM_CRB_R_MAX];
 MemoryRegion mmio;
-MemoryRegion cmdmem;
 
 size_t be_buffer_size;
 
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 598c3e0161..07c6868d8d 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = {
 .name = "tpm-crb",
 .pre_save = tpm_crb_none_pre_save,
 .fields = (VMStateField[]) {
-VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX),
 VMSTATE_END_OF_LIST(),
 }
 };
@@ -103,8 +102,6 @@ static void tpm_crb_none_realize(DeviceState *dev, Error 
**errp)
 
 memory_region_add_subregion(get_system_memory(),
 TPM_CRB_ADDR_BASE, &s->state.mmio);
-memory_region_add_subregion(get_system_memory(),
-TPM_CRB_ADDR_BASE + sizeof(s->state.regs), &s->state.cmdmem);
 
 if (s->state.ppi_enabled) {
 memory_region_add_subregion(get_system_memory(),
diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c
index e56e910670..4ecf064c98 100644
--- a/hw/tpm/tpm_crb_common.c
+++ b/hw/tpm/tpm_crb_common.c
@@ -33,31 +33,12 @@
 #include "qom/object.h"
 #include "tpm_crb.h"
 
-static uint64_t tpm_crb_mmio_read(void *opaque, hwaddr addr,
-  unsigned size)
+static uint8_t tpm_crb_get_active_locty(TPMCRBState *s, uint32_t *regs)
 {
-TPMCRBState *s = opaque;
-void *regs = (void *)&s->regs + (addr & ~3);
-unsigned offset = addr & 3;
-uint32_t val = *(uint32_t *)regs >> (8 * offset);
-
-switch (addr) {
-case A_CRB_LOC_STATE:
-val |= !tpm_backend_get_tpm_established_flag(s->tpmbe);
-break;
-}
-
-trace_tpm_crb_mmio_read(addr, size, val);
-
-return val;
-}
-
-static uint8_t tpm_crb_get_active_locty(TPMCRBState *s)
-{
-if (!ARRAY_FIELD_EX32(s->regs, CRB_LOC_STATE, locAssigned)) {
+if (!ARRAY_FIELD_EX32(regs, CRB_LOC_STATE, locAssigned)) {
 return TPM_CRB_NO_LOCALITY;
 }
-return ARRAY_FIELD_EX32(s->regs, CRB_LOC_STATE, activeLocality);
+return ARRAY_FIELD_EX32(regs, CRB_LOC_STATE, activeLocality);
 }
 
 static void tpm_crb_mmio_write(void *opaque, hwaddr addr,
@@ -65,35 +46,47 @@ static void tpm_crb_mmio_write(void *opaque, hwaddr addr,
 {
 TPMCRBState *s = opaque;
 uint8_t locty =  addr >> 12;
+uint32_t *regs;
+void *mem;
 
 trace_tpm_crb_mmio_write(addr, size, val);
+regs = memory_region_get_ram_ptr(&s->mmio);
+mem = ®s[R_CRB_DATA_BUFFER];
+assert(regs);
+
+if (addr >= A_CRB_DATA_BUFFER) {
+assert(addr + size <= TPM_CRB_ADDR_SIZE);
+assert(size <= sizeof(val));
+memcpy(mem + addr - A_CRB_DATA_BUFFER, &val, size);
+memory_region_set_dirty(&s->mmio, addr, size);
+return;
+}
 
 switch (addr) {
 case A_CRB_CTRL_REQ:
 switch (val) {
 case CRB_CTRL_REQ_CMD_READY:
-ARRAY_FIELD_DP32(s->regs, CRB_CTRL_STS,
+ARRAY_FIELD_DP32(regs, CRB_CTRL_STS,
  tpmIdle, 0);
 break;
 case CRB_CTRL_REQ_GO_IDLE:
-ARRAY_FIELD_DP32(s->regs, CRB_CTRL_STS,
+ARRAY_FIELD_DP32(regs, CRB_CTRL_STS,
  tpmIdle, 1);
 break;
 }
 break;
 case A_CRB_CTRL_CANCEL:
 if (val == CRB_CANCEL_INVOKE &&
-s->regs[R_CRB_CTRL_START] & CRB_START_INVOKE) {
+regs[R_CRB_CTRL_START] & CRB_START_INVOKE) {
 tpm_backend_cancel_cmd(s->tpmbe);
 }
 break;
 case A_CRB_CTRL_START:
 if (val == CRB_START_INVOKE &

[PATCH v2 05/11] tpm_crb: use the ISA bus

2023-07-14 Thread Joelle van Dyne

Since this device is gated to only build for targets with the PC
configuration, we should use the ISA bus like with TPM TIS.

Signed-off-by: Joelle van Dyne 
---
 hw/tpm/tpm_crb.c | 52 
 hw/tpm/Kconfig   |  2 +-
 2 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 07c6868d8d..6144081d30 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -22,6 +22,7 @@
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci_ids.h"
 #include "hw/acpi/tpm.h"
+#include "hw/isa/isa.h"
 #include "migration/vmstate.h"
 #include "sysemu/tpm_backend.h"
 #include "sysemu/tpm_util.h"
@@ -34,7 +35,7 @@
 #include "tpm_crb.h"
 
 struct CRBState {
-DeviceState parent_obj;
+ISADevice parent_obj;
 
 TPMCRBState state;
 };
@@ -43,49 +44,49 @@ typedef struct CRBState CRBState;
 DECLARE_INSTANCE_CHECKER(CRBState, CRB,
  TYPE_TPM_CRB)
 
-static void tpm_crb_none_request_completed(TPMIf *ti, int ret)
+static void tpm_crb_isa_request_completed(TPMIf *ti, int ret)
 {
 CRBState *s = CRB(ti);
 
 tpm_crb_request_completed(&s->state, ret);
 }
 
-static enum TPMVersion tpm_crb_none_get_version(TPMIf *ti)
+static enum TPMVersion tpm_crb_isa_get_version(TPMIf *ti)
 {
 CRBState *s = CRB(ti);
 
 return tpm_crb_get_version(&s->state);
 }
 
-static int tpm_crb_none_pre_save(void *opaque)
+static int tpm_crb_isa_pre_save(void *opaque)
 {
 CRBState *s = opaque;
 
 return tpm_crb_pre_save(&s->state);
 }
 
-static const VMStateDescription vmstate_tpm_crb_none = {
+static const VMStateDescription vmstate_tpm_crb_isa = {
 .name = "tpm-crb",
-.pre_save = tpm_crb_none_pre_save,
+.pre_save = tpm_crb_isa_pre_save,
 .fields = (VMStateField[]) {
 VMSTATE_END_OF_LIST(),
 }
 };
 
-static Property tpm_crb_none_properties[] = {
+static Property tpm_crb_isa_properties[] = {
 DEFINE_PROP_TPMBE("tpmdev", CRBState, state.tpmbe),
 DEFINE_PROP_BOOL("ppi", CRBState, state.ppi_enabled, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-static void tpm_crb_none_reset(void *dev)
+static void tpm_crb_isa_reset(void *dev)
 {
 CRBState *s = CRB(dev);
 
 return tpm_crb_reset(&s->state, TPM_CRB_ADDR_BASE);
 }
 
-static void tpm_crb_none_realize(DeviceState *dev, Error **errp)
+static void tpm_crb_isa_realize(DeviceState *dev, Error **errp)
 {
 CRBState *s = CRB(dev);
 
@@ -100,52 +101,51 @@ static void tpm_crb_none_realize(DeviceState *dev, Error 
**errp)
 
 tpm_crb_init_memory(OBJECT(s), &s->state, errp);
 
-memory_region_add_subregion(get_system_memory(),
+memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
 TPM_CRB_ADDR_BASE, &s->state.mmio);
 
 if (s->state.ppi_enabled) {
-memory_region_add_subregion(get_system_memory(),
+memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
 TPM_PPI_ADDR_BASE, &s->state.ppi.ram);
 }
 
 if (xen_enabled()) {
-tpm_crb_none_reset(dev);
+tpm_crb_isa_reset(dev);
 } else {
-qemu_register_reset(tpm_crb_none_reset, dev);
+qemu_register_reset(tpm_crb_isa_reset, dev);
 }
 }
 
-static void tpm_crb_none_class_init(ObjectClass *klass, void *data)
+static void tpm_crb_isa_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 TPMIfClass *tc = TPM_IF_CLASS(klass);
 
-dc->realize = tpm_crb_none_realize;
-device_class_set_props(dc, tpm_crb_none_properties);
-dc->vmsd  = &vmstate_tpm_crb_none;
+dc->realize = tpm_crb_isa_realize;
+device_class_set_props(dc, tpm_crb_isa_properties);
+dc->vmsd  = &vmstate_tpm_crb_isa;
 dc->user_creatable = true;
 tc->model = TPM_MODEL_TPM_CRB;
-tc->get_version = tpm_crb_none_get_version;
-tc->request_completed = tpm_crb_none_request_completed;
+tc->get_version = tpm_crb_isa_get_version;
+tc->request_completed = tpm_crb_isa_request_completed;
 
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
 
-static const TypeInfo tpm_crb_none_info = {
+static const TypeInfo tpm_crb_isa_info = {
 .name = TYPE_TPM_CRB,
-/* could be TYPE_SYS_BUS_DEVICE (or LPC etc) */
-.parent = TYPE_DEVICE,
+.parent = TYPE_ISA_DEVICE,
 .instance_size = sizeof(CRBState),
-.class_init  = tpm_crb_none_class_init,
+.class_init  = tpm_crb_isa_class_init,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_TPM_IF },
 { }
 }
 };
 
-static void tpm_crb_none_register(void)
+static void tpm_crb_isa_register(void)
 {
-type_register_static(&tpm_crb_none_info);
+type_register_static(&tpm_crb_isa_info);
 }
 
-type_init(tpm_crb_none_register)
+type_init(tpm_crb_isa_register)
diff --git a/hw/tpm/Kconfig b/hw/tpm/Kconfig
index a46663288c..1fd73fe617 100644
--- a/hw/tpm/Kconfig
+++ b/hw/tpm/Kconfig
@@ -22,7 +22,7 @@ config TPM_TIS
 
 config TPM_CRB
 bool
-depends on TPM && PC
+depends on TPM && ISA_BUS

[PATCH v2 03/11] tpm_ppi: refactor memory space initialization

2023-07-14 Thread Joelle van Dyne

Instead of calling `memory_region_add_subregion` directly, we defer to
the caller to do it. This allows us to re-use the code for a SysBus
device.

Signed-off-by: Joelle van Dyne 
Reviewed-by: Stefan Berger 
---
 hw/tpm/tpm_ppi.h| 10 +++---
 hw/tpm/tpm_crb.c|  4 ++--
 hw/tpm/tpm_crb_common.c |  3 +++
 hw/tpm/tpm_ppi.c|  5 +
 hw/tpm/tpm_tis_isa.c|  5 +++--
 5 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/hw/tpm/tpm_ppi.h b/hw/tpm/tpm_ppi.h
index bf5d4a300f..30863c6438 100644
--- a/hw/tpm/tpm_ppi.h
+++ b/hw/tpm/tpm_ppi.h
@@ -20,17 +20,13 @@ typedef struct TPMPPI {
 } TPMPPI;
 
 /**
- * tpm_ppi_init:
+ * tpm_ppi_init_memory:
  * @tpmppi: a TPMPPI
- * @m: the address-space / MemoryRegion to use
- * @addr: the address of the PPI region
  * @obj: the owner object
  *
- * Register the TPM PPI memory region at @addr on the given address
- * space for the object @obj.
+ * Creates the TPM PPI memory region.
  **/
-void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m,
-  hwaddr addr, Object *obj);
+void tpm_ppi_init_memory(TPMPPI *tpmppi, Object *obj);
 
 /**
  * tpm_ppi_reset:
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 3ef4977fb5..598c3e0161 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -107,8 +107,8 @@ static void tpm_crb_none_realize(DeviceState *dev, Error 
**errp)
 TPM_CRB_ADDR_BASE + sizeof(s->state.regs), &s->state.cmdmem);
 
 if (s->state.ppi_enabled) {
-tpm_ppi_init(&s->state.ppi, get_system_memory(),
- TPM_PPI_ADDR_BASE, OBJECT(s));
+memory_region_add_subregion(get_system_memory(),
+TPM_PPI_ADDR_BASE, &s->state.ppi.ram);
 }
 
 if (xen_enabled()) {
diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c
index 228e2d0faf..e56e910670 100644
--- a/hw/tpm/tpm_crb_common.c
+++ b/hw/tpm/tpm_crb_common.c
@@ -216,4 +216,7 @@ void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error 
**errp)
 "tpm-crb-mmio", sizeof(s->regs));
 memory_region_init_ram(&s->cmdmem, obj,
 "tpm-crb-cmd", CRB_CTRL_CMD_SIZE, errp);
+if (s->ppi_enabled) {
+tpm_ppi_init_memory(&s->ppi, obj);
+}
 }
diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
index 7f74e26ec6..40cab59afa 100644
--- a/hw/tpm/tpm_ppi.c
+++ b/hw/tpm/tpm_ppi.c
@@ -44,14 +44,11 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
 }
 }
 
-void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m,
-  hwaddr addr, Object *obj)
+void tpm_ppi_init_memory(TPMPPI *tpmppi, Object *obj)
 {
 tpmppi->buf = qemu_memalign(qemu_real_host_page_size(),
 HOST_PAGE_ALIGN(TPM_PPI_ADDR_SIZE));
 memory_region_init_ram_device_ptr(&tpmppi->ram, obj, "tpm-ppi",
   TPM_PPI_ADDR_SIZE, tpmppi->buf);
 vmstate_register_ram(&tpmppi->ram, DEVICE(obj));
-
-memory_region_add_subregion(m, addr, &tpmppi->ram);
 }
diff --git a/hw/tpm/tpm_tis_isa.c b/hw/tpm/tpm_tis_isa.c
index 91e3792248..7cd7415f30 100644
--- a/hw/tpm/tpm_tis_isa.c
+++ b/hw/tpm/tpm_tis_isa.c
@@ -134,8 +134,9 @@ static void tpm_tis_isa_realizefn(DeviceState *dev, Error 
**errp)
 TPM_TIS_ADDR_BASE, &s->mmio);
 
 if (s->ppi_enabled) {
-tpm_ppi_init(&s->ppi, isa_address_space(ISA_DEVICE(dev)),
- TPM_PPI_ADDR_BASE, OBJECT(dev));
+tpm_ppi_init_memory(&s->ppi, OBJECT(dev));
+memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),
+TPM_PPI_ADDR_BASE, &s->ppi.ram);
 }
 }
 
-- 
2.39.2 (Apple Git-143)

[PATCH v2 11/11] tpm_crb: support restoring older vmstate

2023-07-14 Thread Joelle van Dyne

When we moved to a single mapping and modified TPM CRB's VMState, it
broke restoring of VMs that were saved on an older version. This
change allows those VMs to gracefully migrate to the new memory
mapping.

Signed-off-by: Joelle van Dyne 
---
 hw/tpm/tpm_crb.h|  1 +
 hw/tpm/tpm_crb.c| 14 ++
 hw/tpm/tpm_crb_common.c |  7 +++
 3 files changed, 22 insertions(+)

diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h
index 7cdd37335f..7d8f643e98 100644
--- a/hw/tpm/tpm_crb.h
+++ b/hw/tpm/tpm_crb.h
@@ -70,5 +70,6 @@ enum TPMVersion tpm_crb_get_version(TPMCRBState *s);
 int tpm_crb_pre_save(TPMCRBState *s);
 void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr);
 void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error **errp);
+void tpm_crb_restore_regs(TPMCRBState *s, uint32_t *saved_regs);
 
 #endif /* TPM_TPM_CRB_H */
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 594696ffb8..be29ca8c28 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -40,6 +40,7 @@ struct CRBState {
 ISADevice parent_obj;
 
 TPMCRBState state;
+uint32_t legacy_regs[TPM_CRB_R_MAX];
 };
 typedef struct CRBState CRBState;
 
@@ -67,10 +68,23 @@ static int tpm_crb_isa_pre_save(void *opaque)
 return tpm_crb_pre_save(&s->state);
 }
 
+static int tpm_crb_isa_post_load(void *opaque, int version_id)
+{
+CRBState *s = opaque;
+
+if (version_id == 0) {
+tpm_crb_restore_regs(&s->state, s->legacy_regs);
+}
+return 0;
+}
+
 static const VMStateDescription vmstate_tpm_crb_isa = {
 .name = "tpm-crb",
+.version_id = 1,
 .pre_save = tpm_crb_isa_pre_save,
+.post_load = tpm_crb_isa_post_load,
 .fields = (VMStateField[]) {
+VMSTATE_UINT32_ARRAY(legacy_regs, CRBState, TPM_CRB_R_MAX),
 VMSTATE_END_OF_LIST(),
 }
 };
diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c
index 4ecf064c98..5714ac7fc4 100644
--- a/hw/tpm/tpm_crb_common.c
+++ b/hw/tpm/tpm_crb_common.c
@@ -224,3 +224,10 @@ void tpm_crb_init_memory(Object *obj, TPMCRBState *s, 
Error **errp)
 tpm_ppi_init_memory(&s->ppi, obj);
 }
 }
+
+void tpm_crb_restore_regs(TPMCRBState *s, uint32_t *saved_regs)
+{
+uint32_t *regs = memory_region_get_ram_ptr(&s->mmio);
+
+memcpy(regs, saved_regs, TPM_CRB_R_MAX);
+}
-- 
2.39.2 (Apple Git-143)

[PATCH v2 07/11] hw/arm/virt: add plug handler for TPM on SysBus

2023-07-14 Thread Joelle van Dyne

TPM needs to know its own base address in order to generate its DSDT
device entry.

Signed-off-by: Joelle van Dyne 
---
 hw/arm/virt.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 7d9dbc2663..432148ef47 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
  dev, &error_abort);
 }
 
+#ifdef CONFIG_TPM
+static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif)
+{
+PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
+hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base;
+SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif);
+MemoryRegion *sbdev_mr;
+hwaddr tpm_base;
+uint64_t tpm_size;
+
+if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), TYPE_SYS_BUS_DEVICE)) {
+return;
+}
+
+tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+assert(tpm_base != -1);
+
+tpm_base += pbus_base;
+
+sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
+tpm_size = memory_region_size(sbdev_mr);
+
+if (object_property_find(OBJECT(sbdev), "baseaddr")) {
+object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, NULL);
+}
+if (object_property_find(OBJECT(sbdev), "size")) {
+object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL);
+}
+}
+#endif
+
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
 DeviceState *dev, Error **errp)
 {
@@ -2803,6 +2834,12 @@ static void virt_machine_device_plug_cb(HotplugHandler 
*hotplug_dev,
 vms->virtio_iommu_bdf = pci_get_bdf(pdev);
 create_virtio_iommu_dt_bindings(vms);
 }
+
+#ifdef CONFIG_TPM
+if (object_dynamic_cast(OBJECT(dev), TYPE_TPM_IF)) {
+virt_tpm_plug(vms, TPM_IF(dev));
+}
+#endif
 }
 
 static void virt_dimm_unplug_request(HotplugHandler *hotplug_dev,
-- 
2.39.2 (Apple Git-143)

[PATCH v2 08/11] hw/loongarch/virt: add plug handler for TPM on SysBus

2023-07-14 Thread Joelle van Dyne

TPM needs to know its own base address in order to generate its DSDT
device entry.

Signed-off-by: Joelle van Dyne 
---
 hw/loongarch/virt.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index e19b042ce8..9c536c52bc 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -1040,6 +1040,37 @@ static void virt_mem_plug(HotplugHandler *hotplug_dev,
  dev, &error_abort);
 }
 
+#ifdef CONFIG_TPM
+static void virt_tpm_plug(LoongArchMachineState *lams, TPMIf *tpmif)
+{
+PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(lams->platform_bus_dev);
+hwaddr pbus_base = VIRT_PLATFORM_BUS_BASEADDRESS;
+SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif);
+MemoryRegion *sbdev_mr;
+hwaddr tpm_base;
+uint64_t tpm_size;
+
+if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), TYPE_SYS_BUS_DEVICE)) {
+return;
+}
+
+tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+assert(tpm_base != -1);
+
+tpm_base += pbus_base;
+
+sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
+tpm_size = memory_region_size(sbdev_mr);
+
+if (object_property_find(OBJECT(sbdev), "baseaddr")) {
+object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, NULL);
+}
+if (object_property_find(OBJECT(sbdev), "size")) {
+object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL);
+}
+}
+#endif
+
 static void loongarch_machine_device_plug_cb(HotplugHandler *hotplug_dev,
 DeviceState *dev, Error **errp)
 {
@@ -1054,6 +1085,12 @@ static void 
loongarch_machine_device_plug_cb(HotplugHandler *hotplug_dev,
 } else if (memhp_type_supported(dev)) {
 virt_mem_plug(hotplug_dev, dev, errp);
 }
+
+#ifdef CONFIG_TPM
+if (object_dynamic_cast(OBJECT(dev), TYPE_TPM_IF)) {
+virt_tpm_plug(lams, TPM_IF(dev));
+}
+#endif
 }
 
 static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
-- 
2.39.2 (Apple Git-143)

[PATCH v2 09/11] tpm_tis_sysbus: move DSDT AML generation to device

2023-07-14 Thread Joelle van Dyne

This reduces redundent code in different machine types with ACPI table
generation. Additionally, this will allow us to support multiple TPM
interfaces. Finally, this matches up with the TPM TIS ISA
implementation.

Ideally, we would be able to call `qbus_build_aml` and avoid any TPM
specific code in the ACPI table generation. However, currently we
still have to call `build_tpm2` anyways and it does not look like
most other ACPI devices support the `ACPI_DEV_AML_IF` interface.

Signed-off-by: Joelle van Dyne 
---
 hw/arm/virt-acpi-build.c  | 38 ++
 hw/loongarch/acpi-build.c | 38 ++
 hw/tpm/tpm_tis_sysbus.c   | 35 +++
 3 files changed, 39 insertions(+), 72 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 6b674231c2..49b2f19440 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -35,6 +35,7 @@
 #include "target/arm/cpu.h"
 #include "hw/acpi/acpi-defs.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/acpi_aml_interface.h"
 #include "hw/nvram/fw_cfg.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/aml-build.h"
@@ -208,41 +209,6 @@ static void acpi_dsdt_add_gpio(Aml *scope, const 
MemMapEntry *gpio_memmap,
 aml_append(scope, dev);
 }
 
-#ifdef CONFIG_TPM
-static void acpi_dsdt_add_tpm(Aml *scope, VirtMachineState *vms)
-{
-PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
-hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base;
-SysBusDevice *sbdev = SYS_BUS_DEVICE(tpm_find());
-MemoryRegion *sbdev_mr;
-hwaddr tpm_base;
-
-if (!sbdev) {
-return;
-}
-
-tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
-assert(tpm_base != -1);
-
-tpm_base += pbus_base;
-
-sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
-
-Aml *dev = aml_device("TPM0");
-aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
-aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device")));
-aml_append(dev, aml_name_decl("_UID", aml_int(0)));
-
-Aml *crs = aml_resource_template();
-aml_append(crs,
-   aml_memory32_fixed(tpm_base,
-  (uint32_t)memory_region_size(sbdev_mr),
-  AML_READ_WRITE));
-aml_append(dev, aml_name_decl("_CRS", crs));
-aml_append(scope, dev);
-}
-#endif
-
 #define ID_MAPPING_ENTRY_SIZE 20
 #define SMMU_V3_ENTRY_SIZE 68
 #define ROOT_COMPLEX_ENTRY_SIZE 36
@@ -891,7 +857,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 
 acpi_dsdt_add_power_button(scope);
 #ifdef CONFIG_TPM
-acpi_dsdt_add_tpm(scope, vms);
+call_dev_aml_func(DEVICE(tpm_find()), scope);
 #endif
 
 aml_append(dsdt, scope);
diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index 0b62c3a2f7..4291e670c8 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -14,6 +14,7 @@
 #include "target/loongarch/cpu.h"
 #include "hw/acpi/acpi-defs.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/acpi_aml_interface.h"
 #include "hw/nvram/fw_cfg.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "migration/vmstate.h"
@@ -328,41 +329,6 @@ static void build_flash_aml(Aml *scope, 
LoongArchMachineState *lams)
 aml_append(scope, dev);
 }
 
-#ifdef CONFIG_TPM
-static void acpi_dsdt_add_tpm(Aml *scope, LoongArchMachineState *vms)
-{
-PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
-hwaddr pbus_base = VIRT_PLATFORM_BUS_BASEADDRESS;
-SysBusDevice *sbdev = SYS_BUS_DEVICE(tpm_find());
-MemoryRegion *sbdev_mr;
-hwaddr tpm_base;
-
-if (!sbdev) {
-return;
-}
-
-tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
-assert(tpm_base != -1);
-
-tpm_base += pbus_base;
-
-sbdev_mr = sysbus_mmio_get_region(sbdev, 0);
-
-Aml *dev = aml_device("TPM0");
-aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101")));
-aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device")));
-aml_append(dev, aml_name_decl("_UID", aml_int(0)));
-
-Aml *crs = aml_resource_template();
-aml_append(crs,
-   aml_memory32_fixed(tpm_base,
-  (uint32_t)memory_region_size(sbdev_mr),
-  AML_READ_WRITE));
-aml_append(dev, aml_name_decl("_CRS", crs));
-aml_append(scope, dev);
-}
-#endif
-
 /* build DSDT */
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, MachineState *machine)
@@ -379,7 +345,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 build_la_ged_aml(dsdt, machine);
 build_flash_aml(dsdt, lams);
 #ifdef CONFIG_TPM
-acpi_dsdt_add_tpm(dsdt, lams);
+call_dev_aml_func(DEVICE(tpm_find()), dsdt);
 #endif
 /* System State Package */
 scope = aml_scope("\\");
diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c

Re: [PATCH v4 4/6] ebpf: Added declaration/initialization routines.

2023-07-14 Thread Markus Armbruster

Andrew Melnychenko  writes:

> Now, the binary objects may be retrieved by id.
> It would require for future qmp commands that may require specific
> eBPF blob.
>
> Signed-off-by: Andrew Melnychenko 
> ---
>  ebpf/ebpf.c  | 70 
>  ebpf/ebpf.h  | 31 +
>  ebpf/ebpf_rss.c  |  6 +
>  ebpf/meson.build |  2 +-
>  4 files changed, 108 insertions(+), 1 deletion(-)
>  create mode 100644 ebpf/ebpf.c
>  create mode 100644 ebpf/ebpf.h
>
> diff --git a/ebpf/ebpf.c b/ebpf/ebpf.c
> new file mode 100644
> index 00..ea97c0403e
> --- /dev/null
> +++ b/ebpf/ebpf.c
> @@ -0,0 +1,70 @@
> +/*
> + * QEMU eBPF binary declaration routine.
> + *
> + * Developed by Daynix Computing LTD (http://www.daynix.com)
> + *
> + * Authors:
> + *  Andrew Melnychenko 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/queue.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-commands-ebpf.h"

Does not compile:

../ebpf/ebpf.c:16:10: fatal error: qapi/qapi-commands-ebpf.h: No such file 
or directory

The header doesn't exist until you add qapi/ebpf.json in the next
commit.

> +#include "ebpf/ebpf.h"
> +
> +struct ElfBinaryDataEntry {
> +int id;
> +const void *data;
> +size_t datalen;
> +
> +QSLIST_ENTRY(ElfBinaryDataEntry) node;
> +};
> +
> +static QSLIST_HEAD(, ElfBinaryDataEntry) ebpf_elf_obj_list =
> +QSLIST_HEAD_INITIALIZER();
> +
> +void ebpf_register_binary_data(int id, const void *data, size_t datalen)
> +{
> +struct ElfBinaryDataEntry *dataentry = NULL;
> +
> +dataentry = g_new0(struct ElfBinaryDataEntry, 1);
> +dataentry->data = data;
> +dataentry->datalen = datalen;
> +dataentry->id = id;
> +
> +QSLIST_INSERT_HEAD(&ebpf_elf_obj_list, dataentry, node);
> +}
> +
> +const void *ebpf_find_binary_by_id(int id, size_t *sz, Error **errp)
> +{
> +struct ElfBinaryDataEntry *it = NULL;
> +QSLIST_FOREACH(it, &ebpf_elf_obj_list, node) {
> +if (id == it->id) {
> +*sz = it->datalen;
> +return it->data;
> +}
> +}
> +
> +error_setg(errp, "can't find eBPF object with id: %d", id);
> +
> +return NULL;
> +}
> +
> +EbpfObject *qmp_request_ebpf(EbpfProgramID id, Error **errp)
> +{
> +EbpfObject *ret = NULL;
> +size_t size = 0;
> +const void *data = ebpf_find_binary_by_id(id, &size, errp);
> +if (!data) {
> +return NULL;
> +}
> +
> +ret = g_new0(EbpfObject, 1);
> +ret->object = g_base64_encode(data, size);
> +
> +return ret;
> +}
> diff --git a/ebpf/ebpf.h b/ebpf/ebpf.h
> new file mode 100644
> index 00..b6266b28b8
> --- /dev/null
> +++ b/ebpf/ebpf.h
> @@ -0,0 +1,31 @@
> +/*
> + * QEMU eBPF binary declaration routine.
> + *
> + * Developed by Daynix Computing LTD (http://www.daynix.com)
> + *
> + * Authors:
> + *  Andrew Melnychenko 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef EBPF_H
> +#define EBPF_H
> +
> +struct Error;
> +
> +void ebpf_register_binary_data(int id, const void *data,
> +   size_t datalen);
> +const void *ebpf_find_binary_by_id(int id, size_t *sz,
> +   struct Error **errp);
> +
> +#define ebpf_binary_init(id, fn)   \
> +static void __attribute__((constructor)) ebpf_binary_init_ ## fn(void) \
> +{  \
> +size_t datalen = 0;\
> +const void *data = fn(&datalen);   \
> +ebpf_register_binary_data(id, data, datalen);  \
> +}
> +
> +#endif /* EBPF_H */
> diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
> index 24bc6cc409..8679dc452d 100644
> --- a/ebpf/ebpf_rss.c
> +++ b/ebpf/ebpf_rss.c
> @@ -13,6 +13,8 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/error-report.h"
> +#include "qapi/qapi-types-misc.h"
> +#include "qapi/qapi-commands-ebpf.h"

Likewise.

>  
>  #include 
>  #include 
> @@ -21,6 +23,8 @@
>  
>  #include "ebpf/ebpf_rss.h"
>  #include "ebpf/rss.bpf.skeleton.h"
> +#include "ebpf/ebpf.h"
> +
>  #include "trace.h"
>  
>  void ebpf_rss_init(struct EBPFRSSContext *ctx)
> @@ -261,3 +265,5 @@ void ebpf_rss_unload(struct EBPFRSSContext *ctx)
>  ctx->map_toeplitz_key = -1;
>  ctx->map_indirections_table = -1;
>  }
> +
> +ebpf_binary_init(EBPF_PROGRAMID_RSS, rss_bpf__elf_bytes)
> diff --git a/ebpf/meson.build b/ebpf/meson.build
> index 2f627d6c7d..c9bbaa7c90 100644
> --- a/ebpf/meson.build
> +++ b/ebpf/meson.build
> @@ -1 +1 @@
> -system_ss.add(when: libbpf, if_true: files('ebpf_rss.c'), if_false: 
>

Re: [PATCH v4 4/6] ebpf: Added declaration/initialization routines.

2023-07-14 Thread Markus Armbruster

Andrew Melnychenko  writes:

> Now, the binary objects may be retrieved by id.
> It would require for future qmp commands that may require specific
> eBPF blob.
>
> Signed-off-by: Andrew Melnychenko 
> ---

[...]

> diff --git a/ebpf/meson.build b/ebpf/meson.build
> index 2f627d6c7d..c9bbaa7c90 100644
> --- a/ebpf/meson.build
> +++ b/ebpf/meson.build
> @@ -1 +1 @@
> -system_ss.add(when: libbpf, if_true: files('ebpf_rss.c'), if_false: 
> files('ebpf_rss-stub.c'))
> +common_ss.add(when: libbpf, if_true: files('ebpf.c', 'ebpf_rss.c'), 
> if_false: files('ebpf_rss-stub.c'))
> \ No newline at end of file

Add a newline, please.

Re: [RFC PATCH v2 11/11] hw/char/pl011: Implement TX FIFO

2023-07-14 Thread Richard Henderson


On 7/10/23 18:51, Philippe Mathieu-Daudé wrote:

+static gboolean pl011_xmit(void *do_not_use, GIOCondition cond, void *opaque)
+{
+PL011State *s = opaque;
+int ret;
+const uint8_t *buf;
+uint32_t buflen;
+uint32_t count;
+bool tx_enabled;
+
+if (!qemu_chr_fe_backend_connected(&s->chr)) {
+/* Instant drain the fifo when there's no back-end */
+return pl011_drain_tx(s);
+}
+
+tx_enabled = s->cr & CR_UARTEN;


What happened to "Hello, World"?  We ought to be consistent.
For actual modeling, I think you need TXE too.

Where does UARTFR get updated after successfully transmitting data?


  static void pl011_write_txdata(PL011State *s, const uint8_t *buf, int length)
@@ -162,12 +218,32 @@ static void pl011_write_txdata(PL011State *s, const 
uint8_t *buf, int length)
  if (!(s->cr & CR_TXE)) {
  qemu_log_mask(LOG_GUEST_ERROR, "PL011 write data but TX disabled\n");
  }
+if (!fifo8_is_empty(&s->xmit_fifo)) {
+/*
+ * If the UART is disabled in the middle of transmission
+ * or reception, it completes the current character before
+ * stopping.
+ */
+pl011_xmit(NULL, G_IO_OUT, s);
+return;
+}


Why is this in write_txdata?  I would expect to find this with a write to 
UARTCR.
You appear to *not* be queuing data unless the fifo is empty.


+if (length > fifo8_num_free(&s->xmit_fifo)) {
+/*
+ * The FIFO contents remain valid because no more data is
+ * written when the FIFO is full, only the contents of the
+ * shift register are overwritten. The CPU must now read
+ * the data, to empty the FIFO.
+ */
+trace_pl011_fifo_tx_overrun();
+s->rsr |= RSR_OE;
+return;
+}
+
+trace_pl011_fifo_tx_put(length);
+fifo8_push_all(&s->xmit_fifo, buf, length);


Since length will always be 1, probably we should just remove it.


+static bool pl011_xmit_fifo_state_needed(void *opaque, int version_id)
+{
+PL011State* s = opaque;
+
+return pl011_is_fifo_enabled(s) && !fifo8_is_empty(&s->xmit_fifo);
+}


Ok.


  static int pl011_post_load(void *opaque, int version_id)
  {
  PL011State* s = opaque;
@@ -455,6 +538,11 @@ static int pl011_post_load(void *opaque, int version_id)
  s->read_pos = 0;
  }
  
+if (pl011_xmit_fifo_state_needed(s, version_id)) {

+/* Reschedule another transmission */
+qemu_chr_fe_add_watch(&s->chr, G_IO_OUT | G_IO_HUP, pl011_xmit, s);
+}


Ok.


@@ -473,6 +561,7 @@ static const VMStateDescription vmstate_pl011 = {
  VMSTATE_UINT32(int_enabled, PL011State),
  VMSTATE_UINT32(int_level, PL011State),
  VMSTATE_UINT32_ARRAY(read_fifo, PL011State, PL011_FIFO_DEPTH),
+VMSTATE_FIFO8_TEST(xmit_fifo, PL011State, 
pl011_xmit_fifo_state_needed),


Not ok.

The new data should go in its own VMStateDescription, like vmstate_pl011_clock.


r~

Re: [PATCH v4 5/6] qmp: Added new command to retrieve eBPF blob.

2023-07-14 Thread Markus Armbruster

Andrew Melnychenko  writes:

> Added command "request-ebpf". This command returns
> eBPF program encoded base64. The program taken from the
> skeleton and essentially is an ELF object that can be
> loaded in the future with libbpf.
>
> The reason to use the command to provide the eBPF object
> instead of a separate artifact was to avoid issues related
> to finding the eBPF itself. As the eBPF maps/program should
> correspond to QEMU, the eBPF cant be used from different

can't

> QEMU build.

Blank line between paragaphs.

> The first solution was a helper that comes with QEMU
> and loads appropriate eBPF objects. And the issue is
> to find a proper helper if the system has several
> different QEMUs installed and/or built from the source,
> which helpers may not be compatible.

Blank line between paragaphs.

> Another issue is QEMU updating while there is a running
> QEMU instance. With an updated helper, it may not be
> possible to hotplug virtio-net device to the already
> running QEMU. Overall, requesting the eBPF object from
> QEMU itself solves possible failures with very little effort.

I respectfully disagree with "very little".  But it's your commit
message, not mine.  "Acceptable effort"?

> Links:
> [PATCH 3/5] qmp: Added the helper stamp check.
> https://lore.kernel.org/all/20230219162100.174318-4-and...@daynix.com/
>
> Signed-off-by: Andrew Melnychenko 
> ---
>  qapi/ebpf.json| 58 +++
>  qapi/meson.build  |  1 +
>  qapi/qapi-schema.json |  1 +
>  3 files changed, 60 insertions(+)
>  create mode 100644 qapi/ebpf.json
>
> diff --git a/qapi/ebpf.json b/qapi/ebpf.json
> new file mode 100644
> index 00..3237da69a7
> --- /dev/null
> +++ b/qapi/ebpf.json
> @@ -0,0 +1,58 @@
> +# -*- Mode: Python -*-
> +# vim: filetype=python
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or later.
> +# See the COPYING file in the top-level directory.
> +
> +##
> +# = eBPF Objects
> +##
> +
> +{ 'include': 'common.json' }
> +
> +##
> +# @EbpfObject:
> +#
> +# Structure that holds eBPF ELF object encoded in base64.
> +#
> +# Since: 8.3
> +#
> +##
> +{ 'struct': 'EbpfObject',
> +  'data': {'object': 'str'},
> +  'if': 'CONFIG_EBPF' }
> +
> +##
> +# @EbpfProgramID:
> +#
> +# The eBPF programs that can be gotten with request-ebpf.
> +#
> +# @rss: Receive side scaling, technology that allows steering traffic
> +# between queues by calculation hash. Users may set up indirection table
> +# and hash/packet types configurations. Used with virtio-net.
> +#
> +# Since: 8.3
> +##
> +{ 'enum': 'EbpfProgramID',
> +  'if': 'CONFIG_EBPF',
> +  'data': [ { 'name': 'rss' } ] }
> +
> +##
> +# @request-ebpf:
> +#
> +# Returns eBPF object that can be loaded with libbpf.
> +# Management applications (g.e. libvirt) may load it and pass file
> +# descriptors to QEMU. Which allows running QEMU without BPF capabilities.
> +# It's crucial that eBPF program/map is compatible with QEMU, so it's
> +# provided through QMP.
> +#
> +# Returns: RSS eBPF object encoded in base64.
> +#
> +# Since: 8.3
> +#
> +##
> +{ 'command': 'request-ebpf',
> +  'data': { 'id': 'EbpfProgramID' },
> +  'returns': 'EbpfObject',
> +  'if': 'CONFIG_EBPF' }
> +

Trim the trailing blank line.

Terminology: you use "eBPF program" and "eBPF object".  What's the
difference?  If there's none, use only one term, please.  To me,
"program" feels more clear.

> diff --git a/qapi/meson.build b/qapi/meson.build
> index 60a668b343..90047dae1c 100644
> --- a/qapi/meson.build
> +++ b/qapi/meson.build
> @@ -33,6 +33,7 @@ qapi_all_modules = [
>'crypto',
>'cxl',
>'dump',
> +  'ebpf',
>'error',
>'introspect',
>'job',
> diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
> index 6594afba31..2c82a49bae 100644
> --- a/qapi/qapi-schema.json
> +++ b/qapi/qapi-schema.json
> @@ -53,6 +53,7 @@
>  { 'include': 'char.json' }
>  { 'include': 'dump.json' }
>  { 'include': 'net.json' }
> +{ 'include': 'ebpf.json' }
>  { 'include': 'rdma.json' }
>  { 'include': 'rocker.json' }
>  { 'include': 'tpm.json' }

Re: [PATCH v3] migration: hold the BQL during setup

2023-07-14 Thread Fiona Ebner

Ping

Am 30.06.23 um 16:18 schrieb Fiona Ebner:
> This is intended to be a semantic revert of commit 9b09503752
> ("migration: run setup callbacks out of big lock"). There have been so
> many changes since that commit (e.g. a new setup callback
> dirty_bitmap_save_setup() that also needs to be adapted now), it's
> easier to do the revert manually.
> 
> For snapshots, the bdrv_writev_vmstate() function is used during setup
> (in QIOChannelBlock backing the QEMUFile), but not holding the BQL
> while calling it could lead to an assertion failure. To understand
> how, first note the following:
> 
> 1. Generated coroutine wrappers for block layer functions spawn the
> coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it.
> 2. If the host OS switches threads at an inconvenient time, it can
> happen that a bottom half scheduled for the main thread's AioContext
> is executed as part of a vCPU thread's aio_poll().
> 
> An example leading to the assertion failure is as follows:
> 
> main thread:
> 1. A snapshot-save QMP command gets issued.
> 2. snapshot_save_job_bh() is scheduled.
> 
> vCPU thread:
> 3. aio_poll() for the main thread's AioContext is called (e.g. when
> the guest writes to a pflash device, as part of blk_pwrite which is a
> generated coroutine wrapper).
> 4. snapshot_save_job_bh() is executed as part of aio_poll().
> 3. qemu_savevm_state() is called.
> 4. qemu_mutex_unlock_iothread() is called. Now
> qemu_get_current_aio_context() returns 0x0.
> 5. bdrv_writev_vmstate() is executed during the usual savevm setup
> via qemu_fflush(). But this function is a generated coroutine wrapper,
> so it uses AIO_WAIT_WHILE. There, the assertion
> assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> will fail.
> 
> To fix it, ensure that the BQL is held during setup. While it would
> only be needed for snapshots, adapting migration too avoids additional
> logic for conditional locking/unlocking in the setup callbacks.
> Writing the header could (in theory) also trigger qemu_fflush() and
> thus bdrv_writev_vmstate(), so the locked section also covers the
> qemu_savevm_state_header() call, even for migration for consistentcy.
> 
> The section around multifd_send_sync_main() needs to be unlocked to
> avoid a deadlock. In particular, the function calls
> socket_send_channel_create() using multifd_new_send_channel_async() as
> a callback and then waits for the callback to signal via the
> channels_ready semaphore. The connection happens via
> qio_task_run_in_thread(), but the callback is only executed via
> qio_task_thread_result() which is scheduled for the main event loop.
> Without unlocking the section, the main thread would never get to
> process the task result and the callback meaning there would be no
> signal via the channels_ready semaphore.
> 
> The comment in ram_init_bitmaps() was introduced by 4987783400
> ("migration: fix incorrect memory_global_dirty_log_start outside BQL")
> and is removed, because it referred to the qemu_mutex_lock_iothread()
> call.
> 
> Signed-off-by: Fiona Ebner

[PATCH v6 0/5] Add RISC-V KVM AIA Support

2023-07-14 Thread Yong-Xuan Wang

This series adds support for KVM AIA in RISC-V architecture.

In order to test these patches, we require Linux with KVM AIA support which can
be found in the riscv_kvm_aia_hwaccel_v1 branch at
https://github.com/avpatel/linux.git

---
v6:
- fix alignment
- add hart index to the error message of ISMIC address setting in PATCH3

v5:
- remove the linux-header update patch since the riscv-to-apply.next QEMU has
synced up to Linux 6.5-rc1 headers.
- create the APLIC and IMSIC FDT helper functions in PATCH1
- add the irqfd support in PATCH3
- fix the comments and refine the code

v4:
- update the linux header by the scripts/update-linux-headers.sh in PATCH1
- remove the checking for "aplic_m" before creating S-mode APLIC device in 
PATCH2
- add more setting when we initialize the KVM AIA chip in PATCH4
- fix msi message delivery and the APLIC devices emulation in PATCH5
- fix the AIA devices mapping with NUMA enabled in PATCH6
- add "kvm-aia" parameter to sepecify the KVM AIA mode in PATCH6

v3:
- fix typo
- tag the linux-header patch as placeholder

v2:
- rebase to riscv-to-apply.next
- update the linux header by the scripts/update-linux-headers.sh

Yong-Xuan Wang (5):
  target/riscv: support the AIA device emulation with KVM enabled
  target/riscv: check the in-kernel irqchip support
  target/riscv: Create an KVM AIA irqchip
  target/riscv: update APLIC and IMSIC to support KVM AIA
  target/riscv: select KVM AIA in riscv virt machine

 hw/intc/riscv_aplic.c|  56 --
 hw/intc/riscv_imsic.c|  25 ++-
 hw/riscv/virt.c  | 410 ++-
 include/hw/riscv/virt.h  |   1 +
 target/riscv/kvm.c   | 170 +++-
 target/riscv/kvm_riscv.h |   6 +
 6 files changed, 469 insertions(+), 199 deletions(-)

-- 
2.17.1

[PATCH v6 3/5] target/riscv: Create an KVM AIA irqchip

2023-07-14 Thread Yong-Xuan Wang

We create a vAIA chip by using the KVM_DEV_TYPE_RISCV_AIA and then set up
the chip with the KVM_DEV_RISCV_AIA_GRP_* APIs.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/kvm.c   | 160 +++
 target/riscv/kvm_riscv.h |   6 ++
 2 files changed, 166 insertions(+)

diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
index 005e054604..9bc92cedff 100644
--- a/target/riscv/kvm.c
+++ b/target/riscv/kvm.c
@@ -36,6 +36,7 @@
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
 #include "hw/irq.h"
+#include "hw/intc/riscv_imsic.h"
 #include "qemu/log.h"
 #include "hw/loader.h"
 #include "kvm_riscv.h"
@@ -43,6 +44,7 @@
 #include "chardev/char-fe.h"
 #include "migration/migration.h"
 #include "sysemu/runstate.h"
+#include "hw/riscv/numa.h"
 
 static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type,
  uint64_t idx)
@@ -1026,3 +1028,161 @@ bool kvm_arch_cpu_check_are_resettable(void)
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
 }
+
+char *kvm_aia_mode_str(uint64_t aia_mode)
+{
+const char *val;
+
+switch (aia_mode) {
+case KVM_DEV_RISCV_AIA_MODE_EMUL:
+return "emul";
+case KVM_DEV_RISCV_AIA_MODE_HWACCEL:
+return "hwaccel";
+case KVM_DEV_RISCV_AIA_MODE_AUTO:
+default:
+return "auto";
+};
+}
+
+void kvm_riscv_aia_create(MachineState *machine,
+  uint64_t aia_mode, uint64_t group_shift,
+  uint64_t aia_irq_num, uint64_t aia_msi_num,
+  uint64_t aplic_base, uint64_t imsic_base,
+  uint64_t guest_num)
+{
+int ret, i;
+int aia_fd = -1;
+uint64_t default_aia_mode;
+uint64_t socket_count = riscv_socket_count(machine);
+uint64_t max_hart_per_socket = 0;
+uint64_t socket, base_hart, hart_count, socket_imsic_base, imsic_addr;
+uint64_t socket_bits, hart_bits, guest_bits;
+
+aia_fd = kvm_create_device(kvm_state, KVM_DEV_TYPE_RISCV_AIA, false);
+
+if (aia_fd < 0) {
+error_report("Unable to create in-kernel irqchip");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_MODE,
+&default_aia_mode, false, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to get current KVM AIA mode");
+exit(1);
+}
+qemu_log("KVM AIA: default mode is %s\n",
+ kvm_aia_mode_str(default_aia_mode));
+
+if (default_aia_mode != aia_mode) {
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_MODE,
+&aia_mode, true, NULL);
+if (ret < 0)
+warn_report("KVM AIA: failed to set KVM AIA mode");
+else
+qemu_log("KVM AIA: set current mode to %s\n",
+ kvm_aia_mode_str(aia_mode));
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_SRCS,
+&aia_irq_num, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set number of input irq lines");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_IDS,
+&aia_msi_num, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set number of msi");
+exit(1);
+}
+
+socket_bits = find_last_bit(&socket_count, BITS_PER_LONG) + 1;
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GROUP_BITS,
+&socket_bits, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set group_bits");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GROUP_SHIFT,
+&group_shift, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set group_shift");
+exit(1);
+}
+
+guest_bits = guest_num == 0 ? 0 :
+ find_last_bit(&guest_num, BITS_PER_LONG) + 1;
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GUEST_BITS,
+&guest_bits, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set guest_bits");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_ADDR,
+KVM_DEV_RISCV_AIA_ADDR_APLIC,
+&aplic_base, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set the base address

[PATCH v6 2/5] target/riscv: check the in-kernel irqchip support

2023-07-14 Thread Yong-Xuan Wang

We check the in-kernel irqchip support when using KVM acceleration.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/kvm.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
index 9d8a8982f9..005e054604 100644
--- a/target/riscv/kvm.c
+++ b/target/riscv/kvm.c
@@ -914,7 +914,15 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 
 int kvm_arch_irqchip_create(KVMState *s)
 {
-return 0;
+if (kvm_kernel_irqchip_split()) {
+error_report("-machine kernel_irqchip=split is not supported on 
RISC-V.");
+exit(1);
+}
+
+/*
+ * We can create the VAIA using the newer device control API.
+ */
+return kvm_check_extension(s, KVM_CAP_DEVICE_CTRL);
 }
 
 int kvm_arch_process_async_events(CPUState *cs)
-- 
2.17.1

[PATCH v6 5/5] target/riscv: select KVM AIA in riscv virt machine

2023-07-14 Thread Yong-Xuan Wang

Select KVM AIA when the host kernel has in-kernel AIA chip support.
Since KVM AIA only has one APLIC instance, we map the QEMU APLIC
devices to KVM APLIC.
We also extend virt machine to specify the KVM AIA mode. The "kvm-aia"
parameter is passed along with machine name in QEMU command-line.
1) "kvm-aia=emul": IMSIC is emulated by hypervisor
2) "kvm-aia=hwaccel": use hardware guest IMSIC
3) "kvm-aia=auto": use the hardware guest IMSICs whenever available
   otherwise we fallback to software emulation.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 hw/riscv/virt.c | 132 ++--
 include/hw/riscv/virt.h |   1 +
 2 files changed, 102 insertions(+), 31 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index f595380be1..6367597dfa 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -35,6 +35,7 @@
 #include "hw/riscv/virt.h"
 #include "hw/riscv/boot.h"
 #include "hw/riscv/numa.h"
+#include "kvm_riscv.h"
 #include "hw/intc/riscv_aclint.h"
 #include "hw/intc/riscv_aplic.h"
 #include "hw/intc/riscv_imsic.h"
@@ -75,6 +76,12 @@
 #error "Can't accomodate all IMSIC groups in address space"
 #endif
 
+/* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */
+static bool virt_use_kvm_aia(RISCVVirtState *s)
+{
+return kvm_irqchip_in_kernel() && s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC;
+}
+
 static const MemMapEntry virt_memmap[] = {
 [VIRT_DEBUG] ={0x0, 0x100 },
 [VIRT_MROM] = { 0x1000,0xf000 },
@@ -609,16 +616,16 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
  uint32_t *intc_phandles,
  uint32_t aplic_phandle,
  uint32_t aplic_child_phandle,
- bool m_mode)
+ bool m_mode, int num_harts)
 {
 int cpu;
 char *aplic_name;
 uint32_t *aplic_cells;
 MachineState *ms = MACHINE(s);
 
-aplic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
+aplic_cells = g_new0(uint32_t, num_harts * 2);
 
-for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
+for (cpu = 0; cpu < num_harts; cpu++) {
 aplic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
 aplic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : IRQ_S_EXT);
 }
@@ -632,8 +639,7 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 
 if (s->aia_type == VIRT_AIA_TYPE_APLIC) {
 qemu_fdt_setprop(ms->fdt, aplic_name, "interrupts-extended",
- aplic_cells,
- s->soc[socket].num_harts * sizeof(uint32_t) * 2);
+ aplic_cells, num_harts * sizeof(uint32_t) * 2);
 } else {
 qemu_fdt_setprop_cell(ms->fdt, aplic_name, "msi-parent", msi_phandle);
 }
@@ -664,7 +670,8 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 uint32_t msi_s_phandle,
 uint32_t *phandle,
 uint32_t *intc_phandles,
-uint32_t *aplic_phandles)
+uint32_t *aplic_phandles,
+int num_harts)
 {
 char *aplic_name;
 unsigned long aplic_addr;
@@ -681,7 +688,7 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 create_fdt_one_aplic(s, socket, aplic_addr, memmap[VIRT_APLIC_M].size,
  msi_m_phandle, intc_phandles,
  aplic_m_phandle, aplic_s_phandle,
- true);
+ true, num_harts);
 }
 
 /* S-level APLIC node */
@@ -690,7 +697,7 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 create_fdt_one_aplic(s, socket, aplic_addr, memmap[VIRT_APLIC_S].size,
  msi_s_phandle, intc_phandles,
  aplic_s_phandle, 0,
- false);
+ false, num_harts);
 
 aplic_name = g_strdup_printf("/soc/aplic@%lx", aplic_addr);
 
@@ -774,34 +781,51 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
MemMapEntry *memmap,
 *msi_pcie_phandle = msi_s_phandle;
 }
 
-phandle_pos = ms->smp.cpus;
-for (socket = (socket_count - 1); socket >= 0; socket--) {
-phandle_pos -= s->soc[socket].num_harts;
-
-if (s->aia_type == VIRT_AIA_TYPE_NONE) {
-create_fdt_socket_plic(s, memmap, socket, phandle,
-&intc_phandles[phandle_pos], xplic_phandles);
-} else {
-create_fdt_socket_aplic(s, memmap, socket,
-msi_m_phandle, msi_s_phandle, phandle,
-&intc_phandles[phandle_pos], xplic_phandles);
+/* KVM AIA only has one APLIC instance */

[PATCH v6 1/5] target/riscv: support the AIA device emulation with KVM enabled

2023-07-14 Thread Yong-Xuan Wang

In this patch, we create the APLIC and IMSIC FDT helper functions and
remove M mode AIA devices when using KVM acceleration.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 hw/riscv/virt.c | 290 +++-
 1 file changed, 137 insertions(+), 153 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index d90286dc46..f595380be1 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -516,79 +516,28 @@ static uint32_t imsic_num_bits(uint32_t count)
 return ret;
 }
 
-static void create_fdt_imsic(RISCVVirtState *s, const MemMapEntry *memmap,
- uint32_t *phandle, uint32_t *intc_phandles,
- uint32_t *msi_m_phandle, uint32_t *msi_s_phandle)
+static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr base_addr,
+ uint32_t *intc_phandles, uint32_t msi_phandle,
+ bool m_mode, uint32_t imsic_guest_bits)
 {
 int cpu, socket;
 char *imsic_name;
 MachineState *ms = MACHINE(s);
 int socket_count = riscv_socket_count(ms);
-uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
+uint32_t imsic_max_hart_per_socket;
 uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
 
-*msi_m_phandle = (*phandle)++;
-*msi_s_phandle = (*phandle)++;
 imsic_cells = g_new0(uint32_t, ms->smp.cpus * 2);
 imsic_regs = g_new0(uint32_t, socket_count * 4);
 
-/* M-level IMSIC node */
 for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
 imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
+imsic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : IRQ_S_EXT);
 }
-imsic_max_hart_per_socket = 0;
-for (socket = 0; socket < socket_count; socket++) {
-imsic_addr = memmap[VIRT_IMSIC_M].base +
- socket * VIRT_IMSIC_GROUP_MAX_SIZE;
-imsic_size = IMSIC_HART_SIZE(0) * s->soc[socket].num_harts;
-imsic_regs[socket * 4 + 0] = 0;
-imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
-imsic_regs[socket * 4 + 2] = 0;
-imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
-if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
-imsic_max_hart_per_socket = s->soc[socket].num_harts;
-}
-}
-imsic_name = g_strdup_printf("/soc/imsics@%lx",
-(unsigned long)memmap[VIRT_IMSIC_M].base);
-qemu_fdt_add_subnode(ms->fdt, imsic_name);
-qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible",
-"riscv,imsics");
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "#interrupt-cells",
-FDT_IMSIC_INT_CELLS);
-qemu_fdt_setprop(ms->fdt, imsic_name, "interrupt-controller",
-NULL, 0);
-qemu_fdt_setprop(ms->fdt, imsic_name, "msi-controller",
-NULL, 0);
-qemu_fdt_setprop(ms->fdt, imsic_name, "interrupts-extended",
-imsic_cells, ms->smp.cpus * sizeof(uint32_t) * 2);
-qemu_fdt_setprop(ms->fdt, imsic_name, "reg", imsic_regs,
-socket_count * sizeof(uint32_t) * 4);
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,num-ids",
-VIRT_IRQCHIP_NUM_MSIS);
-if (socket_count > 1) {
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,hart-index-bits",
-imsic_num_bits(imsic_max_hart_per_socket));
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,group-index-bits",
-imsic_num_bits(socket_count));
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "riscv,group-index-shift",
-IMSIC_MMIO_GROUP_MIN_SHIFT);
-}
-qemu_fdt_setprop_cell(ms->fdt, imsic_name, "phandle", *msi_m_phandle);
-
-g_free(imsic_name);
 
-/* S-level IMSIC node */
-for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
-imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
-}
-imsic_guest_bits = imsic_num_bits(s->aia_guests + 1);
 imsic_max_hart_per_socket = 0;
 for (socket = 0; socket < socket_count; socket++) {
-imsic_addr = memmap[VIRT_IMSIC_S].base +
- socket * VIRT_IMSIC_GROUP_MAX_SIZE;
+imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
 imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
  s->soc[socket].num_harts;
 imsic_regs[socket * 4 + 0] = 0;
@@ -599,119 +548,151 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 imsic_max_hart_per_socket = s->soc[socket].num_harts;
 }
 }
-imsic_name = g_strdup_printf("/soc/imsics@%lx",
-(unsigned long)memmap[VIRT_IMSIC_S].base);
+
+imsic_name = g_strdup_printf("/soc/imsics@%lx", (unsigned long)base_addr);
 qemu_fdt_add_subnode(ms->fdt, imsic_name);
-qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible",
-

[PATCH v6 4/5] target/riscv: update APLIC and IMSIC to support KVM AIA

2023-07-14 Thread Yong-Xuan Wang

KVM AIA can't emulate APLIC only. When "aia=aplic" parameter is passed,
APLIC devices is emulated by QEMU. For "aia=aplic-imsic", remove the
mmio operations of APLIC when using KVM AIA and send wired interrupt
signal via KVM_IRQ_LINE API.
After KVM AIA enabled, MSI messages are delivered by KVM_SIGNAL_MSI API
when the IMSICs receive mmio write requests.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 hw/intc/riscv_aplic.c | 56 ++-
 hw/intc/riscv_imsic.c | 25 +++
 2 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
index 4bdc6a5d1a..592c3ce768 100644
--- a/hw/intc/riscv_aplic.c
+++ b/hw/intc/riscv_aplic.c
@@ -31,6 +31,7 @@
 #include "hw/irq.h"
 #include "target/riscv/cpu.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
 #include "migration/vmstate.h"
 
 #define APLIC_MAX_IDC  (1UL << 14)
@@ -148,6 +149,15 @@
 
 #define APLIC_IDC_CLAIMI   0x1c
 
+/*
+ * KVM AIA only supports APLIC MSI, fallback to QEMU emulation if we want to 
use
+ * APLIC Wired.
+ */
+static bool is_kvm_aia(bool msimode)
+{
+return kvm_irqchip_in_kernel() && msimode;
+}
+
 static uint32_t riscv_aplic_read_input_word(RISCVAPLICState *aplic,
 uint32_t word)
 {
@@ -471,6 +481,11 @@ static uint32_t riscv_aplic_idc_claimi(RISCVAPLICState 
*aplic, uint32_t idc)
 return topi;
 }
 
+static void riscv_kvm_aplic_request(void *opaque, int irq, int level)
+{
+kvm_set_irq(kvm_state, irq, !!level);
+}
+
 static void riscv_aplic_request(void *opaque, int irq, int level)
 {
 bool update = false;
@@ -801,29 +816,35 @@ static void riscv_aplic_realize(DeviceState *dev, Error 
**errp)
 uint32_t i;
 RISCVAPLICState *aplic = RISCV_APLIC(dev);
 
-aplic->bitfield_words = (aplic->num_irqs + 31) >> 5;
-aplic->sourcecfg = g_new0(uint32_t, aplic->num_irqs);
-aplic->state = g_new0(uint32_t, aplic->num_irqs);
-aplic->target = g_new0(uint32_t, aplic->num_irqs);
-if (!aplic->msimode) {
-for (i = 0; i < aplic->num_irqs; i++) {
-aplic->target[i] = 1;
+if (!is_kvm_aia(aplic->msimode)) {
+aplic->bitfield_words = (aplic->num_irqs + 31) >> 5;
+aplic->sourcecfg = g_new0(uint32_t, aplic->num_irqs);
+aplic->state = g_new0(uint32_t, aplic->num_irqs);
+aplic->target = g_new0(uint32_t, aplic->num_irqs);
+if (!aplic->msimode) {
+for (i = 0; i < aplic->num_irqs; i++) {
+aplic->target[i] = 1;
+}
 }
-}
-aplic->idelivery = g_new0(uint32_t, aplic->num_harts);
-aplic->iforce = g_new0(uint32_t, aplic->num_harts);
-aplic->ithreshold = g_new0(uint32_t, aplic->num_harts);
+aplic->idelivery = g_new0(uint32_t, aplic->num_harts);
+aplic->iforce = g_new0(uint32_t, aplic->num_harts);
+aplic->ithreshold = g_new0(uint32_t, aplic->num_harts);
 
-memory_region_init_io(&aplic->mmio, OBJECT(dev), &riscv_aplic_ops, aplic,
-  TYPE_RISCV_APLIC, aplic->aperture_size);
-sysbus_init_mmio(SYS_BUS_DEVICE(dev), &aplic->mmio);
+memory_region_init_io(&aplic->mmio, OBJECT(dev), &riscv_aplic_ops,
+  aplic, TYPE_RISCV_APLIC, aplic->aperture_size);
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), &aplic->mmio);
+}
 
 /*
  * Only root APLICs have hardware IRQ lines. All non-root APLICs
  * have IRQ lines delegated by their parent APLIC.
  */
 if (!aplic->parent) {
-qdev_init_gpio_in(dev, riscv_aplic_request, aplic->num_irqs);
+if (is_kvm_aia(aplic->msimode)) {
+qdev_init_gpio_in(dev, riscv_kvm_aplic_request, aplic->num_irqs);
+} else {
+qdev_init_gpio_in(dev, riscv_aplic_request, aplic->num_irqs);
+}
 }
 
 /* Create output IRQ lines for non-MSI mode */
@@ -958,7 +979,10 @@ DeviceState *riscv_aplic_create(hwaddr addr, hwaddr size,
 qdev_prop_set_bit(dev, "mmode", mmode);
 
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
-sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, addr);
+
+if (!is_kvm_aia(msimode)) {
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, addr);
+}
 
 if (parent) {
 riscv_aplic_add_child(parent, dev);
diff --git a/hw/intc/riscv_imsic.c b/hw/intc/riscv_imsic.c
index fea3385b51..760dbddcf7 100644
--- a/hw/intc/riscv_imsic.c
+++ b/hw/intc/riscv_imsic.c
@@ -32,6 +32,7 @@
 #include "target/riscv/cpu.h"
 #include "target/riscv/cpu_bits.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
 #include "migration/vmstate.h"
 
 #define IMSIC_MMIO_PAGE_LE 0x00
@@ -283,6 +284,20 @@ static void riscv_imsic_write(void *opaque, hwaddr addr, 
uint64_t value,
 goto err;
 }
 
+#if defined(CONFIG_KVM)
+if (kvm_irqchip_in_kernel()) {
+struct kv

[PATCH v3 02/47] target/loongarch: meson.build support build LASX

2023-07-14 Thread Song Gao

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/insn_trans/trans_lasx.c.inc | 6 ++
 target/loongarch/translate.c | 1 +
 2 files changed, 7 insertions(+)
 create mode 100644 target/loongarch/insn_trans/trans_lasx.c.inc

diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
new file mode 100644
index 00..56a9839255
--- /dev/null
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * LASX translate functions
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ */
+
diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c
index 3146a2d4ac..6bf2d726d6 100644
--- a/target/loongarch/translate.c
+++ b/target/loongarch/translate.c
@@ -220,6 +220,7 @@ static void set_fpr(int reg_num, TCGv val)
 #include "insn_trans/trans_branch.c.inc"
 #include "insn_trans/trans_privileged.c.inc"
 #include "insn_trans/trans_lsx.c.inc"
+#include "insn_trans/trans_lasx.c.inc"
 
 static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
 {
-- 
2.39.1

[PATCH v3 07/47] target/loongarch: Implement xvneg

2023-07-14 Thread Song Gao

This patch includes:
- XVNEG.{B/H/W/D}.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 10 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  5 +
 target/loongarch/insn_trans/trans_lsx.c.inc  | 12 ++--
 target/loongarch/insns.decode|  5 +
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index f59e3cebf0..4e26d49acc 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1713,6 +1713,11 @@ static void output_vv_i_x(DisasContext *ctx, arg_vv_i 
*a, const char *mnemonic)
 output(ctx, mnemonic, "x%d, x%d, 0x%x", a->vd, a->vj, a->imm);
 }
 
+static void output_vv_x(DisasContext *ctx, arg_vv *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, x%d", a->vd, a->vj);
+}
+
 static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic)
 {
 output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj);
@@ -1738,6 +1743,11 @@ INSN_LASX(xvsubi_hu, vv_i)
 INSN_LASX(xvsubi_wu, vv_i)
 INSN_LASX(xvsubi_du, vv_i)
 
+INSN_LASX(xvneg_b,   vv)
+INSN_LASX(xvneg_h,   vv)
+INSN_LASX(xvneg_w,   vv)
+INSN_LASX(xvneg_d,   vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 93932593a5..0c7d2bbffd 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -56,6 +56,11 @@ TRANS(xvsubi_hu, gvec_subi, 32, MO_16)
 TRANS(xvsubi_wu, gvec_subi, 32, MO_32)
 TRANS(xvsubi_du, gvec_subi, 32, MO_64)
 
+TRANS(xvneg_b, gvec_vv, 32, MO_8, tcg_gen_gvec_neg)
+TRANS(xvneg_h, gvec_vv, 32, MO_16, tcg_gen_gvec_neg)
+TRANS(xvneg_w, gvec_vv, 32, MO_32, tcg_gen_gvec_neg)
+TRANS(xvneg_d, gvec_vv, 32, MO_64, tcg_gen_gvec_neg)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index b95a2dffda..a1370b90e6 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -81,7 +81,7 @@ static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, uint32_t 
oprsz, MemOp mop,
 return true;
 }
 
-static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop,
+static bool gvec_vv(DisasContext *ctx, arg_vv *a, uint32_t oprsz, MemOp mop,
 void (*func)(unsigned, uint32_t, uint32_t,
  uint32_t, uint32_t))
 {
@@ -92,7 +92,7 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop,
 vd_ofs = vec_full_offset(a->vd);
 vj_ofs = vec_full_offset(a->vj);
 
-func(mop, vd_ofs, vj_ofs, 16, ctx->vl/8);
+func(mop, vd_ofs, vj_ofs, oprsz, ctx->vl / 8);
 return true;
 }
 
@@ -173,10 +173,10 @@ TRANS(vsubi_hu, gvec_subi, 16, MO_16)
 TRANS(vsubi_wu, gvec_subi, 16, MO_32)
 TRANS(vsubi_du, gvec_subi, 16, MO_64)
 
-TRANS(vneg_b, gvec_vv, MO_8, tcg_gen_gvec_neg)
-TRANS(vneg_h, gvec_vv, MO_16, tcg_gen_gvec_neg)
-TRANS(vneg_w, gvec_vv, MO_32, tcg_gen_gvec_neg)
-TRANS(vneg_d, gvec_vv, MO_64, tcg_gen_gvec_neg)
+TRANS(vneg_b, gvec_vv, 16, MO_8, tcg_gen_gvec_neg)
+TRANS(vneg_h, gvec_vv, 16, MO_16, tcg_gen_gvec_neg)
+TRANS(vneg_w, gvec_vv, 16, MO_32, tcg_gen_gvec_neg)
+TRANS(vneg_d, gvec_vv, 16, MO_64, tcg_gen_gvec_neg)
 
 TRANS(vsadd_b, gvec_vvv, 16, MO_8, tcg_gen_gvec_ssadd)
 TRANS(vsadd_h, gvec_vvv, 16, MO_16, tcg_gen_gvec_ssadd)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index c48dca70b8..759172628f 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1320,6 +1320,11 @@ xvsubi_hu0111 01101000 11001 . . .   
 @vv_ui5
 xvsubi_wu0111 01101000 11010 . . .@vv_ui5
 xvsubi_du0111 01101000 11011 . . .@vv_ui5
 
+xvneg_b  0111 01101001 11000 01100 . .@vv
+xvneg_h  0111 01101001 11000 01101 . .@vv
+xvneg_w  0111 01101001 11000 01110 . .@vv
+xvneg_d  0111 01101001 11000 0 . .@vv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
-- 
2.39.1

[PATCH v3 05/47] target/loongarch: Implement xvreplgr2vr

2023-07-14 Thread Song Gao

This patch includes:
- XVREPLGR2VR.{B/H/W/D}.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 10 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  5 +
 target/loongarch/insn_trans/trans_lsx.c.inc  | 13 +++--
 target/loongarch/insns.decode|  5 +
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index d8b62ba532..c47f455ed0 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1708,6 +1708,11 @@ static void output_vvv_x(DisasContext *ctx, arg_vvv * a, 
const char *mnemonic)
 output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk);
 }
 
+static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj);
+}
+
 INSN_LASX(xvadd_b,   vvv)
 INSN_LASX(xvadd_h,   vvv)
 INSN_LASX(xvadd_w,   vvv)
@@ -1718,3 +1723,8 @@ INSN_LASX(xvsub_h,   vvv)
 INSN_LASX(xvsub_w,   vvv)
 INSN_LASX(xvsub_d,   vvv)
 INSN_LASX(xvsub_q,   vvv)
+
+INSN_LASX(xvreplgr2vr_b, vr)
+INSN_LASX(xvreplgr2vr_h, vr)
+INSN_LASX(xvreplgr2vr_w, vr)
+INSN_LASX(xvreplgr2vr_d, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 86ba296a73..9bbf6c48ec 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -46,3 +46,8 @@ TRANS(xvsub_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_sub)
 TRANS(xvsub_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_sub)
 TRANS(xvsub_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_sub)
 TRANS(xvsub_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_sub)
+
+TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
+TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
+TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
+TRANS(xvreplgr2vr_d, gvec_dup, 32, MO_64)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 63061bd4a1..4667dba4b4 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -4058,20 +4058,21 @@ static bool trans_vpickve2gr_du(DisasContext *ctx, 
arg_rv_i *a)
 return true;
 }
 
-static bool gvec_dup(DisasContext *ctx, arg_vr *a, MemOp mop)
+static bool gvec_dup(DisasContext *ctx, arg_vr *a, uint32_t oprsz, MemOp mop)
 {
 TCGv src = gpr_src(ctx, a->rj, EXT_NONE);
+
 CHECK_VEC;
 
 tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd),
- 16, ctx->vl/8, src);
+ oprsz, ctx->vl / 8, src);
 return true;
 }
 
-TRANS(vreplgr2vr_b, gvec_dup, MO_8)
-TRANS(vreplgr2vr_h, gvec_dup, MO_16)
-TRANS(vreplgr2vr_w, gvec_dup, MO_32)
-TRANS(vreplgr2vr_d, gvec_dup, MO_64)
+TRANS(vreplgr2vr_b, gvec_dup, 16, MO_8)
+TRANS(vreplgr2vr_h, gvec_dup, 16, MO_16)
+TRANS(vreplgr2vr_w, gvec_dup, 16, MO_32)
+TRANS(vreplgr2vr_d, gvec_dup, 16, MO_64)
 
 static bool trans_vreplvei_b(DisasContext *ctx, arg_vv_i *a)
 {
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index bcc18fb6c5..04bd238995 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1310,3 +1310,8 @@ xvsub_h  0111 0100 11001 . . .
@vvv
 xvsub_w  0111 0100 11010 . . .@vvv
 xvsub_d  0111 0100 11011 . . .@vvv
 xvsub_q  0111 01010010 11011 . . .@vvv
+
+xvreplgr2vr_b0111 01101001 0 0 . .@vr
+xvreplgr2vr_h0111 01101001 0 1 . .@vr
+xvreplgr2vr_w0111 01101001 0 00010 . .@vr
+xvreplgr2vr_d0111 01101001 0 00011 . .@vr
-- 
2.39.1

[PATCH v3 24/47] target/loognarch: Implement xvldi

2023-07-14 Thread Song Gao

This patch includes:
- XVLDI.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 7 +++
 target/loongarch/insn_trans/trans_lasx.c.inc | 2 ++
 target/loongarch/insn_trans/trans_lsx.c.inc  | 6 --
 target/loongarch/insns.decode| 2 ++
 4 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 1a11153343..8fa2edf007 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##type * 
a) \
 return true;\
 }
 
+static void output_v_i_x(DisasContext *ctx, arg_v_i *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm);
+}
+
 static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemonic)
 {
 output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk);
@@ -2022,6 +2027,8 @@ INSN_LASX(xvsigncov_h,   vvv)
 INSN_LASX(xvsigncov_w,   vvv)
 INSN_LASX(xvsigncov_d,   vvv)
 
+INSN_LASX(xvldi, v_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 0a68d9ae61..5e130f9c2e 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -347,6 +347,8 @@ TRANS(xvmskltz_d, gen_vv, 32, gen_helper_vmskltz_d)
 TRANS(xvmskgez_b, gen_vv, 32, gen_helper_vmskgez_b)
 TRANS(xvmsknz_b, gen_vv, 32, gen_helper_vmsknz_b)
 
+TRANS(xvldi, do_vldi, 32)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 08818c08ca..34811720cf 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -3064,7 +3064,7 @@ static uint64_t vldi_get_value(DisasContext *ctx, 
uint32_t imm)
 return data;
 }
 
-static bool trans_vldi(DisasContext *ctx, arg_vldi *a)
+static bool do_vldi(DisasContext *ctx, arg_vldi *a, uint32_t oprsz)
 {
 int sel, vece;
 uint64_t value;
@@ -3080,11 +3080,13 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi *a)
 vece = (a->imm >> 10) & 0x3;
 }
 
-tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, ctx->vl/8,
+tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), oprsz, ctx->vl / 8,
  tcg_constant_i64(value));
 return true;
 }
 
+TRANS(vldi, do_vldi, 16)
+
 TRANS(vand_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_and)
 TRANS(vor_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_or)
 TRANS(vxor_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_xor)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 6a161d6d20..edaa756395 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1605,6 +1605,8 @@ xvmskltz_d   0111 01101001 11000 10011 . .
@vv
 xvmskgez_b   0111 01101001 11000 10100 . .@vv
 xvmsknz_b0111 01101001 11000 11000 . .@vv
 
+xvldi0111 0110 00 . . @v_i13
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
-- 
2.39.1

[PATCH v3 12/47] target/loongarch: Implement xavg/xvagr

2023-07-14 Thread Song Gao

This patch includes:
- XVAVG.{B/H/W/D/}[U];
- XVAVGR.{B/H/W/D}[U].

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 17 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 17 +
 target/loongarch/insns.decode| 17 +
 target/loongarch/vec.h   |  3 +++
 target/loongarch/vec_helper.c| 25 ++--
 5 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 6972e33833..8296aafa98 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1825,6 +1825,23 @@ INSN_LASX(xvaddwod_w_hu_h,   vvv)
 INSN_LASX(xvaddwod_d_wu_w,   vvv)
 INSN_LASX(xvaddwod_q_du_d,   vvv)
 
+INSN_LASX(xvavg_b,   vvv)
+INSN_LASX(xvavg_h,   vvv)
+INSN_LASX(xvavg_w,   vvv)
+INSN_LASX(xvavg_d,   vvv)
+INSN_LASX(xvavg_bu,  vvv)
+INSN_LASX(xvavg_hu,  vvv)
+INSN_LASX(xvavg_wu,  vvv)
+INSN_LASX(xvavg_du,  vvv)
+INSN_LASX(xvavgr_b,  vvv)
+INSN_LASX(xvavgr_h,  vvv)
+INSN_LASX(xvavgr_w,  vvv)
+INSN_LASX(xvavgr_d,  vvv)
+INSN_LASX(xvavgr_bu, vvv)
+INSN_LASX(xvavgr_hu, vvv)
+INSN_LASX(xvavgr_wu, vvv)
+INSN_LASX(xvavgr_du, vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index d8230cba9f..ac4cade845 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -140,6 +140,23 @@ TRANS(xvaddwod_w_hu_h, gvec_vvv, 32, MO_16, do_vaddwod_u_s)
 TRANS(xvaddwod_d_wu_w, gvec_vvv, 32, MO_32, do_vaddwod_u_s)
 TRANS(xvaddwod_q_du_d, gvec_vvv, 32, MO_64, do_vaddwod_u_s)
 
+TRANS(xvavg_b, gvec_vvv, 32, MO_8, do_vavg_s)
+TRANS(xvavg_h, gvec_vvv, 32, MO_16, do_vavg_s)
+TRANS(xvavg_w, gvec_vvv, 32, MO_32, do_vavg_s)
+TRANS(xvavg_d, gvec_vvv, 32, MO_64, do_vavg_s)
+TRANS(xvavg_bu, gvec_vvv, 32, MO_8, do_vavg_u)
+TRANS(xvavg_hu, gvec_vvv, 32, MO_16, do_vavg_u)
+TRANS(xvavg_wu, gvec_vvv, 32, MO_32, do_vavg_u)
+TRANS(xvavg_du, gvec_vvv, 32, MO_64, do_vavg_u)
+TRANS(xvavgr_b, gvec_vvv, 32, MO_8, do_vavgr_s)
+TRANS(xvavgr_h, gvec_vvv, 32, MO_16, do_vavgr_s)
+TRANS(xvavgr_w, gvec_vvv, 32, MO_32, do_vavgr_s)
+TRANS(xvavgr_d, gvec_vvv, 32, MO_64, do_vavgr_s)
+TRANS(xvavgr_bu, gvec_vvv, 32, MO_8, do_vavgr_u)
+TRANS(xvavgr_hu, gvec_vvv, 32, MO_16, do_vavgr_u)
+TRANS(xvavgr_wu, gvec_vvv, 32, MO_32, do_vavgr_u)
+TRANS(xvavgr_du, gvec_vvv, 32, MO_64, do_vavgr_u)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index e1d8b30179..a2cb39750d 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1406,6 +1406,23 @@ xvaddwod_w_hu_h  0111 01000100 1 . . .   
 @vvv
 xvaddwod_d_wu_w  0111 01000100 00010 . . .@vvv
 xvaddwod_q_du_d  0111 01000100 00011 . . .@vvv
 
+xvavg_b  0111 01000110 01000 . . .@vvv
+xvavg_h  0111 01000110 01001 . . .@vvv
+xvavg_w  0111 01000110 01010 . . .@vvv
+xvavg_d  0111 01000110 01011 . . .@vvv
+xvavg_bu 0111 01000110 01100 . . .@vvv
+xvavg_hu 0111 01000110 01101 . . .@vvv
+xvavg_wu 0111 01000110 01110 . . .@vvv
+xvavg_du 0111 01000110 0 . . .@vvv
+xvavgr_b 0111 01000110 1 . . .@vvv
+xvavgr_h 0111 01000110 10001 . . .@vvv
+xvavgr_w 0111 01000110 10010 . . .@vvv
+xvavgr_d 0111 01000110 10011 . . .@vvv
+xvavgr_bu0111 01000110 10100 . . .@vvv
+xvavgr_hu0111 01000110 10101 . . .@vvv
+xvavgr_wu0111 01000110 10110 . . .@vvv
+xvavgr_du0111 01000110 10111 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index 5332dff83c..6ac6b22f20 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -50,4 +50,7 @@
 #define DO_ADD(a, b)  (a + b)
 #define DO_SUB(a, b)  (a - b)
 
+#define DO_VAVG(a, b)  ((a >> 1) + (b >> 1) + (a & b & 1))
+#define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1))
+
 #endif /* LOONGARCH_VEC_H */
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 0127f8ba0b..2fa8b68e72 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -344,19 +344

[PATCH v3 01/47] target/loongarch: Add LASX data support

2023-07-14 Thread Song Gao

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 linux-user/loongarch64/signal.c |  1 +
 target/loongarch/cpu.c  |  1 +
 target/loongarch/cpu.h  | 24 --
 target/loongarch/gdbstub.c  |  1 +
 target/loongarch/internals.h| 22 
 target/loongarch/lsx_helper.c   |  1 +
 target/loongarch/machine.c  | 36 -
 target/loongarch/vec.h  | 33 ++
 8 files changed, 85 insertions(+), 34 deletions(-)
 create mode 100644 target/loongarch/vec.h

diff --git a/linux-user/loongarch64/signal.c b/linux-user/loongarch64/signal.c
index bb8efb1172..39572c1190 100644
--- a/linux-user/loongarch64/signal.c
+++ b/linux-user/loongarch64/signal.c
@@ -12,6 +12,7 @@
 #include "linux-user/trace.h"
 
 #include "target/loongarch/internals.h"
+#include "target/loongarch/vec.h"
 
 /* FP context was used */
 #define SC_USED_FP  (1 << 0)
diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index ad93ecac92..5037cfc02c 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -18,6 +18,7 @@
 #include "cpu-csr.h"
 #include "sysemu/reset.h"
 #include "tcg/tcg.h"
+#include "vec.h"
 
 const char * const regnames[32] = {
 "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index ed04027af1..c39c261bc4 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -246,18 +246,20 @@ FIELD(TLB_MISC, ASID, 1, 10)
 FIELD(TLB_MISC, VPPN, 13, 35)
 FIELD(TLB_MISC, PS, 48, 6)
 
-#define LSX_LEN   (128)
+#define LSX_LEN(128)
+#define LASX_LEN   (256)
+
 typedef union VReg {
-int8_t   B[LSX_LEN / 8];
-int16_t  H[LSX_LEN / 16];
-int32_t  W[LSX_LEN / 32];
-int64_t  D[LSX_LEN / 64];
-uint8_t  UB[LSX_LEN / 8];
-uint16_t UH[LSX_LEN / 16];
-uint32_t UW[LSX_LEN / 32];
-uint64_t UD[LSX_LEN / 64];
-Int128   Q[LSX_LEN / 128];
-}VReg;
+int8_t   B[LASX_LEN / 8];
+int16_t  H[LASX_LEN / 16];
+int32_t  W[LASX_LEN / 32];
+int64_t  D[LASX_LEN / 64];
+uint8_t  UB[LASX_LEN / 8];
+uint16_t UH[LASX_LEN / 16];
+uint32_t UW[LASX_LEN / 32];
+uint64_t UD[LASX_LEN / 64];
+Int128   Q[LASX_LEN / 128];
+} VReg;
 
 typedef union fpr_t fpr_t;
 union fpr_t {
diff --git a/target/loongarch/gdbstub.c b/target/loongarch/gdbstub.c
index 0752fff924..94c427f4da 100644
--- a/target/loongarch/gdbstub.c
+++ b/target/loongarch/gdbstub.c
@@ -11,6 +11,7 @@
 #include "internals.h"
 #include "exec/gdbstub.h"
 #include "gdbstub/helpers.h"
+#include "vec.h"
 
 uint64_t read_fcc(CPULoongArchState *env)
 {
diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h
index 7b0f29c942..c492863cc5 100644
--- a/target/loongarch/internals.h
+++ b/target/loongarch/internals.h
@@ -21,28 +21,6 @@
 /* Global bit for huge page */
 #define LOONGARCH_HGLOBAL_SHIFT 12
 
-#if  HOST_BIG_ENDIAN
-#define B(x)  B[15 - (x)]
-#define H(x)  H[7 - (x)]
-#define W(x)  W[3 - (x)]
-#define D(x)  D[1 - (x)]
-#define UB(x) UB[15 - (x)]
-#define UH(x) UH[7 - (x)]
-#define UW(x) UW[3 - (x)]
-#define UD(x) UD[1 -(x)]
-#define Q(x)  Q[x]
-#else
-#define B(x)  B[x]
-#define H(x)  H[x]
-#define W(x)  W[x]
-#define D(x)  D[x]
-#define UB(x) UB[x]
-#define UH(x) UH[x]
-#define UW(x) UW[x]
-#define UD(x) UD[x]
-#define Q(x)  Q[x]
-#endif
-
 void loongarch_translate_init(void);
 
 void loongarch_cpu_dump_state(CPUState *cpu, FILE *f, int flags);
diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c
index 9571f0aef0..b231a2798b 100644
--- a/target/loongarch/lsx_helper.c
+++ b/target/loongarch/lsx_helper.c
@@ -12,6 +12,7 @@
 #include "fpu/softfloat.h"
 #include "internals.h"
 #include "tcg/tcg.h"
+#include "vec.h"
 
 #define DO_ADD(a, b)  (a + b)
 #define DO_SUB(a, b)  (a - b)
diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
index d8ac99c9a4..1c4e01d076 100644
--- a/target/loongarch/machine.c
+++ b/target/loongarch/machine.c
@@ -8,7 +8,7 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "migration/cpu.h"
-#include "internals.h"
+#include "vec.h"
 
 static const VMStateDescription vmstate_fpu_reg = {
 .name = "fpu_reg",
@@ -76,6 +76,39 @@ static const VMStateDescription vmstate_lsx = {
 },
 };
 
+static const VMStateDescription vmstate_lasxh_reg = {
+.name = "lasxh_reg",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(UD(2), VReg),
+VMSTATE_UINT64(UD(3), VReg),
+VMSTATE_END_OF_LIST()
+}
+};
+
+#define VMSTATE_LASXH_REGS(_field, _state, _start)  \
+VMSTATE_STRUCT_SUB_ARRAY(_field, _state, _start, 32, 0, \
+ vmstate_lasxh_reg, fpr_t)
+
+static bool lasx_needed(void *opaque)
+{
+LoongArchCPU *cpu = opaque;
+
+return FIELD_EX64(cpu->env.cpucfg[2], CPUCFG2, LASX);
+}
+
+static const VMStateDescription vmstate_lasx = {
+.name = "cpu/lasx",

[PATCH v3 16/47] target/loongarch: Implement xvmul/xvmuh/xvmulw{ev/od}

2023-07-14 Thread Song Gao

This patch includes:
- XVMUL.{B/H/W/D};
- XVMUH.{B/H/W/D}[U];
- XVMULW{EV/OD}.{H.B/W.H/D.W/Q.D}[U];
- XVMULW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 38 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 42 +++
 target/loongarch/insn_trans/trans_lsx.c.inc  | 56 ++-
 target/loongarch/insns.decode| 38 +
 target/loongarch/vec.h   |  2 +
 target/loongarch/vec_helper.c| 57 ++--
 6 files changed, 180 insertions(+), 53 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 63c1dc757f..e5f9a6bcdf 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1890,6 +1890,44 @@ INSN_LASX(xvmini_hu, vv_i)
 INSN_LASX(xvmini_wu, vv_i)
 INSN_LASX(xvmini_du, vv_i)
 
+INSN_LASX(xvmul_b,   vvv)
+INSN_LASX(xvmul_h,   vvv)
+INSN_LASX(xvmul_w,   vvv)
+INSN_LASX(xvmul_d,   vvv)
+INSN_LASX(xvmuh_b,   vvv)
+INSN_LASX(xvmuh_h,   vvv)
+INSN_LASX(xvmuh_w,   vvv)
+INSN_LASX(xvmuh_d,   vvv)
+INSN_LASX(xvmuh_bu,  vvv)
+INSN_LASX(xvmuh_hu,  vvv)
+INSN_LASX(xvmuh_wu,  vvv)
+INSN_LASX(xvmuh_du,  vvv)
+
+INSN_LASX(xvmulwev_h_b,  vvv)
+INSN_LASX(xvmulwev_w_h,  vvv)
+INSN_LASX(xvmulwev_d_w,  vvv)
+INSN_LASX(xvmulwev_q_d,  vvv)
+INSN_LASX(xvmulwod_h_b,  vvv)
+INSN_LASX(xvmulwod_w_h,  vvv)
+INSN_LASX(xvmulwod_d_w,  vvv)
+INSN_LASX(xvmulwod_q_d,  vvv)
+INSN_LASX(xvmulwev_h_bu, vvv)
+INSN_LASX(xvmulwev_w_hu, vvv)
+INSN_LASX(xvmulwev_d_wu, vvv)
+INSN_LASX(xvmulwev_q_du, vvv)
+INSN_LASX(xvmulwod_h_bu, vvv)
+INSN_LASX(xvmulwod_w_hu, vvv)
+INSN_LASX(xvmulwod_d_wu, vvv)
+INSN_LASX(xvmulwod_q_du, vvv)
+INSN_LASX(xvmulwev_h_bu_b,   vvv)
+INSN_LASX(xvmulwev_w_hu_h,   vvv)
+INSN_LASX(xvmulwev_d_wu_w,   vvv)
+INSN_LASX(xvmulwev_q_du_d,   vvv)
+INSN_LASX(xvmulwod_h_bu_b,   vvv)
+INSN_LASX(xvmulwod_w_hu_h,   vvv)
+INSN_LASX(xvmulwod_d_wu_w,   vvv)
+INSN_LASX(xvmulwod_q_du_d,   vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 107c75f1b6..5fffe4e60c 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -207,6 +207,48 @@ TRANS(xvmaxi_hu, gvec_vv_i, 32, MO_16, do_vmaxi_u)
 TRANS(xvmaxi_wu, gvec_vv_i, 32, MO_32, do_vmaxi_u)
 TRANS(xvmaxi_du, gvec_vv_i, 32, MO_64, do_vmaxi_u)
 
+TRANS(xvmul_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_mul)
+TRANS(xvmul_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_mul)
+TRANS(xvmul_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_mul)
+TRANS(xvmul_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_mul)
+TRANS(xvmuh_b, gvec_vvv, 32, MO_8, do_vmuh_s)
+TRANS(xvmuh_h, gvec_vvv, 32, MO_16, do_vmuh_s)
+TRANS(xvmuh_w, gvec_vvv, 32, MO_32, do_vmuh_s)
+TRANS(xvmuh_d, gvec_vvv, 32, MO_64, do_vmuh_s)
+TRANS(xvmuh_bu, gvec_vvv, 32, MO_8, do_vmuh_u)
+TRANS(xvmuh_hu, gvec_vvv, 32, MO_16, do_vmuh_u)
+TRANS(xvmuh_wu, gvec_vvv, 32, MO_32, do_vmuh_u)
+TRANS(xvmuh_du, gvec_vvv, 32, MO_64, do_vmuh_u)
+
+TRANS(xvmulwev_h_b, gvec_vvv, 32, MO_8, do_vmulwev_s)
+TRANS(xvmulwev_w_h, gvec_vvv, 32, MO_16, do_vmulwev_s)
+TRANS(xvmulwev_d_w, gvec_vvv, 32, MO_32, do_vmulwev_s)
+
+TRANS(xvmulwev_q_d, gen_vmul_q, 32, 0, 0, tcg_gen_muls2_i64)
+TRANS(xvmulwod_q_d, gen_vmul_q, 32, 1, 1, tcg_gen_muls2_i64)
+TRANS(xvmulwev_q_du, gen_vmul_q, 32, 0, 0, tcg_gen_mulu2_i64)
+TRANS(xvmulwod_q_du, gen_vmul_q, 32, 1, 1, tcg_gen_mulu2_i64)
+TRANS(xvmulwev_q_du_d, gen_vmul_q, 32, 0, 0, tcg_gen_mulus2_i64)
+TRANS(xvmulwod_q_du_d, gen_vmul_q, 32, 1, 1, tcg_gen_mulus2_i64)
+
+TRANS(xvmulwod_h_b, gvec_vvv, 32, MO_8, do_vmulwod_s)
+TRANS(xvmulwod_w_h, gvec_vvv, 32, MO_16, do_vmulwod_s)
+TRANS(xvmulwod_d_w, gvec_vvv, 32, MO_32, do_vmulwod_s)
+
+TRANS(xvmulwev_h_bu, gvec_vvv, 32, MO_8, do_vmulwev_u)
+TRANS(xvmulwev_w_hu, gvec_vvv, 32, MO_16, do_vmulwev_u)
+TRANS(xvmulwev_d_wu, gvec_vvv, 32, MO_32, do_vmulwev_u)
+TRANS(xvmulwod_h_bu, gvec_vvv, 32, MO_8, do_vmulwod_u)
+TRANS(xvmulwod_w_hu, gvec_vvv, 32, MO_16, do_vmulwod_u)
+TRANS(xvmulwod_d_wu, gvec_vvv, 32, MO_32, do_vmulwod_u)
+
+TRANS(xvmulwev_h_bu_b, gvec_vvv, 32, MO_8, do_vmulwev_u_s)
+TRANS(xvmulwev_w_hu_h, gvec_vvv, 32, MO_16, do_vmulwev_u_s)
+TRANS(xvmulwev_d_wu_w, gvec_vvv, 32, MO_32, do_vmulwev_u_s)
+TRANS(xvmulwod_h_bu_b, gvec_vvv, 32, MO_8, do_vmulwod_u_s)
+TRANS(xvmulwod_w_hu_h, gvec_vvv, 32, MO_16, do_vmulwod_u_s)
+TRANS(xvmulwod_d_wu_w, gvec_vvv, 32, MO_32, do_vmulwod_u_s)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index e689e409ce..82051

[PATCH v3 23/47] target/loongarch: Implement xvmskltz/xvmskgez/xvmsknz

2023-07-14 Thread Song Gao

This patch includes:
- XVMSKLTZ.{B/H/W/D};
- XVMSKGEZ.B;
- XVMSKNZ.B.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  7 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  7 ++
 target/loongarch/insns.decode|  7 ++
 target/loongarch/vec_helper.c| 80 ++--
 4 files changed, 76 insertions(+), 25 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 85e0cb7c8d..1a11153343 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2010,6 +2010,13 @@ INSN_LASX(vext2xv_wu_hu, vv)
 INSN_LASX(vext2xv_du_hu, vv)
 INSN_LASX(vext2xv_du_wu, vv)
 
+INSN_LASX(xvmskltz_b,vv)
+INSN_LASX(xvmskltz_h,vv)
+INSN_LASX(xvmskltz_w,vv)
+INSN_LASX(xvmskltz_d,vv)
+INSN_LASX(xvmskgez_b,vv)
+INSN_LASX(xvmsknz_b, vv)
+
 INSN_LASX(xvsigncov_b,   vvv)
 INSN_LASX(xvsigncov_h,   vvv)
 INSN_LASX(xvsigncov_w,   vvv)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 7aab6528a7..0a68d9ae61 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -340,6 +340,13 @@ TRANS(xvsigncov_h, gvec_vvv, 32, MO_16, do_vsigncov)
 TRANS(xvsigncov_w, gvec_vvv, 32, MO_32, do_vsigncov)
 TRANS(xvsigncov_d, gvec_vvv, 32, MO_64, do_vsigncov)
 
+TRANS(xvmskltz_b, gen_vv, 32, gen_helper_vmskltz_b)
+TRANS(xvmskltz_h, gen_vv, 32, gen_helper_vmskltz_h)
+TRANS(xvmskltz_w, gen_vv, 32, gen_helper_vmskltz_w)
+TRANS(xvmskltz_d, gen_vv, 32, gen_helper_vmskltz_d)
+TRANS(xvmskgez_b, gen_vv, 32, gen_helper_vmskgez_b)
+TRANS(xvmsknz_b, gen_vv, 32, gen_helper_vmsknz_b)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 7bbda1a142..6a161d6d20 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1598,6 +1598,13 @@ xvsigncov_h  0111 01010010 11101 . . .   
 @vvv
 xvsigncov_w  0111 01010010 0 . . .@vvv
 xvsigncov_d  0111 01010010 1 . . .@vvv
 
+xvmskltz_b   0111 01101001 11000 1 . .@vv
+xvmskltz_h   0111 01101001 11000 10001 . .@vv
+xvmskltz_w   0111 01101001 11000 10010 . .@vv
+xvmskltz_d   0111 01101001 11000 10011 . .@vv
+xvmskgez_b   0111 01101001 11000 10100 . .@vv
+xvmsknz_b0111 01101001 11000 11000 . .@vv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 126b67eea5..9d13b6544c 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -783,14 +783,19 @@ static uint64_t do_vmskltz_b(int64_t val)
 
 void HELPER(vmskltz_b)(void *vd, void *vj, uint32_t desc)
 {
+int i;
 uint16_t temp = 0;
 VReg *Vd = (VReg *)vd;
 VReg *Vj = (VReg *)vj;
+int oprsz = simd_oprsz(desc);
 
-temp = do_vmskltz_b(Vj->D(0));
-temp |= (do_vmskltz_b(Vj->D(1)) << 8);
-Vd->D(0) = temp;
-Vd->D(1) = 0;
+for (i = 0; i < oprsz / 16; i++) {
+temp = 0;
+temp = do_vmskltz_b(Vj->D(2 * i));
+temp |= (do_vmskltz_b(Vj->D(2 * i  + 1)) << 8);
+Vd->D(2 * i) = temp;
+Vd->D(2 * i + 1) = 0;
+}
 }
 
 static uint64_t do_vmskltz_h(int64_t val)
@@ -804,14 +809,19 @@ static uint64_t do_vmskltz_h(int64_t val)
 
 void HELPER(vmskltz_h)(void *vd, void *vj, uint32_t desc)
 {
+int i;
 uint16_t temp = 0;
 VReg *Vd = (VReg *)vd;
 VReg *Vj = (VReg *)vj;
+int oprsz = simd_oprsz(desc);
 
-temp = do_vmskltz_h(Vj->D(0));
-temp |= (do_vmskltz_h(Vj->D(1)) << 4);
-Vd->D(0) = temp;
-Vd->D(1) = 0;
+for (i = 0; i < oprsz / 16; i++) {
+temp = 0;
+temp = do_vmskltz_h(Vj->D(2 * i));
+temp |= (do_vmskltz_h(Vj->D(2 * i + 1)) << 4);
+Vd->D(2 * i) = temp;
+Vd->D(2 * i + 1) = 0;
+}
 }
 
 static uint64_t do_vmskltz_w(int64_t val)
@@ -824,14 +834,19 @@ static uint64_t do_vmskltz_w(int64_t val)
 
 void HELPER(vmskltz_w)(void *vd, void *vj, uint32_t desc)
 {
+int i;
 uint16_t temp = 0;
 VReg *Vd = (VReg *)vd;
 VReg *Vj = (VReg *)vj;
+int oprsz = simd_oprsz(desc);
 
-temp = do_vmskltz_w(Vj->D(0));
-temp |= (do_vmskltz_w(Vj->D(1)) << 2);
-Vd->D(0) = temp;
-Vd->D(1) = 0;
+for (i = 0; i < oprsz / 16; i++) {
+temp = 0;
+temp = do_vmskltz_w(Vj->D(2 * i));
+temp |= (do_vmskltz_w(Vj->D(2 * i + 1)) << 2);
+Vd->D(2 * i) = temp;
+Vd->D(2 * i + 1) = 0;
+}
 }
 
 static uint64_t do_vmskltz_d(int64_t val)
@@ -840,26 +855,36 @@ stat

[PATCH v3 36/47] target/loongarch: Implement xvfrstp

2023-07-14 Thread Song Gao

This patch includes:
- XVFRSTP[I].{B/H}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  5 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  5 ++
 target/loongarch/insns.decode|  5 ++
 target/loongarch/vec_helper.c| 48 
 4 files changed, 43 insertions(+), 20 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index dad9243fd7..27d6252686 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2235,6 +2235,11 @@ INSN_LASX(xvbitrevi_h,   vv_i)
 INSN_LASX(xvbitrevi_w,   vv_i)
 INSN_LASX(xvbitrevi_d,   vv_i)
 
+INSN_LASX(xvfrstp_b, vvv)
+INSN_LASX(xvfrstp_h, vvv)
+INSN_LASX(xvfrstpi_b,vv_i)
+INSN_LASX(xvfrstpi_h,vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index e52c7551d9..081f692416 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -556,6 +556,11 @@ TRANS(xvbitrevi_h, gvec_vv_i, 32, MO_16, do_vbitrevi)
 TRANS(xvbitrevi_w, gvec_vv_i, 32, MO_32, do_vbitrevi)
 TRANS(xvbitrevi_d, gvec_vv_i, 32, MO_64, do_vbitrevi)
 
+TRANS(xvfrstp_b, gen_vvv, 32, gen_helper_vfrstp_b)
+TRANS(xvfrstp_h, gen_vvv, 32, gen_helper_vfrstp_h)
+TRANS(xvfrstpi_b, gen_vv_i, 32, gen_helper_vfrstpi_b)
+TRANS(xvfrstpi_h, gen_vv_i, 32, gen_helper_vfrstpi_h)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index cb6db8002a..6035fe139c 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1811,6 +1811,11 @@ xvbitrevi_h  0111 01110001 1 1  . .  
 @vv_ui4
 xvbitrevi_w  0111 01110001 10001 . . .@vv_ui5
 xvbitrevi_d  0111 01110001 1001 .. . .@vv_ui6
 
+xvfrstp_b0111 01010010 10110 . . .@vvv
+xvfrstp_h0111 01010010 10111 . . .@vvv
+xvfrstpi_b   0111 01101001 10100 . . .@vv_ui5
+xvfrstpi_h   0111 01101001 10101 . . .@vv_ui5
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 922eac40fb..24286bcef0 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -2251,37 +2251,45 @@ DO_BITI(vbitrevi_d, 64, UD, DO_BITREV)
 #define VFRSTP(NAME, BIT, MASK, E) \
 void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \
 {  \
-int i, m;  \
+int i, j, m, ofs;  \
 VReg *Vd = (VReg *)vd; \
 VReg *Vj = (VReg *)vj; \
 VReg *Vk = (VReg *)vk; \
+int oprsz = simd_oprsz(desc);  \
\
-for (i = 0; i < LSX_LEN/BIT; i++) {\
-if (Vj->E(i) < 0) {\
-break; \
+ofs = LSX_LEN / BIT;   \
+for (i = 0; i < oprsz / 16; i++) { \
+m = Vk->E(i * ofs) & MASK; \
+for (j = 0; j < ofs; j++) {\
+if (Vj->E(j + ofs * i) < 0) {  \
+break; \
+}  \
 }  \
+Vd->E(m + i * ofs) = j;\
 }  \
-m = Vk->E(0) & MASK;   \
-Vd->E(m) = i;  \
 }
 
 VFRSTP(vfrstp_b, 8, 0xf, B)
 VFRSTP(vfrstp_h, 16, 0x7, H)
 
-#define VFRSTPI(NAME, BIT, E) \
-void HELPER(NAME)(void *vd, void vj, uint64_t imm, uint32_t desc) \
-{ \
-int i, m; \
-VReg *Vd = (VReg *)vd;\
-VReg *Vj = (VReg *)vj;\
-  \
-for (i = 0; i

[PATCH v3 04/47] target/loongarch: Implement xvadd/xvsub

2023-07-14 Thread Song Gao

This patch includes:
- XVADD.{B/H/W/D/Q};
- XVSUB.{B/H/W/D/Q}.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c |  23 +
 target/loongarch/insn_trans/trans_lasx.c.inc |  52 +-
 target/loongarch/insn_trans/trans_lsx.c.inc  | 511 +--
 target/loongarch/insns.decode|  14 +
 target/loongarch/translate.c |   5 +
 target/loongarch/vec.h   |  17 +
 6 files changed, 351 insertions(+), 271 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 5c402d944d..d8b62ba532 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1695,3 +1695,26 @@ INSN_LSX(vstelm_d, vr_ii)
 INSN_LSX(vstelm_w, vr_ii)
 INSN_LSX(vstelm_h, vr_ii)
 INSN_LSX(vstelm_b, vr_ii)
+
+#define INSN_LASX(insn, type)   \
+static bool trans_##insn(DisasContext *ctx, arg_##type * a) \
+{   \
+output_##type ## _x(ctx, a, #insn); \
+return true;\
+}
+
+static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk);
+}
+
+INSN_LASX(xvadd_b,   vvv)
+INSN_LASX(xvadd_h,   vvv)
+INSN_LASX(xvadd_w,   vvv)
+INSN_LASX(xvadd_d,   vvv)
+INSN_LASX(xvadd_q,   vvv)
+INSN_LASX(xvsub_b,   vvv)
+INSN_LASX(xvsub_h,   vvv)
+INSN_LASX(xvsub_w,   vvv)
+INSN_LASX(xvsub_d,   vvv)
+INSN_LASX(xvsub_q,   vvv)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 75a77f5dce..86ba296a73 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -4,13 +4,45 @@
  * Copyright (c) 2023 Loongson Technology Corporation Limited
  */
 
-#ifndef CONFIG_USER_ONLY
-#define CHECK_ASXE do { \
-if ((ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) == 0) { \
-generate_exception(ctx, EXCCODE_ASXD); \
-return true; \
-} \
-} while (0)
-#else
-#define CHECK_ASXE
-#endif
+TRANS(xvadd_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_add)
+TRANS(xvadd_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_add)
+TRANS(xvadd_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_add)
+TRANS(xvadd_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_add)
+
+#define XVADDSUB_Q(NAME) \
+static bool trans_xv## NAME ##_q(DisasContext *ctx, arg_vvv * a) \
+{\
+TCGv_i64 rh, rl, ah, al, bh, bl; \
+int i;   \
+ \
+CHECK_VEC;   \
+ \
+rh = tcg_temp_new_i64(); \
+rl = tcg_temp_new_i64(); \
+ah = tcg_temp_new_i64(); \
+al = tcg_temp_new_i64(); \
+bh = tcg_temp_new_i64(); \
+bl = tcg_temp_new_i64(); \
+ \
+for (i = 0; i < 2; i++) {\
+get_vreg64(ah, a->vj, 1 + i * 2);\
+get_vreg64(al, a->vj, 0 + i * 2);\
+get_vreg64(bh, a->vk, 1 + i * 2);\
+get_vreg64(bl, a->vk, 0 + i * 2);\
+ \
+tcg_gen_## NAME ##2_i64(rl, rh, al, ah, bl, bh); \
+ \
+set_vreg64(rh, a->vd, 1 + i * 2);\
+set_vreg64(rl, a->vd, 0 + i * 2);\
+   } \
+ \
+return true; \
+}
+
+XVADDSUB_Q(add)
+XVADDSUB_Q(sub)
+
+TRANS(xvsub_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_sub)
+TRANS(xvsub_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_sub)
+TRANS(xvsub_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_sub)
+TRANS(xvsub_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_sub)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 68779daff6..63061bd4a1 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -4,17 +4,6 @@
  * Copyright (c) 2022-2023 Loongson Technology Corporation Limited
  */
 
-#ifndef CONFIG_USER_ONLY
-#de

[PATCH v3 06/47] target/loongarch: Implement xvaddi/xvsubi

2023-07-14 Thread Song Gao

This patch includes:
- XVADDI.{B/H/W/D}U;
- XVSUBI.{B/H/W/D}U.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c |  14 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |   9 ++
 target/loongarch/insn_trans/trans_lsx.c.inc  | 136 +--
 target/loongarch/insns.decode|   9 ++
 4 files changed, 100 insertions(+), 68 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index c47f455ed0..f59e3cebf0 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1708,6 +1708,11 @@ static void output_vvv_x(DisasContext *ctx, arg_vvv * a, 
const char *mnemonic)
 output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk);
 }
 
+static void output_vv_i_x(DisasContext *ctx, arg_vv_i *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, x%d, 0x%x", a->vd, a->vj, a->imm);
+}
+
 static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic)
 {
 output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj);
@@ -1724,6 +1729,15 @@ INSN_LASX(xvsub_w,   vvv)
 INSN_LASX(xvsub_d,   vvv)
 INSN_LASX(xvsub_q,   vvv)
 
+INSN_LASX(xvaddi_bu, vv_i)
+INSN_LASX(xvaddi_hu, vv_i)
+INSN_LASX(xvaddi_wu, vv_i)
+INSN_LASX(xvaddi_du, vv_i)
+INSN_LASX(xvsubi_bu, vv_i)
+INSN_LASX(xvsubi_hu, vv_i)
+INSN_LASX(xvsubi_wu, vv_i)
+INSN_LASX(xvsubi_du, vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 9bbf6c48ec..93932593a5 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -47,6 +47,15 @@ TRANS(xvsub_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_sub)
 TRANS(xvsub_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_sub)
 TRANS(xvsub_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_sub)
 
+TRANS(xvaddi_bu, gvec_vv_i, 32, MO_8, tcg_gen_gvec_addi)
+TRANS(xvaddi_hu, gvec_vv_i, 32, MO_16, tcg_gen_gvec_addi)
+TRANS(xvaddi_wu, gvec_vv_i, 32, MO_32, tcg_gen_gvec_addi)
+TRANS(xvaddi_du, gvec_vv_i, 32, MO_64, tcg_gen_gvec_addi)
+TRANS(xvsubi_bu, gvec_subi, 32, MO_8)
+TRANS(xvsubi_hu, gvec_subi, 32, MO_16)
+TRANS(xvsubi_wu, gvec_subi, 32, MO_32)
+TRANS(xvsubi_du, gvec_subi, 32, MO_64)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 4667dba4b4..b95a2dffda 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -96,7 +96,7 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop,
 return true;
 }
 
-static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, MemOp mop,
+static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, MemOp 
mop,
   void (*func)(unsigned, uint32_t, uint32_t,
int64_t, uint32_t, uint32_t))
 {
@@ -107,11 +107,11 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, 
MemOp mop,
 vd_ofs = vec_full_offset(a->vd);
 vj_ofs = vec_full_offset(a->vj);
 
-func(mop, vd_ofs, vj_ofs, a->imm , 16, ctx->vl/8);
+func(mop, vd_ofs, vj_ofs, a->imm, oprsz, ctx->vl / 8);
 return true;
 }
 
-static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop)
+static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, MemOp 
mop)
 {
 uint32_t vd_ofs, vj_ofs;
 
@@ -120,7 +120,7 @@ static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp 
mop)
 vd_ofs = vec_full_offset(a->vd);
 vj_ofs = vec_full_offset(a->vj);
 
-tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, 16, ctx->vl/8);
+tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, oprsz, ctx->vl / 8);
 return true;
 }
 
@@ -164,14 +164,14 @@ TRANS(vsub_h, gvec_vvv, 16, MO_16, tcg_gen_gvec_sub)
 TRANS(vsub_w, gvec_vvv, 16, MO_32, tcg_gen_gvec_sub)
 TRANS(vsub_d, gvec_vvv, 16, MO_64, tcg_gen_gvec_sub)
 
-TRANS(vaddi_bu, gvec_vv_i, MO_8, tcg_gen_gvec_addi)
-TRANS(vaddi_hu, gvec_vv_i, MO_16, tcg_gen_gvec_addi)
-TRANS(vaddi_wu, gvec_vv_i, MO_32, tcg_gen_gvec_addi)
-TRANS(vaddi_du, gvec_vv_i, MO_64, tcg_gen_gvec_addi)
-TRANS(vsubi_bu, gvec_subi, MO_8)
-TRANS(vsubi_hu, gvec_subi, MO_16)
-TRANS(vsubi_wu, gvec_subi, MO_32)
-TRANS(vsubi_du, gvec_subi, MO_64)
+TRANS(vaddi_bu, gvec_vv_i, 16, MO_8, tcg_gen_gvec_addi)
+TRANS(vaddi_hu, gvec_vv_i, 16, MO_16, tcg_gen_gvec_addi)
+TRANS(vaddi_wu, gvec_vv_i, 16, MO_32, tcg_gen_gvec_addi)
+TRANS(vaddi_du, gvec_vv_i, 16, MO_64, tcg_gen_gvec_addi)
+TRANS(vsubi_bu, gvec_subi, 16, MO_8)
+TRANS(vsubi_hu, gvec_subi, 16, MO_16)
+TRANS(vsubi_wu, gvec_subi, 16, MO_32)
+TRANS(vsubi_du, gvec_subi, 16, MO_64)
 
 TRANS(vneg_b, gvec_vv, MO_8, tcg_gen_gvec_neg)
 TRANS(vneg_h, gvec_vv, MO_16, tcg_gen_gvec_neg)
@@ -1462,14

[PATCH v3 03/47] target/loongarch: Add CHECK_ASXE maccro for check LASX enable

2023-07-14 Thread Song Gao

Reviewed-by: Richard Henderson 
Signed-off-by: Song Gao 
---
 target/loongarch/cpu.c   |  2 ++
 target/loongarch/cpu.h   |  2 ++
 target/loongarch/insn_trans/trans_lasx.c.inc | 10 ++
 3 files changed, 14 insertions(+)

diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index 5037cfc02c..c9f9cbb19d 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -54,6 +54,7 @@ static const char * const excp_names[] = {
 [EXCCODE_DBP] = "Debug breakpoint",
 [EXCCODE_BCE] = "Bound Check Exception",
 [EXCCODE_SXD] = "128 bit vector instructions Disable exception",
+[EXCCODE_ASXD] = "256 bit vector instructions Disable exception",
 };
 
 const char *loongarch_exception_name(int32_t exception)
@@ -189,6 +190,7 @@ static void loongarch_cpu_do_interrupt(CPUState *cs)
 case EXCCODE_FPD:
 case EXCCODE_FPE:
 case EXCCODE_SXD:
+case EXCCODE_ASXD:
 env->CSR_BADV = env->pc;
 QEMU_FALLTHROUGH;
 case EXCCODE_BCE:
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index c39c261bc4..1137d6cb58 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -428,6 +428,7 @@ static inline int cpu_mmu_index(CPULoongArchState *env, 
bool ifetch)
 #define HW_FLAGS_CRMD_PGR_CSR_CRMD_PG_MASK   /* 0x10 */
 #define HW_FLAGS_EUEN_FPE   0x04
 #define HW_FLAGS_EUEN_SXE   0x08
+#define HW_FLAGS_EUEN_ASXE  0x10
 
 static inline void cpu_get_tb_cpu_state(CPULoongArchState *env, vaddr *pc,
 uint64_t *cs_base, uint32_t *flags)
@@ -437,6 +438,7 @@ static inline void cpu_get_tb_cpu_state(CPULoongArchState 
*env, vaddr *pc,
 *flags = env->CSR_CRMD & (R_CSR_CRMD_PLV_MASK | R_CSR_CRMD_PG_MASK);
 *flags |= FIELD_EX64(env->CSR_EUEN, CSR_EUEN, FPE) * HW_FLAGS_EUEN_FPE;
 *flags |= FIELD_EX64(env->CSR_EUEN, CSR_EUEN, SXE) * HW_FLAGS_EUEN_SXE;
+*flags |= FIELD_EX64(env->CSR_EUEN, CSR_EUEN, ASXE) * HW_FLAGS_EUEN_ASXE;
 }
 
 void loongarch_cpu_list(void);
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 56a9839255..75a77f5dce 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -4,3 +4,13 @@
  * Copyright (c) 2023 Loongson Technology Corporation Limited
  */
 
+#ifndef CONFIG_USER_ONLY
+#define CHECK_ASXE do { \
+if ((ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) == 0) { \
+generate_exception(ctx, EXCCODE_ASXD); \
+return true; \
+} \
+} while (0)
+#else
+#define CHECK_ASXE
+#endif
-- 
2.39.1

[PATCH v3 18/47] target/loongarch; Implement xvdiv/xvmod

2023-07-14 Thread Song Gao

This patch includes:
- XVDIV.{B/H/W/D}[U];
- XVMOD.{B/H/W/D}[U].

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 17 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 17 +
 target/loongarch/insns.decode| 17 +
 target/loongarch/vec.h   |  7 +++
 target/loongarch/vec_helper.c| 10 ++
 5 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index b115fe8315..72df9f0b08 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1962,6 +1962,23 @@ INSN_LASX(xvmaddwod_w_hu_h,  vvv)
 INSN_LASX(xvmaddwod_d_wu_w,  vvv)
 INSN_LASX(xvmaddwod_q_du_d,  vvv)
 
+INSN_LASX(xvdiv_b,   vvv)
+INSN_LASX(xvdiv_h,   vvv)
+INSN_LASX(xvdiv_w,   vvv)
+INSN_LASX(xvdiv_d,   vvv)
+INSN_LASX(xvdiv_bu,  vvv)
+INSN_LASX(xvdiv_hu,  vvv)
+INSN_LASX(xvdiv_wu,  vvv)
+INSN_LASX(xvdiv_du,  vvv)
+INSN_LASX(xvmod_b,   vvv)
+INSN_LASX(xvmod_h,   vvv)
+INSN_LASX(xvmod_w,   vvv)
+INSN_LASX(xvmod_d,   vvv)
+INSN_LASX(xvmod_bu,  vvv)
+INSN_LASX(xvmod_hu,  vvv)
+INSN_LASX(xvmod_wu,  vvv)
+INSN_LASX(xvmod_du,  vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 1f9574a83b..118635dc1a 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -287,6 +287,23 @@ TRANS(xvmaddwod_h_bu_b, gvec_vvv, 32, MO_8, 
do_vmaddwod_u_s)
 TRANS(xvmaddwod_w_hu_h, gvec_vvv, 32, MO_16, do_vmaddwod_u_s)
 TRANS(xvmaddwod_d_wu_w, gvec_vvv, 32, MO_32, do_vmaddwod_u_s)
 
+TRANS(xvdiv_b, gen_vvv, 32, gen_helper_vdiv_b)
+TRANS(xvdiv_h, gen_vvv, 32, gen_helper_vdiv_h)
+TRANS(xvdiv_w, gen_vvv, 32, gen_helper_vdiv_w)
+TRANS(xvdiv_d, gen_vvv, 32, gen_helper_vdiv_d)
+TRANS(xvdiv_bu, gen_vvv, 32, gen_helper_vdiv_bu)
+TRANS(xvdiv_hu, gen_vvv, 32, gen_helper_vdiv_hu)
+TRANS(xvdiv_wu, gen_vvv, 32, gen_helper_vdiv_wu)
+TRANS(xvdiv_du, gen_vvv, 32, gen_helper_vdiv_du)
+TRANS(xvmod_b, gen_vvv, 32, gen_helper_vmod_b)
+TRANS(xvmod_h, gen_vvv, 32, gen_helper_vmod_h)
+TRANS(xvmod_w, gen_vvv, 32, gen_helper_vmod_w)
+TRANS(xvmod_d, gen_vvv, 32, gen_helper_vmod_d)
+TRANS(xvmod_bu, gen_vvv, 32, gen_helper_vmod_bu)
+TRANS(xvmod_hu, gen_vvv, 32, gen_helper_vmod_hu)
+TRANS(xvmod_wu, gen_vvv, 32, gen_helper_vmod_wu)
+TRANS(xvmod_du, gen_vvv, 32, gen_helper_vmod_du)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index d6fb51ae64..fa25c876b4 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1545,6 +1545,23 @@ xvmaddwod_w_hu_h 0111 01001011 11101 . . .   
 @vvv
 xvmaddwod_d_wu_w 0111 01001011 0 . . .@vvv
 xvmaddwod_q_du_d 0111 01001011 1 . . .@vvv
 
+xvdiv_b  0111 01001110 0 . . .@vvv
+xvdiv_h  0111 01001110 1 . . .@vvv
+xvdiv_w  0111 01001110 00010 . . .@vvv
+xvdiv_d  0111 01001110 00011 . . .@vvv
+xvmod_b  0111 01001110 00100 . . .@vvv
+xvmod_h  0111 01001110 00101 . . .@vvv
+xvmod_w  0111 01001110 00110 . . .@vvv
+xvmod_d  0111 01001110 00111 . . .@vvv
+xvdiv_bu 0111 01001110 01000 . . .@vvv
+xvdiv_hu 0111 01001110 01001 . . .@vvv
+xvdiv_wu 0111 01001110 01010 . . .@vvv
+xvdiv_du 0111 01001110 01011 . . .@vvv
+xvmod_bu 0111 01001110 01100 . . .@vvv
+xvmod_hu 0111 01001110 01101 . . .@vvv
+xvmod_wu 0111 01001110 01110 . . .@vvv
+xvmod_du 0111 01001110 0 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index 06c8d7e314..ee50d53f4e 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -65,4 +65,11 @@
 #define DO_MADD(a, b, c)  (a + b * c)
 #define DO_MSUB(a, b, c)  (a - b * c)
 
+#define DO_DIVU(N, M) (unlikely(M == 0) ? 0 : N / M)
+#define DO_REMU(N, M) (unlikely(M == 0) ? 0 : N % M)
+#define DO_DIV(N, M)  (unlikely(M == 0) ? 0 :\
+unlikely((N == -N) && (M == (__typeof(N))(-1))) ? N : N / M)
+#define DO_REM(N, M)  (unlikely(M == 0) ? 0 :\
+unlikely((N == -N) && (M == (__typeof(N))(-1)))

[PATCH v3 20/47] target/loongarch: Implement xvexth

2023-07-14 Thread Song Gao

This patch includes:
- XVEXTH.{H.B/W.H/D.W/Q.D};
- XVEXTH.{HU.BU/WU.HU/DU.WU/QU.DU}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  9 +
 target/loongarch/insn_trans/trans_lasx.c.inc |  9 +
 target/loongarch/insns.decode|  9 +
 target/loongarch/vec_helper.c| 36 +---
 4 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 09e5981fc3..6ca545956d 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1988,6 +1988,15 @@ INSN_LASX(xvsat_hu,  vv_i)
 INSN_LASX(xvsat_wu,  vv_i)
 INSN_LASX(xvsat_du,  vv_i)
 
+INSN_LASX(xvexth_h_b,vv)
+INSN_LASX(xvexth_w_h,vv)
+INSN_LASX(xvexth_d_w,vv)
+INSN_LASX(xvexth_q_d,vv)
+INSN_LASX(xvexth_hu_bu,  vv)
+INSN_LASX(xvexth_wu_hu,  vv)
+INSN_LASX(xvexth_du_wu,  vv)
+INSN_LASX(xvexth_qu_du,  vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index cda617413e..1744521a53 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -313,6 +313,15 @@ TRANS(xvsat_hu, gvec_vv_i, 32, MO_16, do_vsat_u)
 TRANS(xvsat_wu, gvec_vv_i, 32, MO_32, do_vsat_u)
 TRANS(xvsat_du, gvec_vv_i, 32, MO_64, do_vsat_u)
 
+TRANS(xvexth_h_b, gen_vv, 32, gen_helper_vexth_h_b)
+TRANS(xvexth_w_h, gen_vv, 32, gen_helper_vexth_w_h)
+TRANS(xvexth_d_w, gen_vv, 32, gen_helper_vexth_d_w)
+TRANS(xvexth_q_d, gen_vv, 32, gen_helper_vexth_q_d)
+TRANS(xvexth_hu_bu, gen_vv, 32, gen_helper_vexth_hu_bu)
+TRANS(xvexth_wu_hu, gen_vv, 32, gen_helper_vexth_wu_hu)
+TRANS(xvexth_du_wu, gen_vv, 32, gen_helper_vexth_du_wu)
+TRANS(xvexth_qu_du, gen_vv, 32, gen_helper_vexth_qu_du)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index e366cf7615..7491f295a5 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1571,6 +1571,15 @@ xvsat_hu 0111 01110010 1 1  . .  
 @vv_ui4
 xvsat_wu 0111 01110010 10001 . . .@vv_ui5
 xvsat_du 0111 01110010 1001 .. . .@vv_ui6
 
+xvexth_h_b   0111 01101001 11101 11000 . .@vv
+xvexth_w_h   0111 01101001 11101 11001 . .@vv
+xvexth_d_w   0111 01101001 11101 11010 . .@vv
+xvexth_q_d   0111 01101001 11101 11011 . .@vv
+xvexth_hu_bu 0111 01101001 11101 11100 . .@vv
+xvexth_wu_hu 0111 01101001 11101 11101 . .@vv
+xvexth_du_wu 0111 01101001 11101 0 . .@vv
+xvexth_qu_du 0111 01101001 11101 1 . .@vv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 4df39c007e..3b7fcc7283 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -691,32 +691,44 @@ VSAT_U(vsat_hu, 16, UH)
 VSAT_U(vsat_wu, 32, UW)
 VSAT_U(vsat_du, 64, UD)
 
-#define VEXTH(NAME, BIT, E1, E2) \
-void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \
-{\
-int i;   \
-VReg *Vd = (VReg *)vd;   \
-VReg *Vj = (VReg *)vj;   \
- \
-for (i = 0; i < LSX_LEN/BIT; i++) {  \
-Vd->E1(i) = Vj->E2(i + LSX_LEN/BIT); \
-}\
+#define VEXTH(NAME, BIT, E1, E2) \
+void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \
+{\
+int i, j, ofs;   \
+VReg *Vd = (VReg *)vd;   \
+VReg *Vj = (VReg *)vj;   \
+int oprsz = simd_oprsz(desc);\
+ \
+ofs = LSX_LEN / BIT; \
+for (i = 0; i < oprsz / 16; i++) {   \
+for (j = 0; j < ofs; j++) {  \
+Vd->E1(j + i * ofs) = Vj->E2(j + ofs + ofs * 2 * i); \
+}\
+}\
 }
 
 void HELPER(vexth_q_d)(void

[PATCH v3 11/47] target/loongarch: Implement xvaddw/xvsubw

2023-07-14 Thread Song Gao

This patch includes:
- XVADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U];
- XVSUBW{EV/OD}.{H.B/W.H/D.W/Q.D}[U];
- XVADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  43 +++
 target/loongarch/insn_trans/trans_lasx.c.inc |  45 +++
 target/loongarch/insns.decode|  45 +++
 target/loongarch/vec_helper.c| 121 +--
 4 files changed, 220 insertions(+), 34 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index e188220519..6972e33833 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1782,6 +1782,49 @@ INSN_LASX(xvhsubw_wu_hu, vvv)
 INSN_LASX(xvhsubw_du_wu, vvv)
 INSN_LASX(xvhsubw_qu_du, vvv)
 
+INSN_LASX(xvaddwev_h_b,  vvv)
+INSN_LASX(xvaddwev_w_h,  vvv)
+INSN_LASX(xvaddwev_d_w,  vvv)
+INSN_LASX(xvaddwev_q_d,  vvv)
+INSN_LASX(xvaddwod_h_b,  vvv)
+INSN_LASX(xvaddwod_w_h,  vvv)
+INSN_LASX(xvaddwod_d_w,  vvv)
+INSN_LASX(xvaddwod_q_d,  vvv)
+INSN_LASX(xvsubwev_h_b,  vvv)
+INSN_LASX(xvsubwev_w_h,  vvv)
+INSN_LASX(xvsubwev_d_w,  vvv)
+INSN_LASX(xvsubwev_q_d,  vvv)
+INSN_LASX(xvsubwod_h_b,  vvv)
+INSN_LASX(xvsubwod_w_h,  vvv)
+INSN_LASX(xvsubwod_d_w,  vvv)
+INSN_LASX(xvsubwod_q_d,  vvv)
+
+INSN_LASX(xvaddwev_h_bu, vvv)
+INSN_LASX(xvaddwev_w_hu, vvv)
+INSN_LASX(xvaddwev_d_wu, vvv)
+INSN_LASX(xvaddwev_q_du, vvv)
+INSN_LASX(xvaddwod_h_bu, vvv)
+INSN_LASX(xvaddwod_w_hu, vvv)
+INSN_LASX(xvaddwod_d_wu, vvv)
+INSN_LASX(xvaddwod_q_du, vvv)
+INSN_LASX(xvsubwev_h_bu, vvv)
+INSN_LASX(xvsubwev_w_hu, vvv)
+INSN_LASX(xvsubwev_d_wu, vvv)
+INSN_LASX(xvsubwev_q_du, vvv)
+INSN_LASX(xvsubwod_h_bu, vvv)
+INSN_LASX(xvsubwod_w_hu, vvv)
+INSN_LASX(xvsubwod_d_wu, vvv)
+INSN_LASX(xvsubwod_q_du, vvv)
+
+INSN_LASX(xvaddwev_h_bu_b,   vvv)
+INSN_LASX(xvaddwev_w_hu_h,   vvv)
+INSN_LASX(xvaddwev_d_wu_w,   vvv)
+INSN_LASX(xvaddwev_q_du_d,   vvv)
+INSN_LASX(xvaddwod_h_bu_b,   vvv)
+INSN_LASX(xvaddwod_w_hu_h,   vvv)
+INSN_LASX(xvaddwod_d_wu_w,   vvv)
+INSN_LASX(xvaddwod_q_du_d,   vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 4272bafda2..d8230cba9f 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -95,6 +95,51 @@ TRANS(xvhsubw_wu_hu, gen_vvv, 32, gen_helper_vhsubw_wu_hu)
 TRANS(xvhsubw_du_wu, gen_vvv, 32, gen_helper_vhsubw_du_wu)
 TRANS(xvhsubw_qu_du, gen_vvv, 32, gen_helper_vhsubw_qu_du)
 
+TRANS(xvaddwev_h_b, gvec_vvv, 32, MO_8, do_vaddwev_s)
+TRANS(xvaddwev_w_h, gvec_vvv, 32, MO_16, do_vaddwev_s)
+TRANS(xvaddwev_d_w, gvec_vvv, 32, MO_32, do_vaddwev_s)
+TRANS(xvaddwev_q_d, gvec_vvv, 32, MO_64, do_vaddwev_s)
+TRANS(xvaddwod_h_b, gvec_vvv, 32, MO_8, do_vaddwod_s)
+TRANS(xvaddwod_w_h, gvec_vvv, 32, MO_16, do_vaddwod_s)
+TRANS(xvaddwod_d_w, gvec_vvv, 32, MO_32, do_vaddwod_s)
+TRANS(xvaddwod_q_d, gvec_vvv, 32, MO_64, do_vaddwod_s)
+
+TRANS(xvsubwev_h_b, gvec_vvv, 32, MO_8, do_vsubwev_s)
+TRANS(xvsubwev_w_h, gvec_vvv, 32, MO_16, do_vsubwev_s)
+TRANS(xvsubwev_d_w, gvec_vvv, 32, MO_32, do_vsubwev_s)
+TRANS(xvsubwev_q_d, gvec_vvv, 32, MO_64, do_vsubwev_s)
+TRANS(xvsubwod_h_b, gvec_vvv, 32, MO_8, do_vsubwod_s)
+TRANS(xvsubwod_w_h, gvec_vvv, 32, MO_16, do_vsubwod_s)
+TRANS(xvsubwod_d_w, gvec_vvv, 32, MO_32, do_vsubwod_s)
+TRANS(xvsubwod_q_d, gvec_vvv, 32, MO_64, do_vsubwod_s)
+
+TRANS(xvaddwev_h_bu, gvec_vvv, 32, MO_8, do_vaddwev_u)
+TRANS(xvaddwev_w_hu, gvec_vvv, 32, MO_16, do_vaddwev_u)
+TRANS(xvaddwev_d_wu, gvec_vvv, 32, MO_32, do_vaddwev_u)
+TRANS(xvaddwev_q_du, gvec_vvv, 32, MO_64, do_vaddwev_u)
+TRANS(xvaddwod_h_bu, gvec_vvv, 32, MO_8, do_vaddwod_u)
+TRANS(xvaddwod_w_hu, gvec_vvv, 32, MO_16, do_vaddwod_u)
+TRANS(xvaddwod_d_wu, gvec_vvv, 32, MO_32, do_vaddwod_u)
+TRANS(xvaddwod_q_du, gvec_vvv, 32, MO_64, do_vaddwod_u)
+
+TRANS(xvsubwev_h_bu, gvec_vvv, 32, MO_8, do_vsubwev_u)
+TRANS(xvsubwev_w_hu, gvec_vvv, 32, MO_16, do_vsubwev_u)
+TRANS(xvsubwev_d_wu, gvec_vvv, 32, MO_32, do_vsubwev_u)
+TRANS(xvsubwev_q_du, gvec_vvv, 32, MO_64, do_vsubwev_u)
+TRANS(xvsubwod_h_bu, gvec_vvv, 32, MO_8, do_vsubwod_u)
+TRANS(xvsubwod_w_hu, gvec_vvv, 32, MO_16, do_vsubwod_u)
+TRANS(xvsubwod_d_wu, gvec_vvv, 32, MO_32, do_vsubwod_u)
+TRANS(xvsubwod_q_du, gvec_vvv, 32, MO_64, do_vsubwod_u)
+
+TRANS(xvaddwev_h_bu_b, gvec_vvv, 32, MO_8, do_vaddwev_u_s)
+TRANS(xvaddwev_w_hu_h, gvec_vvv, 32, MO_16, do_vaddwev_u_s)
+TRANS(xvaddwev_d_wu_w, gvec_vvv, 32, MO_32, do_vaddwev_u_s)
+TRANS(xvaddwev_q_du_d, gvec_vvv, 32, MO_64, do_vaddwev_u_s)
+TRANS(xvaddwod_h_bu_b, gvec_vvv, 32, MO_8, do_vaddwod_u_s)
+TRANS(xvaddwod_w_hu_h, gvec_vvv, 32, MO_16, do_vaddwod_u_s)
+TRANS(xvaddwod_d_wu_w, gvec_vvv, 32, MO_32, do_vaddwod_u_s)
+TRANS(xvaddwod_q_du_d, gvec_vvv, 32, MO_6

[PATCH v3 39/47] target/loongarch: Implement xvseq xvsle xvslt

2023-07-14 Thread Song Gao

This patch includes:
- XVSEQ[I].{B/H/W/D};
- XVSLE[I].{B/H/W/D}[U];
- XVSLT[I].{B/H/W/D/}[U].

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  43 +++
 target/loongarch/insn_trans/trans_lasx.c.inc |  43 +++
 target/loongarch/insn_trans/trans_lsx.c.inc  | 263 ++-
 target/loongarch/insns.decode|  43 +++
 target/loongarch/vec.h   |   4 +
 target/loongarch/vec_helper.c|  27 +-
 6 files changed, 278 insertions(+), 145 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 3fd3dc3591..295ba74f2b 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2342,6 +2342,49 @@ INSN_LASX(xvffintl_d_w,  vv)
 INSN_LASX(xvffinth_d_w,  vv)
 INSN_LASX(xvffint_s_l,   vvv)
 
+INSN_LASX(xvseq_b,   vvv)
+INSN_LASX(xvseq_h,   vvv)
+INSN_LASX(xvseq_w,   vvv)
+INSN_LASX(xvseq_d,   vvv)
+INSN_LASX(xvseqi_b,  vv_i)
+INSN_LASX(xvseqi_h,  vv_i)
+INSN_LASX(xvseqi_w,  vv_i)
+INSN_LASX(xvseqi_d,  vv_i)
+
+INSN_LASX(xvsle_b,   vvv)
+INSN_LASX(xvsle_h,   vvv)
+INSN_LASX(xvsle_w,   vvv)
+INSN_LASX(xvsle_d,   vvv)
+INSN_LASX(xvslei_b,  vv_i)
+INSN_LASX(xvslei_h,  vv_i)
+INSN_LASX(xvslei_w,  vv_i)
+INSN_LASX(xvslei_d,  vv_i)
+INSN_LASX(xvsle_bu,  vvv)
+INSN_LASX(xvsle_hu,  vvv)
+INSN_LASX(xvsle_wu,  vvv)
+INSN_LASX(xvsle_du,  vvv)
+INSN_LASX(xvslei_bu, vv_i)
+INSN_LASX(xvslei_hu, vv_i)
+INSN_LASX(xvslei_wu, vv_i)
+INSN_LASX(xvslei_du, vv_i)
+
+INSN_LASX(xvslt_b,   vvv)
+INSN_LASX(xvslt_h,   vvv)
+INSN_LASX(xvslt_w,   vvv)
+INSN_LASX(xvslt_d,   vvv)
+INSN_LASX(xvslti_b,  vv_i)
+INSN_LASX(xvslti_h,  vv_i)
+INSN_LASX(xvslti_w,  vv_i)
+INSN_LASX(xvslti_d,  vv_i)
+INSN_LASX(xvslt_bu,  vvv)
+INSN_LASX(xvslt_hu,  vvv)
+INSN_LASX(xvslt_wu,  vvv)
+INSN_LASX(xvslt_du,  vvv)
+INSN_LASX(xvslti_bu, vv_i)
+INSN_LASX(xvslti_hu, vv_i)
+INSN_LASX(xvslti_wu, vv_i)
+INSN_LASX(xvslti_du, vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 057aed657e..ad7f787319 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -658,6 +658,49 @@ TRANS(xvffintl_d_w, gen_vv_f, 32, gen_helper_vffintl_d_w)
 TRANS(xvffinth_d_w, gen_vv_f, 32, gen_helper_vffinth_d_w)
 TRANS(xvffint_s_l, gen_vvv_f, 32, gen_helper_vffint_s_l)
 
+TRANS(xvseq_b, do_cmp, 32, MO_8, TCG_COND_EQ)
+TRANS(xvseq_h, do_cmp, 32, MO_16, TCG_COND_EQ)
+TRANS(xvseq_w, do_cmp, 32, MO_32, TCG_COND_EQ)
+TRANS(xvseq_d, do_cmp, 32, MO_64, TCG_COND_EQ)
+TRANS(xvseqi_b, do_vseqi_s, 32, MO_8)
+TRANS(xvseqi_h, do_vseqi_s, 32, MO_16)
+TRANS(xvseqi_w, do_vseqi_s, 32, MO_32)
+TRANS(xvseqi_d, do_vseqi_s, 32, MO_64)
+
+TRANS(xvsle_b, do_cmp, 32, MO_8, TCG_COND_LE)
+TRANS(xvsle_h, do_cmp, 32, MO_16, TCG_COND_LE)
+TRANS(xvsle_w, do_cmp, 32, MO_32, TCG_COND_LE)
+TRANS(xvsle_d, do_cmp, 32, MO_64, TCG_COND_LE)
+TRANS(xvslei_b, do_vslei_s, 32, MO_8)
+TRANS(xvslei_h, do_vslei_s, 32, MO_16)
+TRANS(xvslei_w, do_vslei_s, 32, MO_32)
+TRANS(xvslei_d, do_vslei_s, 32, MO_64)
+TRANS(xvsle_bu, do_cmp, 32, MO_8, TCG_COND_LEU)
+TRANS(xvsle_hu, do_cmp, 32, MO_16, TCG_COND_LEU)
+TRANS(xvsle_wu, do_cmp, 32, MO_32, TCG_COND_LEU)
+TRANS(xvsle_du, do_cmp, 32, MO_64, TCG_COND_LEU)
+TRANS(xvslei_bu, do_vslei_u, 32, MO_8)
+TRANS(xvslei_hu, do_vslei_u, 32, MO_16)
+TRANS(xvslei_wu, do_vslei_u, 32, MO_32)
+TRANS(xvslei_du, do_vslei_u, 32, MO_64)
+
+TRANS(xvslt_b, do_cmp, 32, MO_8, TCG_COND_LT)
+TRANS(xvslt_h, do_cmp, 32, MO_16, TCG_COND_LT)
+TRANS(xvslt_w, do_cmp, 32, MO_32, TCG_COND_LT)
+TRANS(xvslt_d, do_cmp, 32, MO_64, TCG_COND_LT)
+TRANS(xvslti_b, do_vslti_s, 32, MO_8)
+TRANS(xvslti_h, do_vslti_s, 32, MO_16)
+TRANS(xvslti_w, do_vslti_s, 32, MO_32)
+TRANS(xvslti_d, do_vslti_s, 32, MO_64)
+TRANS(xvslt_bu, do_cmp, 32, MO_8, TCG_COND_LTU)
+TRANS(xvslt_hu, do_cmp, 32, MO_16, TCG_COND_LTU)
+TRANS(xvslt_wu, do_cmp, 32, MO_32, TCG_COND_LTU)
+TRANS(xvslt_du, do_cmp, 32, MO_64, TCG_COND_LTU)
+TRANS(xvslti_bu, do_vslti_u, 32, MO_8)
+TRANS(xvslti_hu, do_vslti_u, 32, MO_16)
+TRANS(xvslti_wu, do_vslti_u, 32, MO_32)
+TRANS(xvslti_du, do_vslti_u, 32, MO_64)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 1e2963446b..61e529d1da 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -3720,7 +3720,8 @@ TRANS(vffintl_d_w, gen_vv_f, 16, gen_help

[PATCH v3 15/47] target/loongarch: Implement xvmax/xvmin

2023-07-14 Thread Song Gao

This patch includes:
- XVMAX[I].{B/H/W/D}[U];
- XVMIN[I].{B/H/W/D}[U].

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 34 ++
 target/loongarch/insn_trans/trans_lasx.c.inc | 36 
 target/loongarch/insns.decode| 36 
 target/loongarch/vec.h   |  3 ++
 target/loongarch/vec_helper.c| 26 +++---
 5 files changed, 121 insertions(+), 14 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index b48822e431..63c1dc757f 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1856,6 +1856,40 @@ INSN_LASX(xvadda_h,  vvv)
 INSN_LASX(xvadda_w,  vvv)
 INSN_LASX(xvadda_d,  vvv)
 
+INSN_LASX(xvmax_b,   vvv)
+INSN_LASX(xvmax_h,   vvv)
+INSN_LASX(xvmax_w,   vvv)
+INSN_LASX(xvmax_d,   vvv)
+INSN_LASX(xvmin_b,   vvv)
+INSN_LASX(xvmin_h,   vvv)
+INSN_LASX(xvmin_w,   vvv)
+INSN_LASX(xvmin_d,   vvv)
+INSN_LASX(xvmax_bu,  vvv)
+INSN_LASX(xvmax_hu,  vvv)
+INSN_LASX(xvmax_wu,  vvv)
+INSN_LASX(xvmax_du,  vvv)
+INSN_LASX(xvmin_bu,  vvv)
+INSN_LASX(xvmin_hu,  vvv)
+INSN_LASX(xvmin_wu,  vvv)
+INSN_LASX(xvmin_du,  vvv)
+
+INSN_LASX(xvmaxi_b,  vv_i)
+INSN_LASX(xvmaxi_h,  vv_i)
+INSN_LASX(xvmaxi_w,  vv_i)
+INSN_LASX(xvmaxi_d,  vv_i)
+INSN_LASX(xvmini_b,  vv_i)
+INSN_LASX(xvmini_h,  vv_i)
+INSN_LASX(xvmini_w,  vv_i)
+INSN_LASX(xvmini_d,  vv_i)
+INSN_LASX(xvmaxi_bu, vv_i)
+INSN_LASX(xvmaxi_hu, vv_i)
+INSN_LASX(xvmaxi_wu, vv_i)
+INSN_LASX(xvmaxi_du, vv_i)
+INSN_LASX(xvmini_bu, vv_i)
+INSN_LASX(xvmini_hu, vv_i)
+INSN_LASX(xvmini_wu, vv_i)
+INSN_LASX(xvmini_du, vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 30cb286cb9..107c75f1b6 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -171,6 +171,42 @@ TRANS(xvadda_h, gvec_vvv, 32, MO_16, do_vadda)
 TRANS(xvadda_w, gvec_vvv, 32, MO_32, do_vadda)
 TRANS(xvadda_d, gvec_vvv, 32, MO_64, do_vadda)
 
+TRANS(xvmax_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_smax)
+TRANS(xvmax_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_smax)
+TRANS(xvmax_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_smax)
+TRANS(xvmax_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_smax)
+TRANS(xvmax_bu, gvec_vvv, 32, MO_8, tcg_gen_gvec_umax)
+TRANS(xvmax_hu, gvec_vvv, 32, MO_16, tcg_gen_gvec_umax)
+TRANS(xvmax_wu, gvec_vvv, 32, MO_32, tcg_gen_gvec_umax)
+TRANS(xvmax_du, gvec_vvv, 32, MO_64, tcg_gen_gvec_umax)
+
+TRANS(xvmin_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_smin)
+TRANS(xvmin_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_smin)
+TRANS(xvmin_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_smin)
+TRANS(xvmin_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_smin)
+TRANS(xvmin_bu, gvec_vvv, 32, MO_8, tcg_gen_gvec_umin)
+TRANS(xvmin_hu, gvec_vvv, 32, MO_16, tcg_gen_gvec_umin)
+TRANS(xvmin_wu, gvec_vvv, 32, MO_32, tcg_gen_gvec_umin)
+TRANS(xvmin_du, gvec_vvv, 32, MO_64, tcg_gen_gvec_umin)
+
+TRANS(xvmini_b, gvec_vv_i, 32, MO_8, do_vmini_s)
+TRANS(xvmini_h, gvec_vv_i, 32, MO_16, do_vmini_s)
+TRANS(xvmini_w, gvec_vv_i, 32, MO_32, do_vmini_s)
+TRANS(xvmini_d, gvec_vv_i, 32, MO_64, do_vmini_s)
+TRANS(xvmini_bu, gvec_vv_i, 32, MO_8, do_vmini_u)
+TRANS(xvmini_hu, gvec_vv_i, 32, MO_16, do_vmini_u)
+TRANS(xvmini_wu, gvec_vv_i, 32, MO_32, do_vmini_u)
+TRANS(xvmini_du, gvec_vv_i, 32, MO_64, do_vmini_u)
+
+TRANS(xvmaxi_b, gvec_vv_i, 32, MO_8, do_vmaxi_s)
+TRANS(xvmaxi_h, gvec_vv_i, 32, MO_16, do_vmaxi_s)
+TRANS(xvmaxi_w, gvec_vv_i, 32, MO_32, do_vmaxi_s)
+TRANS(xvmaxi_d, gvec_vv_i, 32, MO_64, do_vmaxi_s)
+TRANS(xvmaxi_bu, gvec_vv_i, 32, MO_8, do_vmaxi_u)
+TRANS(xvmaxi_hu, gvec_vv_i, 32, MO_16, do_vmaxi_u)
+TRANS(xvmaxi_wu, gvec_vv_i, 32, MO_32, do_vmaxi_u)
+TRANS(xvmaxi_du, gvec_vv_i, 32, MO_64, do_vmaxi_u)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index f3722e3aa7..99aefcb651 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1437,6 +1437,42 @@ xvadda_h 0111 01000101 11001 . . .   
 @vvv
 xvadda_w 0111 01000101 11010 . . .@vvv
 xvadda_d 0111 01000101 11011 . . .@vvv
 
+xvmax_b  0111 01000111 0 . . .@vvv
+xvmax_h  0111 01000111 1 . . .@vvv
+xvmax_w  0111 01000111 00010 . . .@vvv
+xvmax_d  0111 01000111 00011 . . .@vvv
+xvmax_bu 0111 01000111 01000 . .

[PATCH v3 22/47] target/loongarch: Implement xvsigncov

2023-07-14 Thread Song Gao

This patch includes:
- XVSIGNCOV.{B/H/W/D}.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 5 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 5 +
 target/loongarch/insns.decode| 5 +
 target/loongarch/vec.h   | 2 ++
 target/loongarch/vec_helper.c| 2 --
 5 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 975ea018da..85e0cb7c8d 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2010,6 +2010,11 @@ INSN_LASX(vext2xv_wu_hu, vv)
 INSN_LASX(vext2xv_du_hu, vv)
 INSN_LASX(vext2xv_du_wu, vv)
 
+INSN_LASX(xvsigncov_b,   vvv)
+INSN_LASX(xvsigncov_h,   vvv)
+INSN_LASX(xvsigncov_w,   vvv)
+INSN_LASX(xvsigncov_d,   vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 5a99c75858..7aab6528a7 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -335,6 +335,11 @@ TRANS(vext2xv_wu_hu, gen_vv, 32, gen_helper_vext2xv_wu_hu)
 TRANS(vext2xv_du_hu, gen_vv, 32, gen_helper_vext2xv_du_hu)
 TRANS(vext2xv_du_wu, gen_vv, 32, gen_helper_vext2xv_du_wu)
 
+TRANS(xvsigncov_b, gvec_vvv, 32, MO_8, do_vsigncov)
+TRANS(xvsigncov_h, gvec_vvv, 32, MO_16, do_vsigncov)
+TRANS(xvsigncov_w, gvec_vvv, 32, MO_32, do_vsigncov)
+TRANS(xvsigncov_d, gvec_vvv, 32, MO_64, do_vsigncov)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index db1a6689f0..7bbda1a142 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1593,6 +1593,11 @@ vext2xv_wu_hu0111 01101001 0 01101 . .   
 @vv
 vext2xv_du_hu0111 01101001 0 01110 . .@vv
 vext2xv_du_wu0111 01101001 0 0 . .@vv
 
+xvsigncov_b  0111 01010010 11100 . . .@vvv
+xvsigncov_h  0111 01010010 11101 . . .@vvv
+xvsigncov_w  0111 01010010 0 . . .@vvv
+xvsigncov_d  0111 01010010 1 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index ee50d53f4e..681afd842f 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -72,4 +72,6 @@
 #define DO_REM(N, M)  (unlikely(M == 0) ? 0 :\
 unlikely((N == -N) && (M == (__typeof(N))(-1))) ? 0 : N % M)
 
+#define DO_SIGNCOV(a, b)  (a == 0 ? 0 : a < 0 ? -b : b)
+
 #endif /* LOONGARCH_VEC_H */
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 024dda5aca..126b67eea5 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -766,8 +766,6 @@ VEXT2XV(vext2xv_wu_hu, 32, UW, UH)
 VEXT2XV(vext2xv_du_hu, 64, UD, UH)
 VEXT2XV(vext2xv_du_wu, 64, UD, UW)
 
-#define DO_SIGNCOV(a, b)  (a == 0 ? 0 : a < 0 ? -b : b)
-
 DO_3OP(vsigncov_b, 8, B, DO_SIGNCOV)
 DO_3OP(vsigncov_h, 16, H, DO_SIGNCOV)
 DO_3OP(vsigncov_w, 32, W, DO_SIGNCOV)
-- 
2.39.1

[PATCH v3 25/47] target/loongarch: Implement LASX logic instructions

2023-07-14 Thread Song Gao

This patch includes:
- XV{AND/OR/XOR/NOR/ANDN/ORN}.V;
- XV{AND/OR/XOR/NOR}I.B.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 12 
 target/loongarch/insn_trans/trans_lasx.c.inc | 11 +++
 target/loongarch/insn_trans/trans_lsx.c.inc  |  5 +++--
 target/loongarch/insns.decode| 12 
 target/loongarch/vec_helper.c|  4 ++--
 5 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 8fa2edf007..59fa249bae 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2029,6 +2029,18 @@ INSN_LASX(xvsigncov_d,   vvv)
 
 INSN_LASX(xvldi, v_i)
 
+INSN_LASX(xvand_v,   vvv)
+INSN_LASX(xvor_v,vvv)
+INSN_LASX(xvxor_v,   vvv)
+INSN_LASX(xvnor_v,   vvv)
+INSN_LASX(xvandn_v,  vvv)
+INSN_LASX(xvorn_v,   vvv)
+
+INSN_LASX(xvandi_b,  vv_i)
+INSN_LASX(xvori_b,   vv_i)
+INSN_LASX(xvxori_b,  vv_i)
+INSN_LASX(xvnori_b,  vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 5e130f9c2e..31967b371c 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -349,6 +349,17 @@ TRANS(xvmsknz_b, gen_vv, 32, gen_helper_vmsknz_b)
 
 TRANS(xvldi, do_vldi, 32)
 
+TRANS(xvand_v, gvec_vvv, 32, MO_64, tcg_gen_gvec_and)
+TRANS(xvor_v, gvec_vvv, 32, MO_64, tcg_gen_gvec_or)
+TRANS(xvxor_v, gvec_vvv, 32, MO_64, tcg_gen_gvec_xor)
+TRANS(xvnor_v, gvec_vvv, 32, MO_64, tcg_gen_gvec_nor)
+TRANS(xvandn_v, do_vandn_v, 32)
+TRANS(xvorn_v, gvec_vvv, 32, MO_64, tcg_gen_gvec_orc)
+TRANS(xvandi_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_andi)
+TRANS(xvori_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_ori)
+TRANS(xvxori_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_xori)
+TRANS(xvnori_b, gvec_vv_i, 32, MO_8, do_vnori_b)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 34811720cf..1e2963446b 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -3092,7 +3092,7 @@ TRANS(vor_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_or)
 TRANS(vxor_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_xor)
 TRANS(vnor_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_nor)
 
-static bool trans_vandn_v(DisasContext *ctx, arg_vvv *a)
+static bool do_vandn_v(DisasContext *ctx, arg_vvv *a, uint32_t oprsz)
 {
 uint32_t vd_ofs, vj_ofs, vk_ofs;
 
@@ -3102,9 +3102,10 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv *a)
 vj_ofs = vec_full_offset(a->vj);
 vk_ofs = vec_full_offset(a->vk);
 
-tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, 16, ctx->vl/8);
+tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, oprsz, ctx->vl / 8);
 return true;
 }
+TRANS(vandn_v, do_vandn_v, 16)
 TRANS(vorn_v, gvec_vvv, 16, MO_64, tcg_gen_gvec_orc)
 TRANS(vandi_b, gvec_vv_i, 16, MO_8, tcg_gen_gvec_andi)
 TRANS(vori_b, gvec_vv_i, 16, MO_8, tcg_gen_gvec_ori)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index edaa756395..fb28666577 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1607,6 +1607,18 @@ xvmsknz_b0111 01101001 11000 11000 . .   
 @vv
 
 xvldi0111 0110 00 . . @v_i13
 
+xvand_v  0111 01010010 01100 . . .@vvv
+xvor_v   0111 01010010 01101 . . .@vvv
+xvxor_v  0111 01010010 01110 . . .@vvv
+xvnor_v  0111 01010010 0 . . .@vvv
+xvandn_v 0111 01010010 1 . . .@vvv
+xvorn_v  0111 01010010 10001 . . .@vvv
+
+xvandi_b 0111 0101 00  . .@vv_ui8
+xvori_b  0111 0101 01  . .@vv_ui8
+xvxori_b 0111 0101 10  . .@vv_ui8
+xvnori_b 0111 0101 11  . .@vv_ui8
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 9d13b6544c..96c9a243e1 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -914,13 +914,13 @@ void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t desc)
 }
 }
 
-void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t v)
+void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t desc)
 {
 int i;
 VReg *Vd = (VReg *)vd;
 VReg *Vj = (VReg *)vj;
 
-for (i = 0; i < LSX_LEN/

[PATCH v3 46/47] target/loongarch: Implement xvld xvst

2023-07-14 Thread Song Gao

This patch includes:
- XVLD[X], XVST[X];
- XVLDREPL.{B/H/W/D};
- XVSTELM.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 24 ++
 target/loongarch/insn_trans/trans_lasx.c.inc | 80 
 target/loongarch/insn_trans/trans_lsx.c.inc  | 54 ++---
 target/loongarch/insns.decode| 18 +
 4 files changed, 149 insertions(+), 27 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index a518c59772..e5fb362d7f 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1753,6 +1753,16 @@ static void output_vvr_x(DisasContext *ctx, arg_vvr *a, 
const char *mnemonic)
 output(ctx, mnemonic, "x%d, x%d, r%d", a->vd, a->vj, a->rk);
 }
 
+static void output_vrr_x(DisasContext *ctx, arg_vrr *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, r%d, r%d", a->vd, a->rj, a->rk);
+}
+
+static void output_vr_ii_x(DisasContext *ctx, arg_vr_ii *a, const char 
*mnemonic)
+{
+output(ctx, mnemonic, "x%d, r%d, 0x%x, 0x%x", a->vd, a->rj, a->imm, 
a->imm2);
+}
+
 INSN_LASX(xvadd_b,   vvv)
 INSN_LASX(xvadd_h,   vvv)
 INSN_LASX(xvadd_w,   vvv)
@@ -2596,3 +2606,17 @@ INSN_LASX(xvextrins_d,   vv_i)
 INSN_LASX(xvextrins_w,   vv_i)
 INSN_LASX(xvextrins_h,   vv_i)
 INSN_LASX(xvextrins_b,   vv_i)
+
+INSN_LASX(xvld,  vr_i)
+INSN_LASX(xvst,  vr_i)
+INSN_LASX(xvldx, vrr)
+INSN_LASX(xvstx, vrr)
+
+INSN_LASX(xvldrepl_d,vr_i)
+INSN_LASX(xvldrepl_w,vr_i)
+INSN_LASX(xvldrepl_h,vr_i)
+INSN_LASX(xvldrepl_b,vr_i)
+INSN_LASX(xvstelm_d, vr_ii)
+INSN_LASX(xvstelm_w, vr_ii)
+INSN_LASX(xvstelm_h, vr_ii)
+INSN_LASX(xvstelm_b, vr_ii)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index cf53c12543..b8b112d7cc 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -926,3 +926,83 @@ TRANS(xvextrins_b, gen_vv_i, 32, gen_helper_vextrins_b)
 TRANS(xvextrins_h, gen_vv_i, 32, gen_helper_vextrins_h)
 TRANS(xvextrins_w, gen_vv_i, 32, gen_helper_vextrins_w)
 TRANS(xvextrins_d, gen_vv_i, 32, gen_helper_vextrins_d)
+
+static bool gen_lasx_memory(DisasContext *ctx, arg_vr_i *a,
+void (*func)(DisasContext *, int, TCGv))
+{
+TCGv addr = gpr_src(ctx, a->rj, EXT_NONE);
+TCGv temp = NULL;
+
+CHECK_VEC;
+
+if (a->imm) {
+temp = tcg_temp_new();
+tcg_gen_addi_tl(temp, addr, a->imm);
+addr = temp;
+}
+
+func(ctx, a->vd, addr);
+return true;
+}
+
+static void gen_xvld(DisasContext *ctx, int vreg, TCGv addr)
+{
+int i;
+TCGv temp = tcg_temp_new();
+TCGv dest = tcg_temp_new();
+
+tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_TEUQ);
+set_vreg64(dest, vreg, 0);
+
+for (i = 1; i < 4; i++) {
+tcg_gen_addi_tl(temp, addr, 8 * i);
+tcg_gen_qemu_ld_i64(dest, temp, ctx->mem_idx, MO_TEUQ);
+set_vreg64(dest, vreg, i);
+}
+}
+
+static void gen_xvst(DisasContext * ctx, int vreg, TCGv addr)
+{
+int i;
+TCGv temp = tcg_temp_new();
+TCGv dest = tcg_temp_new();
+
+get_vreg64(dest, vreg, 0);
+tcg_gen_qemu_st_i64(dest, addr, ctx->mem_idx, MO_TEUQ);
+
+for (i = 1; i < 4; i++) {
+tcg_gen_addi_tl(temp, addr, 8 * i);
+get_vreg64(dest, vreg, i);
+tcg_gen_qemu_st_i64(dest, temp, ctx->mem_idx, MO_TEUQ);
+}
+}
+
+TRANS(xvld, gen_lasx_memory, gen_xvld)
+TRANS(xvst, gen_lasx_memory, gen_xvst)
+
+static bool gen_lasx_memoryx(DisasContext *ctx, arg_vrr *a,
+ void (*func)(DisasContext*, int, TCGv))
+{
+TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE);
+TCGv src2 = gpr_src(ctx, a->rk, EXT_NONE);
+TCGv addr = tcg_temp_new();
+
+CHECK_VEC;
+
+tcg_gen_add_tl(addr, src1, src2);
+func(ctx, a->vd, addr);
+
+return true;
+}
+
+TRANS(xvldx, gen_lasx_memoryx, gen_xvld)
+TRANS(xvstx, gen_lasx_memoryx, gen_xvst)
+
+TRANS(xvldrepl_b, do_vldrepl, 32, MO_8)
+TRANS(xvldrepl_h, do_vldrepl, 32, MO_16)
+TRANS(xvldrepl_w, do_vldrepl, 32, MO_32)
+TRANS(xvldrepl_d, do_vldrepl, 32, MO_64)
+VSTELM(xvstelm_b, MO_8, B)
+VSTELM(xvstelm_h, MO_16, H)
+VSTELM(xvstelm_w, MO_32, W)
+VSTELM(xvstelm_d, MO_64, D)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index d2ea70d8f0..8fa721eab3 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -4430,33 +4430,33 @@ static bool trans_vstx(DisasContext *ctx, arg_vrr *a)
 return true;
 }
 
-#define VLDREPL(NAME, MO) \
-static bool trans_## NAME (DisasContext *ctx, arg_vr_i *a)\
-{ \
-TCGv addr, temp;

[PATCH v3 35/47] target/loongarch: Implement xvbitclr xvbitset xvbitrev

2023-07-14 Thread Song Gao

This patch includes:
- XVBITCLR[I].{B/H/W/D};
- XVBITSET[I].{B/H/W/D};
- XVBITREV[I].{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 25 ++
 target/loongarch/insn_trans/trans_lasx.c.inc | 27 +++
 target/loongarch/insns.decode| 27 +++
 target/loongarch/vec.h   |  4 ++
 target/loongarch/vec_helper.c| 48 ++--
 5 files changed, 106 insertions(+), 25 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 9e31f9bbbc..dad9243fd7 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2210,6 +2210,31 @@ INSN_LASX(xvpcnt_h,  vv)
 INSN_LASX(xvpcnt_w,  vv)
 INSN_LASX(xvpcnt_d,  vv)
 
+INSN_LASX(xvbitclr_b,vvv)
+INSN_LASX(xvbitclr_h,vvv)
+INSN_LASX(xvbitclr_w,vvv)
+INSN_LASX(xvbitclr_d,vvv)
+INSN_LASX(xvbitclri_b,   vv_i)
+INSN_LASX(xvbitclri_h,   vv_i)
+INSN_LASX(xvbitclri_w,   vv_i)
+INSN_LASX(xvbitclri_d,   vv_i)
+INSN_LASX(xvbitset_b,vvv)
+INSN_LASX(xvbitset_h,vvv)
+INSN_LASX(xvbitset_w,vvv)
+INSN_LASX(xvbitset_d,vvv)
+INSN_LASX(xvbitseti_b,   vv_i)
+INSN_LASX(xvbitseti_h,   vv_i)
+INSN_LASX(xvbitseti_w,   vv_i)
+INSN_LASX(xvbitseti_d,   vv_i)
+INSN_LASX(xvbitrev_b,vvv)
+INSN_LASX(xvbitrev_h,vvv)
+INSN_LASX(xvbitrev_w,vvv)
+INSN_LASX(xvbitrev_d,vvv)
+INSN_LASX(xvbitrevi_b,   vv_i)
+INSN_LASX(xvbitrevi_h,   vv_i)
+INSN_LASX(xvbitrevi_w,   vv_i)
+INSN_LASX(xvbitrevi_d,   vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 94824569a0..e52c7551d9 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -529,6 +529,33 @@ TRANS(xvpcnt_h, gen_vv, 32, gen_helper_vpcnt_h)
 TRANS(xvpcnt_w, gen_vv, 32, gen_helper_vpcnt_w)
 TRANS(xvpcnt_d, gen_vv, 32, gen_helper_vpcnt_d)
 
+TRANS(xvbitclr_b, gvec_vvv, 32, MO_8, do_vbitclr)
+TRANS(xvbitclr_h, gvec_vvv, 32, MO_16, do_vbitclr)
+TRANS(xvbitclr_w, gvec_vvv, 32, MO_32, do_vbitclr)
+TRANS(xvbitclr_d, gvec_vvv, 32, MO_64, do_vbitclr)
+TRANS(xvbitclri_b, gvec_vv_i, 32, MO_8, do_vbitclri)
+TRANS(xvbitclri_h, gvec_vv_i, 32, MO_16, do_vbitclri)
+TRANS(xvbitclri_w, gvec_vv_i, 32, MO_32, do_vbitclri)
+TRANS(xvbitclri_d, gvec_vv_i, 32, MO_64, do_vbitclri)
+
+TRANS(xvbitset_b, gvec_vvv, 32, MO_8, do_vbitset)
+TRANS(xvbitset_h, gvec_vvv, 32, MO_16, do_vbitset)
+TRANS(xvbitset_w, gvec_vvv, 32, MO_32, do_vbitset)
+TRANS(xvbitset_d, gvec_vvv, 32, MO_64, do_vbitset)
+TRANS(xvbitseti_b, gvec_vv_i, 32, MO_8, do_vbitseti)
+TRANS(xvbitseti_h, gvec_vv_i, 32, MO_16, do_vbitseti)
+TRANS(xvbitseti_w, gvec_vv_i, 32, MO_32, do_vbitseti)
+TRANS(xvbitseti_d, gvec_vv_i, 32, MO_64, do_vbitseti)
+
+TRANS(xvbitrev_b, gvec_vvv, 32, MO_8, do_vbitrev)
+TRANS(xvbitrev_h, gvec_vvv, 32, MO_16, do_vbitrev)
+TRANS(xvbitrev_w, gvec_vvv, 32, MO_32, do_vbitrev)
+TRANS(xvbitrev_d, gvec_vvv, 32, MO_64, do_vbitrev)
+TRANS(xvbitrevi_b, gvec_vv_i, 32, MO_8, do_vbitrevi)
+TRANS(xvbitrevi_h, gvec_vv_i, 32, MO_16, do_vbitrevi)
+TRANS(xvbitrevi_w, gvec_vv_i, 32, MO_32, do_vbitrevi)
+TRANS(xvbitrevi_d, gvec_vv_i, 32, MO_64, do_vbitrevi)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index d683c6a6ab..cb6db8002a 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1784,6 +1784,33 @@ xvpcnt_h 0111 01101001 11000 01001 . .   
 @vv
 xvpcnt_w 0111 01101001 11000 01010 . .@vv
 xvpcnt_d 0111 01101001 11000 01011 . .@vv
 
+xvbitclr_b   0111 0101 11000 . . .@vvv
+xvbitclr_h   0111 0101 11001 . . .@vvv
+xvbitclr_w   0111 0101 11010 . . .@vvv
+xvbitclr_d   0111 0101 11011 . . .@vvv
+xvbitclri_b  0111 01110001 0 01 ... . .   @vv_ui3
+xvbitclri_h  0111 01110001 0 1  . .   @vv_ui4
+xvbitclri_w  0111 01110001 1 . . .@vv_ui5
+xvbitclri_d  0111 01110001 0001 .. . .@vv_ui6
+
+xvbitset_b   0111 0101 11100 . . .@vvv
+xvbitset_h   0111 0101 11101 . . .@vvv
+xvbitset_w   0111 0101 0 . . .@vvv
+xvbitset_d   0111 0101 1 . . .@vvv
+xvbitseti_b  0111 01110001 01000 01 ... . .   @vv_ui3
+xvbitseti_h  0111 01110001 01000 1  . .   @vv_ui4
+xvbitseti_w  0111 01110001 01001 . . .@vv_ui5
+xvbitseti_d  0111 01110001 0101 .. . .

[PATCH v3 37/47] target/loongarch: Implement LASX fpu arith instructions

2023-07-14 Thread Song Gao

This patch includes:
- XVF{ADD/SUB/MUL/DIV}.{S/D};
- XVF{MADD/MSUB/NMADD/NMSUB}.{S/D};
- XVF{MAX/MIN}.{S/D};
- XVF{MAXA/MINA}.{S/D};
- XVFLOGB.{S/D};
- XVFCLASS.{S/D};
- XVF{SQRT/RECIP/RSQRT}.{S/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 46 +++
 target/loongarch/insn_trans/trans_lasx.c.inc | 41 ++
 target/loongarch/insns.decode| 41 ++
 target/loongarch/vec_helper.c| 82 +++-
 4 files changed, 172 insertions(+), 38 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 27d6252686..4af74f1ae9 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1708,6 +1708,11 @@ static void output_v_i_x(DisasContext *ctx, arg_v_i *a, 
const char *mnemonic)
 output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm);
 }
 
+static void output__x(DisasContext *ctx, arg_ *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, x%d, x%d, x%d", a->vd, a->vj, a->vk, a->va);
+}
+
 static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemonic)
 {
 output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk);
@@ -2240,6 +2245,47 @@ INSN_LASX(xvfrstp_h, vvv)
 INSN_LASX(xvfrstpi_b,vv_i)
 INSN_LASX(xvfrstpi_h,vv_i)
 
+INSN_LASX(xvfadd_s,  vvv)
+INSN_LASX(xvfadd_d,  vvv)
+INSN_LASX(xvfsub_s,  vvv)
+INSN_LASX(xvfsub_d,  vvv)
+INSN_LASX(xvfmul_s,  vvv)
+INSN_LASX(xvfmul_d,  vvv)
+INSN_LASX(xvfdiv_s,  vvv)
+INSN_LASX(xvfdiv_d,  vvv)
+
+INSN_LASX(xvfmadd_s, )
+INSN_LASX(xvfmadd_d, )
+INSN_LASX(xvfmsub_s, )
+INSN_LASX(xvfmsub_d, )
+INSN_LASX(xvfnmadd_s,)
+INSN_LASX(xvfnmadd_d,)
+INSN_LASX(xvfnmsub_s,)
+INSN_LASX(xvfnmsub_d,)
+
+INSN_LASX(xvfmax_s,  vvv)
+INSN_LASX(xvfmax_d,  vvv)
+INSN_LASX(xvfmin_s,  vvv)
+INSN_LASX(xvfmin_d,  vvv)
+
+INSN_LASX(xvfmaxa_s, vvv)
+INSN_LASX(xvfmaxa_d, vvv)
+INSN_LASX(xvfmina_s, vvv)
+INSN_LASX(xvfmina_d, vvv)
+
+INSN_LASX(xvflogb_s, vv)
+INSN_LASX(xvflogb_d, vv)
+
+INSN_LASX(xvfclass_s,vv)
+INSN_LASX(xvfclass_d,vv)
+
+INSN_LASX(xvfsqrt_s, vv)
+INSN_LASX(xvfsqrt_d, vv)
+INSN_LASX(xvfrecip_s,vv)
+INSN_LASX(xvfrecip_d,vv)
+INSN_LASX(xvfrsqrt_s,vv)
+INSN_LASX(xvfrsqrt_d,vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 081f692416..912b52cfdc 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -561,6 +561,47 @@ TRANS(xvfrstp_h, gen_vvv, 32, gen_helper_vfrstp_h)
 TRANS(xvfrstpi_b, gen_vv_i, 32, gen_helper_vfrstpi_b)
 TRANS(xvfrstpi_h, gen_vv_i, 32, gen_helper_vfrstpi_h)
 
+TRANS(xvfadd_s, gen_vvv_f, 32, gen_helper_vfadd_s)
+TRANS(xvfadd_d, gen_vvv_f, 32, gen_helper_vfadd_d)
+TRANS(xvfsub_s, gen_vvv_f, 32, gen_helper_vfsub_s)
+TRANS(xvfsub_d, gen_vvv_f, 32, gen_helper_vfsub_d)
+TRANS(xvfmul_s, gen_vvv_f, 32, gen_helper_vfmul_s)
+TRANS(xvfmul_d, gen_vvv_f, 32, gen_helper_vfmul_d)
+TRANS(xvfdiv_s, gen_vvv_f, 32, gen_helper_vfdiv_s)
+TRANS(xvfdiv_d, gen_vvv_f, 32, gen_helper_vfdiv_d)
+
+TRANS(xvfmadd_s, gen__f, 32, gen_helper_vfmadd_s)
+TRANS(xvfmadd_d, gen__f, 32, gen_helper_vfmadd_d)
+TRANS(xvfmsub_s, gen__f, 32, gen_helper_vfmsub_s)
+TRANS(xvfmsub_d, gen__f, 32, gen_helper_vfmsub_d)
+TRANS(xvfnmadd_s, gen__f, 32, gen_helper_vfnmadd_s)
+TRANS(xvfnmadd_d, gen__f, 32, gen_helper_vfnmadd_d)
+TRANS(xvfnmsub_s, gen__f, 32, gen_helper_vfnmsub_s)
+TRANS(xvfnmsub_d, gen__f, 32, gen_helper_vfnmsub_d)
+
+TRANS(xvfmax_s, gen_vvv_f, 32, gen_helper_vfmax_s)
+TRANS(xvfmax_d, gen_vvv_f, 32, gen_helper_vfmax_d)
+TRANS(xvfmin_s, gen_vvv_f, 32, gen_helper_vfmin_s)
+TRANS(xvfmin_d, gen_vvv_f, 32, gen_helper_vfmin_d)
+
+TRANS(xvfmaxa_s, gen_vvv_f, 32, gen_helper_vfmaxa_s)
+TRANS(xvfmaxa_d, gen_vvv_f, 32, gen_helper_vfmaxa_d)
+TRANS(xvfmina_s, gen_vvv_f, 32, gen_helper_vfmina_s)
+TRANS(xvfmina_d, gen_vvv_f, 32, gen_helper_vfmina_d)
+
+TRANS(xvflogb_s, gen_vv_f, 32, gen_helper_vflogb_s)
+TRANS(xvflogb_d, gen_vv_f, 32, gen_helper_vflogb_d)
+
+TRANS(xvfclass_s, gen_vv_f, 32, gen_helper_vfclass_s)
+TRANS(xvfclass_d, gen_vv_f, 32, gen_helper_vfclass_d)
+
+TRANS(xvfsqrt_s, gen_vv_f, 32, gen_helper_vfsqrt_s)
+TRANS(xvfsqrt_d, gen_vv_f, 32, gen_helper_vfsqrt_d)
+TRANS(xvfrecip_s, gen_vv_f, 32, gen_helper_vfrecip_s)
+TRANS(xvfrecip_d, gen_vv_f, 32, gen_helper_vfrecip_d)
+TRANS(xvfrsqrt_s, gen_vv_f, 32, gen_helper_vfrsqrt_s)
+TRANS(xvfrsqrt_d, gen_vv_f, 32, gen_helper_vfrsqrt_d)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, g

[PATCH v3 47/47] target/loongarch: CPUCFG support LASX

2023-07-14 Thread Song Gao

Signed-off-by: Song Gao 
---
 target/loongarch/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index c9f9cbb19d..aeccbb42e6 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -392,6 +392,7 @@ static void loongarch_la464_initfn(Object *obj)
 data = FIELD_DP32(data, CPUCFG2, FP_DP, 1);
 data = FIELD_DP32(data, CPUCFG2, FP_VER, 1);
 data = FIELD_DP32(data, CPUCFG2, LSX, 1),
+data = FIELD_DP32(data, CPUCFG2, LASX, 1),
 data = FIELD_DP32(data, CPUCFG2, LLFTP, 1);
 data = FIELD_DP32(data, CPUCFG2, LLFTP_VER, 1);
 data = FIELD_DP32(data, CPUCFG2, LAM, 1);
-- 
2.39.1

[PATCH v3 21/47] target/loongarch: Implement vext2xv

2023-07-14 Thread Song Gao

This patch includes:
- VEXT2XV.{H/W/D}.B, VEXT2XV.{HU/WU/DU}.BU;
- VEXT2XV.{W/D}.B, VEXT2XV.{WU/DU}.HU;
- VEXT2XV.D.W, VEXT2XV.DU.WU.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 13 +
 target/loongarch/helper.h| 13 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 13 +
 target/loongarch/insns.decode| 13 +
 target/loongarch/vec_helper.c| 28 
 5 files changed, 80 insertions(+)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 6ca545956d..975ea018da 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1997,6 +1997,19 @@ INSN_LASX(xvexth_wu_hu,  vv)
 INSN_LASX(xvexth_du_wu,  vv)
 INSN_LASX(xvexth_qu_du,  vv)
 
+INSN_LASX(vext2xv_h_b,   vv)
+INSN_LASX(vext2xv_w_b,   vv)
+INSN_LASX(vext2xv_d_b,   vv)
+INSN_LASX(vext2xv_w_h,   vv)
+INSN_LASX(vext2xv_d_h,   vv)
+INSN_LASX(vext2xv_d_w,   vv)
+INSN_LASX(vext2xv_hu_bu, vv)
+INSN_LASX(vext2xv_wu_bu, vv)
+INSN_LASX(vext2xv_du_bu, vv)
+INSN_LASX(vext2xv_wu_hu, vv)
+INSN_LASX(vext2xv_du_hu, vv)
+INSN_LASX(vext2xv_du_wu, vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h
index 3b3b179a2b..a95059a8c2 100644
--- a/target/loongarch/helper.h
+++ b/target/loongarch/helper.h
@@ -339,6 +339,19 @@ DEF_HELPER_FLAGS_3(vexth_wu_hu, TCG_CALL_NO_RWG, void, 
ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(vexth_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(vexth_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_3(vext2xv_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_w_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_d_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_d_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_wu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_du_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_du_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(vext2xv_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_4(vsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(vsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(vsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 1744521a53..5a99c75858 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -322,6 +322,19 @@ TRANS(xvexth_wu_hu, gen_vv, 32, gen_helper_vexth_wu_hu)
 TRANS(xvexth_du_wu, gen_vv, 32, gen_helper_vexth_du_wu)
 TRANS(xvexth_qu_du, gen_vv, 32, gen_helper_vexth_qu_du)
 
+TRANS(vext2xv_h_b, gen_vv, 32, gen_helper_vext2xv_h_b)
+TRANS(vext2xv_w_b, gen_vv, 32, gen_helper_vext2xv_w_b)
+TRANS(vext2xv_d_b, gen_vv, 32, gen_helper_vext2xv_d_b)
+TRANS(vext2xv_w_h, gen_vv, 32, gen_helper_vext2xv_w_h)
+TRANS(vext2xv_d_h, gen_vv, 32, gen_helper_vext2xv_d_h)
+TRANS(vext2xv_d_w, gen_vv, 32, gen_helper_vext2xv_d_w)
+TRANS(vext2xv_hu_bu, gen_vv, 32, gen_helper_vext2xv_hu_bu)
+TRANS(vext2xv_wu_bu, gen_vv, 32, gen_helper_vext2xv_wu_bu)
+TRANS(vext2xv_du_bu, gen_vv, 32, gen_helper_vext2xv_du_bu)
+TRANS(vext2xv_wu_hu, gen_vv, 32, gen_helper_vext2xv_wu_hu)
+TRANS(vext2xv_du_hu, gen_vv, 32, gen_helper_vext2xv_du_hu)
+TRANS(vext2xv_du_wu, gen_vv, 32, gen_helper_vext2xv_du_wu)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 7491f295a5..db1a6689f0 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1580,6 +1580,19 @@ xvexth_wu_hu 0111 01101001 11101 11101 . .   
 @vv
 xvexth_du_wu 0111 01101001 11101 0 . .@vv
 xvexth_qu_du 0111 01101001 11101 1 . .@vv
 
+vext2xv_h_b  0111 01101001 0 00100 . .@vv
+vext2xv_w_b  0111 01101001 0 00101 . .@vv
+vext2xv_d_b  0111 01101001 0 00110 . .@vv
+vext2xv_w_h  0111 01101001 0 00111 . .@vv
+vext2xv_d_h  0111 01101001 0 01000 . .@vv
+vext2xv_d_w  0111 01101001 0 01001 . .@vv
+vext2xv_hu_bu0111 01101001 0 01010 . .@vv
+vext2xv_wu_bu0111 01101001 0 01011 . .@vv
+vext2xv_du_bu0111 0

[PATCH v3 00/47] Add LoongArch LASX instructions

2023-07-14 Thread Song Gao

Hi,

This series adds LoongArch LASX instructions.

About test:
We use RISU test the LoongArch LASX instructions.

QEMU:
https://github.com/loongson/qemu/tree/tcg-old-abi-support-lasx
RISU:
https://github.com/loongson/risu/tree/loongarch-suport-lasx

Please review, Thanks.

Changes for v3:
- Add a new patch 9, rename lsx_helper.c to vec_helper.c,
  and use gen_helper_gvec_* series functions;
- Use i < oprsz / (BIT / 8) in loop;
- Some helper functions use loop;
- patch 46: use tcg_gen_qemu_ld/st_i64 for xvld/xvst{x};
- R-b.

Changes for v2:
- Expand the definition of VReg to be 256 bits.
- Use more LSX functions.
- R-b.

Song Gao (47):
  target/loongarch: Add LASX data support
  target/loongarch: meson.build support build LASX
  target/loongarch: Add CHECK_ASXE maccro for check LASX enable
  target/loongarch: Implement xvadd/xvsub
  target/loongarch: Implement xvreplgr2vr
  target/loongarch: Implement xvaddi/xvsubi
  target/loongarch: Implement xvneg
  target/loongarch: Implement xvsadd/xvssub
  target/loongarch: rename lsx_helper.c to vec_helper.c
  target/loongarch: Implement xvhaddw/xvhsubw
  target/loongarch: Implement xvaddw/xvsubw
  target/loongarch: Implement xavg/xvagr
  target/loongarch: Implement xvabsd
  target/loongarch: Implement xvadda
  target/loongarch: Implement xvmax/xvmin
  target/loongarch: Implement xvmul/xvmuh/xvmulw{ev/od}
  target/loongarch: Implement xvmadd/xvmsub/xvmaddw{ev/od}
  target/loongarch; Implement xvdiv/xvmod
  target/loongarch: Implement xvsat
  target/loongarch: Implement xvexth
  target/loongarch: Implement vext2xv
  target/loongarch: Implement xvsigncov
  target/loongarch: Implement xvmskltz/xvmskgez/xvmsknz
  target/loognarch: Implement xvldi
  target/loongarch: Implement LASX logic instructions
  target/loongarch: Implement xvsll xvsrl xvsra xvrotr
  target/loongarch: Implement xvsllwil xvextl
  target/loongarch: Implement xvsrlr xvsrar
  target/loongarch: Implement xvsrln xvsran
  target/loongarch: Implement xvsrlrn xvsrarn
  target/loongarch: Implement xvssrln xvssran
  target/loongarch: Implement xvssrlrn xvssrarn
  target/loongarch: Implement xvclo xvclz
  target/loongarch: Implement xvpcnt
  target/loongarch: Implement xvbitclr xvbitset xvbitrev
  target/loongarch: Implement xvfrstp
  target/loongarch: Implement LASX fpu arith instructions
  target/loongarch: Implement LASX fpu fcvt instructions
  target/loongarch: Implement xvseq xvsle xvslt
  target/loongarch: Implement xvfcmp
  target/loongarch: Implement xvbitsel xvset
  target/loongarch: Implement xvinsgr2vr xvpickve2gr
  target/loongarch: Implement xvreplve xvinsve0 xvpickve xvb{sll/srl}v
  target/loongarch: Implement xvpack xvpick xvilv{l/h}
  target/loongarch: Implement xvshuf xvperm{i} xvshuf4i xvextrins
  target/loongarch: Implement xvld xvst
  target/loongarch: CPUCFG support LASX

 linux-user/loongarch64/signal.c  |1 +
 target/loongarch/cpu.c   |4 +
 target/loongarch/cpu.h   |   26 +-
 target/loongarch/disas.c |  925 +
 target/loongarch/gdbstub.c   |1 +
 target/loongarch/helper.h|  689 ++--
 target/loongarch/insn_trans/trans_lasx.c.inc | 1008 +
 target/loongarch/insn_trans/trans_lsx.c.inc  | 2047 ++-
 target/loongarch/insns.decode|  782 
 target/loongarch/internals.h |   22 -
 target/loongarch/lsx_helper.c| 3004 ---
 target/loongarch/machine.c   |   36 +-
 target/loongarch/meson.build |2 +-
 target/loongarch/translate.c |6 +
 target/loongarch/vec.h   |   98 +
 target/loongarch/vec_helper.c| 3431 ++
 16 files changed, 7723 insertions(+), 4359 deletions(-)
 create mode 100644 target/loongarch/insn_trans/trans_lasx.c.inc
 delete mode 100644 target/loongarch/lsx_helper.c
 create mode 100644 target/loongarch/vec.h
 create mode 100644 target/loongarch/vec_helper.c

-- 
2.39.1

[PATCH v3 41/47] target/loongarch: Implement xvbitsel xvset

2023-07-14 Thread Song Gao

This patch includes:
- XVBITSEL.V;
- XVBITSELI.B;
- XVSET{EQZ/NEZ}.V;
- XVSETANYEQZ.{B/H/W/D};
- XVSETALLNEZ.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 19 +
 target/loongarch/helper.h| 16 
 target/loongarch/insn_trans/trans_lasx.c.inc | 42 
 target/loongarch/insn_trans/trans_lsx.c.inc  | 36 ++---
 target/loongarch/insns.decode| 15 +++
 target/loongarch/vec_helper.c| 40 ---
 6 files changed, 130 insertions(+), 38 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 607774375c..3a06b5cb80 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##type * 
a) \
 return true;\
 }
 
+static void output_cv_x(DisasContext *ctx, arg_cv *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "fcc%d, x%d", a->cd, a->vj);
+}
+
 static void output_v_i_x(DisasContext *ctx, arg_v_i *a, const char *mnemonic)
 {
 output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm);
@@ -2479,6 +2484,20 @@ static bool trans_xvfcmp_cond_##suffix(DisasContext 
*ctx, \
 LASX_FCMP_INSN(s)
 LASX_FCMP_INSN(d)
 
+INSN_LASX(xvbitsel_v,)
+INSN_LASX(xvbitseli_b,   vv_i)
+
+INSN_LASX(xvseteqz_v,cv)
+INSN_LASX(xvsetnez_v,cv)
+INSN_LASX(xvsetanyeqz_b, cv)
+INSN_LASX(xvsetanyeqz_h, cv)
+INSN_LASX(xvsetanyeqz_w, cv)
+INSN_LASX(xvsetanyeqz_d, cv)
+INSN_LASX(xvsetallnez_b, cv)
+INSN_LASX(xvsetallnez_h, cv)
+INSN_LASX(xvsetallnez_w, cv)
+INSN_LASX(xvsetallnez_d, cv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h
index 31b3caaa96..21993c8987 100644
--- a/target/loongarch/helper.h
+++ b/target/loongarch/helper.h
@@ -658,14 +658,14 @@ DEF_HELPER_6(vfcmp_s_d, void, env, i32, i32, i32, i32, 
i32)
 
 DEF_HELPER_FLAGS_4(vbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 
-DEF_HELPER_3(vsetanyeqz_b, void, env, i32, i32)
-DEF_HELPER_3(vsetanyeqz_h, void, env, i32, i32)
-DEF_HELPER_3(vsetanyeqz_w, void, env, i32, i32)
-DEF_HELPER_3(vsetanyeqz_d, void, env, i32, i32)
-DEF_HELPER_3(vsetallnez_b, void, env, i32, i32)
-DEF_HELPER_3(vsetallnez_h, void, env, i32, i32)
-DEF_HELPER_3(vsetallnez_w, void, env, i32, i32)
-DEF_HELPER_3(vsetallnez_d, void, env, i32, i32)
+DEF_HELPER_4(vsetanyeqz_b, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetanyeqz_h, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetanyeqz_w, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetanyeqz_d, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetallnez_b, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetallnez_h, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetallnez_w, void, env, i32, i32, i32)
+DEF_HELPER_4(vsetallnez_d, void, env, i32, i32, i32)
 
 DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 57cab4e056..700cbdc622 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -704,6 +704,48 @@ TRANS(xvslti_du, do_vslti_u, 32, MO_64)
 TRANS(xvfcmp_cond_s, do_vfcmp_cond_s, 32)
 TRANS(xvfcmp_cond_d, do_vfcmp_cond_d, 32)
 
+TRANS(xvbitsel_v, do_vbitsel_v, 32)
+TRANS(xvbitseli_b, do_vbitseli_b, 32)
+
+#define XVSET(NAME, COND)  
\
+static bool trans_## NAME(DisasContext *ctx, arg_cv * a)   
\
+{  
\
+TCGv_i64 t1, t2, d[4]; 
\
+   
\
+d[0] = tcg_temp_new_i64(); 
\
+d[1] = tcg_temp_new_i64(); 
\
+d[2] = tcg_temp_new_i64(); 
\
+d[3] = tcg_temp_new_i64(); 
\
+t1 = tcg_temp_new_i64();   
\
+t2 = tcg_temp_new_i64();   
\
+   
\
+get_vreg64(d[0], a->vj, 0);
\
+get_vreg64(d[1], a->vj, 1);
\
+get_vreg64(d[2], a->vj, 2);
\
+get_vreg64(d[3], a->vj, 3);
\
+

[PATCH v3 40/47] target/loongarch: Implement xvfcmp

2023-07-14 Thread Song Gao

This patch includes:
- XVFCMP.cond.{S/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 94 
 target/loongarch/helper.h|  8 +-
 target/loongarch/insn_trans/trans_lasx.c.inc |  3 +
 target/loongarch/insn_trans/trans_lsx.c.inc  | 19 ++--
 target/loongarch/insns.decode|  3 +
 target/loongarch/vec_helper.c|  4 +-
 6 files changed, 119 insertions(+), 12 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 295ba74f2b..607774375c 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2385,6 +2385,100 @@ INSN_LASX(xvslti_hu, vv_i)
 INSN_LASX(xvslti_wu, vv_i)
 INSN_LASX(xvslti_du, vv_i)
 
+#define output_xvfcmp(C, PREFIX, SUFFIX)\
+{   \
+(C)->info->fprintf_func((C)->info->stream, "%08x  %s%s\tx%d, x%d, x%d", \
+(C)->insn, PREFIX, SUFFIX, a->vd,   \
+a->vj, a->vk);  \
+}
+
+static bool output_xxx_fcond(DisasContext *ctx, arg_vvv_fcond * a,
+ const char *suffix)
+{
+bool ret = true;
+switch (a->fcond) {
+case 0x0:
+output_xvfcmp(ctx, "xvfcmp_caf_", suffix);
+break;
+case 0x1:
+output_xvfcmp(ctx, "xvfcmp_saf_", suffix);
+break;
+case 0x2:
+output_xvfcmp(ctx, "xvfcmp_clt_", suffix);
+break;
+case 0x3:
+output_xvfcmp(ctx, "xvfcmp_slt_", suffix);
+break;
+case 0x4:
+output_xvfcmp(ctx, "xvfcmp_ceq_", suffix);
+break;
+case 0x5:
+output_xvfcmp(ctx, "xvfcmp_seq_", suffix);
+break;
+case 0x6:
+output_xvfcmp(ctx, "xvfcmp_cle_", suffix);
+break;
+case 0x7:
+output_xvfcmp(ctx, "xvfcmp_sle_", suffix);
+break;
+case 0x8:
+output_xvfcmp(ctx, "xvfcmp_cun_", suffix);
+break;
+case 0x9:
+output_xvfcmp(ctx, "xvfcmp_sun_", suffix);
+break;
+case 0xA:
+output_xvfcmp(ctx, "xvfcmp_cult_", suffix);
+break;
+case 0xB:
+output_xvfcmp(ctx, "xvfcmp_sult_", suffix);
+break;
+case 0xC:
+output_xvfcmp(ctx, "xvfcmp_cueq_", suffix);
+break;
+case 0xD:
+output_xvfcmp(ctx, "xvfcmp_sueq_", suffix);
+break;
+case 0xE:
+output_xvfcmp(ctx, "xvfcmp_cule_", suffix);
+break;
+case 0xF:
+output_xvfcmp(ctx, "xvfcmp_sule_", suffix);
+break;
+case 0x10:
+output_xvfcmp(ctx, "xvfcmp_cne_", suffix);
+break;
+case 0x11:
+output_xvfcmp(ctx, "xvfcmp_sne_", suffix);
+break;
+case 0x14:
+output_xvfcmp(ctx, "xvfcmp_cor_", suffix);
+break;
+case 0x15:
+output_xvfcmp(ctx, "xvfcmp_sor_", suffix);
+break;
+case 0x18:
+output_xvfcmp(ctx, "xvfcmp_cune_", suffix);
+break;
+case 0x19:
+output_xvfcmp(ctx, "xvfcmp_sune_", suffix);
+break;
+default:
+ret = false;
+}
+return ret;
+}
+
+#define LASX_FCMP_INSN(suffix)\
+static bool trans_xvfcmp_cond_##suffix(DisasContext *ctx, \
+   arg_vvv_fcond * a) \
+{ \
+return output_xxx_fcond(ctx, a, #suffix); \
+}
+
+LASX_FCMP_INSN(s)
+LASX_FCMP_INSN(d)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h
index a95059a8c2..31b3caaa96 100644
--- a/target/loongarch/helper.h
+++ b/target/loongarch/helper.h
@@ -651,10 +651,10 @@ DEF_HELPER_FLAGS_4(vslti_hu, TCG_CALL_NO_RWG, void, ptr, 
ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vslti_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vslti_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 
-DEF_HELPER_5(vfcmp_c_s, void, env, i32, i32, i32, i32)
-DEF_HELPER_5(vfcmp_s_s, void, env, i32, i32, i32, i32)
-DEF_HELPER_5(vfcmp_c_d, void, env, i32, i32, i32, i32)
-DEF_HELPER_5(vfcmp_s_d, void, env, i32, i32, i32, i32)
+DEF_HELPER_6(vfcmp_c_s, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vfcmp_s_s, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vfcmp_c_d, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vfcmp_s_d, void, env, i32, i32, i32, i32, i32)
 
 DEF_HELPER_FLAGS_4(vbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index ad7f787319..57cab4e056 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -701,6 +701,9 @@ TRANS(xvslti_hu, do_vslti_u, 32, MO_16)
 TRANS(xvslti_wu, do_vslti_u, 32, MO_32)
 TRANS(x

[PATCH v3 45/47] target/loongarch: Implement xvshuf xvperm{i} xvshuf4i xvextrins

2023-07-14 Thread Song Gao

This patch includes:
- XVSHUF.{B/H/W/D};
- XVPERM.W;
- XVSHUF4i.{B/H/W/D};
- XVPERMI.{W/D/Q};
- XVEXTRINS.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  21 
 target/loongarch/helper.h|   3 +
 target/loongarch/insn_trans/trans_lasx.c.inc |  21 
 target/loongarch/insns.decode|  21 
 target/loongarch/vec.h   |   2 +
 target/loongarch/vec_helper.c| 112 +++
 6 files changed, 161 insertions(+), 19 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 9b6a07bbb0..a518c59772 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2575,3 +2575,24 @@ INSN_LASX(xvilvh_b,  vvv)
 INSN_LASX(xvilvh_h,  vvv)
 INSN_LASX(xvilvh_w,  vvv)
 INSN_LASX(xvilvh_d,  vvv)
+
+INSN_LASX(xvshuf_b,  )
+INSN_LASX(xvshuf_h,  vvv)
+INSN_LASX(xvshuf_w,  vvv)
+INSN_LASX(xvshuf_d,  vvv)
+
+INSN_LASX(xvperm_w,  vvv)
+
+INSN_LASX(xvshuf4i_b,vv_i)
+INSN_LASX(xvshuf4i_h,vv_i)
+INSN_LASX(xvshuf4i_w,vv_i)
+INSN_LASX(xvshuf4i_d,vv_i)
+
+INSN_LASX(xvpermi_w, vv_i)
+INSN_LASX(xvpermi_d, vv_i)
+INSN_LASX(xvpermi_q, vv_i)
+
+INSN_LASX(xvextrins_d,   vv_i)
+INSN_LASX(xvextrins_w,   vv_i)
+INSN_LASX(xvextrins_h,   vv_i)
+INSN_LASX(xvextrins_b,   vv_i)
diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h
index dc568d8982..8bef7b7a9a 100644
--- a/target/loongarch/helper.h
+++ b/target/loongarch/helper.h
@@ -708,7 +708,10 @@ DEF_HELPER_FLAGS_4(vshuf4i_h, TCG_CALL_NO_RWG, void, ptr, 
ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vshuf4i_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vshuf4i_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 
+DEF_HELPER_FLAGS_4(vperm_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(vpermi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+DEF_HELPER_FLAGS_4(vpermi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+DEF_HELPER_FLAGS_4(vpermi_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 
 DEF_HELPER_FLAGS_4(vextrins_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vextrins_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 500e204fb9..cf53c12543 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -905,3 +905,24 @@ TRANS(xvilvh_b, gen_vvv, 32, gen_helper_vilvh_b)
 TRANS(xvilvh_h, gen_vvv, 32, gen_helper_vilvh_h)
 TRANS(xvilvh_w, gen_vvv, 32, gen_helper_vilvh_w)
 TRANS(xvilvh_d, gen_vvv, 32, gen_helper_vilvh_d)
+
+TRANS(xvshuf_b, gen_, 32, gen_helper_vshuf_b)
+TRANS(xvshuf_h, gen_vvv, 32, gen_helper_vshuf_h)
+TRANS(xvshuf_w, gen_vvv, 32, gen_helper_vshuf_w)
+TRANS(xvshuf_d, gen_vvv, 32, gen_helper_vshuf_d)
+
+TRANS(xvperm_w, gen_vvv, 32,  gen_helper_vperm_w)
+
+TRANS(xvshuf4i_b, gen_vv_i, 32, gen_helper_vshuf4i_b)
+TRANS(xvshuf4i_h, gen_vv_i, 32, gen_helper_vshuf4i_h)
+TRANS(xvshuf4i_w, gen_vv_i, 32, gen_helper_vshuf4i_w)
+TRANS(xvshuf4i_d, gen_vv_i, 32, gen_helper_vshuf4i_d)
+
+TRANS(xvpermi_w, gen_vv_i, 32, gen_helper_vpermi_w)
+TRANS(xvpermi_d, gen_vv_i, 32, gen_helper_vpermi_d)
+TRANS(xvpermi_q, gen_vv_i, 32, gen_helper_vpermi_q)
+
+TRANS(xvextrins_b, gen_vv_i, 32, gen_helper_vextrins_b)
+TRANS(xvextrins_h, gen_vv_i, 32, gen_helper_vextrins_h)
+TRANS(xvextrins_w, gen_vv_i, 32, gen_helper_vextrins_w)
+TRANS(xvextrins_d, gen_vv_i, 32, gen_helper_vextrins_d)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index a325b861c1..64b67ee9ac 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -2039,3 +2039,24 @@ xvilvh_b 0111 01010001 11000 . . .   
 @vvv
 xvilvh_h 0111 01010001 11001 . . .@vvv
 xvilvh_w 0111 01010001 11010 . . .@vvv
 xvilvh_d 0111 01010001 11011 . . .@vvv
+
+xvshuf_b  11010110 . . . .@
+xvshuf_h 0111 01010111 10101 . . .@vvv
+xvshuf_w 0111 01010111 10110 . . .@vvv
+xvshuf_d 0111 01010111 10111 . . .@vvv
+
+xvperm_w 0111 01010111 11010 . . .@vvv
+
+xvshuf4i_b   0111 0001 00  . .@vv_ui8
+xvshuf4i_h   0111 0001 01  . .@vv_ui8
+xvshuf4i_w   0111 0001 10  . .@vv_ui8
+xvshuf4i_d   0111 0001 11  . .@vv_ui8
+
+xvpermi_w0111 0110 01  . .@vv_ui8
+xvpermi_d0111 0110 10  . .@vv_ui8
+xvpermi_q0111 0110 11  . .@vv_ui8
+
+xvextrins_d  0111 0000 00  . .@vv_ui8
+xvextrins_w  0111

[PATCH v3 13/47] target/loongarch: Implement xvabsd

2023-07-14 Thread Song Gao

This patch includes:
- XVABSD.{B/H/W/D}[U].

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 9 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 9 +
 target/loongarch/insns.decode| 9 +
 target/loongarch/vec.h   | 2 ++
 target/loongarch/vec_helper.c| 2 --
 5 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 8296aafa98..d0b1de39b8 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1842,6 +1842,15 @@ INSN_LASX(xvavgr_hu, vvv)
 INSN_LASX(xvavgr_wu, vvv)
 INSN_LASX(xvavgr_du, vvv)
 
+INSN_LASX(xvabsd_b,  vvv)
+INSN_LASX(xvabsd_h,  vvv)
+INSN_LASX(xvabsd_w,  vvv)
+INSN_LASX(xvabsd_d,  vvv)
+INSN_LASX(xvabsd_bu, vvv)
+INSN_LASX(xvabsd_hu, vvv)
+INSN_LASX(xvabsd_wu, vvv)
+INSN_LASX(xvabsd_du, vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index ac4cade845..bd8ba6c7b6 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -157,6 +157,15 @@ TRANS(xvavgr_hu, gvec_vvv, 32, MO_16, do_vavgr_u)
 TRANS(xvavgr_wu, gvec_vvv, 32, MO_32, do_vavgr_u)
 TRANS(xvavgr_du, gvec_vvv, 32, MO_64, do_vavgr_u)
 
+TRANS(xvabsd_b, gvec_vvv, 32, MO_8, do_vabsd_s)
+TRANS(xvabsd_h, gvec_vvv, 32, MO_16, do_vabsd_s)
+TRANS(xvabsd_w, gvec_vvv, 32, MO_32, do_vabsd_s)
+TRANS(xvabsd_d, gvec_vvv, 32, MO_64, do_vabsd_s)
+TRANS(xvabsd_bu, gvec_vvv, 32, MO_8, do_vabsd_u)
+TRANS(xvabsd_hu, gvec_vvv, 32, MO_16, do_vabsd_u)
+TRANS(xvabsd_wu, gvec_vvv, 32, MO_32, do_vabsd_u)
+TRANS(xvabsd_du, gvec_vvv, 32, MO_64, do_vabsd_u)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index a2cb39750d..c086ee9b22 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1423,6 +1423,15 @@ xvavgr_hu0111 01000110 10101 . . .   
 @vvv
 xvavgr_wu0111 01000110 10110 . . .@vvv
 xvavgr_du0111 01000110 10111 . . .@vvv
 
+xvabsd_b 0111 01000110 0 . . .@vvv
+xvabsd_h 0111 01000110 1 . . .@vvv
+xvabsd_w 0111 01000110 00010 . . .@vvv
+xvabsd_d 0111 01000110 00011 . . .@vvv
+xvabsd_bu0111 01000110 00100 . . .@vvv
+xvabsd_hu0111 01000110 00101 . . .@vvv
+xvabsd_wu0111 01000110 00110 . . .@vvv
+xvabsd_du0111 01000110 00111 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index 6ac6b22f20..6767073635 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -53,4 +53,6 @@
 #define DO_VAVG(a, b)  ((a >> 1) + (b >> 1) + (a & b & 1))
 #define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1))
 
+#define DO_VABSD(a, b)  ((a > b) ? (a - b) : (b - a))
+
 #endif /* LOONGARCH_VEC_H */
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 2fa8b68e72..22d08f36ac 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -375,8 +375,6 @@ DO_3OP(vavgr_hu, 16, UH, DO_VAVGR)
 DO_3OP(vavgr_wu, 32, UW, DO_VAVGR)
 DO_3OP(vavgr_du, 64, UD, DO_VAVGR)
 
-#define DO_VABSD(a, b)  ((a > b) ? (a -b) : (b-a))
-
 DO_3OP(vabsd_b, 8, B, DO_VABSD)
 DO_3OP(vabsd_h, 16, H, DO_VABSD)
 DO_3OP(vabsd_w, 32, W, DO_VABSD)
-- 
2.39.1

[PATCH v3 42/47] target/loongarch: Implement xvinsgr2vr xvpickve2gr

2023-07-14 Thread Song Gao

This patch includes:
- XVINSGR2VR.{W/D};
- XVPICKVE2GR.{W/D}[U].

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 18 
 target/loongarch/insn_trans/trans_lasx.c.inc | 30 
 target/loongarch/insns.decode|  7 +
 3 files changed, 55 insertions(+)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 3a06b5cb80..0995d9b794 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1738,6 +1738,17 @@ static void output_vr_x(DisasContext *ctx, arg_vr *a, 
const char *mnemonic)
 output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj);
 }
 
+static void output_vr_i_x(DisasContext *ctx, arg_vr_i *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, r%d, 0x%x", a->vd, a->rj, a->imm);
+}
+
+static void output_rv_i_x(DisasContext *ctx, arg_rv_i *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->vj, a->imm);
+}
+
+
 INSN_LASX(xvadd_b,   vvv)
 INSN_LASX(xvadd_h,   vvv)
 INSN_LASX(xvadd_w,   vvv)
@@ -2498,6 +2509,13 @@ INSN_LASX(xvsetallnez_h, cv)
 INSN_LASX(xvsetallnez_w, cv)
 INSN_LASX(xvsetallnez_d, cv)
 
+INSN_LASX(xvinsgr2vr_w,  vr_i)
+INSN_LASX(xvinsgr2vr_d,  vr_i)
+INSN_LASX(xvpickve2gr_w, rv_i)
+INSN_LASX(xvpickve2gr_d, rv_i)
+INSN_LASX(xvpickve2gr_wu,rv_i)
+INSN_LASX(xvpickve2gr_du,rv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 700cbdc622..a79f34d280 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -746,6 +746,36 @@ TRANS(xvsetallnez_h, gen_cv, 32, gen_helper_vsetallnez_h)
 TRANS(xvsetallnez_w, gen_cv, 32, gen_helper_vsetallnez_w)
 TRANS(xvsetallnez_d, gen_cv, 32, gen_helper_vsetallnez_d)
 
+static bool trans_xvinsgr2vr_w(DisasContext *ctx, arg_vr_i *a)
+{
+return trans_vinsgr2vr_w(ctx, a);
+}
+
+static bool trans_xvinsgr2vr_d(DisasContext *ctx, arg_vr_i *a)
+{
+return trans_vinsgr2vr_d(ctx, a);
+}
+
+static bool trans_xvpickve2gr_w(DisasContext *ctx, arg_rv_i *a)
+{
+return trans_vpickve2gr_w(ctx, a);
+}
+
+static bool trans_xvpickve2gr_d(DisasContext *ctx, arg_rv_i *a)
+{
+return trans_vpickve2gr_d(ctx, a);
+}
+
+static bool trans_xvpickve2gr_wu(DisasContext *ctx, arg_rv_i *a)
+{
+return trans_vpickve2gr_wu(ctx, a);
+}
+
+static bool trans_xvpickve2gr_du(DisasContext *ctx, arg_rv_i *a)
+{
+return trans_vpickve2gr_du(ctx, a);
+}
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index ad6751fdfb..bb3bb447ae 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1976,6 +1976,13 @@ xvsetallnez_h0111 01101001 11001 01101 . 00 ...  
 @cv
 xvsetallnez_w0111 01101001 11001 01110 . 00 ...   @cv
 xvsetallnez_d0111 01101001 11001 0 . 00 ...   @cv
 
+xvinsgr2vr_w 0111 01101110 10111 10 ... . .   @vr_ui3
+xvinsgr2vr_d 0111 01101110 10111 110 .. . .   @vr_ui2
+xvpickve2gr_w0111 01101110 1 10 ... . .   @rv_ui3
+xvpickve2gr_d0111 01101110 1 110 .. . .   @rv_ui2
+xvpickve2gr_wu   0111 0110 00111 10 ... . .   @rv_ui3
+xvpickve2gr_du   0111 0110 00111 110 .. . .   @rv_ui2
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
-- 
2.39.1

[PATCH v3 27/47] target/loongarch: Implement xvsllwil xvextl

2023-07-14 Thread Song Gao

This patch includes:
- XVSLLWIL.{H.B/W.H/D.W};
- XVSLLWIL.{HU.BU/WU.HU/DU.WU};
- XVEXTL.Q.D, VEXTL.QU.DU.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  9 
 target/loongarch/insn_trans/trans_lasx.c.inc |  9 
 target/loongarch/insns.decode|  9 
 target/loongarch/vec_helper.c| 44 
 4 files changed, 54 insertions(+), 17 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index e081a11aba..93c205fa32 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2077,6 +2077,15 @@ INSN_LASX(xvrotri_h, vv_i)
 INSN_LASX(xvrotri_w, vv_i)
 INSN_LASX(xvrotri_d, vv_i)
 
+INSN_LASX(xvsllwil_h_b,  vv_i)
+INSN_LASX(xvsllwil_w_h,  vv_i)
+INSN_LASX(xvsllwil_d_w,  vv_i)
+INSN_LASX(xvextl_q_d,vv)
+INSN_LASX(xvsllwil_hu_bu,vv_i)
+INSN_LASX(xvsllwil_wu_hu,vv_i)
+INSN_LASX(xvsllwil_du_wu,vv_i)
+INSN_LASX(xvextl_qu_du,  vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 5e88f0c530..b51e80dece 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -396,6 +396,15 @@ TRANS(xvrotri_h, gvec_vv_i, 32, MO_16, tcg_gen_gvec_rotri)
 TRANS(xvrotri_w, gvec_vv_i, 32, MO_32, tcg_gen_gvec_rotri)
 TRANS(xvrotri_d, gvec_vv_i, 32, MO_64, tcg_gen_gvec_rotri)
 
+TRANS(xvsllwil_h_b, gen_vv_i, 32, gen_helper_vsllwil_h_b)
+TRANS(xvsllwil_w_h, gen_vv_i, 32, gen_helper_vsllwil_w_h)
+TRANS(xvsllwil_d_w, gen_vv_i, 32, gen_helper_vsllwil_d_w)
+TRANS(xvextl_q_d, gen_vv, 32, gen_helper_vextl_q_d)
+TRANS(xvsllwil_hu_bu, gen_vv_i, 32, gen_helper_vsllwil_hu_bu)
+TRANS(xvsllwil_wu_hu, gen_vv_i, 32, gen_helper_vsllwil_wu_hu)
+TRANS(xvsllwil_du_wu, gen_vv_i, 32, gen_helper_vsllwil_du_wu)
+TRANS(xvextl_qu_du, gen_vv, 32, gen_helper_vextl_qu_du)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index fb7bd9fb34..8a7933eccc 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1652,6 +1652,15 @@ xvrotri_h0111 01101010 0 1  . .  
 @vv_ui4
 xvrotri_w0111 01101010 1 . . .@vv_ui5
 xvrotri_d0111 01101010 0001 .. . .@vv_ui6
 
+xvsllwil_h_b 0111 0111 1 01 ... . .   @vv_ui3
+xvsllwil_w_h 0111 0111 1 1  . .   @vv_ui4
+xvsllwil_d_w 0111 0111 10001 . . .@vv_ui5
+xvextl_q_d   0111 0111 10010 0 . .@vv
+xvsllwil_hu_bu   0111 0111 11000 01 ... . .   @vv_ui3
+xvsllwil_wu_hu   0111 0111 11000 1  . .   @vv_ui4
+xvsllwil_du_wu   0111 0111 11001 . . .@vv_ui5
+xvextl_qu_du 0111 0111 11010 0 . .@vv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 96c9a243e1..dcf75d421c 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -925,37 +925,47 @@ void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, 
uint32_t desc)
 }
 }
 
-#define VSLLWIL(NAME, BIT, E1, E2) \
-void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \
-{  \
-int i; \
-VReg temp; \
-VReg *Vd = (VReg *)vd; \
-VReg *Vj = (VReg *)vj; \
-typedef __typeof(temp.E1(0)) TD;   \
-   \
-temp.D(0) = 0; \
-temp.D(1) = 0; \
-for (i = 0; i < LSX_LEN/BIT; i++) {\
-temp.E1(i) = (TD)Vj->E2(i) << (imm % BIT); \
-}  \
-*Vd = temp;\
+#define VSLLWIL(NAME, BIT, E1, E2) 
\
+void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) 
\
+{  
\
+int i, j, ofs; 
\
+VReg temp = {};

[PATCH v3 17/47] target/loongarch: Implement xvmadd/xvmsub/xvmaddw{ev/od}

2023-07-14 Thread Song Gao

This patch includes:
- XVMADD.{B/H/W/D};
- XVMSUB.{B/H/W/D};
- XVMADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U];
- XVMADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  34 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  38 +++
 target/loongarch/insn_trans/trans_lsx.c.inc  |  68 +--
 target/loongarch/insns.decode|  34 ++
 target/loongarch/vec.h   |   3 +
 target/loongarch/vec_helper.c| 113 ++-
 6 files changed, 203 insertions(+), 87 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index e5f9a6bcdf..b115fe8315 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1928,6 +1928,40 @@ INSN_LASX(xvmulwod_w_hu_h,   vvv)
 INSN_LASX(xvmulwod_d_wu_w,   vvv)
 INSN_LASX(xvmulwod_q_du_d,   vvv)
 
+INSN_LASX(xvmadd_b,  vvv)
+INSN_LASX(xvmadd_h,  vvv)
+INSN_LASX(xvmadd_w,  vvv)
+INSN_LASX(xvmadd_d,  vvv)
+INSN_LASX(xvmsub_b,  vvv)
+INSN_LASX(xvmsub_h,  vvv)
+INSN_LASX(xvmsub_w,  vvv)
+INSN_LASX(xvmsub_d,  vvv)
+
+INSN_LASX(xvmaddwev_h_b, vvv)
+INSN_LASX(xvmaddwev_w_h, vvv)
+INSN_LASX(xvmaddwev_d_w, vvv)
+INSN_LASX(xvmaddwev_q_d, vvv)
+INSN_LASX(xvmaddwod_h_b, vvv)
+INSN_LASX(xvmaddwod_w_h, vvv)
+INSN_LASX(xvmaddwod_d_w, vvv)
+INSN_LASX(xvmaddwod_q_d, vvv)
+INSN_LASX(xvmaddwev_h_bu,vvv)
+INSN_LASX(xvmaddwev_w_hu,vvv)
+INSN_LASX(xvmaddwev_d_wu,vvv)
+INSN_LASX(xvmaddwev_q_du,vvv)
+INSN_LASX(xvmaddwod_h_bu,vvv)
+INSN_LASX(xvmaddwod_w_hu,vvv)
+INSN_LASX(xvmaddwod_d_wu,vvv)
+INSN_LASX(xvmaddwod_q_du,vvv)
+INSN_LASX(xvmaddwev_h_bu_b,  vvv)
+INSN_LASX(xvmaddwev_w_hu_h,  vvv)
+INSN_LASX(xvmaddwev_d_wu_w,  vvv)
+INSN_LASX(xvmaddwev_q_du_d,  vvv)
+INSN_LASX(xvmaddwod_h_bu_b,  vvv)
+INSN_LASX(xvmaddwod_w_hu_h,  vvv)
+INSN_LASX(xvmaddwod_d_wu_w,  vvv)
+INSN_LASX(xvmaddwod_q_du_d,  vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 5fffe4e60c..1f9574a83b 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -249,6 +249,44 @@ TRANS(xvmulwod_h_bu_b, gvec_vvv, 32, MO_8, do_vmulwod_u_s)
 TRANS(xvmulwod_w_hu_h, gvec_vvv, 32, MO_16, do_vmulwod_u_s)
 TRANS(xvmulwod_d_wu_w, gvec_vvv, 32, MO_32, do_vmulwod_u_s)
 
+TRANS(xvmadd_b, gvec_vvv, 32, MO_8, do_vmadd)
+TRANS(xvmadd_h, gvec_vvv, 32, MO_16, do_vmadd)
+TRANS(xvmadd_w, gvec_vvv, 32, MO_32, do_vmadd)
+TRANS(xvmadd_d, gvec_vvv, 32, MO_64, do_vmadd)
+TRANS(xvmsub_b, gvec_vvv, 32, MO_8, do_vmsub)
+TRANS(xvmsub_h, gvec_vvv, 32, MO_16, do_vmsub)
+TRANS(xvmsub_w, gvec_vvv, 32, MO_32, do_vmsub)
+TRANS(xvmsub_d, gvec_vvv, 32, MO_64, do_vmsub)
+
+TRANS(xvmaddwev_h_b, gvec_vvv, 32, MO_8, do_vmaddwev_s)
+TRANS(xvmaddwev_w_h, gvec_vvv, 32, MO_16, do_vmaddwev_s)
+TRANS(xvmaddwev_d_w, gvec_vvv, 32, MO_32, do_vmaddwev_s)
+
+TRANS(xvmaddwev_q_d, gen_vmadd_q, 32, 0, 0, tcg_gen_muls2_i64)
+TRANS(xvmaddwod_q_d, gen_vmadd_q, 32, 1, 1, tcg_gen_muls2_i64)
+TRANS(xvmaddwev_q_du, gen_vmadd_q, 32, 0, 0, tcg_gen_mulu2_i64)
+TRANS(xvmaddwod_q_du, gen_vmadd_q, 32, 1, 1, tcg_gen_mulu2_i64)
+TRANS(xvmaddwev_q_du_d, gen_vmadd_q, 32, 0, 0, tcg_gen_mulus2_i64)
+TRANS(xvmaddwod_q_du_d, gen_vmadd_q, 32, 1, 1, tcg_gen_mulus2_i64)
+
+TRANS(xvmaddwod_h_b, gvec_vvv, 32, MO_8, do_vmaddwod_s)
+TRANS(xvmaddwod_w_h, gvec_vvv, 32, MO_16, do_vmaddwod_s)
+TRANS(xvmaddwod_d_w, gvec_vvv, 32, MO_32, do_vmaddwod_s)
+
+TRANS(xvmaddwev_h_bu, gvec_vvv, 32, MO_8, do_vmaddwev_u)
+TRANS(xvmaddwev_w_hu, gvec_vvv, 32, MO_16, do_vmaddwev_u)
+TRANS(xvmaddwev_d_wu, gvec_vvv, 32, MO_32, do_vmaddwev_u)
+TRANS(xvmaddwod_h_bu, gvec_vvv, 32, MO_8, do_vmaddwod_u)
+TRANS(xvmaddwod_w_hu, gvec_vvv, 32, MO_16, do_vmaddwod_u)
+TRANS(xvmaddwod_d_wu, gvec_vvv, 32, MO_32, do_vmaddwod_u)
+
+TRANS(xvmaddwev_h_bu_b, gvec_vvv, 32, MO_8, do_vmaddwev_u_s)
+TRANS(xvmaddwev_w_hu_h, gvec_vvv, 32, MO_16, do_vmaddwev_u_s)
+TRANS(xvmaddwev_d_wu_w, gvec_vvv, 32, MO_32, do_vmaddwev_u_s)
+TRANS(xvmaddwod_h_bu_b, gvec_vvv, 32, MO_8, do_vmaddwod_u_s)
+TRANS(xvmaddwod_w_hu_h, gvec_vvv, 32, MO_16, do_vmaddwod_u_s)
+TRANS(xvmaddwod_d_wu_w, gvec_vvv, 32, MO_32, do_vmaddwod_u_s)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc 
b/target/loongarch/insn_trans/trans_lsx.c.inc
index 82051b6a23..08818c08ca 100644
--- a/target/loongarch/insn_trans/trans_lsx.c.inc
+++ b/target/loongarch/insn_trans/trans_lsx.c.inc
@@ -2367,38 +2367,42 @@ TRANS(vmaddwev_h_b, gvec_vvv, 16, MO_8, do_vmaddwev_s)
 TRANS(vmaddwev_w_h, gvec_vvv, 16, MO_16, do_vmaddwev_s)
 TRANS(vmaddwev_d_w, gvec_vvv, 16, MO_32, do_vmaddwev_s)
 
-#define

[PATCH v3 30/47] target/loongarch: Implement xvsrlrn xvsrarn

2023-07-14 Thread Song Gao

This patch includes:
- XVSRLRN.{B.H/H.W/W.D};
- XVSRARN.{B.H/H.W/W.D};
- XVSRLRNI.{B.H/H.W/W.D/D.Q};
- XVSRARNI.{B.H/H.W/W.D/D.Q}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  16 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  16 ++
 target/loongarch/insns.decode|  16 ++
 target/loongarch/vec_helper.c| 198 +++
 4 files changed, 161 insertions(+), 85 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 14b526abd6..04b6ea713d 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2120,6 +2120,22 @@ INSN_LASX(xvsrani_h_w,   vv_i)
 INSN_LASX(xvsrani_w_d,   vv_i)
 INSN_LASX(xvsrani_d_q,   vv_i)
 
+INSN_LASX(xvsrlrn_b_h,   vvv)
+INSN_LASX(xvsrlrn_h_w,   vvv)
+INSN_LASX(xvsrlrn_w_d,   vvv)
+INSN_LASX(xvsrarn_b_h,   vvv)
+INSN_LASX(xvsrarn_h_w,   vvv)
+INSN_LASX(xvsrarn_w_d,   vvv)
+
+INSN_LASX(xvsrlrni_b_h,  vv_i)
+INSN_LASX(xvsrlrni_h_w,  vv_i)
+INSN_LASX(xvsrlrni_w_d,  vv_i)
+INSN_LASX(xvsrlrni_d_q,  vv_i)
+INSN_LASX(xvsrarni_b_h,  vv_i)
+INSN_LASX(xvsrarni_h_w,  vv_i)
+INSN_LASX(xvsrarni_w_d,  vv_i)
+INSN_LASX(xvsrarni_d_q,  vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 43ff9b188a..76cc3d749a 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -439,6 +439,22 @@ TRANS(xvsrani_h_w, gen_vv_i, 32, gen_helper_vsrani_h_w)
 TRANS(xvsrani_w_d, gen_vv_i, 32, gen_helper_vsrani_w_d)
 TRANS(xvsrani_d_q, gen_vv_i, 32, gen_helper_vsrani_d_q)
 
+TRANS(xvsrlrn_b_h, gen_vvv, 32, gen_helper_vsrlrn_b_h)
+TRANS(xvsrlrn_h_w, gen_vvv, 32, gen_helper_vsrlrn_h_w)
+TRANS(xvsrlrn_w_d, gen_vvv, 32, gen_helper_vsrlrn_w_d)
+TRANS(xvsrarn_b_h, gen_vvv, 32, gen_helper_vsrarn_b_h)
+TRANS(xvsrarn_h_w, gen_vvv, 32, gen_helper_vsrarn_h_w)
+TRANS(xvsrarn_w_d, gen_vvv, 32, gen_helper_vsrarn_w_d)
+
+TRANS(xvsrlrni_b_h, gen_vv_i, 32, gen_helper_vsrlrni_b_h)
+TRANS(xvsrlrni_h_w, gen_vv_i, 32, gen_helper_vsrlrni_h_w)
+TRANS(xvsrlrni_w_d, gen_vv_i, 32, gen_helper_vsrlrni_w_d)
+TRANS(xvsrlrni_d_q, gen_vv_i, 32, gen_helper_vsrlrni_d_q)
+TRANS(xvsrarni_b_h, gen_vv_i, 32, gen_helper_vsrarni_b_h)
+TRANS(xvsrarni_h_w, gen_vv_i, 32, gen_helper_vsrarni_h_w)
+TRANS(xvsrarni_w_d, gen_vv_i, 32, gen_helper_vsrarni_w_d)
+TRANS(xvsrarni_d_q, gen_vv_i, 32, gen_helper_vsrarni_d_q)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 204dcfa075..d7c50b14ca 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1694,6 +1694,22 @@ xvsrani_h_w  0111 01110101 10001 . . .   
 @vv_ui5
 xvsrani_w_d  0111 01110101 1001 .. . .@vv_ui6
 xvsrani_d_q  0111 01110101 101 ... . .@vv_ui7
 
+xvsrlrn_b_h  0111 0100 10001 . . .@vvv
+xvsrlrn_h_w  0111 0100 10010 . . .@vvv
+xvsrlrn_w_d  0111 0100 10011 . . .@vvv
+xvsrarn_b_h  0111 0100 10101 . . .@vvv
+xvsrarn_h_w  0111 0100 10110 . . .@vvv
+xvsrarn_w_d  0111 0100 10111 . . .@vvv
+
+xvsrlrni_b_h 0111 01110100 01000 1  . .   @vv_ui4
+xvsrlrni_h_w 0111 01110100 01001 . . .@vv_ui5
+xvsrlrni_w_d 0111 01110100 0101 .. . .@vv_ui6
+xvsrlrni_d_q 0111 01110100 011 ... . .@vv_ui7
+xvsrarni_b_h 0111 01110101 11000 1  . .   @vv_ui4
+xvsrarni_h_w 0111 01110101 11001 . . .@vv_ui5
+xvsrarni_w_d 0111 01110101 1101 .. . .@vv_ui6
+xvsrarni_d_q 0111 01110101 111 ... . .@vv_ui7
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index dacedc4363..79715c28e0 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -1201,76 +1201,95 @@ VSRANI(vsrani_b_h, 16, B, H)
 VSRANI(vsrani_h_w, 32, H, W)
 VSRANI(vsrani_w_d, 64, W, D)
 
-#define VSRLRN(NAME, BIT, T, E1, E2)\
-void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc)  \
-{   \
-int i;  \
-VReg *Vd = (VReg *)vd;  \
-VReg *Vj = (VReg *)vj;  \
-VReg *Vk = (VReg *)v

[PATCH v3 44/47] target/loongarch: Implement xvpack xvpick xvilv{l/h}

2023-07-14 Thread Song Gao

This patch includes:
- XVPACK{EV/OD}.{B/H/W/D};
- XVPICK{EV/OD}.{B/H/W/D};
- XVILV{L/H}.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  27 
 target/loongarch/insn_trans/trans_lasx.c.inc |  27 
 target/loongarch/insns.decode|  27 
 target/loongarch/vec_helper.c| 138 +++
 4 files changed, 159 insertions(+), 60 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index ac7dd3021d..9b6a07bbb0 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2548,3 +2548,30 @@ INSN_LASX(xvpickve_d,vv_i)
 
 INSN_LASX(xvbsll_v,  vv_i)
 INSN_LASX(xvbsrl_v,  vv_i)
+
+INSN_LASX(xvpackev_b,vvv)
+INSN_LASX(xvpackev_h,vvv)
+INSN_LASX(xvpackev_w,vvv)
+INSN_LASX(xvpackev_d,vvv)
+INSN_LASX(xvpackod_b,vvv)
+INSN_LASX(xvpackod_h,vvv)
+INSN_LASX(xvpackod_w,vvv)
+INSN_LASX(xvpackod_d,vvv)
+
+INSN_LASX(xvpickev_b,vvv)
+INSN_LASX(xvpickev_h,vvv)
+INSN_LASX(xvpickev_w,vvv)
+INSN_LASX(xvpickev_d,vvv)
+INSN_LASX(xvpickod_b,vvv)
+INSN_LASX(xvpickod_h,vvv)
+INSN_LASX(xvpickod_w,vvv)
+INSN_LASX(xvpickod_d,vvv)
+
+INSN_LASX(xvilvl_b,  vvv)
+INSN_LASX(xvilvl_h,  vvv)
+INSN_LASX(xvilvl_w,  vvv)
+INSN_LASX(xvilvl_d,  vvv)
+INSN_LASX(xvilvh_b,  vvv)
+INSN_LASX(xvilvh_h,  vvv)
+INSN_LASX(xvilvh_w,  vvv)
+INSN_LASX(xvilvh_d,  vvv)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 250665e3fe..500e204fb9 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -878,3 +878,30 @@ TRANS(xvpickve_d, gen_vv_i, 32, gen_helper_xvpickve_d)
 
 TRANS(xvbsll_v, do_vbsll_v, 32)
 TRANS(xvbsrl_v, do_vbsrl_v, 32)
+
+TRANS(xvpackev_b, gen_vvv, 32, gen_helper_vpackev_b)
+TRANS(xvpackev_h, gen_vvv, 32, gen_helper_vpackev_h)
+TRANS(xvpackev_w, gen_vvv, 32, gen_helper_vpackev_w)
+TRANS(xvpackev_d, gen_vvv, 32, gen_helper_vpackev_d)
+TRANS(xvpackod_b, gen_vvv, 32, gen_helper_vpackod_b)
+TRANS(xvpackod_h, gen_vvv, 32, gen_helper_vpackod_h)
+TRANS(xvpackod_w, gen_vvv, 32, gen_helper_vpackod_w)
+TRANS(xvpackod_d, gen_vvv, 32, gen_helper_vpackod_d)
+
+TRANS(xvpickev_b, gen_vvv, 32, gen_helper_vpickev_b)
+TRANS(xvpickev_h, gen_vvv, 32, gen_helper_vpickev_h)
+TRANS(xvpickev_w, gen_vvv, 32, gen_helper_vpickev_w)
+TRANS(xvpickev_d, gen_vvv, 32, gen_helper_vpickev_d)
+TRANS(xvpickod_b, gen_vvv, 32, gen_helper_vpickod_b)
+TRANS(xvpickod_h, gen_vvv, 32, gen_helper_vpickod_h)
+TRANS(xvpickod_w, gen_vvv, 32, gen_helper_vpickod_w)
+TRANS(xvpickod_d, gen_vvv, 32, gen_helper_vpickod_d)
+
+TRANS(xvilvl_b, gen_vvv, 32, gen_helper_vilvl_b)
+TRANS(xvilvl_h, gen_vvv, 32, gen_helper_vilvl_h)
+TRANS(xvilvl_w, gen_vvv, 32, gen_helper_vilvl_w)
+TRANS(xvilvl_d, gen_vvv, 32, gen_helper_vilvl_d)
+TRANS(xvilvh_b, gen_vvv, 32, gen_helper_vilvh_b)
+TRANS(xvilvh_h, gen_vvv, 32, gen_helper_vilvh_h)
+TRANS(xvilvh_w, gen_vvv, 32, gen_helper_vilvh_w)
+TRANS(xvilvh_d, gen_vvv, 32, gen_helper_vilvh_d)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 74383ba3bc..a325b861c1 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -2012,3 +2012,30 @@ xvpickve_d   0111 0111 00111 110 .. . .  
 @vv_ui2
 
 xvbsll_v 0111 01101000 11100 . . .@vv_ui5
 xvbsrl_v 0111 01101000 11101 . . .@vv_ui5
+
+xvpackev_b   0111 01010001 01100 . . .@vvv
+xvpackev_h   0111 01010001 01101 . . .@vvv
+xvpackev_w   0111 01010001 01110 . . .@vvv
+xvpackev_d   0111 01010001 0 . . .@vvv
+xvpackod_b   0111 01010001 1 . . .@vvv
+xvpackod_h   0111 01010001 10001 . . .@vvv
+xvpackod_w   0111 01010001 10010 . . .@vvv
+xvpackod_d   0111 01010001 10011 . . .@vvv
+
+xvpickev_b   0111 01010001 11100 . . .@vvv
+xvpickev_h   0111 01010001 11101 . . .@vvv
+xvpickev_w   0111 01010001 0 . . .@vvv
+xvpickev_d   0111 01010001 1 . . .@vvv
+xvpickod_b   0111 01010010 0 . . .@vvv
+xvpickod_h   0111 01010010 1 . . .@vvv
+xvpickod_w   0111 01010010 00010 . . .@vvv
+xvpickod_d   0111 01010010 00011 . . .@vvv
+
+xvilvl_b 0111 01010001 10100 . . .@vvv
+xvilvl_h 0111 01010001 10101 . . .@vvv
+xvilvl_w 0111 01010001 10110 . . .@vvv
+xvilvl_d 0111 01010001 10111 . . .@vvv
+xvilvh_b 0111 01010001 11000 . . .@vvv
+xvilvh_h

[PATCH v3 28/47] target/loongarch: Implement xvsrlr xvsrar

2023-07-14 Thread Song Gao

This patch includes:
- XVSRLR[I].{B/H/W/D};
- XVSRAR[I].{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 18 ++
 target/loongarch/insn_trans/trans_lasx.c.inc | 18 ++
 target/loongarch/insns.decode| 17 +
 target/loongarch/vec_helper.c| 12 
 4 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 93c205fa32..9109203a05 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2086,6 +2086,24 @@ INSN_LASX(xvsllwil_wu_hu,vv_i)
 INSN_LASX(xvsllwil_du_wu,vv_i)
 INSN_LASX(xvextl_qu_du,  vv)
 
+INSN_LASX(xvsrlr_b,  vvv)
+INSN_LASX(xvsrlr_h,  vvv)
+INSN_LASX(xvsrlr_w,  vvv)
+INSN_LASX(xvsrlr_d,  vvv)
+INSN_LASX(xvsrlri_b, vv_i)
+INSN_LASX(xvsrlri_h, vv_i)
+INSN_LASX(xvsrlri_w, vv_i)
+INSN_LASX(xvsrlri_d, vv_i)
+
+INSN_LASX(xvsrar_b,  vvv)
+INSN_LASX(xvsrar_h,  vvv)
+INSN_LASX(xvsrar_w,  vvv)
+INSN_LASX(xvsrar_d,  vvv)
+INSN_LASX(xvsrari_b, vv_i)
+INSN_LASX(xvsrari_h, vv_i)
+INSN_LASX(xvsrari_w, vv_i)
+INSN_LASX(xvsrari_d, vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index b51e80dece..aebe384220 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -405,6 +405,24 @@ TRANS(xvsllwil_wu_hu, gen_vv_i, 32, 
gen_helper_vsllwil_wu_hu)
 TRANS(xvsllwil_du_wu, gen_vv_i, 32, gen_helper_vsllwil_du_wu)
 TRANS(xvextl_qu_du, gen_vv, 32, gen_helper_vextl_qu_du)
 
+TRANS(xvsrlr_b, gen_vvv, 32, gen_helper_vsrlr_b)
+TRANS(xvsrlr_h, gen_vvv, 32, gen_helper_vsrlr_h)
+TRANS(xvsrlr_w, gen_vvv, 32, gen_helper_vsrlr_w)
+TRANS(xvsrlr_d, gen_vvv, 32, gen_helper_vsrlr_d)
+TRANS(xvsrlri_b, gen_vv_i, 32, gen_helper_vsrlri_b)
+TRANS(xvsrlri_h, gen_vv_i, 32, gen_helper_vsrlri_h)
+TRANS(xvsrlri_w, gen_vv_i, 32, gen_helper_vsrlri_w)
+TRANS(xvsrlri_d, gen_vv_i, 32, gen_helper_vsrlri_d)
+
+TRANS(xvsrar_b, gen_vvv, 32, gen_helper_vsrar_b)
+TRANS(xvsrar_h, gen_vvv, 32, gen_helper_vsrar_h)
+TRANS(xvsrar_w, gen_vvv, 32, gen_helper_vsrar_w)
+TRANS(xvsrar_d, gen_vvv, 32, gen_helper_vsrar_d)
+TRANS(xvsrari_b, gen_vv_i, 32, gen_helper_vsrari_b)
+TRANS(xvsrari_h, gen_vv_i, 32, gen_helper_vsrari_h)
+TRANS(xvsrari_w, gen_vv_i, 32, gen_helper_vsrari_w)
+TRANS(xvsrari_d, gen_vv_i, 32, gen_helper_vsrari_d)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 8a7933eccc..ca0951e1cc 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1661,6 +1661,23 @@ xvsllwil_wu_hu   0111 0111 11000 1  . .  
 @vv_ui4
 xvsllwil_du_wu   0111 0111 11001 . . .@vv_ui5
 xvextl_qu_du 0111 0111 11010 0 . .@vv
 
+xvsrlr_b 0111 0100 0 . . .@vvv
+xvsrlr_h 0111 0100 1 . . .@vvv
+xvsrlr_w 0111 0100 00010 . . .@vvv
+xvsrlr_d 0111 0100 00011 . . .@vvv
+xvsrlri_b0111 01101010 01000 01 ... . .   @vv_ui3
+xvsrlri_h0111 01101010 01000 1  . .   @vv_ui4
+xvsrlri_w0111 01101010 01001 . . .@vv_ui5
+xvsrlri_d0111 01101010 0101 .. . .@vv_ui6
+xvsrar_b 0111 0100 00100 . . .@vvv
+xvsrar_h 0111 0100 00101 . . .@vvv
+xvsrar_w 0111 0100 00110 . . .@vvv
+xvsrar_d 0111 0100 00111 . . .@vvv
+xvsrari_b0111 01101010 1 01 ... . .   @vv_ui3
+xvsrari_h0111 01101010 1 1  . .   @vv_ui4
+xvsrari_w0111 01101010 10001 . . .@vv_ui5
+xvsrari_d0111 01101010 1001 .. . .@vv_ui6
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index dcf75d421c..38b55e00ca 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -997,8 +997,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t 
desc)  \
 VReg *Vd = (VReg *)vd;  \
 VReg *Vj = (VReg *)vj;  \
 VReg *Vk = (VReg *)vk;  \
+int oprsz = simd_oprsz(desc);   \

[PATCH v3 26/47] target/loongarch: Implement xvsll xvsrl xvsra xvrotr

2023-07-14 Thread Song Gao

This patch includes:
- XVSLL[I].{B/H/W/D};
- XVSRL[I].{B/H/W/D};
- XVSRA[I].{B/H/W/D};
- XVROTR[I].{B/H/W/D}.

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 36 
 target/loongarch/insn_trans/trans_lasx.c.inc | 36 
 target/loongarch/insns.decode| 33 ++
 3 files changed, 105 insertions(+)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 59fa249bae..e081a11aba 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2041,6 +2041,42 @@ INSN_LASX(xvori_b,   vv_i)
 INSN_LASX(xvxori_b,  vv_i)
 INSN_LASX(xvnori_b,  vv_i)
 
+INSN_LASX(xvsll_b,   vvv)
+INSN_LASX(xvsll_h,   vvv)
+INSN_LASX(xvsll_w,   vvv)
+INSN_LASX(xvsll_d,   vvv)
+INSN_LASX(xvslli_b,  vv_i)
+INSN_LASX(xvslli_h,  vv_i)
+INSN_LASX(xvslli_w,  vv_i)
+INSN_LASX(xvslli_d,  vv_i)
+
+INSN_LASX(xvsrl_b,   vvv)
+INSN_LASX(xvsrl_h,   vvv)
+INSN_LASX(xvsrl_w,   vvv)
+INSN_LASX(xvsrl_d,   vvv)
+INSN_LASX(xvsrli_b,  vv_i)
+INSN_LASX(xvsrli_h,  vv_i)
+INSN_LASX(xvsrli_w,  vv_i)
+INSN_LASX(xvsrli_d,  vv_i)
+
+INSN_LASX(xvsra_b,   vvv)
+INSN_LASX(xvsra_h,   vvv)
+INSN_LASX(xvsra_w,   vvv)
+INSN_LASX(xvsra_d,   vvv)
+INSN_LASX(xvsrai_b,  vv_i)
+INSN_LASX(xvsrai_h,  vv_i)
+INSN_LASX(xvsrai_w,  vv_i)
+INSN_LASX(xvsrai_d,  vv_i)
+
+INSN_LASX(xvrotr_b,  vvv)
+INSN_LASX(xvrotr_h,  vvv)
+INSN_LASX(xvrotr_w,  vvv)
+INSN_LASX(xvrotr_d,  vvv)
+INSN_LASX(xvrotri_b, vv_i)
+INSN_LASX(xvrotri_h, vv_i)
+INSN_LASX(xvrotri_w, vv_i)
+INSN_LASX(xvrotri_d, vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 31967b371c..5e88f0c530 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -360,6 +360,42 @@ TRANS(xvori_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_ori)
 TRANS(xvxori_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_xori)
 TRANS(xvnori_b, gvec_vv_i, 32, MO_8, do_vnori_b)
 
+TRANS(xvsll_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_shlv)
+TRANS(xvsll_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_shlv)
+TRANS(xvsll_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_shlv)
+TRANS(xvsll_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_shlv)
+TRANS(xvslli_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_shli)
+TRANS(xvslli_h, gvec_vv_i, 32, MO_16, tcg_gen_gvec_shli)
+TRANS(xvslli_w, gvec_vv_i, 32, MO_32, tcg_gen_gvec_shli)
+TRANS(xvslli_d, gvec_vv_i, 32, MO_64, tcg_gen_gvec_shli)
+
+TRANS(xvsrl_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_shrv)
+TRANS(xvsrl_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_shrv)
+TRANS(xvsrl_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_shrv)
+TRANS(xvsrl_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_shrv)
+TRANS(xvsrli_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_shri)
+TRANS(xvsrli_h, gvec_vv_i, 32, MO_16, tcg_gen_gvec_shri)
+TRANS(xvsrli_w, gvec_vv_i, 32, MO_32, tcg_gen_gvec_shri)
+TRANS(xvsrli_d, gvec_vv_i, 32, MO_64, tcg_gen_gvec_shri)
+
+TRANS(xvsra_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_sarv)
+TRANS(xvsra_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_sarv)
+TRANS(xvsra_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_sarv)
+TRANS(xvsra_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_sarv)
+TRANS(xvsrai_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_sari)
+TRANS(xvsrai_h, gvec_vv_i, 32, MO_16, tcg_gen_gvec_sari)
+TRANS(xvsrai_w, gvec_vv_i, 32, MO_32, tcg_gen_gvec_sari)
+TRANS(xvsrai_d, gvec_vv_i, 32, MO_64, tcg_gen_gvec_sari)
+
+TRANS(xvrotr_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_rotrv)
+TRANS(xvrotr_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_rotrv)
+TRANS(xvrotr_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_rotrv)
+TRANS(xvrotr_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_rotrv)
+TRANS(xvrotri_b, gvec_vv_i, 32, MO_8, tcg_gen_gvec_rotri)
+TRANS(xvrotri_h, gvec_vv_i, 32, MO_16, tcg_gen_gvec_rotri)
+TRANS(xvrotri_w, gvec_vv_i, 32, MO_32, tcg_gen_gvec_rotri)
+TRANS(xvrotri_d, gvec_vv_i, 32, MO_64, tcg_gen_gvec_rotri)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index fb28666577..fb7bd9fb34 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1619,6 +1619,39 @@ xvori_b  0111 0101 01  . .   
 @vv_ui8
 xvxori_b 0111 0101 10  . .@vv_ui8
 xvnori_b 0111 0101 11  . .@vv_ui8
 
+xvsll_b  0111 01001110 1 . . .@vvv
+xvsll_h  0111 01001110 10001 . . .@vvv
+xvsll_w  0111 01001110 10010 . . .@vvv
+xvsll_d  0111 01001110 10011

[PATCH v3 31/47] target/loongarch: Implement xvssrln xvssran

2023-07-14 Thread Song Gao

This patch includes:
- XVSSRLN.{B.H/H.W/W.D};
- XVSSRAN.{B.H/H.W/W.D};
- XVSSRLN.{BU.H/HU.W/WU.D};
- XVSSRAN.{BU.H/HU.W/WU.D};
- XVSSRLNI.{B.H/H.W/W.D/D.Q};
- XVSSRANI.{B.H/H.W/W.D/D.Q};
- XVSSRLNI.{BU.H/HU.W/WU.D/DU.Q};
- XVSSRANI.{BU.H/HU.W/WU.D/DU.Q}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  30 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  30 ++
 target/loongarch/insns.decode|  30 ++
 target/loongarch/vec_helper.c| 451 ++-
 4 files changed, 337 insertions(+), 204 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 04b6ea713d..04e8d42044 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2136,6 +2136,36 @@ INSN_LASX(xvsrarni_h_w,  vv_i)
 INSN_LASX(xvsrarni_w_d,  vv_i)
 INSN_LASX(xvsrarni_d_q,  vv_i)
 
+INSN_LASX(xvssrln_b_h,   vvv)
+INSN_LASX(xvssrln_h_w,   vvv)
+INSN_LASX(xvssrln_w_d,   vvv)
+INSN_LASX(xvssran_b_h,   vvv)
+INSN_LASX(xvssran_h_w,   vvv)
+INSN_LASX(xvssran_w_d,   vvv)
+INSN_LASX(xvssrln_bu_h,  vvv)
+INSN_LASX(xvssrln_hu_w,  vvv)
+INSN_LASX(xvssrln_wu_d,  vvv)
+INSN_LASX(xvssran_bu_h,  vvv)
+INSN_LASX(xvssran_hu_w,  vvv)
+INSN_LASX(xvssran_wu_d,  vvv)
+
+INSN_LASX(xvssrlni_b_h,  vv_i)
+INSN_LASX(xvssrlni_h_w,  vv_i)
+INSN_LASX(xvssrlni_w_d,  vv_i)
+INSN_LASX(xvssrlni_d_q,  vv_i)
+INSN_LASX(xvssrani_b_h,  vv_i)
+INSN_LASX(xvssrani_h_w,  vv_i)
+INSN_LASX(xvssrani_w_d,  vv_i)
+INSN_LASX(xvssrani_d_q,  vv_i)
+INSN_LASX(xvssrlni_bu_h, vv_i)
+INSN_LASX(xvssrlni_hu_w, vv_i)
+INSN_LASX(xvssrlni_wu_d, vv_i)
+INSN_LASX(xvssrlni_du_q, vv_i)
+INSN_LASX(xvssrani_bu_h, vv_i)
+INSN_LASX(xvssrani_hu_w, vv_i)
+INSN_LASX(xvssrani_wu_d, vv_i)
+INSN_LASX(xvssrani_du_q, vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 76cc3d749a..8804d23e3a 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -455,6 +455,36 @@ TRANS(xvsrarni_h_w, gen_vv_i, 32, gen_helper_vsrarni_h_w)
 TRANS(xvsrarni_w_d, gen_vv_i, 32, gen_helper_vsrarni_w_d)
 TRANS(xvsrarni_d_q, gen_vv_i, 32, gen_helper_vsrarni_d_q)
 
+TRANS(xvssrln_b_h, gen_vvv, 32, gen_helper_vssrln_b_h)
+TRANS(xvssrln_h_w, gen_vvv, 32, gen_helper_vssrln_h_w)
+TRANS(xvssrln_w_d, gen_vvv, 32, gen_helper_vssrln_w_d)
+TRANS(xvssran_b_h, gen_vvv, 32, gen_helper_vssran_b_h)
+TRANS(xvssran_h_w, gen_vvv, 32, gen_helper_vssran_h_w)
+TRANS(xvssran_w_d, gen_vvv, 32, gen_helper_vssran_w_d)
+TRANS(xvssrln_bu_h, gen_vvv, 32, gen_helper_vssrln_bu_h)
+TRANS(xvssrln_hu_w, gen_vvv, 32, gen_helper_vssrln_hu_w)
+TRANS(xvssrln_wu_d, gen_vvv, 32, gen_helper_vssrln_wu_d)
+TRANS(xvssran_bu_h, gen_vvv, 32, gen_helper_vssran_bu_h)
+TRANS(xvssran_hu_w, gen_vvv, 32, gen_helper_vssran_hu_w)
+TRANS(xvssran_wu_d, gen_vvv, 32, gen_helper_vssran_wu_d)
+
+TRANS(xvssrlni_b_h, gen_vv_i, 32, gen_helper_vssrlni_b_h)
+TRANS(xvssrlni_h_w, gen_vv_i, 32, gen_helper_vssrlni_h_w)
+TRANS(xvssrlni_w_d, gen_vv_i, 32, gen_helper_vssrlni_w_d)
+TRANS(xvssrlni_d_q, gen_vv_i, 32, gen_helper_vssrlni_d_q)
+TRANS(xvssrani_b_h, gen_vv_i, 32, gen_helper_vssrani_b_h)
+TRANS(xvssrani_h_w, gen_vv_i, 32, gen_helper_vssrani_h_w)
+TRANS(xvssrani_w_d, gen_vv_i, 32, gen_helper_vssrani_w_d)
+TRANS(xvssrani_d_q, gen_vv_i, 32, gen_helper_vssrani_d_q)
+TRANS(xvssrlni_bu_h, gen_vv_i, 32, gen_helper_vssrlni_bu_h)
+TRANS(xvssrlni_hu_w, gen_vv_i, 32, gen_helper_vssrlni_hu_w)
+TRANS(xvssrlni_wu_d, gen_vv_i, 32, gen_helper_vssrlni_wu_d)
+TRANS(xvssrlni_du_q, gen_vv_i, 32, gen_helper_vssrlni_du_q)
+TRANS(xvssrani_bu_h, gen_vv_i, 32, gen_helper_vssrani_bu_h)
+TRANS(xvssrani_hu_w, gen_vv_i, 32, gen_helper_vssrani_hu_w)
+TRANS(xvssrani_wu_d, gen_vv_i, 32, gen_helper_vssrani_wu_d)
+TRANS(xvssrani_du_q, gen_vv_i, 32, gen_helper_vssrani_du_q)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index d7c50b14ca..022dd9bfd1 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1710,6 +1710,36 @@ xvsrarni_h_w 0111 01110101 11001 . . .   
 @vv_ui5
 xvsrarni_w_d 0111 01110101 1101 .. . .@vv_ui6
 xvsrarni_d_q 0111 01110101 111 ... . .@vv_ui7
 
+xvssrln_b_h  0111 0100 11001 . . .@vvv
+xvssrln_h_w  0111 0100 11010 . . .@vvv
+xvssrln_w_d  0111 0100 11011 . . .@vvv
+xvssran_b_h  0111 0100 11101 . . .@vvv
+xvssran_h_w  0111 0100 0 . . .@vvv
+xvssran_w_d  0111 0100 1 . . .@vvv
+xvssrln_bu_h

[PATCH v3 19/47] target/loongarch: Implement xvsat

2023-07-14 Thread Song Gao

This patch includes:
- XVSAT.{B/H/W/D}[U].

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  9 
 target/loongarch/insn_trans/trans_lasx.c.inc |  9 
 target/loongarch/insns.decode|  9 
 target/loongarch/vec_helper.c| 48 ++--
 4 files changed, 52 insertions(+), 23 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 72df9f0b08..09e5981fc3 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1979,6 +1979,15 @@ INSN_LASX(xvmod_hu,  vvv)
 INSN_LASX(xvmod_wu,  vvv)
 INSN_LASX(xvmod_du,  vvv)
 
+INSN_LASX(xvsat_b,   vv_i)
+INSN_LASX(xvsat_h,   vv_i)
+INSN_LASX(xvsat_w,   vv_i)
+INSN_LASX(xvsat_d,   vv_i)
+INSN_LASX(xvsat_bu,  vv_i)
+INSN_LASX(xvsat_hu,  vv_i)
+INSN_LASX(xvsat_wu,  vv_i)
+INSN_LASX(xvsat_du,  vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 118635dc1a..cda617413e 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -304,6 +304,15 @@ TRANS(xvmod_hu, gen_vvv, 32, gen_helper_vmod_hu)
 TRANS(xvmod_wu, gen_vvv, 32, gen_helper_vmod_wu)
 TRANS(xvmod_du, gen_vvv, 32, gen_helper_vmod_du)
 
+TRANS(xvsat_b, gvec_vv_i, 32, MO_8, do_vsat_s)
+TRANS(xvsat_h, gvec_vv_i, 32, MO_16, do_vsat_s)
+TRANS(xvsat_w, gvec_vv_i, 32, MO_32, do_vsat_s)
+TRANS(xvsat_d, gvec_vv_i, 32, MO_64, do_vsat_s)
+TRANS(xvsat_bu, gvec_vv_i, 32, MO_8, do_vsat_u)
+TRANS(xvsat_hu, gvec_vv_i, 32, MO_16, do_vsat_u)
+TRANS(xvsat_wu, gvec_vv_i, 32, MO_32, do_vsat_u)
+TRANS(xvsat_du, gvec_vv_i, 32, MO_64, do_vsat_u)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index fa25c876b4..e366cf7615 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1562,6 +1562,15 @@ xvmod_hu 0111 01001110 01101 . . .   
 @vvv
 xvmod_wu 0111 01001110 01110 . . .@vvv
 xvmod_du 0111 01001110 0 . . .@vvv
 
+xvsat_b  0111 01110010 01000 01 ... . .   @vv_ui3
+xvsat_h  0111 01110010 01000 1  . .   @vv_ui4
+xvsat_w  0111 01110010 01001 . . .@vv_ui5
+xvsat_d  0111 01110010 0101 .. . .@vv_ui6
+xvsat_bu 0111 01110010 1 01 ... . .   @vv_ui3
+xvsat_hu 0111 01110010 1 1  . .   @vv_ui4
+xvsat_wu 0111 01110010 10001 . . .@vv_ui5
+xvsat_du 0111 01110010 1001 .. . .@vv_ui6
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 998e561e0f..4df39c007e 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -652,18 +652,19 @@ VDIV(vmod_hu, 16, UH, DO_REMU)
 VDIV(vmod_wu, 32, UW, DO_REMU)
 VDIV(vmod_du, 64, UD, DO_REMU)
 
-#define VSAT_S(NAME, BIT, E)\
-void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t v) \
-{   \
-int i;  \
-VReg *Vd = (VReg *)vd;  \
-VReg *Vj = (VReg *)vj;  \
-typedef __typeof(Vd->E(0)) TD;  \
-\
-for (i = 0; i < LSX_LEN/BIT; i++) { \
-Vd->E(i) = Vj->E(i) > (TD)max ? (TD)max :   \
-   Vj->E(i) < (TD)~max ? (TD)~max: Vj->E(i);\
-}   \
+#define VSAT_S(NAME, BIT, E)   \
+void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t desc) \
+{  \
+int i; \
+VReg *Vd = (VReg *)vd; \
+VReg *Vj = (VReg *)vj; \
+typedef __typeof(Vd->E(0)) TD; \
+int oprsz = simd_oprsz(desc);  \
+   \
+for (i = 0; i < oprsz / (BIT / 8); i++) {  \
+Vd->E(i) = Vj->E(i) > (TD)max ? (TD)max :  \
+

[PATCH v3 43/47] target/loongarch: Implement xvreplve xvinsve0 xvpickve xvb{sll/srl}v

2023-07-14 Thread Song Gao

This patch includes:
- XVREPLVE.{B/H/W/D};
- XVREPL128VEI.{B/H/W/D};
- XVREPLVE0.{B/H/W/D/Q};
- XVINSVE0.{W/D};
- XVPICKVE.{W/D};
- XVBSLL.V, XVBSRL.V.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  28 +
 target/loongarch/helper.h|   5 +
 target/loongarch/insn_trans/trans_lasx.c.inc |  98 
 target/loongarch/insn_trans/trans_lsx.c.inc  | 111 +++
 target/loongarch/insns.decode|  25 +
 target/loongarch/vec_helper.c|  28 +
 6 files changed, 249 insertions(+), 46 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 0995d9b794..ac7dd3021d 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1748,6 +1748,10 @@ static void output_rv_i_x(DisasContext *ctx, arg_rv_i 
*a, const char *mnemonic)
 output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->vj, a->imm);
 }
 
+static void output_vvr_x(DisasContext *ctx, arg_vvr *a, const char *mnemonic)
+{
+output(ctx, mnemonic, "x%d, x%d, r%d", a->vd, a->vj, a->rk);
+}
 
 INSN_LASX(xvadd_b,   vvv)
 INSN_LASX(xvadd_h,   vvv)
@@ -2520,3 +2524,27 @@ INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
 INSN_LASX(xvreplgr2vr_d, vr)
+
+INSN_LASX(xvreplve_b,vvr)
+INSN_LASX(xvreplve_h,vvr)
+INSN_LASX(xvreplve_w,vvr)
+INSN_LASX(xvreplve_d,vvr)
+INSN_LASX(xvrepl128vei_b,vv_i)
+INSN_LASX(xvrepl128vei_h,vv_i)
+INSN_LASX(xvrepl128vei_w,vv_i)
+INSN_LASX(xvrepl128vei_d,vv_i)
+
+INSN_LASX(xvreplve0_b,   vv)
+INSN_LASX(xvreplve0_h,   vv)
+INSN_LASX(xvreplve0_w,   vv)
+INSN_LASX(xvreplve0_d,   vv)
+INSN_LASX(xvreplve0_q,   vv)
+
+INSN_LASX(xvinsve0_w,vv_i)
+INSN_LASX(xvinsve0_d,vv_i)
+
+INSN_LASX(xvpickve_w,vv_i)
+INSN_LASX(xvpickve_d,vv_i)
+
+INSN_LASX(xvbsll_v,  vv_i)
+INSN_LASX(xvbsrl_v,  vv_i)
diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h
index 21993c8987..dc568d8982 100644
--- a/target/loongarch/helper.h
+++ b/target/loongarch/helper.h
@@ -667,6 +667,11 @@ DEF_HELPER_4(vsetallnez_h, void, env, i32, i32, i32)
 DEF_HELPER_4(vsetallnez_w, void, env, i32, i32, i32)
 DEF_HELPER_4(vsetallnez_d, void, env, i32, i32, i32)
 
+DEF_HELPER_FLAGS_4(xvinsve0_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+DEF_HELPER_FLAGS_4(xvinsve0_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+DEF_HELPER_FLAGS_4(xvpickve_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+DEF_HELPER_FLAGS_4(xvpickve_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+
 DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(vpackev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index a79f34d280..250665e3fe 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -780,3 +780,101 @@ TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
 TRANS(xvreplgr2vr_d, gvec_dup, 32, MO_64)
+
+TRANS(xvreplve_b, gen_vreplve, 32, MO_8, 8, tcg_gen_ld8u_i64)
+TRANS(xvreplve_h, gen_vreplve, 32, MO_16, 16, tcg_gen_ld16u_i64)
+TRANS(xvreplve_w, gen_vreplve, 32, MO_32, 32, tcg_gen_ld32u_i64)
+TRANS(xvreplve_d, gen_vreplve, 32, MO_64, 64, tcg_gen_ld_i64)
+
+static bool trans_xvrepl128vei_b(DisasContext *ctx, arg_vv_i * a)
+{
+CHECK_VEC;
+
+tcg_gen_gvec_dup_mem(MO_8,
+ offsetof(CPULoongArchState, fpr[a->vd].vreg.B(0)),
+ offsetof(CPULoongArchState,
+  fpr[a->vj].vreg.B((a->imm))),
+ 16, 16);
+tcg_gen_gvec_dup_mem(MO_8,
+ offsetof(CPULoongArchState, fpr[a->vd].vreg.B(16)),
+ offsetof(CPULoongArchState,
+  fpr[a->vj].vreg.B((a->imm + 16))),
+ 16, 16);
+return true;
+}
+
+static bool trans_xvrepl128vei_h(DisasContext *ctx, arg_vv_i *a)
+{
+CHECK_VEC;
+
+tcg_gen_gvec_dup_mem(MO_16,
+ offsetof(CPULoongArchState, fpr[a->vd].vreg.H(0)),
+ offsetof(CPULoongArchState,
+  fpr[a->vj].vreg.H((a->imm))),
+ 16, 16);
+tcg_gen_gvec_dup_mem(MO_16,
+ offsetof(CPULoongArchState, fpr[a->vd].vreg.H(8)),
+ offsetof(CPULoongArchState,
+  fpr[a->vj].vreg.H((a->imm + 8))),
+ 16, 16);
+return true;
+}
+
+static bool trans_xvrepl128vei_w(DisasContext *ctx, arg_vv_i *a)
+{
+CHECK_VEC;
+
+tcg_gen_gvec_dup_mem(MO_32,
+

[PATCH v3 10/47] target/loongarch: Implement xvhaddw/xvhsubw

2023-07-14 Thread Song Gao

This patch includes:
- XVHADDW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU};
- XVHSUBW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 17 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 17 +
 target/loongarch/insns.decode| 18 ++
 target/loongarch/meson.build |  2 +-
 target/loongarch/vec.h   |  3 ++
 target/loongarch/vec_helper.c| 36 ++--
 6 files changed, 82 insertions(+), 11 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 0fd88a56c1..e188220519 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1765,6 +1765,23 @@ INSN_LASX(xvssub_hu, vvv)
 INSN_LASX(xvssub_wu, vvv)
 INSN_LASX(xvssub_du, vvv)
 
+INSN_LASX(xvhaddw_h_b,   vvv)
+INSN_LASX(xvhaddw_w_h,   vvv)
+INSN_LASX(xvhaddw_d_w,   vvv)
+INSN_LASX(xvhaddw_q_d,   vvv)
+INSN_LASX(xvhaddw_hu_bu, vvv)
+INSN_LASX(xvhaddw_wu_hu, vvv)
+INSN_LASX(xvhaddw_du_wu, vvv)
+INSN_LASX(xvhaddw_qu_du, vvv)
+INSN_LASX(xvhsubw_h_b,   vvv)
+INSN_LASX(xvhsubw_w_h,   vvv)
+INSN_LASX(xvhsubw_d_w,   vvv)
+INSN_LASX(xvhsubw_q_d,   vvv)
+INSN_LASX(xvhsubw_hu_bu, vvv)
+INSN_LASX(xvhsubw_wu_hu, vvv)
+INSN_LASX(xvhsubw_du_wu, vvv)
+INSN_LASX(xvhsubw_qu_du, vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 275c6172b4..4272bafda2 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -78,6 +78,23 @@ TRANS(xvssub_hu, gvec_vvv, 32, MO_16, tcg_gen_gvec_ussub)
 TRANS(xvssub_wu, gvec_vvv, 32, MO_32, tcg_gen_gvec_ussub)
 TRANS(xvssub_du, gvec_vvv, 32, MO_64, tcg_gen_gvec_ussub)
 
+TRANS(xvhaddw_h_b, gen_vvv, 32, gen_helper_vhaddw_h_b)
+TRANS(xvhaddw_w_h, gen_vvv, 32, gen_helper_vhaddw_w_h)
+TRANS(xvhaddw_d_w, gen_vvv, 32, gen_helper_vhaddw_d_w)
+TRANS(xvhaddw_q_d, gen_vvv, 32, gen_helper_vhaddw_q_d)
+TRANS(xvhaddw_hu_bu, gen_vvv, 32, gen_helper_vhaddw_hu_bu)
+TRANS(xvhaddw_wu_hu, gen_vvv, 32, gen_helper_vhaddw_wu_hu)
+TRANS(xvhaddw_du_wu, gen_vvv, 32, gen_helper_vhaddw_du_wu)
+TRANS(xvhaddw_qu_du, gen_vvv, 32, gen_helper_vhaddw_qu_du)
+TRANS(xvhsubw_h_b, gen_vvv, 32, gen_helper_vhsubw_h_b)
+TRANS(xvhsubw_w_h, gen_vvv, 32, gen_helper_vhsubw_w_h)
+TRANS(xvhsubw_d_w, gen_vvv, 32, gen_helper_vhsubw_d_w)
+TRANS(xvhsubw_q_d, gen_vvv, 32, gen_helper_vhsubw_q_d)
+TRANS(xvhsubw_hu_bu, gen_vvv, 32, gen_helper_vhsubw_hu_bu)
+TRANS(xvhsubw_wu_hu, gen_vvv, 32, gen_helper_vhsubw_wu_hu)
+TRANS(xvhsubw_du_wu, gen_vvv, 32, gen_helper_vhsubw_du_wu)
+TRANS(xvhsubw_qu_du, gen_vvv, 32, gen_helper_vhsubw_qu_du)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 32f857ff7c..ba0b36f4a7 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1343,6 +1343,24 @@ xvssub_hu0111 01000100 11001 . . .   
 @vvv
 xvssub_wu0111 01000100 11010 . . .@vvv
 xvssub_du0111 01000100 11011 . . .@vvv
 
+xvhaddw_h_b  0111 01000101 01000 . . .@vvv
+xvhaddw_w_h  0111 01000101 01001 . . .@vvv
+xvhaddw_d_w  0111 01000101 01010 . . .@vvv
+xvhaddw_q_d  0111 01000101 01011 . . .@vvv
+xvhaddw_hu_bu0111 01000101 1 . . .@vvv
+xvhaddw_wu_hu0111 01000101 10001 . . .@vvv
+xvhaddw_du_wu0111 01000101 10010 . . .@vvv
+xvhaddw_qu_du0111 01000101 10011 . . .@vvv
+
+xvhsubw_h_b  0111 01000101 01100 . . .@vvv
+xvhsubw_w_h  0111 01000101 01101 . . .@vvv
+xvhsubw_d_w  0111 01000101 01110 . . .@vvv
+xvhsubw_q_d  0111 01000101 0 . . .@vvv
+xvhsubw_hu_bu0111 01000101 10100 . . .@vvv
+xvhsubw_wu_hu0111 01000101 10101 . . .@vvv
+xvhsubw_du_wu0111 01000101 10110 . . .@vvv
+xvhsubw_qu_du0111 01000101 10111 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build
index b7a27df5a9..7fbf045a5d 100644
--- a/target/loongarch/meson.build
+++ b/target/loongarch/meson.build
@@ -11,7 +11,7 @@ loongarch_tcg_ss.add(files(
   'op_helper.c',
   'translate.c',
   'gdbstub.c',
-  'lsx_helper.c',
+  'vec_helper.c',
 ))
 loongarch_tcg_ss.add(zlib)
 
diff

[PATCH v3 29/47] target/loongarch: Implement xvsrln xvsran

2023-07-14 Thread Song Gao

This patch includes:
- XVSRLN.{B.H/H.W/W.D};
- XVSRAN.{B.H/H.W/W.D};
- XVSRLNI.{B.H/H.W/W.D/D.Q};
- XVSRANI.{B.H/H.W/W.D/D.Q}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  16 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  16 ++
 target/loongarch/insns.decode|  16 ++
 target/loongarch/vec.h   |   2 +
 target/loongarch/vec_helper.c| 168 ++-
 5 files changed, 141 insertions(+), 77 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 9109203a05..14b526abd6 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2104,6 +2104,22 @@ INSN_LASX(xvsrari_h, vv_i)
 INSN_LASX(xvsrari_w, vv_i)
 INSN_LASX(xvsrari_d, vv_i)
 
+INSN_LASX(xvsrln_b_h,vvv)
+INSN_LASX(xvsrln_h_w,vvv)
+INSN_LASX(xvsrln_w_d,vvv)
+INSN_LASX(xvsran_b_h,vvv)
+INSN_LASX(xvsran_h_w,vvv)
+INSN_LASX(xvsran_w_d,vvv)
+
+INSN_LASX(xvsrlni_b_h,   vv_i)
+INSN_LASX(xvsrlni_h_w,   vv_i)
+INSN_LASX(xvsrlni_w_d,   vv_i)
+INSN_LASX(xvsrlni_d_q,   vv_i)
+INSN_LASX(xvsrani_b_h,   vv_i)
+INSN_LASX(xvsrani_h_w,   vv_i)
+INSN_LASX(xvsrani_w_d,   vv_i)
+INSN_LASX(xvsrani_d_q,   vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index aebe384220..43ff9b188a 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -423,6 +423,22 @@ TRANS(xvsrari_h, gen_vv_i, 32, gen_helper_vsrari_h)
 TRANS(xvsrari_w, gen_vv_i, 32, gen_helper_vsrari_w)
 TRANS(xvsrari_d, gen_vv_i, 32, gen_helper_vsrari_d)
 
+TRANS(xvsrln_b_h, gen_vvv, 32, gen_helper_vsrln_b_h)
+TRANS(xvsrln_h_w, gen_vvv, 32, gen_helper_vsrln_h_w)
+TRANS(xvsrln_w_d, gen_vvv, 32, gen_helper_vsrln_w_d)
+TRANS(xvsran_b_h, gen_vvv, 32, gen_helper_vsran_b_h)
+TRANS(xvsran_h_w, gen_vvv, 32, gen_helper_vsran_h_w)
+TRANS(xvsran_w_d, gen_vvv, 32, gen_helper_vsran_w_d)
+
+TRANS(xvsrlni_b_h, gen_vv_i, 32, gen_helper_vsrlni_b_h)
+TRANS(xvsrlni_h_w, gen_vv_i, 32, gen_helper_vsrlni_h_w)
+TRANS(xvsrlni_w_d, gen_vv_i, 32, gen_helper_vsrlni_w_d)
+TRANS(xvsrlni_d_q, gen_vv_i, 32, gen_helper_vsrlni_d_q)
+TRANS(xvsrani_b_h, gen_vv_i, 32, gen_helper_vsrani_b_h)
+TRANS(xvsrani_h_w, gen_vv_i, 32, gen_helper_vsrani_h_w)
+TRANS(xvsrani_w_d, gen_vv_i, 32, gen_helper_vsrani_w_d)
+TRANS(xvsrani_d_q, gen_vv_i, 32, gen_helper_vsrani_d_q)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index ca0951e1cc..204dcfa075 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1678,6 +1678,22 @@ xvsrari_h0111 01101010 1 1  . .  
 @vv_ui4
 xvsrari_w0111 01101010 10001 . . .@vv_ui5
 xvsrari_d0111 01101010 1001 .. . .@vv_ui6
 
+xvsrln_b_h   0111 0100 01001 . . .@vvv
+xvsrln_h_w   0111 0100 01010 . . .@vvv
+xvsrln_w_d   0111 0100 01011 . . .@vvv
+xvsran_b_h   0111 0100 01101 . . .@vvv
+xvsran_h_w   0111 0100 01110 . . .@vvv
+xvsran_w_d   0111 0100 0 . . .@vvv
+
+xvsrlni_b_h  0111 01110100 0 1  . .   @vv_ui4
+xvsrlni_h_w  0111 01110100 1 . . .@vv_ui5
+xvsrlni_w_d  0111 01110100 0001 .. . .@vv_ui6
+xvsrlni_d_q  0111 01110100 001 ... . .@vv_ui7
+xvsrani_b_h  0111 01110101 1 1  . .   @vv_ui4
+xvsrani_h_w  0111 01110101 10001 . . .@vv_ui5
+xvsrani_w_d  0111 01110101 1001 .. . .@vv_ui6
+xvsrani_d_q  0111 01110101 101 ... . .@vv_ui7
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index 681afd842f..67d829f9da 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -74,4 +74,6 @@
 
 #define DO_SIGNCOV(a, b)  (a == 0 ? 0 : a < 0 ? -b : b)
 
+#define R_SHIFT(a, b) (a >> b)
+
 #endif /* LOONGARCH_VEC_H */
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 38b55e00ca..dacedc4363 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -1079,107 +1079,121 @@ VSRARI(vsrari_h, 16, H)
 VSRARI(vsrari_w, 32, W)
 VSRARI(vsrari_d, 64, D)
 
-#define R_SHIFT(a, b) (a >> b)
-
-#define VSRLN(NAME, BIT, T, E1, E2) \
-void HELPER(NAME)(void *vd, void *v, void *vk, uint32_

[PATCH v3 33/47] target/loongarch: Implement xvclo xvclz

2023-07-14 Thread Song Gao

This patch includes:
- XVCLO.{B/H/W/D};
- XVCLZ.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  9 +
 target/loongarch/insn_trans/trans_lasx.c.inc |  9 +
 target/loongarch/insns.decode|  9 +
 target/loongarch/vec.h   |  9 +
 target/loongarch/vec_helper.c| 13 ++---
 5 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index f043a2f9b6..0fc58735b9 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2196,6 +2196,15 @@ INSN_LASX(xvssrarni_hu_w,vv_i)
 INSN_LASX(xvssrarni_wu_d,vv_i)
 INSN_LASX(xvssrarni_du_q,vv_i)
 
+INSN_LASX(xvclo_b,   vv)
+INSN_LASX(xvclo_h,   vv)
+INSN_LASX(xvclo_w,   vv)
+INSN_LASX(xvclo_d,   vv)
+INSN_LASX(xvclz_b,   vv)
+INSN_LASX(xvclz_h,   vv)
+INSN_LASX(xvclz_w,   vv)
+INSN_LASX(xvclz_d,   vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index d0440dea2a..80a566b948 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -515,6 +515,15 @@ TRANS(xvssrarni_hu_w, gen_vv_i, 32, 
gen_helper_vssrarni_hu_w)
 TRANS(xvssrarni_wu_d, gen_vv_i, 32, gen_helper_vssrarni_wu_d)
 TRANS(xvssrarni_du_q, gen_vv_i, 32, gen_helper_vssrarni_du_q)
 
+TRANS(xvclo_b, gen_vv, 32, gen_helper_vclo_b)
+TRANS(xvclo_h, gen_vv, 32, gen_helper_vclo_h)
+TRANS(xvclo_w, gen_vv, 32, gen_helper_vclo_w)
+TRANS(xvclo_d, gen_vv, 32, gen_helper_vclo_d)
+TRANS(xvclz_b, gen_vv, 32, gen_helper_vclz_b)
+TRANS(xvclz_h, gen_vv, 32, gen_helper_vclz_h)
+TRANS(xvclz_w, gen_vv, 32, gen_helper_vclz_w)
+TRANS(xvclz_d, gen_vv, 32, gen_helper_vclz_d)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index dc74bae7a5..3175532045 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1770,6 +1770,15 @@ xvssrarni_hu_w   0111 01110110 11001 . . .   
 @vv_ui5
 xvssrarni_wu_d   0111 01110110 1101 .. . .@vv_ui6
 xvssrarni_du_q   0111 01110110 111 ... . .@vv_ui7
 
+xvclo_b  0111 01101001 11000 0 . .@vv
+xvclo_h  0111 01101001 11000 1 . .@vv
+xvclo_w  0111 01101001 11000 00010 . .@vv
+xvclo_d  0111 01101001 11000 00011 . .@vv
+xvclz_b  0111 01101001 11000 00100 . .@vv
+xvclz_h  0111 01101001 11000 00101 . .@vv
+xvclz_w  0111 01101001 11000 00110 . .@vv
+xvclz_d  0111 01101001 11000 00111 . .@vv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index 67d829f9da..4497cd4a6d 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -76,4 +76,13 @@
 
 #define R_SHIFT(a, b) (a >> b)
 
+#define DO_CLO_B(N)  (clz32(~N & 0xff) - 24)
+#define DO_CLO_H(N)  (clz32(~N & 0x) - 16)
+#define DO_CLO_W(N)  (clz32(~N))
+#define DO_CLO_D(N)  (clz64(~N))
+#define DO_CLZ_B(N)  (clz32(N) - 24)
+#define DO_CLZ_H(N)  (clz32(N) - 16)
+#define DO_CLZ_W(N)  (clz32(N))
+#define DO_CLZ_D(N)  (clz64(N))
+
 #endif /* LOONGARCH_VEC_H */
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 94f3f13456..2706daa1e0 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -2161,22 +2161,13 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \
 int i;   \
 VReg *Vd = (VReg *)vd;   \
 VReg *Vj = (VReg *)vj;   \
+int oprsz = simd_oprsz(desc);\
  \
-for (i = 0; i < LSX_LEN/BIT; i++)\
-{\
+for (i = 0; i < oprsz / (BIT / 8); i++) {\
 Vd->E(i) = DO_OP(Vj->E(i));  \
 }\
 }
 
-#define DO_CLO_B(N)  (clz32(~N & 0xff) - 24)
-#define DO_CLO_H(N)  (clz32(~N & 0x) - 16)
-#define DO_CLO_W(N)  (clz32(~N))
-#define DO_CLO_D(N)  (clz64(~N))
-#define DO_CLZ_B(N)  (clz32(N) - 24)
-#define DO_CLZ_H(N)  (clz32(N) - 16)
-#define DO_CLZ_W(N)  (clz32(N))
-#define DO_CLZ_D(N)  (clz64(N))
-
 DO_2OP(vclo_b, 8, UB, DO_CLO_B)
 DO_2OP(vclo_h, 16, UH, DO_CLO_H)
 DO_2OP(vclo_w, 32, UW, DO_CLO_W)
-- 
2.39.1

[PATCH v3 08/47] target/loongarch: Implement xvsadd/xvssub

2023-07-14 Thread Song Gao

This patch includes:
- XVSADD.{B/H/W/D}[U];
- XVSSUB.{B/H/W/D}[U].

Signed-off-by: Song Gao 
Reviewed-by: Richard Henderson 
---
 target/loongarch/disas.c | 17 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 17 +
 target/loongarch/insns.decode| 18 ++
 3 files changed, 52 insertions(+)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 4e26d49acc..0fd88a56c1 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1748,6 +1748,23 @@ INSN_LASX(xvneg_h,   vv)
 INSN_LASX(xvneg_w,   vv)
 INSN_LASX(xvneg_d,   vv)
 
+INSN_LASX(xvsadd_b,  vvv)
+INSN_LASX(xvsadd_h,  vvv)
+INSN_LASX(xvsadd_w,  vvv)
+INSN_LASX(xvsadd_d,  vvv)
+INSN_LASX(xvsadd_bu, vvv)
+INSN_LASX(xvsadd_hu, vvv)
+INSN_LASX(xvsadd_wu, vvv)
+INSN_LASX(xvsadd_du, vvv)
+INSN_LASX(xvssub_b,  vvv)
+INSN_LASX(xvssub_h,  vvv)
+INSN_LASX(xvssub_w,  vvv)
+INSN_LASX(xvssub_d,  vvv)
+INSN_LASX(xvssub_bu, vvv)
+INSN_LASX(xvssub_hu, vvv)
+INSN_LASX(xvssub_wu, vvv)
+INSN_LASX(xvssub_du, vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 0c7d2bbffd..275c6172b4 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -61,6 +61,23 @@ TRANS(xvneg_h, gvec_vv, 32, MO_16, tcg_gen_gvec_neg)
 TRANS(xvneg_w, gvec_vv, 32, MO_32, tcg_gen_gvec_neg)
 TRANS(xvneg_d, gvec_vv, 32, MO_64, tcg_gen_gvec_neg)
 
+TRANS(xvsadd_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_ssadd)
+TRANS(xvsadd_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_ssadd)
+TRANS(xvsadd_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_ssadd)
+TRANS(xvsadd_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_ssadd)
+TRANS(xvsadd_bu, gvec_vvv, 32, MO_8, tcg_gen_gvec_usadd)
+TRANS(xvsadd_hu, gvec_vvv, 32, MO_16, tcg_gen_gvec_usadd)
+TRANS(xvsadd_wu, gvec_vvv, 32, MO_32, tcg_gen_gvec_usadd)
+TRANS(xvsadd_du, gvec_vvv, 32, MO_64, tcg_gen_gvec_usadd)
+TRANS(xvssub_b, gvec_vvv, 32, MO_8, tcg_gen_gvec_sssub)
+TRANS(xvssub_h, gvec_vvv, 32, MO_16, tcg_gen_gvec_sssub)
+TRANS(xvssub_w, gvec_vvv, 32, MO_32, tcg_gen_gvec_sssub)
+TRANS(xvssub_d, gvec_vvv, 32, MO_64, tcg_gen_gvec_sssub)
+TRANS(xvssub_bu, gvec_vvv, 32, MO_8, tcg_gen_gvec_ussub)
+TRANS(xvssub_hu, gvec_vvv, 32, MO_16, tcg_gen_gvec_ussub)
+TRANS(xvssub_wu, gvec_vvv, 32, MO_32, tcg_gen_gvec_ussub)
+TRANS(xvssub_du, gvec_vvv, 32, MO_64, tcg_gen_gvec_ussub)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 759172628f..32f857ff7c 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1325,6 +1325,24 @@ xvneg_h  0111 01101001 11000 01101 . .   
 @vv
 xvneg_w  0111 01101001 11000 01110 . .@vv
 xvneg_d  0111 01101001 11000 0 . .@vv
 
+xvsadd_b 0111 01000100 01100 . . .@vvv
+xvsadd_h 0111 01000100 01101 . . .@vvv
+xvsadd_w 0111 01000100 01110 . . .@vvv
+xvsadd_d 0111 01000100 0 . . .@vvv
+xvsadd_bu0111 01000100 10100 . . .@vvv
+xvsadd_hu0111 01000100 10101 . . .@vvv
+xvsadd_wu0111 01000100 10110 . . .@vvv
+xvsadd_du0111 01000100 10111 . . .@vvv
+
+xvssub_b 0111 01000100 1 . . .@vvv
+xvssub_h 0111 01000100 10001 . . .@vvv
+xvssub_w 0111 01000100 10010 . . .@vvv
+xvssub_d 0111 01000100 10011 . . .@vvv
+xvssub_bu0111 01000100 11000 . . .@vvv
+xvssub_hu0111 01000100 11001 . . .@vvv
+xvssub_wu0111 01000100 11010 . . .@vvv
+xvssub_du0111 01000100 11011 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
-- 
2.39.1

[PATCH v3 32/47] target/loongarch: Implement xvssrlrn xvssrarn

2023-07-14 Thread Song Gao

This patch includes:
- XVSSRLRN.{B.H/H.W/W.D};
- XVSSRARN.{B.H/H.W/W.D};
- XVSSRLRN.{BU.H/HU.W/WU.D};
- XVSSRARN.{BU.H/HU.W/WU.D};
- XVSSRLRNI.{B.H/H.W/W.D/D.Q};
- XVSSRARNI.{B.H/H.W/W.D/D.Q};
- XVSSRLRNI.{BU.H/HU.W/WU.D/DU.Q};
- XVSSRARNI.{BU.H/HU.W/WU.D/DU.Q}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  30 ++
 target/loongarch/insn_trans/trans_lasx.c.inc |  30 ++
 target/loongarch/insns.decode|  30 ++
 target/loongarch/vec_helper.c| 467 ++-
 4 files changed, 348 insertions(+), 209 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 04e8d42044..f043a2f9b6 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2166,6 +2166,36 @@ INSN_LASX(xvssrani_hu_w, vv_i)
 INSN_LASX(xvssrani_wu_d, vv_i)
 INSN_LASX(xvssrani_du_q, vv_i)
 
+INSN_LASX(xvssrlrn_b_h,  vvv)
+INSN_LASX(xvssrlrn_h_w,  vvv)
+INSN_LASX(xvssrlrn_w_d,  vvv)
+INSN_LASX(xvssrarn_b_h,  vvv)
+INSN_LASX(xvssrarn_h_w,  vvv)
+INSN_LASX(xvssrarn_w_d,  vvv)
+INSN_LASX(xvssrlrn_bu_h, vvv)
+INSN_LASX(xvssrlrn_hu_w, vvv)
+INSN_LASX(xvssrlrn_wu_d, vvv)
+INSN_LASX(xvssrarn_bu_h, vvv)
+INSN_LASX(xvssrarn_hu_w, vvv)
+INSN_LASX(xvssrarn_wu_d, vvv)
+
+INSN_LASX(xvssrlrni_b_h, vv_i)
+INSN_LASX(xvssrlrni_h_w, vv_i)
+INSN_LASX(xvssrlrni_w_d, vv_i)
+INSN_LASX(xvssrlrni_d_q, vv_i)
+INSN_LASX(xvssrlrni_bu_h,vv_i)
+INSN_LASX(xvssrlrni_hu_w,vv_i)
+INSN_LASX(xvssrlrni_wu_d,vv_i)
+INSN_LASX(xvssrlrni_du_q,vv_i)
+INSN_LASX(xvssrarni_b_h, vv_i)
+INSN_LASX(xvssrarni_h_w, vv_i)
+INSN_LASX(xvssrarni_w_d, vv_i)
+INSN_LASX(xvssrarni_d_q, vv_i)
+INSN_LASX(xvssrarni_bu_h,vv_i)
+INSN_LASX(xvssrarni_hu_w,vv_i)
+INSN_LASX(xvssrarni_wu_d,vv_i)
+INSN_LASX(xvssrarni_du_q,vv_i)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 8804d23e3a..d0440dea2a 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -485,6 +485,36 @@ TRANS(xvssrani_hu_w, gen_vv_i, 32, gen_helper_vssrani_hu_w)
 TRANS(xvssrani_wu_d, gen_vv_i, 32, gen_helper_vssrani_wu_d)
 TRANS(xvssrani_du_q, gen_vv_i, 32, gen_helper_vssrani_du_q)
 
+TRANS(xvssrlrn_b_h, gen_vvv, 32, gen_helper_vssrlrn_b_h)
+TRANS(xvssrlrn_h_w, gen_vvv, 32, gen_helper_vssrlrn_h_w)
+TRANS(xvssrlrn_w_d, gen_vvv, 32, gen_helper_vssrlrn_w_d)
+TRANS(xvssrarn_b_h, gen_vvv, 32, gen_helper_vssrarn_b_h)
+TRANS(xvssrarn_h_w, gen_vvv, 32, gen_helper_vssrarn_h_w)
+TRANS(xvssrarn_w_d, gen_vvv, 32, gen_helper_vssrarn_w_d)
+TRANS(xvssrlrn_bu_h, gen_vvv, 32, gen_helper_vssrlrn_bu_h)
+TRANS(xvssrlrn_hu_w, gen_vvv, 32, gen_helper_vssrlrn_hu_w)
+TRANS(xvssrlrn_wu_d, gen_vvv, 32, gen_helper_vssrlrn_wu_d)
+TRANS(xvssrarn_bu_h, gen_vvv, 32, gen_helper_vssrarn_bu_h)
+TRANS(xvssrarn_hu_w, gen_vvv, 32, gen_helper_vssrarn_hu_w)
+TRANS(xvssrarn_wu_d, gen_vvv, 32, gen_helper_vssrarn_wu_d)
+
+TRANS(xvssrlrni_b_h, gen_vv_i, 32, gen_helper_vssrlrni_b_h)
+TRANS(xvssrlrni_h_w, gen_vv_i, 32, gen_helper_vssrlrni_h_w)
+TRANS(xvssrlrni_w_d, gen_vv_i, 32, gen_helper_vssrlrni_w_d)
+TRANS(xvssrlrni_d_q, gen_vv_i, 32, gen_helper_vssrlrni_d_q)
+TRANS(xvssrarni_b_h, gen_vv_i, 32, gen_helper_vssrarni_b_h)
+TRANS(xvssrarni_h_w, gen_vv_i, 32, gen_helper_vssrarni_h_w)
+TRANS(xvssrarni_w_d, gen_vv_i, 32, gen_helper_vssrarni_w_d)
+TRANS(xvssrarni_d_q, gen_vv_i, 32, gen_helper_vssrarni_d_q)
+TRANS(xvssrlrni_bu_h, gen_vv_i, 32, gen_helper_vssrlrni_bu_h)
+TRANS(xvssrlrni_hu_w, gen_vv_i, 32, gen_helper_vssrlrni_hu_w)
+TRANS(xvssrlrni_wu_d, gen_vv_i, 32, gen_helper_vssrlrni_wu_d)
+TRANS(xvssrlrni_du_q, gen_vv_i, 32, gen_helper_vssrlrni_du_q)
+TRANS(xvssrarni_bu_h, gen_vv_i, 32, gen_helper_vssrarni_bu_h)
+TRANS(xvssrarni_hu_w, gen_vv_i, 32, gen_helper_vssrarni_hu_w)
+TRANS(xvssrarni_wu_d, gen_vv_i, 32, gen_helper_vssrarni_wu_d)
+TRANS(xvssrarni_du_q, gen_vv_i, 32, gen_helper_vssrarni_du_q)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 022dd9bfd1..dc74bae7a5 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1740,6 +1740,36 @@ xvssrani_hu_w0111 01110110 01001 . . .   
 @vv_ui5
 xvssrani_wu_d0111 01110110 0101 .. . .@vv_ui6
 xvssrani_du_q0111 01110110 011 ... . .@vv_ui7
 
+xvssrlrn_b_h 0111 0101 1 . . .@vvv
+xvssrlrn_h_w 0111 0101 00010 . . .@vvv
+xvssrlrn_w_d 0111 0101 00011 . . .@vvv
+xvssrarn_b_h 0111 0101 00101 . . .@vvv
+xvssrarn_h_w 0111 0101 00110 . . .@vvv
+xvssra

[PATCH v3 38/47] target/loongarch: Implement LASX fpu fcvt instructions

2023-07-14 Thread Song Gao

This patch includes:
- XVFCVT{L/H}.{S.H/D.S};
- XVFCVT.{H.S/S.D};
- XVFRINT[{RNE/RZ/RP/RM}].{S/D};
- XVFTINT[{RNE/RZ/RP/RM}].{W.S/L.D};
- XVFTINT[RZ].{WU.S/LU.D};
- XVFTINT[{RNE/RZ/RP/RM}].W.D;
- XVFTINT[{RNE/RZ/RP/RM}]{L/H}.L.S;
- XVFFINT.{S.W/D.L}[U];
- X[CVFFINT.S.L, VFFINT{L/H}.D.W.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  56 
 target/loongarch/insn_trans/trans_lasx.c.inc |  56 
 target/loongarch/insns.decode|  58 
 target/loongarch/vec_helper.c| 263 ---
 4 files changed, 335 insertions(+), 98 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 4af74f1ae9..3fd3dc3591 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2286,6 +2286,62 @@ INSN_LASX(xvfrecip_d,vv)
 INSN_LASX(xvfrsqrt_s,vv)
 INSN_LASX(xvfrsqrt_d,vv)
 
+INSN_LASX(xvfcvtl_s_h,   vv)
+INSN_LASX(xvfcvth_s_h,   vv)
+INSN_LASX(xvfcvtl_d_s,   vv)
+INSN_LASX(xvfcvth_d_s,   vv)
+INSN_LASX(xvfcvt_h_s,vvv)
+INSN_LASX(xvfcvt_s_d,vvv)
+
+INSN_LASX(xvfrint_s, vv)
+INSN_LASX(xvfrint_d, vv)
+INSN_LASX(xvfrintrm_s,   vv)
+INSN_LASX(xvfrintrm_d,   vv)
+INSN_LASX(xvfrintrp_s,   vv)
+INSN_LASX(xvfrintrp_d,   vv)
+INSN_LASX(xvfrintrz_s,   vv)
+INSN_LASX(xvfrintrz_d,   vv)
+INSN_LASX(xvfrintrne_s,  vv)
+INSN_LASX(xvfrintrne_d,  vv)
+
+INSN_LASX(xvftint_w_s,   vv)
+INSN_LASX(xvftint_l_d,   vv)
+INSN_LASX(xvftintrm_w_s, vv)
+INSN_LASX(xvftintrm_l_d, vv)
+INSN_LASX(xvftintrp_w_s, vv)
+INSN_LASX(xvftintrp_l_d, vv)
+INSN_LASX(xvftintrz_w_s, vv)
+INSN_LASX(xvftintrz_l_d, vv)
+INSN_LASX(xvftintrne_w_s,vv)
+INSN_LASX(xvftintrne_l_d,vv)
+INSN_LASX(xvftint_wu_s,  vv)
+INSN_LASX(xvftint_lu_d,  vv)
+INSN_LASX(xvftintrz_wu_s,vv)
+INSN_LASX(xvftintrz_lu_d,vv)
+INSN_LASX(xvftint_w_d,   vvv)
+INSN_LASX(xvftintrm_w_d, vvv)
+INSN_LASX(xvftintrp_w_d, vvv)
+INSN_LASX(xvftintrz_w_d, vvv)
+INSN_LASX(xvftintrne_w_d,vvv)
+INSN_LASX(xvftintl_l_s,  vv)
+INSN_LASX(xvftinth_l_s,  vv)
+INSN_LASX(xvftintrml_l_s,vv)
+INSN_LASX(xvftintrmh_l_s,vv)
+INSN_LASX(xvftintrpl_l_s,vv)
+INSN_LASX(xvftintrph_l_s,vv)
+INSN_LASX(xvftintrzl_l_s,vv)
+INSN_LASX(xvftintrzh_l_s,vv)
+INSN_LASX(xvftintrnel_l_s,   vv)
+INSN_LASX(xvftintrneh_l_s,   vv)
+
+INSN_LASX(xvffint_s_w,   vv)
+INSN_LASX(xvffint_s_wu,  vv)
+INSN_LASX(xvffint_d_l,   vv)
+INSN_LASX(xvffint_d_lu,  vv)
+INSN_LASX(xvffintl_d_w,  vv)
+INSN_LASX(xvffinth_d_w,  vv)
+INSN_LASX(xvffint_s_l,   vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 912b52cfdc..057aed657e 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -602,6 +602,62 @@ TRANS(xvfrecip_d, gen_vv_f, 32, gen_helper_vfrecip_d)
 TRANS(xvfrsqrt_s, gen_vv_f, 32, gen_helper_vfrsqrt_s)
 TRANS(xvfrsqrt_d, gen_vv_f, 32, gen_helper_vfrsqrt_d)
 
+TRANS(xvfcvtl_s_h, gen_vv_f, 32, gen_helper_vfcvtl_s_h)
+TRANS(xvfcvth_s_h, gen_vv_f, 32, gen_helper_vfcvth_s_h)
+TRANS(xvfcvtl_d_s, gen_vv_f, 32, gen_helper_vfcvtl_d_s)
+TRANS(xvfcvth_d_s, gen_vv_f, 32, gen_helper_vfcvth_d_s)
+TRANS(xvfcvt_h_s, gen_vvv_f, 32, gen_helper_vfcvt_h_s)
+TRANS(xvfcvt_s_d, gen_vvv_f, 32, gen_helper_vfcvt_s_d)
+
+TRANS(xvfrintrne_s, gen_vv_f, 32, gen_helper_vfrintrne_s)
+TRANS(xvfrintrne_d, gen_vv_f, 32, gen_helper_vfrintrne_d)
+TRANS(xvfrintrz_s, gen_vv_f, 32, gen_helper_vfrintrz_s)
+TRANS(xvfrintrz_d, gen_vv_f, 32, gen_helper_vfrintrz_d)
+TRANS(xvfrintrp_s, gen_vv_f, 32, gen_helper_vfrintrp_s)
+TRANS(xvfrintrp_d, gen_vv_f, 32, gen_helper_vfrintrp_d)
+TRANS(xvfrintrm_s, gen_vv_f, 32, gen_helper_vfrintrm_s)
+TRANS(xvfrintrm_d, gen_vv_f, 32, gen_helper_vfrintrm_d)
+TRANS(xvfrint_s, gen_vv_f, 32, gen_helper_vfrint_s)
+TRANS(xvfrint_d, gen_vv_f, 32, gen_helper_vfrint_d)
+
+TRANS(xvftintrne_w_s, gen_vv_f, 32, gen_helper_vftintrne_w_s)
+TRANS(xvftintrne_l_d, gen_vv_f, 32, gen_helper_vftintrne_l_d)
+TRANS(xvftintrz_w_s, gen_vv_f, 32, gen_helper_vftintrz_w_s)
+TRANS(xvftintrz_l_d, gen_vv_f, 32, gen_helper_vftintrz_l_d)
+TRANS(xvftintrp_w_s, gen_vv_f, 32, gen_helper_vftintrp_w_s)
+TRANS(xvftintrp_l_d, gen_vv_f, 32, gen_helper_vftintrp_l_d)
+TRANS(xvftintrm_w_s, gen_vv_f, 32, gen_helper_vftintrm_w_s)
+TRANS(xvftintrm_l_d, gen_vv_f, 32, gen_helper_vftintrm_l_d)
+TRANS(xvftint_w_s, gen_vv_f, 32, gen_helper_vftint_w_s)
+TRANS(xvftint_l_d, gen_vv_f, 32, gen_helper_vftint_l_d)
+TRANS(xvftintrz_wu_s, gen_vv_f, 32, gen_helper_vftintrz_wu_s)
+TRANS(xvftintrz_lu_d, gen_vv_f, 32, gen_helper_vftintrz_lu_d)
+TRANS(xvftint_wu_s, gen_vv_f, 32, gen_helper_vftint_wu_s)
+TRANS(xvftint_lu_d, gen_vv_f, 32, gen_helper_vftint_lu_d)
+TRANS(xvftintrne_

[PATCH v3 14/47] target/loongarch: Implement xvadda

2023-07-14 Thread Song Gao

This patch includes:
- XVADDA.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c |  5 
 target/loongarch/insn_trans/trans_lasx.c.inc |  5 
 target/loongarch/insns.decode|  5 
 target/loongarch/vec.h   |  2 ++
 target/loongarch/vec_helper.c| 24 ++--
 5 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index d0b1de39b8..b48822e431 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -1851,6 +1851,11 @@ INSN_LASX(xvabsd_hu, vvv)
 INSN_LASX(xvabsd_wu, vvv)
 INSN_LASX(xvabsd_du, vvv)
 
+INSN_LASX(xvadda_b,  vvv)
+INSN_LASX(xvadda_h,  vvv)
+INSN_LASX(xvadda_w,  vvv)
+INSN_LASX(xvadda_d,  vvv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index bd8ba6c7b6..30cb286cb9 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -166,6 +166,11 @@ TRANS(xvabsd_hu, gvec_vvv, 32, MO_16, do_vabsd_u)
 TRANS(xvabsd_wu, gvec_vvv, 32, MO_32, do_vabsd_u)
 TRANS(xvabsd_du, gvec_vvv, 32, MO_64, do_vabsd_u)
 
+TRANS(xvadda_b, gvec_vvv, 32, MO_8, do_vadda)
+TRANS(xvadda_h, gvec_vvv, 32, MO_16, do_vadda)
+TRANS(xvadda_w, gvec_vvv, 32, MO_32, do_vadda)
+TRANS(xvadda_d, gvec_vvv, 32, MO_64, do_vadda)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index c086ee9b22..f3722e3aa7 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1432,6 +1432,11 @@ xvabsd_hu0111 01000110 00101 . . .   
 @vvv
 xvabsd_wu0111 01000110 00110 . . .@vvv
 xvabsd_du0111 01000110 00111 . . .@vvv
 
+xvadda_b 0111 01000101 11000 . . .@vvv
+xvadda_h 0111 01000101 11001 . . .@vvv
+xvadda_w 0111 01000101 11010 . . .@vvv
+xvadda_d 0111 01000101 11011 . . .@vvv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h
index 6767073635..7ccc89c10f 100644
--- a/target/loongarch/vec.h
+++ b/target/loongarch/vec.h
@@ -55,4 +55,6 @@
 
 #define DO_VABSD(a, b)  ((a > b) ? (a - b) : (b - a))
 
+#define DO_VABS(a)  ((a < 0) ? (-a) : (a))
+
 #endif /* LOONGARCH_VEC_H */
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 22d08f36ac..ff77f714e8 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -384,18 +384,18 @@ DO_3OP(vabsd_hu, 16, UH, DO_VABSD)
 DO_3OP(vabsd_wu, 32, UW, DO_VABSD)
 DO_3OP(vabsd_du, 64, UD, DO_VABSD)
 
-#define DO_VABS(a)  ((a < 0) ? (-a) : (a))
-
-#define DO_VADDA(NAME, BIT, E, DO_OP)   \
-void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \
-{   \
-int i;  \
-VReg *Vd = (VReg *)vd;  \
-VReg *Vj = (VReg *)vj;  \
-VReg *Vk = (VReg *)vk;  \
-for (i = 0; i < LSX_LEN/BIT; i++) { \
-Vd->E(i) = DO_OP(Vj->E(i)) + DO_OP(Vk->E(i));   \
-}   \
+#define DO_VADDA(NAME, BIT, E, DO_OP)  \
+void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \
+{  \
+int i; \
+VReg *Vd = (VReg *)vd; \
+VReg *Vj = (VReg *)vj; \
+VReg *Vk = (VReg *)vk; \
+int oprsz = simd_oprsz(desc);  \
+   \
+for (i = 0; i < oprsz / (BIT / 8); i++) {  \
+Vd->E(i) = DO_OP(Vj->E(i)) + DO_OP(Vk->E(i));  \
+}  \
 }
 
 DO_VADDA(vadda_b, 8, B, DO_VABS)
-- 
2.39.1

[PATCH v3 34/47] target/loongarch: Implement xvpcnt

2023-07-14 Thread Song Gao

This patch includes:
- VPCNT.{B/H/W/D}.

Signed-off-by: Song Gao 
---
 target/loongarch/disas.c | 5 +
 target/loongarch/insn_trans/trans_lasx.c.inc | 5 +
 target/loongarch/insns.decode| 5 +
 target/loongarch/vec_helper.c| 4 ++--
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 0fc58735b9..9e31f9bbbc 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -2205,6 +2205,11 @@ INSN_LASX(xvclz_h,   vv)
 INSN_LASX(xvclz_w,   vv)
 INSN_LASX(xvclz_d,   vv)
 
+INSN_LASX(xvpcnt_b,  vv)
+INSN_LASX(xvpcnt_h,  vv)
+INSN_LASX(xvpcnt_w,  vv)
+INSN_LASX(xvpcnt_d,  vv)
+
 INSN_LASX(xvreplgr2vr_b, vr)
 INSN_LASX(xvreplgr2vr_h, vr)
 INSN_LASX(xvreplgr2vr_w, vr)
diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc 
b/target/loongarch/insn_trans/trans_lasx.c.inc
index 80a566b948..94824569a0 100644
--- a/target/loongarch/insn_trans/trans_lasx.c.inc
+++ b/target/loongarch/insn_trans/trans_lasx.c.inc
@@ -524,6 +524,11 @@ TRANS(xvclz_h, gen_vv, 32, gen_helper_vclz_h)
 TRANS(xvclz_w, gen_vv, 32, gen_helper_vclz_w)
 TRANS(xvclz_d, gen_vv, 32, gen_helper_vclz_d)
 
+TRANS(xvpcnt_b, gen_vv, 32, gen_helper_vpcnt_b)
+TRANS(xvpcnt_h, gen_vv, 32, gen_helper_vpcnt_h)
+TRANS(xvpcnt_w, gen_vv, 32, gen_helper_vpcnt_w)
+TRANS(xvpcnt_d, gen_vv, 32, gen_helper_vpcnt_d)
+
 TRANS(xvreplgr2vr_b, gvec_dup, 32, MO_8)
 TRANS(xvreplgr2vr_h, gvec_dup, 32, MO_16)
 TRANS(xvreplgr2vr_w, gvec_dup, 32, MO_32)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 3175532045..d683c6a6ab 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -1779,6 +1779,11 @@ xvclz_h  0111 01101001 11000 00101 . .   
 @vv
 xvclz_w  0111 01101001 11000 00110 . .@vv
 xvclz_d  0111 01101001 11000 00111 . .@vv
 
+xvpcnt_b 0111 01101001 11000 01000 . .@vv
+xvpcnt_h 0111 01101001 11000 01001 . .@vv
+xvpcnt_w 0111 01101001 11000 01010 . .@vv
+xvpcnt_d 0111 01101001 11000 01011 . .@vv
+
 xvreplgr2vr_b0111 01101001 0 0 . .@vr
 xvreplgr2vr_h0111 01101001 0 1 . .@vr
 xvreplgr2vr_w0111 01101001 0 00010 . .@vr
diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c
index 2706daa1e0..57e9a9ed65 100644
--- a/target/loongarch/vec_helper.c
+++ b/target/loongarch/vec_helper.c
@@ -2183,9 +2183,9 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \
 int i;   \
 VReg *Vd = (VReg *)vd;   \
 VReg *Vj = (VReg *)vj;   \
+int oprsz = simd_oprsz(desc);\
  \
-for (i = 0; i < LSX_LEN/BIT; i++)\
-{\
+for (i = 0; i < oprsz / (BIT / 8); i++) {\
 Vd->E(i) = FN(Vj->E(i)); \
 }\
 }
-- 
2.39.1

[PATCH v2] block: Fix pad_request's request restriction

2023-07-14 Thread Hanna Czenczek

bdrv_pad_request() relies on requests' lengths not to exceed SIZE_MAX,
which bdrv_check_qiov_request() does not guarantee.

bdrv_check_request32() however will guarantee this, and both of
bdrv_pad_request()'s callers (bdrv_co_preadv_part() and
bdrv_co_pwritev_part()) already run it before calling
bdrv_pad_request().  Therefore, bdrv_pad_request() can safely call
bdrv_check_request32() without expecting error, too.

In effect, this patch will not change guest-visible behavior.  It is a
clean-up to tighten a condition to match what is guaranteed by our
callers, and which exists purely to show clearly why the subsequent
assertion (`assert(*bytes <= SIZE_MAX)`) is always true.

Note there is a difference between the interfaces of
bdrv_check_qiov_request() and bdrv_check_request32(): The former takes
an errp, the latter does not, so we can no longer just pass
&error_abort.  Instead, we need to check the returned value.  While we
do expect success (because the callers have already run this function),
an assert(ret == 0) is not much simpler than just to return an error if
it occurs, so let us handle errors by returning them up the stack now.

Reported-by: Peter Maydell 
Fixes: 18743311b829cafc1737a5f20bc3248d5f91ee2a
   ("block: Collapse padded I/O vecs exceeding IOV_MAX")
Signed-off-by: Hanna Czenczek 
---
v2:
- Added paragraph to the commit message to express explicitly that this
  patch will not change guest-visible behavior
- (No code changes)
---
 block/io.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index e8293d6b26..055fcf7438 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1710,7 +1710,11 @@ static int bdrv_pad_request(BlockDriverState *bs,
 int sliced_niov;
 size_t sliced_head, sliced_tail;
 
-bdrv_check_qiov_request(*offset, *bytes, *qiov, *qiov_offset, 
&error_abort);
+/* Should have been checked by the caller already */
+ret = bdrv_check_request32(*offset, *bytes, *qiov, *qiov_offset);
+if (ret < 0) {
+return ret;
+}
 
 if (!bdrv_init_padding(bs, *offset, *bytes, write, pad)) {
 if (padded) {
@@ -1723,7 +1727,7 @@ static int bdrv_pad_request(BlockDriverState *bs,
   &sliced_head, &sliced_tail,
   &sliced_niov);
 
-/* Guaranteed by bdrv_check_qiov_request() */
+/* Guaranteed by bdrv_check_request32() */
 assert(*bytes <= SIZE_MAX);
 ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
   sliced_head, *bytes);
-- 
2.41.0

Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128

2023-07-14 Thread Thomas Huth


On 13/07/2023 22.23, Richard Henderson wrote:

We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
CONFIG_ATOMIC128_OPT in atomic128.h.  It is difficult
to tell when those changes have been applied with the
ifdef we must use with CONFIG_CMPXCHG128.  So instead
use HAVE_CMPXCHG128, which triggers -Werror-undef when
the proper header has not been included.

Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
requires CONFIG_ATOMIC128_OPT.  Without this we fall back
to EXCP_ATOMIC to single-step 128-bit atomics, which is
slow enough to cause some tests to time out.

Reported-by: Thomas Huth 
Signed-off-by: Richard Henderson 
---

Thomas, this issue does not quite match the one you bisected, but
other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128
being used in BootLinuxS390X.test_s390_ccw_virtio_tcg.

As far as I can see, this wasn't broken by the addition of
CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough.

Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host.


Thanks, I can confirm that this fixes the issue for me, too.

Tested-by: Thomas Huth

Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation

2023-07-14 Thread Gavin Shan


On 7/14/23 10:51, Gavin Shan wrote:

On 7/14/23 02:29, Philippe Mathieu-Daudé wrote:

On 13/7/23 14:34, Gavin Shan wrote:

On 7/13/23 21:52, Marcin Juszkiewicz wrote:

W dniu 13.07.2023 o 13:44, Peter Maydell pisze:


I see this isn't a change in this patch, but given that
what the user specifies is not "cortex-a8-arm-cpu" but
"cortex-a8", why do we include the "-arm-cpu" suffix in
the error messages? It's not valid syntax to say
"-cpu cortex-a8-arm-cpu", so it's a bit misleading...


Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other 
architectures.

I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names:

13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M 
virt -cpu cortex-r5
qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu
The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, 
cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, 
neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, 
cortex-a57-arm-cpu, host-arm-cpu, max-arm-cpu

13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M 
virt -cpu cortex-a57-arm-cpu
qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu'



The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in 
PATCH[2].
In the generic validation, the complete CPU type is used. The error message also
have complete CPU type there.


In some places (arm_cpu_list_entry, arm_cpu_add_definition) we use:

   g_strndup(typename, strlen(typename) - strlen("-" TYPE_ARM_CPU))

Maybe extract as a helper? cpu_typename_name()? :)



Yeah, it's definitely a good idea. The helper is needed by all architectures,
not ARM alone. The following CPU types don't have explicit definition of
_CPU_TYPE_SUFFIX. We need take "-" TYPE_CPU as the suffix.

     target/microblaze/cpu.c  TYPE_MICROBLAZE_CPU
     target/hppa/cpu.c    TYPE_HPPA_CPU
     target/nios2/cpu.c   TYPE_NIOS2_CPU

     target/microblaze/cpu-qom.h:#define TYPE_MICROBLAZE_CPU "microblaze-cpu"
     target/hppa/cpu-qom.h:  #define TYPE_HPPA_CPU   "hppa-cpu"
     target/nios2/cpu.h: #define TYPE_NIOS2_CPU  "nios2-cpu"

I think the function name can be cpu_model_name() since we have called it
as 'model' in cpu.c::parse_cpu_option(). Something like below. Please let
me know if you have more comments.

     target//cpu.h
     -

     static inline char *cpu_model_name(const char *typename)
     {
     return g_strndup(typename, strlen(typename) - 
strlen(TYPE_XXX_CPU_SUFFIX));
     }



I found the generic CPU type invalidation in hw/core/machine.c can't see 
functions
from target/xxx/, including cpu_model_name(). In order to call this function 
from
hw/core/machine.c, we need transit in cpu.c

include/exec/cpu-common.h
-
char *cpu_get_model_name(const char *name);
void list_cpus(void);

cpu.c

-
char *cpu_get_model_name(const char *name)
{
   return cpu_model_name(name);
}

With above hunk of changes, cpu_get_model_name() can be called in 
hw/core/machine.c,
to extract the CPU model name from the CPU type name.

Thanks,
Gavin

Re: [RFC PATCH v4 20/24] vfio/iommufd: Implement the iommufd backend

2023-07-14 Thread Joel Granados

On Wed, Jul 12, 2023 at 03:25:24PM +0800, Zhenzhong Duan wrote:
> From: Yi Liu 
> 
> Add the iommufd backend. The IOMMUFD container class is implemented
> based on the new /dev/iommu user API. This backend obviously depends
> on CONFIG_IOMMUFD.
> 
> So far, the iommufd backend doesn't support live migration yet due
> to missing support in the host kernel.
> 
> Co-authored-by: Eric Auger 
> Signed-off-by: Eric Auger 
> Signed-off-by: Yi Liu 
> Signed-off-by: Zhenzhong Duan 
> ---
>  hw/vfio/as.c  |  12 +-
>  hw/vfio/iommufd.c | 511 ++
>  hw/vfio/meson.build   |   3 +
>  hw/vfio/trace-events  |  11 +
>  include/hw/vfio/vfio-common.h |  24 ++
>  include/hw/vfio/vfio-container-base.h |   3 +
>  6 files changed, 562 insertions(+), 2 deletions(-)
>  create mode 100644 hw/vfio/iommufd.c
> 
> diff --git a/hw/vfio/as.c b/hw/vfio/as.c
> index a7179403b7..2e7ecd4e96 100644
> --- a/hw/vfio/as.c
> +++ b/hw/vfio/as.c
> @@ -45,7 +45,7 @@
>  #include "migration/qemu-file.h"
>  #include "sysemu/tpm.h"
>  
> -static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
> +VFIOAddressSpaceList vfio_address_spaces =
>  QLIST_HEAD_INITIALIZER(vfio_address_spaces);
>  
>  #ifdef CONFIG_KVM
> @@ -1516,8 +1516,16 @@ int vfio_attach_device(char *name, VFIODevice 
> *vbasedev,
>  {
>  const VFIOIOMMUBackendOpsClass *ops;
>  
> -ops = VFIO_IOMMU_BACKEND_OPS_CLASS(
> +#ifdef CONFIG_IOMMUFD
> +if (vbasedev->iommufd) {
> +ops = VFIO_IOMMU_BACKEND_OPS_CLASS(
> +  object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_IOMMUFD_OPS));
> +} else
> +#endif
> +{
> +ops = VFIO_IOMMU_BACKEND_OPS_CLASS(
>object_class_by_name(TYPE_VFIO_IOMMU_BACKEND_LEGACY_OPS));
> +}
>  if (!ops) {
>  error_setg(errp, "VFIO IOMMU Backend not found!");
>  return -ENODEV;
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> new file mode 100644
> index 00..286ad0b766
> --- /dev/null
> +++ b/hw/vfio/iommufd.c
> @@ -0,0 +1,511 @@
> +/*
> + * iommufd container backend
> + *
> + * Copyright (C) 2023 Intel Corporation.
> + * Copyright Red Hat, Inc. 2023
> + *
> + * Authors: Yi Liu 
> + *  Eric Auger 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include 
> +#include 
> +#include 
> +
> +#include "hw/vfio/vfio-common.h"
> +#include "qemu/error-report.h"
> +#include "trace.h"
> +#include "qapi/error.h"
> +#include "sysemu/iommufd.h"
> +#include "hw/qdev-core.h"
> +#include "sysemu/reset.h"
> +#include "qemu/cutils.h"
> +#include "qemu/char_dev.h"
> +
> +static int iommufd_map(VFIOContainer *bcontainer, hwaddr iova,
> +   ram_addr_t size, void *vaddr, bool readonly)
> +{
> +VFIOIOMMUFDContainer *container =
> +container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
> +
> +return iommufd_backend_map_dma(container->be,
> +   container->ioas_id,
> +   iova, size, vaddr, readonly);
> +}
> +
> +static int iommufd_unmap(VFIOContainer *bcontainer,
> + hwaddr iova, ram_addr_t size,
> + IOMMUTLBEntry *iotlb)
> +{
> +VFIOIOMMUFDContainer *container =
> +container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
> +
> +/* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
> +return iommufd_backend_unmap_dma(container->be,
> + container->ioas_id, iova, size);
> +}
> +
> +static int vfio_get_devicefd(const char *sysfs_path, Error **errp)
> +{
> +long int ret = -ENOTTY;
> +char *path, *vfio_dev_path = NULL, *vfio_path = NULL;
> +DIR *dir = NULL;
> +struct dirent *dent;
> +gchar *contents;
> +struct stat st;
> +gsize length;
> +int major, minor;
> +dev_t vfio_devt;
> +
> +path = g_strdup_printf("%s/vfio-dev", sysfs_path);
> +if (stat(path, &st) < 0) {
> +error_setg_errno(errp, errno, "no such host device");
> +goto out_free_path;
> +}
> +
> +dir = opendir(path);
> +if (!dir) {
> +error_setg_errno(errp, errno, "couldn't open dirrectory %s", path);
> +goto out_free_path;
> +}
> +
> +while ((dent = readd

Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)

2023-07-14 Thread Daniel Henrique Barboza





On 7/14/23 00:12, Alistair Francis wrote:

On Fri, Jul 14, 2023 at 11:14 AM Daniel Henrique Barboza
 wrote:




On 7/13/23 19:47, Conor Dooley wrote:

On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote:

On 7/13/23 19:12, Conor Dooley wrote:



And a question for you below Daniel.

On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:






qemu-system-riscv64: warning: disabling zca extension for hart 
0x because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 
0x0001 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 
0x0001 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 
0x0002 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 
0x0002 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 
0x0003 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 
0x0003 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 
0x0004 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 
0x0004 because privilege spec version does not match


Why am I seeing these warnings? Does the mpfs machine type need to
disable some things? It only supports rv64imafdc per the DT, and
predates things like Zca existing, so emitting warnings does not seem
fair at all to me!


QEMU will disable extensions that are newer than a priv spec version that is set
by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
CPU has a priv spec version 1_10_0. The CPU is also enabling C.

We will enable zca if C is enabled. C and D enabled will also enable zcd. But
then the priv check will disabled both because zca and zcd have priv spec 
1_12_0.

This is a side effect for a change that I did a few months ago. Back then we
weren't disabling stuff correctly.


Yah, I did check out the blame, hence directing it at you. Thanks for
the explanation.


The warnings are annoying but are benign.


To be honest, benign or not, this is kind of thing is only going to
lead to grief. Even though only the direct kernel boot works, we do
actually have some customers that are using the icicle target in QEMU.


And apparently the sifive_u54 CPU is being inconsistent for some time and
we noticed just now.
Now, if the icicle board is supposed to have zca and zcd then we have a problem.


I don't know, this depends on how you see things in QEMU. I would say
that it supports c, and not Zca/Zcf/Zcd, given it predates the
extensions. I have no interest in retrofitting my devicetree stuff with
them, for example.


We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm 
not
sure how this will affect other boards that uses this CPU) or remove this priv 
spec
disable code altogether from QEMU.


I think you should stop warning for this? From my dumb-user perspective,
the warning only "scares" me into thinking something is wrong, when
there isn't. I can see a use case for the warning where someone tries to
enable Zca & Co. in their QEMU incantation for a CPU that does not
have the correct privilege level to support it, but I didn't try to set
any options at all in that way, so the warnings seem unfair?



That's a fair criticism. We had similar discussions a few months back. It's 
weird
to send warnings when the user didn't set the extensions manually, but ATM we
can't tell whether an extension was user enabled or not.

So we can either show unfair warning messages or not show warnings and take the 
risk
of silently disabling extensions that users enabled in the command line. It 
seems
that the former is more annoying to deal with than the latter.

I guess I can propose a patch to remove the warnings. We can send warning again
when we have a better solution.


A better solution is to just not enable Zca and friends automatically,
or at least look at the priv spec before we do


Good idea. In fact we should do that for all extensions that we're enabling
automatically.


I'll work something out. Thanks,


Daniel



Alistair




Daniel




Cheers,
Conor.

RE: [RFC PATCH v4 20/24] vfio/iommufd: Implement the iommufd backend

2023-07-14 Thread Duan, Zhenzhong

>-Original Message-
>From: Joel Granados 
>Sent: Friday, July 14, 2023 5:23 PM
>Subject: Re: [RFC PATCH v4 20/24] vfio/iommufd: Implement the iommufd
>backend
>
>On Wed, Jul 12, 2023 at 03:25:24PM +0800, Zhenzhong Duan wrote:
>> From: Yi Liu 
...
>> +static int vfio_get_devicefd(const char *sysfs_path, Error **errp)
>> +{
>> +long int ret = -ENOTTY;
>> +char *path, *vfio_dev_path = NULL, *vfio_path = NULL;
>> +DIR *dir = NULL;
>> +struct dirent *dent;
>> +gchar *contents;
>> +struct stat st;
>> +gsize length;
>> +int major, minor;
>> +dev_t vfio_devt;
>> +
>> +path = g_strdup_printf("%s/vfio-dev", sysfs_path);
>> +if (stat(path, &st) < 0) {
>> +error_setg_errno(errp, errno, "no such host device");
>> +goto out_free_path;
>> +}
>> +
>> +dir = opendir(path);
>> +if (!dir) {
>> +error_setg_errno(errp, errno, "couldn't open dirrectory %s", path);
>> +goto out_free_path;
>> +}
>> +
>> +while ((dent = readdir(dir))) {
>> +if (!strncmp(dent->d_name, "vfio", 4)) {
>> +vfio_dev_path = g_strdup_printf("%s/%s/dev", path, 
>> dent->d_name);
>> +break;
>> +}
>> +}
>> +
>> +if (!vfio_dev_path) {
>> +error_setg(errp, "failed to find vfio-dev/vfioX/dev");
>> +goto out_free_path;
>> +}
>> +
>> +if (!g_file_get_contents(vfio_dev_path, &contents, &length, NULL)) {
>> +error_setg(errp, "failed to load \"%s\"", vfio_dev_path);
>> +goto out_free_dev_path;
>> +}
>> +
>> +if (sscanf(contents, "%d:%d", &major, &minor) != 2) {
>> +error_setg(errp, "failed to get major:mino for \"%s\"", 
>> vfio_dev_path);
>Very small nit: Should be "minor" here.
Good catch, will fix.

Thanks
Zhenzhong

Re: x86 custom apicid assignments [Was: Re: [PATCH v7 0/2] Remove EPYC mode apicid decode and use generic decode]

2023-07-14 Thread Igor Mammedov

On Wed, 5 Jul 2023 10:12:40 +0200
Claudio Fontana  wrote:

> Hi all, partially resurrecting an old thread.
> 
> I've seen how for Epyc something special was done in the past in terms of 
> apicid assignments based on topology, which was then reverted apparently,
> but I wonder if something more general would be useful to all?
> 
> The QEMU apicid assignments first of all do not seem to match what is 
> happening on real hardware.

QEMU typically does generate valid APIC IDs
it however doesn't do a good job when using odd number of cores and/or NUMA 
enabled cases.
(That is what Babu have attempted to fix, but eventually that have been dropped 
for
reasons described in quoted cover letter)

> Functionally things are ok, but then when trying to investigate issues, 
> specifically in the guest kernel KVM PV code (arch/x86/kernel/kvm.c),
> in some cases the actual apicid values in relationship to the topology do 
> matter,

Care to point out specific places you are referring to?

KVM is not the only place where it might matter, it affects topo/numa code on 
guest side as well. 

> and currently there is no way (I know of), of supplying our own apicid 
> assignment, more closely matching what happens on hardware.
> 
> This has been an issue when debugging guest images in the cloud, where being 
> able to reproduce issues locally would be very beneficial as opposed to using 
> cloud images as the feedback loop,
> but unfortunately QEMU cannot currently create the right apicid values to 
> associate to the cpus.

Indeed EPYC APIC encoding mess increases support cases load downstream,
but as long as one has access to similar host hw, one should be able
to reproduce the issue locally.
However I would expect end result on such support end with an advice
to change topo/use another CPU model.

(what we lack is a documentation what works and what doesn't,
perhaps writing guidelines would be sufficient to steer users
to the usable EPYC configurations)

> Do I understand the issue correctly, comments, ideas?
> How receptive the project would be for changes aimed at providing a custom 
> assignment of apicids to cpus, regardless of Intel or AMD?

It's not that simple to just set custom APIC ID in register and be done with it,
you'll likely break (from the top of my head: some CPUID leaves might
depend on it, ACPI tables, NUMA mapping, KVM's vcpu_id).

Current topo code aims to work on information based on '-smp'/'-numa',
all through out QEMU codebase.
If however we were let user set APIC ID (which is somehow
correct), we would need to take reverse steps to decode that
(in vendor specific way) and incorporate resulting topo into other
code that uses topology info.
That makes it quite messy, not to mention it's x86(AMD specific) and
doesn't fit well with generalizing topo handling.
So I don't really like this route.

(x86 cpus have apic_id property, so theoretically you can set it
and with some minimal hacking lunch a guest, but then
expect guest to be unhappy when ACPI ID goes out of sync with
everything else. I would do that only for the sake of an experiment
and wouldn't try to upstream that)

What I wouldn't mind is taking the 2nd stab at what Babu had tried
do. Provided it manages to encode APIC ID for EPYC correctly and won't
complicate code much (and still using -smp/-numa as the root source for
topo configuration).

> Thanks,
> 
> Claudio
> 
> 
> 
> On 9/1/20 17:57, Babu Moger wrote:
> > To support some of the complex topology, we introduced EPYC mode apicid 
> > decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code 
> > and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > 
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > https://lore.kernel.org/qemu-devel/20200826143849.59f69...@redhat.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the 
> > generic
> > apicid decode.
> > ---
> > v7:
> >  Eduardo has already queued 1-8 from the v6. Sending rest of the patches.
> >  Fixed CPUID 80ld based on Igor's comment and few text changes.
> >  
> > v6:
> >  
> > https://lore.kernel.org/qemu-devel/159889924378.21294.16494070903874534542.st...@naples-babu.amd.com/
> >  Found out that numa configuration is not mandatory for all the EPYC model 
> > topology.
> >  We can use the generic decode which works pretty well. Also noticed that
> >  cpuid does not changes when the numa nodes change(NPS- Nodes per socket).
> >  Took care of couple comments from Igor and Eduardo.
> >  Thank you Igor, Daniel, David, Eduardo for your feedback.  
> > 
> > v5:
> >  
> > https://lore.kernel.org/qemu-devel/159804762216.39954.15502128500494116468.st...@naples-ba

Re: [PATCH v1 13/15] virtio-mem: Expose device memory via multiple memslots if enabled

2023-07-14 Thread David Hildenbrand


On 13.07.23 21:58, Maciej S. Szmigiero wrote:

On 16.06.2023 11:26, David Hildenbrand wrote:

Having large virtio-mem devices that only expose little memory to a VM
is currently a problem: we map the whole sparse memory region into the
guest using a single memslot, resulting in one gigantic memslot in KVM.
KVM allocates metadata for the whole memslot, which can result in quite
some memory waste.

Assuming we have a 1 TiB virtio-mem device and only expose little (e.g.,
1 GiB) memory, we would create a single 1 TiB memslot and KVM has to
allocate metadata for that 1 TiB memslot: on x86, this implies allocating
a significant amount of memory for metadata:

(1) RMAP: 8 bytes per 4 KiB, 8 bytes per 2 MiB, 8 bytes per 1 GiB
  -> For 1 TiB: 2147483648 + 4194304 + 8192 = ~ 2 GiB (0.2 %)

  With the TDP MMU (cat /sys/module/kvm/parameters/tdp_mmu) this gets
  allocated lazily when required for nested VMs
(2) gfn_track: 2 bytes per 4 KiB
  -> For 1 TiB: 536870912 = ~512 MiB (0.05 %)
(3) lpage_info: 4 bytes per 2 MiB, 4 bytes per 1 GiB
  -> For 1 TiB: 2097152 + 4096 = ~2 MiB (0.0002 %)
(4) 2x dirty bitmaps for tracking: 2x 1 bit per 4 KiB page
  -> For 1 TiB: 536870912 = 64 MiB (0.006 %)

So we primarily care about (1) and (2). The bad thing is, that the
memory consumption *doubles* once SMM is enabled, because we create the
memslot once for !SMM and once for SMM.

Having a 1 TiB memslot without the TDP MMU consumes around:
* With SMM: 5 GiB
* Without SMM: 2.5 GiB
Having a 1 TiB memslot with the TDP MMU consumes around:
* With SMM: 1 GiB
* Without SMM: 512 MiB

... and that's really something we want to optimize, to be able to just
start a VM with small boot memory (e.g., 4 GiB) and a virtio-mem device
that can grow very large (e.g., 1 TiB).

Consequently, using multiple memslots and only mapping the memslots we
really need can significantly reduce memory waste and speed up
memslot-related operations. Let's expose the sparse RAM memory region using
multiple memslots, mapping only the memslots we currently need into our
device memory region container.

* With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we only map the memslots that
actually have memory plugged, and dynamically (un)map when
(un)plugging memory blocks.

* Without VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we always map the memslots
covered by the usable region, and dynamically (un)map when resizing the
usable region.

We'll auto-determine the number of memslots to use based on the suggested
memslot limit provided by the core. We'll use at most 1 memslot per
gigabyte. Note that our global limit of memslots accross all memory devices
is currently set to 256: even with multiple large virtio-mem devices, we'd
still have a sane limit on the number of memslots used.

The default is a single memslot for now ("multiple-memslots=off"). The
optimization must be enabled manually using "multiple-memslots=on", because
some vhost setups (e.g., hotplug of vhost-user devices) might be
problematic until we support more memslots especially in vhost-user
backends.

Note that "multiple-memslots=on" is just a hint that multiple memslots
*may* be used for internal optimizations, not that multiple memslots
*must* be used. The actual number of memslots that are used is an
internal detail: for example, once memslot metadata is no longer an
issue, we could simply stop optimizing for that. Migration source and
destination can differ on the setting of "multiple-memslots".

Signed-off-by: David Hildenbrand 
---
   hw/virtio/virtio-mem-pci.c |  21 +++
   hw/virtio/virtio-mem.c | 265 -
   include/hw/virtio/virtio-mem.h |  23 ++-
   3 files changed, 304 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c
index b85c12668d..8b403e7e78 100644
--- a/hw/virtio/virtio-mem-pci.c
+++ b/hw/virtio/virtio-mem-pci.c

(...)

@@ -790,6 +921,43 @@ static void virtio_mem_system_reset(void *opaque)
   virtio_mem_unplug_all(vmem);
   }
   
+static void virtio_mem_prepare_mr(VirtIOMEM *vmem)

+{
+const uint64_t region_size = memory_region_size(&vmem->memdev->mr);
+
+g_assert(!vmem->mr);
+vmem->mr = g_new0(MemoryRegion, 1);
+memory_region_init(vmem->mr, OBJECT(vmem), "virtio-mem",
+   region_size);
+vmem->mr->align = memory_region_get_alignment(&vmem->memdev->mr);
+}
+
+static void virtio_mem_prepare_memslots(VirtIOMEM *vmem)
+{
+const uint64_t region_size = memory_region_size(&vmem->memdev->mr);
+unsigned int idx;
+
+g_assert(!vmem->memslots && vmem->nb_memslots);
+vmem->memslots = g_new0(MemoryRegion, vmem->nb_memslots);
+
+/* Initialize our memslots, but don't map them yet. */
+for (idx = 0; idx < vmem->nb_memslots; idx++) {
+const uint64_t memslot_offset = idx * vmem->memslot_size;
+uint64_t memslot_size = vmem->memslot_size;
+char name[20];
+
+/* The size of the last memslot might be

Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping

2023-07-14 Thread Peter Maydell

On Thu, 13 Jul 2023 at 19:43, Stefan Berger  wrote:
>
>
>
> On 7/13/23 13:18, Peter Maydell wrote:
> > On Thu, 13 Jul 2023 at 18:16, Stefan Berger  wrote:
> >> I guess the first point would be to decide whether to support an i2c bus 
> >> on the virt board and then whether we can use the aspeed bus that we know 
> >> that the tpm_tis_i2c device model works with but we don't know how Windows 
> >> may react to it.
> >>
> >> It seems sysbus is already supported there so ... we may have a 'match'?
> >
> > You can use sysbus devices anywhere -- they're just
>
> 'anywhere' also includes aarch64 virt board I suppose.

Yes. Literally any machine can have memory mapped devices.

> > "this is a memory mapped device". The question is whether
> > we should, or whether an i2c controller is more like
> > what the real world uses (and if so, what i2c controller).
> >
>
> > I don't want to accept changes to the virt board that are
> > hard to live with in future, because changing virt in
> > non-backward compatible ways is painful.
>
> Once we have the CRB sysbus device we would keep it around forever and it 
> seems to
> - not require any changes to the virt board (iiuc) since sysbus is already 
> being used
> - works already with Windows and probably also Linux

"Add a sysbus device to the virt board" is the kind of
change I mean -- once you do that it's hard to take it
out again, and if we decide in 6 months time that actually
i2c would be the better option then we end up with two
different ways to do the same thing and trying to
deprecate the other one is a pain.

-- PMM

Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)

2023-07-14 Thread Conor Dooley

On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:

> > > OpenSBI v1.3
> > >_  _
> > >   / __ \  / |  _ \_   _|
> > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > >  | |__| | |_) |  __/ | | |) | |_) || |_
> > >   \/| .__/ \___|_| |_|_/|___/_|
> > > | |
> > > |_|
> > >
> > > init_coldboot: ipi init failed (error -1009)
> > >
> > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > and compiles only a significantly cut down number of files from it, we
> > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > not tested v1.3, nor do we have any immediate plans to change our
> > > platform firmware to vendor v1.3 either.
> > >
> > > I unless there's something obvious to you, it sounds like I will need to
> > > go and bisect OpenSBI. That's a job for another day though, given the
> > > time.
> > >
> 
> The real issue is some CPU/HART DT nodes marked as disabled in the
> DT passed to OpenSBI 1.3.
> 
> This issue does not exist in any of the DTs generated by QEMU but some
> of the DTs in the kernel (such as microchip and SiFive board DTs) have
> the E-core disabled.
> 
> I had discovered this issue in a totally different context after the OpenSBI 
> 1.3
> release happened. This issue is already fixed in the latest OpenSBI by the
> following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> Fix sbi_hartid_to_scratch() usage in ACLINT drivers").

Great, thanks Anup! I thought I had tested tip-of-tree too, but
obviously not.

> I always assumed that Microchip hss.bin is the preferred BIOS for the
> QEMU microchip-icicle-kit machine but I guess that's not true.

Unfortunately the HSS has not worked in QEMU for a long time, and while
I would love to fix it, but am pretty stretched for spare time to begin
with.
I usually just do direct kernel boots, which use the OpenSBI that comes
with QEMU, as I am sure you already know :)

> At this point, you can either:
> 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine

> 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> microchip-icicle-kit machine with OpenSBI 1.3

Will OpenSBI disable it? If not, I think option 2) needs to be remove
the DT node. I'll just use tip-of-tree myself & up to the 


signature.asc
Description: PGP signature

Re: [ping] [PATCH 1/1] qemu-nbd: fix regression with qemu-nbd --fork run over ssh

2023-07-14 Thread Denis V. Lunev


On 7/6/23 21:15, Denis V. Lunev wrote:

Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
 Author: Hanna Reitz 
 Date:   Wed May 8 23:18:18 2019 +0200
 qemu-nbd: Do not close stderr
has introduced an interesting regression. Original behavior of
 ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
was the following:
  * qemu-nbd was started as a daemon
  * the command execution is done and ssh exited with success

The patch has changed this behavior and 'ssh' command now hangs forever.

According to the normal specification of the daemon() call, we should
endup with STDERR pointing to /dev/null. That should be done at the
very end of the successful startup sequence when the pipe to the
bootstrap process (used for diagnostics) is no longer needed.

This could be achived in the same way as done for 'qemu-nbd -c' case.
STDOUT copying to STDERR does the trick.

This also leads to proper 'ssh' connection closing which fixes my
original problem.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
CC: Hanna Reitz 
---
  qemu-nbd.c | 9 +
  1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 4276163564..e9e118dfdb 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -575,7 +575,6 @@ int main(int argc, char **argv)
  bool writethrough = false; /* Client will flush as needed. */
  bool fork_process = false;
  bool list = false;
-int old_stderr = -1;
  unsigned socket_activation;
  const char *pid_file_name = NULL;
  const char *selinux_label = NULL;
@@ -930,11 +929,6 @@ int main(int argc, char **argv)
  } else if (pid == 0) {
  close(stderr_fd[0]);
  
-/* Remember parent's stderr if we will be restoring it. */

-if (fork_process) {
-old_stderr = dup(STDERR_FILENO);
-}
-
  ret = qemu_daemon(1, 0);
  
  /* Temporarily redirect stderr to the parent's pipe...  */

@@ -1152,8 +1146,7 @@ int main(int argc, char **argv)
  }
  
  if (fork_process) {

-dup2(old_stderr, STDERR_FILENO);
-close(old_stderr);
+dup2(STDOUT_FILENO, STDERR_FILENO);
  }
  
  state = RUNNING;

ping

[PATCH v3] hw/mips: Improve the default USB settings in the loongson3-virt machine

2023-07-14 Thread Thomas Huth

It's possible to compile QEMU without the USB devices (e.g. when using
"--without-default-devices" as option for the "configure" script).
To be still able to run the loongson3-virt machine in default mode with
such a QEMU binary, we have to check here for the availability of the
OHCI controller first before instantiating the USB devices.

Signed-off-by: Thomas Huth 
---
 v3: Back to runtime detection, but more simple this time compared to v1
 (checking for OHCI should be enough, since this implies CONFIG_USB
  which is the switch that usb-kbd and usb-tablet are depending on)

 hw/mips/loongson3_virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
index 4018b8c1d3..3ad0a223df 100644
--- a/hw/mips/loongson3_virt.c
+++ b/hw/mips/loongson3_virt.c
@@ -447,7 +447,7 @@ static inline void loongson3_virt_devices_init(MachineState 
*machine,
 
 pci_vga_init(pci_bus);
 
-if (defaults_enabled()) {
+if (defaults_enabled() && object_class_by_name("pci-ohci")) {
 pci_create_simple(pci_bus, -1, "pci-ohci");
 usb_create_simple(usb_bus_find(-1), "usb-kbd");
 usb_create_simple(usb_bus_find(-1), "usb-tablet");
-- 
2.39.3

[PULL 0/5] Patches for QEMU 8.1 hard freeze

2023-07-14 Thread Paolo Bonzini

The following changes since commit 3dd9e54703e6ae4f9ab3767f5cecc99edf08:

  Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into 
staging (2023-07-12 20:46:10 +0100)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 2eb5599e8a73e70a9e86a97120818ff95a43a23a:

  scsi: clear unit attention only for REPORT LUNS commands (2023-07-14 11:10:58 
+0200)


* SCSI unit attention fix
* add PCIe devices to s390x emulator
* IDE unplug fix for Xen


Cédric Le Goater (1):
  kconfig: Add PCIe devices to s390x machines

Olaf Hering (1):
  hw/ide/piix: properly initialize the BMIBA register

Stefano Garzarella (3):
  scsi: fetch unit attention when creating the request
  scsi: cleanup scsi_clear_unit_attention()
  scsi: clear unit attention only for REPORT LUNS commands

 configs/devices/s390x-softmmu/default.mak |  1 +
 hw/ide/piix.c |  2 +-
 hw/net/Kconfig|  4 +-
 hw/pci/Kconfig|  3 ++
 hw/s390x/Kconfig  |  3 +-
 hw/scsi/scsi-bus.c| 82 ---
 hw/usb/Kconfig|  2 +-
 include/hw/scsi/scsi.h|  1 +
 8 files changed, 53 insertions(+), 45 deletions(-)
-- 
2.41.0

[PULL 1/5] hw/ide/piix: properly initialize the BMIBA register

2023-07-14 Thread Paolo Bonzini

From: Olaf Hering 

According to the 82371FB documentation (82371FB.pdf, 2.3.9. BMIBA-BUS
MASTER INTERFACE BASE ADDRESS REGISTER, April 1997), the register is
32bit wide. To properly reset it to default values, all 32bit need to be
cleared. Bit #0 "Resource Type Indicator (RTE)" needs to be enabled.

The initial change wrote just the lower 8 bit, leaving parts of the "Bus
Master Interface Base Address" address at bit 15:4 unchanged.

Fixes: e6a71ae327 ("Add support for 82371FB (Step A1) and Improved support for 
82371SB (Function 1)")

Signed-off-by: Olaf Hering 
Reviewed-by: Bernhard Beschow 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20230712074721.14728-1-o...@aepfle.de>
Signed-off-by: Paolo Bonzini 
---
 hw/ide/piix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 151f206046e..4e5e12935f5 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -117,7 +117,7 @@ static void piix_ide_reset(DeviceState *dev)
 pci_set_word(pci_conf + PCI_COMMAND, 0x);
 pci_set_word(pci_conf + PCI_STATUS,
  PCI_STATUS_DEVSEL_MEDIUM | PCI_STATUS_FAST_BACK);
-pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
+pci_set_long(pci_conf + 0x20, 0x1);  /* BMIBA: 20-23h */
 }
 
 static bool pci_piix_init_bus(PCIIDEState *d, unsigned i, Error **errp)
-- 
2.41.0

[PULL 2/5] kconfig: Add PCIe devices to s390x machines

2023-07-14 Thread Paolo Bonzini

From: Cédric Le Goater 

It is useful to extend the number of available PCIe devices to KVM guests
for passthrough scenarios and also to expose these models to a different
(big endian) architecture. Introduce a new config PCIE_DEVICES to select
models, Intel Ethernet adapters and one USB controller. These devices all
support MSI-X which is a requirement on s390x as legacy INTx are not
supported.

Cc: Matthew Rosato 
Cc: Paolo Bonzini 
Cc: Thomas Huth 
Signed-off-by: Cédric Le Goater 
Message-ID: <20230712080146.839113-1-...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 configs/devices/s390x-softmmu/default.mak | 1 +
 hw/net/Kconfig| 4 ++--
 hw/pci/Kconfig| 3 +++
 hw/s390x/Kconfig  | 3 ++-
 hw/usb/Kconfig| 2 +-
 5 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/configs/devices/s390x-softmmu/default.mak 
b/configs/devices/s390x-softmmu/default.mak
index f2287a133f3..6d87bc8b4b0 100644
--- a/configs/devices/s390x-softmmu/default.mak
+++ b/configs/devices/s390x-softmmu/default.mak
@@ -7,6 +7,7 @@
 #CONFIG_VFIO_CCW=n
 #CONFIG_VIRTIO_PCI=n
 #CONFIG_WDT_DIAG288=n
+#CONFIG_PCIE_DEVICES=n
 
 # Boards:
 #
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
index 98e00be4f93..7fcc0d7faa2 100644
--- a/hw/net/Kconfig
+++ b/hw/net/Kconfig
@@ -41,12 +41,12 @@ config E1000_PCI
 
 config E1000E_PCI_EXPRESS
 bool
-default y if PCI_DEVICES
+default y if PCI_DEVICES || PCIE_DEVICES
 depends on PCI_EXPRESS && MSI_NONBROKEN
 
 config IGB_PCI_EXPRESS
 bool
-default y if PCI_DEVICES
+default y if PCI_DEVICES || PCIE_DEVICES
 depends on PCI_EXPRESS && MSI_NONBROKEN
 
 config RTL8139_PCI
diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
index 77f8b005ffb..fe70902cd82 100644
--- a/hw/pci/Kconfig
+++ b/hw/pci/Kconfig
@@ -8,6 +8,9 @@ config PCI_EXPRESS
 config PCI_DEVICES
 bool
 
+config PCIE_DEVICES
+bool
+
 config MSI_NONBROKEN
 # selected by interrupt controllers that do not support MSI,
 # or support it and have a good implementation. See commit
diff --git a/hw/s390x/Kconfig b/hw/s390x/Kconfig
index 5e7d8a2bae8..e8d4d68ece0 100644
--- a/hw/s390x/Kconfig
+++ b/hw/s390x/Kconfig
@@ -5,7 +5,8 @@ config S390_CCW_VIRTIO
 imply VFIO_AP
 imply VFIO_CCW
 imply WDT_DIAG288
-select PCI
+imply PCIE_DEVICES
+select PCI_EXPRESS
 select S390_FLIC
 select SCLPCONSOLE
 select VIRTIO_CCW
diff --git a/hw/usb/Kconfig b/hw/usb/Kconfig
index 0ec6def4b8b..0f486764ed6 100644
--- a/hw/usb/Kconfig
+++ b/hw/usb/Kconfig
@@ -36,7 +36,7 @@ config USB_XHCI
 
 config USB_XHCI_PCI
 bool
-default y if PCI_DEVICES
+default y if PCI_DEVICES || PCIE_DEVICES
 depends on PCI
 select USB_XHCI
 
-- 
2.41.0

[PULL 3/5] scsi: fetch unit attention when creating the request

2023-07-14 Thread Paolo Bonzini

From: Stefano Garzarella 

Commit 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs") split
calls to scsi_req_new() and scsi_req_enqueue() in the virtio-scsi device.
No ill effects were observed until commit 8cc5583abe ("virtio-scsi: Send
"REPORTED LUNS CHANGED" sense data upon disk hotplug events") added a
unit attention that was easy to trigger with device hotplug and
hot-unplug.

Because the two calls were separated, all requests in the batch were
prepared calling scsi_req_new() to report a sense.  The first one
submitted would report the right sense and reset it to NO_SENSE, while
the others reported CHECK_CONDITION with no sense data.  This caused
SCSI errors in Linux.

To solve this issue, let's fetch the unit attention as early as possible
when we prepare the request, so that only the first request in the batch
will use the unit attention SCSIReqOps and the others will not report
CHECK CONDITION.

Fixes: 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs")
Fixes: 8cc5583abe ("virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon 
disk hotplug events")
Reported-by: Thomas Huth 
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2176702
Co-developed-by: Paolo Bonzini 
Signed-off-by: Stefano Garzarella 
Message-ID: <20230712134352.118655-2-sgarz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-bus.c | 40 +++-
 include/hw/scsi/scsi.h |  1 +
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index f80f4cb4fcf..f083373021c 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -412,19 +412,35 @@ static const struct SCSIReqOps reqops_invalid_opcode = {
 
 /* SCSIReqOps implementation for unit attention conditions.  */
 
+static void scsi_fetch_unit_attention_sense(SCSIRequest *req)
+{
+SCSISense *ua = NULL;
+
+if (req->dev->unit_attention.key == UNIT_ATTENTION) {
+ua = &req->dev->unit_attention;
+} else if (req->bus->unit_attention.key == UNIT_ATTENTION) {
+ua = &req->bus->unit_attention;
+}
+
+/*
+ * Fetch the unit attention sense immediately so that another
+ * scsi_req_new does not use reqops_unit_attention.
+ */
+if (ua) {
+scsi_req_build_sense(req, *ua);
+*ua = SENSE_CODE(NO_SENSE);
+}
+}
+
 static int32_t scsi_unit_attention(SCSIRequest *req, uint8_t *buf)
 {
-if (req->dev->unit_attention.key == UNIT_ATTENTION) {
-scsi_req_build_sense(req, req->dev->unit_attention);
-} else if (req->bus->unit_attention.key == UNIT_ATTENTION) {
-scsi_req_build_sense(req, req->bus->unit_attention);
-}
 scsi_req_complete(req, CHECK_CONDITION);
 return 0;
 }
 
 static const struct SCSIReqOps reqops_unit_attention = {
 .size = sizeof(SCSIRequest),
+.init_req = scsi_fetch_unit_attention_sense,
 .send_command = scsi_unit_attention
 };
 
@@ -699,6 +715,11 @@ SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, 
SCSIDevice *d,
 object_ref(OBJECT(d));
 object_ref(OBJECT(qbus->parent));
 notifier_list_init(&req->cancel_notifiers);
+
+if (reqops->init_req) {
+reqops->init_req(req);
+}
+
 trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
 return req;
 }
@@ -798,6 +819,15 @@ uint8_t *scsi_req_get_buf(SCSIRequest *req)
 static void scsi_clear_unit_attention(SCSIRequest *req)
 {
 SCSISense *ua;
+
+/*
+ * scsi_fetch_unit_attention_sense() already cleaned the unit attention
+ * in this case.
+ */
+if (req->ops == &reqops_unit_attention) {
+return;
+}
+
 if (req->dev->unit_attention.key != UNIT_ATTENTION &&
 req->bus->unit_attention.key != UNIT_ATTENTION) {
 return;
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index e2bb1a2fbfd..3692ca82f31 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -108,6 +108,7 @@ int cdrom_read_toc_raw(int nb_sectors, uint8_t *buf, int 
msf, int session_num);
 /* scsi-bus.c */
 struct SCSIReqOps {
 size_t size;
+void (*init_req)(SCSIRequest *req);
 void (*free_req)(SCSIRequest *req);
 int32_t (*send_command)(SCSIRequest *req, uint8_t *buf);
 void (*read_data)(SCSIRequest *req);
-- 
2.41.0

[PULL 4/5] scsi: cleanup scsi_clear_unit_attention()

2023-07-14 Thread Paolo Bonzini

From: Stefano Garzarella 

The previous commit moved the unit attention clearing when we create
the request. So now we can clean scsi_clear_unit_attention() to handle
only the case of the REPORT LUNS command: this is the only case in
which a UNIT ATTENTION is cleared without having been reported.

Suggested-by: Paolo Bonzini 
Signed-off-by: Stefano Garzarella 
Message-ID: <20230712134352.118655-3-sgarz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-bus.c | 28 ++--
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index f083373021c..f9c95dfb50d 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -828,26 +828,12 @@ static void scsi_clear_unit_attention(SCSIRequest *req)
 return;
 }
 
-if (req->dev->unit_attention.key != UNIT_ATTENTION &&
-req->bus->unit_attention.key != UNIT_ATTENTION) {
-return;
-}
-
-/*
- * If an INQUIRY command enters the enabled command state,
- * the device server shall [not] clear any unit attention condition;
- * See also MMC-6, paragraphs 6.5 and 6.6.2.
- */
-if (req->cmd.buf[0] == INQUIRY ||
-req->cmd.buf[0] == GET_CONFIGURATION ||
-req->cmd.buf[0] == GET_EVENT_STATUS_NOTIFICATION) {
-return;
-}
-
 if (req->dev->unit_attention.key == UNIT_ATTENTION) {
 ua = &req->dev->unit_attention;
-} else {
+} else if (req->bus->unit_attention.key == UNIT_ATTENTION) {
 ua = &req->bus->unit_attention;
+} else {
+return;
 }
 
 /*
@@ -856,12 +842,10 @@ static void scsi_clear_unit_attention(SCSIRequest *req)
  * with an additional sense code of REPORTED LUNS DATA HAS CHANGED.
  */
 if (req->cmd.buf[0] == REPORT_LUNS &&
-!(ua->asc == SENSE_CODE(REPORTED_LUNS_CHANGED).asc &&
-  ua->ascq == SENSE_CODE(REPORTED_LUNS_CHANGED).ascq)) {
-return;
+ua->asc == SENSE_CODE(REPORTED_LUNS_CHANGED).asc &&
+ua->ascq == SENSE_CODE(REPORTED_LUNS_CHANGED).ascq) {
+*ua = SENSE_CODE(NO_SENSE);
 }
-
-*ua = SENSE_CODE(NO_SENSE);
 }
 
 int scsi_req_get_sense(SCSIRequest *req, uint8_t *buf, int len)
-- 
2.41.0

[PULL 5/5] scsi: clear unit attention only for REPORT LUNS commands

2023-07-14 Thread Paolo Bonzini

From: Stefano Garzarella 

scsi_clear_unit_attention() now only handles REPORTED LUNS DATA HAS
CHANGED.

This only happens when we handle REPORT LUNS commands, so let's rename
the function in scsi_clear_reported_luns_changed() and call it only in
scsi_target_emulate_report_luns().

Suggested-by: Paolo Bonzini 
Signed-off-by: Stefano Garzarella 
Message-ID: <20230712134352.118655-4-sgarz...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-bus.c | 34 +++---
 1 file changed, 11 insertions(+), 23 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index f9c95dfb50d..fc4b77fdb02 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -22,6 +22,7 @@ static char *scsibus_get_fw_dev_path(DeviceState *dev);
 static void scsi_req_dequeue(SCSIRequest *req);
 static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len);
 static void scsi_target_free_buf(SCSIRequest *req);
+static void scsi_clear_reported_luns_changed(SCSIRequest *req);
 
 static int next_scsi_bus;
 
@@ -518,6 +519,14 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq 
*r)
 
 /* store the LUN list length */
 stl_be_p(&r->buf[0], len - 8);
+
+/*
+ * If a REPORT LUNS command enters the enabled command state, [...]
+ * the device server shall clear any pending unit attention condition
+ * with an additional sense code of REPORTED LUNS DATA HAS CHANGED.
+ */
+scsi_clear_reported_luns_changed(&r->req);
+
 return true;
 }
 
@@ -816,18 +825,10 @@ uint8_t *scsi_req_get_buf(SCSIRequest *req)
 return req->ops->get_buf(req);
 }
 
-static void scsi_clear_unit_attention(SCSIRequest *req)
+static void scsi_clear_reported_luns_changed(SCSIRequest *req)
 {
 SCSISense *ua;
 
-/*
- * scsi_fetch_unit_attention_sense() already cleaned the unit attention
- * in this case.
- */
-if (req->ops == &reqops_unit_attention) {
-return;
-}
-
 if (req->dev->unit_attention.key == UNIT_ATTENTION) {
 ua = &req->dev->unit_attention;
 } else if (req->bus->unit_attention.key == UNIT_ATTENTION) {
@@ -836,13 +837,7 @@ static void scsi_clear_unit_attention(SCSIRequest *req)
 return;
 }
 
-/*
- * If a REPORT LUNS command enters the enabled command state, [...]
- * the device server shall clear any pending unit attention condition
- * with an additional sense code of REPORTED LUNS DATA HAS CHANGED.
- */
-if (req->cmd.buf[0] == REPORT_LUNS &&
-ua->asc == SENSE_CODE(REPORTED_LUNS_CHANGED).asc &&
+if (ua->asc == SENSE_CODE(REPORTED_LUNS_CHANGED).asc &&
 ua->ascq == SENSE_CODE(REPORTED_LUNS_CHANGED).ascq) {
 *ua = SENSE_CODE(NO_SENSE);
 }
@@ -1528,13 +1523,6 @@ void scsi_req_complete(SCSIRequest *req, int status)
 req->dev->sense_is_ua = false;
 }
 
-/*
- * Unit attention state is now stored in the device's sense buffer
- * if the HBA didn't do autosense.  Clear the pending unit attention
- * flags.
- */
-scsi_clear_unit_attention(req);
-
 scsi_req_ref(req);
 scsi_req_dequeue(req);
 req->bus->info->complete(req, req->residual);
-- 
2.41.0

[PATCH, trivial 01/29] tree-wide spelling fixes in comments and some messages: block

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 block.c  | 2 +-
 block/block-copy.c   | 4 ++--
 block/export/vduse-blk.c | 2 +-
 block/export/vhost-user-blk-server.c | 2 +-
 block/export/vhost-user-blk-server.h | 2 +-
 block/file-posix.c   | 8 
 block/graph-lock.c   | 2 +-
 block/io.c   | 2 +-
 block/linux-aio.c| 2 +-
 block/mirror.c   | 2 +-
 block/qcow2-refcount.c   | 2 +-
 block/vhdx.c | 2 +-
 block/vhdx.h | 4 ++--
 13 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/block.c b/block.c
index a307c151a8..90d2dde828 100644
--- a/block.c
+++ b/block.c
@@ -7584,3 +7584,3 @@ int bdrv_try_change_aio_context(BlockDriverState *bs, 
AioContext *ctx,
  * Take care of checking that all nodes support changing AioContext
- * and drain them, builing a linear list of callbacks to run if everything
+ * and drain them, building a linear list of callbacks to run if everything
  * is successful (the transaction itself).
diff --git a/block/block-copy.c b/block/block-copy.c
index e13d7bc6b6..db1efc3eb9 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -69,3 +69,3 @@ typedef struct BlockCopyCallState {
 /*
- * Fields that report information about return values and erros.
+ * Fields that report information about return values and errors.
  * Protected by lock in BlockCopyState.
@@ -464,3 +464,3 @@ static coroutine_fn int block_copy_task_run(AioTaskPool 
*pool,
  *
- * No sync here: nor bitmap neighter intersecting requests handling, only copy.
+ * No sync here: nor bitmap neither intersecting requests handling, only copy.
  *
diff --git a/block/export/vduse-blk.c b/block/export/vduse-blk.c
index 83b05548e7..172f73cef4 100644
--- a/block/export/vduse-blk.c
+++ b/block/export/vduse-blk.c
@@ -140,3 +140,3 @@ static void vduse_blk_enable_queue(VduseDev *dev, 
VduseVirtq *vq)
on_vduse_vq_kick, NULL, NULL, NULL, vq);
-/* Make sure we don't miss any kick afer reconnecting */
+/* Make sure we don't miss any kick after reconnecting */
 eventfd_write(vduse_queue_get_fd(vq), 1);
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index f7b5073605..fe2cee3a78 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -1,3 +1,3 @@
 /*
- * Sharing QEMU block devices via vhost-user protocal
+ * Sharing QEMU block devices via vhost-user protocol
  *
diff --git a/block/export/vhost-user-blk-server.h 
b/block/export/vhost-user-blk-server.h
index fcf46fc8a5..77fb5c0131 100644
--- a/block/export/vhost-user-blk-server.h
+++ b/block/export/vhost-user-blk-server.h
@@ -1,3 +1,3 @@
 /*
- * Sharing QEMU block devices via vhost-user protocal
+ * Sharing QEMU block devices via vhost-user protocol
  *
diff --git a/block/file-posix.c b/block/file-posix.c
index 9e8e3d8ca5..f84c35d831 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1161,5 +1161,5 @@ static int raw_reopen_prepare(BDRVReopenState *state,
  * bdrv_reopen_multiple() .bdrv_reopen_prepare() callback called prior to
- * permission update. Happily, permission update is always a part (a 
seprate
- * stage) of bdrv_reopen_multiple() so we can rely on this fact and
- * reconfigure fd in raw_check_perm().
+ * permission update. Happily, permission update is always a part
+ * (a separate stage) of bdrv_reopen_multiple() so we can rely on this
+ * fact and reconfigure fd in raw_check_perm().
  */
@@ -3378,3 +3378,3 @@ static void raw_account_discard(BDRVRawState *s, uint64_t 
nbytes, int ret)
  * offset can be any byte within the entire size of the device;
- * nr_zones is the maxium number of sectors the command should operate on.
+ * nr_zones is the maximum number of sectors the command should operate on.
  */
diff --git a/block/graph-lock.c b/block/graph-lock.c
index 5e66f01ae8..f357a2c0b1 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -97,3 +97,3 @@ static uint32_t reader_count(void)
 
-/* rd can temporarly be negative, but the total will *always* be >= 0 */
+/* rd can temporarily be negative, but the total will *always* be >= 0 */
 rd = orphaned_reader_count;
diff --git a/block/io.c b/block/io.c
index e8293d6b26..2b872f32f1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -344,3 +344,3 @@ static void coroutine_fn 
bdrv_co_yield_to_drain(BlockDriverState *bs,
 
-/* Reaquire the AioContext of bs if we dropped it */
+/* Reacquire the AioContext of bs if we dropped it */
 if (ctx != co_ctx) {
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 561c71a9ae..1a51503271 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -229,3 +229,3 @@ static void qemu_laio_process_completions(LinuxAioState *s)
  * by setting event_max to zero, upper level will the

[PATCH, trivial 11/29] tree-wide spelling fixes in comments and some messages: ppc

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 host/include/ppc/host/cpuinfo.h |  2 +-
 hw/ppc/ppc.c|  2 +-
 hw/ppc/prep_systemio.c  |  2 +-
 hw/ppc/spapr.c  |  8 
 hw/ppc/spapr_hcall.c|  2 +-
 hw/ppc/spapr_nvdimm.c   |  4 ++--
 hw/ppc/spapr_pci_vfio.c |  6 +++---
 include/hw/ppc/openpic.h|  2 +-
 include/hw/ppc/spapr.h  |  2 +-
 target/ppc/cpu-models.h |  4 ++--
 target/ppc/cpu.h|  2 +-
 target/ppc/cpu_init.c   |  4 ++--
 target/ppc/excp_helper.c| 14 +++---
 target/ppc/power8-pmu-regs.c.inc|  4 ++--
 target/ppc/translate/vmx-impl.c.inc |  6 +++---
 15 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/host/include/ppc/host/cpuinfo.h b/host/include/ppc/host/cpuinfo.h
index 29ee7f9ef8..38b8eabe2a 100644
--- a/host/include/ppc/host/cpuinfo.h
+++ b/host/include/ppc/host/cpuinfo.h
@@ -2,3 +2,3 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
- * Host specific cpu indentification for ppc.
+ * Host specific cpu identification for ppc.
  */
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 0e0a3d93c3..6c46204428 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -715,3 +715,3 @@ target_ulong cpu_ppc_load_decr(CPUPPCState *env)
 /*
- * If large decrementer is enabled then the decrementer is signed extened
+ * If large decrementer is enabled then the decrementer is signed extended
  * to 64 bits, otherwise it is a 32 bit value.
diff --git a/hw/ppc/prep_systemio.c b/hw/ppc/prep_systemio.c
index 5a56f155f5..c96cefb13d 100644
--- a/hw/ppc/prep_systemio.c
+++ b/hw/ppc/prep_systemio.c
@@ -41,3 +41,3 @@ OBJECT_DECLARE_SIMPLE_TYPE(PrepSystemIoState, PREP_SYSTEMIO)
 
-/* Bit as defined in PowerPC Reference Plaform v1.1, sect. 6.1.5, p. 132 */
+/* Bit as defined in PowerPC Reference Platform v1.1, sect. 6.1.5, p. 132 */
 #define PREP_BIT(n) (1 << (7 - (n)))
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1c8b8d57a7..298b4cebf0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2553,3 +2553,3 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, 
Error **errp)
 
-/* Detemine the VSMT mode to use: */
+/* Determine the VSMT mode to use: */
 if (vsmt_user) {
@@ -3089,3 +3089,3 @@ static int spapr_kvm_type(MachineState *machine, const 
char *vm_type)
  * The use of g_ascii_strcasecmp() for 'hv' and 'pr' is to
- * accomodate the 'HV' and 'PV' formats that exists in the
+ * accommodate the 'HV' and 'PV' formats that exists in the
  * wild. The 'auto' mode is being introduced already as
@@ -4323,3 +4323,3 @@ spapr_cpu_index_to_props(MachineState *machine, unsigned 
cpu_index)
 
-/* make sure possible_cpu are intialized */
+/* make sure possible_cpu are initialized */
 mc->possible_cpu_arch_ids(machine);
@@ -5014,3 +5014,3 @@ static void spapr_machine_2_12_class_options(MachineClass 
*mc)
  * hpt-max-page-size capability. Of course we can't do it here
- * because this is too early and the HW accelerator isn't initialzed
+ * because this is too early and the HW accelerator isn't initialized
  * yet. Postpone this to machine init (see default_caps_with_cpu()).
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 9b1f225d4a..d69867583d 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1560,3 +1560,3 @@ static void hypercall_register_types(void)
 
-/* "debugger" hcalls (also used by SLOF). Note: We do -not- differenciate
+/* "debugger" hcalls (also used by SLOF). Note: We do -not- differentiate
  * here between the "CI" and the "CACHE" variants, they will use whatever
diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
index a8688243a6..4e34545dcf 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -379,3 +379,3 @@ static target_ulong h_scm_bind_mem(PowerPCCPU *cpu, 
SpaprMachineState *spapr,
  * Currently continue token should be zero qemu has already bound
- * everything and this hcall doesnt return H_BUSY.
+ * everything and this hcall doesn't return H_BUSY.
  */
@@ -590,3 +590,3 @@ void spapr_nvdimm_finish_flushes(void)
  * finally reaching here. Other code path being guest
- * h_client_architecture_support, thats early boot up.
+ * h_client_architecture_support, that's early boot up.
  */
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index d8aeee0b7e..12e7790cf6 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -41,3 +41,3 @@ void spapr_phb_vfio_reset(DeviceState *qdev)
 /*
- * The PE might be in frozen state. To reenable the EEH
+ * The PE might be in frozen state. To re-enable the EEH
  * functionality on it will clean the frozen state, which
@@ -80,3 +80,3 @@ int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb,
  * We have already validated that all the devices under this sphb
- * are from same i

[PATCH, trivial 02/29] tree-wide spelling fixes in comments and some messages: bsd-user

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 bsd-user/errno_defs.h| 2 +-
 bsd-user/freebsd/target_os_siginfo.h | 2 +-
 bsd-user/freebsd/target_os_stack.h   | 4 ++--
 bsd-user/freebsd/target_os_user.h| 2 +-
 bsd-user/qemu.h  | 2 +-
 bsd-user/signal-common.h | 4 ++--
 bsd-user/signal.c| 6 +++---
 7 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/bsd-user/errno_defs.h b/bsd-user/errno_defs.h
index f3e8ac3488..abe70119d9 100644
--- a/bsd-user/errno_defs.h
+++ b/bsd-user/errno_defs.h
@@ -151,3 +151,3 @@
 /* Internal errors: */
-#define TARGET_EJUSTRETURN  254 /* Just return without 
modifing regs */
+#define TARGET_EJUSTRETURN  254 /* Just return without 
modifying regs */
 #define TARGET_ERESTART 255 /* Restart syscall */
diff --git a/bsd-user/freebsd/target_os_siginfo.h 
b/bsd-user/freebsd/target_os_siginfo.h
index 4573738752..6c282d8502 100644
--- a/bsd-user/freebsd/target_os_siginfo.h
+++ b/bsd-user/freebsd/target_os_siginfo.h
@@ -74,3 +74,3 @@ typedef struct target_siginfo {
 
-/* SIGPOLL -- Not really genreated in FreeBSD ??? */
+/* SIGPOLL -- Not really generated in FreeBSD ??? */
 struct {
diff --git a/bsd-user/freebsd/target_os_stack.h 
b/bsd-user/freebsd/target_os_stack.h
index 0590133291..d15fc3263f 100644
--- a/bsd-user/freebsd/target_os_stack.h
+++ b/bsd-user/freebsd/target_os_stack.h
@@ -27,3 +27,3 @@
 /*
- * The inital FreeBSD stack is as follows:
+ * The initial FreeBSD stack is as follows:
  * (see kern/kern_exec.c exec_copyout_strings() )
@@ -61,3 +61,3 @@ static inline int setup_initial_stack(struct bsd_binprm *bprm,
 
-/* Add machine depedent sigcode. */
+/* Add machine dependent sigcode. */
 p -= TARGET_SZSIGCODE;
diff --git a/bsd-user/freebsd/target_os_user.h 
b/bsd-user/freebsd/target_os_user.h
index f036a32343..1ca7b5ab17 100644
--- a/bsd-user/freebsd/target_os_user.h
+++ b/bsd-user/freebsd/target_os_user.h
@@ -28,3 +28,3 @@ struct target_priority {
 uint8_t pri_level;  /* Normal priority level. */
-uint8_t pri_native; /* Priority before propogation. */
+uint8_t pri_native; /* Priority before propagation. */
 uint8_t pri_user;   /* User priority based on p_cpu and p_nice. */
diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index 41d84e0b81..79c9b62609 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -120,3 +120,3 @@ extern const char *qemu_uname_release;
  * and envelope for the new program. 256k should suffice for a reasonable
- * maxiumum env+arg in 32-bit environments, bump it up to 512k for !ILP32
+ * maximum env+arg in 32-bit environments, bump it up to 512k for !ILP32
  * platforms.
diff --git a/bsd-user/signal-common.h b/bsd-user/signal-common.h
index 6f90345bb2..c044e81165 100644
--- a/bsd-user/signal-common.h
+++ b/bsd-user/signal-common.h
@@ -51,3 +51,3 @@ void target_to_host_sigset(sigset_t *d, const target_sigset_t 
*s);
  * either within host siginfo_t or in target_siginfo structures which we get
- * from the guest userspace program. Linux kenrels use this internally, but BSD
+ * from the guest userspace program. Linux kernels use this internally, but BSD
  * kernels don't do this, but its a useful abstraction.
@@ -55,3 +55,3 @@ void target_to_host_sigset(sigset_t *d, const target_sigset_t 
*s);
  * The linux-user version of this uses the top 16 bits, but FreeBSD's SI_USER
- * and other signal indepenent SI_ codes have bit 16 set, so we only use the 
top
+ * and other signal independent SI_ codes have bit 16 set, so we only use the 
top
  * byte instead.
diff --git a/bsd-user/signal.c b/bsd-user/signal.c
index f4e078ee1d..6e77dd0b4d 100644
--- a/bsd-user/signal.c
+++ b/bsd-user/signal.c
@@ -46,3 +46,3 @@ static inline int sas_ss_flags(TaskState *ts, unsigned long 
sp)
 /*
- * The BSD ABIs use the same singal numbers across all the CPU architectures, 
so
+ * The BSD ABIs use the same signal numbers across all the CPU architectures, 
so
  * (unlike Linux) these functions are just the identity mapping. This might not
@@ -243,3 +243,3 @@ static inline void 
host_to_target_siginfo_noswap(target_siginfo_t *tinfo,
  * Unsure that this can actually be generated, and our support for
- * capsicum is somewhere between weak and non-existant, but if we get
+ * capsicum is somewhere between weak and non-existent, but if we get
  * one, then we know what to save.
@@ -321,3 +321,3 @@ int block_signals(void)
  * further guest code before unblocking signals in
- * process_pending_signals(). We depend on the FreeBSD behaivor here where
+ * process_pending_signals(). We depend on the FreeBSD behavior here where
  * this will only affect this thread's signal mask. We don't use
-- 
2.39.2

[PATCH, trivial 04/29] tree-wide spelling fixes in comments and some messages: util

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 util/cpuinfo-aarch64.c | 4 ++--
 util/cpuinfo-i386.c| 4 ++--
 util/cpuinfo-ppc.c | 2 +-
 util/main-loop.c   | 2 +-
 util/oslib-posix.c | 2 +-
 util/qdist.c   | 2 +-
 util/qemu-progress.c   | 2 +-
 util/qemu-sockets.c| 2 +-
 util/rcu.c | 2 +-
 9 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/util/cpuinfo-aarch64.c b/util/cpuinfo-aarch64.c
index ababc39550..7d39f47e3b 100644
--- a/util/cpuinfo-aarch64.c
+++ b/util/cpuinfo-aarch64.c
@@ -2,3 +2,3 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
- * Host specific cpu indentification for AArch64.
+ * Host specific cpu identification for AArch64.
  */
@@ -35,3 +35,3 @@ static bool sysctl_for_bool(const char *name)
  * but we're only asking about static properties, all of which should be
- * 'int'.  So we shouln't see ENOMEM (val too small), or any of the other
+ * 'int'.  So we shouldn't see ENOMEM (val too small), or any of the other
  * more exotic errors.
diff --git a/util/cpuinfo-i386.c b/util/cpuinfo-i386.c
index 3a7b7e0ad1..b2ed65bb10 100644
--- a/util/cpuinfo-i386.c
+++ b/util/cpuinfo-i386.c
@@ -2,3 +2,3 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
- * Host specific cpu indentification for x86.
+ * Host specific cpu identification for x86.
  */
@@ -76,3 +76,3 @@ unsigned __attribute__((constructor)) cpuinfo_init(void)
  * AMD has provided an even stronger guarantee that processors
- * with AVX provide 16-byte atomicity for all cachable,
+ * with AVX provide 16-byte atomicity for all cacheable,
  * naturally aligned single loads and stores, e.g. MOVDQU.
diff --git a/util/cpuinfo-ppc.c b/util/cpuinfo-ppc.c
index 7212afa45d..1ea3db0ac8 100644
--- a/util/cpuinfo-ppc.c
+++ b/util/cpuinfo-ppc.c
@@ -2,3 +2,3 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
- * Host specific cpu indentification for ppc.
+ * Host specific cpu identification for ppc.
  */
diff --git a/util/main-loop.c b/util/main-loop.c
index 014c795916..797b640c41 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -49,3 +49,3 @@
  * Disable CFI checks.
- * We are going to call a signal hander directly. Such handler may or may not
+ * We are going to call a signal handler directly. Such handler may or may not
  * have been defined in our binary, so there's no guarantee that the pointer
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 760390b31e..4d583da7ce 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -673,3 +673,3 @@ void qemu_free_stack(void *stack, size_t sz)
  * Disable CFI checks.
- * We are going to call a signal hander directly. Such handler may or may not
+ * We are going to call a signal handler directly. Such handler may or may not
  * have been defined in our binary, so there's no guarantee that the pointer
diff --git a/util/qdist.c b/util/qdist.c
index 5f75e24c29..ef3566b03a 100644
--- a/util/qdist.c
+++ b/util/qdist.c
@@ -212,3 +212,3 @@ void qdist_bin__internal(struct qdist *to, const struct 
qdist *from, size_t n)
  * To avoid double-counting we capture [left, right) ranges, except for
- * the righmost bin, which captures a [left, right] range.
+ * the rightmost bin, which captures a [left, right] range.
  */
diff --git a/util/qemu-progress.c b/util/qemu-progress.c
index aa994668f1..35574487c9 100644
--- a/util/qemu-progress.c
+++ b/util/qemu-progress.c
@@ -97,3 +97,3 @@ static void progress_dummy_init(void)
  * tools that use the progress report SIGUSR1 isn't used in this meaning
- * and instead should print the progress, so reenable it.
+ * and instead should print the progress, so re-enable it.
  */
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 892d33f5e6..83e84b1186 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -931,3 +931,3 @@ static int unix_listen_saddr(UnixSocketAddress *saddr,
 /*
- * This dummy fd usage silences the mktemp() unsecure warning.
+ * This dummy fd usage silences the mktemp() insecure warning.
  * Using mkstemp() doesn't make things more secure here
diff --git a/util/rcu.c b/util/rcu.c
index 30a7e22026..e587bcc483 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -357,3 +357,3 @@ void drain_call_rcu(void)
  * we also end up waiting for most of RCU callbacks that were registered
- * on the other threads, but this is a side effect that shoudn't be
+ * on the other threads, but this is a side effect that shouldn't be
  * assumed.
-- 
2.39.2

[PATCH, trivial 09/29] tree-wide spelling fixes in comments and some messages: i386

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 host/include/i386/host/cpuinfo.h | 2 +-
 hw/i386/acpi-build.c | 4 ++--
 hw/i386/amd_iommu.c  | 4 ++--
 hw/i386/intel_iommu.c| 4 ++--
 hw/i386/kvm/xen_xenstore.c   | 2 +-
 hw/i386/kvm/xenstore_impl.c  | 2 +-
 hw/i386/pc.c | 4 ++--
 include/hw/i386/topology.h   | 2 +-
 target/i386/cpu.c| 4 ++--
 target/i386/cpu.h| 4 ++--
 target/i386/hax/hax-interface.h  | 4 ++--
 target/i386/hax/hax-windows.c| 2 +-
 target/i386/kvm/kvm.c| 4 ++--
 target/i386/kvm/xen-emu.c| 2 +-
 target/i386/machine.c| 4 ++--
 target/i386/tcg/translate.c  | 8 
 tests/tcg/i386/system/boot.S | 2 +-
 tests/tcg/i386/x86.csv   | 2 +-
 18 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h
index 073d0a426f..6e46939132 100644
--- a/host/include/i386/host/cpuinfo.h
+++ b/host/include/i386/host/cpuinfo.h
@@ -2,3 +2,3 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
- * Host specific cpu indentification for x86.
+ * Host specific cpu identification for x86.
  */
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9c74fa17ad..acbfff 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -777,3 +777,3 @@ static Aml *initialize_route(Aml *route, const char 
*link_name,
  * based on device location.
- * The main goal is to equaly distribute the interrupts
+ * The main goal is to equally distribute the interrupts
  * over the 4 existing ACPI links (works only for i440fx).
@@ -2080,3 +2080,3 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 /*
- * Insert DMAR scope for PCI bridges and endpoint devcie
+ * Insert DMAR scope for PCI bridges and endpoint device
  */
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 9c77304438..c98a3c6e11 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -261,3 +261,3 @@ static void amdvi_log_command_error(AMDVIState *s, hwaddr 
addr)
 }
-/* log an illegal comand event
+/* log an illegal command event
  *   @addr : address of illegal command
@@ -769,3 +769,3 @@ static void amdvi_mmio_write(void *opaque, hwaddr addr, 
uint64_t val,
 amdvi_mmio_reg_write(s, size, val, addr);
-/* FIXME - make sure System Software has finished writing incase
+/* FIXME - make sure System Software has finished writing in case
  * it writes in chucks less than 8 bytes in a robust way.As for
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index dcc334060c..09b19a43ee 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -54,3 +54,3 @@
  * PCI bus number (or SID) is not reliable since the device is usaully
- * initalized before guest can configure the PCI bridge
+ * initialized before guest can configure the PCI bridge
  * (SECONDARY_BUS_NUMBER).
@@ -1691,3 +1691,3 @@ static bool vtd_switch_address_space(VTDAddressSpace *as)
  * We enable per as memory region (iommu_ir_fault) for catching
- * the tranlsation for interrupt range through PASID + PT.
+ * the translation for interrupt range through PASID + PT.
  */
diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 133d89e953..660d0b72f9 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -1158,3 +1158,3 @@ static unsigned int copy_to_ring(XenXenstoreState *s, 
uint8_t *ptr,
  * This matches the barrier in copy_to_ring() (or the guest's
- * equivalent) betweem writing the data to the ring and updating
+ * equivalent) between writing the data to the ring and updating
  * rsp_prod. It protects against the pathological case (which
diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index 305fe75519..36595fdb45 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -1428,3 +1428,3 @@ static void save_node(gpointer key, gpointer value, 
gpointer opaque)
  * There's no rename/move in XenStore, so all we need to find
- * it is the tx_id of the transation in which it exists. Which
+ * it is the tx_id of the transaction in which it exists. Which
  * may be the root tx.
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3109d5e0e0..405db3aef9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -435,3 +435,3 @@ static uint64_t ioport80_read(void *opaque, hwaddr addr, 
unsigned size)
 
-/* MSDOS compatibility mode FPU exception support */
+/* MS-DOS compatibility mode FPU exception support */
 static void ioportF0_write(void *opaque, hwaddr addr, uint64_t data,
@@ -1754,3 +1754,3 @@ static void pc_machine_set_max_fw_size(Object *obj, 
Visitor *v,
"User specified max allowed firmware size %" PRIu64 " is "
-   "greater than 16MiB. If combined firwmare size exceeds "
+   "greater than 16MiB. If combined firmware size exceeds "

[PATCH, trivial 14/29] tree-wide spelling fixes in comments and some messages: hexagon

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 target/hexagon/README   |  2 +-
 target/hexagon/fma_emu.c|  2 +-
 target/hexagon/idef-parser/README.rst   |  2 +-
 target/hexagon/idef-parser/idef-parser.h|  2 +-
 target/hexagon/idef-parser/parser-helpers.c |  6 +++---
 target/hexagon/imported/alu.idef|  8 
 target/hexagon/imported/macros.def  |  2 +-
 target/hexagon/imported/mmvec/ext.idef  | 10 +-
 tests/tcg/hexagon/fpstuff.c |  2 +-
 tests/tcg/hexagon/test_clobber.S|  2 +-
 10 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/target/hexagon/README b/target/hexagon/README
index 43811178e9..e757bcb64a 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -241,3 +241,3 @@ VLIW packet semantics differ from serial semantics in that 
all input operands
 are read, then the operations are performed, then all the results are written.
-For exmaple, this packet performs a swap of registers r0 and r1
+For example, this packet performs a swap of registers r0 and r1
 { r0 = r1; r1 = r0 }
diff --git a/target/hexagon/fma_emu.c b/target/hexagon/fma_emu.c
index d3b45d494f..05a56d8c10 100644
--- a/target/hexagon/fma_emu.c
+++ b/target/hexagon/fma_emu.c
@@ -417,3 +417,3 @@ static SUFFIX accum_round_##SUFFIX(Accum a, float_status * 
fp_status) \
  * shifted out lots of bits from B, or if we had no shift / 1 shift sticky 
\
- * shoudl be 0  \
+ * should be 0  \
  */ \
diff --git a/target/hexagon/idef-parser/README.rst 
b/target/hexagon/idef-parser/README.rst
index debeddfde5..d0aa34309b 100644
--- a/target/hexagon/idef-parser/README.rst
+++ b/target/hexagon/idef-parser/README.rst
@@ -442,3 +442,3 @@ Run-time errors can be divided between lexing and parsing 
errors, lexing errors
 are hard to detect, since the ``var`` token will catch everything which is not
-catched by other tokens, but easy to fix, because most of the time a simple
+caught by other tokens, but easy to fix, because most of the time a simple
 regex editing will be enough.
diff --git a/target/hexagon/idef-parser/idef-parser.h 
b/target/hexagon/idef-parser/idef-parser.h
index d23e71f13b..3faa1deecd 100644
--- a/target/hexagon/idef-parser/idef-parser.h
+++ b/target/hexagon/idef-parser/idef-parser.h
@@ -75,3 +75,3 @@ typedef struct HexTmp {
 /**
- * Enum of the possible immediated, an immediate is a value which is known
+ * Enum of the possible immediate, an immediate is a value which is known
  * at tinycode generation time, e.g. an integer value, not a TCGv
diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index 7b5ebafec2..ec43343801 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -461,3 +461,3 @@ static bool try_find_variable(Context *c, YYLTYPE *locp,
 
-/* Calls `try_find_variable` and asserts succcess. */
+/* Calls `try_find_variable` and asserts success. */
 static void find_variable(Context *c, YYLTYPE *locp,
@@ -551,3 +551,3 @@ HexValue gen_bin_cmp(Context *c,
 default:
-fprintf(stderr, "Error in evalutating immediateness!");
+fprintf(stderr, "Error in evaluating immediateness!");
 abort();
@@ -1166,3 +1166,3 @@ void gen_rdeposit_op(Context *c,
  * Otherwise if the width is not known, we fallback on reimplementing
- * desposit in TCG.
+ * deposit in TCG.
  */
diff --git a/target/hexagon/imported/alu.idef b/target/hexagon/imported/alu.idef
index 58477ae40a..12d2aac5d4 100644
--- a/target/hexagon/imported/alu.idef
+++ b/target/hexagon/imported/alu.idef
@@ -294,12 +294,12 @@ 
Q6INSN(A4_combineii,"Rdd32=combine(#s8,#U6)",ATTRIBS(),"Set two small immediates
 Q6INSN(A2_combine_hh,"Rd32=combine(Rt.H32,Rs.H32)",ATTRIBS(),
-"Combine two halfs into a register", {RdV = (fGETUHALF(1,RtV)<<16) | 
fGETUHALF(1,RsV);})
+"Combine two halves into a register", {RdV = (fGETUHALF(1,RtV)<<16) | 
fGETUHALF(1,RsV);})
 
 Q6INSN(A2_combine_hl,"Rd32=combine(Rt.H32,Rs.L32)",ATTRIBS(),
-"Combine two halfs into a register", {RdV = (fGETUHALF(1,RtV)<<16) | 
fGETUHALF(0,RsV);})
+"Combine two halves into a register", {RdV = (fGETUHALF(1,RtV)<<16) | 
fGETUHALF(0,RsV);})
 
 Q6INSN(A2_combine_lh,"Rd32=combine(Rt.L32,Rs.H32)",ATTRIBS(),
-"Combine two halfs into a register", {RdV = (fGETUHALF(0,RtV)<<16) | 
fGETUHALF(1,RsV);})
+"Combine two halves into a register", {RdV = (fGETUHALF(0,RtV)<<16) | 
fGETUHALF(1,RsV);})
 
 Q6INSN(A2_combine_ll,"Rd32=combine(Rt.L32,Rs.L32)",ATTRIBS(),
-"Combine two halfs into a register", {RdV = (fGETUHALF(0,RtV)<<16) | 
fGETUHALF(0,RsV);})
+"Combine two halves into a register", {RdV = (fGETUHALF(0,RtV)<<16) | 
fGETUHALF(0,RsV);})
 
diff --git a/target/hexagon/imported/macros.def 
b/target/hexagon/imported/macros.def
index e23f91562e..4bbcfdd5e1 100755
--- a/target/hexagon/imported/macros.def
+++ b/target/hexagon/imported/macros.def
@@ -904,3 +904,3 @@ DEF_MA

[PATCH, trivial 03/29] tree-wide spelling fixes in comments and some messages: ui

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 ui/cocoa.m| 2 +-
 ui/keymaps.h  | 2 +-
 ui/sdl2-2d.c  | 2 +-
 ui/sdl2.c | 2 +-
 ui/vnc-enc-tight.c| 2 +-
 ui/vnc-enc-zrle.c.inc | 2 +-
 ui/vnc-enc-zywrle.h   | 4 ++--
 7 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index 0c2153d17c..4d8989c4eb 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -2047,3 +2047,3 @@ static void cocoa_display_init(DisplayState *ds, 
DisplayOptions *opts)
  * Create the menu entries which depend on QEMU state (for consoles
- * and removeable devices). These make calls back into QEMU functions,
+ * and removable devices). These make calls back into QEMU functions,
  * which is OK because at this point we know that the second thread
diff --git a/ui/keymaps.h b/ui/keymaps.h
index 6473405485..3d52c0882a 100644
--- a/ui/keymaps.h
+++ b/ui/keymaps.h
@@ -46,3 +46,3 @@ typedef struct {
 
-/* Additional modifiers to use if not catched another way. */
+/* Additional modifiers to use if not caught another way. */
 #define SCANCODE_SHIFT  0x100
diff --git a/ui/sdl2-2d.c b/ui/sdl2-2d.c
index bfebbdeaea..06468cd493 100644
--- a/ui/sdl2-2d.c
+++ b/ui/sdl2-2d.c
@@ -152,3 +152,3 @@ bool sdl2_2d_check_format(DisplayChangeListener *dcl,
  * We let SDL convert for us a few more formats than,
- * the native ones. Thes are the ones I have tested.
+ * the native ones. These are the ones I have tested.
  */
diff --git a/ui/sdl2.c b/ui/sdl2.c
index 0d91b555e3..ea4a92cd36 100644
--- a/ui/sdl2.c
+++ b/ui/sdl2.c
@@ -862,3 +862,3 @@ static void sdl2_display_init(DisplayState *ds, 
DisplayOptions *o)
 #ifndef CONFIG_WIN32
-/* QEMU uses its own low level keyboard hook procecure on Windows */
+/* QEMU uses its own low level keyboard hook procedure on Windows */
 SDL_SetHint(SDL_HINT_GRAB_KEYBOARD, "1");
diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
index 09200d71b8..ee853dcfcb 100644
--- a/ui/vnc-enc-tight.c
+++ b/ui/vnc-enc-tight.c
@@ -79,3 +79,3 @@ static int tight_send_framebuffer_update(VncState *vs, int x, 
int y,
 static const struct {
-double jpeg_freq_min;   /* Don't send JPEG if the freq is bellow */
+double jpeg_freq_min;   /* Don't send JPEG if the freq is below */
 double jpeg_freq_threshold; /* Always send JPEG if the freq is above */
diff --git a/ui/vnc-enc-zrle.c.inc b/ui/vnc-enc-zrle.c.inc
index c107d8affc..a8ca37d05e 100644
--- a/ui/vnc-enc-zrle.c.inc
+++ b/ui/vnc-enc-zrle.c.inc
@@ -112,3 +112,3 @@ static void ZRLE_ENCODE_TILE(VncState *vs, ZRLE_PIXEL 
*data, int w, int h,
 
-/* Real limit is 127 but we wan't a way to know if there is more than 127 
*/
+/* Real limit is 127 but we want a way to know if there is more than 127 */
 palette_init(palette, 256, ZRLE_BPP);
diff --git a/ui/vnc-enc-zywrle.h b/ui/vnc-enc-zywrle.h
index e661ec117d..64fbc90ee7 100644
--- a/ui/vnc-enc-zywrle.h
+++ b/ui/vnc-enc-zywrle.h
@@ -487,3 +487,3 @@ static inline void wavelet(int *buf, int width, int height, 
int level)
   RGB <=> YUV conversion stuffs.
-  YUV coversion is explained as following formula in strict meaning:
+  YUV conversion is explained as following formula in strict meaning:
   Y =  0.299R + 0.587G + 0.114B (   0<=Y<=255)
@@ -541,3 +541,3 @@ static inline void wavelet(int *buf, int width, int height, 
int level)
  So, we must transfer each sub images individually in strict meaning.
- But at least ZRLE meaning, following one decompositon image is same as
+ But at least ZRLE meaning, following one decomposition image is same as
  avobe individual sub image. I use this format.
-- 
2.39.2

[PATCH, trivial 23/29] tree-wide spelling fixes in comments and some messages: hw/

2023-07-14 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 hw/acpi/aml-build.c  |  6 +++---
 hw/acpi/hmat.c   |  2 +-
 hw/acpi/nvdimm.c |  2 +-
 hw/block/hd-geometry.c   |  4 ++--
 hw/block/pflash_cfi01.c  |  2 +-
 hw/char/cadence_uart.c   |  2 +-
 hw/char/imx_serial.c |  2 +-
 hw/char/serial.c |  2 +-
 hw/core/generic-loader.c |  4 ++--
 hw/core/loader.c |  4 ++--
 hw/core/machine.c|  2 +-
 hw/core/qdev-properties-system.c |  2 +-
 hw/cpu/a15mpcore.c   |  2 +-
 hw/cxl/cxl-events.c  |  2 +-
 hw/cxl/cxl-mailbox-utils.c   |  4 ++--
 hw/dma/omap_dma.c|  4 ++--
 hw/input/hid.c   |  2 +-
 hw/input/tsc2005.c   | 16 
 hw/intc/loongarch_extioi.c   |  2 +-
 hw/intc/loongson_liointc.c   |  2 +-
 hw/intc/omap_intc.c  |  2 +-
 hw/intc/pnv_xive.c   |  2 +-
 hw/intc/spapr_xive.c |  2 +-
 hw/intc/spapr_xive_kvm.c |  6 +++---
 hw/intc/xive.c   |  2 +-
 hw/intc/xive2.c  |  6 +++---
 hw/ipmi/ipmi_bmc_extern.c|  2 +-
 hw/mem/cxl_type3.c   |  6 +++---
 hw/misc/imx7_ccm.c   |  2 +-
 hw/misc/mac_via.c|  2 +-
 hw/misc/stm32f2xx_syscfg.c   |  4 ++--
 hw/misc/trace-events |  2 +-
 hw/misc/zynq_slcr.c  |  2 +-
 hw/nvme/ctrl.c   |  6 +++---
 hw/nvram/eeprom_at24c.c  |  2 +-
 hw/nvram/fw_cfg.c|  2 +-
 hw/rtc/exynos4210_rtc.c  |  2 +-
 hw/rx/rx62n.c|  2 +-
 hw/scsi/lsi53c895a.c |  2 +-
 hw/scsi/mfi.h|  2 +-
 hw/sd/sd.c   |  2 +-
 hw/sd/sdhci.c|  2 +-
 hw/sensor/isl_pmbus_vr.c |  2 +-
 hw/sensor/max34451.c |  2 +-
 hw/sh4/sh7750_regs.h | 26 +-
 hw/smbios/smbios.c   |  2 +-
 hw/ssi/xilinx_spips.c|  6 +++---
 hw/ssi/xlnx-versal-ospi.c|  2 +-
 hw/timer/etraxfs_timer.c |  2 +-
 hw/timer/i8254.c |  2 +-
 hw/timer/renesas_tmr.c   |  2 +-
 hw/virtio/virtio-crypto.c|  4 ++--
 hw/virtio/virtio-mem.c   |  2 +-
 hw/virtio/virtio.c   |  2 +-
 54 files changed, 92 insertions(+), 92 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index ea331a20d1..af66bde0f5 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -314,3 +314,3 @@ build_prepend_package_length(GArray *package, unsigned 
length, bool incl_self)
  * and PkgLength's length itself when used for terms with
- * explitit length.
+ * explicit length.
  */
@@ -682,3 +682,3 @@ Aml *aml_store(Aml *val, Aml *target)
  *
- * Returns: The newly allocated and composed according to patter Aml object.
+ * Returns: The newly allocated and composed according to pattern Aml object.
  */
@@ -2161,3 +2161,3 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
AcpiFadtData *f,
 } else {
-build_append_int_noprefix(tbl, 0, 3); /* Reserved upto ACPI 5.0 */
+build_append_int_noprefix(tbl, 0, 3); /* Reserved up to ACPI 5.0 */
 }
diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
index 3a6d51282a..2d5e199ba9 100644
--- a/hw/acpi/hmat.c
+++ b/hw/acpi/hmat.c
@@ -84,3 +84,3 @@ static void build_hmat_lb(GArray *table_data, HMAT_LB_Info 
*hmat_lb,
 uint32_t lb_length
-= 32 /* Table length upto and including Entry Base Unit */
+= 32 /* Table length up to and including Entry Base Unit */
 + 4 * num_initiator /* Initiator Proximity Domain List */
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index a3b25a92f3..fe03ce87e0 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -1099,3 +1099,3 @@ static void nvdimm_build_common_dsm(Aml *dev,
  * size is 32 bits, otherwise it is 64 bits.
- * Because of this CreateField() canot be used if RLEN < Integer Size.
+ * Because of this CreateField() cannot be used if RLEN < Integer Size.
  *
diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c
index dae13ab14d..2b0af4430f 100644
--- a/hw/block/hd-geometry.c
+++ b/hw/block/hd-geometry.c
@@ -52,3 +52,3 @@ struct partition {
 
-/* try to guess the disk logical geometry from the MSDOS partition table.
+/* try to guess the disk logical geometry from the MS-DOS partition table.
Return 0 if OK, -1 if could not guess */
@@ -68,3 +68,3 @@ static int guess_disk_lchs(BlockBackend *blk,
 }
-/* test msdos magic */
+/* test MS-DOS magic */
 if (buf[510] != 0x55 || buf[511] != 0xaa) {
diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 3c066e3405..62056b1d74 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -893,3 +893,3 @@ static Property pflash_cfi01_properties[] = {
  * If we're emulating flash devices wired in

1 2 3 >

1 - 100 of 239 matches

Mail list logo