Re: [PATCH] hw/virtio: Document *_should_notify() are called within rcu_read_lock()

2021-05-23 Thread Michael S. Tsirkin
On Thu, May 20, 2021 at 08:49:00AM +0200, Philippe Mathieu-Daudé wrote:
> Such comments make reviewing this file somehow easier.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

ok but

> ---
>  hw/virtio/virtio.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index e02544b2df7..2b4c6c4b875 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -2449,6 +2449,7 @@ static void virtio_set_isr(VirtIODevice *vdev, int 
> value)
>  }
>  }
>  
> +/* Called within rcu_read_lock().  */
>  static bool virtio_split_should_notify(VirtIODevice *vdev, VirtQueue *vq)
>  {
>  uint16_t old, new;
> @@ -2485,6 +2486,7 @@ static bool vring_packed_need_event(VirtQueue *vq, bool 
> wrap,
>  return vring_need_event(off, new, old);
>  }
>  
> +/* Called within rcu_read_lock().  */
>  static bool virtio_packed_should_notify(VirtIODevice *vdev, VirtQueue *vq)
>  {
>  VRingPackedDescEvent e;


one space before */ please, not two.

> -- 
> 2.26.3




Re: [PATCH] hw/virtio: Have virtio_bus_get_vdev_bad_features() return 64-bit value

2021-05-23 Thread Michael S. Tsirkin
On Thu, May 20, 2021 at 12:28:22PM +0200, Philippe Mathieu-Daudé wrote:
> In commit 019a3edbb25 ("virtio: make features 64bit wide") we
> increased the 'features' field to 64-bit, but forgot to update
> the virtio_bus_get_vdev_bad_features() helper. The 'bad features'
> are truncated to 32-bit. The virtio_net_bad_features() handler
> from the virtio-net devices is potentially affected.

I'm fine with increasing it for consistency, but bad features
are all legacy things aren't they? So there isn't a functional
issue ... or did I miss anything?

> 
> Have the virtio_bus_get_vdev_bad_features() helper return the
> full 64-bit value.
> 
> Cc: qemu-sta...@nongnu.org
> Fixes: 019a3edbb25 ("virtio: make features 64bit wide")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/hw/virtio/virtio-bus.h | 2 +-
>  hw/virtio/virtio-bus.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
> index ef8abe49c5a..f9955ff577a 100644
> --- a/include/hw/virtio/virtio-bus.h
> +++ b/include/hw/virtio/virtio-bus.h
> @@ -122,7 +122,7 @@ uint16_t virtio_bus_get_vdev_id(VirtioBusState *bus);
>  /* Get the config_len field of the plugged device. */
>  size_t virtio_bus_get_vdev_config_len(VirtioBusState *bus);
>  /* Get bad features of the plugged device. */
> -uint32_t virtio_bus_get_vdev_bad_features(VirtioBusState *bus);
> +uint64_t virtio_bus_get_vdev_bad_features(VirtioBusState *bus);
>  /* Get config of the plugged device. */
>  void virtio_bus_get_vdev_config(VirtioBusState *bus, uint8_t *config);
>  /* Set config of the plugged device. */
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index 859978d2487..25a2b68a234 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -134,7 +134,7 @@ size_t virtio_bus_get_vdev_config_len(VirtioBusState *bus)
>  }
>  
>  /* Get bad features of the plugged device. */
> -uint32_t virtio_bus_get_vdev_bad_features(VirtioBusState *bus)
> +uint64_t virtio_bus_get_vdev_bad_features(VirtioBusState *bus)
>  {
>  VirtIODevice *vdev = virtio_bus_get_device(bus);
>  VirtioDeviceClass *k;
> -- 
> 2.26.3




Re: [RFC PATCH v4 0/7] Use ACPI PCI hot-plug for Q35

2021-05-23 Thread Michael S. Tsirkin
On Thu, May 13, 2021 at 08:26:35AM +0200, Julia Suvorova wrote:
> The patch set consists of two parts:
> patches 1-4: introduce new feature
>  'acpi-pci-hotplug-with-bridge-support' on Q35
> patches 5-7: make the feature default along with changes in ACPI tables
> 
> This way maintainers can decide which way to choose without breaking
> the patch set.
> 
> With the feature disabled Q35 falls back to the native hot-plug.

Overall I think this version has addressed my concerns, good job!
I see Igor has some suggestions on how to structure the code,
they seem easy to address.
Question: what makes this an RFC? Any known bugs/missing functionality?
I don't see anything obvious - what did I miss?


> Pros
> * no racy behavior during boot (see 110c477c2ed)
> * eject is possible - according to PCIe spec, attention button
>   press should lead to power off, and then the adapter should be
>   removed manually. As there is no power down state exists in QEMU,
>   we cannot distinguish between an eject and a power down
>   request.
> * no delay during deleting - after the actual power off software
>   must wait at least 1 second before indicating about it. This case
>   is quite important for users, it even has its own bug:
>   https://bugzilla.redhat.com/show_bug.cgi?id=1594168
> * no timer-based behavior - in addition to the previous example,
>   the attention button has a 5-second waiting period, during which
>   the operation can be canceled with a second press. While this
>   looks fine for manual button control, automation will result in
>   the need to queue or drop events, and the software receiving
>   events in all sort of unspecified combinations of attention/power
>   indicator states, which is racy and uppredictable.
> * fixes or reduces the likelihood of the bugs:
> * https://bugzilla.redhat.com/show_bug.cgi?id=1833187
> * https://bugzilla.redhat.com/show_bug.cgi?id=1657077
> * https://bugzilla.redhat.com/show_bug.cgi?id=1669931
> * https://bugzilla.redhat.com/show_bug.cgi?id=1678290
> 
> Cons:
> * no access to possible features presented in slot capabilities
>   (this is only surprise removal AFAIK)
> 
> v4:
> * regain per-port control over hot-plug
> * rebased over acpi-index changes
> * set property on machine type to
>   make pci code more generic [Igor, Michael]
> 
> v3:
> * drop change of _OSC to allow SHPC on hotplugged bridges
> * use 'acpi-root-pci-hotplug'
> * add migration states [Igor]
> * minor style changes
> 
> v2:
> * new ioport range for acpiphp [Gerd]
> * drop find_pci_host() [Igor]
> * explain magic numbers in _OSC [Igor]
> * drop build_q35_pci_hotplug() wrapper [Igor]
> 
> Julia Suvorova (7):
>   hw/acpi/pcihp: Enhance acpi_pcihp_disable_root_bus() to support Q35
>   hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35
>   hw/acpi/ich9: Enable ACPI PCI hot-plug
>   hw/pci/pcie: Do not set HPC flag if acpihp is used
>   bios-tables-test: Allow changes in DSDT ACPI tables
>   hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35
>   bios-tables-test: Update golden binaries
> 
>  hw/i386/acpi-build.h  |   5 +++
>  include/hw/acpi/ich9.h|   5 +++
>  include/hw/acpi/pcihp.h   |   3 +-
>  include/hw/boards.h   |   1 +
>  hw/acpi/ich9.c|  68 ++
>  hw/acpi/pcihp.c   |  22 +++---
>  hw/acpi/piix4.c   |   4 +-
>  hw/core/machine.c |  19 +
>  hw/i386/acpi-build.c  |  32 --
>  hw/i386/pc.c  |   4 +-
>  hw/i386/pc_q35.c  |   8 
>  hw/pci/pcie.c |  11 -
>  tests/data/acpi/q35/DSDT  | Bin 7859 -> 8289 bytes
>  tests/data/acpi/q35/DSDT.acpihmat | Bin 9184 -> 9614 bytes
>  tests/data/acpi/q35/DSDT.bridge   | Bin 7877 -> 11003 bytes
>  tests/data/acpi/q35/DSDT.cphp | Bin 8323 -> 8753 bytes
>  tests/data/acpi/q35/DSDT.dimmpxm  | Bin 9513 -> 9943 bytes
>  tests/data/acpi/q35/DSDT.ipmibt   | Bin 7934 -> 8364 bytes
>  tests/data/acpi/q35/DSDT.memhp| Bin 9218 -> 9648 bytes
>  tests/data/acpi/q35/DSDT.mmio64   | Bin 8990 -> 9419 bytes
>  tests/data/acpi/q35/DSDT.nohpet   | Bin 7717 -> 8147 bytes
>  tests/data/acpi/q35/DSDT.numamem  | Bin 7865 -> 8295 bytes
>  tests/data/acpi/q35/DSDT.tis  | Bin 8465 -> 8894 bytes
>  23 files changed, 161 insertions(+), 21 deletions(-)
> 
> -- 
> 2.30.2




Re:Re:Re: set qemu support serial crtscts

2021-05-23 Thread 付小明
Hi,  Peter and all developers
Do you solve this question? I am looking forward to your reply. Thank you 
very much



At 2021-04-15 13:38:08, "付小明"  wrote:

Hi, Peter and all developers
 Yes, I need to describe it clearly. 
1、I want make the qemu serial support crtscts
2、Problem: the machine's guest is windows, the machine's host is centos.  The 
qemu emulation pci-serial.
A printer (model is BTP-2001cp)use the serial to print. The machine connect the 
printer by serial port.
The data printed by the printer is missing some data.You can see picture 1 is 
lost some data, picture 2  is normal.Because the printer need hardware flow 
control.
I modify /chardev/char-serial.c to support hardware flow control. Then the data 
printed by the printer is normal.
The CRTSCTS is (not in POSIX) Enable RTS/CTS (hardware) flow control.  
We can acquire some information by  "stty -F /dev/ttyS0 -a ". We can set the 
CRTSCTS by "stty -F /dev/ttyS0 crtscts"
3、I have some question
1) Does the qemu init all pci-serial port crtscts or raw? For example, I make 
the qemu configure with 4 pci-serial port.
 -chardev tty,id=charserial0,path=/dev/ttyS0,logfile=/dev/fdset/4,logappend=on 
-device pci-serial,chardev=charserial0,id=serial0,bus=pci.0,addr=0x17 -add-fd 
set=5,fd=28 -chardev 
tty,id=charserial1,path=/dev/ttyS1,logfile=/dev/fdset/5,logappend=on -device 
pci-serial,chardev=charserial1,id=serial1,bus=pci.0,addr=0x18 -add-fd 
set=6,fd=29 -chardev 
tty,id=charserial2,path=/dev/ttyS2,logfile=/dev/fdset/6,logappend=on -device 
pci-serial,chardev=charserial2,id=serial2,bus=pci.0,addr=0x19 -add-fd 
set=7,fd=30 -chardev 
tty,id=charserial3,path=/dev/ttyS3,logfile=/dev/fdset/7,logappend=on -device 
pci-serial,chardev=charserial3,id=serial3,bus=pci.0,addr=0x1a
2) if some printer not support (hardware) flow control, we make the qemu 
support (hardware) flow control, have some bad influence?


.



















At 2021-04-14 21:24:20, "Peter Maydell"  wrote:
>On Wed, 14 Apr 2021 at 14:18, 付小明  wrote:
>>
>> HI, I have find qemu serial not support crtscts. This result some
>> machine not communication, because this machine need crtscts
>
>Could you provide more detail, please? For a bug report it
>is useful to know:
> * what you were trying to do
> * what happened
> * what you expected to happen
> * full details like the QEMU command line
> * what guest software you were running
>
>Ideally, the report should have everything we need to be
>able to reproduce the problem ourselves.
>
>thanks
>-- PMM





 

[PATCH v2] hw/virtio: Document *_should_notify() are called within rcu_read_lock()

2021-05-23 Thread Philippe Mathieu-Daudé
Such comments make reviewing this file somehow easier.

Signed-off-by: Philippe Mathieu-Daudé 
---
v2: only one space before end of comment (mst)
---
 hw/virtio/virtio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index e02544b2df7..130e3568409 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2449,6 +2449,7 @@ static void virtio_set_isr(VirtIODevice *vdev, int value)
 }
 }
 
+/* Called within rcu_read_lock(). */
 static bool virtio_split_should_notify(VirtIODevice *vdev, VirtQueue *vq)
 {
 uint16_t old, new;
@@ -2485,6 +2486,7 @@ static bool vring_packed_need_event(VirtQueue *vq, bool 
wrap,
 return vring_need_event(off, new, old);
 }
 
+/* Called within rcu_read_lock(). */
 static bool virtio_packed_should_notify(VirtIODevice *vdev, VirtQueue *vq)
 {
 VRingPackedDescEvent e;
-- 
2.26.3




[Bug 1887306] Re: qemu-user deadlocks when forked in a multithreaded process

2021-05-23 Thread Alexey Izbyshev
Still reproduces with QEMU 6.0.0.

** Changed in: qemu
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887306

Title:
  qemu-user deadlocks when forked in a multithreaded process

Status in QEMU:
  New

Bug description:
  The following program (also attached) deadlocks when run under QEMU
  user on Linux.

  #include 
  #include 
  #include 
  #include 
  #include 
  #include 

  #define NUM_THREADS 100
  #define NUM_FORKS 10

  pthread_barrier_t barrier;

  void *t(void *arg) {
  for (int i = 0; i < NUM_FORKS; i++) {
  pid_t pid = fork();
  if (pid < 0)
  abort();
  if (!pid)
  _exit(0);
  if (waitpid(pid, NULL, 0) < 0)
  abort();
  }
  //pthread_barrier_wait(&barrier);
  return NULL;
  }

  int main(void) {
  pthread_barrier_init(&barrier, NULL, NUM_THREADS);
  pthread_t ts[NUM_THREADS];
  for (size_t i = 0; i < NUM_THREADS; i++) {
  if (pthread_create(&ts[i], NULL, t, NULL))
  abort();
  }
  for (size_t i = 0; i < NUM_THREADS; i++) {
  pthread_join(ts[i], NULL);
  }
  printf("Done: %d\n", getpid());
  return 0;
  }

  To reproduce:
  $ gcc test.c -pthread
  $ while qemu-x86_64 ./a.out; do :; done

  (Be careful, Ctrl-C/SIGINT doesn't kill the deadlocked child).

  Larger values of NUM_THREADS/NUM_FORKS lead to more often deadlocks.
  With the values above it often deadlocks on the first try on my
  machine. When it deadlocks, there is a child qemu process with two
  threads which is waited upon by one of the worker threads of the
  parent.

  I tried to avoid the deadlock by serializing fork() with a mutex, but
  it didn't help. However, ensuring that no thread exits until all forks
  are done (by adding a barrier to t()) does seem to help, at least, the
  program above could run for a half an hour until I terminated it.

  Tested on QEMU 5.0.0, 4.2.0 and 2.11.1, with x86_64 and AArch64 linux-
  user targets.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887306/+subscriptions



Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface

2021-05-23 Thread BALATON Zoltan

On Sun, 23 May 2021, Alexey Kardashevskiy wrote:

On 22/05/2021 23:01, BALATON Zoltan wrote:

On Sat, 22 May 2021, Alexey Kardashevskiy wrote:

On 21/05/2021 19:05, BALATON Zoltan wrote:

On Fri, 21 May 2021, Alexey Kardashevskiy wrote:

On 21/05/2021 07:59, BALATON Zoltan wrote:

On Thu, 20 May 2021, Alexey Kardashevskiy wrote:

The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.

Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to 
be

updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.

This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual 
Open

Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.

The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.

This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and 
updates

"/memory@0/available" to report the client about available memory.

This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.

In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.

When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.

This adds basic instances support which are managed by a hash map
ihandle -> [phandle].

Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..1 - stack
40.. - kernel
3ea.. - initramdisk

This OF CI does not implement "interpret".

Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.

With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk 
to
boot from any possible source. Note this requires reasonably recent 
guest

kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 
The immediate benefit is much faster booting time which especially

crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.

This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.

This is coded in assumption that later on we might be adding support 
for

booting from QEMU backends (blockdev is the first candidate) without
devices/drivers in between as OF1275 does not require that and
it is quite easy to so.

Signed-off-by: Alexey Kardashevskiy 
---

The example command line is:

/home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-enable-kvm \
-m 8G \
-machine 
pseries,x-vof=on,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off 
\

-kernel pbuild/kernel-le-guest/vmlinux \
-initrd pb/rootfs.cpio.xz \
-drive 
id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw 
\

-global spapr-nvram.drive=DRIVE0 \
-snapshot \
-smp 8,threads=8 \
-L /home/aik/t/qemu-ppc64-bios/ \
-trace events=qemu_trace_events \
-d guest_errors \
-chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.tmux26 \
-mon chardev=SOCKET0,mode=control

---
Changes:
v20:
* compile vof.bin with -mcpu=power4 for better compatibility
* s/std/stw/ in entry.S to make it work on ppc32
* fixed dt_available property to support both 32 and 64bit
* shuffled prom_args handling code
* do not enforce 32bit in MSR (again, to support 32bit platforms)



[...]

diff --git a/default-configs/devices/ppc64-softmmu.mak 
b/default-configs/devices/ppc64-softmmu.mak

index ae0841fa3a18..9fb201dfacfa 100644
--- a/default-configs/devices/ppc64-softmmu.mak
+++ b/default-configs/devices/ppc64-softmmu.mak
@@ -9,3 +9,4 @@ CONFIG_POWERNV=y
 # For pSeries
 CONFIG_PSERIES=y
 CONFIG_NVDIMM=y
+CONFIG_VOF=y
diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index e51e0e5e5ac6..964510

Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface

2021-05-23 Thread BALATON Zoltan

On Sun, 23 May 2021, Alexey Kardashevskiy wrote:

On 23/05/2021 01:02, BALATON Zoltan wrote:

On Sat, 22 May 2021, BALATON Zoltan wrote:

On Sat, 22 May 2021, Alexey Kardashevskiy wrote:

VOF itself does not prints anything in this patch.


However it seems to be needed for linux as the first thing it does seems 
to be getting /chosen/stdout and calls exit if it returns nothing. So I'll 
need this at least for linux. (I think MorphOS may also query it to print 
a banner or some messages but not sure it needs it, at least it does not 
abort right away if not found.)


but to see Linux output do I need a stdout in VOF or it will just open 
the serial with its own driver and use that?
So I'm not sure what's the stdout parts in the current vof patch does 
and if I need that for anything. I'll try to experiment with it some 
more but fixing the ld and Kconfig seems to be enough to get it work for 
me.


So for the client to print something, /chosen/stdout needs to have a 
valid ihandle.
The only way to get a valid ihandle is having a valid phandle which 
vof_client_open() can open.
A valid phandle is a phandle of any node in the device tree. On spapr we 
pick some spapr-vty, open it and store in /chosen/stdout.


From this point output from the client can be seen via a tracepoint.


I've got it now. Looking at the original firmware device tree dump:

https://osdn.net/projects/qmiga/wiki/SubprojectPegasos2/attach/PegasosII_OFW-Dump.txt 

I see that /chosen/stdout points to "screen" which is an alias to 
/bootconsole. Just adding an empty /bootconsole node in the device tree and 
vof_client_open_store() that as /chosen/stdout works and I get output via 
vof_write traces so this is enough for now to test Linux. Properly 
connecting a serial backend can thus be postponed.


So with this the Linux kernel does not abort on the first device tree 
access but starts to decompress itself then the embedded initrd and crashes 
at calling setprop:


[...]
vof_client_handle: setprop

Thread 4 "qemu-system-ppc" received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  0x in  ()
#1  0x55a5c2bf in vof_setprop
     (vof=0x748e9420, vallen=4, valaddr=, 
pname=, nodeph=8, fdt=0x7fff8aaff010, ms=0x564f8800)

     at ../hw/ppc/vof.c:308
#2  0x55a5c2bf in vof_client_handle
     (nrets=1, rets=0x748e93f0, nargs=4, args=0x748e93c0, 
service=0x748e9460 "setprop",
  vof=0x748e9420, fdt=0x7fff8aaff010, ms=0x564f8800) at 
../hw/ppc/vof.c:842

#3  0x55a5c2bf in vof_client_call
     (ms=0x564f8800, vof=vof@entry=0x5662a3d0, 
fdt=fdt@entry=0x7fff8aaff010, args_real=args_real@entry=23580472)

     at ../hw/ppc/vof.c:935

loooks like it's trying to set /chosen/linux,initrd-start:


It is not horribly clear why it crashed though.


It crashed becuase I had TYPE_VOF_MACHINE_IF but did not set a setprop 
callback and it tried to call that here. Adding a {return true;} empty 
callback avoids this.



(gdb) up
#1  0x55a5c2bf in vof_setprop (vof=0x748e9420, vallen=4, 
valaddr=, pname=, nodeph=8,

     fdt=0x7fff8aaff010, ms=0x564f8800) at ../hw/ppc/vof.c:308
308    if (!vmc->setprop(ms, nodepath, propname, val, vallen)) {
(gdb) p nodepath
$1 = "/chosen\000\060/rPC,750CXE/", '\000' 
(gdb) p propname
$2 = 
"linux,initrd-start\000linux,initrd-end\000linux,cmdline-timeout\000bootarg" 
(gdb) p val

$3 = 

I think I need the callback for setprop in TYPE_VOF_MACHINE_IF. I can copy 
spapr_vof_setprop() but some explanation on why that's needed might help. 
Ciould I just do fdt_setprop in my callback as vof_setprop() would do 
without a machine callback or is there some special handling needed for 
these properties?


The short answer is yes, you do not need TYPE_VOF_MACHINE_IF.

The long answer is that we build the FDT on spapr twice:
1. at the reset time and
2. after "ibm,client-arhitecture-support" (early in the boot the spapr 
paravirtual client says what it supports - ISA level, MMU features, etc)


Between 1 and 2 the kernel moves initrd and we do not update the QEMU's 
version of its location, the tree at 2) will have the old values.


So for that reason I have TYPE_VOF_MACHINE_IF. You most definitely do not 
need it.


I need TYPE_VOF_MACHINE_IF because that has the quiesce callback that I 
need to shut VOF down when the guest is finished with it otherwise it 
would crash later (more on this in next message). But since I shut down 
VOF here I don't need to remember changes to the FDT so I can just use an 
empty setprop callback. (I wouldn't even need that if VOF would check that 
a callback is non-NULL before calling it.)


Regards,
BALATON Zoltan

Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface

2021-05-23 Thread BALATON Zoltan

On Sun, 23 May 2021, Alexey Kardashevskiy wrote:

On 23/05/2021 02:46, BALATON Zoltan wrote:

On Sat, 22 May 2021, BALATON Zoltan wrote:

On Sat, 22 May 2021, BALATON Zoltan wrote:

On Sat, 22 May 2021, Alexey Kardashevskiy wrote:

VOF itself does not prints anything in this patch.


However it seems to be needed for linux as the first thing it does seems 
to be getting /chosen/stdout and calls exit if it returns nothing. So 
I'll need this at least for linux. (I think MorphOS may also query it to 
print a banner or some messages but not sure it needs it, at least it 
does not abort right away if not found.)


but to see Linux output do I need a stdout in VOF or it will just open 
the serial with its own driver and use that?
So I'm not sure what's the stdout parts in the current vof patch does 
and if I need that for anything. I'll try to experiment with it some 
more but fixing the ld and Kconfig seems to be enough to get it work 
for me.


So for the client to print something, /chosen/stdout needs to have a 
valid ihandle.
The only way to get a valid ihandle is having a valid phandle which 
vof_client_open() can open.
A valid phandle is a phandle of any node in the device tree. On spapr we 
pick some spapr-vty, open it and store in /chosen/stdout.


From this point output from the client can be seen via a tracepoint.


I've got it now. Looking at the original firmware device tree dump:

https://osdn.net/projects/qmiga/wiki/SubprojectPegasos2/attach/PegasosII_OFW-Dump.txt 

I see that /chosen/stdout points to "screen" which is an alias to 
/bootconsole. Just adding an empty /bootconsole node in the device tree 
and vof_client_open_store() that as /chosen/stdout works and I get output 
via vof_write traces so this is enough for now to test Linux. Properly 
connecting a serial backend can thus be postponed.


Using /failsafe instead of /bootconsole is even better because Linux then 
adds console=ttyS0 to the bootargs by default as it knows that's a serial 
port.


When linux boots so far that it can use whatever is passed in "console=" - 
the client interface is done pretty much and the output happens without it.


That's the problem that Linux does not open serial yet when booting with 
VOF but I don't have everyhing in the device tree yet and devices may be 
set up differently when the board firmware haven't run so I'm not sure 
what's missing for Linux to find and open serial. Does anybody happen to 
know?


So with this the Linux kernel does not abort on the first device tree 
access but starts to decompress itself then the embedded initrd and 
crashes at calling setprop:


[...]
vof_client_handle: setprop

Thread 4 "qemu-system-ppc" received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  0x in  ()
#1  0x55a5c2bf in vof_setprop
   (vof=0x748e9420, vallen=4, valaddr=, 
pname=, nodeph=8, fdt=0x7fff8aaff010, ms=0x564f8800)

   at ../hw/ppc/vof.c:308
#2  0x55a5c2bf in vof_client_handle
   (nrets=1, rets=0x748e93f0, nargs=4, args=0x748e93c0, 
service=0x748e9460 "setprop",
    vof=0x748e9420, fdt=0x7fff8aaff010, ms=0x564f8800) at 
../hw/ppc/vof.c:842

#3  0x55a5c2bf in vof_client_call
   (ms=0x564f8800, vof=vof@entry=0x5662a3d0, 
fdt=fdt@entry=0x7fff8aaff010, args_real=args_real@entry=23580472)

   at ../hw/ppc/vof.c:935

loooks like it's trying to set /chosen/linux,initrd-start:

(gdb) up
#1  0x55a5c2bf in vof_setprop (vof=0x748e9420, vallen=4, 
valaddr=, pname=, nodeph=8,

   fdt=0x7fff8aaff010, ms=0x564f8800) at ../hw/ppc/vof.c:308
308    if (!vmc->setprop(ms, nodepath, propname, val, vallen)) {
(gdb) p nodepath
$1 = "/chosen\000\060/rPC,750CXE/", '\000' 
(gdb) p propname
$2 = 
"linux,initrd-start\000linux,initrd-end\000linux,cmdline-timeout\000bootarg" 
(gdb) p val

$3 = 

I think I need the callback for setprop in TYPE_VOF_MACHINE_IF. I can copy 
spapr_vof_setprop() but some explanation on why that's needed might help. 
Ciould I just do fdt_setprop in my callback as vof_setprop() would do 
without a machine callback or is there some special handling needed for 
these properties?


Just returning true from the setprop callback of the VofMachineIfClass for 
now to see what it would do and then it gets to all the way of calling 
quiesce. Unfortunately it then tries to call prom_printf on Pegasos2 as 
seen here:


https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/powerpc/kernel/prom_init.c?h=v4.14.233#n3261 

which does not work because I have to shut down vhyp at quiesce 


What is vhyp and why do you have to shut it down?


The vhyp is the TYPE_PPC_VIRTUAL_HYPERVISOR interface that I need to get 
hypercalls working as I don't normally have it on pegasos2 so I need to 
install that for VOF but have to tear it down on quiece otherwise it would 
conflict with things later (at least the assert below but guests also use 
syscalls and I'm not sure that would also conflict). It works tho

Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface

2021-05-23 Thread BALATON Zoltan

On Sun, 23 May 2021, Alexey Kardashevskiy wrote:

On 22/05/2021 23:08, BALATON Zoltan wrote:

On Sat, 22 May 2021, Alexey Kardashevskiy wrote:

On 22/05/2021 05:57, BALATON Zoltan wrote:

On Fri, 21 May 2021, BALATON Zoltan wrote:

On Fri, 21 May 2021, Alexey Kardashevskiy wrote:

On 21/05/2021 07:59, BALATON Zoltan wrote:

On Thu, 20 May 2021, Alexey Kardashevskiy wrote:

The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.

Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to 
be
updated in sync with it. Even though we've managed to limit the 
amount

of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.

This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual 
Open

Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.

The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.

This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and 
updates

"/memory@0/available" to report the client about available memory.

This implements changing some device tree properties which we know 
how

to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.

In absence of SLOF, this assigns phandles to device tree nodes to 
make

device tree traversing work.

When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.

This adds basic instances support which are managed by a hash map
ihandle -> [phandle].

Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..1 - stack
40.. - kernel
3ea.. - initramdisk

This OF CI does not implement "interpret".

Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.

With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk 
to
boot from any possible source. Note this requires reasonably recent 
guest

kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 
The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also 
this

may come handy when/if GRUB-in-the-userspace sees light of the day.

This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.

This is coded in assumption that later on we might be adding support 
for

booting from QEMU backends (blockdev is the first candidate) without
devices/drivers in between as OF1275 does not require that and
it is quite easy to so.

Signed-off-by: Alexey Kardashevskiy 
---

The example command line is:

/home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-enable-kvm \
-m 8G \
-machine 
pseries,x-vof=on,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off 
\

-kernel pbuild/kernel-le-guest/vmlinux \
-initrd pb/rootfs.cpio.xz \
-drive 
id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw 
\

-global spapr-nvram.drive=DRIVE0 \
-snapshot \
-smp 8,threads=8 \
-L /home/aik/t/qemu-ppc64-bios/ \
-trace events=qemu_trace_events \
-d guest_errors \
-chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.tmux26 \
-mon chardev=SOCKET0,mode=control

---
Changes:
v20:
* compile vof.bin with -mcpu=power4 for better compatibility
* s/std/stw/ in entry.S to make it work on ppc32
* fixed dt_available property to support both 32 and 64bit
* shuffled prom_args handling code
* do not enforce 32bit in MSR (again, to support 32bit platforms)



[...]

diff --git a/default-configs/devices/ppc64-softmmu.mak 
b/default-configs/devices/ppc64-softmmu.mak

index ae0841fa3a18..9fb201dfacfa 100644
--- a/default-configs/devices/ppc64-softmmu.mak
+++ b/default-configs/devices/ppc64-softmmu.mak
@@ -9,3 +9,4 @@ CONFIG_POWERNV=y
 # For pSeries
 CONFIG_PSERIES=y
 CONFIG_NVDIMM=y
+CONFIG_VOF=y
diff --git a/hw/ppc/K

Re: The latest Qemu release can't bootup VM with latest guest kernel.

2021-05-23 Thread Gal Hammer
Hi Yang,

On Thu, 20 May 2021 at 11:27, Yang Zhong  wrote:

> Hello all,
>
> I found the latest Qemu release can't bootup the VM with latest guest
> kernel(>5.13).
>
> The normal v6.0.0 release is good to bootup the latest guest kernel.
>
> There are two issues were found
> 1. Guest kernel panic.
> 2. kvm disabled by bios
>
> The panic log as below:
> [2.250024] BUG: unable to handle page fault for address:
> ac06c55f
> [2.252226] #PF: supervisor write access in kernel mode
> [2.253892] #PF: error_code(0x0003) - permissions violation
> [2.255671] PGD 5940e067 P4D 5940f067 PUD 59410063 PMD 580001e1
> [2.257567] Oops: 0003 [#1] SMP NOPTI
> [2.258738] CPU: 2 PID: 313 Comm: systemd-udevd Not tainted 5.13.0-rc1+
> #1
> [2.260899] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 0.0.0 02/06/2015
> [2.263375] RIP: 0010:__send_ipi_mask+0x1bf/0x240
> [2.264855] Code: c0 48 c7 44 24 18 00 00 00 00 e9 48 ff ff ff 48 89 d0
> 4c 09 c8 74 1b 49 63 d7 48 63 74 24 0c b8 0a 00 00 00 4c 89 cb 4c 89 d1
> <0f> 01 d9 48 85 c0 78 4a 48 f7 04 24 00 02 00 00 0f 84 80 fe ff ff
> [2.270643] RSP: 0018:ff591a62c0193ab0 EFLAGS: 00010006
> [2.272277] RAX: 000a RBX: 0009 RCX:
> 
> [2.274482] RDX:  RSI: 00fc RDI:
> ff13a83dc003c830
> [2.276663] RBP: ff591a62c0193b08 R08: 0004 R09:
> 0009
> [2.278866] R10:  R11:  R12:
> 
> [2.281065] R13: ff13a83dc003c830 R14: 00011580 R15:
> 
> [2.283272] FS:  7f23ebd07940() GS:ff13a83e3bd0()
> knlGS:
> [2.285794] CS:  0010 DS:  ES:  CR0: 80050033
> [2.287574] CR2: ac06c55f CR3: 000106ce2003 CR4:
> 00771ee0
> [2.289757] DR0:  DR1:  DR2:
> 
> [2.291972] DR3:  DR6: fffe0ff0 DR7:
> 0400
> [2.294177] PKRU: 5554
> [2.295043] Call Trace:
> [2.295820]  kvm_smp_send_call_func_ipi+0xe/0x60
> [2.297220]  smp_call_function_many_cond+0x25d/0x2a0
> [2.298772]  ? flush_tlb_one_kernel+0x20/0x20
> [2.300145]  on_each_cpu_cond_mask+0x1e/0x20
> [2.301514]  flush_tlb_kernel_range+0x8d/0x90
> [2.302799]  __purge_vmap_area_lazy+0xc1/0x6a0
> [2.304097]  ? cpumask_next+0x1f/0x20
> [2.305160]  ? purge_fragmented_blocks_allcpus+0x3d/0x210
> [2.306686]  _vm_unmap_aliases+0xf1/0x120
> [2.307861]  change_page_attr_set_clr+0x95/0x280
> [2.309203]  set_memory_ro+0x26/0x30
> [2.310259]  ? 0xc00f7000
> [2.311214]  module_enable_ro.part.58+0x62/0xc0
> [2.312417]  do_init_module+0x17a/0x230
> [2.313460]  load_module+0x1a30/0x1b00
> [2.314463]  ? __do_sys_finit_module+0xad/0x110
> [2.315702]  __do_sys_finit_module+0xad/0x110
> [2.316890]  do_syscall_64+0x39/0x80
> [2.317868]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [2.319226] RIP: 0033:0x7f23ea8f32bd
>
>
> I also used the bisect to get the bad commit id:
> f5cc5a5c168674f84bf061cdb307c2d25fba5448
>
> This issue is known issue? or some fixs are ready to fix those issues?
> thanks!
>

What's your qemu command line?

I'm also having a kernel crash (although I think mine is with a different
call stack) when using "-cpu host". The crash doesn't occur when I use
"kvm64" cpu type.

Gal.


>
> Regards,
>
> Yang
>
>
>
>


Re: notdirty_write thrashing in simple for loop

2021-05-23 Thread BALATON Zoltan

Hello,

On Tue, 18 May 2021, Mark Watson wrote:

I'm trying to implement my own machine for amiga emulation using a software
cpu and fpga hardware. For this I have built my own machine which consists
of a large malloced ram block and some fpga hardware mmapped elsewhere into
the memory space.

I'm using qemu to emulate a 68040 on an arm cortex a9 host in system mode.

It is working, though I'm investigating a strange performance issue.

I'm looking for advice on where to look next in debugging this from the
specialist(s) of accel/tcg/cputlb.c please.


I think you need to be more specific about details or even better provide 
a way to reproduce it without your hardware if possible otherwise people 
will not get what you're seeing. From the above it's not clear to me if 
you're emulating an fpga hardware in QEMU or actually run with the fpga 
(supposedly implementing the Amiga chipset) in the virtual machine's 
memory so accesses to some addresses will do something in hardware (in 
which case it may be difficult to reproduce without it and also could be 
the source of problems so hard to tell what might be causing your issue.)


(Is this related to pistorm or something based on that for full Amiga 
emulation without Amiga hardware? Just insterested, unrelated to this 
thread.)



To investigate the performance issue I tried to break it down to the
simplest possible case. I can reproduce it with a simple for loop (compiled
without optimisation).
   for (int i=0;i!=0xff;++i)
{
if ((i&0x)==0)
{
}
}


So you do nothing in the loop just test for the loop variable and this 
sometimes runs slow?



Running it in user mode on the same host it takes ~0.6 seconds. In the
built-in 'virtual' m68k machine running linux it takes 1.3 seconds.
However in my machine under amigaos I'm seeing it typically taking 5 and a
half minutes! Occasionally it seems to run at the correct speed of <2
seconds, though I have yet to identify why. These are the logs of the
captured code before it goes into the main chain loop.
qemu_slow_stuck_fragment.log



The log does not make much sense to me but I'm also not an expert on TCG 
and ARM. Why do you have faults while running a simple empty loop and what 
are those? Is something flushing the TLB for some reason or is this just 
becuase of the debug logging? I think there are some -d oprions for mmu 
debugging that may give more info on TLB usage.



I have verified that this performance change is not due to slow fpga memory
area access, i.e. there are no accesses to that memory region during this.


OK so then it should be possible to reproduce without that hardware? If so 
that would help people to understand the issue and give advice but I 
see that reproducing may need understanding the issue first.



I took a look in gdb while running this loop to see what is going on.
Initially I was surprised that I didn't find the code in 'OUT:', however I
guess it makes sense that it has to call into the framework for memory
access. I noticed that a lot of calls to glib are made and see


I rarely use gdb with QEMU so not sure but normally with TCG in_asm and 
out_asm debug you'll only see these when the TB is first translated not 
when you run it later because then the translated code is run from the TB 
cache. I think you can kind of disable this with -singlestep that makes 
TBs just a single instruction and may change caching. At least with that I 
see all instructions all the time in -d in_asm so this may help debugging 
although it makes things much slower.



g_tree_lookup called a lot. This is caused by notdirty_write being called
'000s of times and each time going into the page_collection_lock and
tb_invalidate_phys_page_fast. I presume this is happening each time that
"i" is incremented on the stack, which clearly has a huge overhead.


There are only a few places notdirty_write is called from so you should be 
able to identify which of those is firing (if all else fails you could add 
debug logs but there may be trace points to enable too). Once we get which 
place it's coming from then maybe people could tell why that could happen. 
Don't know if you already know QEMU debug options, I have some things 
collected here that I've used while implementing machine emulation here:


https://osdn.net/projects/qmiga/wiki/DeveloperTips


Even being able to get a proper stack trace from gdb would be very helpful
to understand this. I tried to configure qemu with '--enable-debug' but
still do not get a proper stack if i attach to it. I'm not sure if this is
the case due to it running dynamically compiled code before calling into
this.


The --enable-debug adds debug symbols to QEMU but if it's called from 
generated code then you'll probably see that as the source of the calls so 
hard to tell what has put that there. Yet it may help if you could show 
some back traces you got in case that makes sense to somebody who knows 
about TCG. Al

Re: notdirty_write thrashing in simple for loop

2021-05-23 Thread Mark Watson
Hi

On Sun, 23 May 2021 at 15:41, BALATON Zoltan  wrote:

> I think you need to be more specific about details or even better provide
> a way to reproduce it without your hardware if possible otherwise people
> will not get what you're seeing. From the above it's not clear to me if
> you're emulating an fpga hardware in QEMU or actually run with the fpga
> (supposedly implementing the Amiga chipset) in the virtual machine's
> memory so accesses to some addresses will do something in hardware (in
> which case it may be difficult to reproduce without it and also could be
> the source of problems so hard to tell what might be causing your issue.)
>

I managed to reproduce now locally without the fpga, on my x86 system.

The issue seems to be the layout of where the Amiga puts code and the
stack. It does not use virtual memory and each program seems to get the
stack just below the code. So whenever the code increments i, it writes to
the page and qemu does a lookup in a map to potentially invalidate the
code.


> (Is this related to pistorm or something based on that for full Amiga
> emulation without Amiga hardware? Just insterested, unrelated to this
> thread.)
>
The minimig is a recreation of the Amiga hardware in the FPGA. In addition
to its own dedicated board, it has been ported to many boards: Turbo
Chameleon, Mist, MiSTer (DE10 Nano with expansion). In the MiSTer an SOC
FPGA chip is used, which has dual arm codes and an fpga on the same
silicon, with high performance bridges beween them.
Pistorm and Buffee are fairly similar, in that they are replacing the 68K
cpu with an emulated cpu, but with intefaces to real hardware. As I
undetstand it, the former uses Musashi and the latter they are writing
their own JIT.


> So you do nothing in the loop just test for the loop variable and this
> sometimes runs slow?
>
Yes in fact even without the test in the loop. Just a loop incrementing i,
where i is on the stack. As I now found out it seems to be an issue if the
code and the variable i are in the same page.

Now I could try to modify the software on the amiga to split stack and
code. I do wonder if some kind of caching layer could be added to qemu so
that repeated invalidates do not take so much cpu time.


> Also verify that these excessive calls to notdirty_write does
> only happen when it's running slow so it's really the source of the
> problems and not something normal otherwise.
>
I have now confirmed this, I enable the trace_event on the notdirty to
confirm.

Many thanks for the qemu and dgb debugging tips, much appreciated. I will
real them.

Mark Watson


Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface

2021-05-23 Thread BALATON Zoltan

On Sun, 23 May 2021, BALATON Zoltan wrote:

On Sun, 23 May 2021, Alexey Kardashevskiy wrote:
One thing to note about PCI is that normally I think the client expects the 
firmware to do PCI probing and SLOF does it. But VOF does not and Linux 
scans PCI bus(es) itself. Might be a problem for you kernel.


I'm not sure what info does MorphOS get from the device tree and what it 
probes itself but I think it may at least need device ids and info about the 
PCI bus to be able to access the config regs, after that it should set the 
devices up hopefully. I could add these from the board code to device tree so 
VOF does not need to do anything about it. However I'm not getting to that 
point yet because it crashes on something that it's missing and couldn't yet 
find out what is that.


I'd like to get Linux working now as that would be enough to test this and 
then if for MorphOS we still need a ROM it's not a problem if at least we can 
boot Linux without the original firmware. But I can't make Linux open a 
serial console and I don't know what it needs for that. Do you happen to 
know? I've looked at the sources in Linux/arch/powerpc but not sure how it 
would find and open a serial port on pegasos2. It seems to work with the 
board firmware and now I can get it to boot with VOF but then it does not 
open serial so it probably needs something in the device tree or expects the 
firmware to set something up that we should add in pegasos2.c when using VOF.


I've now found that Linux uses rtas methods read-pci-config and 
write-pci-config for PCI access on pegasos2 so this means that we'll 
probably need rtas too (I hoped we could get away without it if it were 
only used for shutdown/reboot or so but seems Linux needs it for PCI as 
well and does not scan the bus and won't find some devices without it).


While VOF can do rtas, this causes a problem with the hypercall method 
using sc 1 that goes through vhyp but trips the assert in ppc_store_sdr1() 
so cannot work after guest is past quiesce. So the question is why is that 
assert there and would using sc 1 for hypercalls on pegasos2 cause other 
problems later even if the assert could be removed? Can somebody who knows 
more about it explain this please? If this cannot be resolved then we may 
need a different hypercall method on pegasos2 (I've considered MOL OSI or 
are there other options? I may use some advice from people who know it 
better, especially the possible interaction with KVM later as the long 
term goal with pegasos2 is to be able to run with KVM on PPC hardware 
eventually.) But this also means that if that assert cannot be dropped or 
there may be other problems with sc 1 hypercalls then we maybe cannot have 
the same vof.bin and we'll need a separate version that I would like to 
avoid if possible so if there's a simple way to keep it working or make 
vof.bin use alternate hypercall method without needing a separate binary 
that would be the direction I'd tend to go. Even if we need a seoarate 
version I'd like to keep as much common as possible.


I've tested that the missing rtas is not the reason for getting no output 
via serial though, as even when disabling rtas on pegasos2.rom it boots 
and I still get serial output just some PCI devices are not detected (such 
as USB, the video card and the not emulated ethernet port but these are 
not fatal so it might even work as a first try without rtas, just to boot 
a Linux kernel for testing it would be enough if I can fix the serial 
output). I still don't know why it's not finding serial but I think it may 
be some missing or wrong info in the device tree I generat. I'll try to 
focus on this for now and leave the above rtas question for later.


Regards,
BALATON Zoltan



Re: notdirty_write thrashing in simple for loop

2021-05-23 Thread BALATON Zoltan

On Sun, 23 May 2021, Mark Watson wrote:

Hi

On Sun, 23 May 2021 at 15:41, BALATON Zoltan  wrote:

I think you need to be more specific about details or even better provide
a way to reproduce it without your hardware if possible otherwise people
will not get what you're seeing. From the above it's not clear to me if
you're emulating an fpga hardware in QEMU or actually run with the fpga
(supposedly implementing the Amiga chipset) in the virtual machine's
memory so accesses to some addresses will do something in hardware (in
which case it may be difficult to reproduce without it and also could be
the source of problems so hard to tell what might be causing your issue.)


I managed to reproduce now locally without the fpga, on my x86 system.

The issue seems to be the layout of where the Amiga puts code and the
stack. It does not use virtual memory and each program seems to get the
stack just below the code. So whenever the code increments i, it writes to
the page and qemu does a lookup in a map to potentially invalidate the
code.


That's probably enough for people who can give advice to understand the 
problem. I think I get it but can't help more as I don't know TCG or QEMU 
internals very well.



(Is this related to pistorm or something based on that for full Amiga
emulation without Amiga hardware? Just insterested, unrelated to this
thread.)


The minimig is a recreation of the Amiga hardware in the FPGA. In addition
to its own dedicated board, it has been ported to many boards: Turbo
Chameleon, Mist, MiSTer (DE10 Nano with expansion). In the MiSTer an SOC
FPGA chip is used, which has dual arm codes and an fpga on the same
silicon, with high performance bridges beween them.
Pistorm and Buffee are fairly similar, in that they are replacing the 68K
cpu with an emulated cpu, but with intefaces to real hardware. As I
undetstand it, the former uses Musashi and the latter they are writing
their own JIT.


I see. Interesting idea if you already have such an fpga SoC, then you 
can make good use of the ARM cores that way.



So you do nothing in the loop just test for the loop variable and this
sometimes runs slow?


Yes in fact even without the test in the loop. Just a loop incrementing i,
where i is on the stack. As I now found out it seems to be an issue if the
code and the variable i are in the same page.

Now I could try to modify the software on the amiga to split stack and
code. I do wonder if some kind of caching layer could be added to qemu so
that repeated invalidates do not take so much cpu time.


I don't know but added maintainers of accel/tcg/cputlb.c to cc to get 
their attention. You can get this info from MAINTAINERS file and more 
easily with:


scripts/get_maintainer.pl -f accel/tcg/cputlb.c

For reference and more backfround info here's a link to Mark's original 
message:


https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg05581.html

Regards,
BALATON Zoltan


Also verify that these excessive calls to notdirty_write does
only happen when it's running slow so it's really the source of the
problems and not something normal otherwise.


I have now confirmed this, I enable the trace_event on the notdirty to
confirm.

Many thanks for the qemu and dgb debugging tips, much appreciated. I will
real them.

Mark Watson





Re: [PATCH] linux-user/syscall: zero-init msghdr in do_sendrecvmsg_locked

2021-05-23 Thread Kenta Iwasaki
Doing a ping for this patch.
https://patchew.org/QEMU/20210516091536.1042693-1-ke...@lithdew.net/

Best regards,
Kenta Iwasaki

On Sun, 16 May 2021 at 21:57, Kenta Iwasaki  wrote:

> Sure,
>
> The bytes of `msghdr` need to be cleared because the `msghdr` struct
> layout specified in QEMU appears to generalize between the definitions of
> `msghdr` across different libc's and kernels. To appropriately generalize
> `msghdr` across libc's and kernels would either:
>
> 1. require specializing code in do_sendrecvmsg_locked() for each
> individual libc and kernel version, or
> 2. zeroing out all bytes of `msghdr`, b/c certain libc or kernel versions
> may misinterpret the undefined padding bytes that come from misalignment in
> the struct as actual syscall params.
>
> The patch I provided would be going for route #2, given that it's a
> simpler fix for the underlying problem for the short term.
>
> What I believe is the background behind why the struct layout has been a
> problem is because, since the beginning, the Linux kernel has always
> specified the layout of `msghdr` differently from POSIX. Given that this
> implies incompatibility between kernels on how `msghdr` is specified,
> different libc projects such as musl and glibc provide different
> workarounds by laying out `msghdr` differently amongst one another.
>
> A few projects running tests/applications through QEMU have been bitten by
> this, and a solution that one of the projects discovered was that patching
> QEMU to zero-initialize the bytes msghdr the same way my patch does allow
> for compatibility between different `msghdr` layouts across glibc, musl,
> and the Linux kernel:
> https://github.com/void-linux/void-packages/issues/23557#issuecomment-718392360
>
> For some additional useful context, here's a link pointing changes musl
> libc made to laying out `msghdr` b/c of Linux incorrectly laying out
> `msghdr` against the POSIX standard:
> http://git.musl-libc.org/cgit/musl/commit/?id=7168790763cdeb794df52be6e3b39fbb021c5a64
>
> Also, here is a link to the `msghdr` struct layout in musl libc's bleeding
> edge branch as of now:
> https://git.musl-libc.org/cgit/musl/tree/include/sys/socket.h#n22
>
> As for my rationale for sending in this patch, it is because I'm currently
> implementing cross-platform networking in the standard library for the Zig
> programming language, and have run into this exact same problem with
> EMSGSIZE being returned by sendmsg() when tests are run through QEMU on
> x86_64-linux-musl.
>
> My discussions with the Zig community about it alongside debug logs
> regarding the exact parameters being fed to the sendmsg() syscall can be
> found here:
> https://github.com/ziglang/zig/pull/8750#issuecomment-841641576
>
> Hope this gives enough context about the problem and patch, but please do
> let me know if there is any more information that I could provide which
> would help.
>
> Best regards,
> Kenta Iwasaki
>
>
> On Sun, 16 May 2021 at 19:53, Laurent Vivier  wrote:
>
>> Le 16/05/2021 à 11:15, Kenta Iwasaki a écrit :
>> > The mixing of libc and kernel versions of the layout of the `msghdr`
>> > struct causes EMSGSIZE to be returned by sendmsg if the `msghdr` struct
>> > is not zero-initialized (such that padding bytes comprise of
>> > uninitialized memory).
>> >
>> > Other parts of the QEMU codebase appear to zero-initialize the `msghdr`
>> > struct to workaround these struct layout issues, except for
>> > do_sendrecvmsg_locked in linux-user/syscall.c.
>> >
>> > This patch zero-initializes the `msghdr` struct in
>> > do_sendrecvmsg_locked.
>> >
>> > Signed-off-by: Kenta Iwasaki 
>> > ---
>> >  linux-user/syscall.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> > index 95d79ddc43..f60b7e04d5 100644
>> > --- a/linux-user/syscall.c
>> > +++ b/linux-user/syscall.c
>> > @@ -3337,7 +3337,7 @@ static abi_long do_sendrecvmsg_locked(int fd,
>> struct target_msghdr *msgp,
>> >int flags, int send)
>> >  {
>> >  abi_long ret, len;
>> > -struct msghdr msg;
>> > +struct msghdr msg = { 0 };
>> >  abi_ulong count;
>> >  struct iovec *vec;
>> >  abi_ulong target_vec;
>> >
>>
>> It seems do_sendrecvmsg_locked() initializes all the fields of the
>> structure, I don't see why we
>> need to clear it before use.
>>
>> Could you explain more?
>>
>> Thanks,
>> Laurent
>>
>


Re: [PATCH 05/38] target/riscv: 8-bit Addition & Subtraction Instruction

2021-05-23 Thread Palmer Dabbelt

On Mon, 15 Mar 2021 14:22:58 PDT (-0700), alistai...@gmail.com wrote:

On Fri, Feb 12, 2021 at 10:14 AM LIU Zhiwei  wrote:


Signed-off-by: LIU Zhiwei 


Acked-by: Alistair Francis 


I saw some reviews on the other ones, but since others (like this) just 
have acks and haven't had any other traffic I'm going to start here.


It looks like the latest spec is 0.9.4, but the changelog is pretty 
minimal between 0.9.5 and 0.9.2:


[0.9.2 -> 0.9.3]

* Changed Zp64 name to Zpsfoperand.
* Added Zprvsfextra for RV64 only instructions.
* Removed SWAP16 encoding. It is an alias of PKBT16.
* Fixed few typos and enhanced precision descriptions on imtermediate results.

[0.9.3 -> 0.9.4]

* Fixed few typos and enhanced precision descriptions on imtermediate results.
* Fixed/Changed data types for some intrinsic functions.
* Removed "RV32 Only" for Zpsfoperand.

So I'm just going to stick with reviewing based on the latest spec 
 
and try to keep those differences in mind, assuming we're just tracking 
the latest draft here.



Alistair


---
 target/riscv/helper.h   |  9 +++
 target/riscv/insn32.decode  | 11 
 target/riscv/insn_trans/trans_rvp.c.inc | 79 +
 target/riscv/packed_helper.c| 73 +++
 4 files changed, 172 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 6d622c732a..a69a6b4e84 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1175,3 +1175,12 @@ DEF_HELPER_3(rstsa16, tl, env, tl, tl)
 DEF_HELPER_3(urstsa16, tl, env, tl, tl)
 DEF_HELPER_3(kstsa16, tl, env, tl, tl)
 DEF_HELPER_3(ukstsa16, tl, env, tl, tl)
+
+DEF_HELPER_3(radd8, tl, env, tl, tl)
+DEF_HELPER_3(uradd8, tl, env, tl, tl)
+DEF_HELPER_3(kadd8, tl, env, tl, tl)
+DEF_HELPER_3(ukadd8, tl, env, tl, tl)
+DEF_HELPER_3(rsub8, tl, env, tl, tl)
+DEF_HELPER_3(ursub8, tl, env, tl, tl)
+DEF_HELPER_3(ksub8, tl, env, tl, tl)
+DEF_HELPER_3(uksub8, tl, env, tl, tl)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 8815e90476..358dd1fa10 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -624,3 +624,14 @@ rstsa161011011  . . 010 . 111 @r
 urstsa16   1101011  . . 010 . 111 @r
 kstsa161100011  . . 010 . 111 @r
 ukstsa16   1110011  . . 010 . 111 @r
+
+add8   0100100  . . 000 . 111 @r
+radd8  100  . . 000 . 111 @r
+uradd8 0010100  . . 000 . 111 @r
+kadd8  0001100  . . 000 . 111 @r
+ukadd8 0011100  . . 000 . 111 @r
+sub8   0100101  . . 000 . 111 @r
+rsub8  101  . . 000 . 111 @r
+ursub8 0010101  . . 000 . 111 @r
+ksub8  0001101  . . 000 . 111 @r
+uksub8 0011101  . . 000 . 111 @r
diff --git a/target/riscv/insn_trans/trans_rvp.c.inc 
b/target/riscv/insn_trans/trans_rvp.c.inc
index 0885a4fd45..109f560ec9 100644
--- a/target/riscv/insn_trans/trans_rvp.c.inc
+++ b/target/riscv/insn_trans/trans_rvp.c.inc
@@ -159,3 +159,82 @@ GEN_RVP_R_OOL(rstsa16);
 GEN_RVP_R_OOL(urstsa16);
 GEN_RVP_R_OOL(kstsa16);
 GEN_RVP_R_OOL(ukstsa16);
+
+/* 8-bit Addition & Subtraction Instructions */
+/*
+ *  Copied from tcg-op-gvec.c.
+ *
+ *  Perform a vector addition using normal addition and a mask.  The mask
+ *  should be the sign bit of each lane.  This 6-operation form is more
+ *  efficient than separate additions when there are 4 or more lanes in
+ *  the 64-bit operation.
+ */
+
+static void gen_simd_add_mask(TCGv d, TCGv a, TCGv b, TCGv m)
+{
+TCGv t1 = tcg_temp_new();
+TCGv t2 = tcg_temp_new();
+TCGv t3 = tcg_temp_new();
+
+tcg_gen_andc_tl(t1, a, m);
+tcg_gen_andc_tl(t2, b, m);
+tcg_gen_xor_tl(t3, a, b);
+tcg_gen_add_tl(d, t1, t2);
+tcg_gen_and_tl(t3, t3, m);
+tcg_gen_xor_tl(d, d, t3);
+
+tcg_temp_free(t1);
+tcg_temp_free(t2);
+tcg_temp_free(t3);
+}
+
+static void tcg_gen_simd_add8(TCGv d, TCGv a, TCGv b)
+{
+TCGv m = tcg_const_tl((target_ulong)dup_const(MO_8, 0x80));
+gen_simd_add_mask(d, a, b, m);
+tcg_temp_free(m);
+}
+
+GEN_RVP_R_INLINE(add8, add, 0, trans_add);
+
+/*
+ *  Copied from tcg-op-gvec.c.
+ *
+ *  Perform a vector subtraction using normal subtraction and a mask.
+ *  Compare gen_addv_mask above.
+ */
+static void gen_simd_sub_mask(TCGv d, TCGv a, TCGv b, TCGv m)
+{
+TCGv t1 = tcg_temp_new();
+TCGv t2 = tcg_temp_new();
+TCGv t3 = tcg_temp_new();
+
+tcg_gen_or_tl(t1, a, m);
+tcg_gen_andc_tl(t2, b, m);
+tcg_gen_eqv_tl(t3, a, b);
+tcg_gen_sub_tl(d, t1, t2);
+tcg_gen_and_tl(t3, t3, m);
+tcg_gen_xor_tl(d, d, t3);
+
+tcg_temp_free(t1);
+tcg_temp_free(t2);
+tcg_temp_free(t3);
+}
+
+static void tcg_gen_simd_sub8(TCGv d, TCG

[Bug 1891748] Re: qemu-arm-static 5.1 can't run gcc

2021-05-23 Thread Devaev Maxim
Anyone?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1891748

Title:
  qemu-arm-static 5.1 can't run gcc

Status in QEMU:
  Fix Released
Status in Juju Charms Collection:
  New

Bug description:
  Issue discovered while trying to build pikvm (1)

  Long story short: when using qemu-arm-static 5.1, gcc exits whith
  message:

  Allocating guest commpage: Operation not permitted

  
  when using qemu-arm-static v5.0, gcc "works"

  Steps to reproduce will follow

  (1)  https://github.com/pikvm/pikvm/blob/master/pages/building_os.md

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1891748/+subscriptions



Re: About the performance of hyper-v

2021-05-23 Thread Liang Li
> >> > Analyze events for all VMs, all VCPUs:
> >> >  VM-EXITSamples  Samples% Time%Min TimeMax
> >> > Time Avg time
> >> >   EXTERNAL_INTERRUPT 47183159.89%68.58%  0.64us
> >> > 65.42us  2.34us ( +-   0.11% )
> >> >MSR_WRITE 23893230.33%23.07%  0.48us
> >> > 41.05us  1.56us ( +-   0.14% )
> >> >
> >> > Total Samples:787803, Total events handled time:1611193.84us.
> >> >
> >> > I tried turning off hyper-v for the same workload and repeat the test,
> >> > the overall virtualization overhead reduced by about of 50%:
> >> >
> >> > ---
> >> >
> >> > Analyze events for all VMs, all VCPUs:
> >> >
> >> >  VM-EXITSamples  Samples% Time%Min TimeMax
> >> > Time Avg time
> >> >   APIC_WRITE 25515274.43%50.72%  0.49us
> >> > 50.01us  1.42us ( +-   0.14% )
> >> >EPT_MISCONFIG  3996711.66%40.58%  1.55us
> >> > 686.05us  7.27us ( +-   0.43% )
> >> >DR_ACCESS  3500310.21% 4.64%  0.32us
> >> > 40.03us  0.95us ( +-   0.32% )
> >> >   EXTERNAL_INTERRUPT   6622 1.93% 2.08%  0.70us
> >> > 57.38us  2.25us ( +-   1.42% )
> >> >
> >> > Total Samples:342788, Total events handled time:715695.62us.
> >> >
> >> > For this scenario,  hyper-v works really bad.  stimer works better
> >> > than hpet, but on the other hand, it relies on SynIC which has
> >> > negative effects for IPI intensive workloads.
> >> > Do you have any plans for improvement?
> >> >
> >>
> >> Hey,
> >>
> >> the above can be caused by the fact that when 'hv-synic' is enabled, KVM
> >> automatically disables APICv and this can explain the overhead and the
> >> fact that you're seeing more vmexits. KVM disables APICv because SynIC's
> >> 'AutoEOI' feature is incompatible with it. We can, however, tell Windows
> >> to not use AutoEOI ('Recommend deprecating AutoEOI' bit) and only
> >> inhibit APICv if the recommendation was ignored. This is implemented in
> >> the following KVM patch series:
> >> https://lore.kernel.org/kvm/20210518144339.1987982-1-vkuzn...@redhat.com/
> >>
> >> It will, however, require a new 'hv-something' flag to QEMU. For now, it
> >> can be tested with 'hv-passthrough'.
> >>
> >> It would be great if you could give it a spin!
> >>
> >> --
> >> Vitaly
> >
> > It's great to know that you already have a solution for this. :)
> >
> > By the way,  is there any requirement for the version of windows or
> > windows updates for the new feature to work?
>
> AFAIR, 'Recommend deprecating AutoEOI' bit appeared in WS2012 so I'd
> expect WS2008 to ignore it completely (and thus SynIC will always be
> disabling APICv for it).
>

Hi Vitaly,
  I tried your patchset and found it's not helpful to reduce the
virtualization overhead.
here are some perfdata with the same workload

===
Analyze events for all VMs, all VCPUs:
 VM-EXITSamples  Samples% Time%Min TimeMax
Time Avg time
   MSR_WRITE 92404589.96%81.10%  0.42us
68.42us  1.26us ( +-   0.07% )
   DR_ACCESS  44669 4.35% 2.36%  0.32us
50.74us  0.76us ( +-   0.32% )
  EXTERNAL_INTERRUPT  29809 2.90% 6.42%  0.66us
70.75us  3.10us ( +-   0.54% )
  VMCALL  17819 1.73% 5.21%  0.75us
15.64us  4.20us ( +-   0.33%

Total Samples:1027227, Total events handled time:1436343.94us.
===

The result shows the overhead increased.  enable the apicv can help to
reduce the vm-exit
caused by interrupt injection, but on the other side, there are a lot
of vm-exit caused by APIC_EOI.

When turning off the hyper-v and using the kvm apicv, there is no such
overhead. It seems turning
on hyper V related features is not always the best choice for a windows guest.

Thanks!
Liang



[PATCH] linux-user: make process_pending_signals thread-safe

2021-05-23 Thread Hamza Mahfooz
Use pthread_sigmask instead of sigprocmask inside process_pending_signals
to ensure that race conditions aren't possible.

Signed-off-by: Hamza Mahfooz 
---
 linux-user/signal.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 7eecec46c4..81ff753c01 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1005,9 +1005,8 @@ void process_pending_signals(CPUArchState *cpu_env)
 sigset_t *blocked_set;
 
 while (qatomic_read(&ts->signal_pending)) {
-/* FIXME: This is not threadsafe.  */
 sigfillset(&set);
-sigprocmask(SIG_SETMASK, &set, 0);
+pthread_sigmask(SIG_SETMASK, &set, 0);
 
 restart_scan:
 sig = ts->sync_signal.pending;
-- 
2.31.1




[RFC v4 02/14] hw/s390x: rename tod-qemu.c to tod-tcg.c

2021-05-23 Thread Cho, Yu-Chen
we stop short of renaming the actual qom object though,
so type remains TYPE_QEMU_S390_TOD, ie "s390-tod-qemu".

Signed-off-by: Claudio Fontana 
Reviewed-by: David Hildenbrand 
Reviewed-by: Cornelia Huck 
Signed-off-by: Cho, Yu-Chen 
---
 hw/s390x/meson.build   | 2 +-
 hw/s390x/{tod-qemu.c => tod-tcg.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename hw/s390x/{tod-qemu.c => tod-tcg.c} (100%)

diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
index 327e9c93af..02e81a9467 100644
--- a/hw/s390x/meson.build
+++ b/hw/s390x/meson.build
@@ -16,7 +16,7 @@ s390x_ss.add(files(
   'sclp.c',
   'sclpcpu.c',
   'sclpquiesce.c',
-  'tod-qemu.c',
+  'tod-tcg.c',
   'tod.c',
 ))
 s390x_ss.add(when: 'CONFIG_KVM', if_true: files(
diff --git a/hw/s390x/tod-qemu.c b/hw/s390x/tod-tcg.c
similarity index 100%
rename from hw/s390x/tod-qemu.c
rename to hw/s390x/tod-tcg.c
-- 
2.31.1




[RFC v4 01/14] target/s390x: meson: add target_user_arch

2021-05-23 Thread Cho, Yu-Chen
the lack of target_user_arch makes it hard to fully leverage the
build system in order to separate user code from sysemu code.

Provide it, so that we can avoid the proliferation of #ifdef
in target code.

Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/meson.build | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/s390x/meson.build b/target/s390x/meson.build
index c42eadb7d2..1219f64112 100644
--- a/target/s390x/meson.build
+++ b/target/s390x/meson.build
@@ -58,5 +58,8 @@ if host_machine.cpu_family() == 's390x' and 
cc.has_link_argument('-Wl,--s390-pgs
if_true: declare_dependency(link_args: 
['-Wl,--s390-pgste']))
 endif
 
+s390x_user_ss = ss.source_set()
+
 target_arch += {'s390x': s390x_ss}
 target_softmmu_arch += {'s390x': s390x_softmmu_ss}
+target_user_arch += {'s390x': s390x_user_ss}
-- 
2.31.1




[RFC v4 05/14] target/s390x: remove tcg-stub.c

2021-05-23 Thread Cho, Yu-Chen
now that we protect all calls to the tcg-specific functions
with if (tcg_enabled()), we do not need the TCG stub anymore.

Signed-off-by: Claudio Fontana 
Reviewed-by: David Hildenbrand 
Reviewed-by: Cornelia Huck 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/meson.build |  2 +-
 target/s390x/tcg-stub.c  | 30 --
 2 files changed, 1 insertion(+), 31 deletions(-)
 delete mode 100644 target/s390x/tcg-stub.c

diff --git a/target/s390x/meson.build b/target/s390x/meson.build
index 1219f64112..a5e1ded93f 100644
--- a/target/s390x/meson.build
+++ b/target/s390x/meson.build
@@ -21,7 +21,7 @@ s390x_ss.add(when: 'CONFIG_TCG', if_true: files(
   'vec_helper.c',
   'vec_int_helper.c',
   'vec_string_helper.c',
-), if_false: files('tcg-stub.c'))
+))
 
 s390x_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'), if_false: 
files('kvm-stub.c'))
 
diff --git a/target/s390x/tcg-stub.c b/target/s390x/tcg-stub.c
deleted file mode 100644
index d22c898802..00
--- a/target/s390x/tcg-stub.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/*
- * QEMU TCG support -- s390x specific function stubs.
- *
- * Copyright (C) 2018 Red Hat Inc
- *
- * Authors:
- *   David Hildenbrand 
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-
-#include "qemu/osdep.h"
-#include "qemu-common.h"
-#include "cpu.h"
-#include "tcg_s390x.h"
-
-void tcg_s390_tod_updated(CPUState *cs, run_on_cpu_data opaque)
-{
-}
-void QEMU_NORETURN tcg_s390_program_interrupt(CPUS390XState *env,
-  uint32_t code, uintptr_t ra)
-{
-g_assert_not_reached();
-}
-void QEMU_NORETURN tcg_s390_data_exception(CPUS390XState *env, uint32_t dxc,
-   uintptr_t ra)
-{
-g_assert_not_reached();
-}
-- 
2.31.1




[RFC v4 04/14] hw/s390x: tod: make explicit checks for accelerators when initializing

2021-05-23 Thread Cho, Yu-Chen
replace general "else" with specific checks for each possible accelerator.

Handle qtest as a NOP, and error out for an unknown accelerator used in
combination with tod.

Signed-off-by: Claudio Fontana 
Reviewed-by: David Hildenbrand 
Reviewed-by: Cornelia Huck 
Signed-off-by: Cho, Yu-Chen 
---
 hw/s390x/tod.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/tod.c b/hw/s390x/tod.c
index 3c2979175e..fd5a36bf24 100644
--- a/hw/s390x/tod.c
+++ b/hw/s390x/tod.c
@@ -14,6 +14,8 @@
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "sysemu/kvm.h"
+#include "sysemu/tcg.h"
+#include "sysemu/qtest.h"
 #include "migration/qemu-file-types.h"
 #include "migration/register.h"
 
@@ -23,8 +25,13 @@ void s390_init_tod(void)
 
 if (kvm_enabled()) {
 obj = object_new(TYPE_KVM_S390_TOD);
-} else {
+} else if (tcg_enabled()) {
 obj = object_new(TYPE_QEMU_S390_TOD);
+} else if (qtest_enabled()) {
+return;
+} else {
+error_report("current accelerator not handled in s390_init_tod!");
+abort();
 }
 object_property_add_child(qdev_get_machine(), TYPE_S390_TOD, obj);
 object_unref(obj);
-- 
2.31.1




[RFC v4 03/14] hw/s390x: only build tod-tcg from the CONFIG_TCG build

2021-05-23 Thread Cho, Yu-Chen
this will allow in later patches to remove unneeded stubs
in target/s390x.

Signed-off-by: Claudio Fontana 
Reviewed-by: David Hildenbrand 
Reviewed-by: Cornelia Huck 
Signed-off-by: Cho, Yu-Chen 
---
 hw/s390x/meson.build | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
index 02e81a9467..28484256ec 100644
--- a/hw/s390x/meson.build
+++ b/hw/s390x/meson.build
@@ -16,7 +16,6 @@ s390x_ss.add(files(
   'sclp.c',
   'sclpcpu.c',
   'sclpquiesce.c',
-  'tod-tcg.c',
   'tod.c',
 ))
 s390x_ss.add(when: 'CONFIG_KVM', if_true: files(
@@ -25,6 +24,9 @@ s390x_ss.add(when: 'CONFIG_KVM', if_true: files(
   's390-stattrib-kvm.c',
   'pv.c',
 ))
+s390x_ss.add(when: 'CONFIG_TCG', if_true: files(
+  'tod-tcg.c',
+))
 s390x_ss.add(when: 'CONFIG_S390_CCW_VIRTIO', if_true: 
files('s390-virtio-ccw.c'))
 s390x_ss.add(when: 'CONFIG_TERMINAL3270', if_true: files('3270-ccw.c'))
 s390x_ss.add(when: 'CONFIG_VFIO', if_true: files('s390-pci-vfio.c'))
-- 
2.31.1




[RFC v4 08/14] target/s390x: split cpu-dump from helper.c

2021-05-23 Thread Cho, Yu-Chen
Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/cpu-dump.c  | 131 +++
 target/s390x/helper.c| 107 
 target/s390x/meson.build |   1 +
 3 files changed, 132 insertions(+), 107 deletions(-)
 create mode 100644 target/s390x/cpu-dump.c

diff --git a/target/s390x/cpu-dump.c b/target/s390x/cpu-dump.c
new file mode 100644
index 00..4170dec01a
--- /dev/null
+++ b/target/s390x/cpu-dump.c
@@ -0,0 +1,131 @@
+/*
+ * S/390 CPU dump to FILE
+ *
+ *  Copyright (c) 2009 Ulrich Hecht
+ *  Copyright (c) 2011 Alexander Graf
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "s390x-internal.h"
+#include "qemu/qemu-print.h"
+
+void s390_cpu_dump_state(CPUState *cs, FILE *f, int flags)
+{
+S390CPU *cpu = S390_CPU(cs);
+CPUS390XState *env = &cpu->env;
+int i;
+
+if (env->cc_op > 3) {
+qemu_fprintf(f, "PSW=mask %016" PRIx64 " addr %016" PRIx64 " cc 
%15s\n",
+ env->psw.mask, env->psw.addr, cc_name(env->cc_op));
+} else {
+qemu_fprintf(f, "PSW=mask %016" PRIx64 " addr %016" PRIx64 " cc 
%02x\n",
+ env->psw.mask, env->psw.addr, env->cc_op);
+}
+
+for (i = 0; i < 16; i++) {
+qemu_fprintf(f, "R%02d=%016" PRIx64, i, env->regs[i]);
+if ((i % 4) == 3) {
+qemu_fprintf(f, "\n");
+} else {
+qemu_fprintf(f, " ");
+}
+}
+
+if (flags & CPU_DUMP_FPU) {
+if (s390_has_feat(S390_FEAT_VECTOR)) {
+for (i = 0; i < 32; i++) {
+qemu_fprintf(f, "V%02d=%016" PRIx64 "%016" PRIx64 "%c",
+ i, env->vregs[i][0], env->vregs[i][1],
+ i % 2 ? '\n' : ' ');
+}
+} else {
+for (i = 0; i < 16; i++) {
+qemu_fprintf(f, "F%02d=%016" PRIx64 "%c",
+ i, *get_freg(env, i),
+ (i % 4) == 3 ? '\n' : ' ');
+}
+}
+}
+
+#ifndef CONFIG_USER_ONLY
+for (i = 0; i < 16; i++) {
+qemu_fprintf(f, "C%02d=%016" PRIx64, i, env->cregs[i]);
+if ((i % 4) == 3) {
+qemu_fprintf(f, "\n");
+} else {
+qemu_fprintf(f, " ");
+}
+}
+#endif
+
+#ifdef DEBUG_INLINE_BRANCHES
+for (i = 0; i < CC_OP_MAX; i++) {
+qemu_fprintf(f, "  %15s = %10ld\t%10ld\n", cc_name(i),
+ inline_branch_miss[i], inline_branch_hit[i]);
+}
+#endif
+
+qemu_fprintf(f, "\n");
+}
+
+const char *cc_name(enum cc_op cc_op)
+{
+static const char * const cc_names[] = {
+[CC_OP_CONST0]= "CC_OP_CONST0",
+[CC_OP_CONST1]= "CC_OP_CONST1",
+[CC_OP_CONST2]= "CC_OP_CONST2",
+[CC_OP_CONST3]= "CC_OP_CONST3",
+[CC_OP_DYNAMIC]   = "CC_OP_DYNAMIC",
+[CC_OP_STATIC]= "CC_OP_STATIC",
+[CC_OP_NZ]= "CC_OP_NZ",
+[CC_OP_ADDU]  = "CC_OP_ADDU",
+[CC_OP_SUBU]  = "CC_OP_SUBU",
+[CC_OP_LTGT_32]   = "CC_OP_LTGT_32",
+[CC_OP_LTGT_64]   = "CC_OP_LTGT_64",
+[CC_OP_LTUGTU_32] = "CC_OP_LTUGTU_32",
+[CC_OP_LTUGTU_64] = "CC_OP_LTUGTU_64",
+[CC_OP_LTGT0_32]  = "CC_OP_LTGT0_32",
+[CC_OP_LTGT0_64]  = "CC_OP_LTGT0_64",
+[CC_OP_ADD_64]= "CC_OP_ADD_64",
+[CC_OP_SUB_64]= "CC_OP_SUB_64",
+[CC_OP_ABS_64]= "CC_OP_ABS_64",
+[CC_OP_NABS_64]   = "CC_OP_NABS_64",
+[CC_OP_ADD_32]= "CC_OP_ADD_32",
+[CC_OP_SUB_32]= "CC_OP_SUB_32",
+[CC_OP_ABS_32]= "CC_OP_ABS_32",
+[CC_OP_NABS_32]   = "CC_OP_NABS_32",
+[CC_OP_COMP_32]   = "CC_OP_COMP_32",
+[CC_OP_COMP_64]   = "CC_OP_COMP_64",
+[CC_OP_TM_32] = "CC_OP_TM_32",
+[CC_OP_TM_64] = "CC_OP_TM_64",
+[CC_OP_NZ_F32]= "CC_OP_NZ_F32",
+[CC_OP_NZ_F64]= "CC_OP_NZ_F64",
+[CC_OP_NZ_F128]   = "CC_OP_NZ_F128",
+[CC_OP_ICM]   = "CC_OP_ICM",
+[CC_OP_SLA_32]= "CC_OP_SLA_32",
+[CC_OP_SLA_64]= "CC_OP_SLA_64",
+[CC_OP_FLOGR] = "CC_OP_FLOGR",
+[CC_OP_LCBB]  = "CC_OP_LCBB",
+[CC_OP_VC]= "CC_OP_VC",
+[CC_OP_MULS_32] 

[RFC v4 06/14] target/s390x: start moving TCG-only code to tcg/

2021-05-23 Thread Cho, Yu-Chen
move everything related to translate, as well as HELPER code in tcg/

mmu_helper.c stays put for now, as it contains both TCG and KVM code.

The internal.h file is renamed to s390x-internal.h, because of the
risk of collision with other files with the same name.

Signed-off-by: Claudio Fontana 
Acked-by: David Hildenbrand 
Signed-off-by: Cho, Yu-Chen 
---
 hw/s390x/tod-tcg.c|  2 +-
 include/hw/s390x/tod.h|  2 +-
 target/s390x/arch_dump.c  |  2 +-
 target/s390x/cpu.c|  2 +-
 target/s390x/cpu_models.c |  2 +-
 target/s390x/diag.c   |  2 +-
 target/s390x/gdbstub.c|  2 +-
 target/s390x/helper.c |  2 +-
 target/s390x/interrupt.c  |  4 ++--
 target/s390x/ioinst.c |  2 +-
 target/s390x/kvm.c|  2 +-
 target/s390x/machine.c|  4 ++--
 target/s390x/meson.build  | 17 ++---
 target/s390x/mmu_helper.c |  2 +-
 target/s390x/{internal.h => s390x-internal.h} |  6 ++
 target/s390x/sigp.c   |  2 +-
 target/s390x/{ => tcg}/cc_helper.c|  2 +-
 target/s390x/{ => tcg}/crypto_helper.c|  2 +-
 target/s390x/{ => tcg}/excp_helper.c  |  2 +-
 target/s390x/{ => tcg}/fpu_helper.c   |  2 +-
 target/s390x/{ => tcg}/insn-data.def  |  0
 target/s390x/{ => tcg}/insn-format.def|  0
 target/s390x/{ => tcg}/int_helper.c   |  2 +-
 target/s390x/{ => tcg}/mem_helper.c   |  2 +-
 target/s390x/tcg/meson.build  | 14 ++
 target/s390x/{ => tcg}/misc_helper.c  |  2 +-
 target/s390x/{ => tcg}/s390-tod.h |  0
 target/s390x/{ => tcg}/tcg_s390x.h|  0
 target/s390x/{ => tcg}/translate.c|  2 +-
 target/s390x/{ => tcg}/translate_vx.c.inc |  0
 target/s390x/{ => tcg}/vec.h  |  0
 target/s390x/{ => tcg}/vec_fpu_helper.c   |  2 +-
 target/s390x/{ => tcg}/vec_helper.c   |  2 +-
 target/s390x/{ => tcg}/vec_int_helper.c   |  0
 target/s390x/{ => tcg}/vec_string_helper.c|  2 +-
 35 files changed, 49 insertions(+), 42 deletions(-)
 rename target/s390x/{internal.h => s390x-internal.h} (98%)
 rename target/s390x/{ => tcg}/cc_helper.c (99%)
 rename target/s390x/{ => tcg}/crypto_helper.c (98%)
 rename target/s390x/{ => tcg}/excp_helper.c (99%)
 rename target/s390x/{ => tcg}/fpu_helper.c (99%)
 rename target/s390x/{ => tcg}/insn-data.def (100%)
 rename target/s390x/{ => tcg}/insn-format.def (100%)
 rename target/s390x/{ => tcg}/int_helper.c (99%)
 rename target/s390x/{ => tcg}/mem_helper.c (99%)
 create mode 100644 target/s390x/tcg/meson.build
 rename target/s390x/{ => tcg}/misc_helper.c (99%)
 rename target/s390x/{ => tcg}/s390-tod.h (100%)
 rename target/s390x/{ => tcg}/tcg_s390x.h (100%)
 rename target/s390x/{ => tcg}/translate.c (99%)
 rename target/s390x/{ => tcg}/translate_vx.c.inc (100%)
 rename target/s390x/{ => tcg}/vec.h (100%)
 rename target/s390x/{ => tcg}/vec_fpu_helper.c (99%)
 rename target/s390x/{ => tcg}/vec_helper.c (99%)
 rename target/s390x/{ => tcg}/vec_int_helper.c (100%)
 rename target/s390x/{ => tcg}/vec_string_helper.c (99%)

diff --git a/hw/s390x/tod-tcg.c b/hw/s390x/tod-tcg.c
index e91b9590f5..4b3e65050a 100644
--- a/hw/s390x/tod-tcg.c
+++ b/hw/s390x/tod-tcg.c
@@ -16,7 +16,7 @@
 #include "qemu/cutils.h"
 #include "qemu/module.h"
 #include "cpu.h"
-#include "tcg_s390x.h"
+#include "tcg/tcg_s390x.h"
 
 static void qemu_s390_tod_get(const S390TODState *td, S390TOD *tod,
   Error **errp)
diff --git a/include/hw/s390x/tod.h b/include/hw/s390x/tod.h
index ff3195a4bf..0935e85089 100644
--- a/include/hw/s390x/tod.h
+++ b/include/hw/s390x/tod.h
@@ -12,7 +12,7 @@
 #define HW_S390_TOD_H
 
 #include "hw/qdev-core.h"
-#include "target/s390x/s390-tod.h"
+#include "tcg/s390-tod.h"
 #include "qom/object.h"
 
 typedef struct S390TOD {
diff --git a/target/s390x/arch_dump.c b/target/s390x/arch_dump.c
index cc1330876b..08daf93ae1 100644
--- a/target/s390x/arch_dump.c
+++ b/target/s390x/arch_dump.c
@@ -13,7 +13,7 @@
 
 #include "qemu/osdep.h"
 #include "cpu.h"
-#include "internal.h"
+#include "s390x-internal.h"
 #include "elf.h"
 #include "sysemu/dump.h"
 
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 64455cf309..533b251b7e 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -23,7 +23,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "cpu.h"
-#include "internal.h"
+#include "s390x-internal.h"
 #include "kvm_s390x.h"
 #include "sysemu/kvm.h"
 #include "sysemu/reset.h"
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 050dcf2d42..4ff8cba7e5 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -12,7 +12,7 @@
 
 #include "qem

[RFC v4 07/14] target/s390x: move sysemu-only code out to cpu-sysemu.c

2021-05-23 Thread Cho, Yu-Chen
Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/cpu-sysemu.c | 304 ++
 target/s390x/cpu.c| 282 ++-
 target/s390x/meson.build  |   1 +
 target/s390x/trace-events |   2 +-
 4 files changed, 318 insertions(+), 271 deletions(-)
 create mode 100644 target/s390x/cpu-sysemu.c

diff --git a/target/s390x/cpu-sysemu.c b/target/s390x/cpu-sysemu.c
new file mode 100644
index 00..6081b7ef32
--- /dev/null
+++ b/target/s390x/cpu-sysemu.c
@@ -0,0 +1,304 @@
+/*
+ * QEMU S/390 CPU - System Emulation-only code
+ *
+ * Copyright (c) 2009 Ulrich Hecht
+ * Copyright (c) 2011 Alexander Graf
+ * Copyright (c) 2012 SUSE LINUX Products GmbH
+ * Copyright (c) 2012 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "cpu.h"
+#include "s390x-internal.h"
+#include "kvm_s390x.h"
+#include "sysemu/kvm.h"
+#include "sysemu/reset.h"
+#include "qemu/timer.h"
+#include "trace.h"
+#include "qapi/qapi-visit-run-state.h"
+#include "sysemu/hw_accel.h"
+
+#include "hw/s390x/pv.h"
+#include "hw/boards.h"
+#include "sysemu/arch_init.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/tcg.h"
+
+/* S390CPUClass::load_normal() */
+static void s390_cpu_load_normal(CPUState *s)
+{
+S390CPU *cpu = S390_CPU(s);
+uint64_t spsw;
+
+if (!s390_is_pv()) {
+spsw = ldq_phys(s->as, 0);
+cpu->env.psw.mask = spsw & PSW_MASK_SHORT_CTRL;
+/*
+ * Invert short psw indication, so SIE will report a specification
+ * exception if it was not set.
+ */
+cpu->env.psw.mask ^= PSW_MASK_SHORTPSW;
+cpu->env.psw.addr = spsw & PSW_MASK_SHORT_ADDR;
+} else {
+/*
+ * Firmware requires us to set the load state before we set
+ * the cpu to operating on protected guests.
+ */
+s390_cpu_set_state(S390_CPU_STATE_LOAD, cpu);
+}
+s390_cpu_set_state(S390_CPU_STATE_OPERATING, cpu);
+}
+
+void s390_cpu_machine_reset_cb(void *opaque)
+{
+S390CPU *cpu = opaque;
+
+run_on_cpu(CPU(cpu), s390_do_cpu_full_reset, RUN_ON_CPU_NULL);
+}
+
+static GuestPanicInformation *s390_cpu_get_crash_info(CPUState *cs)
+{
+GuestPanicInformation *panic_info;
+S390CPU *cpu = S390_CPU(cs);
+
+cpu_synchronize_state(cs);
+panic_info = g_malloc0(sizeof(GuestPanicInformation));
+
+panic_info->type = GUEST_PANIC_INFORMATION_TYPE_S390;
+panic_info->u.s390.core = cpu->env.core_id;
+panic_info->u.s390.psw_mask = cpu->env.psw.mask;
+panic_info->u.s390.psw_addr = cpu->env.psw.addr;
+panic_info->u.s390.reason = cpu->env.crash_reason;
+
+return panic_info;
+}
+
+static void s390_cpu_get_crash_info_qom(Object *obj, Visitor *v,
+const char *name, void *opaque,
+Error **errp)
+{
+CPUState *cs = CPU(obj);
+GuestPanicInformation *panic_info;
+
+if (!cs->crash_occurred) {
+error_setg(errp, "No crash occurred");
+return;
+}
+
+panic_info = s390_cpu_get_crash_info(cs);
+
+visit_type_GuestPanicInformation(v, "crash-information", &panic_info,
+ errp);
+qapi_free_GuestPanicInformation(panic_info);
+}
+
+void s390_cpu_init_sysemu(Object *obj)
+{
+CPUState *cs = CPU(obj);
+S390CPU *cpu = S390_CPU(obj);
+
+cs->start_powered_off = true;
+object_property_add(obj, "crash-information", "GuestPanicInformation",
+s390_cpu_get_crash_info_qom, NULL, NULL, NULL);
+cpu->env.tod_timer =
+timer_new_ns(QEMU_CLOCK_VIRTUAL, s390x_tod_timer, cpu);
+cpu->env.cpu_timer =
+timer_new_ns(QEMU_CLOCK_VIRTUAL, s390x_cpu_timer, cpu);
+s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
+}
+
+bool s390_cpu_realize_sysemu(DeviceState *dev, Error **errp)
+{
+S390CPU *cpu = S390_CPU(dev);
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
+
+if (cpu->env.core_id >= max_cpus) {
+error_setg(errp, "Unable to add CPU with core-id: %" PRIu32
+   ", maximum core-id: %d", cpu->env.core_id,
+   max_cpus - 1);
+return false;
+}
+
+if (cpu_exists(cpu->env.core_id)) {
+error_setg(errp, "Un

[RFC v4 11/14] target/s390x: remove kvm-stub.c

2021-05-23 Thread Cho, Yu-Chen
all function calls are protected by kvm_enabled(),
so we should not need the stubs.

Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/kvm-stub.c  | 126 ---
 target/s390x/meson.build |   2 +-
 2 files changed, 1 insertion(+), 127 deletions(-)
 delete mode 100644 target/s390x/kvm-stub.c

diff --git a/target/s390x/kvm-stub.c b/target/s390x/kvm-stub.c
deleted file mode 100644
index 9970b5a8c7..00
--- a/target/s390x/kvm-stub.c
+++ /dev/null
@@ -1,126 +0,0 @@
-/*
- * QEMU KVM support -- s390x specific function stubs.
- *
- * Copyright (c) 2009 Ulrich Hecht
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-
-#include "qemu/osdep.h"
-#include "cpu.h"
-#include "kvm_s390x.h"
-
-void kvm_s390_access_exception(S390CPU *cpu, uint16_t code, uint64_t te_code)
-{
-}
-
-int kvm_s390_mem_op(S390CPU *cpu, vaddr addr, uint8_t ar, void *hostbuf,
-int len, bool is_write)
-{
-return -ENOSYS;
-}
-
-void kvm_s390_program_interrupt(S390CPU *cpu, uint16_t code)
-{
-}
-
-int kvm_s390_set_cpu_state(S390CPU *cpu, uint8_t cpu_state)
-{
-return -ENOSYS;
-}
-
-void kvm_s390_vcpu_interrupt_pre_save(S390CPU *cpu)
-{
-}
-
-int kvm_s390_vcpu_interrupt_post_load(S390CPU *cpu)
-{
-return 0;
-}
-
-int kvm_s390_get_hpage_1m(void)
-{
-return 0;
-}
-
-int kvm_s390_get_ri(void)
-{
-return 0;
-}
-
-int kvm_s390_get_gs(void)
-{
-return 0;
-}
-
-int kvm_s390_get_clock(uint8_t *tod_high, uint64_t *tod_low)
-{
-return -ENOSYS;
-}
-
-int kvm_s390_get_clock_ext(uint8_t *tod_high, uint64_t *tod_low)
-{
-return -ENOSYS;
-}
-
-int kvm_s390_set_clock(uint8_t tod_high, uint64_t tod_low)
-{
-return -ENOSYS;
-}
-
-int kvm_s390_set_clock_ext(uint8_t tod_high, uint64_t tod_low)
-{
-return -ENOSYS;
-}
-
-void kvm_s390_enable_css_support(S390CPU *cpu)
-{
-}
-
-int kvm_s390_assign_subch_ioeventfd(EventNotifier *notifier, uint32_t sch,
-int vq, bool assign)
-{
-return -ENOSYS;
-}
-
-void kvm_s390_cmma_reset(void)
-{
-}
-
-void kvm_s390_reset_vcpu_initial(S390CPU *cpu)
-{
-}
-
-void kvm_s390_reset_vcpu_clear(S390CPU *cpu)
-{
-}
-
-void kvm_s390_reset_vcpu_normal(S390CPU *cpu)
-{
-}
-
-int kvm_s390_set_mem_limit(uint64_t new_limit, uint64_t *hw_limit)
-{
-return 0;
-}
-
-void kvm_s390_set_max_pagesize(uint64_t pagesize, Error **errp)
-{
-}
-
-void kvm_s390_crypto_reset(void)
-{
-}
-
-void kvm_s390_stop_interrupt(S390CPU *cpu)
-{
-}
-
-void kvm_s390_restart_interrupt(S390CPU *cpu)
-{
-}
-
-void kvm_s390_set_diag318(CPUState *cs, uint64_t diag318_info)
-{
-}
diff --git a/target/s390x/meson.build b/target/s390x/meson.build
index bbcaede384..6c8e03b8fb 100644
--- a/target/s390x/meson.build
+++ b/target/s390x/meson.build
@@ -8,7 +8,7 @@ s390x_ss.add(files(
   'cpu-dump.c',
 ))
 
-s390x_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'), if_false: 
files('kvm-stub.c'))
+s390x_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'))
 
 gen_features = executable('gen-features', 'gen-features.c', native: true,
   build_by_default: false)
-- 
2.31.1




[RFC v4 09/14] target/s390x: make helper.c sysemu-only

2021-05-23 Thread Cho, Yu-Chen
Now that we have moved cpu-dump functionality out of helper.c,
we can make the module sysemu-only.

Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/helper.c| 4 
 target/s390x/meson.build | 2 +-
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/s390x/helper.c b/target/s390x/helper.c
index 41ccc83d11..f246bec353 100644
--- a/target/s390x/helper.c
+++ b/target/s390x/helper.c
@@ -27,11 +27,8 @@
 #include "hw/s390x/pv.h"
 #include "sysemu/hw_accel.h"
 #include "sysemu/runstate.h"
-#ifndef CONFIG_USER_ONLY
 #include "sysemu/tcg.h"
-#endif
 
-#ifndef CONFIG_USER_ONLY
 void s390x_tod_timer(void *opaque)
 {
 cpu_inject_clock_comparator((S390CPU *) opaque);
@@ -322,4 +319,3 @@ int s390_store_adtl_status(S390CPU *cpu, hwaddr addr, 
hwaddr len)
 cpu_physical_memory_unmap(sa, len, 1, len);
 return 0;
 }
-#endif /* CONFIG_USER_ONLY */
diff --git a/target/s390x/meson.build b/target/s390x/meson.build
index 6e1aa3b0cd..bbcaede384 100644
--- a/target/s390x/meson.build
+++ b/target/s390x/meson.build
@@ -4,7 +4,6 @@ s390x_ss.add(files(
   'cpu_features.c',
   'cpu_models.c',
   'gdbstub.c',
-  'helper.c',
   'interrupt.c',
   'cpu-dump.c',
 ))
@@ -23,6 +22,7 @@ s390x_ss.add(gen_features_h)
 
 s390x_softmmu_ss = ss.source_set()
 s390x_softmmu_ss.add(files(
+  'helper.c',
   'arch_dump.c',
   'diag.c',
   'ioinst.c',
-- 
2.31.1




[RFC v4 12/14] target/s390x: move kvm files into kvm/

2021-05-23 Thread Cho, Yu-Chen
Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 hw/intc/s390_flic_kvm.c|  2 +-
 hw/s390x/s390-stattrib-kvm.c   |  2 +-
 hw/s390x/tod-kvm.c |  2 +-
 hw/vfio/ap.c   |  2 +-
 meson.build|  1 +
 target/s390x/cpu-sysemu.c  |  2 +-
 target/s390x/cpu.c |  2 +-
 target/s390x/cpu_models.c  |  2 +-
 target/s390x/diag.c|  2 +-
 target/s390x/interrupt.c   |  2 +-
 target/s390x/{ => kvm}/kvm.c   |  2 +-
 target/s390x/{ => kvm}/kvm_s390x.h |  0
 target/s390x/kvm/meson.build   | 17 +
 target/s390x/kvm/trace-events  |  7 +++
 target/s390x/kvm/trace.h   |  1 +
 target/s390x/machine.c |  2 +-
 target/s390x/meson.build   | 16 +---
 target/s390x/mmu_helper.c  |  2 +-
 target/s390x/trace-events  |  6 --
 19 files changed, 39 insertions(+), 33 deletions(-)
 rename target/s390x/{ => kvm}/kvm.c (99%)
 rename target/s390x/{ => kvm}/kvm_s390x.h (100%)
 create mode 100644 target/s390x/kvm/meson.build
 create mode 100644 target/s390x/kvm/trace-events
 create mode 100644 target/s390x/kvm/trace.h

diff --git a/hw/intc/s390_flic_kvm.c b/hw/intc/s390_flic_kvm.c
index 929cfa3a68..efe5054182 100644
--- a/hw/intc/s390_flic_kvm.c
+++ b/hw/intc/s390_flic_kvm.c
@@ -11,7 +11,7 @@
  */
 
 #include "qemu/osdep.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 #include 
 #include "qemu/error-report.h"
 #include "qemu/module.h"
diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
index f0b11a74e4..24cd01382e 100644
--- a/hw/s390x/s390-stattrib-kvm.c
+++ b/hw/s390x/s390-stattrib-kvm.c
@@ -16,7 +16,7 @@
 #include "qemu/error-report.h"
 #include "sysemu/kvm.h"
 #include "exec/ram_addr.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 
 Object *kvm_s390_stattrib_create(void)
 {
diff --git a/hw/s390x/tod-kvm.c b/hw/s390x/tod-kvm.c
index 0b94477486..ec855811ae 100644
--- a/hw/s390x/tod-kvm.c
+++ b/hw/s390x/tod-kvm.c
@@ -13,7 +13,7 @@
 #include "qemu/module.h"
 #include "sysemu/runstate.h"
 #include "hw/s390x/tod.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 
 static void kvm_s390_get_tod_raw(S390TOD *tod, Error **errp)
 {
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 4b32aca1a0..e0dd561e85 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -21,7 +21,7 @@
 #include "qemu/module.h"
 #include "qemu/option.h"
 #include "qemu/config-file.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 #include "migration/vmstate.h"
 #include "hw/qdev-properties.h"
 #include "hw/s390x/ap-bridge.h"
diff --git a/meson.build b/meson.build
index 1559e8d873..37be2e60c3 100644
--- a/meson.build
+++ b/meson.build
@@ -1863,6 +1863,7 @@ if have_system or have_user
 'target/ppc',
 'target/riscv',
 'target/s390x',
+'target/s390x/kvm',
 'target/sparc',
   ]
 endif
diff --git a/target/s390x/cpu-sysemu.c b/target/s390x/cpu-sysemu.c
index 6081b7ef32..f3c1b4845a 100644
--- a/target/s390x/cpu-sysemu.c
+++ b/target/s390x/cpu-sysemu.c
@@ -24,7 +24,7 @@
 #include "qapi/error.h"
 #include "cpu.h"
 #include "s390x-internal.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 #include "sysemu/kvm.h"
 #include "sysemu/reset.h"
 #include "qemu/timer.h"
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 59efe48bcd..6e82ba73cc 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -24,7 +24,7 @@
 #include "qapi/error.h"
 #include "cpu.h"
 #include "s390x-internal.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 #include "sysemu/kvm.h"
 #include "sysemu/reset.h"
 #include "qemu/module.h"
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 4ff8cba7e5..0ed1c23774 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -13,7 +13,7 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "s390x-internal.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 #include "sysemu/kvm.h"
 #include "sysemu/tcg.h"
 #include "qapi/error.h"
diff --git a/target/s390x/diag.c b/target/s390x/diag.c
index 8405f69df0..76b01dcd68 100644
--- a/target/s390x/diag.c
+++ b/target/s390x/diag.c
@@ -21,7 +21,7 @@
 #include "hw/s390x/s390-virtio-ccw.h"
 #include "hw/s390x/pv.h"
 #include "sysemu/kvm.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 
 int handle_diag_288(CPUS390XState *env, uint64_t r1, uint64_t r3)
 {
diff --git a/target/s390x/interrupt.c b/target/s390x/interrupt.c
index 734f0c62de..5195f060ec 100644
--- a/target/s390x/interrupt.c
+++ b/target/s390x/interrupt.c
@@ -9,7 +9,7 @@
 
 #include "qemu/osdep.h"
 #include "cpu.h"
-#include "kvm_s390x.h"
+#include "kvm/kvm_s390x.h"
 #include "s390x-internal.h"
 #include "exec/exec-all.h"
 #include "sysemu/kvm.h"
diff --git a/target/s390x/kvm.c b/target/s390x/kvm/kvm.c
similarity index 99%
rename from target/s390x/kvm.c
rename to target/s390x/kvm/kvm.c
index 2a22cc69f6..4e47563faf 100644
--- a/target/s390x/k

[RFC v4 10/14] target/s390x: use kvm_enabled() to wrap call to kvm_s390_get_hpage_1m

2021-05-23 Thread Cho, Yu-Chen
this will allow to remove the kvm stubs.

Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/diag.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/s390x/diag.c b/target/s390x/diag.c
index c17a2498a7..8405f69df0 100644
--- a/target/s390x/diag.c
+++ b/target/s390x/diag.c
@@ -20,6 +20,7 @@
 #include "hw/s390x/ipl.h"
 #include "hw/s390x/s390-virtio-ccw.h"
 #include "hw/s390x/pv.h"
+#include "sysemu/kvm.h"
 #include "kvm_s390x.h"
 
 int handle_diag_288(CPUS390XState *env, uint64_t r1, uint64_t r3)
@@ -168,7 +169,7 @@ out:
 return;
 }
 
-if (kvm_s390_get_hpage_1m()) {
+if (kvm_enabled() && kvm_s390_get_hpage_1m()) {
 error_report("Protected VMs can currently not be backed with "
  "huge pages");
 env->regs[r1 + 1] = DIAG_308_RC_INVAL_FOR_PV;
-- 
2.31.1




[RFC v4 14/14] MAINTAINERS: update s390x directories

2021-05-23 Thread Cho, Yu-Chen
After the reshuffling, update MAINTAINERS accordingly.
Make use of the new directories:

target/s390x/kvm/
target/s390x/tcg/

Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 MAINTAINERS | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 89741cfc19..8578927961 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -293,7 +293,7 @@ S390 TCG CPUs
 M: Richard Henderson 
 M: David Hildenbrand 
 S: Maintained
-F: target/s390x/
+F: target/s390x/tcg
 F: hw/s390x/
 F: disas/s390.c
 F: tests/tcg/s390x/
@@ -389,14 +389,12 @@ M: Halil Pasic 
 M: Cornelia Huck 
 M: Christian Borntraeger 
 S: Supported
-F: target/s390x/kvm.c
-F: target/s390x/kvm_s390x.h
-F: target/s390x/kvm-stub.c
+F: target/s390x/kvm/
 F: target/s390x/ioinst.[ch]
 F: target/s390x/machine.c
 F: target/s390x/sigp.c
 F: target/s390x/cpu_features*.[ch]
-F: target/s390x/cpu_models.[ch]
+F: target/s390x/cpu_models*.[ch]
 F: hw/s390x/pv.c
 F: include/hw/s390x/pv.h
 F: hw/intc/s390_flic.c
-- 
2.31.1




[RFC v4 13/14] target/s390x: split sysemu part of cpu models

2021-05-23 Thread Cho, Yu-Chen
also create a tiny _user.c with just the (at least for now),
empty implementation of apply_cpu_model.

Signed-off-by: Claudio Fontana 
Signed-off-by: Cho, Yu-Chen 
---
 target/s390x/cpu_models.c| 417 +-
 target/s390x/cpu_models_sysemu.c | 426 +++
 target/s390x/cpu_models_user.c   |  20 ++
 target/s390x/meson.build |   4 +
 target/s390x/s390x-internal.h|   2 +
 5 files changed, 453 insertions(+), 416 deletions(-)
 create mode 100644 target/s390x/cpu_models_sysemu.c
 create mode 100644 target/s390x/cpu_models_user.c

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 0ed1c23774..30a192590d 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -18,18 +18,11 @@
 #include "sysemu/tcg.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
-#include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "qemu/qemu-print.h"
-#include "qapi/qmp/qerror.h"
-#include "qapi/qobject-input-visitor.h"
-#include "qapi/qmp/qdict.h"
 #ifndef CONFIG_USER_ONLY
-#include "sysemu/arch_init.h"
 #include "sysemu/sysemu.h"
-#include "hw/pci/pci.h"
 #endif
-#include "qapi/qapi-commands-machine-target.h"
 #include "hw/s390x/pv.h"
 
 #define CPUDEF_INIT(_type, _gen, _ec_ga, _mha_pow, _hmfai, _name, _desc) \
@@ -414,381 +407,6 @@ void s390_cpu_list(void)
 }
 }
 
-static S390CPUModel *get_max_cpu_model(Error **errp);
-
-#ifndef CONFIG_USER_ONLY
-static void list_add_feat(const char *name, void *opaque);
-
-static void check_unavailable_features(const S390CPUModel *max_model,
-   const S390CPUModel *model,
-   strList **unavailable)
-{
-S390FeatBitmap missing;
-
-/* check general model compatibility */
-if (max_model->def->gen < model->def->gen ||
-(max_model->def->gen == model->def->gen &&
- max_model->def->ec_ga < model->def->ec_ga)) {
-list_add_feat("type", unavailable);
-}
-
-/* detect missing features if any to properly report them */
-bitmap_andnot(missing, model->features, max_model->features,
-  S390_FEAT_MAX);
-if (!bitmap_empty(missing, S390_FEAT_MAX)) {
-s390_feat_bitmap_to_ascii(missing, unavailable, list_add_feat);
-}
-}
-
-struct CpuDefinitionInfoListData {
-CpuDefinitionInfoList *list;
-S390CPUModel *model;
-};
-
-static void create_cpu_model_list(ObjectClass *klass, void *opaque)
-{
-struct CpuDefinitionInfoListData *cpu_list_data = opaque;
-CpuDefinitionInfoList **cpu_list = &cpu_list_data->list;
-CpuDefinitionInfo *info;
-char *name = g_strdup(object_class_get_name(klass));
-S390CPUClass *scc = S390_CPU_CLASS(klass);
-
-/* strip off the -s390x-cpu */
-g_strrstr(name, "-" TYPE_S390_CPU)[0] = 0;
-info = g_new0(CpuDefinitionInfo, 1);
-info->name = name;
-info->has_migration_safe = true;
-info->migration_safe = scc->is_migration_safe;
-info->q_static = scc->is_static;
-info->q_typename = g_strdup(object_class_get_name(klass));
-/* check for unavailable features */
-if (cpu_list_data->model) {
-Object *obj;
-S390CPU *sc;
-obj = object_new_with_class(klass);
-sc = S390_CPU(obj);
-if (sc->model) {
-info->has_unavailable_features = true;
-check_unavailable_features(cpu_list_data->model, sc->model,
-   &info->unavailable_features);
-}
-object_unref(obj);
-}
-
-QAPI_LIST_PREPEND(*cpu_list, info);
-}
-
-CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
-{
-struct CpuDefinitionInfoListData list_data = {
-.list = NULL,
-};
-
-list_data.model = get_max_cpu_model(NULL);
-
-object_class_foreach(create_cpu_model_list, TYPE_S390_CPU, false,
- &list_data);
-
-return list_data.list;
-}
-
-static void cpu_model_from_info(S390CPUModel *model, const CpuModelInfo *info,
-Error **errp)
-{
-Error *err = NULL;
-const QDict *qdict = NULL;
-const QDictEntry *e;
-Visitor *visitor;
-ObjectClass *oc;
-S390CPU *cpu;
-Object *obj;
-
-if (info->props) {
-qdict = qobject_to(QDict, info->props);
-if (!qdict) {
-error_setg(errp, QERR_INVALID_PARAMETER_TYPE, "props", "dict");
-return;
-}
-}
-
-oc = cpu_class_by_name(TYPE_S390_CPU, info->name);
-if (!oc) {
-error_setg(errp, "The CPU definition \'%s\' is unknown.", info->name);
-return;
-}
-if (S390_CPU_CLASS(oc)->kvm_required && !kvm_enabled()) {
-error_setg(errp, "The CPU definition '%s' requires KVM", info->name);
-return;
-}
-obj = object_new_with_class(oc);
-cpu = S390_CPU(obj);
-
-if (!cpu->model) {
-error_setg(errp, "Details about the host CPU model are not available, "
-

Re: [PATCH v3 5/9] target/ppc: removed unnecessary inclusion of helper-proto.h

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:55PM -0300, Bruno Larsen (billionai) wrote:
> These files included helper-proto.h, but didn't use or declare any
> helpers, so the #include has been removed
> 
> Signed-off-by: Bruno Larsen (billionai) 
> Reviewed-by: Richard Henderson 

Applied to ppc-for-6.1, thanks.

> ---
>  target/ppc/cpu_init.c| 1 -
>  target/ppc/gdbstub.c | 1 -
>  target/ppc/mmu-hash32.c  | 1 -
>  target/ppc/mmu-radix64.c | 1 -
>  4 files changed, 4 deletions(-)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 3365135896..b696469d1a 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -43,7 +43,6 @@
>  #include "fpu/softfloat.h"
>  #include "qapi/qapi-commands-machine-target.h"
>  
> -#include "exec/helper-proto.h"
>  #include "helper_regs.h"
>  #include "internal.h"
>  #include "spr_tcg.h"
> diff --git a/target/ppc/gdbstub.c b/target/ppc/gdbstub.c
> index c7d866cfcc..09ff1328d4 100644
> --- a/target/ppc/gdbstub.c
> +++ b/target/ppc/gdbstub.c
> @@ -20,7 +20,6 @@
>  #include "qemu/osdep.h"
>  #include "cpu.h"
>  #include "exec/gdbstub.h"
> -#include "exec/helper-proto.h"
>  #include "internal.h"
>  
>  static int ppc_gdb_register_len_apple(int n)
> diff --git a/target/ppc/mmu-hash32.c b/target/ppc/mmu-hash32.c
> index 32d1f4a954..6a07c345e4 100644
> --- a/target/ppc/mmu-hash32.c
> +++ b/target/ppc/mmu-hash32.c
> @@ -21,7 +21,6 @@
>  #include "qemu/osdep.h"
>  #include "cpu.h"
>  #include "exec/exec-all.h"
> -#include "exec/helper-proto.h"
>  #include "sysemu/kvm.h"
>  #include "kvm_ppc.h"
>  #include "internal.h"
> diff --git a/target/ppc/mmu-radix64.c b/target/ppc/mmu-radix64.c
> index eabfe4e261..cbd404bfa4 100644
> --- a/target/ppc/mmu-radix64.c
> +++ b/target/ppc/mmu-radix64.c
> @@ -20,7 +20,6 @@
>  #include "qemu/osdep.h"
>  #include "cpu.h"
>  #include "exec/exec-all.h"
> -#include "exec/helper-proto.h"
>  #include "qemu/error-report.h"
>  #include "sysemu/kvm.h"
>  #include "kvm_ppc.h"

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 2/9] target/ppc: moved ppc_store_lpcr and ppc_store_msr to cpu.c

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:52PM -0300, Bruno Larsen (billionai) wrote:
> These functions are used in hw/ppc logic, during machine startup, which
> means it must be compiled when --disable-tcg is selected, and so it has
> been moved into a common code file
> 
> Signed-off-by: Bruno Larsen (billionai) 
> Reviewed-by: Richard Henderson 
> Reviewed-by: David Gibson 

Applied to ppc-for-6.1, thanks.

> ---
>  target/ppc/cpu.c | 17 +
>  target/ppc/misc_helper.c | 16 
>  2 files changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/target/ppc/cpu.c b/target/ppc/cpu.c
> index 9cf3288b7a..c8e87e30f1 100644
> --- a/target/ppc/cpu.c
> +++ b/target/ppc/cpu.c
> @@ -24,6 +24,7 @@
>  #include "exec/log.h"
>  #include "fpu/softfloat-helpers.h"
>  #include "mmu-hash64.h"
> +#include "helper_regs.h"
>  
>  target_ulong cpu_read_xer(CPUPPCState *env)
>  {
> @@ -92,3 +93,19 @@ void ppc_store_sdr1(CPUPPCState *env, target_ulong value)
>  env->spr[SPR_SDR1] = value;
>  }
>  #endif /* CONFIG_SOFTMMU */
> +
> +/* GDBstub can read and write MSR... */
> +void ppc_store_msr(CPUPPCState *env, target_ulong value)
> +{
> +hreg_store_msr(env, value, 0);
> +}
> +
> +void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
> +{
> +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +CPUPPCState *env = &cpu->env;
> +
> +env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
> +/* The gtse bit affects hflags */
> +hreg_compute_hflags(env);
> +}
> diff --git a/target/ppc/misc_helper.c b/target/ppc/misc_helper.c
> index 08a31da289..442b12652c 100644
> --- a/target/ppc/misc_helper.c
> +++ b/target/ppc/misc_helper.c
> @@ -255,22 +255,6 @@ target_ulong helper_clcs(CPUPPCState *env, uint32_t arg)
>  
> /*/
>  /* Special registers manipulation */
>  
> -/* GDBstub can read and write MSR... */
> -void ppc_store_msr(CPUPPCState *env, target_ulong value)
> -{
> -hreg_store_msr(env, value, 0);
> -}
> -
> -void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
> -{
> -PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> -CPUPPCState *env = &cpu->env;
> -
> -env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
> -/* The gtse bit affects hflags */
> -hreg_compute_hflags(env);
> -}
> -
>  /*
>   * This code is lifted from MacOnLinux. It is called whenever THRM1,2
>   * or 3 is read an fixes up the values in such a way that will make

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 7/9] target/ppc: Added options to disable many TCG-only functions

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:57PM -0300, Bruno Larsen (billionai) wrote:
> Wrapped some function calls in cpu_init.c, gdbstub.c, mmu-hash64.c,
> mmu_helper.c and excp_helper.c that were TCG only with ifdef
> CONFIG_TCG,
> to support building without TCG.
> 
> for excp_helper we also moved the function do_rfi higher in the file to
> reduce the ifdef count.
> 
> For cpu_init.c, we will also create stubs for ppc_*_opcodes, to make the
> ifdef hell a little smaller, and have hid part of the spr_registration
> logic into the macro that can make the TCG part disappear.
> 
> Signed-off-by: Bruno Larsen (billionai) 
> ---
>  target/ppc/cpu_init.c| 11 +++---
>  target/ppc/excp_helper.c | 85 +++-
>  target/ppc/mmu-hash64.c  | 11 +-
>  target/ppc/mmu_helper.c  | 16 +++-
>  4 files changed, 78 insertions(+), 45 deletions(-)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index b696469d1a..f5ae2f150d 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -1205,15 +1205,12 @@ static void register_BookE206_sprs(CPUPPCState *env, 
> uint32_t mas_mask,
>  /* TLB assist registers */
>  /* XXX : not implemented */
>  for (i = 0; i < 8; i++) {
> -void (*uea_write)(DisasContext *ctx, int sprn, int gprn) =
> -&spr_write_generic32;
> -if (i == 2 && (mas_mask & (1 << i)) && (env->insns_flags & PPC_64B)) 
> {
> -uea_write = &spr_write_generic;
> -}
>  if (mas_mask & (1 << i)) {
>  spr_register(env, mas_sprn[i], mas_names[i],
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, uea_write,
> + &spr_read_generic,
> + (i == 2 && (env->insns_flags & PPC_64B))
> + ? &spr_write_generic : &spr_write_generic32,


I'd prefer to see this change as a separate patch, since it's not just
adding an ifdef.

>   0x);
>  }
>  }
> @@ -9253,7 +9250,9 @@ static void ppc_cpu_class_init(ObjectClass *oc, void 
> *data)
>  cc->class_by_name = ppc_cpu_class_by_name;
>  cc->has_work = ppc_cpu_has_work;
>  cc->dump_state = ppc_cpu_dump_state;
> +#ifdef CONFIG_TCG
>  cc->dump_statistics = ppc_cpu_dump_statistics;
> +#endif
>  cc->set_pc = ppc_cpu_set_pc;
>  cc->gdb_read_register = ppc_cpu_gdb_read_register;
>  cc->gdb_write_register = ppc_cpu_gdb_write_register;
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 80bb6e70e9..e20f38ebe2 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -19,12 +19,15 @@
>  #include "qemu/osdep.h"
>  #include "qemu/main-loop.h"
>  #include "cpu.h"
> -#include "exec/helper-proto.h"
>  #include "exec/exec-all.h"
> -#include "exec/cpu_ldst.h"
>  #include "internal.h"
>  #include "helper_regs.h"
>  
> +#ifdef CONFIG_TCG
> +#include "exec/helper-proto.h"
> +#include "exec/cpu_ldst.h"
> +#endif
> +
>  /* #define DEBUG_OP */
>  /* #define DEBUG_SOFTWARE_TLB */
>  /* #define DEBUG_EXCEPTIONS */
> @@ -1191,6 +1194,7 @@ void raise_exception_ra(CPUPPCState *env, uint32_t 
> exception,
>  raise_exception_err_ra(env, exception, 0, raddr);
>  }
>  
> +#ifdef CONFIG_TCG
>  void helper_raise_exception_err(CPUPPCState *env, uint32_t exception,
>  uint32_t error_code)
>  {
> @@ -1201,8 +1205,43 @@ void helper_raise_exception(CPUPPCState *env, uint32_t 
> exception)
>  {
>  raise_exception_err_ra(env, exception, 0, 0);
>  }
> +#endif
>  
>  #if !defined(CONFIG_USER_ONLY)
> +static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong 
> msr)
> +{
> +CPUState *cs = env_cpu(env);
> +

IIUC this code motion is to reduce the number of ifdef blocks you
need.  I'm actually inclined to leave this chunk in place, even though
it means an extra ifdef block, just to make it clearer what's going on
in the diff.

We could do the code motion to clean up as an additional patch (either
before or after would be fine, as would just not bothering for now).

> +/* MSR:POW cannot be set by any form of rfi */
> +msr &= ~(1ULL << MSR_POW);
> +
> +#if defined(TARGET_PPC64)
> +/* Switching to 32-bit ? Crop the nip */
> +if (!msr_is_64bit(env, msr)) {
> +nip = (uint32_t)nip;
> +}
> +#else
> +nip = (uint32_t)nip;
> +#endif
> +/* XXX: beware: this is false if VLE is supported */
> +env->nip = nip & ~((target_ulong)0x0003);
> +hreg_store_msr(env, msr, 1);
> +#if defined(DEBUG_OP)
> +cpu_dump_rfi(env->nip, env->msr);
> +#endif
> +/*
> + * No need to raise an exception here, as rfi is always the last
> + * insn of a TB
> + */
> +cpu_interrupt_exittb(cs);
> +/* Reset the reservation */
> +env->reserve_addr = -1;
> +
> +/* Context synchronizing: check if TCG TLB needs flush */
> +check_tlb_flush(env, false);
> +}
> +
> +#ifdef CONFIG_T

Re: [PATCH v3 1/9] target/ppc: cleaned error_report from ppc_store_sdr1

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:51PM -0300, Bruno Larsen (billionai) wrote:
> Changed how the function ppc_store_sdr1, from error_report(...) to
> qemu_log_mask(LOG_GUEST_ERROR, ...).
> 
> Signed-off-by: Bruno Larsen (billionai) 
> Suggested-by: Richard Henderson 

Applied to ppc-for-6.1, thanks.

> ---
>  target/ppc/cpu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/target/ppc/cpu.c b/target/ppc/cpu.c
> index d957d1a687..9cf3288b7a 100644
> --- a/target/ppc/cpu.c
> +++ b/target/ppc/cpu.c
> @@ -77,13 +77,13 @@ void ppc_store_sdr1(CPUPPCState *env, target_ulong value)
>  target_ulong htabsize = value & SDR_64_HTABSIZE;
>  
>  if (value & ~sdr_mask) {
> -error_report("Invalid bits 0x"TARGET_FMT_lx" set in SDR1",
> - value & ~sdr_mask);
> +qemu_log_mask(LOG_GUEST_ERROR, "Invalid bits 0x"TARGET_FMT_lx
> + " set in SDR1", value & ~sdr_mask);
>  value &= sdr_mask;
>  }
>  if (htabsize > 28) {
> -error_report("Invalid HTABSIZE 0x" TARGET_FMT_lx" stored in 
> SDR1",
> - htabsize);
> +qemu_log_mask(LOG_GUEST_ERROR, "Invalid HTABSIZE 0x" 
> TARGET_FMT_lx
> + " stored in SDR1", htabsize);
>  return;
>  }
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 6/9] target/ppc: moved ppc_cpu_do_interrupt to cpu.c

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:56PM -0300, Bruno Larsen (billionai) wrote:
> Moved the ppc_cpu_do_interrupt function to cpu.c file, where it makes
> more sense, and turned powerpc_excp not static, as it now needs to be
> accessed from outside of excp_helper.c
> 
> Signed-off-by: Bruno Larsen (billionai)
> 

Looks ok to me, but I'd like to get a review from Richard before applying.

> ---
>  target/ppc/cpu.c | 20 
>  target/ppc/cpu.h |  1 +
>  target/ppc/excp_helper.c | 19 +--
>  3 files changed, 22 insertions(+), 18 deletions(-)
> 
> diff --git a/target/ppc/cpu.c b/target/ppc/cpu.c
> index 19d67b5b07..95898f348b 100644
> --- a/target/ppc/cpu.c
> +++ b/target/ppc/cpu.c
> @@ -152,3 +152,23 @@ void ppc_store_fpscr(CPUPPCState *env, target_ulong val)
>  fpscr_set_rounding_mode(env);
>  }
>  }
> +
> +/* Exception processing */
> +#if defined(CONFIG_USER_ONLY)
> +void ppc_cpu_do_interrupt(CPUState *cs)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +CPUPPCState *env = &cpu->env;
> +
> +cs->exception_index = POWERPC_EXCP_NONE;
> +env->error_code = 0;
> +}
> +#else
> +void ppc_cpu_do_interrupt(CPUState *cs)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +CPUPPCState *env = &cpu->env;
> +
> +powerpc_excp(cpu, env->excp_model, cs->exception_index);
> +}
> +#endif
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 203f07e48e..65a08cc424 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1254,6 +1254,7 @@ DECLARE_OBJ_CHECKERS(PPCVirtualHypervisor, 
> PPCVirtualHypervisorClass,
>  #endif /* CONFIG_USER_ONLY */
>  
>  void ppc_cpu_do_interrupt(CPUState *cpu);
> +void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp);
>  bool ppc_cpu_exec_interrupt(CPUState *cpu, int int_req);
>  void ppc_cpu_dump_state(CPUState *cpu, FILE *f, int flags);
>  void ppc_cpu_dump_statistics(CPUState *cpu, int flags);
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index f4f15279eb..80bb6e70e9 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -38,15 +38,6 @@
>  
> /*/
>  /* Exception processing */
>  #if defined(CONFIG_USER_ONLY)
> -void ppc_cpu_do_interrupt(CPUState *cs)
> -{
> -PowerPCCPU *cpu = POWERPC_CPU(cs);
> -CPUPPCState *env = &cpu->env;
> -
> -cs->exception_index = POWERPC_EXCP_NONE;
> -env->error_code = 0;
> -}
> -
>  static void ppc_hw_interrupt(CPUPPCState *env)
>  {
>  CPUState *cs = env_cpu(env);
> @@ -324,7 +315,7 @@ static inline void powerpc_set_excp_state(PowerPCCPU *cpu,
>   * Note that this function should be greatly optimized when called
>   * with a constant excp, from ppc_hw_interrupt
>   */
> -static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp)
> +inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp)
>  {
>  CPUState *cs = CPU(cpu);
>  CPUPPCState *env = &cpu->env;
> @@ -968,14 +959,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
> excp_model, int excp)
>  powerpc_set_excp_state(cpu, vector, new_msr);
>  }
>  
> -void ppc_cpu_do_interrupt(CPUState *cs)
> -{
> -PowerPCCPU *cpu = POWERPC_CPU(cs);
> -CPUPPCState *env = &cpu->env;
> -
> -powerpc_excp(cpu, env->excp_model, cs->exception_index);
> -}
> -
>  static void ppc_hw_interrupt(CPUPPCState *env)
>  {
>  PowerPCCPU *cpu = env_archcpu(env);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 3/9] target/ppc: reduce usage of fpscr_set_rounding_mode

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:53PM -0300, Bruno Larsen (billionai) wrote:
> It is preferable to store the current rounding mode and retore from that
> than recalculating from fpscr, so we changed the behavior of do_fri and
> VSX_ROUND to do it like that.
> 
> Suggested-by: Richard Henderson 
> Signed-off-by: Bruno Larsen (billionai) 
> Reviewed-by: Richard Henderson 

Applied to ppc-for-6.1, thanks.

> ---
>  target/ppc/fpu_helper.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 44315fca0b..a4a283df2b 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -822,6 +822,7 @@ static inline uint64_t do_fri(CPUPPCState *env, uint64_t 
> arg,
>int rounding_mode)
>  {
>  CPU_DoubleU farg;
> +FloatRoundMode old_rounding_mode = 
> get_float_rounding_mode(&env->fp_status);
>  
>  farg.ll = arg;
>  
> @@ -834,8 +835,7 @@ static inline uint64_t do_fri(CPUPPCState *env, uint64_t 
> arg,
>float_flag_inexact;
>  set_float_rounding_mode(rounding_mode, &env->fp_status);
>  farg.ll = float64_round_to_int(farg.d, &env->fp_status);
> -/* Restore rounding mode from FPSCR */
> -fpscr_set_rounding_mode(env);
> +set_float_rounding_mode(old_rounding_mode, &env->fp_status);
>  
>  /* fri* does not set FPSCR[XX] */
>  if (!inexact) {
> @@ -3136,8 +3136,10 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
> ppc_vsr_t *xb)   \
>  {  \
>  ppc_vsr_t t = *xt; \
>  int i; \
> +FloatRoundMode curr_rounding_mode; \
> \
>  if (rmode != FLOAT_ROUND_CURRENT) {\
> +curr_rounding_mode = get_float_rounding_mode(&env->fp_status); \
>  set_float_rounding_mode(rmode, &env->fp_status);   \
>  }  \
> \
> @@ -3160,7 +3162,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
> ppc_vsr_t *xb)   \
>   * mode from FPSCR \
>   */\
>  if (rmode != FLOAT_ROUND_CURRENT) {\
> -fpscr_set_rounding_mode(env);  \
> +set_float_rounding_mode(curr_rounding_mode, &env->fp_status);  \
>  env->fp_status.float_exception_flags &= ~float_flag_inexact;   \
>  }  \
> \

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 4/9] target/ppc: overhauled and moved logic of storing fpscr

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:54PM -0300, Bruno Larsen (billionai) wrote:
65;6401;1c> Followed the suggested overhaul to store_fpscr logic, and moved it 
to
> cpu.c where it can be accessed in !TCG builds.
> 
> The overhaul was suggesting because storing a value to fpscr should
> never raise an exception, so we could remove all the mess that happened
> with POWERPC_EXCP_FP.
> 
> We also moved fpscr_set_rounding_mode into cpu.c as it could now be moved
> there, and it is needed when a value for the fpscr is being stored
> directly.
> 
> Suggested-by: Richard Henderson 
> Signed-off-by: Bruno Larsen (billionai) 
> Reviewed-by: Richard Henderson 

Applied to ppc-for-6.1, thanks.

> ---
>  target/ppc/cpu.c|  43 
>  target/ppc/cpu.h|  12 +-
>  target/ppc/fpu_helper.c | 238 +++-
>  target/ppc/gdbstub.c|   6 +-
>  4 files changed, 65 insertions(+), 234 deletions(-)
> 
> diff --git a/target/ppc/cpu.c b/target/ppc/cpu.c
> index c8e87e30f1..19d67b5b07 100644
> --- a/target/ppc/cpu.c
> +++ b/target/ppc/cpu.c
> @@ -25,6 +25,7 @@
>  #include "fpu/softfloat-helpers.h"
>  #include "mmu-hash64.h"
>  #include "helper_regs.h"
> +#include "sysemu/tcg.h"
>  
>  target_ulong cpu_read_xer(CPUPPCState *env)
>  {
> @@ -109,3 +110,45 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
>  /* The gtse bit affects hflags */
>  hreg_compute_hflags(env);
>  }
> +
> +static inline void fpscr_set_rounding_mode(CPUPPCState *env)
> +{
> +int rnd_type;
> +
> +/* Set rounding mode */
> +switch (fpscr_rn) {
> +case 0:
> +/* Best approximation (round to nearest) */
> +rnd_type = float_round_nearest_even;
> +break;
> +case 1:
> +/* Smaller magnitude (round toward zero) */
> +rnd_type = float_round_to_zero;
> +break;
> +case 2:
> +/* Round toward +infinite */
> +rnd_type = float_round_up;
> +break;
> +default:
> +case 3:
> +/* Round toward -infinite */
> +rnd_type = float_round_down;
> +break;
> +}
> +set_float_rounding_mode(rnd_type, &env->fp_status);
> +}
> +
> +void ppc_store_fpscr(CPUPPCState *env, target_ulong val)
> +{
> +val &= ~(FP_VX | FP_FEX);
> +if (val & FPSCR_IX) {
> +val |= FP_VX;
> +}
> +if ((val >> FPSCR_XX) & (val >> FPSCR_XE) & 0x1f) {
> +val |= FP_FEX;
> +}
> +env->fpscr = val;
> +if (tcg_enabled()) {
> +fpscr_set_rounding_mode(env);
> +}
> +}
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index cab33a3680..203f07e48e 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -675,11 +675,11 @@ enum {
>  #define fpscr_ni (((env->fpscr) >> FPSCR_NI) & 0x1)
>  #define fpscr_rn (((env->fpscr) >> FPSCR_RN0)& 0x3)
>  /* Invalid operation exception summary */
> -#define fpscr_ix ((env->fpscr) & ((1 << FPSCR_VXSNAN) | (1 << FPSCR_VXISI)  
> | \
> -  (1 << FPSCR_VXIDI)  | (1 << FPSCR_VXZDZ)  
> | \
> -  (1 << FPSCR_VXIMZ)  | (1 << FPSCR_VXVC)   
> | \
> -  (1 << FPSCR_VXSOFT) | (1 << FPSCR_VXSQRT) 
> | \
> -  (1 << FPSCR_VXCVI)))
> +#define FPSCR_IX ((1 << FPSCR_VXSNAN) | (1 << FPSCR_VXISI)  | \
> +  (1 << FPSCR_VXIDI)  | (1 << FPSCR_VXZDZ)  | \
> +  (1 << FPSCR_VXIMZ)  | (1 << FPSCR_VXVC)   | \
> +  (1 << FPSCR_VXSOFT) | (1 << FPSCR_VXSQRT) | \
> +  (1 << FPSCR_VXCVI))
>  /* exception summary */
>  #define fpscr_ex  (((env->fpscr) >> FPSCR_XX) & 0x1F)
>  /* enabled exception summary */
> @@ -1334,7 +1334,7 @@ void cpu_ppc_set_vhyp(PowerPCCPU *cpu, 
> PPCVirtualHypervisor *vhyp);
>  #endif
>  #endif
>  
> -void store_fpscr(CPUPPCState *env, uint64_t arg, uint32_t mask);
> +void ppc_store_fpscr(CPUPPCState *env, target_ulong val);
>  void helper_hfscr_facility_check(CPUPPCState *env, uint32_t bit,
>   const char *caller, uint32_t cause);
>  
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index a4a283df2b..0f4074fc7e 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -383,247 +383,35 @@ static inline void float_inexact_excp(CPUPPCState *env)
>  }
>  }
>  
> -static inline void fpscr_set_rounding_mode(CPUPPCState *env)
> -{
> -int rnd_type;
> -
> -/* Set rounding mode */
> -switch (fpscr_rn) {
> -case 0:
> -/* Best approximation (round to nearest) */
> -rnd_type = float_round_nearest_even;
> -break;
> -case 1:
> -/* Smaller magnitude (round toward zero) */
> -rnd_type = float_round_to_zero;
> -break;
> -case 2:
> -/* Round toward +infinite */
> -rnd_type = float_round_up;
> -break;
> -default:
> -case 3:
> -/* Round toward -infinite */
> -rnd_

Re: [PATCH v3 8/9] target/ppc: created tcg-stub.c file

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 05:17:58PM -0300, Bruno Larsen (billionai) wrote:
> Created a file with stubs needed to compile disabling TCG. *_ppc_opcodes
> were created to make cpu_init.c have a few less ifdefs, since they are
> not needed. coftmmu_resize_hpt_* have to be created because the compiler
> can't automatically know they aren't used, but they should never be
> reached.
> 
> Signed-off-by: Bruno Larsen (billionai) 
> ---
>  target/ppc/meson.build |  4 
>  target/ppc/tcg-stub.c  | 25 +
>  2 files changed, 29 insertions(+)
>  create mode 100644 target/ppc/tcg-stub.c
> 
> diff --git a/target/ppc/meson.build b/target/ppc/meson.build
> index d1aa7d5d39..848e625302 100644
> --- a/target/ppc/meson.build
> +++ b/target/ppc/meson.build
> @@ -28,6 +28,10 @@ ppc_softmmu_ss.add(files(
>'mmu_helper.c',
>'monitor.c',
>  ))
> +ppc_softmmu_ss.add(when: 'CONFIG_TCG', if_false: files(
> +  'tcg-stub.c'
> +))
> +
>  ppc_softmmu_ss.add(when: 'TARGET_PPC64', if_true: files(
>'compat.c',
>'mmu-book3s-v3.c',
> diff --git a/target/ppc/tcg-stub.c b/target/ppc/tcg-stub.c
> new file mode 100644
> index 00..6d99834274
> --- /dev/null
> +++ b/target/ppc/tcg-stub.c

New file needs a copyright banner.

> @@ -0,0 +1,25 @@
> +#include "qemu/osdep.h"
> +#include "cpu.h"
> +#include "internal.h"
> +#include "hw/ppc/spapr.h"
> +
> +void create_ppc_opcodes(PowerPCCPU *cpu, Error **errp)
> +{}

I believe style dictates this be
{
}

rather than
{}


> +
> +void destroy_ppc_opcodes(PowerPCCPU *cpu)
> +{}
> +
> +target_ulong softmmu_resize_hpt_prepare(PowerPCCPU *cpu,
> +SpaprMachineState *spapr,
> +target_ulong shift)
> +{
> +g_assert_not_reached();
> +}
> +
> +target_ulong softmmu_resize_hpt_commit(PowerPCCPU *cpu,
> +   SpaprMachineState *spapr,
> +   target_ulong flags,
> +   target_ulong shift)
> +{
> +g_assert_not_reached();
> +}

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 09/11] include/exec: added functions to the stubs in exec-all.h

2021-05-23 Thread David Gibson
On Mon, May 17, 2021 at 01:59:35PM -0300, Lucas Mateus Martins Araujo e Castro 
wrote:
> 
> On 17/05/2021 00:58, David Gibson wrote:
> > On Thu, May 13, 2021 at 06:44:01PM -0500, Richard Henderson wrote:
> > 65;6401;1c> On 5/13/21 9:03 AM, Lucas Mateus Martins Araujo e Castro wrote:
> > > > tlb_set_page is called by many ppc_hash64_handle_mmu_fault,
> > > > ppc_radix64_handle_mmu_fault and ppc_hash32_handle_mmu_fault, all of
> > > > which from what I've seen are only used inside #if
> > > > defined(CONFIG_SOFTMMU).
> > > tlb_set_page should only be called from one place: ppc_cpu_tlb_fill.  The
> > > other functions should fill in data, much like get_physical_address.
> > > 
> > > 
> > > > So what is the best way to deal with these tlb_set_page calls? Should
> > > > these part of the _handle_mmu_fault functions never be reached or should
> > > > these functions never be called?
> > > There is some duplication between get_physical_address* and
> > > *handle_mmu_fault that should be fixed.
> > > 
> > > What should be happening is that you have one function (per mmu type) that
> > > takes a virtual address and resolves a physical address. This bit of code
> > > should be written so that it is usable by both
> > > CPUClass.get_phys_page_attrs_debug and TCGCPUOps.tlb_fill.  It appears as 
> > > if
> > > ppc_radix64_xlate is the right interface for this.
> > > 
> > > It appears that real mode handling is duplicated between hash64 and 
> > > radix64,
> > > which could be unified.
> > Any common handling between the hash and radix MMUs should go in
> > mmu-book3s-v3.*  That covers common things across the v3 (POWER9 and
> > later) MMUs which includes both hash and radix mode.
> 
> I'm not completely sure how this should be handled, there's a
> get_physical_address in mmu_helper.c but it's a static function and divided
> by processor families instead of MMU types, so get_physical_address_* should
> be a new function?

Sorry, I wasn't clear.  mmu-v3 is only for things specifically common
to the hash64 model (as implemented in mmu-hash64.c) and the radix
model (as implemented in mmu-radix64.c).  Which basically means things
related to the POWER9 MMU which can switch between those modes.

Things common to *more* MMU models (hash32, 4xx, 8xx, BookE, etc.)
which includes most of what's in mmu-helper.c go elsewhere.

> The new get_physical_address_* function would be a mmu-hash(32|64) that do
> something like ppc_radix64_xlate and add a function to mmu-book3s-v3 that
> call either the radix64 or the hash64 function and also handle real mode
> access.
> 
> Also should the tlb_set_page calls in *_handle_mmu_fault be changed to
> ppc_cpu_tlb_fill or the function should themselves fill it?
> 
> > 
> > > You should only call tlb_set_page from TCGCPUOps.tlb_fill, aka
> > > ppc_cpu_tlb_fill.  TCGCPUOps.tlb_fill is obviously TCG only.
> > > 
> > > The version you are looking at here is system emulation specific (sysemu,
> > > !defined(CONFIG_USER_ONLY)).  There is a second version of this function,
> > > with the same signature, that is used for user emulation in the helpfully
> > > named user_only_helper.c.
> > > 
> > > 
> > > r~
> > > 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] spapr: Don't hijack current_machine->boot_order

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 06:07:35PM +0200, Greg Kurz wrote:
> QEMU 6.0 moved all the -boot variables to the machine. Especially, the
> removal of the boot_order static changed the handling of '-boot once'
> from:
> 
> if (boot_once) {
> qemu_boot_set(boot_once, &error_fatal);
> qemu_register_reset(restore_boot_order, g_strdup(boot_order));
> }
> 
> to
> 
> if (current_machine->boot_once) {
> qemu_boot_set(current_machine->boot_once, &error_fatal);
> qemu_register_reset(restore_boot_order,
> g_strdup(current_machine->boot_order));
> }
> 
> This means that we now register as subsequent boot order a copy
> of current_machine->boot_once that was just set with the previous
> call to qemu_boot_set(), i.e. we never transition away from the
> once boot order.
> 
> It is certainly fragile^Wwrong for the spapr code to hijack a
> field of the base machine type object like that. The boot order
> rework simply turned this software boundary violation into an
> actual bug.
> 
> Have the spapr code to handle that with its own field in
> SpaprMachineState. Also kfree() the initial boot device
> string when "once" was used.
> 
> Fixes: 4b7acd2ac821 ("vl: clean up -boot variables")
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1960119
> Cc: pbonz...@redhat.com
> Signed-off-by: Greg Kurz 


Applied to ppc-for-6.1, thanks.

> ---
>  include/hw/ppc/spapr.h | 3 +++
>  hw/ppc/spapr.c | 8 +---
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index bbf817af4647..f05219f75ef6 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -223,6 +223,9 @@ struct SpaprMachineState {
>  int fwnmi_machine_check_interlock;
>  QemuCond fwnmi_machine_check_interlock_cond;
>  
> +/* Set by -boot */
> +char *boot_device;
> +
>  /*< public >*/
>  char *kvm_type;
>  char *host_model;
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index c23bcc449071..4dd90b75cc52 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1005,7 +1005,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, 
> void *fdt, bool reset)
>  _FDT(chosen = fdt_add_subnode(fdt, 0, "chosen"));
>  
>  if (reset) {
> -const char *boot_device = machine->boot_order;
> +const char *boot_device = spapr->boot_device;
>  char *stdout_path = spapr_vio_stdout_path(spapr->vio_bus);
>  size_t cb = 0;
>  char *bootlist = get_boot_devices_list(&cb);
> @@ -2376,8 +2376,10 @@ static SaveVMHandlers savevm_htab_handlers = {
>  static void spapr_boot_set(void *opaque, const char *boot_device,
> Error **errp)
>  {
> -MachineState *machine = MACHINE(opaque);
> -machine->boot_order = g_strdup(boot_device);
> +SpaprMachineState *spapr = SPAPR_MACHINE(opaque);
> +
> +g_free(spapr->boot_device);
> +spapr->boot_device = g_strdup(boot_device);
>  }
>  
>  static void spapr_create_lmb_dr_connectors(SpaprMachineState *spapr)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 22/24] target/ppc: Split out ppc_jumbo_xlate

2021-05-23 Thread David Gibson
On Wed, May 19, 2021 at 03:40:15PM -0300, Bruno Piazera Larsen wrote:
> 
> On 18/05/2021 17:11, Richard Henderson wrote:
> > Mirror the interface of ppc_radix64_xlate (mostly), putting all
> > of the logic for older mmu translation into a single entry point.
> > For booke, we need to add mmu_idx to the xlate-style interface.
> > 
> > Signed-off-by: Richard Henderson 
> Out of my depth again, but testing seems fine, so

In this case you want "Tested-by" rather than "Acked-by".

> Acked-by: Bruno Larsen (billionai)
> 
> > ---
> >   target/ppc/mmu_helper.c | 181 +---
> >   1 file changed, 97 insertions(+), 84 deletions(-)
> > 
> > diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> > index 863e556a22..68c2e59238 100644
> > --- a/target/ppc/mmu_helper.c
> > +++ b/target/ppc/mmu_helper.c
> > @@ -1427,48 +1427,6 @@ static int get_physical_address(CPUPPCState *env, 
> > mmu_ctx_t *ctx,
> >   return get_physical_address_wtlb(env, ctx, eaddr, access_type, type, 
> > 0);
> >   }
> > -hwaddr ppc_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
> > -{
> > -PowerPCCPU *cpu = POWERPC_CPU(cs);
> > -CPUPPCState *env = &cpu->env;
> > -mmu_ctx_t ctx;
> > -
> > -switch (env->mmu_model) {
> > -#if defined(TARGET_PPC64)
> > -case POWERPC_MMU_64B:
> > -case POWERPC_MMU_2_03:
> > -case POWERPC_MMU_2_06:
> > -case POWERPC_MMU_2_07:
> > -return ppc_hash64_get_phys_page_debug(cpu, addr);
> > -case POWERPC_MMU_3_00:
> > -return ppc64_v3_get_phys_page_debug(cpu, addr);
> > -#endif
> > -
> > -case POWERPC_MMU_32B:
> > -case POWERPC_MMU_601:
> > -return ppc_hash32_get_phys_page_debug(cpu, addr);
> > -
> > -default:
> > -;
> > -}
> > -
> > -if (unlikely(get_physical_address(env, &ctx, addr, MMU_DATA_LOAD,
> > -  ACCESS_INT) != 0)) {
> > -
> > -/*
> > - * Some MMUs have separate TLBs for code and data. If we only
> > - * try an ACCESS_INT, we may not be able to read instructions
> > - * mapped by code TLBs, so we also try a ACCESS_CODE.
> > - */
> > -if (unlikely(get_physical_address(env, &ctx, addr, MMU_INST_FETCH,
> > -  ACCESS_CODE) != 0)) {
> > -return -1;
> > -}
> > -}
> > -
> > -return ctx.raddr & TARGET_PAGE_MASK;
> > -}
> > -
> >   static void booke206_update_mas_tlb_miss(CPUPPCState *env, target_ulong 
> > address,
> >MMUAccessType access_type, int 
> > mmu_idx)
> >   {
> > @@ -1524,30 +1482,38 @@ static void 
> > booke206_update_mas_tlb_miss(CPUPPCState *env, target_ulong address,
> >   }
> >   /* Perform address translation */
> > -static int cpu_ppc_handle_mmu_fault(CPUPPCState *env, target_ulong address,
> > -MMUAccessType access_type, int mmu_idx)
> > +/* TODO: Split this by mmu_model. */
> > +static bool ppc_jumbo_xlate(PowerPCCPU *cpu, vaddr eaddr,
> > +MMUAccessType access_type,
> > +hwaddr *raddrp, int *psizep, int *protp,
> > +int mmu_idx, bool guest_visible)
> >   {
> > -CPUState *cs = env_cpu(env);
> > -PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +CPUState *cs = CPU(cpu);
> > +CPUPPCState *env = &cpu->env;
> >   mmu_ctx_t ctx;
> >   int type;
> > -int ret = 0;
> > +int ret;
> >   if (access_type == MMU_INST_FETCH) {
> >   /* code access */
> >   type = ACCESS_CODE;
> > -} else {
> > +} else if (guest_visible) {
> >   /* data access */
> >   type = env->access_type;
> > +} else {
> > +type = ACCESS_INT;
> >   }
> > -ret = get_physical_address_wtlb(env, &ctx, address, access_type,
> > +
> > +ret = get_physical_address_wtlb(env, &ctx, eaddr, access_type,
> >   type, mmu_idx);
> >   if (ret == 0) {
> > -tlb_set_page(cs, address & TARGET_PAGE_MASK,
> > - ctx.raddr & TARGET_PAGE_MASK, ctx.prot,
> > - mmu_idx, TARGET_PAGE_SIZE);
> > -ret = 0;
> > -} else if (ret < 0) {
> > +*raddrp = ctx.raddr;
> > +*protp = ctx.prot;
> > +*psizep = TARGET_PAGE_BITS;
> > +return true;
> > +}
> > +
> > +if (guest_visible) {
> >   LOG_MMU_STATE(cs);
> >   if (type == ACCESS_CODE) {
> >   switch (ret) {
> > @@ -1557,7 +1523,7 @@ static int cpu_ppc_handle_mmu_fault(CPUPPCState *env, 
> > target_ulong address,
> >   case POWERPC_MMU_SOFT_6xx:
> >   cs->exception_index = POWERPC_EXCP_IFTLB;
> >   env->error_code = 1 << 18;
> > -env->spr[SPR_IMISS] = address;
> > +env->spr[SPR_IMISS] = eaddr;
> >   env->spr[SPR_ICMP] = 0x8

Re: [PATCH v3] spapr: Fix EEH capability issue on KVM guest for PCI passthru

2021-05-23 Thread David Gibson
On Fri, May 21, 2021 at 01:35:51PM +0530, Mahesh Salgaonkar wrote:
> With upstream kernel, especially after commit 98ba956f6a389
> ("powerpc/pseries/eeh: Rework device EEH PE determination") we see that KVM
> guest isn't able to enable EEH option for PCI pass-through devices anymore.
> 
> [root@atest-guest ~]# dmesg | grep EEH
> [0.032337] EEH: pSeries platform initialized
> [0.298207] EEH: No capable adapters found: recovery disabled.
> [root@atest-guest ~]#
> 
> So far the linux kernel was assuming pe_config_addr equal to device's
> config_addr and using it to enable EEH on the PE through ibm,set-eeh-option
> RTAS call. Which wasn't the correct way as per PAPR. The linux kernel
> commit 98ba956f6a389 fixed this flow. With that fixed, linux now uses PE
> config address returned by ibm,get-config-addr-info2 RTAS call to enable
> EEH option per-PE basis instead of per-device basis. However this has
> uncovered a bug in qemu where ibm,set-eeh-option is treating PE config
> address as per-device config address.
> 
> Hence in qemu guest with recent kernel the ibm,set-eeh-option RTAS call
> fails with -3 return value indicating that there is no PCI device exist for
> the specified PE config address. The rtas_ibm_set_eeh_option call uses
> pci_find_device() to get the PC device that matches specific bus and devfn
> extracted from PE config address passed as argument. Thus it tries to map
> the PE config address to a single specific PCI device 'bus->devices[devfn]'
> which always results into checking device on slot 0 'bus->devices[0]'.
> This succeeds when there is a pass-through device (vfio-pci) present on
> slot 0. But in cases where there is no pass-through device present in slot
> 0, but present in non-zero slots, ibm,set-eeh-option call fails to enable
> the EEH capability.
> 
> hw/ppc/spapr_pci_vfio.c: spapr_phb_vfio_eeh_set_option()
>case RTAS_EEH_ENABLE: {
> PCIHostState *phb;
> PCIDevice *pdev;
> 
> /*
>  * The EEH functionality is enabled on basis of PCI device,
>  * instead of PE. We need check the validity of the PCI
>  * device address.
>  */
> phb = PCI_HOST_BRIDGE(sphb);
> pdev = pci_find_device(phb->bus,
>(addr >> 16) & 0xFF, (addr >> 8) & 0xFF);
> if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
> return RTAS_OUT_PARAM_ERROR;
> }
> 
> hw/pci/pci.c:pci_find_device()
> 
> PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn)
> {
> bus = pci_find_bus_nr(bus, bus_num);
> 
> if (!bus)
> return NULL;
> 
> return bus->devices[devfn];
> }
> 
> This patch fixes ibm,set-eeh-option to check for presence of any PCI device
> (vfio-pci) under specified bus and enable the EEH if found. The current
> code already makes sure that all the devices on that bus are from same
> iommu group (within same PE) and fail very early if it does not.
> 
> After this fix guest is able to find EEH capable devices and enable EEH
> recovery on it.
> 
> [root@atest-guest ~]# dmesg | grep EEH
> [0.048139] EEH: pSeries platform initialized
> [0.405115] EEH: Capable adapter found: recovery enabled.
> [root@atest-guest ~]#
> 
> Reviewed-by: Daniel Henrique Barboza 
> Signed-off-by: Mahesh Salgaonkar 

Gross, but I don't see a better way of handling this, so, applied to
ppc-for-6.1, thanks.

> ---
> Change in v3:
> - Add a comment about reason for not checking for validity of supplied
>   config_addr as pointed by Oliver in spapr_phb_vfio_eeh_set_option()
>   function.
> Change in v2:
> - Fix ibm,set-eeh-option instead of returning per-device PE config address.
> - Changed patch subject line.
> ---
>  hw/ppc/spapr_pci_vfio.c |   40 +---
>  1 file changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> index e0547b1740..6587c8cb5b 100644
> --- a/hw/ppc/spapr_pci_vfio.c
> +++ b/hw/ppc/spapr_pci_vfio.c
> @@ -47,6 +47,16 @@ void spapr_phb_vfio_reset(DeviceState *qdev)
>  spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
>  }
>  
> +static void spapr_eeh_pci_find_device(PCIBus *bus, PCIDevice *pdev,
> +  void *opaque)
> +{
> +bool *found = opaque;
> +
> +if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
> +*found = true;
> +}
> +}
> +
>  int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb,
>unsigned int addr, int option)
>  {
> @@ -59,17 +69,33 @@ int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb,
>  break;
>  case RTAS_EEH_ENABLE: {
>  PCIHostState *phb;
> -PCIDevice *pdev;
> +bool found = false;
>  
>  /*
> - * The EEH functionality is enabled on basis of PCI device,
> - * instead of PE. We need check the validity of the PCI
> - * device address.
> + * The EEH funct

Re: [PATCH 00/24] target/ppc: Clean up mmu translation

2021-05-23 Thread David Gibson
On Wed, May 19, 2021 at 05:47:05PM -0500, Richard Henderson wrote:
> On 5/19/21 3:37 PM, Richard Henderson wrote:
> > On 5/18/21 9:52 PM, David Gibson wrote:
> > > I've applied 1..15, still looking at the rest.
> > 
> > Please dequeue.  I want to create a new mmu-internal.h, which affects
> > all the patches from #1.
> 
> Alternately, don't.  I can move the function later, and it may be a while
> before I can get back to this.

Ok, I'll leave them in, since they're good cleanups even without the
rest of the series.

> 
> Two outstanding bugs:
> 
> (1) mmu-radix64.c vs hypervisors.  You'll not see these unless you run kvm
> inside of tcg.
> 
> Basically, all usage of msr_{hv,pr,ir,dr} within ppc_*_xlate is incorrect.
> We should be pulling these from the 3 bits of mmu_idx, as outlined in the
> table in hreg_compute_hflags_value.

Ah, that's probably my fault.  I reworked those substantially from the
original code (closer to mmu_helper.c).  I guess I didn't (and I
suspect I still don't) really understand how the softmmu works.

> When you start propagating that around, you see that the second-level
> translation for loading the pte (2 of the 3 calls to
> ppc_radix64_partition_scoped_xlate) should not be using the mmu_idx related
> to the user-mode guest access, but should be using the mmu_idx of the
> kernel/hypervisor that owns the page table.
> 
> I can't see that mmu-radix64.c has the same problem.  I'm not really sure
> how the second-level translation for hypervisors works there.  Is it by the
> hypervisor altering the hash table as it is loaded?

Uh.. you started by saying mmu-radix64.c then talked about the hash
table, so I'm not sure which model you're talking about.

For hash + hypervisor, then yes, there is no hardware 2-level
translation, it all works by paravirtualizing updates to the hash
table (this is the H_ENTER etc. code in hw/ppc/spapr_softmmu.c).

> (2) The direct-segment accesses for 6xx and hash32 use ACCESS_* to
> conditionally reject an mmu access.  This is flawed, because we only arrive
> into these translation routines on the softtlb slow path.  If we have an
> ACCESS_INT and then an ACCESS_FLOAT, the first will load a tlb entry which
> the second will use to stay on the fast path.
> 
> There are several possible solutions:
> 
>  (A) Give tlb_set_page size == 1 for direct-segment addresses.
>  This will cause cputlb.c to force all references to the page
>  back through the slow path and through tlb_fill.  At which
>  point env->access_type is correct for each access, and we
>  can reject on type.
> 
>  This could be really slow.  But since these direct-segment
>  accesses are also uncached, I would expect the guest OS to
>  only be using them for i/o access.  Which is already slow,
>  so perhaps the extra trip through tlb_fill isn't noticeable.
> 
>  (B) Use additional mmu_idx.  Not ideal, since we wouldn't be
>  sharing softtlb entries for ACCESS_INT and ACCESS_FLOAT
>  and ACCESS_RES for the normal case.  But the relevant
>  mmu_models do not have hypervisor support, so we could
>  still fit them in to 8 mmu_idx.  We'd probably want to
>  use special code for ACCESS_CACHE and ACCESS_EXT here.
> 
>  (C) Ignore this special case entirely, dropping everything
>  related to env->access_type.  The section of code that
>  uses them is all for really old machine types with
>  comments like
> 
> /* Direct-store segment : absolutely *BUGGY* for now */
> 
>  which is not encouraging.  And short of writing our own
>  test cases we're not likely to have any code to exercise
>  this.

Indeed.  Direct store segments are basically ancient history of the
POWER architecture which Linux never used, and I don't think much else
did either.  So I'm inclined to go with

  (D) Just rip out all the direct store segment code and replace with
  some LOG_UNIMPs.  Re-adding it working can be the job of the
  probably non-existent person who has an actual use case using
  them.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 09/11] include/exec: added functions to the stubs in exec-all.h

2021-05-23 Thread David Gibson
On Mon, May 17, 2021 at 08:07:24AM -0300, Bruno Piazera Larsen wrote:
> 
> On 17/05/2021 00:55, David Gibson wrote:
> > On Thu, May 13, 2021 at 11:03:24AM -0300, Lucas Mateus Martins Araujo e 
> > Castro wrote:
> > > On 12/05/2021 15:34, Richard Henderson wrote:
> > > > On 5/12/21 9:08 AM, Bruno Larsen (billionai) wrote:
> > > > > From: "Lucas Mateus Castro (alqotel)"
> > > > > 
> > > > > Added tlb_set_page and tlb_set_page_with_attrs to the
> > > > > stubbed functions in exec-all.h  as it is needed
> > > > > in some functions when compiling without TCG
> > > > > 
> > > > > Signed-off-by: Lucas Mateus Castro
> > > > > (alqotel)
> > > > > ---
> > > > >    include/exec/exec-all.h | 10 ++
> > > > >    1 file changed, 10 insertions(+)
> > > > No, the caller is tcg-specific already.
> > > > 
> > > > 
> > > > r~
> > > tlb_set_page is called by many ppc_hash64_handle_mmu_fault,
> > > ppc_radix64_handle_mmu_fault and ppc_hash32_handle_mmu_fault, all of which
> > > from what I've seen are only used inside #if defined(CONFIG_SOFTMMU). So
> > > what is the best way to deal with these tlb_set_page calls? Should these
> > > part of the _handle_mmu_fault functions never be reached or should
> > > these
> > The handle_mmu_fault() functions per se shouldn't be included in a
> > !SOFTMMU build.  We might have to extract some of their internal logic
> > for the gdb path, though.
> > 
> > > functions never be called?
> > > 
> > > If it's the latter then should we change the #if defined to #if
> > > defined(CONFIG_SOFTMMU) && (CONFIG_TCG)?
> > That definitely doesn't make sense.  In practice CONFIG_SOFTMMU == 
> > CONFIG_TCG.

Ugh.. sorry.. I was confused again.  In practice CONFIG_SOFTMMU ==
!CONFIG_USER_ONLY.


> We figured it was the case, but from what I can tell, CONFIG_SOFTMMU is set
> when parsing the target list (in the configure script) and CONFIG_TCG is set
> later, when parsing which accelerators were requested. So even though
> SOFTMMU should imply TCG, the way it is coded right now doesn't. We could
> also try and change the configure script, but neither of us is really good
> with bash scripts, so this was the next best solution we came up with.
> > 
> > > 
> > > P.S: There was a miscommunication between me and Bruno, this should've 
> > > been
> > > a RFC.
> > > 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 16/24] target/ppc: Remove PowerPCCPUClass.handle_mmu_fault

2021-05-23 Thread David Gibson
On Tue, May 18, 2021 at 03:11:38PM -0500, Richard Henderson wrote:
> Instead, use a switch on env->mmu_model.  This avoids some
> replicated information in cpu setup.

I have mixed feelings about this, since I introduced
pcc->handle_mmu_fault specifically to get rid of the nasty
mega-switch, with the hope of eventually getting rid of env->mmu_model
entirely.

But.. it does simplify your patch series, which makes a good change
overall.

> 
> Signed-off-by: Richard Henderson 
> ---
>  target/ppc/cpu-qom.h|  1 -
>  target/ppc/cpu_init.c   | 45 -
>  target/ppc/mmu_helper.c | 24 ++
>  3 files changed, 20 insertions(+), 50 deletions(-)
> 
> diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> index 06b6571bc9..3b14d2f134 100644
> --- a/target/ppc/cpu-qom.h
> +++ b/target/ppc/cpu-qom.h
> @@ -198,7 +198,6 @@ struct PowerPCCPUClass {
>  int n_host_threads;
>  void (*init_proc)(CPUPPCState *env);
>  int  (*check_pow)(CPUPPCState *env);
> -int (*handle_mmu_fault)(PowerPCCPU *cpu, vaddr eaddr, int rwx, int 
> mmu_idx);
>  bool (*interrupts_big_endian)(PowerPCCPU *cpu);
>  };
>  
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index d0fa219880..d33aded7cf 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -4580,9 +4580,6 @@ POWERPC_FAMILY(601)(ObjectClass *oc, void *data)
>  (1ull << MSR_IR) |
>  (1ull << MSR_DR);
>  pcc->mmu_model = POWERPC_MMU_601;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_601;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_601;
> @@ -4625,9 +4622,6 @@ POWERPC_FAMILY(601v)(ObjectClass *oc, void *data)
>  (1ull << MSR_IR) |
>  (1ull << MSR_DR);
>  pcc->mmu_model = POWERPC_MMU_601;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_601;
>  pcc->flags = POWERPC_FLAG_SE | POWERPC_FLAG_RTC_CLK | 
> POWERPC_FLAG_HID0_LE;
> @@ -4891,9 +4885,6 @@ POWERPC_FAMILY(604)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_604;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_604;
> @@ -4975,9 +4966,6 @@ POWERPC_FAMILY(604E)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_604;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_604;
> @@ -5046,9 +5034,6 @@ POWERPC_FAMILY(740)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_7x0;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_750;
> @@ -5126,9 +5111,6 @@ POWERPC_FAMILY(750)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_7x0;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_750;
> @@ -5329,9 +5311,6 @@ POWERPC_FAMILY(750cl)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_7x0;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_750;
> @@ -5412,9 +5391,6 @@ POWERPC_FAMILY(750cx)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> -#if defined(CONFIG_SOFTMMU)
> -pcc->handle_mmu_fault = ppc_hash32_handle_mmu_fault;
> -#endif
>  pcc->excp_model = POWERPC_EXCP_7x0;
>  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
>  pcc->bfd_mach = bfd_mach_ppc_750;
> @@ -5500,9 +5476,6 @@ POWERPC_FAMILY(750fx)(ObjectClass *oc, void *data)
>  (1ull << MSR_RI) |
>  (1ull << MSR_LE);
>  pcc->mmu_model = POWERPC_MMU_32B;
> 

Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface

2021-05-23 Thread Alexey Kardashevskiy




On 5/23/21 21:24, BALATON Zoltan wrote:

On Sun, 23 May 2021, Alexey Kardashevskiy wrote:

On 23/05/2021 01:02, BALATON Zoltan wrote:

On Sat, 22 May 2021, BALATON Zoltan wrote:

On Sat, 22 May 2021, Alexey Kardashevskiy wrote:

VOF itself does not prints anything in this patch.


However it seems to be needed for linux as the first thing it does 
seems to be getting /chosen/stdout and calls exit if it returns 
nothing. So I'll need this at least for linux. (I think MorphOS may 
also query it to print a banner or some messages but not sure it 
needs it, at least it does not abort right away if not found.)


but to see Linux output do I need a stdout in VOF or it will just 
open the serial with its own driver and use that?
So I'm not sure what's the stdout parts in the current vof patch 
does and if I need that for anything. I'll try to experiment with 
it some more but fixing the ld and Kconfig seems to be enough to 
get it work for me.


So for the client to print something, /chosen/stdout needs to have 
a valid ihandle.
The only way to get a valid ihandle is having a valid phandle which 
vof_client_open() can open.
A valid phandle is a phandle of any node in the device tree. On 
spapr we pick some spapr-vty, open it and store in /chosen/stdout.


From this point output from the client can be seen via a tracepoint.


I've got it now. Looking at the original firmware device tree dump:

https://osdn.net/projects/qmiga/wiki/SubprojectPegasos2/attach/PegasosII_OFW-Dump.txt 

I see that /chosen/stdout points to "screen" which is an alias to 
/bootconsole. Just adding an empty /bootconsole node in the device 
tree and vof_client_open_store() that as /chosen/stdout works and I 
get output via vof_write traces so this is enough for now to test 
Linux. Properly connecting a serial backend can thus be postponed.


So with this the Linux kernel does not abort on the first device tree 
access but starts to decompress itself then the embedded initrd and 
crashes at calling setprop:


[...]
vof_client_handle: setprop

Thread 4 "qemu-system-ppc" received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  0x in  ()
#1  0x55a5c2bf in vof_setprop
 (vof=0x748e9420, vallen=4, valaddr=, 
pname=, nodeph=8, fdt=0x7fff8aaff010, ms=0x564f8800)

 at ../hw/ppc/vof.c:308
#2  0x55a5c2bf in vof_client_handle
 (nrets=1, rets=0x748e93f0, nargs=4, args=0x748e93c0, 
service=0x748e9460 "setprop",
  vof=0x748e9420, fdt=0x7fff8aaff010, ms=0x564f8800) at 
../hw/ppc/vof.c:842

#3  0x55a5c2bf in vof_client_call
 (ms=0x564f8800, vof=vof@entry=0x5662a3d0, 
fdt=fdt@entry=0x7fff8aaff010, args_real=args_real@entry=23580472)

 at ../hw/ppc/vof.c:935

loooks like it's trying to set /chosen/linux,initrd-start:


It is not horribly clear why it crashed though.


It crashed becuase I had TYPE_VOF_MACHINE_IF but did not set a setprop 
callback and it tried to call that here. Adding a {return true;} empty 
callback avoids this.



Ah ok.




(gdb) up
#1  0x55a5c2bf in vof_setprop (vof=0x748e9420, vallen=4, 
valaddr=, pname=, nodeph=8,

 fdt=0x7fff8aaff010, ms=0x564f8800) at ../hw/ppc/vof.c:308
308    if (!vmc->setprop(ms, nodepath, propname, val, vallen)) {
(gdb) p nodepath
$1 = "/chosen\000\060/rPC,750CXE/", '\000' 
(gdb) p propname
$2 = 
"linux,initrd-start\000linux,initrd-end\000linux,cmdline-timeout\000bootarg" 
(gdb) p val

$3 = 

I think I need the callback for setprop in TYPE_VOF_MACHINE_IF. I can 
copy spapr_vof_setprop() but some explanation on why that's needed 
might help. Ciould I just do fdt_setprop in my callback as 
vof_setprop() would do without a machine callback or is there some 
special handling needed for these properties?


The short answer is yes, you do not need TYPE_VOF_MACHINE_IF.

The long answer is that we build the FDT on spapr twice:
1. at the reset time and
2. after "ibm,client-arhitecture-support" (early in the boot the spapr 
paravirtual client says what it supports - ISA level, MMU features, etc)


Between 1 and 2 the kernel moves initrd and we do not update the 
QEMU's version of its location, the tree at 2) will have the old values.


So for that reason I have TYPE_VOF_MACHINE_IF. You most definitely do 
not need it.


I need TYPE_VOF_MACHINE_IF because that has the quiesce callback that I 
need to shut VOF down when the guest is finished with it otherwise it 
would crash later (more on this in next message).


Nah, quiesce() only means stopping IO in VOF. VOF is shut down when the 
client decides to stop using it (and zero that memory).


But since I shut down 
VOF here I don't need to remember changes to the FDT so I can just use 
an empty setprop callback. (I wouldn't even need that if VOF would check 
that a callback is non-NULL before calling it.)


I'll add the check.

I'll need some time to go though the other mails, closer to the weekend, 
there are too many gaps in my knowledge about t

Re: [PATCH 16/24] target/ppc: Remove PowerPCCPUClass.handle_mmu_fault

2021-05-23 Thread Richard Henderson

On 5/23/21 10:28 PM, David Gibson wrote:

On Tue, May 18, 2021 at 03:11:38PM -0500, Richard Henderson wrote:

Instead, use a switch on env->mmu_model.  This avoids some
replicated information in cpu setup.


I have mixed feelings about this, since I introduced
pcc->handle_mmu_fault specifically to get rid of the nasty
mega-switch, with the hope of eventually getting rid of env->mmu_model
entirely.

But.. it does simplify your patch series, which makes a good change
overall.


Having browsed the mmu code for a while, I would imagine a good change would be 
to have several hooks, and the mmu_model enum, all in the same const struct. 
But the current situation is untennable.



r~



Re: [PATCH 07/38] target/riscv: SIMD 8-bit Shift Instructions

2021-05-23 Thread Palmer Dabbelt

On Fri, 12 Feb 2021 07:02:25 PST (-0800), zhiwei_...@c-sky.com wrote:

Signed-off-by: LIU Zhiwei 


I know it's always kind of akward for this type of patches, but IIUC 
they're all supposed to have some sort of description.



---
 target/riscv/helper.h   |   9 +++
 target/riscv/insn32.decode  |  17 
 target/riscv/insn_trans/trans_rvp.c.inc |  16 
 target/riscv/packed_helper.c| 102 
 4 files changed, 144 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 20bf400ac2..0ecd4d53f9 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1193,3 +1193,12 @@ DEF_HELPER_3(sll16, tl, env, tl, tl)
 DEF_HELPER_3(ksll16, tl, env, tl, tl)
 DEF_HELPER_3(kslra16, tl, env, tl, tl)
 DEF_HELPER_3(kslra16_u, tl, env, tl, tl)
+
+DEF_HELPER_3(sra8, tl, env, tl, tl)
+DEF_HELPER_3(sra8_u, tl, env, tl, tl)
+DEF_HELPER_3(srl8, tl, env, tl, tl)
+DEF_HELPER_3(srl8_u, tl, env, tl, tl)
+DEF_HELPER_3(sll8, tl, env, tl, tl)
+DEF_HELPER_3(ksll8, tl, env, tl, tl)
+DEF_HELPER_3(kslra8, tl, env, tl, tl)
+DEF_HELPER_3(kslra8_u, tl, env, tl, tl)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 6f053bfeb7..cc782fcde5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -24,6 +24,7 @@

 %sh1020:10
 %sh420:4
+%sh320:3
 %csr20:12
 %rm 12:3
 %nf 29:3 !function=ex_plus_1
@@ -61,6 +62,7 @@

 @sh  ..  .. .  ... . ... &shift  shamt=%sh10  %rs1 
%rd
 @sh4 ..  .. .  ... . ... &shift  shamt=%sh4  %rs1 
%rd
+@sh3 ..  .. .  ... . ... &shift  shamt=%sh3  %rs1 
%rd
 @csr    .  ... . ...   %csr %rs1 
%rd

 @atom_ld . aq:1 rl:1 .  . ... &atomic rs2=0 %rs1 
%rd
@@ -652,3 +654,18 @@ ksll16 0110010  . . 000 . 111 @r
 kslli160111010  1 . 000 . 111 @sh4
 kslra160101011  . . 000 . 111 @r
 kslra16_u  0110011  . . 000 . 111 @r
+
+sra8   0101100  . . 000 . 111 @r
+sra8_u 0110100  . . 000 . 111 @r
+srai8  000  00... . 000 . 111 @sh3
+srai8_u000  01... . 000 . 111 @sh3
+srl8   0101101  . . 000 . 111 @r
+srl8_u 0110101  . . 000 . 111 @r
+srli8  001  00... . 000 . 111 @sh3
+srli8_u001  01... . 000 . 111 @sh3
+sll8   0101110  . . 000 . 111 @r
+slli8  010  00... . 000 . 111 @sh3
+ksll8  0110110  . . 000 . 111 @r
+kslli8 010  01... . 000 . 111 @sh3
+kslra8 010  . . 000 . 111 @r
+kslra8_u   0110111  . . 000 . 111 @r
diff --git a/target/riscv/insn_trans/trans_rvp.c.inc 
b/target/riscv/insn_trans/trans_rvp.c.inc
index 848edab7e5..12a64849eb 100644
--- a/target/riscv/insn_trans/trans_rvp.c.inc
+++ b/target/riscv/insn_trans/trans_rvp.c.inc
@@ -353,3 +353,19 @@ GEN_RVP_SHIFTI(slli16, sll16, tcg_gen_vec_shl16i_i64);
 GEN_RVP_SHIFTI(srai16_u, sra16_u, NULL);
 GEN_RVP_SHIFTI(srli16_u, srl16_u, NULL);
 GEN_RVP_SHIFTI(kslli16, ksll16, NULL);
+
+/* SIMD 8-bit Shift Instructions */
+GEN_RVP_SHIFT(sra8, tcg_gen_gvec_sars, 0);
+GEN_RVP_SHIFT(srl8, tcg_gen_gvec_shrs, 0);
+GEN_RVP_SHIFT(sll8, tcg_gen_gvec_shls, 0);
+GEN_RVP_R_OOL(sra8_u);
+GEN_RVP_R_OOL(srl8_u);
+GEN_RVP_R_OOL(ksll8);
+GEN_RVP_R_OOL(kslra8);
+GEN_RVP_R_OOL(kslra8_u);
+GEN_RVP_SHIFTI(srai8, sra8, tcg_gen_vec_sar8i_i64);
+GEN_RVP_SHIFTI(srli8, srl8, tcg_gen_vec_shr8i_i64);
+GEN_RVP_SHIFTI(slli8, sll8, tcg_gen_vec_shl8i_i64);
+GEN_RVP_SHIFTI(srai8_u, sra8_u, NULL);
+GEN_RVP_SHIFTI(srli8_u, srl8_u, NULL);
+GEN_RVP_SHIFTI(kslli8, ksll8, NULL);
diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c
index 7e31c2fe46..ab9ebc472b 100644
--- a/target/riscv/packed_helper.c
+++ b/target/riscv/packed_helper.c
@@ -529,3 +529,105 @@ static inline void do_kslra16_u(CPURISCVState *env, void 
*vd, void *va,
 }

 RVPR(kslra16_u, 1, 2);
+
+/* SIMD 8-bit Shift Instructions */
+static inline void do_sra8(CPURISCVState *env, void *vd, void *va,
+   void *vb, uint8_t i)
+{
+int8_t *d = vd, *a = va;
+uint8_t shift = *(uint8_t *)vb & 0x7;
+d[i] = a[i] >> shift;
+}
+
+RVPR(sra8, 1, 1);
+
+static inline void do_srl8(CPURISCVState *env, void *vd, void *va,
+   void *vb, uint8_t i)
+{
+uint8_t *d = vd, *a = va;
+uint8_t shift = *(uint8_t *)vb & 0x7;
+d[i] = a[i] >> shift;
+}
+
+RVPR(srl8, 1, 1);
+
+static inline void do_sll8(CPURISCVState *env, void *vd, void *va,
+   void *vb, uint8_t i)
+{
+uint8_t *d = vd, *a = va;
+uint8_t shift = *(uint8_t *)vb & 0x7;
+d[i] = a[i] << shift;
+}
+
+RVPR(sll8, 1, 1);
+
+static inline void do_

Re: [PATCH 00/24] target/ppc: Clean up mmu translation

2021-05-23 Thread Richard Henderson

On 5/23/21 10:26 PM, David Gibson wrote:

On Wed, May 19, 2021 at 05:47:05PM -0500, Richard Henderson wrote:

On 5/19/21 3:37 PM, Richard Henderson wrote:

On 5/18/21 9:52 PM, David Gibson wrote:

I've applied 1..15, still looking at the rest.


Please dequeue.  I want to create a new mmu-internal.h, which affects
all the patches from #1.


Alternately, don't.  I can move the function later, and it may be a while
before I can get back to this.


Ok, I'll leave them in, since they're good cleanups even without the
rest of the series.



Two outstanding bugs:

(1) mmu-radix64.c vs hypervisors.  You'll not see these unless you run kvm
inside of tcg.

Basically, all usage of msr_{hv,pr,ir,dr} within ppc_*_xlate is incorrect.
We should be pulling these from the 3 bits of mmu_idx, as outlined in the
table in hreg_compute_hflags_value.


Ah, that's probably my fault.  I reworked those substantially from the
original code (closer to mmu_helper.c).  I guess I didn't (and I
suspect I still don't) really understand how the softmmu works.


When you start propagating that around, you see that the second-level
translation for loading the pte (2 of the 3 calls to
ppc_radix64_partition_scoped_xlate) should not be using the mmu_idx related
to the user-mode guest access, but should be using the mmu_idx of the
kernel/hypervisor that owns the page table.

I can't see that mmu-radix64.c has the same problem.  I'm not really sure
how the second-level translation for hypervisors works there.  Is it by the
hypervisor altering the hash table as it is loaded?


Uh.. you started by saying mmu-radix64.c then talked about the hash
table, so I'm not sure which model you're talking about.


radix64 definitely has a problem.
Then I wondered if hash64 had the same problem.



For hash + hypervisor, then yes, there is no hardware 2-level
translation, it all works by paravirtualizing updates to the hash
table (this is the H_ENTER etc. code in hw/ppc/spapr_softmmu.c).


Ah, gotcha.  So, no, hash64 is unaffected.


Indeed.  Direct store segments are basically ancient history of the
POWER architecture which Linux never used, and I don't think much else
did either.  So I'm inclined to go with

   (D) Just rip out all the direct store segment code and replace with
   some LOG_UNIMPs.  Re-adding it working can be the job of the
   probably non-existent person who has an actual use case using
   them.


Fair enough.


r~



Re: [PATCH qemu v19] spapr: Implement Open Firmware client interface

2021-05-23 Thread David Gibson
On Mon, May 17, 2021 at 02:17:36PM +0200, BALATON Zoltan wrote:
> On Mon, 17 May 2021, Alexey Kardashevskiy wrote:
> > On 5/16/21 01:04, BALATON Zoltan wrote:
> > > On Thu, 22 Apr 2021, Alexey Kardashevskiy wrote:
> > 
> > [snip]
> > 
> > > > +/* Defined as Big Endian */
> > > > +struct prom_args {
> > > > +    uint32_t service;
> > > > +    uint32_t nargs;
> > > > +    uint32_t nret;
> > > > +    uint32_t args[10];
> > > > +} QEMU_PACKED;
> > > 
> > > This #define and struct definition should probably be moved to
> > > include/hw/ppc/vof.h as I had to copy these when trying to get VOF
> > > running with pegasos2 and these seem to be VOF specific not spapr.
> > 
> > Correct, I'll fix it - there are 2 copies already.
> > 
> > 
> > > 
> > > I was trying to wire up VOF on pegasos2 as proof of concept but I
> > > did not get very far as it crashed at the first move due to using
> > > mtmsrd which does not exist on the 32 bit CPUs (G4 or G3) used by
> > > pegasos2:
> > > 
> > > vof_claim virt=0x0 size=0xc38 align=0x0 => 0x0
> > > vof_claim virt=0x0 size=0x8000 align=0x8000 => 0x8000
> > > vof_claim virt=0xc0 size=0x18fd62 align=0x0 => 0xc0
> > > vof_claimed 0x0..0xc38 size=0xc38
> > > vof_claimed 0x8000..0x1 size=0x8000
> > > vof_claimed 0xc0..0xd8fd62 size=0x18fd62
> > > vof_avail 0xc38..0x8000 size=0x73c8
> > > vof_avail 0x1..0xc0 size=0xbf
> > > vof_avail 0xd8fd62..0x2000 size=0x1f27029e
> > > via_superio_cfg: unimplemented register 0xf2
> > > via_superio_cfg: unimplemented register 0xf4
> > > via_superio_cfg: unimplemented register 0xf6
> > > via_superio_cfg: unimplemented register 0xf7
> > > invalid/unsupported opcode: 1f - 12 - 05 - 00 (7fe00164) fff00108 0
> > > 
> > > IN:
> > > 0xfff00100:  3fe0  lis  r31, 0
> > > 0xfff00104:  63ff  ori  r31, r31, 0
> > > 0xfff00108:  7fe00164  mtmsrd   r31
> > > 
> > > 
> > > IN:
> > > 0xfff00700:  807e8020  lwz  r3, -0x7fe0(r30)
> > > 0xfff00704:  4cc63182  crclr    6
> > > 0xfff00708:  4bfffd1d  bl   0xfff00424
> > > 
> > > Invalid access at addr 0x8020, size 4, region '(null)', reason:
> > > rejected
> > > 
> > > Is this mtmsrd really needed? Do 64-bit Power CPUs start in 64 bit mode?
> > 
> > Yup:
> > https://git.qemu.org/?p=qemu.git;a=blob;f=target/ppc/translate_init.c.inc;h=66e6a4a746f46148e0006081af09391b32c125cd;hb=HEAD#l10085
> > 
> > I cannot find the exact reason for that, probably PAPR or some PPC-OF
> > binding says so.
> > 
> > 
> > > I'd expect them to be in compatibility mode unless 64 bit is enabled
> > > but I did not check the docs. If it's needed maybe a dummy handler
> > > has to be at 0x700 to ignore this exception if it's running on a
> > > 32-bit CPU.
> > 
> > I wanted MSR and the code to be in sync explicitly.
> 
> OK, then can you add a dummy exception handler at 0x700 to ignore this so it
> would also work on a 32-bit CPU? That does not seem to be too difficult.

I don't like that idea.  I really think the idea with VOF should be
that it's explicitly closely married to qemu, so we shouldn't have to
mess around with the firmware image coping with different situations.

So a) I don't think we should go to *any* trouble to have the same VOF
guest image work on multiple machine types (nor multiple qemu
versions).  And b) I think VOF should just assume qemu has started it
in the mode it wants and not try to get it there.

> > > By the way does vof need to be loaded at addr 0 or it could work at
> > > the default ROM address as well? That would simplify usage if it
> > > could run position independent.
> > 
> > What do you call the default ROM address? SLOF loads at 0 and starts at
> > 0x100, VOF does the same.
> 
> On pegasos2 the ROM is at 0xfff0 normally and that's where it starts
> executing at offset 0x100. If I load vof.bin there it starts but hypercalls
> fail as above. I've changed it to load vof at 0 then it works a bit better
> and after adding the property to /chosen it tries to query it but fails
> there for some reason.
> 
> > Making it run position independend is going to make it more complex and
> > I really (really) want it to be tiny.
> 
> It's not needed as I can special case it and load vof at 0 but is it more
> complex than just compiling it with the appropriate flag for PIC (-fPIC or
> what is it).
> 
> > I think what you really want is another vof-pegasos2.bin linked at the
> > address you like and not calling mtmsrd, could be an #ifdef in the
> > existing vof firmware. It is rather expected to have a firmware per a
> > machine type.
> 
> Is that really needed? Can we make a single firmware binary run on different
> CPUs? I think openbios-ppc runs on both 64 and 32 bit PPC while itself
> compiled as 32 bit but also runs with qemu-system-ppc64 -M mac99 which uses
> a G5 CPU by default. It does some magic here:
> 
> https://github.com/openbios/openbios/blob/master/arch/ppc/qemu/start.S
> 
> but that may be too ugly to copy.
> 
> Anoth

Re: [PATCH v5 1/3] spapr: nvdimm: Forward declare and move the definitions

2021-05-23 Thread David Gibson
On Tue, May 18, 2021 at 08:03:17AM -0400, Shivaprasad G Bhat wrote:
> The subsequent patches add definitions which tend to get
> the compilation to cyclic dependency. So, prepare with
> forward declarations, move the definitions and clean up.
> 
> Signed-off-by: Shivaprasad G Bhat 

This is a reasonable cleanup regardless of the rest of the series, so
I've applied this to ppc-for-6.1.

> ---
>  hw/ppc/spapr_nvdimm.c |   12 
>  include/hw/ppc/spapr_nvdimm.h |   14 ++
>  2 files changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
> index 252204e25f..3f57a8b6fa 100644
> --- a/hw/ppc/spapr_nvdimm.c
> +++ b/hw/ppc/spapr_nvdimm.c
> @@ -35,6 +35,18 @@
>  /* SCM device is unable to persist memory contents */
>  #define PAPR_PMEM_UNARMED PPC_BIT(0)
>  
> +/*
> + * The nvdimm size should be aligned to SCM block size.
> + * The SCM block size should be aligned to SPAPR_MEMORY_BLOCK_SIZE
> + * in order to have SCM regions not to overlap with dimm memory regions.
> + * The SCM devices can have variable block sizes. For now, fixing the
> + * block size to the minimum value.
> + */
> +#define SPAPR_MINIMUM_SCM_BLOCK_SIZE SPAPR_MEMORY_BLOCK_SIZE
> +
> +/* Have an explicit check for alignment */
> +QEMU_BUILD_BUG_ON(SPAPR_MINIMUM_SCM_BLOCK_SIZE % SPAPR_MEMORY_BLOCK_SIZE);
> +
>  bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
> uint64_t size, Error **errp)
>  {
> diff --git a/include/hw/ppc/spapr_nvdimm.h b/include/hw/ppc/spapr_nvdimm.h
> index 73be250e2a..764f999f54 100644
> --- a/include/hw/ppc/spapr_nvdimm.h
> +++ b/include/hw/ppc/spapr_nvdimm.h
> @@ -11,19 +11,9 @@
>  #define HW_SPAPR_NVDIMM_H
>  
>  #include "hw/mem/nvdimm.h"
> -#include "hw/ppc/spapr.h"
>  
> -/*
> - * The nvdimm size should be aligned to SCM block size.
> - * The SCM block size should be aligned to SPAPR_MEMORY_BLOCK_SIZE
> - * inorder to have SCM regions not to overlap with dimm memory regions.
> - * The SCM devices can have variable block sizes. For now, fixing the
> - * block size to the minimum value.
> - */
> -#define SPAPR_MINIMUM_SCM_BLOCK_SIZE SPAPR_MEMORY_BLOCK_SIZE
> -
> -/* Have an explicit check for alignment */
> -QEMU_BUILD_BUG_ON(SPAPR_MINIMUM_SCM_BLOCK_SIZE % SPAPR_MEMORY_BLOCK_SIZE);
> +typedef struct SpaprDrc SpaprDrc;
> +typedef struct SpaprMachineState SpaprMachineState;
>  
>  int spapr_pmem_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
> void *fdt, int *fdt_start_offset, Error **errp);
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] target/ppc: Implement ISA v3.1 wait variants

2021-05-23 Thread David Gibson
On Mon, May 17, 2021 at 05:19:06PM +1000, Nicholas Piggin wrote:
> Excerpts from David Gibson's message of May 17, 2021 3:39 pm:
> > On Mon, May 17, 2021 at 12:46:51PM +1000, Nicholas Piggin wrote:
> >> ISA v3.1 adds new variations of wait, specified by the WC field. These
> >> are not compatible with the wait 0 implementation, because they add
> >> additional conditions that cause the processor to resume, which can
> >> cause software to hang or run very slowly.
> >> 
> >> Add the new wait variants with a trivial no-op implementation, which is
> >> allowed, as explained in comments: software must not depend on any
> >> particular architected WC condition having caused resumption of
> >> execution, therefore a no-op implementation is architecturally correct.
> >> 
> >> Signed-off-by: Nicholas Piggin 
> > 
> > Logic looks fine.  There is no test on the CPU's features or model
> > here, though, so this will change behaviour for pre-3.1 CPUs as well.
> 
> Huh. 2.06-2.07 has very similar WC bits as 3.1, but 3.0 removed them
> and made them reserved. I should have looked back but I'd assumed
> they weren't there either.
> 
> Existing code treats WC != 0 as invalid on pre-3.0 processors AFAIKS,
> so that's not quite right for 2.06-7 (they should look more like 3.1).
> 
> But before that it looks like it was just wait with no WC field.
> 
> > What would invoking these wait variants (presumably reserved) on
> > earlier CPUs do?
> 
> Prior to 2.06, it looks like there is no WC field, and so they should 
> generate a program check. So that just leaves the incorrect program
> checks for 2.06-7, something like this should do it:
> 
> -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
> +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA206),

Ok, can you update with such a change, and put some of this
explanation of the history in a comment.

> 2.06-3.1 should all be fine with this patch, AFAIKS they all have words 
> to the effect that WC != 0 is subject to implementation defined 
> behaviour and may be treated as a no-op or not implemented.

Ok.  Note that we do try to match specific CPU behaviour, not just the
architecture, although the architecture is obviously more important.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH 0/5] linux-user changes to run docker

2021-05-23 Thread YAMAMOTO Takashi
These patches, along with a few more hacks [1] I didn't include
in this patchset, allowed me to run arm64 and armv7 version of
dind image on amd64.

[1] https://github.com/yamt/qemu/tree/linux-user-for-docker

You can find my test setup here:
https://github.com/yamt/garbage/tree/master/binfmt-aarch64-install

YAMAMOTO Takashi (5):
  linux-user: handle /proc/self/exe for execve
  linux-uesr: make exec_path realpath
  linux-user: Fix the execfd case of /proc/self/exe open
  linux-user: dup the execfd on start up
  linux-user: Implement pivot_root

 linux-user/main.c| 14 +-
 linux-user/qemu.h|  2 ++
 linux-user/syscall.c | 43 ---
 3 files changed, 55 insertions(+), 4 deletions(-)

-- 
2.21.1 (Apple Git-122.3)




[PATCH 1/5] linux-user: handle /proc/self/exe for execve

2021-05-23 Thread YAMAMOTO Takashi
It seems somehow common to execve /proc/self/exe in docker
or golang community these days.
At least, moby "reexec" and runc "libcontainer" do that.

Signed-off-by: YAMAMOTO Takashi 
---
 linux-user/syscall.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index c9f812091c..a2b03ecb8b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8470,6 +8470,7 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
 #endif
 case TARGET_NR_execve:
 {
+const char *path;
 char **argp, **envp;
 int argc, envc;
 abi_ulong gp;
@@ -8537,7 +8538,11 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
  * before the execve completes and makes it the other
  * program's problem.
  */
-ret = get_errno(safe_execve(p, argp, envp));
+path = p;
+if (is_proc_myself(path, "exe")) {
+path = exec_path;
+}
+ret = get_errno(safe_execve(path, argp, envp));
 unlock_user(p, arg1, 0);
 
 goto execve_end;
-- 
2.21.1 (Apple Git-122.3)




[PATCH 4/5] linux-user: dup the execfd on start up

2021-05-23 Thread YAMAMOTO Takashi
So that it can be used for other purposes (e.g. syscall.c)
after the elf loader closed it.

Signed-off-by: YAMAMOTO Takashi 
---
 linux-user/main.c| 8 
 linux-user/qemu.h| 2 ++
 linux-user/syscall.c | 5 ++---
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 1f9f4e3820..86ddba8b62 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -56,6 +56,7 @@
 
 char *exec_path;
 char exec_path_store[PATH_MAX];
+int exec_fd = -1;
 
 int singlestep;
 static const char *argv0;
@@ -833,6 +834,13 @@ int main(int argc, char **argv, char **envp)
 cpu->opaque = ts;
 task_settid(ts);
 
+/*
+ * dup execfd to a global so that it can be used after loader_exec
+ * closes it.
+ */
+
+exec_fd = dup(execfd);
+
 ret = loader_exec(execfd, exec_path, target_argv, target_environ, regs,
 info, &bprm);
 if (ret != 0) {
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 3b0b6b75fe..ee4e9a1779 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -160,6 +160,8 @@ typedef struct TaskState {
 } __attribute__((aligned(16))) TaskState;
 
 extern char *exec_path;
+extern int exec_fd;
+
 void init_task_state(TaskState *ts);
 void task_settid(TaskState *);
 void stop_all_tasks(void);
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 14a63518e2..2947e79dc0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8117,12 +8117,11 @@ static int do_openat(void *cpu_env, int dirfd, const 
char *pathname, int flags,
 };
 
 if (is_proc_myself(pathname, "exe")) {
-int execfd = qemu_getauxval(AT_EXECFD);
-if (execfd) {
+if (exec_fd != -1) {
 char filename[PATH_MAX];
 int ret;
 
-snprintf(filename, sizeof(filename), "/proc/self/fd/%d", execfd);
+snprintf(filename, sizeof(filename), "/proc/self/fd/%d", exec_fd);
 ret = safe_openat(dirfd, filename, flags, mode);
 if (ret != -1) {
 return ret;
-- 
2.21.1 (Apple Git-122.3)




[PATCH 2/5] linux-uesr: make exec_path realpath

2021-05-23 Thread YAMAMOTO Takashi
Otherwise, it can be easily fooled by the user app using chdir().

Signed-off-by: YAMAMOTO Takashi 
---
 linux-user/main.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 4dfc47ad3b..1f9f4e3820 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -55,6 +55,7 @@
 #endif
 
 char *exec_path;
+char exec_path_store[PATH_MAX];
 
 int singlestep;
 static const char *argv0;
@@ -610,7 +611,10 @@ static int parse_args(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 
-exec_path = argv[optind];
+exec_path = realpath(argv[optind], exec_path_store);
+if (exec_path == NULL) {
+exec_path = argv[optind];
+}
 
 return optind;
 }
-- 
2.21.1 (Apple Git-122.3)




[PATCH 3/5] linux-user: Fix the execfd case of /proc/self/exe open

2021-05-23 Thread YAMAMOTO Takashi
It's problematic to return AT_EXECFD as it is because the user app
would close it.
This patch opens it via /proc/self/fd instead.

Signed-off-by: YAMAMOTO Takashi 
---
 linux-user/syscall.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index a2b03ecb8b..14a63518e2 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8118,7 +8118,17 @@ static int do_openat(void *cpu_env, int dirfd, const 
char *pathname, int flags,
 
 if (is_proc_myself(pathname, "exe")) {
 int execfd = qemu_getauxval(AT_EXECFD);
-return execfd ? execfd : safe_openat(dirfd, exec_path, flags, mode);
+if (execfd) {
+char filename[PATH_MAX];
+int ret;
+
+snprintf(filename, sizeof(filename), "/proc/self/fd/%d", execfd);
+ret = safe_openat(dirfd, filename, flags, mode);
+if (ret != -1) {
+return ret;
+}
+}
+return safe_openat(dirfd, exec_path, flags, mode);
 }
 
 for (fake_open = fakes; fake_open->filename; fake_open++) {
-- 
2.21.1 (Apple Git-122.3)




[PATCH 5/5] linux-user: Implement pivot_root

2021-05-23 Thread YAMAMOTO Takashi
Used by runc.

Signed-off-by: YAMAMOTO Takashi 
---
 linux-user/syscall.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 2947e79dc0..e739921e86 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -8254,6 +8255,11 @@ static int host_to_target_cpu_mask(const unsigned long 
*host_mask,
 return 0;
 }
 
+static int pivot_root(const char *new_root, const char *put_old)
+{
+return syscall(SYS_pivot_root, new_root, put_old);
+}
+
 /* This is an internal helper for do_syscall so that it is easier
  * to have a single return point, so that actions, such as logging
  * of syscall results, can be performed.
@@ -13222,6 +13228,23 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
 return ret;
 #endif
 
+#if defined(TARGET_NR_pivot_root)
+case TARGET_NR_pivot_root:
+{
+void *p2;
+p = lock_user_string(arg1); /* new_root */
+p2 = lock_user_string(arg2); /* put_old */
+if (!p || !p2) {
+ret = -TARGET_EFAULT;
+} else {
+ret = get_errno(pivot_root(p, p2));
+}
+unlock_user(p2, arg2, 0);
+unlock_user(p, arg1, 0);
+}
+return ret;
+#endif
+
 default:
 qemu_log_mask(LOG_UNIMP, "Unsupported syscall: %d\n", num);
 return -TARGET_ENOSYS;
-- 
2.21.1 (Apple Git-122.3)




Re: [PATCH 16/24] target/ppc: Remove PowerPCCPUClass.handle_mmu_fault

2021-05-23 Thread David Gibson
On Sun, May 23, 2021 at 11:36:44PM -0500, Richard Henderson wrote:
65;6401;1c> On 5/23/21 10:28 PM, David Gibson wrote:
> > On Tue, May 18, 2021 at 03:11:38PM -0500, Richard Henderson wrote:
> > > Instead, use a switch on env->mmu_model.  This avoids some
> > > replicated information in cpu setup.
> > 
> > I have mixed feelings about this, since I introduced
> > pcc->handle_mmu_fault specifically to get rid of the nasty
> > mega-switch, with the hope of eventually getting rid of env->mmu_model
> > entirely.
> > 
> > But.. it does simplify your patch series, which makes a good change
> > overall.
> 
> Having browsed the mmu code for a while, I would imagine a good change would
> be to have several hooks, and the mmu_model enum, all in the same const
> struct. But the current situation is untennable.

Yeah, fair.

> 
> 
> r~
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: The latest Qemu release can't bootup VM with latest guest kernel.

2021-05-23 Thread Yang Zhong
On Sun, May 23, 2021 at 03:23:30PM +0300, Gal Hammer wrote:
> Hi Yang,
> 
> On Thu, 20 May 2021 at 11:27, Yang Zhong  wrote:
> 
> > Hello all,
> >
> > I found the latest Qemu release can't bootup the VM with latest guest
> > kernel(>5.13).
> >
> > The normal v6.0.0 release is good to bootup the latest guest kernel.
> >
> > There are two issues were found
> > 1. Guest kernel panic.
> > 2. kvm disabled by bios
> >
> > The panic log as below:
> > [2.250024] BUG: unable to handle page fault for address:
> > ac06c55f
> > [2.252226] #PF: supervisor write access in kernel mode
> > [2.253892] #PF: error_code(0x0003) - permissions violation
> > [2.255671] PGD 5940e067 P4D 5940f067 PUD 59410063 PMD 580001e1
> > [2.257567] Oops: 0003 [#1] SMP NOPTI
> > [2.258738] CPU: 2 PID: 313 Comm: systemd-udevd Not tainted 5.13.0-rc1+
> > #1
> > [2.260899] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > 0.0.0 02/06/2015
> > [2.263375] RIP: 0010:__send_ipi_mask+0x1bf/0x240
> > [2.264855] Code: c0 48 c7 44 24 18 00 00 00 00 e9 48 ff ff ff 48 89 d0
> > 4c 09 c8 74 1b 49 63 d7 48 63 74 24 0c b8 0a 00 00 00 4c 89 cb 4c 89 d1
> > <0f> 01 d9 48 85 c0 78 4a 48 f7 04 24 00 02 00 00 0f 84 80 fe ff ff
> > [2.270643] RSP: 0018:ff591a62c0193ab0 EFLAGS: 00010006
> > [2.272277] RAX: 000a RBX: 0009 RCX:
> > 
> > [2.274482] RDX:  RSI: 00fc RDI:
> > ff13a83dc003c830
> > [2.276663] RBP: ff591a62c0193b08 R08: 0004 R09:
> > 0009
> > [2.278866] R10:  R11:  R12:
> > 
> > [2.281065] R13: ff13a83dc003c830 R14: 00011580 R15:
> > 
> > [2.283272] FS:  7f23ebd07940() GS:ff13a83e3bd0()
> > knlGS:
> > [2.285794] CS:  0010 DS:  ES:  CR0: 80050033
> > [2.287574] CR2: ac06c55f CR3: 000106ce2003 CR4:
> > 00771ee0
> > [2.289757] DR0:  DR1:  DR2:
> > 
> > [2.291972] DR3:  DR6: fffe0ff0 DR7:
> > 0400
> > [2.294177] PKRU: 5554
> > [2.295043] Call Trace:
> > [2.295820]  kvm_smp_send_call_func_ipi+0xe/0x60
> > [2.297220]  smp_call_function_many_cond+0x25d/0x2a0
> > [2.298772]  ? flush_tlb_one_kernel+0x20/0x20
> > [2.300145]  on_each_cpu_cond_mask+0x1e/0x20
> > [2.301514]  flush_tlb_kernel_range+0x8d/0x90
> > [2.302799]  __purge_vmap_area_lazy+0xc1/0x6a0
> > [2.304097]  ? cpumask_next+0x1f/0x20
> > [2.305160]  ? purge_fragmented_blocks_allcpus+0x3d/0x210
> > [2.306686]  _vm_unmap_aliases+0xf1/0x120
> > [2.307861]  change_page_attr_set_clr+0x95/0x280
> > [2.309203]  set_memory_ro+0x26/0x30
> > [2.310259]  ? 0xc00f7000
> > [2.311214]  module_enable_ro.part.58+0x62/0xc0
> > [2.312417]  do_init_module+0x17a/0x230
> > [2.313460]  load_module+0x1a30/0x1b00
> > [2.314463]  ? __do_sys_finit_module+0xad/0x110
> > [2.315702]  __do_sys_finit_module+0xad/0x110
> > [2.316890]  do_syscall_64+0x39/0x80
> > [2.317868]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [2.319226] RIP: 0033:0x7f23ea8f32bd
> >
> >
> > I also used the bisect to get the bad commit id:
> > f5cc5a5c168674f84bf061cdb307c2d25fba5448
> >
> > This issue is known issue? or some fixs are ready to fix those issues?
> > thanks!
> >
> 
> What's your qemu command line?
> 
> I'm also having a kernel crash (although I think mine is with a different
> call stack) when using "-cpu host". The crash doesn't occur when I use
> "kvm64" cpu type.
>

  Hello Gal,

  Let me list my host and guest environment
  Host: Icelake, Linux5.13.0-rc1+
  Guest: Linux5.13.0-rc1+

  The Qemu command line:
  ./qemu-system-x86_64 \
 -machine q35 \
 -accel kvm \
 -m 4096 \
 -smp 4 \
 -cpu host \
 -bios /home/vmm/project/images/OVMF-upstream.fd \
 -drive 
format=raw,file=/root/project/images/SGX_rhel8_efi.img,index=0,media=disk \
 -netdev user,id=guest0,hostfwd=tcp::10022-:22 \
 -device virtio-net-pci,netdev=guest0 \
 -qmp tcp:127.0.0.1:12345,server,nowait \
 -monitor telnet:127.0.0.1:5,server,nowait \
 -nographic -nodefaults -serial stdio

  I also tried the 'cpu kvm64' in my side, and there is not any issue.

  Regards,

  Yang
 
> Gal.
> 
> 
> >
> > Regards,
> >
> > Yang
> >
> >
> >
> >