Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-29 Thread Philippe Mathieu-Daudé
On 12/29/21 04:48, wangyanan (Y) wrote:
> Hi Philippe,
> Thanks for your review.
> 
> On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:
>> Hi,
>>
>> On 12/28/21 10:22, Yanan Wang wrote:
>>> The new Cluster-Aware Scheduling support has landed in Linux 5.16,
>>> which has been proved to benefit the scheduling performance (e.g.
>>> load balance and wake_affine strategy) on both x86_64 and AArch64.
>>>
>>> So now in Linux 5.16 we have four-level arch-neutral CPU topology
>>> definition like below and a new scheduler level for clusters.
>>> struct cpu_topology {
>>>  int thread_id;
>>>  int core_id;
>>>  int cluster_id;
>>>  int package_id;
>>>  int llc_id;
>>>  cpumask_t thread_sibling;
>>>  cpumask_t core_sibling;
>>>  cpumask_t cluster_sibling;
>>>  cpumask_t llc_sibling;
>>> }
>>>
>>> A cluster generally means a group of CPU cores which share L2 cache
>>> or other mid-level resources, and it is the shared resources that
>>> is used to improve scheduler's behavior. From the point of view of
>>> the size range, it's between CPU die and CPU core. For example, on
>>> some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
>>> and 4 CPU cores in each cluster. The 4 CPU cores share a separate
>>> L2 cache and a L3 cache tag, which brings cache affinity advantage.
>>>
>>> In virtualization, on the Hosts which have pClusters, if we can
>> Maybe [*] -> reference to pClusters?
> Hm, I'm not sure what kind of reference is appropriate here.

So I guess the confusion comes from a simple typo =)

Is it OK if I replace "pClusters" by "Clusters"?

> The third paragraph in the commit message has explained what
> a cluster generally means. We can also read the description of
> clusters in Linux kernel Kconfig files: [1] and [2].
> 
> [1]arm64: https://github.com/torvalds/linux/blob/master/arch/arm64/Kconfig
> 
> config SCHED_CLUSTER
>    bool "Cluster scheduler support"
>    help
>  Cluster scheduler support improves the CPU scheduler's decision
>  making when dealing with machines that have clusters of CPUs.
>  Cluster usually means a couple of CPUs which are placed closely
>  by sharing mid-level caches, last-level cache tags or internal
>  busses.
> 
> [2]x86: https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig
> 
> config SCHED_CLUSTER
>    bool "Cluster scheduler support"
>    depends on SMP
>    default y
>    help
>  Cluster scheduler support improves the CPU scheduler's decision
>  making when dealing with machines that have clusters of CPUs.
>  Cluster usually means a couple of CPUs which are placed closely
>  by sharing mid-level caches, last-level cache tags or internal
>  busses.
>>> design a vCPU topology with cluster level for guest kernel and
>>> have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
>>> also make use of the cache affinity of CPU clusters to gain
>>> similar scheduling performance.
>>>
>>> This patch adds infrastructure for CPU cluster level topology
>>> configuration and parsing, so that the user can specify cluster
>>> parameter if their machines support it.
>>>
>>> Signed-off-by: Yanan Wang 
>>> ---
>>>   hw/core/machine-smp.c | 26 +++---
>>>   hw/core/machine.c |  3 +++
>>>   include/hw/boards.h   |  6 +-
>>>   qapi/machine.json |  5 -
>>>   qemu-options.hx   |  7 ---
>>>   softmmu/vl.c  |  3 +++
>>>   6 files changed, 38 insertions(+), 12 deletions(-)
>>> diff --git a/qapi/machine.json b/qapi/machine.json
>>> index edeab6084b..ff0ab4ca20 100644
>>> --- a/qapi/machine.json
>>> +++ b/qapi/machine.json
>>> @@ -1404,7 +1404,9 @@
>>>   #
>>>   # @dies: number of dies per socket in the CPU topology
>>>   #
>>> -# @cores: number of cores per die in the CPU topology
>>> +# @clusters: number of clusters per die in the CPU topology
>> Missing:
>>
>>     #    (since 7.0)
> Ah, yes.
>>> +#
>>> +# @cores: number of cores per cluster in the CPU topology
>>>   #
>>>   # @threads: number of threads per core in the CPU topology
>>>   #
>>> @@ -1416,6 +1418,7 @@
>>>    '*cpus': 'int',
>>>    '*sockets': 'int',
>>>    '*dies': 'int',
>>> + '*clusters': 'int',
>>>    '*cores': 'int',
>>>    '*threads': 'int',
>>>    '*maxcpus': 'int' } }
>> If you want I can update the doc when applying.
> Do you mean the missing "since 7.0"?
> If you have a plan to apply the first 1-7 patches separately and
> I don't need to respin, please help to update it, thank you! :)

Yes, that is the plan.

> 
> Thanks,
> Yanan
>> Thanks,
>>
>> Phil.
>>
>> .
> 




Re: QEMU CI failure of cross-i386-* targets (meson picks wrong glib for native target)

2021-12-29 Thread Alessandro Di Federico via
On Fri, 24 Dec 2021 22:25:24 +0100
Paolo Bonzini  wrote:

> Is the configure script setting $cross_compile to yes? That will
> decide whether meson getting a --cross-file or a --native-file
> option, and consequently whether it treats the host and build
> machines as equal or different.

From what I can see cross_compile is set only if --cross-prefix is set,
which doesn't seem to be the case for most containers (e.g., s390x)
but not for i386:

tests/docker/dockerfiles/debian-s390x-cross.docker:ENV QEMU_CONFIGURE_OPTS 
--cross-prefix=s390x-linux-gnu-
tests/docker/dockerfiles/fedora-i386-cross.docker:ENV QEMU_CONFIGURE_OPTS 
--cpu=i386 --disable-vhost-user

I can try to set `--cross-prefix=x86_64-redhat-linux-` but I guess this
will prevent certain tests to run (given the cross-compile environment).

I'll give it a shot.

-- 
Alessandro Di Federico
rev.ng Labs Srl



Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-29 Thread wangyanan (Y)



On 2021/12/29 18:44, Philippe Mathieu-Daudé wrote:

On 12/29/21 04:48, wangyanan (Y) wrote:

Hi Philippe,
Thanks for your review.

On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:

Hi,

On 12/28/21 10:22, Yanan Wang wrote:

The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
  int thread_id;
  int core_id;
  int cluster_id;
  int package_id;
  int llc_id;
  cpumask_t thread_sibling;
  cpumask_t core_sibling;
  cpumask_t cluster_sibling;
  cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters, if we can

Maybe [*] -> reference to pClusters?

Hm, I'm not sure what kind of reference is appropriate here.

So I guess the confusion comes from a simple typo =)

I tried to mean "physical clusters" on host by pClusters, on the contrary
to "virtual clusters" on guest. But obviously it brings confusion.

Is it OK if I replace "pClusters" by "Clusters"?

Sure, it's clearer to just use "clusters", please do that.

The third paragraph in the commit message has explained what
a cluster generally means. We can also read the description of
clusters in Linux kernel Kconfig files: [1] and [2].

[1]arm64: https://github.com/torvalds/linux/blob/master/arch/arm64/Kconfig

config SCHED_CLUSTER
    bool "Cluster scheduler support"
    help
  Cluster scheduler support improves the CPU scheduler's decision
  making when dealing with machines that have clusters of CPUs.
  Cluster usually means a couple of CPUs which are placed closely
  by sharing mid-level caches, last-level cache tags or internal
  busses.

[2]x86: https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig

config SCHED_CLUSTER
    bool "Cluster scheduler support"
    depends on SMP
    default y
    help
  Cluster scheduler support improves the CPU scheduler's decision
  making when dealing with machines that have clusters of CPUs.
  Cluster usually means a couple of CPUs which are placed closely
  by sharing mid-level caches, last-level cache tags or internal
  busses.

design a vCPU topology with cluster level for guest kernel and
have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
also make use of the cache affinity of CPU clusters to gain
similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang 
---
   hw/core/machine-smp.c | 26 +++---
   hw/core/machine.c |  3 +++
   include/hw/boards.h   |  6 +-
   qapi/machine.json |  5 -
   qemu-options.hx   |  7 ---
   softmmu/vl.c  |  3 +++
   6 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/qapi/machine.json b/qapi/machine.json
index edeab6084b..ff0ab4ca20 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1404,7 +1404,9 @@
   #
   # @dies: number of dies per socket in the CPU topology
   #
-# @cores: number of cores per die in the CPU topology
+# @clusters: number of clusters per die in the CPU topology

Missing:

     #    (since 7.0)

Ah, yes.

+#
+# @cores: number of cores per cluster in the CPU topology
   #
   # @threads: number of threads per core in the CPU topology
   #
@@ -1416,6 +1418,7 @@
    '*cpus': 'int',
    '*sockets': 'int',
    '*dies': 'int',
+ '*clusters': 'int',
    '*cores': 'int',
    '*threads': 'int',
    '*maxcpus': 'int' } }

If you want I can update the doc when applying.

Do you mean the missing "since 7.0"?
If you have a plan to apply the first 1-7 patches separately and
I don't need to respin, please help to update it, thank you! :)

Yes, that is the plan.

Thank you! I will pack the rest for ARM into next version separately
after you queue the generic part.

Thanks,
Yanan

Thanks,
Yanan

Thanks,

Phil.

.

.





Re: [RFC v2 00/12] target/ppc: powerpc_excp improvements

2021-12-29 Thread Fabiano Rosas
Cédric Le Goater  writes:

> Hello Fabiano,
>
> On 12/20/21 19:18, Fabiano Rosas wrote:
>> This changed a lot since v1, basically what remains is the idea that
>> we want to have some sort of array of interrupts and some sort of
>> separation between processors.
>> 
>> At the end of this series we'll have:
>> 
>> - One file with all interrupt implementations (interrupts.c);
>> 
>> - Separate files for each major group of CPUs (book3s, booke,
>>32bits). Only interrupt code for now, but we could bring pieces of
>>cpu_init into them;
>> 
>> - Four separate interrupt arrays, one for each of the above groups
>>plus KVM.
>> 
>> - powerpc_excp calls into the individual files and from there we
>>dispatch according to what is available in the interrupts array.
>
>
> This is going in the good direction. I think we need more steps for
> the reviewers, for tests and bisectability. First 4 patches are OK
> and I hope to merge them ASAP.

Ok, I'm sending another series with just these 4 + the bounds check
Richard mentioned.

>
> The powerpc_excp() routine has grown nearly out of control these last
> years and it is becoming difficult to maintain. The goal is to clarify
> what it is going on for each CPU or each CPU family. The first step
> consists basically in duplicating the code and moving the exceptions
> handlers in specific routines.
>
> 1. cleanups should come first as usual.
>
> 2. isolate large chunks, like Nick did with ppc_excp_apply_ail().
> We could do easily the same for :
>
> 2.1 ILE
> 2.2 unimplemeted ones doing a cpu abort:
>  
>   cpu_abort(cs, " "  "is not implemented yet !\n");
> 2.3 6x TLBS
>
> This should reduce considerably powerpc_excp() without changing too
> much the execution path.

Agreed.

>
> 3. Cleanup the use of excp_model, like in dcbz_common() and kvm.
> This is not critical but some are shortcuts.

The issue here is that we would probably be switching one arbitrary
identifier for another. I don't think we have a lightweight canonical
way of identifying a CPU or group of CPUs. But maybe having these
conditionals on a specific CPU should be considered a hack to begin
with.

>
> 4. Introduce a new powerpc_excp() handler :
>
> static void powerpc_excp(PowerPCCPU *cpu, int excp)
> {
> switch(env->excp_model) {
> case POWERPC_EXCP_FOO1:
> case POWERPC_EXCP_FOO2:
> powerpc_excp_foo(cpu, excp);
>  break;
> case POWERPC_EXCP_BAR:
> powerpc_excp_legacy(cpu, excp);
>  break;
> default:
> g_assert_not_reached();
> }
> }
>
> and start duplicating code cpu per cpu in specific excp handlers, avoiding
> as much as possible the use of excp_model in the powerpc_excp_*() 
> routines.
> That's for the theory.
>
> I suppose these can be grouped in the following way :
>
> * 405 CPU
>  POWERPC_EXCP_40x,
>
> * 6xx CPUs
>  POWERPC_EXCP_601,
>  POWERPC_EXCP_602,
>  POWERPC_EXCP_603,
>  POWERPC_EXCP_G2,
>  POWERPC_EXCP_604,
>   
> * 7xx CPUs
>  POWERPC_EXCP_7x0,
>  POWERPC_EXCP_7x5,
>  POWERPC_EXCP_74xx,
>   
> * BOOKE CPUs
>  POWERPC_EXCP_BOOKE,
>
> * BOOKS CPUs
>  POWERPC_EXCP_970,/* could be special */
>  POWERPC_EXCP_POWER7,
>  POWERPC_EXCP_POWER8,
>  POWERPC_EXCP_POWER9,
>  POWERPC_EXCP_POWER10,
>  
> If not possible, then, we will duplicate more and that's not a problem.
>
> I would keep the routines in the same excp_helper.c file for now; we
> can move the code in different files but I would do it later and with
> other components in mind and not just the exception models. book3s,
> booke, 7xx, 6xx, 405 are the different groups. It fits what you did.
> 
> 5. Once done, get rid of powerpc_excp_legacy()
>
> 6. Start looking at refactoring again.
>
> There might be a common prologue and epilogue. As a consequence we could
> change the args passed to powerpc_excp_*().
>
> There could be common handlers and that's why an array of exception
> handlers looks good. this is what you are trying to address after patch 5
> but I would prefer to do the above steps before.

Ack all of this. I'm working on it.

Thank you for the inputs.



Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-29 Thread Philippe Mathieu-Daudé
On 12/29/21 14:04, wangyanan (Y) wrote:
> 
> On 2021/12/29 18:44, Philippe Mathieu-Daudé wrote:
>> On 12/29/21 04:48, wangyanan (Y) wrote:
>>> Hi Philippe,
>>> Thanks for your review.
>>>
>>> On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:
 Hi,

 On 12/28/21 10:22, Yanan Wang wrote:
> The new Cluster-Aware Scheduling support has landed in Linux 5.16,
> which has been proved to benefit the scheduling performance (e.g.
> load balance and wake_affine strategy) on both x86_64 and AArch64.
>
> So now in Linux 5.16 we have four-level arch-neutral CPU topology
> definition like below and a new scheduler level for clusters.
> struct cpu_topology {
>   int thread_id;
>   int core_id;
>   int cluster_id;
>   int package_id;
>   int llc_id;
>   cpumask_t thread_sibling;
>   cpumask_t core_sibling;
>   cpumask_t cluster_sibling;
>   cpumask_t llc_sibling;
> }
>
> A cluster generally means a group of CPU cores which share L2 cache
> or other mid-level resources, and it is the shared resources that
> is used to improve scheduler's behavior. From the point of view of
> the size range, it's between CPU die and CPU core. For example, on
> some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
> and 4 CPU cores in each cluster. The 4 CPU cores share a separate
> L2 cache and a L3 cache tag, which brings cache affinity advantage.
>
> In virtualization, on the Hosts which have pClusters, if we can
 Maybe [*] -> reference to pClusters?
>>> Hm, I'm not sure what kind of reference is appropriate here.
>> So I guess the confusion comes from a simple typo =)
> I tried to mean "physical clusters" on host by pClusters, on the contrary
> to "virtual clusters" on guest. But obviously it brings confusion.

OK, I got confused because you don't use "vClusters".

>> Is it OK if I replace "pClusters" by "Clusters"?
> Sure, it's clearer to just use "clusters", please do that.

OK.




[PATCH v2 1/5] target/ppc: powerpc_excp: Set alternate SRRs directly

2021-12-29 Thread Fabiano Rosas
There are currently only two interrupts that use alternate SRRs, so
let them write to them directly during the setup code.

No functional change intended.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Richard Henderson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 target/ppc/excp_helper.c | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index f90e616aac..8b9c6bc5a8 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -298,7 +298,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = &cpu->env;
 target_ulong msr, new_msr, vector;
-int srr0, srr1, asrr0, asrr1, lev = -1;
+int srr0, srr1, lev = -1;
 
 qemu_log_mask(CPU_LOG_INT, "Raise exception at " TARGET_FMT_lx
   " => %08x (%02x)\n", env->nip, excp, env->error_code);
@@ -319,8 +319,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 /* target registers */
 srr0 = SPR_SRR0;
 srr1 = SPR_SRR1;
-asrr0 = -1;
-asrr1 = -1;
 
 /*
  * check for special resume at 0x100 from doze/nap/sleep/winkle on
@@ -410,8 +408,9 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 /* FIXME: choose one or the other based on CPU type */
 srr0 = SPR_BOOKE_MCSRR0;
 srr1 = SPR_BOOKE_MCSRR1;
-asrr0 = SPR_BOOKE_CSRR0;
-asrr1 = SPR_BOOKE_CSRR1;
+
+env->spr[SPR_BOOKE_CSRR0] = env->nip;
+env->spr[SPR_BOOKE_CSRR1] = msr;
 break;
 default:
 break;
@@ -570,8 +569,10 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 /* FIXME: choose one or the other based on CPU type */
 srr0 = SPR_BOOKE_DSRR0;
 srr1 = SPR_BOOKE_DSRR1;
-asrr0 = SPR_BOOKE_CSRR0;
-asrr1 = SPR_BOOKE_CSRR1;
+
+env->spr[SPR_BOOKE_CSRR0] = env->nip;
+env->spr[SPR_BOOKE_CSRR1] = msr;
+
 /* DBSR already modified by caller */
 } else {
 cpu_abort(cs, "Debug exception triggered on unsupported model\n");
@@ -838,14 +839,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 
 vector |= env->excp_prefix;
 
-/* If any alternate SRR register are defined, duplicate saved values */
-if (asrr0 != -1) {
-env->spr[asrr0] = env->nip;
-}
-if (asrr1 != -1) {
-env->spr[asrr1] = msr;
-}
-
 #if defined(TARGET_PPC64)
 if (excp_model == POWERPC_EXCP_BOOKE) {
 if (env->spr[SPR_BOOKE_EPCR] & EPCR_ICM) {
-- 
2.33.1




[PATCH v2 0/5] target/ppc: powerpc_excp improvements (1/n)

2021-12-29 Thread Fabiano Rosas
This series comprises of the first 4 patches from the RFC v2 plus an
extra patch addressing review comments.

Patch 1,3,4,5 have been reviewed.

Patch 2 addresses prior comments from patch 3 and has not been
reviewed.

RFC v1:
https://lists.nongnu.org/archive/html/qemu-ppc/2021-06/msg00026.html

RFC v2:
https://lists.nongnu.org/archive/html/qemu-ppc/2021-12/msg00542.html

Fabiano Rosas (5):
  target/ppc: powerpc_excp: Set alternate SRRs directly
  target/ppc: powerpc_excp: Add excp_vectors bounds check
  target/ppc: powerpc_excp: Set vector earlier
  target/ppc: powerpc_excp: Move system call vectored code together
  target/ppc: powerpc_excp: Stop passing excp_model around

 target/ppc/excp_helper.c | 102 ++-
 1 file changed, 46 insertions(+), 56 deletions(-)

-- 
2.33.1




[PATCH v2 2/5] target/ppc: powerpc_excp: Add excp_vectors bounds check

2021-12-29 Thread Fabiano Rosas
The next patch will start accessing the excp_vectors array earlier in
the function, so add a bounds check as first thing here.

This converts the empty return on POWERPC_EXCP_NONE to an error. This
exception number never reaches this function and if it does it
probably means something else went wrong up the line.

Signed-off-by: Fabiano Rosas 
---
 target/ppc/excp_helper.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 8b9c6bc5a8..9a03e4b896 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -300,6 +300,10 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 target_ulong msr, new_msr, vector;
 int srr0, srr1, lev = -1;
 
+if (excp <= POWERPC_EXCP_NONE || excp >= POWERPC_EXCP_NB) {
+cpu_abort(cs, "Invalid PowerPC exception %d. Aborting\n", excp);
+}
+
 qemu_log_mask(CPU_LOG_INT, "Raise exception at " TARGET_FMT_lx
   " => %08x (%02x)\n", env->nip, excp, env->error_code);
 
@@ -353,9 +357,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 #endif
 
 switch (excp) {
-case POWERPC_EXCP_NONE:
-/* Should never happen */
-return;
 case POWERPC_EXCP_CRITICAL:/* Critical input */
 switch (excp_model) {
 case POWERPC_EXCP_40x:
-- 
2.33.1




[PATCH v2 3/5] target/ppc: powerpc_excp: Set vector earlier

2021-12-29 Thread Fabiano Rosas
None of the interrupt setup code touches 'vector', so we can move it
earlier in the function. This will allow us to later move the System
Call Vectored setup that is on the top level into the
POWERPC_EXCP_SYSCALL_VECTORED code block.

This patch also moves the verification for when 'excp' does not have
an address associated with it. We now bail a little earlier when that
is the case. This should not cause any visible effects.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/excp_helper.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 9a03e4b896..1fe20b4806 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -356,6 +356,14 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 }
 #endif
 
+vector = env->excp_vectors[excp];
+if (vector == (target_ulong)-1ULL) {
+cpu_abort(cs, "Raised an exception without defined vector %d\n",
+  excp);
+}
+
+vector |= env->excp_prefix;
+
 switch (excp) {
 case POWERPC_EXCP_CRITICAL:/* Critical input */
 switch (excp_model) {
@@ -832,14 +840,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 }
 #endif
 
-vector = env->excp_vectors[excp];
-if (vector == (target_ulong)-1ULL) {
-cpu_abort(cs, "Raised an exception without defined vector %d\n",
-  excp);
-}
-
-vector |= env->excp_prefix;
-
 #if defined(TARGET_PPC64)
 if (excp_model == POWERPC_EXCP_BOOKE) {
 if (env->spr[SPR_BOOKE_EPCR] & EPCR_ICM) {
-- 
2.33.1




[PATCH v2 4/5] target/ppc: powerpc_excp: Move system call vectored code together

2021-12-29 Thread Fabiano Rosas
Now that 'vector' is known before calling the interrupt-specific setup
code, we can move all of the scv setup into one place.

No functional change intended.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Richard Henderson 
---
 target/ppc/excp_helper.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 1fe20b4806..811f6be0a0 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -550,6 +550,11 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 env->nip += 4;
 new_msr |= env->msr & ((target_ulong)1 << MSR_EE);
 new_msr |= env->msr & ((target_ulong)1 << MSR_RI);
+
+vector += lev * 0x20;
+
+env->lr = env->nip;
+env->ctr = msr;
 break;
 case POWERPC_EXCP_FPU:   /* Floating-point unavailable exception */
 case POWERPC_EXCP_APU:   /* Auxiliary processor unavailable  */
@@ -863,14 +868,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 
 /* Save MSR */
 env->spr[srr1] = msr;
-
-#if defined(TARGET_PPC64)
-} else {
-vector += lev * 0x20;
-
-env->lr = env->nip;
-env->ctr = msr;
-#endif
 }
 
 /* This can update new_msr and vector if AIL applies */
-- 
2.33.1




[PATCH v2 5/5] target/ppc: powerpc_excp: Stop passing excp_model around

2021-12-29 Thread Fabiano Rosas
We can just access it directly in powerpc_excp.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Richard Henderson 
Reviewed-by: David Gibson 
---
 target/ppc/excp_helper.c | 43 
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 811f6be0a0..c7e55800af 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -293,10 +293,11 @@ static inline void powerpc_set_excp_state(PowerPCCPU *cpu,
  * Note that this function should be greatly optimized when called
  * with a constant excp, from ppc_hw_interrupt
  */
-static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp)
+static inline void powerpc_excp(PowerPCCPU *cpu, int excp)
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = &cpu->env;
+int excp_model = env->excp_model;
 target_ulong msr, new_msr, vector;
 int srr0, srr1, lev = -1;
 
@@ -879,9 +880,8 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 void ppc_cpu_do_interrupt(CPUState *cs)
 {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
-CPUPPCState *env = &cpu->env;
 
-powerpc_excp(cpu, env->excp_model, cs->exception_index);
+powerpc_excp(cpu, cs->exception_index);
 }
 
 static void ppc_hw_interrupt(CPUPPCState *env)
@@ -892,20 +892,20 @@ static void ppc_hw_interrupt(CPUPPCState *env)
 /* External reset */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_RESET)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_RESET);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_RESET);
+powerpc_excp(cpu, POWERPC_EXCP_RESET);
 return;
 }
 /* Machine check exception */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_MCK)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_MCK);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_MCHECK);
+powerpc_excp(cpu, POWERPC_EXCP_MCHECK);
 return;
 }
 #if 0 /* TODO */
 /* External debug exception */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_DEBUG)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_DEBUG);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_DEBUG);
+powerpc_excp(cpu, POWERPC_EXCP_DEBUG);
 return;
 }
 #endif
@@ -925,7 +925,7 @@ static void ppc_hw_interrupt(CPUPPCState *env)
 if ((async_deliver || msr_hv == 0) && hdice) {
 /* HDEC clears on delivery */
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_HDECR);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_HDECR);
+powerpc_excp(cpu, POWERPC_EXCP_HDECR);
 return;
 }
 }
@@ -935,7 +935,7 @@ static void ppc_hw_interrupt(CPUPPCState *env)
 /* LPCR will be clear when not supported so this will work */
 bool hvice = !!(env->spr[SPR_LPCR] & LPCR_HVICE);
 if ((async_deliver || msr_hv == 0) && hvice) {
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_HVIRT);
+powerpc_excp(cpu, POWERPC_EXCP_HVIRT);
 return;
 }
 }
@@ -947,14 +947,14 @@ static void ppc_hw_interrupt(CPUPPCState *env)
 /* HEIC blocks delivery to the hypervisor */
 if ((async_deliver && !(heic && msr_hv && !msr_pr)) ||
 (env->has_hv_mode && msr_hv == 0 && !lpes0)) {
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_EXTERNAL);
+powerpc_excp(cpu, POWERPC_EXCP_EXTERNAL);
 return;
 }
 }
 if (msr_ce != 0) {
 /* External critical interrupt */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_CEXT)) {
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_CRITICAL);
+powerpc_excp(cpu, POWERPC_EXCP_CRITICAL);
 return;
 }
 }
@@ -962,24 +962,24 @@ static void ppc_hw_interrupt(CPUPPCState *env)
 /* Watchdog timer on embedded PowerPC */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_WDT)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_WDT);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_WDT);
+powerpc_excp(cpu, POWERPC_EXCP_WDT);
 return;
 }
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_CDOORBELL)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_CDOORBELL);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_DOORCI);
+powerpc_excp(cpu, POWERPC_EXCP_DOORCI);
 return;
 }
 /* Fixed interval timer on embedded PowerPC */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_FIT)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_FIT);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_FIT);
+powerpc_excp(cpu, POWERPC_EXCP_FIT);
 return;
 }
 /* Programmable interval timer on embedded PowerPC */
 if (env->pending_

[PATCH 1/2] include/qemu: add 32-bit Windows dump structures

2021-12-29 Thread Viktor Prutyanov
These structures are required to produce 32-bit guest Windows Complete
Memory Dump. Add 32-bit Windows dump header, CPU context and physical
memory descriptor structures along with corresponding definitions.

Signed-off-by: Viktor Prutyanov 
---
 contrib/elf2dmp/main.c   |   6 +-
 include/qemu/win_dump_defs.h | 115 +--
 2 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index 20b477d582..b9fc6d230c 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -141,10 +141,10 @@ static KDDEBUGGER_DATA64 *get_kdbg(uint64_t KernBase, 
struct pdb_reader *pdb,
 return kdbg;
 }
 
-static void win_context_init_from_qemu_cpu_state(WinContext *ctx,
+static void win_context_init_from_qemu_cpu_state(WinContext64 *ctx,
 QEMUCPUState *s)
 {
-WinContext win_ctx = (WinContext){
+WinContext64 win_ctx = (WinContext64){
 .ContextFlags = WIN_CTX_X64 | WIN_CTX_INT | WIN_CTX_SEG | WIN_CTX_CTL,
 .MxCsr = INITIAL_MXCSR,
 
@@ -302,7 +302,7 @@ static int fill_context(KDDEBUGGER_DATA64 *kdbg,
 for (i = 0; i < qe->state_nr; i++) {
 uint64_t Prcb;
 uint64_t Context;
-WinContext ctx;
+WinContext64 ctx;
 QEMUCPUState *s = qe->state[i];
 
 if (va_space_rw(vs, kdbg->KiProcessorBlock + sizeof(Prcb) * i,
diff --git a/include/qemu/win_dump_defs.h b/include/qemu/win_dump_defs.h
index 145096e8ee..af85f5c3cf 100644
--- a/include/qemu/win_dump_defs.h
+++ b/include/qemu/win_dump_defs.h
@@ -11,11 +11,22 @@
 #ifndef QEMU_WIN_DUMP_DEFS_H
 #define QEMU_WIN_DUMP_DEFS_H
 
+typedef struct WinDumpPhyMemRun32 {
+uint32_t BasePage;
+uint32_t PageCount;
+} QEMU_PACKED WinDumpPhyMemRun32;
+
 typedef struct WinDumpPhyMemRun64 {
 uint64_t BasePage;
 uint64_t PageCount;
 } QEMU_PACKED WinDumpPhyMemRun64;
 
+typedef struct WinDumpPhyMemDesc32 {
+uint32_t NumberOfRuns;
+uint32_t NumberOfPages;
+WinDumpPhyMemRun32 Run[86];
+} QEMU_PACKED WinDumpPhyMemDesc32;
+
 typedef struct WinDumpPhyMemDesc64 {
 uint32_t NumberOfRuns;
 uint32_t unused;
@@ -33,6 +44,39 @@ typedef struct WinDumpExceptionRecord {
 uint64_t ExceptionInformation[15];
 } QEMU_PACKED WinDumpExceptionRecord;
 
+typedef struct WinDumpHeader32 {
+char Signature[4];
+char ValidDump[4];
+uint32_t MajorVersion;
+uint32_t MinorVersion;
+uint32_t DirectoryTableBase;
+uint32_t PfnDatabase;
+uint32_t PsLoadedModuleList;
+uint32_t PsActiveProcessHead;
+uint32_t MachineImageType;
+uint32_t NumberProcessors;
+union {
+struct {
+uint32_t BugcheckCode;
+uint32_t BugcheckParameter1;
+uint32_t BugcheckParameter2;
+uint32_t BugcheckParameter3;
+uint32_t BugcheckParameter4;
+};
+uint8_t BugcheckData[20];
+};
+uint8_t VersionUser[32];
+uint32_t reserved0;
+uint32_t KdDebuggerDataBlock;
+union {
+WinDumpPhyMemDesc32 PhysicalMemoryBlock;
+uint8_t PhysicalMemoryBlockBuffer[700];
+};
+uint8_t reserved1[3200];
+uint32_t RequiredDumpSpace;
+uint8_t reserved2[92];
+} QEMU_PACKED WinDumpHeader32;
+
 typedef struct WinDumpHeader64 {
 char Signature[4];
 char ValidDump[4];
@@ -81,24 +125,48 @@ typedef struct WinDumpHeader64 {
 uint8_t reserved[4018];
 } QEMU_PACKED WinDumpHeader64;
 
+typedef union WinDumpHeader {
+struct {
+char Signature[4];
+char ValidDump[4];
+};
+WinDumpHeader32 x32;
+WinDumpHeader64 x64;
+} WinDumpHeader;
+
 #define KDBG_OWNER_TAG_OFFSET64 0x10
 #define KDBG_MM_PFN_DATABASE_OFFSET64   0xC0
 #define KDBG_KI_BUGCHECK_DATA_OFFSET64  0x88
 #define KDBG_KI_PROCESSOR_BLOCK_OFFSET640x218
 #define KDBG_OFFSET_PRCB_CONTEXT_OFFSET64   0x338
 
+#define KDBG_OWNER_TAG_OFFSET   KDBG_OWNER_TAG_OFFSET64
+#define KDBG_MM_PFN_DATABASE_OFFSET KDBG_MM_PFN_DATABASE_OFFSET64
+#define KDBG_KI_BUGCHECK_DATA_OFFSETKDBG_KI_BUGCHECK_DATA_OFFSET64
+#define KDBG_KI_PROCESSOR_BLOCK_OFFSET  KDBG_KI_PROCESSOR_BLOCK_OFFSET64
+#define KDBG_OFFSET_PRCB_CONTEXT_OFFSET KDBG_OFFSET_PRCB_CONTEXT_OFFSET64
+
 #define VMCOREINFO_ELF_NOTE_HDR_SIZE24
+#define VMCOREINFO_WIN_DUMP_NOTE_SIZE64 (sizeof(WinDumpHeader64) + \
+ VMCOREINFO_ELF_NOTE_HDR_SIZE)
+#define VMCOREINFO_WIN_DUMP_NOTE_SIZE32 (sizeof(WinDumpHeader32) + \
+ VMCOREINFO_ELF_NOTE_HDR_SIZE)
 
 #define WIN_CTX_X64 0x0010L
+#define WIN_CTX_X86 0x0001L
 
 #define WIN_CTX_CTL 0x0001L
 #define WIN_CTX_INT 0x0002L
 #define WIN_CTX_SEG 0x0004L
 #define WIN_CTX_FP  0x0008L
 #define WIN_CTX_DBG 0x0010L
+#define WIN_CTX_EXT 0x0020L
+
+#define WIN_CTX64_FULL (WIN_CTX_X64 | WIN_CTX_CTL | WIN_CTX_INT | WIN_CTX_FP)
+#define WIN_CTX64_ALL (WIN_CTX64_FULL | WIN_CTX_SEG | WIN_CTX_DBG)
 
-#define WIN_CTX_FU

[PATCH 0/2] dump: add 32-bit guest Windows support

2021-12-29 Thread Viktor Prutyanov
Since 32-bit versions of Windows still exist, there is a need to take
live and crash dumps of such guests along with 64-bit guests. So, add
an ability for 'dump-guest-memory -w' to take dumps from 32-bit guest.
When running the command QEMU consumes 32-bit Complete Memory Dump
header passed by guest driver through vmcoreinfo device as it was
previously done for 64-bit headers. 32-bit vmcoreinfo guest driver in
upstream virtio-win can fill such a header.

Viktor Prutyanov (2):
  include/qemu: add 32-bit Windows dump structures and definitions
  dump/win_dump: add 32-bit guest Windows support for dump-guest-memory

 contrib/elf2dmp/main.c   |   6 +-
 dump/win_dump.c  | 293 ++-
 hmp-commands.hx  |   2 +-
 include/qemu/win_dump_defs.h | 115 +-
 4 files changed, 300 insertions(+), 116 deletions(-)

-- 
2.31.1




[PATCH 2/2] dump/win_dump: add 32-bit guest Windows support

2021-12-29 Thread Viktor Prutyanov
Before this patch, 'dump-guest-memory -w' was accepting only 64-bit
dump header provided by guest through vmcoreinfo and thus was unable
to produce 32-bit guest Windows dump. So, add 32-bit guest Windows
dumping support.

Signed-off-by: Viktor Prutyanov 
---
 dump/win_dump.c | 293 ++--
 hmp-commands.hx |   2 +-
 2 files changed, 186 insertions(+), 109 deletions(-)

diff --git a/dump/win_dump.c b/dump/win_dump.c
index c5eb5a9aac..98d3c82078 100644
--- a/dump/win_dump.c
+++ b/dump/win_dump.c
@@ -24,11 +24,25 @@
 #include "hw/misc/vmcoreinfo.h"
 #include "win_dump.h"
 
-static size_t write_run(WinDumpPhyMemRun64 *run, int fd, Error **errp)
+#define WIN_DUMP_PTR_SIZE (x64 ? sizeof(uint64_t) : sizeof(uint32_t))
+
+#define _WIN_DUMP_FIELD(f) (x64 ? h->x64.f : h->x32.f)
+#define WIN_DUMP_FIELD(field) _WIN_DUMP_FIELD(field)
+
+#define _WIN_DUMP_FIELD_PTR(f) (x64 ? (void *)&h->x64.f : (void *)&h->x32.f)
+#define WIN_DUMP_FIELD_PTR(field) _WIN_DUMP_FIELD_PTR(field)
+
+#define _WIN_DUMP_FIELD_SIZE(f) (x64 ? sizeof(h->x64.f) : sizeof(h->x32.f))
+#define WIN_DUMP_FIELD_SIZE(field) _WIN_DUMP_FIELD_SIZE(field)
+
+#define WIN_DUMP_CTX_SIZE (x64 ? sizeof(WinContext64) : sizeof(WinContext32))
+
+static size_t write_run(uint64_t base_page, uint64_t page_count,
+int fd, Error **errp)
 {
 void *buf;
-uint64_t addr = run->BasePage << TARGET_PAGE_BITS;
-uint64_t size = run->PageCount << TARGET_PAGE_BITS;
+uint64_t addr = base_page << TARGET_PAGE_BITS;
+uint64_t size = page_count << TARGET_PAGE_BITS;
 uint64_t len, l;
 size_t total = 0;
 
@@ -57,15 +71,16 @@ static size_t write_run(WinDumpPhyMemRun64 *run, int fd, 
Error **errp)
 return total;
 }
 
-static void write_runs(DumpState *s, WinDumpHeader64 *h, Error **errp)
+static void write_runs(DumpState *s, WinDumpHeader *h, bool x64, Error **errp)
 {
-WinDumpPhyMemDesc64 *desc = &h->PhysicalMemoryBlock;
-WinDumpPhyMemRun64 *run = desc->Run;
+uint64_t BasePage, PageCount;
 Error *local_err = NULL;
 int i;
 
-for (i = 0; i < desc->NumberOfRuns; i++) {
-s->written_size += write_run(run + i, s->fd, &local_err);
+for (i = 0; i < WIN_DUMP_FIELD(PhysicalMemoryBlock.NumberOfRuns); i++) {
+BasePage = WIN_DUMP_FIELD(PhysicalMemoryBlock.Run[i].BasePage);
+PageCount = WIN_DUMP_FIELD(PhysicalMemoryBlock.Run[i].PageCount);
+s->written_size += write_run(BasePage, PageCount, s->fd, &local_err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -73,30 +88,45 @@ static void write_runs(DumpState *s, WinDumpHeader64 *h, 
Error **errp)
 }
 }
 
-static void patch_mm_pfn_database(WinDumpHeader64 *h, Error **errp)
+static int cpu_read_ptr(bool x64, CPUState *cpu, uint64_t addr, uint64_t *ptr)
+{
+int ret;
+uint32_t ptr32;
+uint64_t ptr64;
+
+ret = cpu_memory_rw_debug(cpu, addr, x64 ? (void *)&ptr64 : (void *)&ptr32,
+WIN_DUMP_PTR_SIZE, 0);
+
+*ptr = x64 ? ptr64 : ptr32;
+
+return ret;
+}
+
+static void patch_mm_pfn_database(WinDumpHeader *h, bool x64, Error **errp)
 {
 if (cpu_memory_rw_debug(first_cpu,
-h->KdDebuggerDataBlock + KDBG_MM_PFN_DATABASE_OFFSET64,
-(uint8_t *)&h->PfnDatabase, sizeof(h->PfnDatabase), 0)) {
+WIN_DUMP_FIELD(KdDebuggerDataBlock) + KDBG_MM_PFN_DATABASE_OFFSET,
+WIN_DUMP_FIELD_PTR(PfnDatabase),
+WIN_DUMP_FIELD_SIZE(PfnDatabase), 0)) {
 error_setg(errp, "win-dump: failed to read MmPfnDatabase");
 return;
 }
 }
 
-static void patch_bugcheck_data(WinDumpHeader64 *h, Error **errp)
+static void patch_bugcheck_data(WinDumpHeader *h, bool x64, Error **errp)
 {
 uint64_t KiBugcheckData;
 
-if (cpu_memory_rw_debug(first_cpu,
-h->KdDebuggerDataBlock + KDBG_KI_BUGCHECK_DATA_OFFSET64,
-(uint8_t *)&KiBugcheckData, sizeof(KiBugcheckData), 0)) {
+if (cpu_read_ptr(x64, first_cpu,
+WIN_DUMP_FIELD(KdDebuggerDataBlock) + KDBG_KI_BUGCHECK_DATA_OFFSET,
+&KiBugcheckData)) {
 error_setg(errp, "win-dump: failed to read KiBugcheckData");
 return;
 }
 
-if (cpu_memory_rw_debug(first_cpu,
-KiBugcheckData,
-h->BugcheckData, sizeof(h->BugcheckData), 0)) {
+if (cpu_memory_rw_debug(first_cpu, KiBugcheckData,
+WIN_DUMP_FIELD(BugcheckData),
+WIN_DUMP_FIELD_SIZE(BugcheckData), 0)) {
 error_setg(errp, "win-dump: failed to read bugcheck data");
 return;
 }
@@ -105,38 +135,42 @@ static void patch_bugcheck_data(WinDumpHeader64 *h, Error 
**errp)
  * If BugcheckCode wasn't saved, we consider guest OS as alive.
  */
 
-if (!h->BugcheckCode) {
-h->BugcheckCode = LIVE_SYSTEM_DUMP;
+if (!WIN_DUMP_FIELD(BugcheckCode)) {
+*(uint32_t *)WIN_DUMP_FIELD_PTR(BugcheckCode) = LIVE_SYSTEM_DUMP;
 }
 }
 
 /*
  * This routine tries to cor

[PULL 1/6] job.c: add missing notifier initialization

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
From: Emanuele Giuseppe Esposito 

It seems that on_idle list is not properly initialized like
the other notifiers.

Fixes: 34dc97b9a0e ("blockjob: Wake up BDS when job becomes idle")
Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 job.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/job.c b/job.c
index dbfa67bb0a..54db80df66 100644
--- a/job.c
+++ b/job.c
@@ -352,6 +352,7 @@ void *job_create(const char *job_id, const JobDriver 
*driver, JobTxn *txn,
 notifier_list_init(&job->on_finalize_completed);
 notifier_list_init(&job->on_pending);
 notifier_list_init(&job->on_ready);
+notifier_list_init(&job->on_idle);
 
 job_state_transition(job, JOB_STATUS_CREATED);
 aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
-- 
2.31.1




[PULL 3/6] test-blockjob-txn: don't abuse job->blk

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
Here we use job->blk to drop our own reference in job cleanup. Let's do
simpler: drop our reference immediately after job creation.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Nikita Lapshin 
---
 tests/unit/test-blockjob-txn.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/tests/unit/test-blockjob-txn.c b/tests/unit/test-blockjob-txn.c
index 8bd13b9949..c69028b450 100644
--- a/tests/unit/test-blockjob-txn.c
+++ b/tests/unit/test-blockjob-txn.c
@@ -25,14 +25,6 @@ typedef struct {
 int *result;
 } TestBlockJob;
 
-static void test_block_job_clean(Job *job)
-{
-BlockJob *bjob = container_of(job, BlockJob, job);
-BlockDriverState *bs = blk_bs(bjob->blk);
-
-bdrv_unref(bs);
-}
-
 static int coroutine_fn test_block_job_run(Job *job, Error **errp)
 {
 TestBlockJob *s = container_of(job, TestBlockJob, common.job);
@@ -73,7 +65,6 @@ static const BlockJobDriver test_block_job_driver = {
 .free  = block_job_free,
 .user_resume   = block_job_user_resume,
 .run   = test_block_job_run,
-.clean = test_block_job_clean,
 },
 };
 
@@ -105,6 +96,7 @@ static BlockJob *test_block_job_start(unsigned int 
iterations,
 s = block_job_create(job_id, &test_block_job_driver, txn, bs,
  0, BLK_PERM_ALL, 0, JOB_DEFAULT,
  test_block_job_cb, data, &error_abort);
+bdrv_unref(bs); /* referenced by job now */
 s->iterations = iterations;
 s->use_timer = use_timer;
 s->rc = rc;
-- 
2.31.1




[PULL 2/6] blockjob: implement and use block_job_get_aio_context

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
We are going to drop BlockJob.blk. So let's retrieve block job context
from underlying job instead of main node.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Nikita Lapshin 
---
 include/block/blockjob.h | 7 +++
 blockdev.c   | 6 +++---
 blockjob.c   | 5 +
 qemu-img.c   | 2 +-
 4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index d200f33c10..3b84805140 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -173,4 +173,11 @@ bool block_job_is_internal(BlockJob *job);
  */
 const BlockJobDriver *block_job_driver(BlockJob *job);
 
+/*
+ * block_job_get_aio_context:
+ *
+ * Returns aio context associated with a block job.
+ */
+AioContext *block_job_get_aio_context(BlockJob *job);
+
 #endif
diff --git a/blockdev.c b/blockdev.c
index 0eb2823b1b..b5ff9b854e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3315,7 +3315,7 @@ static BlockJob *find_block_job(const char *id, 
AioContext **aio_context,
 return NULL;
 }
 
-*aio_context = blk_get_aio_context(job->blk);
+*aio_context = block_job_get_aio_context(job);
 aio_context_acquire(*aio_context);
 
 return job;
@@ -3420,7 +3420,7 @@ void qmp_block_job_finalize(const char *id, Error **errp)
  * automatically acquires the new one), so make sure we release the correct
  * one.
  */
-aio_context = blk_get_aio_context(job->blk);
+aio_context = block_job_get_aio_context(job);
 job_unref(&job->job);
 aio_context_release(aio_context);
 }
@@ -3711,7 +3711,7 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
 if (block_job_is_internal(job)) {
 continue;
 }
-aio_context = blk_get_aio_context(job->blk);
+aio_context = block_job_get_aio_context(job);
 aio_context_acquire(aio_context);
 value = block_job_query(job, errp);
 aio_context_release(aio_context);
diff --git a/blockjob.c b/blockjob.c
index 4bad1408cb..70bc3105a6 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -547,3 +547,8 @@ BlockErrorAction block_job_error_action(BlockJob *job, 
BlockdevOnError on_err,
 }
 return action;
 }
+
+AioContext *block_job_get_aio_context(BlockJob *job)
+{
+return job->job.aio_context;
+}
diff --git a/qemu-img.c b/qemu-img.c
index f036a1d428..21ba1e6800 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -902,7 +902,7 @@ static void common_block_job_cb(void *opaque, int ret)
 static void run_block_job(BlockJob *job, Error **errp)
 {
 uint64_t progress_current, progress_total;
-AioContext *aio_context = blk_get_aio_context(job->blk);
+AioContext *aio_context = block_job_get_aio_context(job);
 int ret = 0;
 
 aio_context_acquire(aio_context);
-- 
2.31.1




[PULL 4/6] block/stream: add own blk

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
block-stream is the only block-job, that reasonably use BlockJob.blk.
We are going to drop BlockJob.blk soon. So, let block-stream have own
blk.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Nikita Lapshin 
---
 block/stream.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index e45113aed6..7c6b173ddd 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -33,6 +33,7 @@ enum {
 
 typedef struct StreamBlockJob {
 BlockJob common;
+BlockBackend *blk;
 BlockDriverState *base_overlay; /* COW overlay (stream from this) */
 BlockDriverState *above_base;   /* Node directly above the base */
 BlockDriverState *cor_filter_bs;
@@ -88,17 +89,18 @@ static int stream_prepare(Job *job)
 static void stream_clean(Job *job)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
-BlockJob *bjob = &s->common;
 
 if (s->cor_filter_bs) {
 bdrv_cor_filter_drop(s->cor_filter_bs);
 s->cor_filter_bs = NULL;
 }
 
+blk_unref(s->blk);
+s->blk = NULL;
+
 /* Reopen the image back in read-only mode if necessary */
 if (s->bs_read_only) {
 /* Give up write permissions before making it read-only */
-blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort);
 bdrv_reopen_set_read_only(s->target_bs, true, NULL);
 }
 
@@ -108,7 +110,6 @@ static void stream_clean(Job *job)
 static int coroutine_fn stream_run(Job *job, Error **errp)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
-BlockBackend *blk = s->common.blk;
 BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
 int64_t len;
 int64_t offset = 0;
@@ -159,7 +160,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
 }
 trace_stream_one_iteration(s, offset, n, ret);
 if (copy) {
-ret = stream_populate(blk, offset, n);
+ret = stream_populate(s->blk, offset, n);
 }
 if (ret < 0) {
 BlockErrorAction action =
@@ -294,13 +295,24 @@ void stream_start(const char *job_id, BlockDriverState 
*bs,
 }
 
 s = block_job_create(job_id, &stream_job_driver, NULL, cor_filter_bs,
- BLK_PERM_CONSISTENT_READ,
- basic_flags | BLK_PERM_WRITE,
+ 0, BLK_PERM_ALL,
  speed, creation_flags, NULL, NULL, errp);
 if (!s) {
 goto fail;
 }
 
+s->blk = blk_new_with_bs(cor_filter_bs, BLK_PERM_CONSISTENT_READ,
+ basic_flags | BLK_PERM_WRITE, errp);
+if (!s->blk) {
+goto fail;
+}
+/*
+ * Disable request queuing in the BlockBackend to avoid deadlocks on drain:
+ * The job reports that it's busy until it reaches a pause point.
+ */
+blk_set_disable_request_queuing(s->blk, true);
+blk_set_allow_aio_context_change(s->blk, true);
+
 /*
  * Prevent concurrent jobs trying to modify the graph structure here, we
  * already have our own plans. Also don't allow resize as the image size is
-- 
2.31.1




[PULL 5/6] test-bdrv-drain: don't use BlockJob.blk

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
We are going to drop BlockJob.blk in further commit. For tests it's
enough to simply pass bs pointer.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Nikita Lapshin 
---
 tests/unit/test-bdrv-drain.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 2d3c17e566..36be84ae55 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -772,6 +772,7 @@ static void test_iothread_drain_subtree(void)
 
 typedef struct TestBlockJob {
 BlockJob common;
+BlockDriverState *bs;
 int run_ret;
 int prepare_ret;
 bool running;
@@ -783,7 +784,7 @@ static int test_job_prepare(Job *job)
 TestBlockJob *s = container_of(job, TestBlockJob, common.job);
 
 /* Provoke an AIO_WAIT_WHILE() call to verify there is no deadlock */
-blk_flush(s->common.blk);
+bdrv_flush(s->bs);
 return s->prepare_ret;
 }
 
@@ -792,7 +793,7 @@ static void test_job_commit(Job *job)
 TestBlockJob *s = container_of(job, TestBlockJob, common.job);
 
 /* Provoke an AIO_WAIT_WHILE() call to verify there is no deadlock */
-blk_flush(s->common.blk);
+bdrv_flush(s->bs);
 }
 
 static void test_job_abort(Job *job)
@@ -800,7 +801,7 @@ static void test_job_abort(Job *job)
 TestBlockJob *s = container_of(job, TestBlockJob, common.job);
 
 /* Provoke an AIO_WAIT_WHILE() call to verify there is no deadlock */
-blk_flush(s->common.blk);
+bdrv_flush(s->bs);
 }
 
 static int coroutine_fn test_job_run(Job *job, Error **errp)
@@ -915,6 +916,7 @@ static void test_blockjob_common_drain_node(enum drain_type 
drain_type,
 tjob = block_job_create("job0", &test_job_driver, NULL, src,
 0, BLK_PERM_ALL,
 0, 0, NULL, NULL, &error_abort);
+tjob->bs = src;
 job = &tjob->common;
 block_job_add_bdrv(job, "target", target, 0, BLK_PERM_ALL, &error_abort);
 
@@ -1538,6 +1540,7 @@ typedef struct TestDropBackingBlockJob {
 bool should_complete;
 bool *did_complete;
 BlockDriverState *detach_also;
+BlockDriverState *bs;
 } TestDropBackingBlockJob;
 
 static int coroutine_fn test_drop_backing_job_run(Job *job, Error **errp)
@@ -1557,7 +1560,7 @@ static void test_drop_backing_job_commit(Job *job)
 TestDropBackingBlockJob *s =
 container_of(job, TestDropBackingBlockJob, common.job);
 
-bdrv_set_backing_hd(blk_bs(s->common.blk), NULL, &error_abort);
+bdrv_set_backing_hd(s->bs, NULL, &error_abort);
 bdrv_set_backing_hd(s->detach_also, NULL, &error_abort);
 
 *s->did_complete = true;
@@ -1657,6 +1660,7 @@ static void test_blockjob_commit_by_drained_end(void)
 job = block_job_create("job", &test_drop_backing_job_driver, NULL,
bs_parents[2], 0, BLK_PERM_ALL, 0, 0, NULL, NULL,
&error_abort);
+job->bs = bs_parents[2];
 
 job->detach_also = bs_parents[0];
 job->did_complete = &job_has_completed;
-- 
2.31.1




[PULL 6/6] blockjob: drop BlockJob.blk field

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
It's unused now (except for permission handling)[*]. The only reasonable
user of it was block-stream job, recently updated to use own blk. And
other block jobs prefer to use own source node related objects.

So, the arguments of dropping the field are:

 - block jobs prefer not to use it
 - block jobs usually has more then one node to operate on, and better
   to operate symmetrically (for example has both source and target
   blk's in specific block-job state structure)

*: BlockJob.blk is used to keep some permissions. We simply move
permissions to block-job child created in block_job_create() together
with blk.

In mirror, we just should not care anymore about restoring state of
blk. Most probably this code could be dropped long ago, after dropping
bs->job pointer. Now it finally goes away together with BlockJob.blk
itself.

iotest 141 output is updated, as "bdrv_has_blk(bs)" check in
qmp_blockdev_del() doesn't fail (we don't have blk now). Still, new
error message looks even better.

In iotest 283 we need to add a job id, otherwise "Invalid job ID"
happens now earlier than permission check (as permissions moved from
blk to block-job node).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Nikita Lapshin 
---
 include/block/blockjob.h   |  3 ---
 block/mirror.c |  7 ---
 blockjob.c | 31 ---
 tests/qemu-iotests/141.out |  2 +-
 tests/qemu-iotests/283 |  3 ++-
 tests/qemu-iotests/283.out |  2 +-
 6 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 3b84805140..87fbb3985f 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -43,9 +43,6 @@ typedef struct BlockJob {
 /** Data belonging to the generic Job infrastructure */
 Job job;
 
-/** The block device on which the job is operating.  */
-BlockBackend *blk;
-
 /** Status that is published by the query-block-jobs QMP API */
 BlockDeviceIoStatus iostatus;
 
diff --git a/block/mirror.c b/block/mirror.c
index efec2c7674..959e3dfbd6 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -771,13 +771,6 @@ static int mirror_exit_common(Job *job)
 block_job_remove_all_bdrv(bjob);
 bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
 
-/* We just changed the BDS the job BB refers to (with either or both of the
- * bdrv_replace_node() calls), so switch the BB back so the cleanup does
- * the right thing. We don't need any permissions any more now. */
-blk_remove_bs(bjob->blk);
-blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort);
-blk_insert_bs(bjob->blk, mirror_top_bs, &error_abort);
-
 bs_opaque->job = NULL;
 
 bdrv_drained_end(src);
diff --git a/blockjob.c b/blockjob.c
index 70bc3105a6..10815a89fe 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -86,7 +86,6 @@ void block_job_free(Job *job)
 BlockJob *bjob = container_of(job, BlockJob, job);
 
 block_job_remove_all_bdrv(bjob);
-blk_unref(bjob->blk);
 ratelimit_destroy(&bjob->limit);
 error_free(bjob->blocker);
 }
@@ -433,22 +432,16 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp)
 {
-BlockBackend *blk;
 BlockJob *job;
+int ret;
 
 if (job_id == NULL && !(flags & JOB_INTERNAL)) {
 job_id = bdrv_get_device_name(bs);
 }
 
-blk = blk_new_with_bs(bs, perm, shared_perm, errp);
-if (!blk) {
-return NULL;
-}
-
-job = job_create(job_id, &driver->job_driver, txn, 
blk_get_aio_context(blk),
+job = job_create(job_id, &driver->job_driver, txn, 
bdrv_get_aio_context(bs),
  flags, cb, opaque, errp);
 if (job == NULL) {
-blk_unref(blk);
 return NULL;
 }
 
@@ -458,8 +451,6 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 
 ratelimit_init(&job->limit);
 
-job->blk = blk;
-
 job->finalize_cancelled_notifier.notify = block_job_event_cancelled;
 job->finalize_completed_notifier.notify = block_job_event_completed;
 job->pending_notifier.notify = block_job_event_pending;
@@ -476,21 +467,23 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 
 error_setg(&job->blocker, "block device is in use by block job: %s",
job_type_str(&job->job));
-block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
+
+ret = block_job_add_bdrv(job, "main node", bs, perm, shared_perm, errp);
+if (ret < 0) {
+goto fail;
+}
 
 bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
 
-/* Disable request queuing in the BlockBackend to avoid deadlocks on drain:
- * The job reports that it's busy until it reaches a pause point. */
-blk_set_disable_request_queuing(blk, true);
-bl

[PULL 0/6] Block jobs patches

2021-12-29 Thread Vladimir Sementsov-Ogievskiy
The following changes since commit 89f3bfa3265554d1d591ee4d7f1197b6e3397e84:

  Merge tag 'pull-pa-20211223' of https://gitlab.com/rth7680/qemu into staging 
(2021-12-23 17:53:36 -0800)

are available in the Git repository at:

  https://src.openvz.org/scm/~vsementsov/qemu.git tags/pull-jobs-2021-12-29

for you to fetch changes up to 985cac8f200443ad952becc03b07c51ff4f80983:

  blockjob: drop BlockJob.blk field (2021-12-28 15:18:59 +0100)


Jobs patches:
 - small fix of job_create()
 - refactoring: drop BlockJob.blk field



Emanuele Giuseppe Esposito (1):
  job.c: add missing notifier initialization

Vladimir Sementsov-Ogievskiy (5):
  blockjob: implement and use block_job_get_aio_context
  test-blockjob-txn: don't abuse job->blk
  block/stream: add own blk
  test-bdrv-drain: don't use BlockJob.blk
  blockjob: drop BlockJob.blk field

 include/block/blockjob.h   | 10 +++---
 block/mirror.c |  7 ---
 block/stream.c | 24 +--
 blockdev.c |  6 +++---
 blockjob.c | 36 --
 job.c  |  1 +
 qemu-img.c |  2 +-
 tests/unit/test-bdrv-drain.c   | 12 
 tests/unit/test-blockjob-txn.c | 10 +-
 tests/qemu-iotests/141.out |  2 +-
 tests/qemu-iotests/283 |  3 ++-
 tests/qemu-iotests/283.out |  2 +-
 12 files changed, 60 insertions(+), 55 deletions(-)

-- 
2.31.1




New arm alignment issue with 6.2.0 - bisected to single revision

2021-12-29 Thread Mark Watson
Hi,

I'm seeing a repeatable alignment exception running m68k system mode on
armv7l (arm cortex a9) following this commit:
"fa947a667fceab02f9f85fc99f54aebcc9ae6b51 is the first bad commit
commit fa947a667fceab02f9f85fc99f54aebcc9ae6b51
Author: Richard Henderson 
Date: Thu Jul 29 10:45:10 2021 -1000

hw/core: Make do_unaligned_access noreturn

While we may have had some thought of allowing system-mode
to return from this hook, we have no guests that require this.
"
With this included I see this in the kernel dmesg log:
[10621.993234] Alignment trap: not handling instruction f843b004 at
[]
[10622.000479] 8<--- cut here ---
[10622.003609] Unhandled fault: alignment exception (0x811) at 0xb13eed96
[10622.010162] pgd = 45acdb93
[10622.012941] [b13eed96] *pgd=0557a831, *pte=c01ee743, *ppte=c01eec33

As well as bisecting I've verified it is this revision by checking out
clean HEAD then reverting just this revision (+ fixing conflicts).

The patch itself just seems to be adding QEMU_NORETURN (aka '__attribute__
((__noreturn__))') which I'd expect to be benign, so I'm not really sure
what is going on.

I cross-compiled it on Ubuntu using gcc/g++ (Ubuntu 9.3.0-17ubuntu1~20.04)
9.3.0.

Thanks,

Mark


[PATCH] target/hppa: Fix atomic_store_3 for STBY

2021-12-29 Thread Richard Henderson
The parallel version of STBY did not take host endianness into
account, and also computed the incorrect address for STBY_E.

Bswap twice to handle the merge and store.  Compute mask inside
the function rather than as a parameter.  Force align the address,
rather than subtracting one.

Generalize the function to system mode by using probe_access().

Reported-by: Helge Deller 
Signed-off-by: Richard Henderson 
---
 target/hppa/op_helper.c| 27 ++-
 tests/tcg/hppa/stby.c  | 87 ++
 tests/tcg/hppa/Makefile.target |  5 ++
 3 files changed, 107 insertions(+), 12 deletions(-)
 create mode 100644 tests/tcg/hppa/stby.c

diff --git a/target/hppa/op_helper.c b/target/hppa/op_helper.c
index 96d9391c39..1b86557d5d 100644
--- a/target/hppa/op_helper.c
+++ b/target/hppa/op_helper.c
@@ -57,26 +57,29 @@ void HELPER(tcond)(CPUHPPAState *env, target_ureg cond)
 }
 }
 
-static void atomic_store_3(CPUHPPAState *env, target_ulong addr, uint32_t val,
-   uint32_t mask, uintptr_t ra)
+static void atomic_store_3(CPUHPPAState *env, target_ulong addr,
+   uint32_t val, uintptr_t ra)
 {
-#ifdef CONFIG_USER_ONLY
-uint32_t old, new, cmp;
+int mmu_idx = cpu_mmu_index(env, 0);
+uint32_t old, new, cmp, mask, *haddr;
+void *vaddr;
+
+vaddr = probe_access(env, addr, 3, MMU_DATA_STORE, mmu_idx, ra);
+if (vaddr == NULL) {
+cpu_loop_exit_atomic(env_cpu(env), ra);
+}
+haddr = (uint32_t *)((uintptr_t)vaddr & -4);
+mask = addr & 1 ? 0x00ffu : 0xff00u;
 
-uint32_t *haddr = g2h(env_cpu(env), addr - 1);
 old = *haddr;
 while (1) {
-new = (old & ~mask) | (val & mask);
+new = be32_to_cpu((cpu_to_be32(old) & ~mask) | (val & mask));
 cmp = qatomic_cmpxchg(haddr, old, new);
 if (cmp == old) {
 return;
 }
 old = cmp;
 }
-#else
-/* FIXME -- we can do better.  */
-cpu_loop_exit_atomic(env_cpu(env), ra);
-#endif
 }
 
 static void do_stby_b(CPUHPPAState *env, target_ulong addr, target_ureg val,
@@ -92,7 +95,7 @@ static void do_stby_b(CPUHPPAState *env, target_ulong addr, 
target_ureg val,
 case 1:
 /* The 3 byte store must appear atomic.  */
 if (parallel) {
-atomic_store_3(env, addr, val, 0x00ffu, ra);
+atomic_store_3(env, addr, val, ra);
 } else {
 cpu_stb_data_ra(env, addr, val >> 16, ra);
 cpu_stw_data_ra(env, addr + 1, val, ra);
@@ -122,7 +125,7 @@ static void do_stby_e(CPUHPPAState *env, target_ulong addr, 
target_ureg val,
 case 3:
 /* The 3 byte store must appear atomic.  */
 if (parallel) {
-atomic_store_3(env, addr - 3, val, 0xff00u, ra);
+atomic_store_3(env, addr - 3, val, ra);
 } else {
 cpu_stw_data_ra(env, addr - 3, val >> 16, ra);
 cpu_stb_data_ra(env, addr - 1, val >> 8, ra);
diff --git a/tests/tcg/hppa/stby.c b/tests/tcg/hppa/stby.c
new file mode 100644
index 00..36bd5f723c
--- /dev/null
+++ b/tests/tcg/hppa/stby.c
@@ -0,0 +1,87 @@
+/* Test STBY */
+
+#include 
+#include 
+#include 
+#include 
+
+
+struct S {
+unsigned a;
+unsigned b;
+unsigned c;
+};
+
+static void check(const struct S *s, unsigned e,
+  const char *which, const char *insn, int ofs)
+{
+int err = 0;
+
+if (s->a != 0) {
+fprintf(stderr, "%s %s %d: garbage before word 0x%08x\n",
+which, insn, ofs, s->a);
+err = 1;
+}
+if (s->c != 0) {
+fprintf(stderr, "%s %s %d: garbage after word 0x%08x\n",
+which, insn, ofs, s->c);
+err = 1;
+}
+if (s->b != e) {
+fprintf(stderr, "%s %s %d: 0x%08x != 0x%08x\n",
+which, insn, ofs, s->b, e);
+err = 1;
+}
+
+if (err) {
+exit(1);
+}
+}
+
+#define TEST(INSN, OFS, E) \
+do {   \
+s.b = 0;   \
+asm volatile(INSN " %1, " #OFS "(%0)"  \
+ : : "r"(&s.b), "r" (0x11223344) : "memory");  \
+check(&s, E, which, INSN, OFS);\
+} while (0)
+
+static void test(const char *which)
+{
+struct S s = { };
+
+TEST("stby,b", 0, 0x11223344);
+TEST("stby,b", 1, 0x00223344);
+TEST("stby,b", 2, 0x3344);
+TEST("stby,b", 3, 0x0044);
+
+TEST("stby,e", 0, 0x);
+TEST("stby,e", 1, 0x1100);
+TEST("stby,e", 2, 0x1122);
+TEST("stby,e", 3, 0x11223300);
+}
+
+static void *child(void *x)
+{
+return NULL;
+}
+
+int main()
+{
+int err;
+pthread_t thr;
+
+/* Run test in serial mode */
+test("serial");
+
+/* Create a dummy thread to start parallel mode. */
+err = pthread_create(&thr, NULL, chil

[PATCH 0/3] hw/sysbus: Document GPIO related functions

2021-12-29 Thread Philippe Mathieu-Daudé
Reduce the scope of a pair of qdev/sysbus functions,
then document the sysbus functions related to creating
and connecting GPIO lines.

Based-on: <20211218130437.1516929-1-f4...@amsat.org>

Philippe Mathieu-Daudé (3):
  hw/qdev: Restrict qdev_get_gpio_out_connector() to qdev-internal.h
  hw/sysbus: Restrict sysbus_get_connected_irq() to sysbus-internal.h
  hw/sysbus: Document GPIO related functions

 hw/core/qdev-internal.h   | 15 +
 hw/core/sysbus-internal.h | 16 +
 include/hw/qdev-core.h| 18 --
 include/hw/sysbus.h   | 69 ---
 hw/core/gpio.c|  1 +
 hw/core/platform-bus.c|  2 +-
 hw/core/sysbus.c  |  2 ++
 7 files changed, 99 insertions(+), 24 deletions(-)
 create mode 100644 hw/core/qdev-internal.h
 create mode 100644 hw/core/sysbus-internal.h

-- 
2.33.1





[PATCH 2/3] hw/sysbus: Restrict sysbus_get_connected_irq() to sysbus-internal.h

2021-12-29 Thread Philippe Mathieu-Daudé
sysbus_get_connected_irq() and sysbus_is_irq_connected() are only
used by platform-bus.c; restrict them to hw/core/ by adding a local
"sysbus-internal.h" header.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/sysbus-internal.h | 16 
 include/hw/sysbus.h   |  2 --
 hw/core/platform-bus.c|  2 +-
 hw/core/sysbus.c  |  1 +
 4 files changed, 18 insertions(+), 3 deletions(-)
 create mode 100644 hw/core/sysbus-internal.h

diff --git a/hw/core/sysbus-internal.h b/hw/core/sysbus-internal.h
new file mode 100644
index 000..991b3e3159c
--- /dev/null
+++ b/hw/core/sysbus-internal.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SysBus internal helpers
+ *
+ * Copyright (c) 2021 QEMU contributors
+ */
+#ifndef HW_CORE_SYSBUS_INTERNAL_H
+#define HW_CORE_SYSBUS_INTERNAL_H
+
+#include "hw/sysbus.h"
+
+/* Following functions are only used by the platform-bus subsystem */
+qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n);
+bool sysbus_is_irq_connected(SysBusDevice *dev, int n);
+
+#endif /* HW_CORE_SYSBUS_INTERNAL_H */
diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
index 3564b7b6a22..24645ee7996 100644
--- a/include/hw/sysbus.h
+++ b/include/hw/sysbus.h
@@ -77,8 +77,6 @@ void sysbus_init_ioports(SysBusDevice *dev, uint32_t ioport, 
uint32_t size);
 bool sysbus_has_irq(SysBusDevice *dev, int n);
 bool sysbus_has_mmio(SysBusDevice *dev, unsigned int n);
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq);
-bool sysbus_is_irq_connected(SysBusDevice *dev, int n);
-qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n);
 void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
 void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
  int priority);
diff --git a/hw/core/platform-bus.c b/hw/core/platform-bus.c
index b8487b26b67..016fb71eba1 100644
--- a/hw/core/platform-bus.c
+++ b/hw/core/platform-bus.c
@@ -25,7 +25,7 @@
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
-
+#include "sysbus-internal.h"
 
 /*
  * Returns the PlatformBus IRQ number for a SysBusDevice irq number or -1 if
diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index 0e6773c8df7..dcd7beda184 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -24,6 +24,7 @@
 #include "monitor/monitor.h"
 #include "exec/address-spaces.h"
 #include "qdev-internal.h"
+#include "sysbus-internal.h"
 
 static void sysbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
 static char *sysbus_get_fw_dev_path(DeviceState *dev);
-- 
2.33.1




[PATCH 1/3] hw/qdev: Restrict qdev_get_gpio_out_connector() to qdev-internal.h

2021-12-29 Thread Philippe Mathieu-Daudé
qdev_get_gpio_out_connector() is called by sysbus_get_connected_irq()
which is only used by platform-bus.c; restrict it to hw/core/ by
adding a local "qdev-internal.h" header.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/qdev-internal.h | 15 +++
 include/hw/qdev-core.h  | 18 --
 hw/core/gpio.c  |  1 +
 hw/core/sysbus.c|  1 +
 4 files changed, 17 insertions(+), 18 deletions(-)
 create mode 100644 hw/core/qdev-internal.h

diff --git a/hw/core/qdev-internal.h b/hw/core/qdev-internal.h
new file mode 100644
index 000..6ec17d0ea70
--- /dev/null
+++ b/hw/core/qdev-internal.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * qdev internal helpers
+ *
+ * Copyright (c) 2009-2021 QEMU contributors
+ */
+#ifndef HW_CORE_QDEV_INTERNAL_H
+#define HW_CORE_QDEV_INTERNAL_H
+
+#include "hw/qdev-core.h"
+
+/* Following functions are only used by the platform-bus subsystem */
+qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, const char *name, int 
n);
+
+#endif /* HW_CORE_QDEV_INTERNAL_H */
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index d19c9417520..655899654bb 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -532,24 +532,6 @@ void qdev_connect_gpio_out(DeviceState *dev, int n, 
qemu_irq pin);
 void qdev_connect_gpio_out_named(DeviceState *dev, const char *name, int n,
  qemu_irq input_pin);
 
-/**
- * qdev_get_gpio_out_connector: Get the qemu_irq connected to an output GPIO
- * @dev: Device whose output GPIO we are interested in
- * @name: Name of the output GPIO array
- * @n: Number of the output GPIO line within that array
- *
- * Returns whatever qemu_irq is currently connected to the specified
- * output GPIO line of @dev. This will be NULL if the output GPIO line
- * has never been wired up to the anything.  Note that the qemu_irq
- * returned does not belong to @dev -- it will be the input GPIO or
- * IRQ of whichever device the board code has connected up to @dev's
- * output GPIO.
- *
- * You probably don't need to use this function -- it is used only
- * by the platform-bus subsystem.
- */
-qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, const char *name, int 
n);
-
 /**
  * qdev_intercept_gpio_out: Intercept an existing GPIO connection
  * @dev: Device to intercept the outbound GPIO line from
diff --git a/hw/core/gpio.c b/hw/core/gpio.c
index 80d07a6ec99..513ccbd1062 100644
--- a/hw/core/gpio.c
+++ b/hw/core/gpio.c
@@ -21,6 +21,7 @@
 #include "hw/qdev-core.h"
 #include "hw/irq.h"
 #include "qapi/error.h"
+#include "qdev-internal.h"
 
 static NamedGPIOList *qdev_get_named_gpio_list(DeviceState *dev,
const char *name)
diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index 05c1da3d311..0e6773c8df7 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -23,6 +23,7 @@
 #include "hw/sysbus.h"
 #include "monitor/monitor.h"
 #include "exec/address-spaces.h"
+#include "qdev-internal.h"
 
 static void sysbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
 static char *sysbus_get_fw_dev_path(DeviceState *dev);
-- 
2.33.1




[PATCH 3/3] hw/sysbus: Document GPIO related functions

2021-12-29 Thread Philippe Mathieu-Daudé
Similarly to cd07d7f9f51 ("qdev: Document GPIO related functions"),
add documentation comments for the various sysbus functions
related to creating and connecting GPIO lines.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/sysbus.h | 67 +++--
 1 file changed, 64 insertions(+), 3 deletions(-)

diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
index 24645ee7996..7b2b7c7faaa 100644
--- a/include/hw/sysbus.h
+++ b/include/hw/sysbus.h
@@ -69,14 +69,75 @@ typedef void FindSysbusDeviceFunc(SysBusDevice *sbdev, void 
*opaque);
 
 void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory);
 MemoryRegion *sysbus_mmio_get_region(SysBusDevice *dev, int n);
-void sysbus_init_irq(SysBusDevice *dev, qemu_irq *p);
-void sysbus_pass_irq(SysBusDevice *dev, SysBusDevice *target);
-void sysbus_init_ioports(SysBusDevice *dev, uint32_t ioport, uint32_t size);
 
+/**
+ * sysbus_init_irq: Create an output GPIO line
+ * @dev: Sysbus device to create output GPIO for
+ * @irq: Pointer to qemu_irq for the GPIO lines
+ *
+ * Sysbus devices should use this function in their instance_init
+ * or realize methods to create any output GPIO lines they need.
+ *
+ * The @irq argument should be a pointer to either a "qemu_irq" in
+ * the device's state structure. The device implementation can then raise
+ * and lower the GPIO line by calling qemu_set_irq(). (If anything is
+ * connected to the other end of the GPIO this will cause the handler
+ * function for that input GPIO to be called.)
+ *
+ * See sysbus_connect_irq() for how code that uses such a device can
+ * connect to one of its output GPIO lines.
+ *
+ * There is no need to release the @pins allocated array because it
+ * will be automatically released when @dev calls its instance_finalize()
+ * handler.
+ */
+void sysbus_init_irq(SysBusDevice *dev, qemu_irq *irq);
+
+/**
+ * sysbus_pass_irq: Create GPIO lines on container which pass through
+ *  to a target device
+ * @dev: Device which needs to expose GPIO lines
+ * @target: Device which has GPIO lines
+ *
+ * This function allows a container device to create GPIO arrays on itself
+ * which simply pass through to a GPIO array of another device. It is
+ * useful when modelling complex devices such system-on-chip, where a
+ * sysbus device contains other sysbus devices.
+ *
+ * It is not possible to pass a subset of the GPIO lines with this function.
+ *
+ * To users of the container sysbus device, the GPIO array created on @dev
+ * behaves exactly like any other.
+ */
+void sysbus_pass_irq(SysBusDevice *dev, SysBusDevice *target);
+
+void sysbus_init_ioports(SysBusDevice *dev, uint32_t ioport, uint32_t size);
 
 bool sysbus_has_irq(SysBusDevice *dev, int n);
 bool sysbus_has_mmio(SysBusDevice *dev, unsigned int n);
+
+/**
+ * sysbus_connect_irq: Connect a sysbus device output GPIO line
+ * @dev: sysbus device whose GPIO to connect
+ * @n: Number of the output GPIO line (which must be in range)
+ * @pin: qemu_irq to connect the output line to
+ *
+ * This function connects an output GPIO line on a sysbus device
+ * up to an arbitrary qemu_irq, so that when the device asserts that
+ * output GPIO line, the qemu_irq's callback is invoked.
+ * The index @n of the GPIO line must be valid, otherwise this function
+ * will assert().
+ *
+ * Outbound GPIO lines can be connected to any qemu_irq, but the common
+ * case is connecting them to another device's inbound GPIO line, using
+ * the qemu_irq returned by qdev_get_gpio_in() or qdev_get_gpio_in_named().
+ *
+ * It is not valid to try to connect one outbound GPIO to multiple
+ * qemu_irqs at once, or to connect multiple outbound GPIOs to the
+ * same qemu_irq; see qdev_connect_gpio_out() for details.
+ */
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq);
+
 void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
 void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
  int priority);
-- 
2.33.1




Re: [PATCH] target/hppa: Fix atomic_store_3 for STBY

2021-12-29 Thread Helge Deller
On 12/29/21 23:29, Richard Henderson wrote:
> The parallel version of STBY did not take host endianness into
> account, and also computed the incorrect address for STBY_E.
>
> Bswap twice to handle the merge and store.  Compute mask inside
> the function rather than as a parameter.  Force align the address,
> rather than subtracting one.
>
> Generalize the function to system mode by using probe_access().
>
> Reported-by: Helge Deller 
> Signed-off-by: Richard Henderson 

Please add:
Tested-by: Helge Deller 

I successfully tested the included stby test case on physical hardware and in 
qemu (on x86).
My original problem where gcc (on hppa) produces wrong assembly output is fixed 
too.
Please backport this patch to qemu v6.1.x and v6.0.x.

Thank you!
Helge


> ---
>  target/hppa/op_helper.c| 27 ++-
>  tests/tcg/hppa/stby.c  | 87 ++
>  tests/tcg/hppa/Makefile.target |  5 ++
>  3 files changed, 107 insertions(+), 12 deletions(-)
>  create mode 100644 tests/tcg/hppa/stby.c
>
> diff --git a/target/hppa/op_helper.c b/target/hppa/op_helper.c
> index 96d9391c39..1b86557d5d 100644
> --- a/target/hppa/op_helper.c
> +++ b/target/hppa/op_helper.c
> @@ -57,26 +57,29 @@ void HELPER(tcond)(CPUHPPAState *env, target_ureg cond)
>  }
>  }
>
> -static void atomic_store_3(CPUHPPAState *env, target_ulong addr, uint32_t 
> val,
> -   uint32_t mask, uintptr_t ra)
> +static void atomic_store_3(CPUHPPAState *env, target_ulong addr,
> +   uint32_t val, uintptr_t ra)
>  {
> -#ifdef CONFIG_USER_ONLY
> -uint32_t old, new, cmp;
> +int mmu_idx = cpu_mmu_index(env, 0);
> +uint32_t old, new, cmp, mask, *haddr;
> +void *vaddr;
> +
> +vaddr = probe_access(env, addr, 3, MMU_DATA_STORE, mmu_idx, ra);
> +if (vaddr == NULL) {
> +cpu_loop_exit_atomic(env_cpu(env), ra);
> +}
> +haddr = (uint32_t *)((uintptr_t)vaddr & -4);
> +mask = addr & 1 ? 0x00ffu : 0xff00u;
>
> -uint32_t *haddr = g2h(env_cpu(env), addr - 1);
>  old = *haddr;
>  while (1) {
> -new = (old & ~mask) | (val & mask);
> +new = be32_to_cpu((cpu_to_be32(old) & ~mask) | (val & mask));
>  cmp = qatomic_cmpxchg(haddr, old, new);
>  if (cmp == old) {
>  return;
>  }
>  old = cmp;
>  }
> -#else
> -/* FIXME -- we can do better.  */
> -cpu_loop_exit_atomic(env_cpu(env), ra);
> -#endif
>  }
>
>  static void do_stby_b(CPUHPPAState *env, target_ulong addr, target_ureg val,
> @@ -92,7 +95,7 @@ static void do_stby_b(CPUHPPAState *env, target_ulong addr, 
> target_ureg val,
>  case 1:
>  /* The 3 byte store must appear atomic.  */
>  if (parallel) {
> -atomic_store_3(env, addr, val, 0x00ffu, ra);
> +atomic_store_3(env, addr, val, ra);
>  } else {
>  cpu_stb_data_ra(env, addr, val >> 16, ra);
>  cpu_stw_data_ra(env, addr + 1, val, ra);
> @@ -122,7 +125,7 @@ static void do_stby_e(CPUHPPAState *env, target_ulong 
> addr, target_ureg val,
>  case 3:
>  /* The 3 byte store must appear atomic.  */
>  if (parallel) {
> -atomic_store_3(env, addr - 3, val, 0xff00u, ra);
> +atomic_store_3(env, addr - 3, val, ra);
>  } else {
>  cpu_stw_data_ra(env, addr - 3, val >> 16, ra);
>  cpu_stb_data_ra(env, addr - 1, val >> 8, ra);
> diff --git a/tests/tcg/hppa/stby.c b/tests/tcg/hppa/stby.c
> new file mode 100644
> index 00..36bd5f723c
> --- /dev/null
> +++ b/tests/tcg/hppa/stby.c
> @@ -0,0 +1,87 @@
> +/* Test STBY */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +
> +struct S {
> +unsigned a;
> +unsigned b;
> +unsigned c;
> +};
> +
> +static void check(const struct S *s, unsigned e,
> +  const char *which, const char *insn, int ofs)
> +{
> +int err = 0;
> +
> +if (s->a != 0) {
> +fprintf(stderr, "%s %s %d: garbage before word 0x%08x\n",
> +which, insn, ofs, s->a);
> +err = 1;
> +}
> +if (s->c != 0) {
> +fprintf(stderr, "%s %s %d: garbage after word 0x%08x\n",
> +which, insn, ofs, s->c);
> +err = 1;
> +}
> +if (s->b != e) {
> +fprintf(stderr, "%s %s %d: 0x%08x != 0x%08x\n",
> +which, insn, ofs, s->b, e);
> +err = 1;
> +}
> +
> +if (err) {
> +exit(1);
> +}
> +}
> +
> +#define TEST(INSN, OFS, E) \
> +do {   \
> +s.b = 0;   \
> +asm volatile(INSN " %1, " #OFS "(%0)"  \
> + : : "r"(&s.b), "r" (0x11223344) : "memory");  \
> +check(&s, E, which, INSN, OFS);\
> +} while (0)
> +
> +static void test(c

[PATCH] gitlab-ci: Always upload artifacts by default

2021-12-29 Thread Philippe Mathieu-Daudé
GitLab defaults [1] to upload artifacts only when the job succeeds,
which is not helpful to troubleshoot failing tests. Switch to
always upload artifacts by default for QEMU jobs, by setting the
'artifacts:when' keyword in the global default section [2].

[1] https://docs.gitlab.com/ee/ci/yaml/index.html#artifactswhen
[2] https://docs.gitlab.com/ee/ci/yaml/index.html#default

Reported-by: Richard Henderson 
Signed-off-by: Philippe Mathieu-Daudé 
---
 .gitlab-ci.d/qemu-project.yml | 4 
 1 file changed, 4 insertions(+)

diff --git a/.gitlab-ci.d/qemu-project.yml b/.gitlab-ci.d/qemu-project.yml
index 871262fe0e8..24137c14dc4 100644
--- a/.gitlab-ci.d/qemu-project.yml
+++ b/.gitlab-ci.d/qemu-project.yml
@@ -1,6 +1,10 @@
 # This file contains the set of jobs run by the QEMU project:
 # https://gitlab.com/qemu-project/qemu/-/pipelines
 
+default:
+  artifacts:
+when: always
+
 include:
   - local: '/.gitlab-ci.d/stages.yml'
   - local: '/.gitlab-ci.d/edk2.yml'
-- 
2.33.1




Re: [PATCH 0/2] dump: add 32-bit guest Windows support

2021-12-29 Thread Philippe Mathieu-Daudé
Hi Viktor,

On 12/29/21 18:52, Viktor Prutyanov wrote:
> Since 32-bit versions of Windows still exist, there is a need to take
> live and crash dumps of such guests along with 64-bit guests. So, add
> an ability for 'dump-guest-memory -w' to take dumps from 32-bit guest.
> When running the command QEMU consumes 32-bit Complete Memory Dump
> header passed by guest driver through vmcoreinfo device as it was
> previously done for 64-bit headers. 32-bit vmcoreinfo guest driver in
> upstream virtio-win can fill such a header.
> 
> Viktor Prutyanov (2):
>   include/qemu: add 32-bit Windows dump structures and definitions
>   dump/win_dump: add 32-bit guest Windows support for dump-guest-memory

Looking at your series, the patches are doing a bit too much.

Suggestion to ease review:
- Rename WinContext -> WinContext64 (+ add union) in a preliminary patch
- Rename WinDumpHeader -> WinDumpHeader64 in another patch
- Add *WIN_DUMP_FIELD* macros only using the 64-bit
- Add 32-bit structures
- Add 32-bit code
  (better name 'bool x64', maybe 'is_64bit' or 'guest64'?)

Otherwise LGTM.

Regards,

Phil.




Re: [PULL 0/6] Block jobs patches

2021-12-29 Thread Richard Henderson

On 12/29/21 11:21 AM, Vladimir Sementsov-Ogievskiy wrote:

The following changes since commit 89f3bfa3265554d1d591ee4d7f1197b6e3397e84:

   Merge tag 'pull-pa-20211223' of https://gitlab.com/rth7680/qemu into staging 
(2021-12-23 17:53:36 -0800)

are available in the Git repository at:

   https://src.openvz.org/scm/~vsementsov/qemu.git tags/pull-jobs-2021-12-29

for you to fetch changes up to 985cac8f200443ad952becc03b07c51ff4f80983:

   blockjob: drop BlockJob.blk field (2021-12-28 15:18:59 +0100)


Jobs patches:
  - small fix of job_create()
  - refactoring: drop BlockJob.blk field



Emanuele Giuseppe Esposito (1):
   job.c: add missing notifier initialization

Vladimir Sementsov-Ogievskiy (5):
   blockjob: implement and use block_job_get_aio_context
   test-blockjob-txn: don't abuse job->blk
   block/stream: add own blk
   test-bdrv-drain: don't use BlockJob.blk
   blockjob: drop BlockJob.blk field

  include/block/blockjob.h   | 10 +++---
  block/mirror.c |  7 ---
  block/stream.c | 24 +--
  blockdev.c |  6 +++---
  blockjob.c | 36 --
  job.c  |  1 +
  qemu-img.c |  2 +-
  tests/unit/test-bdrv-drain.c   | 12 
  tests/unit/test-blockjob-txn.c | 10 +-
  tests/qemu-iotests/141.out |  2 +-
  tests/qemu-iotests/283 |  3 ++-
  tests/qemu-iotests/283.out |  2 +-
  12 files changed, 60 insertions(+), 55 deletions(-)


Applied, thanks.


r~




Re: check-python-tox failures

2021-12-29 Thread Philippe Mathieu-Daudé
On 12/23/21 05:11, Richard Henderson wrote:
> Hi John,
> 
> This test has been failing for quite a while. While it is allowed to
> fail, can we either fix this or disable it, to allow the ci to go proper
> green?

This job not only fails on the mainstream pipelines, but also on
all forks recently rebased. This is very annoying, I am in favor
of disabling the job until someone figure out what is wrong.

Phil.




Re: [PATCH v11 29/31] linux-user: Implement CPU-specific signal handler for loongarch64 hosts

2021-12-29 Thread gaosong

HI,

On 2021/12/21 下午1:41, WANG Xuerui wrote:

+case 0b001110: /* indexed, atomic, bounds-checking memory operations */
+uint32_t sel = (insn >> 15) & 0b111;
+
+switch (sel) {
+case 0b010: /* stx.b */
+case 0b0101000: /* stx.h */
+case 0b011: /* stx.w */
+case 0b0111000: /* stx.d */
+case 0b111: /* fstx.s */
+case 0b000: /* fstx.d */
+case 0b00011101100: /* fstgt.s */
+case 0b00011101101: /* fstgt.d */
+case 0b00011101110: /* fstle.s */
+case 0b0001110: /* fstle.d */
+case 0b0001000: /* stgt.b */
+case 0b0001001: /* stgt.h */
+case 0b0001010: /* stgt.w */
+case 0b0001011: /* stgt.d */
+case 0b0001100: /* stle.b */
+case 0b0001101: /* stle.h */
+case 0b0001110: /* stle.w */
+case 0b000: /* stle.d */
+case 0b0001100 ... 0b00011100011: /* am* insns */
+return true;
+}
+break;
+}


We build qemu-x86_64 on LoongArch machine, but got an error,

../linux-user/host/loongarch64/host-signal.h:57:9: error: a label can only be 
part of a statement and a declaration is not a statement
 uint32_t sel = (insn >> 15) & 0b111;
 ^~~~

I think  we should define  'sel'  before:

    /* Detect store by reading the instruction at the program counter.  */
    switch ((insn >> 26) & 0b11) {

or
case 0b001110:

 {

          uint32_t set = ...;

  ...

 }

Thanks
Song Gao



Re: [PATCH v11 29/31] linux-user: Implement CPU-specific signal handler for loongarch64 hosts

2021-12-29 Thread WANG Xuerui

Hi,

On 12/30/21 11:11, gaosong wrote:


HI,

On 2021/12/21 下午1:41, WANG Xuerui wrote:

+case 0b001110: /* indexed, atomic, bounds-checking memory operations */
+uint32_t sel = (insn >> 15) & 0b111;
+
+switch (sel) {
+case 0b010: /* stx.b */
+case 0b0101000: /* stx.h */
+case 0b011: /* stx.w */
+case 0b0111000: /* stx.d */
+case 0b111: /* fstx.s */
+case 0b000: /* fstx.d */
+case 0b00011101100: /* fstgt.s */
+case 0b00011101101: /* fstgt.d */
+case 0b00011101110: /* fstle.s */
+case 0b0001110: /* fstle.d */
+case 0b0001000: /* stgt.b */
+case 0b0001001: /* stgt.h */
+case 0b0001010: /* stgt.w */
+case 0b0001011: /* stgt.d */
+case 0b0001100: /* stle.b */
+case 0b0001101: /* stle.h */
+case 0b0001110: /* stle.w */
+case 0b000: /* stle.d */
+case 0b0001100 ... 0b00011100011: /* am* insns */
+return true;
+}
+break;
+}

We build qemu-x86_64 on LoongArch machine, but got an error,
../linux-user/host/loongarch64/host-signal.h:57:9: error: a label can only be 
part of a statement and a declaration is not a statement
  uint32_t sel = (insn >> 15) & 0b111;
  ^~~~
I think  we should define  'sel'  before:
     /* Detect store by reading the instruction at the program counter.  */
     switch ((insn >> 26) & 0b11) {
or
case 0b001110:
  {
           uint32_t set = ...;
   ...
  }
I can't reproduce the error on both my development machines (amd64 and 
loongarch64), so I wonder if the issue is related to your particular 
setup (i.e. compiler versions and such)?

Re: [PATCH v10 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically

2021-12-29 Thread Hyman Huang




在 2021/12/23 19:12, Peter Xu 写道:

Hi, Yong,

On Tue, Dec 14, 2021 at 07:07:32PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Introduce the third method GLOBAL_DIRTY_LIMIT of dirty
tracking for calculate dirtyrate periodly for dirty restraint.

Implement thread for calculate dirtyrate periodly, which will
be used for dirty page limit.

Add dirtylimit.h to introduce the util function for dirty
limit implementation.


Sorry to be late on reading it, my apologies.



Signed-off-by: Hyman Huang(黄勇) 
---
  include/exec/memory.h   |   5 +-
  include/sysemu/dirtylimit.h |  51 ++
  migration/dirtyrate.c   | 160 +---
  migration/dirtyrate.h   |   2 +
  4 files changed, 207 insertions(+), 11 deletions(-)
  create mode 100644 include/sysemu/dirtylimit.h

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 20f1b27..606bec8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -69,7 +69,10 @@ static inline void fuzz_dma_read_cb(size_t addr,
  /* Dirty tracking enabled because measuring dirty rate */
  #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
  
-#define GLOBAL_DIRTY_MASK  (0x3)

+/* Dirty tracking enabled because dirty limit */
+#define GLOBAL_DIRTY_LIMIT  (1U << 2)
+
+#define GLOBAL_DIRTY_MASK  (0x7)
  
  extern unsigned int global_dirty_tracking;
  
diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h

new file mode 100644
index 000..34e48f8
--- /dev/null
+++ b/include/sysemu/dirtylimit.h
@@ -0,0 +1,51 @@
+/*
+ * dirty limit helper functions
+ *
+ * Copyright (c) 2021 CHINA TELECOM CO.,LTD.
+ *
+ * Authors:
+ *  Hyman Huang(黄勇) 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_DIRTYRLIMIT_H
+#define QEMU_DIRTYRLIMIT_H
+
+#define DIRTYLIMIT_CALC_TIME_MS 1000/* 1000ms */
+
+/**
+ * dirtylimit_calc_current
+ *
+ * get current dirty page rate for specified virtual CPU.
+ */
+int64_t dirtylimit_calc_current(int cpu_index);
+
+/**
+ * dirtylimit_calc_start
+ *
+ * start dirty page rate calculation thread.
+ */
+void dirtylimit_calc_start(void);
+
+/**
+ * dirtylimit_calc_quit
+ *
+ * quit dirty page rate calculation thread.
+ */
+void dirtylimit_calc_quit(void);
+
+/**
+ * dirtylimit_calc_state_init
+ *
+ * initialize dirty page rate calculation state.
+ */
+void dirtylimit_calc_state_init(int max_cpus);
+
+/**
+ * dirtylimit_calc_state_finalize
+ *
+ * finalize dirty page rate calculation state.
+ */
+void dirtylimit_calc_state_finalize(void);
+#endif


Since dirtylimit and dirtyrate looks so alike, not sure it's easier to just
reuse dirtyrate.h; after all you reused dirtyrate.c.
I'm working on this, and i find it's fine to just reuse dirtyrate.h, but 
the origin dirtyrate.h didn't export any function to other qemu module, 
so we should add a new file include/sysemu/dirtyrate.h to export the 
dirty page rage caluculation util function, how do you think about this?


I'm doing the code review and i find that it is more reasonable to 
implement the dirtylimit just in a standalone file such as 
softmmu/dirtylimit.c, since the implementation of dirtylimit in v10 has 
nothing to do with throttle algo in softmmu/cpu-throttle.c. If we merge 
the two implemenation into one file, it is weired. Is this ok?



diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index d65e744..e8d4e4a 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -27,6 +27,7 @@
  #include "qapi/qmp/qdict.h"
  #include "sysemu/kvm.h"
  #include "sysemu/runstate.h"
+#include "sysemu/dirtylimit.h"
  #include "exec/memory.h"
  
  /*

@@ -46,6 +47,155 @@ static struct DirtyRateStat DirtyStat;
  static DirtyRateMeasureMode dirtyrate_mode =
  DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
  
+struct {

+DirtyRatesData data;
+bool quit;
+QemuThread thread;
+} *dirtylimit_calc_state;
+
+static void dirtylimit_global_dirty_log_start(void)
+{
+qemu_mutex_lock_iothread();
+memory_global_dirty_log_start(GLOBAL_DIRTY_LIMIT);
+qemu_mutex_unlock_iothread();
+}
+
+static void dirtylimit_global_dirty_log_stop(void)
+{
+qemu_mutex_lock_iothread();
+memory_global_dirty_log_stop(GLOBAL_DIRTY_LIMIT);
+qemu_mutex_unlock_iothread();
+}


This is merely dirtyrate_global_dirty_log_start/stop but with a different flag.

Let's introduce global_dirty_log_change() with BQL?

   global_dirty_log_change(flag, onoff)
   {
 qemu_mutex_lock_iothread();
 if (start) {
 memory_global_dirty_log_start(flag);
 } else {
 memory_global_dirty_log_stop(flag);
 }
 qemu_mutex_unlock_iothread();
   }

Then we merge 4 functions into one.

We can also have a BQL-version of global_dirty_log_sync() in the same patch if
you think above helpful.


+
+static inline void record_dirtypages(DirtyPageRecord *dirty_pages,
+ CPUState *c

Re: [PATCH v10 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically

2021-12-29 Thread Peter Xu
On Thu, Dec 30, 2021 at 01:01:09PM +0800, Hyman Huang wrote:
> > > +/**
> > > + * dirtylimit_calc_state_finalize
> > > + *
> > > + * finalize dirty page rate calculation state.
> > > + */
> > > +void dirtylimit_calc_state_finalize(void);
> > > +#endif
> > 
> > Since dirtylimit and dirtyrate looks so alike, not sure it's easier to just
> > reuse dirtyrate.h; after all you reused dirtyrate.c.
> I'm working on this, and i find it's fine to just reuse dirtyrate.h, but the
> origin dirtyrate.h didn't export any function to other qemu module, so we
> should add a new file include/sysemu/dirtyrate.h to export the dirty page
> rage caluculation util function, how do you think about this?

I see what you meant, yeah if it's exported outside migration/ then looks fine.

> 
> I'm doing the code review and i find that it is more reasonable to implement
> the dirtylimit just in a standalone file such as softmmu/dirtylimit.c, since
> the implementation of dirtylimit in v10 has nothing to do with throttle algo
> in softmmu/cpu-throttle.c. If we merge the two implemenation into one file,
> it is weired. Is this ok?

Sure.

> > At least it can be changed into something like:
> > 
> >dirtylimit_calc_func(DirtyRateVcpu *stat)
> >{
> >// never race against cpu hotplug/unplug
> >cpu_list_lock();
> > 
> >// this should include allocate buffers and record initial values 
> > for all cores
> >record = vcpu_dirty_stat_alloc();
> >// do the sleep
> >duration = vcpu_dirty_stat_wait(records)
> > 
> >// the "difference"..
> >global_dirty_log_sync();
> > 
> >// collect end timestamps, calculates
> >vcpu_dirty_stat_collect(stat, records);
> >vcpu_dirty_stat_free(stat);
> > 
> >cpu_list_unlock();
> > 
> >return stat;
> >}
> > 
> > It may miss something but just to show what I meant..
> I think this work is refactor and i do this in a standalone commit before
> the dirtylimit commits. Is this ok?

I think that's more than welcomed if you can split the patch into smaller ones;
they'll be even easier to be reviewed.

Thanks,

-- 
Peter Xu




Re: [PATCH 0/3] Fix RVV calling incorrect RFV/RVD check functions bug

2021-12-29 Thread Frank Chang
On Wed, Dec 29, 2021 at 10:43 AM Frank Chang  wrote:

>  於 2021年12月29日 週三 上午10:13寫道:
>
>> From: Frank Chang 
>>
>> For vector widening and narrowing floating-point instructions, we should
>> use require_scale_rvf() instead of require_rvf() to check whether the
>> correspond RVF/RVD is enabled if either source or destination
>> floating-point operand is double-width of SEW. Otherwise, illegal
>> instruction exception should be raised.
>>
>> e.g. For SEW=16, if the source/destination floating-point operand is
>> double-width of SEW, RVF needs to be enabled. Otherwise, an illegal
>> instruction exception will be raised. Similarly, for SEW=32, RVD
>> needs to be enabled.
>>
>> Frank Chang (3):
>>   target/riscv: rvv-1.0: Call the correct RVF/RVD check funtion for
>> widening fp insns
>>   target/riscv: rvv-1.0: Call the correct RVF/RVD check funtion for
>> widening fp/int type-convert insns
>>   target/riscv: rvv-1.0: Call the correct RVF/RVD check funtion for
>> narrowing fp/int type-convert insns
>
>
> Oops, I misspell the word "function" in the patch titles.
> I will correct it in my next patchset.
>
> Regards,
> Frank Chang
>

I also missed the 'Signed-off-by' in this patch series.
Will also fix it in my next-version patchset.

Regards,
Frank Chang


>
>
>>
>
>
>>  target/riscv/insn_trans/trans_rvv.c.inc | 78 ++---
>>  1 file changed, 57 insertions(+), 21 deletions(-)
>>
>> --
>> 2.31.1
>>
>>
>>