Soon we will call cpu_has_work without the BQL.
Cc: Mark Cave-Ayland
Cc: Artyom Tarasenko
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
target/sparc/cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index
Soon we will call cpu_has_work without the BQL.
Cc: Cornelia Huck
Cc: Alexander Graf
Cc: David Hildenbrand
Cc: qemu-s3...@nongnu.org
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
target/s390x/cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a
On Fri, Nov 09, 2018 at 10:30:01 -0500, Cleber Rosa wrote:
> To be fully honest, this may not be a OSX (alone) condition, but may
> be a situation that only happens with OSX on Travis-CI, were resources
> are quite limited.
>
> I have personal experience with tests that exercise parallelism or
> d
On Tue, Nov 06, 2018 at 20:20:22 +0800, guangrong.x...@gmail.com wrote:
> From: Xiao Guangrong
>
> This modules implements the lockless and efficient threaded workqueue.
(snip)
> +++ b/util/threaded-workqueue.c
> +struct Threads {
> +/*
> + * in order to avoid contention, the @requests is
On Mon, Nov 12, 2018 at 22:44:46 +0100, Richard Henderson wrote:
> Based on an idea forwarded by Emilio, which suggests a 5-6%
> speed gain is possible. I have not spent too much time
> measuring this, as the code size gains are significant.
Nice!
> I believe that I posted an x86_64-only patch s
On Wed, Nov 14, 2018 at 11:30:19 +, Alex Bennée wrote:
>
> Emilio G. Cota writes:
>
> > This allows us to queue synchronous CPU work without the BQL.
> >
> > Will gain a user soon.
>
> This is also in the cpu-lock series right?
No, in the cpu-lock seri
On Wed, Nov 14, 2018 at 13:03:19 +, Alex Bennée wrote:
>
> Emilio G. Cota writes:
>
> > This will allow us to trace 16B-long memory accesses.
> >
> > While at it, add some defines for the mem_info bits and simplify
> > trace_mem_get_info by making it a wr
On Wed, Nov 14, 2018 at 14:41:53 +, Alex Bennée wrote:
> Emilio G. Cota writes:
(snip)
> > -static GHashTable *helper_table;
> > +static struct qht helper_table;
> > +static bool helper_table_inited;
>
> Having a flag for initialisation seems a little excessive con
On Wed, Nov 14, 2018 at 16:11:35 +, Alex Bennée wrote:
>
> Emilio G. Cota writes:
(snip)
> I needed to do this:
>
> modified tcg/tcg.c
> @@ -884,7 +884,7 @@ static TCGTemp *tcg_global_reg_new_internal(TCGContext
> *s, TCGType type,
>
> static inline uint32_t
On Wed, Nov 14, 2018 at 17:01:13 +, Alex Bennée wrote:
>
> Emilio G. Cota writes:
>
> > In preparation for adding plugin support. One of the clean-up
> > actions when uninstalling plugins will be to flush the code
> > cache. We'll also have to clear the ru
On Wed, Nov 14, 2018 at 16:43:22 +, Alex Bennée wrote:
>
> Emilio G. Cota writes:
>
> > Signed-off-by: Emilio G. Cota
> > ---
>
> >
> > void cpu_interrupt(CPUState *cpu, int mask);
> > diff --git a/cpus.c b/cpus.c
> > index 3efe89354d..a44
On Wed, Nov 14, 2018 at 12:44:00 +0100, Paolo Bonzini wrote:
> This avoids the following deadlock:
>
> 1) a thread calls run_on_cpu for CPU 2 from a timer, and single_tcg_halt_cond
> is signaled
>
> 2) CPU 1 is running and exits. It finds no work item and enters CPU 2
>
> 3) because the I/O thr
On Thu, Nov 15, 2018 at 12:32:00 +0100, Richard Henderson wrote:
> On 11/14/18 2:00 AM, Emilio G. Cota wrote:
> > The following might be related: I'm seeing segfaults with -smp 8
> > and beyond when doing bootup+shutdown of an aarch64 guest on
> > an x86-64 host
On Fri, Nov 16, 2018 at 00:15:53 +0100, Paolo Bonzini wrote:
> On 14/11/2018 20:42, Emilio G. Cota wrote:
> > On Wed, Nov 14, 2018 at 12:44:00 +0100, Paolo Bonzini wrote:
> >> This avoids the following deadlock:
> >>
> >> 1) a thread calls ru
On Thu, Nov 15, 2018 at 23:04:50 +0100, Richard Henderson wrote:
> On 11/15/18 7:48 PM, Emilio G. Cota wrote:
> > - Segfault in code_gen_buffer. This one I don't have a fix for,
> > but it's *much* easier to reproduce when -tb-size is very small,
> > e.g. &quo
On Thu, Nov 15, 2018 at 20:13:38 -0500, Emilio G. Cota wrote:
> I'll generate now some more perf numbers that we could include in the
> commit logs.
SPEC numbers are a net perf decrease, unfortunately:
Softmmu speedup for SPEC06int (test
On Fri, Nov 16, 2018 at 09:07:50 +0100, Richard Henderson wrote:
> On 11/16/18 6:10 AM, Emilio G. Cota wrote:
> > It's possible that newer machines with larger reorder buffers
> > will be able to take better advantage of the higher instruction
> > locality, hiding the la
On Fri, Nov 16, 2018 at 09:10:32 +0100, Richard Henderson wrote:
> On 11/16/18 2:13 AM, Emilio G. Cota wrote:
> > This allows us to discard most TBs; in the example above,
> > we end up *not* discarding only ~70 TBs, that is we end up keeping
> > only 70/2500 = 2.8% of the
On Fri, Nov 16, 2018 at 14:55:01 +0100, Etienne Dublé wrote:
(snip)
> So the idea is: what if we could share the cache of code already translated
> between all those processes?
> There would be sereral ways to achieve this:
> * use a shared memory area for the cache, and locking mechanisms.
> * hav
On Tue, Nov 20, 2018 at 18:25:25 +0800, Xiao Guangrong wrote:
> On 11/14/18 2:38 AM, Emilio G. Cota wrote:
> > On Tue, Nov 06, 2018 at 20:20:22 +0800, guangrong.x...@gmail.com wrote:
> > > From: Xiao Guangrong
(snip)
> > Batching achieves higher performance at high cor
On Sun, Oct 07, 2018 at 19:37:50 +0200, Philippe Mathieu-Daudé wrote:
> On 10/6/18 11:45 PM, Emilio G. Cota wrote:
> > 2. System boot + shutdown, ubuntu 18.04 x86_64:
>
> You can also run the VM tests to build QEMU:
>
> $ make vm-test
Thanks, will give that a look.
>
On Mon, Oct 08, 2018 at 11:28:38 +0100, Alex Bennée wrote:
> Emilio G. Cota writes:
> > Again, for performance you'd avoid the tracepoint (i.e. calling
> > a helper to call another function) and embed directly the
> > callback from TCG. Same thing applies to TB's.
On Sun, Oct 07, 2018 at 21:48:34 -0400, Emilio G. Cota wrote:
> - 70/40% use rate for growing/shrinking the TLB does not
> seem a great choice, if one wants to avoid a pathological
> case that can induce constant resizing. Imagine we got
> exactly 70% use rate, and all TLB
On Mon, Oct 08, 2018 at 14:57:18 +0100, Alex Bennée wrote:
> Emilio G. Cota writes:
> > The readers that do not hold tlb_lock must use atomic reads when
> > reading .addr_write, since this field can be updated by other threads;
> > the conversion to atomic reads is done in th
On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> > Currently we evict an entry to the victim TLB when it doesn't match
> > the current address. But it could be that there's no match because
> > the current
v3: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01087.html
Changes since v3:
- Add R-b's
- Add comment to copy_tlb_helper_locked to note that it can
only be called from the TLB owner thread.
The series is checkpatch-clean. You can fetch it from:
https://github.com/cota/qemu/tre
ntire TLB.
Tested-by: Alex Bennée
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 3 +
accel/tcg/cputlb.c | 155 ++--
2 files changed, 87 insertions(+), 71 deletions(-)
diff --git a/include/exec/cpu-defs.h b/include
Paves the way for the addition of a per-TLB lock.
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/exec/exec-all.h | 8
accel/tcg/cputlb.c | 4
exec.c | 1 +
3 files changed, 13 insertions(+)
diff --git a/include/exec/exec-all.h b/include
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
accel/tcg/cputlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 502eea2850..f6b388c961 100644
--- a/accel/tcg/cputlb.c
+++ b/accel
Notes:
- tlb-lock-v2 corresponds to an implementation with a mutex.
- tlb-lock-v3 is the current patch series, i.e. with a spinlock
and a single lock acquisition in tlb_set_page_with_attrs.
Signed-off-by: Emilio G. Cota
---
accel/tcg/softmmu_template.h | 16 ++--
include/exec/cpu_
On Sun, Oct 07, 2018 at 18:05:22 -0700, Richard Henderson wrote:
> Isolate the computation of an index from an address into a
> helper before we change that function.
>
> Signed-off-by: Richard Henderson
> ---
>
> Emilio, this should make your dynamic tlb sizing patch 1/6
> significantly smaller
On Mon, Oct 08, 2018 at 12:46:26 -0700, Richard Henderson wrote:
> On 10/8/18 7:42 AM, Emilio G. Cota wrote:
> > On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
> >> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> >>> Currently we evict an entry to the vi
This paves the way for implementing a dynamically-sized softmmu.
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 5 +
accel/tcg/cputlb.c | 17 ++---
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu
This paves the way for implementing dynamic TLB resizing.
XXX: convert other TCG backends
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 10 ++
include/exec/cpu_ldst.h | 14 +-
accel/tcg/cputlb.c| 18 +++---
tcg/i386/tcg-target.inc.c
v1: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01146.html
Changes since v1:
- Add tlb_index and tlb_entry helpers from Richard
- Introduce sizeof_tlb() and tlb_n_entries()
- Extract tlb_mask as its own array in CPUArchState, as
suggested by Richard. For the associated helpers (t
s keep track of the TLB's use rate.
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-all.h | 9 +
accel/tcg/cputlb.c | 2 +-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 117d2fbbca..e21140049b 100644
--- a/in
From: Richard Henderson
Isolate the computation of an index from an address into a
helper before we change that function.
Signed-off-by: Richard Henderson
[ cota: convert tlb_vaddr_to_host; use atomic_read on addr_write ]
Signed-off-by: Emilio G. Cota
---
accel/tcg/softmmu_template.h
# ** # ** # ** # ** # * *# * *# * *# * *# * *# * * #* * # |
0 +-+***##***##-**##-**##-**##-**##-***#-***#-***#-***#-***#-***##***##+-+
401.bzi403.g429445.g456.hm462.libq464.h471.omn4483.xalancbgeomean
png: https://imgur.com/a/eXkjMCE
After this series, we bring down the average softmmu overhead
from 2.77x to 1.80x, with a maximum slowdown of 2.48x (omnetpp).
On Tue, Oct 09, 2018 at 13:34:40 +0100, Alex Bennée wrote:
>
> Emilio G. Cota writes:
>
> > v1: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01146.html
> >
> > Changes since v1:
>
> Hmm I'm seeing some qtest failures, for exampl
On Tue, Oct 09, 2018 at 15:45:36 +0100, Alex Bennée wrote:
>
> Emilio G. Cota writes:
>
> > On Tue, Oct 09, 2018 at 13:34:40 +0100, Alex Bennée wrote:
> >>
> >> Emilio G. Cota writes:
> >>
> >> > v1: https://lists.gnu.org/archive/html/
On Tue, Oct 09, 2018 at 15:54:21 +0100, Alex Bennée wrote:
> Emilio G. Cota writes:
> > +if (new_size == old_size) {
> > +return;
> > +}
> > +
> > +g_free(env->tlb_table[mmu_idx]);
> > +g_free(env->iotlb[mmu_idx]);
> > +
As far as I can tell tlb_flush does not need to be called
this early. tlb_flush is eventually called after the CPU
has been realized.
This change paves the way to the introduction of tlb_init,
which will be called from cpu_exec_realizefn.
Signed-off-by: Emilio G. Cota
---
target/alpha/cpu.c
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
accel/tcg/cputlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 502eea2850..f6b388c961 100644
--- a/accel/tcg/cputlb.c
+++ b/accel
Paves the way for the addition of a per-TLB lock.
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/exec/exec-all.h | 8
accel/tcg/cputlb.c | 4
exec.c | 1 +
3 files changed, 13 insertions(+)
diff --git a/include/exec/exec-all.h b/include
v4: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01421.html
Changes since v4:
- Add two patches to remove early calls to tlb_flush.
You can fetch the series from:
https://github.com/cota/qemu/tree/tlb-lock-v5
Thanks,
Emilio
As far as I can tell tlb_flush does not need to be called
this early. tlb_flush is eventually called after the CPU
has been realized.
This change paves the way to the introduction of tlb_init,
which will be called from cpu_exec_realizefn.
Cc: Guan Xuetao
Signed-off-by: Emilio G. Cota
ntire TLB.
Tested-by: Alex Bennée
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 3 +
accel/tcg/cputlb.c | 155 ++--
2 files changed, 87 insertions(+), 71 deletions(-)
diff --git a/include/exec/cpu-defs.h b/include
s keep track of the TLB's use rate.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-all.h | 9 +
accel/tcg/cputlb.c | 2 +-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/include/exec/cpu-all.h b/include/
Notes:
- tlb-lock-v2 corresponds to an implementation with a mutex.
- tlb-lock-v3 is the current patch series, i.e. with a spinlock
and a single lock acquisition in tlb_set_page_with_attrs.
Signed-off-by: Emilio G. Cota
---
accel/tcg/softmmu_template.h | 16 ++--
include/exec/cpu_
From: Richard Henderson
Isolate the computation of an index from an address into a
helper before we change that function.
Reviewed-by: Alex Bennée
Signed-off-by: Richard Henderson
[ cota: convert tlb_vaddr_to_host; use atomic_read on addr_write ]
Signed-off-by: Emilio G. Cota
---
accel/tcg
After this series, we bring down the average softmmu overhead
from 2.77x to 1.80x, with a maximum slowdown of 2.48x (omnetpp).
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 39 +
accel/tcg/cputlb.c| 41
v2: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01495.html
Changes since v2:
- Add R-b's
- Apply on top of tlb-lock-v5 series, fixing the alpha
boot segfault due to the early tlb_flush
+ The series now passes `make check-qtest'
- Alloc the iotlb with g_new instead of g_new0
-
This paves the way for implementing a dynamically-sized softmmu.
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 5 +
accel/tcg/cputlb.c | 17 ++---
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/include/exec/cpu
This paves the way for implementing dynamic TLB resizing.
XXX: convert other TCG backends
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 10 ++
include/exec/cpu_ldst.h | 14 +-
accel/tcg/cputlb.c| 18 +++---
tcg/i386/tcg-target.inc.c
On Tue, Oct 09, 2018 at 18:55:30 +0100, Peter Maydell wrote:
> On 9 October 2018 at 18:45, Emilio G. Cota wrote:
(snip)
> > @@ -201,7 +201,6 @@ static void alpha_cpu_initfn(Object *obj)
> > CPUAlphaState *env = &cpu->env;
> >
> > cs->
On Wed, Oct 03, 2018 at 14:39:22 -0500, Richard Henderson wrote:
(snip)
> Richard Henderson (9):
> tcg: Split CONFIG_ATOMIC128
> target/i386: Convert to HAVE_CMPXCHG128
> target/arm: Convert to HAVE_CMPXCHG128
> target/arm: Check HAVE_CMPXCHG128 at translate time
> target/ppc: Convert to
** Changed in: qemu
Status: Confirmed => Fix Committed
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1793119
Title:
Wrong floating-point emulation on AArch64 with FPCR set to zero
Status in
The first patch we've seen before -- I'm taking it from the
atomic interrupt_request series.
The other three patches are related to TCG profiling. One
of them is a build fix that I suspect has gone unnoticed
due to its dependence on CONFIG_PROFILER.
The series is checkpatch-clean. You can fetch i
We forgot to initialize n in commit 15fa08f845 ("tcg: Dynamically
allocate TCGOps", 2017-12-29).
Signed-off-by: Emilio G. Cota
---
tcg/tcg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index f27b22bd3c..8f26916b99 100644
--- a/tcg/tcg.c
called "cpu_exec_time", which is more
descriptive than "tcg_time". Add a function to query this value
directly, and for completeness, fill in the field in
tcg_profile_snapshot, even though its callers do not use it.
Signed-off-by: Emilio G. Cota
---
include/qemu/timer
Consistently access u16.high with atomics to avoid
undefined behaviour in MTTCG.
Note that icount_decr.u16.low is only used in icount mode,
so regular accesses to it are OK.
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
accel/tcg/tcg-all.c | 2 +-
accel/tcg/translate
This plugs two 4-byte holes in 64-bit.
Signed-off-by: Emilio G. Cota
---
tcg/tcg.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcg/tcg.h b/tcg/tcg.h
index f9f12378e9..d80ef2a883 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -633,8 +633,8 @@ typedef struct TCGProfile
Disable for all TCG backends for now.
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-defs.h | 43 +++-
include/exec/cpu_ldst.h | 21 ++
tcg/aarch64/tcg-target.h | 1 +
tcg/arm/tcg-target.h | 1 +
tcg/i386/tcg-target.h| 1 +
tcg/mips/tcg-target.h| 1 +
tcg
s keep track of the TLB's use rate, which
we'll use to implement a policy for dynamic TLB sizing.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/exec/cpu-all.h | 9 +
accel/tcg/cputlb.c | 2 +-
2 files changed, 10 insertion
#* * # ** # ** # ** # ** # * *# * *# * *# * *# * *# * * #* * # |
0 +-+***##***##-**##-**##-**##-**##-***#-***#-***#-***#-***#-***##***##+-+
401.bzi403.g429445.g456.hm462.libq464.h471.omn4483.xalancbgeomean
png: https://imgur.com/a/eXkjMCE
After this series, we bring down the average softmmu overhead
from 2.77x to 1.80x, with
RFC v3: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01753.html
Changes since RFC v3:
- This is now a proper patch series, since it should not (knowingly)
break anything.
- Rebase on top of rth's tcg-next (ffd8994b90f5), which includes
patch 1 from RFC v3.
- Make the feature opt
Reviewed-by: Bastian Koppelmann
Signed-off-by: Emilio G. Cota
---
target/tricore/fpu_helper.c | 9 ++---
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/target/tricore/fpu_helper.c b/target/tricore/fpu_helper.c
index df162902d6..31df462e4a 100644
--- a/target/tricore
an Koppelmann
Tested-by: Bastian Koppelmann
Signed-off-by: Emilio G. Cota
---
fpu/softfloat.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 46ae206172..0cbb08be32 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -336,8 +33
F v
[...]
- After:
In 6133248 tests, no errors found in f64_mulAdd, rounding near_even, tininess
before rounding.
[...]
Signed-off-by: Emilio G. Cota
---
tests/fp/Makefile | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tests/fp/Makefile b/tests/fp/Makefile
index d649a5a1db..49cdcd1bd2 100644
--
These will gain some users very soon.
Signed-off-by: Emilio G. Cota
---
include/fpu/softfloat.h | 10 ++
1 file changed, 10 insertions(+)
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 9eeccd88a5..38a5e99cf3 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu
This paves the way for upcoming work.
Reviewed-by: Bastian Koppelmann
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/fpu/softfloat.h | 20
1 file changed, 20 insertions(+)
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 8fd9f9bbae
v4: https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg02960.html
Changes since v4:
- Rebase on current master (a73549f99).
- Add a patch for fp-test to pick a specialization; this gets rid of
the muladd errors, since our default "no specialization" does not
raise invalid when one of t
G. Cota
---
fpu/softfloat.c | 88 +++--
1 file changed, 86 insertions(+), 2 deletions(-)
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 78837fa9d8..8ef0571c6e 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1678,7 +1678,8 @@ float16
:
mul-single: 73.41 MFlops
mul-double: 76.93 MFlops
3. IBM POWER8E @ 2.1 GHz
- before:
mul-single: 58.40 MFlops
mul-double: 59.33 MFlops
- after:
mul-single: 60.25 MFlops
mul-double: 94.79 MFlops
Signed-off-by: Emilio G. Cota
---
fpu/softfloat.c | 66
ag to disable hardfloat.
In the long run though it would be good to fix the targets so that
at least the inexact flag passed to softfloat is indeed sticky.
Signed-off-by: Emilio G. Cota
---
fpu/softfloat.c | 341
1 file changed, 341 insertions(+)
4 +-+..@@@&==@.@&.=.+before +-+
3 +-+..@.@&.=@.@&.=.+ad@@@&== +-+
2.5 +-+.##$$%%.@&.=@.@&.=.....+ @m@& = +-+
2 +-+@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& =
machine, having 2F64 set to 1 pays off, but it
doesn't for 2F32:
- Intel i7-6700K:
add-single: [1] 285.79 vs [0] 426.70 MFlops
add-double: [1] 302.15 vs [0] 278.82 MFlops
Signed-off-by: Emilio G. Cota
---
fpu/softfloat.c | 106
1 file change
23% slower for single precision,
with it enabled, and 17% slower for double precision.
Signed-off-by: Emilio G. Cota
---
fpu/softfloat.c | 73 +++--
1 file changed, 71 insertions(+), 2 deletions(-)
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index
r-mode).
Signed-off-by: Emilio G. Cota
---
tests/fp/fp-bench.c | 630
tests/fp/.gitignore | 1 +
tests/fp/Makefile | 5 +-
3 files changed, 635 insertions(+), 1 deletion(-)
create mode 100644 tests/fp/fp-bench.c
diff --git a/tests/fp/fp-b
:
fma-single: 66.14 MFlops
fma-double: 63.10 MFlops
3. IBM POWER8E @ 2.1 GHz
- before:
fma-single: 37.26 MFlops
fma-double: 37.29 MFlops
- after:
fma-single: 48.90 MFlops
fma-double: 59.51 MFlops
Here having 3FP64 set to 1 pays off for x86_64:
[1] 170.15 vs [0] 153.12 MFlops
Signed-off-by: Emilio G
On Tue, Oct 16, 2018 at 08:03:03 +0200, Paolo Bonzini wrote:
> On 16/10/2018 04:52, Richard Henderson wrote:
> > On 10/5/18 2:14 PM, Emilio G. Cota wrote:
> >> -target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
> >> +target_ulong tlb_addr =
https://imgur.com/a/BHzpPTW
Notes:
- tlb-lock-v2 corresponds to an implementation with a mutex.
- tlb-lock-v3 corresponds to the current implementation, i.e.
a spinlock and a single lock acquisition in tlb_set_page_with_attrs.
Signed-off-by: Emilio G. Cota
---
accel/tcg/softmmu_template.h | 12 +
On Tue, Oct 16, 2018 at 19:10:03 +0800, guangrong.x...@gmail.com wrote:
(snip)
> diff --git a/include/qemu/ptr_ring.h b/include/qemu/ptr_ring.h
> new file mode 100644
> index 00..d8266d45f6
> --- /dev/null
> +++ b/include/qemu/ptr_ring.h
> @@ -0,0 +1,235 @@
(snip)
> +#define SMP_CACHE_BYTES
On Thu, Oct 11, 2018 at 20:25:08 +0100, Dr. David Alan Gilbert (git) wrote:
> From: Thomas Huth
>
> We can re-use the s390-ccw bios code to implement a small firmware
> for a s390x guest which prints out the "A" and "B" characters and
> modifies the memory, as required for the migration test.
>
On Thu, Oct 18, 2018 at 14:38:01 +0200, Thomas Huth wrote:
> On 2018-10-17 21:28, Emilio G. Cota wrote:
> > Can anyone reproduce this? Otherwise, let me know what other info
> > I could provide.
>
> I've finally been able to reproduce it - seems like it only happens her
Cc: Aurelien Jarno
Signed-off-by: Emilio G. Cota
---
target/sh4/op_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index 4f825bae5a..57cc363ccc 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -105,7
The few direct users of &cpu->lock will be converted soon.
Cc: Peter Crosthwaite
Cc: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/qom/cpu.h | 26
cpus.c | 48 +++--
stubs/cpu-lock.c
It will gain a user once we protect more of CPUState under cpu->lock.
This completes the conversion to cpu_mutex_lock/unlock in the file.
Signed-off-by: Emilio G. Cota
---
include/qom/cpu.h | 9 +
cpus-common.c | 17 +++--
2 files changed, 20 insertions(+), 6 deleti
We don't pass a pointer to qemu_global_mutex anymore.
Cc: Peter Crosthwaite
Cc: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/qom/cpu.h | 10 --
cpus-common.c | 2 +-
cpus.c| 5 -
3 files changed, 1 insertion(+), 16 deletions(-)
diff --
cpu->halted will soon be protected by cpu->lock.
We will use these helpers to ease the transition,
since right now cpu->halted has many direct callers.
Signed-off-by: Emilio G. Cota
---
include/qom/cpu.h | 24
1 file changed, 24 insertions(+)
diff --git a/in
Cc: Stafford Horne
Signed-off-by: Emilio G. Cota
---
target/openrisc/sys_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/openrisc/sys_helper.c b/target/openrisc/sys_helper.c
index b66a45c1e0..ab4d8fb520 100644
--- a/target/openrisc/sys_helper.c
+++ b/target
Cc: Michael Walle
Signed-off-by: Emilio G. Cota
---
target/lm32/op_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/lm32/op_helper.c b/target/lm32/op_helper.c
index 234d55e056..392634441b 100644
--- a/target/lm32/op_helper.c
+++ b/target/lm32/op_helper.c
This lock will soon protect more fields of the struct. Give
it a more appropriate name.
Cc: Peter Crosthwaite
Cc: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/qom/cpu.h | 5 +++--
cpus-common.c | 14 +++---
cpus.c| 4 ++--
qom/cpu.c | 2 +-
4
Cc: Michael Clark
Cc: Palmer Dabbelt
Cc: Sagar Karandikar
Cc: Bastian Koppelmann
Cc: Alistair Francis
Signed-off-by: Emilio G. Cota
---
target/riscv/op_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index
This eliminates the need to use the BQL to queue CPU work.
While at it, give the per-cpu field a generic name ("cond") since
it will soon be used for more than just queueing CPU work.
Cc: Peter Crosthwaite
Cc: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/qom/
Instead of open-coding it.
While at it, make sure that all accesses to the list are
performed while holding the list's lock.
Cc: Peter Crosthwaite
Cc: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/qom/cpu.h | 6 +++---
cpus-common.c | 25 -
c
Cc: Andrzej Zaborowski
Cc: Peter Maydell
Cc: qemu-...@nongnu.org
Signed-off-by: Emilio G. Cota
---
hw/arm/omap1.c| 4 ++--
hw/arm/pxa2xx_gpio.c | 2 +-
hw/arm/pxa2xx_pic.c | 2 +-
target/arm/arm-powerctl.c | 4 ++--
target/arm/cpu.c | 2 +-
target/arm
Cc: Aurelien Jarno
Cc: Aleksandar Markovic
Cc: James Hogan
Signed-off-by: Emilio G. Cota
---
hw/mips/cps.c | 2 +-
hw/misc/mips_itu.c | 4 ++--
target/mips/kvm.c | 2 +-
target/mips/op_helper.c | 8
target/mips/translate.c | 4 ++--
5 files changed, 10 insertions
Cc: Max Filippov
Signed-off-by: Emilio G. Cota
---
target/xtensa/cpu.c | 2 +-
target/xtensa/helper.c| 2 +-
target/xtensa/op_helper.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
index a54dbe4260..d4ca35e6cc 100644
In ppce500_spin.c, acquire the lock just once to update
both cpu->halted and cpu->stopped.
Cc: David Gibson
Cc: Alexander Graf
Cc: qemu-...@nongnu.org
Signed-off-by: Emilio G. Cota
---
target/ppc/helper_regs.h| 2 +-
hw/ppc/e500.c | 4 ++--
hw/ppc
901 - 1000 of 2303 matches
Mail list logo