Re: [Qemu-devel] [RFC 26/38] cpu: protect tb_jmp_cache with seqlock

2015-08-25 Thread Emilio G. Cota
On Sun, Aug 23, 2015 at 18:14:58 -0700, Paolo Bonzini wrote: > On 23/08/2015 17:23, Emilio G. Cota wrote: > > This paves the way for a lockless tb_find_fast. > > > > Signed-off-by: Emilio G. Cota > > --- (snip) > > @@ -1707,12 +1735,14 @@ void tb_flush_jmp_

Re: [Qemu-devel] [RFC 31/38] cpu: protect l1_map with tb_lock in full-system mode

2015-08-25 Thread Emilio G. Cota
On Sun, Aug 23, 2015 at 18:07:04 -0700, Paolo Bonzini wrote: > On 23/08/2015 17:24, Emilio G. Cota wrote: > > Note that user-only uses mmap_lock for this. > > > > Signed-off-by: Emilio G. Cota > > Why is this needed? The RCU-like page_find should work just fine.

Re: [Qemu-devel] [RFC 33/38] cpu: introduce cpu_tcg_sched_work to run work while other CPUs sleep

2015-08-25 Thread Emilio G. Cota
On Sun, Aug 23, 2015 at 18:24:55 -0700, Paolo Bonzini wrote: > On 23/08/2015 17:24, Emilio G. Cota wrote: > > This is similar in intent to the async_safe_work mechanism. The main > > differences are: > > > > - Work is run on a single CPU thread *after*

Re: [Qemu-devel] [RFC 35/38] cputlb: use cpu_tcg_sched_work for tlb_flush_all

2015-08-25 Thread Emilio G. Cota
On Sun, Aug 23, 2015 at 18:29:33 -0700, Paolo Bonzini wrote: > > > On 23/08/2015 17:24, Emilio G. Cota wrote: > > Signed-off-by: Emilio G. Cota > > --- > > cputlb.c | 41 +++-- > > 1 file changed, 11 insertions(+), 30 deletio

Re: [Qemu-devel] [RFC 00/38] MTTCG: i386, user+system mode

2015-08-25 Thread Emilio G. Cota
On Sun, Aug 23, 2015 at 19:01:28 -0700, Paolo Bonzini wrote: > > * tb_flush: do it once all other CPUs have been put to sleep by calling > > rcu_synchronize(). > > We also instrument tb_lock to make sure that only one tb_flush request > > can > > happen at a given time. > > What do

Re: [Qemu-devel] MTTCG next version?

2015-08-26 Thread Emilio G. Cota
On Wed, Aug 26, 2015 at 14:18:24 +0200, Frederic Konrad wrote: > Do that make sense? A few decisions here don't make that much sense to me, but maybe I'm missing context: > I'm trying to do the next version of the MTTCG work: > > I would like to rebase on Alvise atomic instruction branch: > -

Re: [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses

2015-08-27 Thread Emilio G. Cota
On Wed, Aug 26, 2015 at 02:17:42 +0200, Paolo Bonzini wrote: > Signed-off-by: Paolo Bonzini > --- > cpu-exec.c | 6 +- > qom/cpu.c | 2 ++ > 2 files changed, 7 insertions(+), 1 deletion(-) I like this patch. Are we making sure that other writes to tcg_exit_req are preceded by a write barri

Re: [Qemu-devel] [RFC 35/38] cputlb: use cpu_tcg_sched_work for tlb_flush_all

2015-09-01 Thread Emilio G. Cota
On Tue, Sep 01, 2015 at 17:10:30 +0100, Alex Bennée wrote: > > Emilio G. Cota writes: > > > Signed-off-by: Emilio G. Cota > > --- > > cputlb.c | 41 +++-- > > 1 file changed, 11 insertions(+), 30 deletions(-) > > I

[Qemu-devel] linux-user crashes on clone(2) when run on ppc host

2015-06-16 Thread Emilio G. Cota
Hi, I'm having trouble running a simple multithreaded program on a PowerPC host machine. The machine I'm using is a ppc VM--I think it's running under KVM (I'm using OVH's RunAbove Power8 service): admin@adsf:~/qemu$ uname -a Linux adsf 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:27:09 UT

Re: [Qemu-devel] linux-user crashes on clone(2) when run on ppc host

2015-06-17 Thread Emilio G. Cota
On Wed, Jun 17, 2015 at 09:58:27 +0100, Peter Maydell wrote: > On 17 June 2015 at 01:52, Emilio G. Cota wrote: > > I'm having trouble running a simple multithreaded program on a PowerPC host > > machine. > > > > The machine I'm using is a ppc VM--I th

Re: [Qemu-devel] linux-user crashes on clone(2) when run on ppc host

2015-06-18 Thread Emilio G. Cota
_phys_page_range->tb_gen_code. I'm not using gdb so I guess I cannot trigger this. Am I missing something? > On 17 June 2015 at 22:36, Emilio G. Cota wrote: > > I don't think this is a race because it also breaks when > > run on a single core (with taskset

[Qemu-devel] [Bug 1098729] Re: qemu-user-static for armhf: segfault in threaded code

2015-06-18 Thread Emilio G. Cota
I cannot make dotprod_mutex.c to crash with the current master (git 8ffe756d). I've tried both linux-arm and linux-arm-static, the latter running under chroot. I've tried on three different machines, and have tested with different thread counts: 4, 10, 16, 64 (one of the machines has 64 cores). I

Re: [Qemu-devel] linux-user crashes on clone(2) when run on ppc host

2015-06-18 Thread Emilio G. Cota
lity that > thread A calls tb_gen_code, which calls tb_alloc, which > calls tb_flush, which clears the whole code cache, and then > tb_gen_code starts generating code over the top of a TB > that thread B was in the middle of executing from... Agreed, this needs to be fixed. Certainly no

[Qemu-devel] [PATCH 01/10] translate-all: add missing fold of tb_ctx into tcg_ctx

2016-04-04 Thread Emilio G. Cota
Since 5e5f07e08 "TCG: Move translation block variables to new context inside tcg_ctx: tb_ctx" on Feb 1 2013, compilation of usermode + TB_DEBUG_CHECK has been broken. Fix it. Signed-off-by: Emilio G. Cota --- translate-all.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) di

[Qemu-devel] [PATCH 10/10] tb hash: track translated blocks with qht

2016-04-04 Thread Emilio G. Cota
e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Signed-off-by: Emilio G. Cota --- cpu-exe

[Qemu-devel] [PATCH 02/10] compiler.h: add QEMU_CACHELINE + QEMU_ALIGN() + QEMU_CACHELINE_ALIGNED

2016-04-04 Thread Emilio G. Cota
I'm assuming windows compilers don't support this attribute. Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..fb946f1 100644 --- a/include/qemu/

[Qemu-devel] [PATCH 09/10] qht: add test program

2016-04-04 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 +++- tests/test-qht.c | 100 +++ 3 files changed, 106 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht.c diff --git a/tests/.gitignore b/tests

[Qemu-devel] [PATCH 04/10] seqlock: rename write_lock/unlock to write_begin/end

2016-04-04 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4 ++-- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/cpus.c b/cpus.c index

[Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-04 Thread Emilio G. Cota
or other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Signed-off-by: Emilio G. Cota --- cpu-exec.c | 2 +- include/exec/tb-hash.h | 19 +-- translate-all.c| 6 +++--- 3 file

[Qemu-devel] [PATCH 03/10] seqlock: remove optional mutex

2016-04-04 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include/qemu/seqlock.h | 10 +- 2 files changed, 2 insertions(+), 10 deletions(-) diff

[Qemu-devel] [PATCH 05/10] include: add spinlock wrapper

2016-04-04 Thread Emilio G. Cota
Wrap pthread_spin on POSIX, or QemuMutex on Windows. AFAIK there are is no off-the-shelf spinlock implementation for Windows, so we'll just use QemuMutex. Signed-off-by: Emilio G. Cota --- include/qemu/spinlock-posix.h | 60 +++ include/qemu/spi

[Qemu-devel] [PATCH 06/10] include: add xxhash.h

2016-04-04 Thread Emilio G. Cota
as a 64-bit implementation, can be found at: https://github.com/Cyan4973/xxHash Signed-off-by: Emilio G. Cota --- include/qemu/xxhash.h | 106 ++ 1 file changed, 106 insertions(+) create mode 100644 include/qemu/xxhash.h diff --git a/include

[Qemu-devel] [PATCH 08/10] qht: QEMU's fast, resizable and scalable Hash Table

2016-04-04 Thread Emilio G. Cota
we can already benefit from the single-threaded speedup that qht also provides. Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 150 ++ util/Makefile.objs | 2 +- util/qht.c | 458 + 3 files changed, 609

[Qemu-devel] [PATCH 00/10] tb hash improvements

2016-04-04 Thread Emilio G. Cota
This patchset is derived from my ongoing work on MTTCG, but does not depend on it and brings improvements that we can already benefit from. It applies cleanly on the current master and is checkpatch-clean. The key goal is to make the TB hash table faster, and while at it, scalable. Tested on two d

Re: [Qemu-devel] [PATCH 02/10] compiler.h: add QEMU_CACHELINE + QEMU_ALIGN() + QEMU_CACHELINE_ALIGNED

2016-04-05 Thread Emilio G. Cota
On Tue, Apr 05, 2016 at 08:57:45 +0100, Peter Maydell wrote: > On 5 April 2016 at 06:30, Emilio G. Cota wrote: > > +#define QEMU_CACHELINE (64) > > Why 64? Does anything bad happen if the host's cache line > size turns out to be greater than the value here ? Defining

Re: [Qemu-devel] [PATCH 02/10] compiler.h: add QEMU_CACHELINE + QEMU_ALIGN() + QEMU_CACHELINE_ALIGNED

2016-04-05 Thread Emilio G. Cota
On Tue, Apr 05, 2016 at 19:01:07 +0100, Peter Maydell wrote: > On 5 April 2016 at 18:24, Emilio G. Cota wrote: > > So how about this: > > we add these defaults, and also add an optional --configure > > parameter to override said defaults. > > I think this definitel

Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-05 Thread Emilio G. Cota
On Tue, Apr 05, 2016 at 09:07:57 -0700, Richard Henderson wrote: > On 04/05/2016 08:48 AM, Paolo Bonzini wrote: > >I think it's fine to use the struct. The exact size of the struct > >varies from 3 to 5 32-bit words, so it's hard to write nice > >size-dependent code for the hash. > > I don't thin

Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-05 Thread Emilio G. Cota
look? Thanks, E. commit af92a0690f49172621cd8b80759e3ca567d43567 Author: Emilio G. Cota Date: Tue Apr 5 18:06:21 2016 -0400 rth Signed-off-by: Emilio G. Cota diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index 6b97a7c..349a856 100644 --- a/i

Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-06 Thread Emilio G. Cota
boots OK for me with the patch below. I'm booting it as per the instructions in http://www.bennee.com/~alex/blog/2014/05/09/running-linux-in-qemus-aarch64-system-emulation-mode/ Thanks, Emilio commit e70474788fa37a85df21e1c63101a879103758f5 Author: Emilio G. Cota Date:

Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-06 Thread Emilio G. Cota
On Wed, Apr 06, 2016 at 13:52:21 +0200, Paolo Bonzini wrote: > > > On 06/04/2016 02:52, Emilio G. Cota wrote: > > +static inline uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e, > > int seed) > > I would keep just this version and unconditionally zer

Re: [Qemu-devel] [PATCH 0/2] Add gpio_key and use it for ARM virt power button

2016-04-06 Thread Emilio G. Cota
On Wed, Mar 23, 2016 at 16:12:46 +, Peter Maydell wrote: > On 17 March 2016 at 13:25, Shannon Zhao wrote: > > From: Shannon Zhao > > > > There is a problem for power button that it will not work if an early > > system_powerdown request happens before guest gpio driver loads. > > > > Here we a

Re: [Qemu-devel] [PATCH 06/10] include: add xxhash.h

2016-04-06 Thread Emilio G. Cota
On Wed, Apr 06, 2016 at 12:39:42 +0100, Alex Bennée wrote: > > Emilio G. Cota writes: > > > xxhash is a fast, high-quality hashing function. The appended > > brings in the 32-bit version of it, with the small modification that > > it assumes the data to be hashed is

Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-06 Thread Emilio G. Cota
On Wed, Apr 06, 2016 at 20:23:42 +0200, Paolo Bonzini wrote: > On 06/04/2016 19:44, Emilio G. Cota wrote: > > I like this idea, because the ugliness of the sizeof checks is significant. > > However, the quality of the resulting hash is not as good when always using > > fu

[Qemu-devel] [PATCH v2 05/13] include/processor.h: define cpu_relax()

2016-04-07 Thread Emilio G. Cota
Taken from the linux kernel. Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 28 1 file changed, 28 insertions(+) create mode 100644 include/qemu/processor.h diff --git a/include/qemu/processor.h b/include/qemu/processor.h new file mode 100644 index

[Qemu-devel] [PATCH] tb: consistently use uint32_t for tb->flags

2016-04-07 Thread Emilio G. Cota
-by: Laurent Desnogues Suggested-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpu-exec.c | 6 +++--- exec.c | 2 +- hw/i386/kvmvapic.c | 2 +- include/exec/exec-all.h | 5 +++-- target-alpha/cpu.h | 2 +- target-arm/cpu.h| 2 +- tar

[Qemu-devel] [PATCH v2 12/13] qht: add test program

2016-04-07 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 ++- tests/test-qht.c | 109 +++ 3 files changed, 115 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht.c diff --git a/tests/.gitignore b/tests

[Qemu-devel] [PATCH v2 09/13] exec: add tb_hash_func5, derived from xxhash

2016-04-07 Thread Emilio G. Cota
This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Signed-off-by: Emilio G. Cota --- include/exec/tb-hash-xx.h | 103 ++ 1 file changed, 103 insertions(+) create

[Qemu-devel] [PATCH v2 04/13] seqlock: rename write_lock/unlock to write_begin/end

2016-04-07 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4 ++-- 2 files changed, 16 insertions(+), 16 deletions(-) diff

[Qemu-devel] [PATCH v2 06/13] qemu-thread: add simple test-and-set spinlock

2016-04-07 Thread Emilio G. Cota
From: Guillaume Delbergue Signed-off-by: Guillaume Delbergue [Rewritten. - Paolo] Signed-off-by: Paolo Bonzini --- include/qemu/thread.h | 31 +++ 1 file changed, 31 insertions(+) diff --git a/include/qemu/thread.h b/include/qemu/thread.h index bdae6df..1aa843b 100

[Qemu-devel] [PATCH v2 10/13] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-07 Thread Emilio G. Cota
or other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Signed-off-by: Emilio G. Cota --- cpu-exec.c | 2 +- include/exec/tb-hash.h | 8 ++-- translate-all.c| 6 +++--- 3 files changed, 10 inser

[Qemu-devel] [PATCH v2 02/13] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-04-07 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..1978ddc 100644 --- a/include/qemu/compiler.h +++ b/include/qemu/compiler.h @@ -41,6 +41,8 @@ # define QEMU_PACKED

[Qemu-devel] [PATCH v2 08/13] qemu-thread: optimize spin_lock for uncontended locks

2016-04-07 Thread Emilio G. Cota
je 4ad980 4ad996: f3 90 pause 4ad998: eb f6 jmp4ad990 Signed-off-by: Emilio G. Cota --- include/qemu/thread.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/qemu/thread.h b/include/qemu/thread.h

[Qemu-devel] [PATCH v2 07/13] qemu-thread: call cpu_relax() while spinning

2016-04-07 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- include/qemu/thread.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/qemu/thread.h b/include/qemu/thread.h index 1aa843b..599965e 100644 --- a/include/qemu/thread.h +++ b/include/qemu/thread.h @@ -2,6 +2,7 @@ #define

Re: [Qemu-devel] [PATCH] tb: consistently use uint32_t for tb->flags

2016-04-07 Thread Emilio G. Cota
On Thu, Apr 07, 2016 at 13:19:22 -0400, Emilio G. Cota wrote: > We are inconsistent with the type of tb->flags: usage varies loosely > between int and uint64_t. Settle to uint32_t everywhere, which is > superior to both: at least one target (aarch64) uses the most significant > bit

[Qemu-devel] [PATCH v2 00/10] tb hash improvements

2016-04-07 Thread Emilio G. Cota
See v1 for context: https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00587.html All patches in v2 are checkpatch-clean, except 05 (checkpatch should be ignored for this one) and 06, which I took unmodified (later patches fix those warnings while doing other things, anyway). Note that pat

[Qemu-devel] [PATCH v2 03/13] seqlock: remove optional mutex

2016-04-07 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include/qemu/seqlock.h | 10 +- 2 files changed, 2 insertions(+

Re: [Qemu-devel] [PATCH 09/10] qht: add test program

2016-04-19 Thread Emilio G. Cota
On Fri, Apr 08, 2016 at 11:45:41 +0100, Alex Bennée wrote: (snip the entire patch) > A couple of notes: > > - these should use the gtester boiler plate for reporting results Done in v3. > - AFAICT they are not exercising the multi-element hashing we actually > use in the main code > -

Re: [Qemu-devel] [PATCH 08/10] qht: QEMU's fast, resizable and scalable Hash Table

2016-04-19 Thread Emilio G. Cota
Hi Alex, I'm sending a v3 in a few minutes. I've addressed all your comments there, so I won't duplicate them here; please find inline my replies to some questions you raised. On Fri, Apr 08, 2016 at 11:27:19 +0100, Alex Bennée wrote: > Emilio G. Cota writes: (snip) > >

[Qemu-devel] [PATCH v3 02/11] seqlock: remove optional mutex

2016-04-19 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include/qemu/seqlock.h | 10 +---

[Qemu-devel] [PATCH v3 03/11] seqlock: rename write_lock/unlock to write_begin/end

2016-04-19 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4 ++-- 2 files changed, 16

[Qemu-devel] [PATCH v3 10/11] tb hash: track translated blocks with qht

2016-04-19 Thread Emilio G. Cota
e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpu-ex

[Qemu-devel] [PATCH v3 05/11] qemu-thread: add simple test-and-set spinlock

2016-04-19 Thread Emilio G. Cota
From: Guillaume Delbergue Signed-off-by: Guillaume Delbergue [Rewritten. - Paolo] Signed-off-by: Paolo Bonzini [Emilio's additions: call cpu_relax() while spinning; optimize for uncontended locks by acquiring the lock with xchg+test instead of test+xchg+test.] Signed-off-by: Emilio G.

[Qemu-devel] [PATCH v3 01/11] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-04-19 Thread Emilio G. Cota
Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..1978ddc 100644 --- a/include/qemu/compiler.h +++ b/include/qemu/compiler.h @@ -41,6

[Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump

2016-04-19 Thread Emilio G. Cota
Suggested-by: Richard Henderson Signed-off-by: Emilio G. Cota --- translate-all.c | 5 + 1 file changed, 5 insertions(+) diff --git a/translate-all.c b/translate-all.c index 617a572..769bffc 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1664,6 +1664,8 @@ void dump_exec_info(FILE

[Qemu-devel] [PATCH v3 09/11] qht: add test program

2016-04-19 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 ++- tests/test-qht.c | 139 +++ 3 files changed, 145 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht.c diff --git a/tests/.gitignore b/tests

[Qemu-devel] [PATCH v3 08/11] qht: QEMU's fast, resizable and scalable Hash Table

2016-04-19 Thread Emilio G. Cota
we can already benefit from the single-threaded speedup that qht also provides. Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 54 + util/Makefile.objs | 2 +- util/qht.c | 590 + 3 files changed, 645 insertions(+), 1

[Qemu-devel] [PATCH v3 04/11] include/processor.h: define cpu_relax()

2016-04-19 Thread Emilio G. Cota
Taken from the linux kernel. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 28 1 file changed, 28 insertions(+) create mode 100644 include/qemu/processor.h diff --git a/include/qemu/processor.h b/include/qemu

[Qemu-devel] [PATCH v3 00/11] tb hash improvements

2016-04-19 Thread Emilio G. Cota
See v2 here: https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg01307.html Changes from v2: - Dropped "add missing fold of tb_ctx into tcg_ctx", already merged upstream as commit 7e6bd36d611. - Added reviewed-by tags from Alex and Richard - xxhash: + use rol32 from qemu/bitops.h +

[Qemu-devel] [PATCH v3 06/11] exec: add tb_hash_func5, derived from xxhash

2016-04-19 Thread Emilio G. Cota
This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Signed-off-by: Emilio G. Cota --- include/exec/tb-hash-xx.h | 94 +++ 1 file changed, 94 insertions(+) create

[Qemu-devel] [PATCH v3 07/11] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-19 Thread Emilio G. Cota
or other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpu-exec.c | 2 +- include/exec/tb-hash.h | 8 ++-- translate-all.c|

Re: [Qemu-devel] [PATCH] tb: consistently use uint32_t for tb->flags

2016-04-19 Thread Emilio G. Cota
On Thu, Apr 07, 2016 at 13:19:22 -0400, Emilio G. Cota wrote: > We are inconsistent with the type of tb->flags: usage varies loosely > between int and uint64_t. Settle to uint32_t everywhere, which is > superior to both: at least one target (aarch64) uses the most significant > bit

Re: [Qemu-devel] [PATCH v3 04/11] include/processor.h: define cpu_relax()

2016-04-20 Thread Emilio G. Cota
On Wed, Apr 20, 2016 at 14:15:41 +0200, KONRAD Frederic wrote: > Le 20/04/2016 01:07, Emilio G. Cota a écrit : > >Taken from the linux kernel. > > > >Reviewed-by: Richard Henderson > >Signed-off-by: Emilio G. Cota > >--- > > include/qemu/processor.h | 28 +

Re: [Qemu-devel] [PATCH v3 05/11] qemu-thread: add simple test-and-set spinlock

2016-04-20 Thread Emilio G. Cota
On Wed, Apr 20, 2016 at 08:18:56 -0700, Richard Henderson wrote: > On 04/19/2016 04:07 PM, Emilio G. Cota wrote: > >From: Guillaume Delbergue > > > >Signed-off-by: Guillaume Delbergue > >[Rewritten. - Paolo] > >Signed-off-by: Paolo Bonzini > >[Emilio'

[Qemu-devel] [UPDATED v3 04/11] include/processor.h: define cpu_relax()

2016-04-20 Thread Emilio G. Cota
Taken from the linux kernel. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 34 ++ 1 file changed, 34 insertions(+) create mode 100644 include/qemu/processor.h diff --git a/include/qemu/processor.h b/include/qemu

[Qemu-devel] [UPDATED v3 05/11] qemu-thread: add simple test-and-set spinlock

2016-04-20 Thread Emilio G. Cota
milio G. Cota --- include/qemu/thread.h | 34 ++ 1 file changed, 34 insertions(+) diff --git a/include/qemu/thread.h b/include/qemu/thread.h index bdae6df..a216941 100644 --- a/include/qemu/thread.h +++ b/include/qemu/thread.h @@ -1,6 +1,9 @@ #ifndef __QEMU_THR

Re: [Qemu-devel] [PATCH v3 05/11] qemu-thread: add simple test-and-set spinlock

2016-04-20 Thread Emilio G. Cota
On Wed, Apr 20, 2016 at 10:55:45 -0700, Richard Henderson wrote: > On 04/20/2016 10:17 AM, Emilio G. Cota wrote: > >I've tried to find a GCC intrinsic for test-and-set, and I've only found > >lock_test_and_set, which is what we use for atomic_xchg (except on ppc) > >

Re: [Qemu-devel] [PATCH v3 05/11] qemu-thread: add simple test-and-set spinlock

2016-04-21 Thread Emilio G. Cota
On Wed, Apr 20, 2016 at 12:39:45 -0700, Richard Henderson wrote: > On 04/20/2016 11:11 AM, Emilio G. Cota wrote: > >On Wed, Apr 20, 2016 at 10:55:45 -0700, Richard Henderson wrote: > >>On 04/20/2016 10:17 AM, Emilio G. Cota wrote: (snip) > >My comment was related to thi

[Qemu-devel] [PATCH 2/2] translate-all: add missing munmap of the code_gen guard page for MIPS

2016-04-21 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- translate-all.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/translate-all.c b/translate-all.c index e700399..bba9b62 100644 --- a/translate-all.c +++ b/translate-all.c @@ -668,39 +668,39 @@ static inline void *alloc_code_gen_buffer

[Qemu-devel] [PATCH 1/2] translate-all: remove redundant setting of tcg_ctx.code_gen_buffer_size

2016-04-21 Thread Emilio G. Cota
= size_code_gen_buffer(tb_size); Signed-off-by: Emilio G. Cota --- translate-all.c | 1 - 1 file changed, 1 deletion(-) diff --git a/translate-all.c b/translate-all.c index 769bffc..e700399 100644 --- a/translate-all.c +++ b/translate-all.c @@ -505,7 +505,6 @@ static inline size_t size_code_gen_buffer

[Qemu-devel] [RFC] translate-all: protect code_gen_buffer with RCU

2016-04-21 Thread Emilio G. Cota
g the assert. If you have a working multi-threaded workload that would be good to test this, please let me know. - Windows; not even compile-tested! Signed-off-by: Emilio G. Cota --- translate-all.c | 122 +--- 1 file changed, 117 insertions(

Re: [Qemu-devel] [PATCH v3 01/11] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-04-22 Thread Emilio G. Cota
On Fri, Apr 22, 2016 at 10:35:42 +0100, Peter Maydell wrote: > On 20 April 2016 at 00:07, Emilio G. Cota wrote: > > +#define QEMU_ALIGNED(B) __attribute__((aligned(B))) > > A rather trivial thing, but if we have to respin this series > for some other reason could we use

Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump

2016-04-22 Thread Emilio G. Cota
On Fri, Apr 22, 2016 at 10:41:25 -0700, Richard Henderson wrote: > On 04/19/2016 04:07 PM, Emilio G. Cota wrote: > > +ht_avg_len = qht_avg_bucket_chain_length(&tcg_ctx.tb_ctx.htable, > > &ht_heads); > > +cpu_fprintf(f, "TB hash avg ch

Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump

2016-04-22 Thread Emilio G. Cota
On Fri, Apr 22, 2016 at 12:59:52 -0700, Richard Henderson wrote: > FWIW, so that I could get an idea of how the stats change as we improve the > hashing, I inserted the attachment 1 patch between patches 5 and 6, and with > attachment 2 attempting to fix the accounting for patches 9 and 10. For qh

Re: [Qemu-devel] [RFC] translate-all: protect code_gen_buffer with RCU

2016-04-23 Thread Emilio G. Cota
On Fri, Apr 22, 2016 at 15:41:13 +0100, Alex Bennée wrote: > Emilio G. Cota writes: (snip) > > Known issues: > > - Basically compile-tested only, since I've only run this with > > single-threaded TCG; I also tried running it with linux-user, > > but in order t

[Qemu-devel] [RFC v2] translate-all: protect code_gen_buffer with RCU

2016-04-23 Thread Emilio G. Cota
Alex' unit test with low enough -tb-size, see https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg03465.html Seems to work in MTTCG, although I've only tested with tb_lock always being taken in tb_find_fast. - Windows; not even compile-tested! Signed-off-by: Emilio G. Cota

Re: [Qemu-devel] [PATCH v3 08/11] qht: QEMU's fast, resizable and scalable Hash Table

2016-04-24 Thread Emilio G. Cota
On Sun, Apr 24, 2016 at 13:01:31 -0700, Richard Henderson wrote: > On 04/19/2016 04:07 PM, Emilio G. Cota wrote: > >+static void qht_insert__locked(struct qht *ht, struct qht_map *map, > >+ struct qht_bucket *head, void *p, uint32_t > >hash)

Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump

2016-04-24 Thread Emilio G. Cota
On Sun, Apr 24, 2016 at 12:46:08 -0700, Richard Henderson wrote: > On 04/22/2016 04:57 PM, Emilio G. Cota wrote: > >On Fri, Apr 22, 2016 at 12:59:52 -0700, Richard Henderson wrote: > >>FWIW, so that I could get an idea of how the stats change as we improve the > >

Re: [Qemu-devel] [RFC v2] translate-all: protect code_gen_buffer with RCU

2016-04-25 Thread Emilio G. Cota
On Mon, Apr 25, 2016 at 16:19:59 +0100, Alex Bennée wrote: > > Emilio G. Cota writes: > > > [ Applies on top of bennee/mttcg/enable-mttcg-for-armv7-v1 after > > reverting "translate-all: introduces tb_flush_safe". A trivial > > conflict must be solved afte

[Qemu-devel] [RFC v3] translate-all: protect code_gen_buffer with RCU

2016-04-25 Thread Emilio G. Cota
: [<0020>]lr : [<40010800>]psr: 6153 sp : 400b45c0 ip : 400b34e8 fp : 40032ca8 r10: r9 : r8 : r7 : r6 : r5 : r4 : r3 : r2 : r1 : 00ff r0 : Flags: nZCv IRQs on FIQs of

Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump

2016-04-26 Thread Emilio G. Cota
On Sun, Apr 24, 2016 at 18:06:51 -0400, Emilio G. Cota wrote: > On Sun, Apr 24, 2016 at 12:46:08 -0700, Richard Henderson wrote: > > On 04/22/2016 04:57 PM, Emilio G. Cota wrote: > > >On Fri, Apr 22, 2016 at 12:59:52 -0700, Richard Henderson wrote: > > >>FWIW, so t

[Qemu-devel] [PATCH v4 10/14] qdist: add test program

2016-04-29 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 +- tests/test-qdist.c | 363 + 3 files changed, 369 insertions(+), 1 deletion(-) create mode 100644 tests/test-qdist.c diff --git a/tests/.gitignore b

[Qemu-devel] [PATCH v4 03/14] seqlock: rename write_lock/unlock to write_begin/end

2016-04-29 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4 ++-- 2 files changed, 16

[Qemu-devel] [PATCH v4 05/14] atomics: add atomic_test_and_set

2016-04-29 Thread Emilio G. Cota
This new helper expands to __atomic_test_and_set where available; otherwise it expands to atomic_xchg. Signed-off-by: Emilio G. Cota --- include/qemu/atomic.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h index 5bc4d6c..6132dad 100644 --- a

[Qemu-devel] [PATCH v4 04/14] include/processor.h: define cpu_relax()

2016-04-29 Thread Emilio G. Cota
Taken from the linux kernel. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 34 ++ 1 file changed, 34 insertions(+) create mode 100644 include/qemu/processor.h diff --git a/include/qemu

[Qemu-devel] [PATCH v4 14/14] translate-all: add tb hash bucket info to 'info jit' dump

2016-04-29 Thread Emilio G. Cota
: [15.0,16.7)|▁▂▅▄█▅|[30.3,32.0] Suggested-by: Richard Henderson Signed-off-by: Emilio G. Cota --- translate-all.c | 36 1 file changed, 36 insertions(+) diff --git a/translate-all.c b/translate-all.c index 0bf76d7..775ea79 100644 --- a/translate-all.c +++ b

[Qemu-devel] [PATCH v4 08/14] tb hash: hash phys_pc, pc, and flags with xxhash

2016-04-29 Thread Emilio G. Cota
or other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpu-exec.c | 4 ++-- include/exec/tb-hash.h | 8 ++-

[Qemu-devel] [PATCH v4 06/14] qemu-thread: add simple test-and-set spinlock

2016-04-29 Thread Emilio G. Cota
TATAS.] Signed-off-by: Emilio G. Cota --- include/qemu/thread.h | 34 ++ 1 file changed, 34 insertions(+) diff --git a/include/qemu/thread.h b/include/qemu/thread.h index bdae6df..39ff1ac 100644 --- a/include/qemu/thread.h +++ b/include/qemu/thread.h @@ -1,6

[Qemu-devel] [PATCH v4 12/14] qht: add test program

2016-04-29 Thread Emilio G. Cota
Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 5 +- tests/test-qht.c | 177 +++ 3 files changed, 182 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht.c diff --git a

[Qemu-devel] [PATCH v4 09/14] qdist: add module to represent frequency distributions of data

2016-04-29 Thread Emilio G. Cota
dist, 39); [...] qdist_inc(&dist, 80); char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS); // -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0] g_free(str); char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS); // -> [39.0,49.2)▁█▁▁[69.8,80.0] g_free(str); Signed-off-by: Emilio G. Cota --- include/qemu/qdi

[Qemu-devel] [PATCH v4 07/14] exec: add tb_hash_func5, derived from xxhash

2016-04-29 Thread Emilio G. Cota
This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/exec/tb-hash-xx.h | 94 +++ 1 file

Re: [Qemu-devel] [RFC v3] translate-all: protect code_gen_buffer with RCU

2016-04-29 Thread Emilio G. Cota
On Tue, Apr 26, 2016 at 07:32:39 +0100, Alex Bennée wrote: > Emilio G. Cota writes: > > With two code_gen "halves", if two tb_flush calls are done in the same > > RCU read critical section, we're screwed. I added a cpu_exit at the end > > of tb_flush to try to

[Qemu-devel] [PATCH v4 00/14] tb hash improvements

2016-04-29 Thread Emilio G. Cota
Changes from v3: - added reviewed-by tags from v3. I dropped the review tags from the 'qht' and 'info jit' patches because they have changed quite a bit from v3. - qdist: new module to print intuitive histograms, see 'info jit' below. - qht: + bug fix: remove unnecessary requirement of hashe

[Qemu-devel] [PATCH v4 11/14] qht: QEMU's fast, resizable and scalable Hash Table

2016-04-29 Thread Emilio G. Cota
we can already benefit from the single-threaded speedup that qht also provides. Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 67 + util/Makefile.objs | 2 +- util/qht.c | 722 + 3 files changed, 790 insertions(+), 1

[Qemu-devel] [PATCH v4 13/14] tb hash: track translated blocks with qht

2016-04-29 Thread Emilio G. Cota
e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpu-ex

[Qemu-devel] [PATCH v4 02/14] seqlock: remove optional mutex

2016-04-29 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include/qemu/seqlock.h | 10 +---

[Qemu-devel] [PATCH v4 01/14] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-04-29 Thread Emilio G. Cota
Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..b64f899 100644 --- a/include/qemu/compiler.h +++ b/include

Re: [Qemu-devel] [PATCH v4 13/14] tb hash: track translated blocks with qht

2016-05-03 Thread Emilio G. Cota
On Tue, May 03, 2016 at 08:36:35 +0100, Alex Bennée wrote: > Emilio G. Cota writes: > > static TranslationBlock *tb_find_physical(CPUState *cpu, > >target_ulong pc, > >

Re: [Qemu-devel] [PATCH v4 13/14] tb hash: track translated blocks with qht

2016-05-04 Thread Emilio G. Cota
On Tue, May 03, 2016 at 19:24:31 -1000, Richard Henderson wrote: > On 05/03/2016 07:26 AM, Emilio G. Cota wrote: > >Ouch, sorry. Won't happen again. > > > >Grab the missing pre-requisite patch from: > > http://patchwork.ozlabs.org/patch/607662/mbox/ > &

Re: [Qemu-devel] [PATCH v4 13/14] tb hash: track translated blocks with qht

2016-05-05 Thread Emilio G. Cota
On Wed, May 04, 2016 at 07:22:16 -1000, Richard Henderson wrote: > On 05/04/2016 05:36 AM, Emilio G. Cota wrote: > >BTW in the last couple of days I did some more work beyond v4: > > > >- Added a benchmark (not a correctness test) to measure parallel > > performance o

Re: [Qemu-devel] [RFC v3] translate-all: protect code_gen_buffer with RCU

2016-05-09 Thread Emilio G. Cota
On Mon, May 09, 2016 at 13:21:50 +0200, Paolo Bonzini wrote: > On 30/04/2016 05:40, Emilio G. Cota wrote: > >> The tb_flush > >> > is a fairly rare occurrence its not like its on the critical performance > >> > path (although of course pathological cases are p

<    1   2   3   4   5   6   7   8   9   10   >