Re: [Qemu-devel] [RFC v1 01/11] tcg: move tb_find_fast outside the tb_lock critical section

2016-03-21 Thread Emilio G. Cota
On Fri, Mar 18, 2016 at 16:18:42 +, Alex Bennée wrote: > From: KONRAD Frederic > > Signed-off-by: KONRAD Frederic > Signed-off-by: Paolo Bonzini > [AJB: minor checkpatch fixes] > Signed-off-by: Alex Bennée > > --- > v1(ajb) > - checkpatch fixes > --- > diff --git a/cpu-exec.c b/cpu-exec

Re: [Qemu-devel] [RFC v1 03/11] tcg: comment on which functions have to be called with tb_lock held

2016-03-21 Thread Emilio G. Cota
On Fri, Mar 18, 2016 at 17:59:46 +0100, Paolo Bonzini wrote: > On 18/03/2016 17:18, Alex Bennée wrote: > > + > > +/* Protected by tb_lock. */ > > Only writes are protected by tb_lock. Read happen outside the lock. > > Reads are not quite thread safe yet, because of tb_flush. In order to >

Re: [Qemu-devel] [RFC v1 01/11] tcg: move tb_find_fast outside the tb_lock critical section

2016-03-21 Thread Emilio G. Cota
On Mon, Mar 21, 2016 at 22:08:06 +, Peter Maydell wrote: > It is not _necessary_, but it is a performance optimization to > speed up the "missed in the TLB" case. (A TLB flush will wipe > the tb_jmp_cache table.) From the thread where the move-to-front-of-list > behaviour was added in 2010, ben

Re: [Qemu-devel] [PATCH 1/2] atomics: do not use __atomic primitives for RCU atomics

2016-05-24 Thread Emilio G. Cota
On Tue, May 24, 2016 at 09:08:01 +0200, Paolo Bonzini wrote: > On 23/05/2016 19:09, Emilio G. Cota wrote: > > PS. And really equating smp_wmb/rmb to release/acquire as we have under > > #ifdef __ATOMIC is hard to justify, other than to please tsan. > > That only makes a diffe

[Qemu-devel] [PATCH v2 0/3] atomics: fix RCU perf. regression + update documentation

2016-05-24 Thread Emilio G. Cota
v1: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg03661.html Patch 1 hasn't changed from v1 (where it was patch 2 though). Patches 2 and 3 fix a not-so-small-after-all RCU performance regression we introduced when transitioning to __atomic primitives. I got an arm64 machine to test tod

[Qemu-devel] [PATCH v2 1/3] docs/atomics: update atomic_read/set comparison with Linux

2016-05-24 Thread Emilio G. Cota
tions pertaining the variable they apply to; this, however, has no effect on surrounding statements like barriers do. For more details on this, see: https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html Signed-off-by: Emilio G. Cota --- docs/atomics.txt | 16 +--- 1 file changed,

[Qemu-devel] [PATCH v2 2/3] atomics: emit an smp_read_barrier_depends() barrier only for Sparc and Thread Sanitizer

2016-05-24 Thread Emilio G. Cota
rwise, only emit the barrier for Sparc hosts. Note that we still guarantee that smp_read_barrier_depends() is a compiler barrier. Signed-off-by: Emilio G. Cota --- include/qemu/atomic.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h

[Qemu-devel] [PATCH v2 3/3] atomics: do not emit consume barrier for atomic_rcu_read

2016-05-24 Thread Emilio G. Cota
after applying this patch: $ tests/qht-bench -d 5 -n 1 Before: 9.78 MT/s After: 10.96 MT/s Signed-off-by: Emilio G. Cota --- include/qemu/atomic.h | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h index 4a4f2fb..c5b6c8d

Re: [Qemu-devel] [PATCH v2 2/3] atomics: emit an smp_read_barrier_depends() barrier only for Sparc and Thread Sanitizer

2016-05-24 Thread Emilio G. Cota
On Tue, May 24, 2016 at 23:09:47 +0300, Sergey Fedorov wrote: > On 24/05/16 23:06, Emilio G. Cota wrote: > > For correctness, smp_read_barrier_depends() is only required to > > emit a barrier on Sparc hosts. However, we are currently emitting > > a consume fence unconditional

Re: [Qemu-devel] [PATCH v5 13/18] qht: support parallel writes

2016-05-24 Thread Emilio G. Cota
On Mon, May 23, 2016 at 23:28:27 +0300, Sergey Fedorov wrote: > What if we turn qht::lock into a mutex and change the function as follows: > > static inline > struct qht_bucket *qht_bucket_lock__no_stale(struct qht *ht, > uint32_t hash, >

Re: [Qemu-devel] [PATCH v5 13/18] qht: support parallel writes

2016-05-24 Thread Emilio G. Cota
On Wed, May 25, 2016 at 01:17:21 +0300, Sergey Fedorov wrote: > >> With this implementation we could: > >> (1) get rid of qht_map::stale > >> (2) don't waste cycles waiting for resize to complete > > I'll include this in v6. > > How is it by perf? Not much of a difference, since resize is a slo

[Qemu-devel] [PATCH v6 06/15] exec: add tb_hash_func5, derived from xxhash

2016-05-24 Thread Emilio G. Cota
This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/exec/tb-hash-xx.h | 94 +++ 1 file

[Qemu-devel] [PATCH v6 03/15] seqlock: rename write_lock/unlock to write_begin/end

2016-05-24 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4 ++-- 2 files changed, 16

[Qemu-devel] [PATCH v6 04/15] include/processor.h: define cpu_relax()

2016-05-24 Thread Emilio G. Cota
Taken from the linux kernel. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 30 ++ 1 file changed, 30 insertions(+) create mode 100644 include/qemu/processor.h diff --git a/include/qemu

[Qemu-devel] [PATCH v6 15/15] translate-all: add tb hash bucket info to 'info jit' dump

2016-05-24 Thread Emilio G. Cota
: [15.0,16.7)|▁▂▅▄█▅|[30.3,32.0] Suggested-by: Richard Henderson Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- translate-all.c | 36 1 file changed, 36 insertions(+) diff --git a/translate-all.c b/translate-all.c index 5357737..c8074cf 100644

[Qemu-devel] [PATCH v6 02/15] seqlock: remove optional mutex

2016-05-24 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include/qemu/seqlock.h | 10 +---

[Qemu-devel] [PATCH v6 05/15] qemu-thread: add simple test-and-set spinlock

2016-05-24 Thread Emilio G. Cota
ks by acquiring the lock with TAS instead of TATAS; add qemu_spin_locked().] Signed-off-by: Emilio G. Cota --- include/qemu/thread.h | 35 +++ 1 file changed, 35 insertions(+) diff --git a/include/qemu/thread.h b/include/qemu/thread.h index bdae6df..c5d71cf 100644

[Qemu-devel] [PATCH v6 07/15] tb hash: hash phys_pc, pc, and flags with xxhash

2016-05-24 Thread Emilio G. Cota
t do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpu-exec.c | 4 ++-- include/exec/tb-hash.h | 8 ++-- translate-all.c

[Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-05-24 Thread Emilio G. Cota
dist, 39); [...] qdist_inc(&dist, 80); char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS); // -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0] g_free(str); char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS); // -> [39.0,49.2)▁█▁▁[69.8,80.0] g_free(str); Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota

[Qemu-devel] [PATCH v6 09/15] qdist: add test program

2016-05-24 Thread Emilio G. Cota
Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 +- tests/test-qdist.c | 369 + 3 files changed, 375 insertions(+), 1 deletion(-) create mode 100644 tests/test-qdist.c

[Qemu-devel] [PATCH v6 10/15] qht: QEMU's fast, resizable and scalable Hash Table

2016-05-24 Thread Emilio G. Cota
those changes arrive we can already benefit from the single-threaded speedup that qht also provides. Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 183 util/Makefile.objs | 1 + util/qht.c | 837 + 3 files changed

[Qemu-devel] [PATCH v6 11/15] qht: add test program

2016-05-24 Thread Emilio G. Cota
Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 ++- tests/test-qht.c | 159 +++ 3 files changed, 165 insertions(+), 1 deletion(-) create mode 100644

[Qemu-devel] [PATCH v6 01/15] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-05-24 Thread Emilio G. Cota
Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..b64f899 100644 --- a/include/qemu/compiler.h +++ b/include

[Qemu-devel] [PATCH v6 14/15] tb hash: track translated blocks with qht

2016-05-24 Thread Emilio G. Cota
e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G.

[Qemu-devel] [PATCH v6 12/15] qht: add qht-bench, a performance benchmark

2016-05-24 Thread Emilio G. Cota
Number of threads Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile| 3 +- tests/qht-bench.c | 474 ++ 3 files changed, 477 insertions(+), 1 deletion(-) create mode 100644 tests/qht-bench.c diff --g

[Qemu-devel] [PATCH v6 00/15] tb hash improvements

2016-05-24 Thread Emilio G. Cota
v5: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg02366.html v6 applies cleanly on top of tcg-next (8b1fe3f4 "cpu-exec: Clean up 'interrupt_request' reloading", tagged "pull-tcg-20160512"). Changes from v5, mostly from Sergey's review: - processor.h: use #ifdef #elif throughout the fi

[Qemu-devel] [PATCH v6 13/15] qht: add test-qht-par to invoke qht-bench from 'check' target

2016-05-24 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 5 - tests/test-qht-par.c | 56 3 files changed, 61 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht-par.c diff --git a/tests/.gitignore

Re: [Qemu-devel] [PATCH v2 2/3] atomics: emit an smp_read_barrier_depends() barrier only for Sparc and Thread Sanitizer

2016-05-25 Thread Emilio G. Cota
On Wed, May 25, 2016 at 14:16:56 +0200, Paolo Bonzini wrote: > On 24/05/2016 22:06, Emilio G. Cota wrote: > > For correctness, smp_read_barrier_depends() is only required to > > emit a barrier on Sparc hosts. However, we are currently emitting > > a consume fence unconditio

Re: [Qemu-devel] [PATCH] docs/atomics: update comparison with Linux

2016-05-25 Thread Emilio G. Cota
r only; QEMU provides all of inc, dec, and, sub, and, or. Not clear whether the last 'and' is redundant or is being used as a conjunction. Either way it would be clearer to just remove it. Other than that: Reviewed-by: Emilio G. Cota Emilio

Re: [Qemu-devel] [PATCH v6 04/15] include/processor.h: define cpu_relax()

2016-05-27 Thread Emilio G. Cota
On Fri, May 27, 2016 at 23:53:01 +0300, Sergey Fedorov wrote: > On 25/05/16 04:13, Emilio G. Cota wrote: > > Taken from the linux kernel. > > > > Reviewed-by: Richard Henderson > > Reviewed-by: Alex Bennée > > Signed-off-by: Emilio G. Cota > >

Re: [Qemu-devel] [PATCH v6 12/15] qht: add qht-bench, a performance benchmark

2016-05-31 Thread Emilio G. Cota
On Tue, May 31, 2016 at 16:12:32 +0100, Alex Bennée wrote: > Emilio G. Cota writes: > > This serves as a performance benchmark as well as a stress test > > for QHT. We can tweak quite a number of things, including the > > number of resize threads and how frequently

Re: [Qemu-devel] [PATCH v6 10/15] qht: QEMU's fast, resizable and scalable Hash Table

2016-06-03 Thread Emilio G. Cota
On Sun, May 29, 2016 at 22:52:27 +0300, Sergey Fedorov wrote: > I was just wondering if it could be worthwhile to pass a hash function > when initializing a QHT. Then we could have variants of qht_insert(), > qht_remove() and qht_lookup() which does not require a computed hash > value but call the

Re: [Qemu-devel] [PATCH v6 10/15] qht: QEMU's fast, resizable and scalable Hash Table

2016-06-03 Thread Emilio G. Cota
On Sun, May 29, 2016 at 22:52:27 +0300, Sergey Fedorov wrote: > > +/** > > + * qht_remove - remove a pointer from the hash table > > + * @ht: QHT to remove from > > + * @p: pointer to be removed > > + * @hash: hash corresponding to @p > > + * > > + * Attempting to remove a NULL @p is a bug. > > + *

Re: [Qemu-devel] [PATCH v6 13/15] qht: add test-qht-par to invoke qht-bench from 'check' target

2016-06-03 Thread Emilio G. Cota
On Sun, May 29, 2016 at 23:53:42 +0300, Sergey Fedorov wrote: > On 25/05/16 04:13, Emilio G. Cota wrote: > (snip) > > + > > +#define TEST_QHT_STRING "tests/qht-bench 1>/dev/null 2>&1 -R -S0.1 -D1 > > -N1" > > + > > +static vo

Re: [Qemu-devel] [PATCH v6 12/15] qht: add qht-bench, a performance benchmark

2016-06-03 Thread Emilio G. Cota
On Sun, May 29, 2016 at 23:45:23 +0300, Sergey Fedorov wrote: > On 25/05/16 04:13, Emilio G. Cota wrote: > > diff --git a/tests/qht-bench.c b/tests/qht-bench.c > > new file mode 100644 > > index 000..30d27c8 > > --- /dev/null > > +++ b/tests/qht-benc

Re: [Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-06-03 Thread Emilio G. Cota
On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote: > On 25/05/16 04:13, Emilio G. Cota wrote: > (snip) > > +double qdist_avg(const struct qdist *dist) > > +{ > > +unsigned long count; > > +size_t i; > > +double ret = 0; > >

Re: [Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-06-06 Thread Emilio G. Cota
On Fri, Jun 03, 2016 at 20:46:07 +0300, Sergey Fedorov wrote: > On 03/06/16 20:29, Sergey Fedorov wrote: > > On 03/06/16 20:22, Emilio G. Cota wrote: > >> On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote: > >>> On 25/05/16 04:13, Emilio G. Cota wrot

Re: [Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-06-06 Thread Emilio G. Cota
On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote: > On 25/05/16 04:13, Emilio G. Cota wrote: > > diff --git a/util/qdist.c b/util/qdist.c > > new file mode 100644 > > index 000..3343640 > > --- /dev/null > > +++ b/util/qdist.c > > @@

Re: [Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-06-07 Thread Emilio G. Cota
On Tue, Jun 07, 2016 at 17:06:16 +0300, Sergey Fedorov wrote: > On 07/06/16 02:40, Emilio G. Cota wrote: > > On Fri, Jun 03, 2016 at 20:46:07 +0300, Sergey Fedorov wrote: > >> Maybe something like > >> https://en.wikipedia.org/wiki/Kahan_summation_algorithm could

Re: [Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-06-07 Thread Emilio G. Cota
On Tue, Jun 07, 2016 at 18:56:48 +0300, Sergey Fedorov wrote: > On 07/06/16 04:05, Emilio G. Cota wrote: > > On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote: > >> On 25/05/16 04:13, Emilio G. Cota wrote: > >>> diff --git a/util/qdist.c b/util/qdis

Re: [Qemu-devel] [PATCH v6 00/15] tb hash improvements

2016-06-08 Thread Emilio G. Cota
On Wed, Jun 08, 2016 at 07:25:33 +0100, Alex Bennée wrote: > Emilio G. Cota writes: > > > v5: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg02366.html > > > > v6 applies cleanly on top of tcg-next (8b1fe3f4 "cpu-exec: > > Clean up 'interru

Re: [Qemu-devel] [PATCH v6 08/15] qdist: add module to represent frequency distributions of data

2016-06-08 Thread Emilio G. Cota
On Wed, Jun 08, 2016 at 17:10:03 +0300, Sergey Fedorov wrote: > On 08/06/16 03:02, Emilio G. Cota wrote: > > -dist->entries = g_realloc(dist->entries, > > - sizeof(*dist->entries) * (dist->n + 1)); > > +if (unlikely(dist->n

[Qemu-devel] [PATCH v7 11/15] qht: add test program

2016-06-08 Thread Emilio G. Cota
Acked-by: Sergey Fedorov Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 ++- tests/test-qht.c | 159 +++ 3 files changed, 165 insertions(+), 1

[Qemu-devel] [PATCH v7 06/15] exec: add tb_hash_func5, derived from xxhash

2016-06-08 Thread Emilio G. Cota
This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Reviewed-by: Sergey Fedorov Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/exec/tb-hash-xx.h | 94

[Qemu-devel] [PATCH v7 01/15] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-06-08 Thread Emilio G. Cota
Reviewed-by: Sergey Fedorov Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..b64f899 100644 --- a/include

[Qemu-devel] [PATCH v7 07/15] tb hash: hash phys_pc, pc, and flags with xxhash

2016-06-08 Thread Emilio G. Cota
t do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Reviewed-by: Sergey Fedorov Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpu-exec.c | 4 ++-- include/exec/tb-hash.h |

[Qemu-devel] [PATCH v7 13/15] qht: add test-qht-par to invoke qht-bench from 'check' target

2016-06-08 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 5 - tests/test-qht-par.c | 56 3 files changed, 61 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht-par.c diff --git a/tests/.gitignore

[Qemu-devel] [PATCH v7 09/15] qdist: add test program

2016-06-08 Thread Emilio G. Cota
Acked-by: Sergey Fedorov Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 +- tests/test-qdist.c | 384 + 3 files changed, 390 insertions(+), 1 deletion(-) create mode

[Qemu-devel] [PATCH v7 03/15] seqlock: rename write_lock/unlock to write_begin/end

2016-06-08 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Reviewed-by: Sergey Fedorov Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4

[Qemu-devel] [PATCH v7 14/15] tb hash: track translated blocks with qht

2016-06-08 Thread Emilio G. Cota
e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: Sergey Fedorov Reviewed-by: Alex Bennée Reviewed-by: Richard Henders

[Qemu-devel] [PATCH v7 08/15] qdist: add module to represent frequency distributions of data

2016-06-08 Thread Emilio G. Cota
dist, 39); [...] qdist_inc(&dist, 80); char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS); // -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0] g_free(str); char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS); // -> [39.0,49.2)▁█▁▁[69.8,80.0] g_free(str); Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota

[Qemu-devel] [PATCH v7 04/15] include/processor.h: define cpu_relax()

2016-06-08 Thread Emilio G. Cota
Taken from the linux kernel. Reviewed-by: Sergey Fedorov Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 30 ++ 1 file changed, 30 insertions(+) create mode 100644 include/qemu/processor.h diff

[Qemu-devel] [PATCH v7 10/15] qht: QEMU's fast, resizable and scalable Hash Table

2016-06-08 Thread Emilio G. Cota
those changes arrive we can already benefit from the single-threaded speedup that qht also provides. Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 183 util/Makefile.objs | 1 + util/qht.c | 833 + 3 files changed

[Qemu-devel] [PATCH v7 12/15] qht: add qht-bench, a performance benchmark

2016-06-08 Thread Emilio G. Cota
of threads Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile| 3 +- tests/qht-bench.c | 488 ++ 3 files changed, 491 insertions(+), 1 deletion(-) create mode 100644 tests/qht-bench.c diff --git a/tests/.gitigno

[Qemu-devel] [PATCH v7 05/15] qemu-thread: add simple test-and-set spinlock

2016-06-08 Thread Emilio G. Cota
imize for uncontended locks by acquiring the lock with TAS instead of TATAS; add qemu_spin_locked().] Signed-off-by: Emilio G. Cota --- include/qemu/thread.h | 35 +++ 1 file changed, 35 insertions(+) diff --git a/include/qemu/thread.h b/include/qemu/thread.h

[Qemu-devel] [PATCH v7 15/15] translate-all: add tb hash bucket info to 'info jit' dump

2016-06-08 Thread Emilio G. Cota
: [15.0,16.7)|▁▂▅▄█▅|[30.3,32.0] Acked-by: Sergey Fedorov Suggested-by: Richard Henderson Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- translate-all.c | 36 1 file changed, 36 insertions(+) diff --git a/translate-all.c b/translate-all.c

[Qemu-devel] [PATCH v7 00/15] tb hash improvements

2016-06-08 Thread Emilio G. Cota
v6 on qemu-devel: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg04251.html All changes in this iteration come from comments by Sergey unless otherwise noted. All patches are checkpatch-clean once false positives are taken into account. Changes from v6: - Add reviewed-by tags from v6

[Qemu-devel] [PATCH v7 02/15] seqlock: remove optional mutex

2016-06-08 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Reviewed-by: Sergey Fedorov Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include

Re: [Qemu-devel] [PULL 00/15] tb hash improvements

2016-06-10 Thread Emilio G. Cota
On Fri, Jun 10, 2016 at 16:33:10 +0100, Peter Maydell wrote: > Fails to build on ppc64be :-( > > In file included from /home/pm215/qemu/include/qemu/thread.h:4:0, > from /home/pm215/qemu/include/block/aio.h:20, > from /home/pm215/qemu/include/block/block.h:4, >

Re: [Qemu-devel] [PULL 00/15] tb hash improvements

2016-06-10 Thread Emilio G. Cota
On Fri, Jun 10, 2016 at 17:41:26 +0100, Peter Maydell wrote: > On 10 June 2016 at 17:34, Emilio G. Cota wrote: > > On Fri, Jun 10, 2016 at 16:33:10 +0100, Peter Maydell wrote: > >> Fails to build on ppc64be :-( > >> > >> In file included from /home/

[Qemu-devel] [PATCH v5 03/18] seqlock: rename write_lock/unlock to write_begin/end

2016-05-13 Thread Emilio G. Cota
It is a more appropriate name, now that the mutex embedded in the seqlock is gone. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 28 ++-- include/qemu/seqlock.h | 4 ++-- 2 files changed, 16

[Qemu-devel] [PATCH v5 02/18] seqlock: remove optional mutex

2016-05-13 Thread Emilio G. Cota
This option is unused; besides, it bloats the struct when not needed. Let's just let writers define their own locks elsewhere. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- cpus.c | 2 +- include/qemu/seqlock.h | 10 +---

[Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-13 Thread Emilio G. Cota
h TAS instead of TATAS; add qemu_spin_locked().] Signed-off-by: Emilio G. Cota --- include/qemu/thread.h | 39 +++ 1 file changed, 39 insertions(+) diff --git a/include/qemu/thread.h b/include/qemu/thread.h index bdae6df..4b74ee5 100644 --- a/include/qemu/thr

[Qemu-devel] [PATCH v5 01/18] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-05-13 Thread Emilio G. Cota
Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/compiler.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h index 8f1cc7b..b64f899 100644 --- a/include/qemu/compiler.h +++ b/include

[Qemu-devel] [PATCH v5 08/18] exec: add tb_hash_func5, derived from xxhash

2016-05-13 Thread Emilio G. Cota
This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/exec/tb-hash-xx.h | 94 +++ 1 file

[Qemu-devel] [PATCH v5 05/18] atomics: add atomic_test_and_set_acquire

2016-05-13 Thread Emilio G. Cota
This new helper expands to __atomic_test_and_set with acquire semantics where available; otherwise it expands to __sync_test_and_set, which has acquire semantics. Signed-off-by: Emilio G. Cota --- include/qemu/atomic.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/qemu/atomic.h

[Qemu-devel] [PATCH v5 00/18] tb hash improvements

2016-05-13 Thread Emilio G. Cota
This patchset applies on top of tcg-next (8b1fe3f4 "cpu-exec: Clean up 'interrupt_request' reloading", tagged "pull-tcg-20160512"). For reference, here is v4: https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg04670.html Changes from v4: - atomics.h: + Add atomic_read_acquire and atomi

[Qemu-devel] [PATCH v5 09/18] tb hash: hash phys_pc, pc, and flags with xxhash

2016-05-13 Thread Emilio G. Cota
t do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- cpu-exec.c | 4 ++-- include/exec/tb-hash.h | 8 ++-- translate-all.c

[Qemu-devel] [PATCH v5 06/18] atomics: add atomic_read_acquire and atomic_set_release

2016-05-13 Thread Emilio G. Cota
When __atomic is not available, we use full memory barriers instead of smp/wmb, since acquire/release barriers apply to all memory operations and not just to loads/stores, respectively. Signed-off-by: Emilio G. Cota --- include/qemu/atomic.h | 27 +++ 1 file changed, 27

[Qemu-devel] [PATCH v5 16/18] qht: add test-qht-par to invoke qht-bench from 'check' target

2016-05-13 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 5 - tests/test-qht-par.c | 56 3 files changed, 61 insertions(+), 1 deletion(-) create mode 100644 tests/test-qht-par.c diff --git a/tests/.gitignore

[Qemu-devel] [PATCH v5 04/18] include/processor.h: define cpu_relax()

2016-05-13 Thread Emilio G. Cota
Taken from the linux kernel. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- include/qemu/processor.h | 34 ++ 1 file changed, 34 insertions(+) create mode 100644 include/qemu/processor.h diff --git a/include/qemu

[Qemu-devel] [PATCH v5 18/18] translate-all: add tb hash bucket info to 'info jit' dump

2016-05-13 Thread Emilio G. Cota
: [15.0,16.7)|▁▂▅▄█▅|[30.3,32.0] Suggested-by: Richard Henderson Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- translate-all.c | 36 1 file changed, 36 insertions(+) diff --git a/translate-all.c b/translate-all.c index 5357737..c8074cf 100644

[Qemu-devel] [PATCH v5 11/18] qdist: add test program

2016-05-13 Thread Emilio G. Cota
Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 +- tests/test-qdist.c | 369 + 3 files changed, 375 insertions(+), 1 deletion(-) create mode 100644 tests/test-qdist.c

[Qemu-devel] [PATCH v5 14/18] qht: add test program

2016-05-13 Thread Emilio G. Cota
Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile | 6 ++- tests/test-qht.c | 159 +++ 3 files changed, 165 insertions(+), 1 deletion(-) create mode 100644

[Qemu-devel] [PATCH v5 17/18] tb hash: track translated blocks with qht

2016-05-13 Thread Emilio G. Cota
e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Emilio G.

[Qemu-devel] [PATCH v5 13/18] qht: support parallel writes

2016-05-13 Thread Emilio G. Cota
() check after acquiring the lock of the bucket they operate on. Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 10 +- util/qht.c | 276 - 2 files changed, 215 insertions(+), 71 deletions(-) diff --git a/include/qemu/qht.h b

[Qemu-devel] [PATCH v5 10/18] qdist: add module to represent frequency distributions of data

2016-05-13 Thread Emilio G. Cota
dist, 39); [...] qdist_inc(&dist, 80); char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS); // -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0] g_free(str); char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS); // -> [39.0,49.2)▁█▁▁[69.8,80.0] g_free(str); Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota

[Qemu-devel] [PATCH v5 15/18] qht: add qht-bench, a performance benchmark

2016-05-13 Thread Emilio G. Cota
Number of threads Signed-off-by: Emilio G. Cota --- tests/.gitignore | 1 + tests/Makefile| 3 +- tests/qht-bench.c | 473 ++ 3 files changed, 476 insertions(+), 1 deletion(-) create mode 100644 tests/qht-bench.c diff --g

[Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table

2016-05-13 Thread Emilio G. Cota
those changes arrive we can already benefit from the single-threaded speedup that qht also provides. Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota --- include/qemu/qht.h | 66 + util/Makefile.objs | 1 + util/qht.c | 703

Re: [Qemu-devel] [PATCH v5 06/18] atomics: add atomic_read_acquire and atomic_set_release

2016-05-16 Thread Emilio G. Cota
On Sun, May 15, 2016 at 06:22:36 -0400, Pranith Kumar wrote: > On Fri, May 13, 2016 at 11:34 PM, Emilio G. Cota wrote: > > When __atomic is not available, we use full memory barriers instead > > of smp/wmb, since acquire/release barriers apply to all memory > > operations

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-17 Thread Emilio G. Cota
On Tue, May 17, 2016 at 23:20:11 +0300, Sergey Fedorov wrote: > On 17/05/16 23:04, Emilio G. Cota wrote: (snip) > > +/* > > + * We might we tempted to use __atomic_test_and_set with __ATOMIC_ACQUIRE; > > + * however, the documentation explicitly says that we should only pass

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-17 Thread Emilio G. Cota
On Tue, May 17, 2016 at 23:35:57 +0300, Sergey Fedorov wrote: > On 17/05/16 22:38, Emilio G. Cota wrote: > > On Tue, May 17, 2016 at 20:13:24 +0300, Sergey Fedorov wrote: > >> On 14/05/16 06:34, Emilio G. Cota wrote: > (snip) > >>> +

Re: [Qemu-devel] [PATCH v5 09/18] tb hash: hash phys_pc, pc, and flags with xxhash

2016-05-17 Thread Emilio G. Cota
On Tue, May 17, 2016 at 20:47:52 +0300, Sergey Fedorov wrote: > On 14/05/16 06:34, Emilio G. Cota wrote: > > For some workloads such as arm bootup, tb_phys_hash is performance-critical. > > The is due to the high frequency of accesses to the hash table, originated > > by (

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-17 Thread Emilio G. Cota
On Tue, May 17, 2016 at 20:13:24 +0300, Sergey Fedorov wrote: > On 14/05/16 06:34, Emilio G. Cota wrote: (snip) > > +static inline void qemu_spin_lock(QemuSpin *spin) > > +{ > > +while (atomic_test_and_set_acquire(&spin->value)) { > > From gcc-4.8

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-17 Thread Emilio G. Cota
t gets rid of any guesswork (as in my previous email). I've changed the patch to: commit 8f89d36b6203b78df2bf1e3f82871b8aa2ca83b7 Author: Emilio G. Cota Date: Thu Apr 28 10:56:26 2016 -0400 atomics: add atomic_test_and_set_acquire Signed-off-by: Emilio G. Cota diff --git

Re: [Qemu-devel] [PATCH v5 08/18] exec: add tb_hash_func5, derived from xxhash

2016-05-17 Thread Emilio G. Cota
On Tue, May 17, 2016 at 20:22:52 +0300, Sergey Fedorov wrote: > On 14/05/16 06:34, Emilio G. Cota wrote: (snip) > > +static inline > > +uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e) > > +{ > > +uint32_t v1 = TB_HASH_XX_SEED + PRIME32_1 + PRIM

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-18 Thread Emilio G. Cota
pu_relax() while spinning; optimize for uncontended locks by acquiring the lock with TAS instead of TATAS; add qemu_spin_locked().] Signed-off-by: Emilio G. Cota diff --git a/include/qemu/thread.h b/include/qemu/thread.h index bdae6df..2d225ff 100644 --- a/include/qemu/thread.h +++ b/include/q

Re: [Qemu-devel] [PATCH v5 04/18] include/processor.h: define cpu_relax()

2016-05-18 Thread Emilio G. Cota
0f173e473c27c5a Author: Emilio G. Cota Date: Wed Apr 6 18:21:08 2016 -0400 include/processor.h: define cpu_relax() Taken from the linux kernel. Reviewed-by: Richard Henderson Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota diff --git a/include/qemu/processor.

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-18 Thread Emilio G. Cota
On Wed, May 18, 2016 at 21:21:26 +0300, Sergey Fedorov wrote: > On 14/05/16 06:34, Emilio G. Cota wrote: > > +static inline int qemu_spin_trylock(QemuSpin *spin) > > +{ > > +if (atomic_test_and_set_acquire(&spin->value)) { > > +return -EBUSY; > &g

Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock

2016-05-18 Thread Emilio G. Cota
On Wed, May 18, 2016 at 22:51:09 +0300, Sergey Fedorov wrote: > On 14/05/16 06:34, Emilio G. Cota wrote: > > +static inline void qemu_spin_lock(QemuSpin *spin) > > +{ > > +while (atomic_test_and_set_acquire(&spin->value)) { > > A possible optimization might

Re: [Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table

2016-05-20 Thread Emilio G. Cota
Hi Sergey, Any 'Ack' below means the change has made it to my tree. On Sat, May 21, 2016 at 01:13:20 +0300, Sergey Fedorov wrote: > > +#include "qemu/osdep.h" > > +#include "qemu-common.h" > > There's no need in qemu-common.h Ack > > +#include "qemu/seqlock.h" > > +#include "qemu/qdist.h" > >

Re: [Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table

2016-05-21 Thread Emilio G. Cota
On Fri, May 20, 2016 at 22:48:11 -0400, Emilio G. Cota wrote: > On Sat, May 21, 2016 at 01:13:20 +0300, Sergey Fedorov wrote: > > > +static inline > > > +void *qht_do_lookup(struct qht_bucket *head, qht_lookup_func_t func, > > > +co

[Qemu-devel] [PATCH 1/2] atomics: do not use __atomic primitives for RCU atomics

2016-05-21 Thread Emilio G. Cota
for tsan, though. Signed-off-by: Emilio G. Cota --- include/qemu/atomic.h | 99 ++- 1 file changed, 43 insertions(+), 56 deletions(-) diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h index 5bc4d6c..4d62425 100644 --- a/include/qemu/at

[Qemu-devel] [PATCH 0/2] atomics: fix small RCU perf. regression + update documentation

2016-05-21 Thread Emilio G. Cota
Patch 1 fixes a small performance regression introduced when moving our atomics to __atomic primitives. The regression can be measured on RMO architectures (I used aarch64); the effect is very small but consistently measurable: for instance, rcutorture performance degraded by about 0.3%. Patch 2 o

[Qemu-devel] [PATCH 2/2] docs/atomics: update atomic_read/set comparison with Linux

2016-05-21 Thread Emilio G. Cota
tions pertaining the variable they apply to; this, however, has no effect on surrounding statements like barriers do. For more details on this, see: https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html Signed-off-by: Emilio G. Cota --- docs/atomics.txt | 16 +--- 1 file changed,

Re: [Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table

2016-05-22 Thread Emilio G. Cota
On Sun, May 22, 2016 at 09:01:59 +0100, Alex Bennée wrote: > Emilio G. Cota writes: > > A small update: I just got rid of all the atomic_read/set's that > > apply to the hashes, since retries will take care of possible races. > > I guess the potential hash-clash from a

Re: [Qemu-devel] Any topics for today's MTTCG sync-up call?

2016-05-23 Thread Emilio G. Cota
On Mon, May 23, 2016 at 14:57:12 +0200, Claudio Fontana wrote: > Hi, at some point in the past there was a set of performance benchmarks > which were showing the improvements using mttcg, is there some update > on that? Any scalable parallel workload should do. I've used the C/C++ benchmarks in s

Re: [Qemu-devel] Any topics for today's MTTCG sync-up call?

2016-05-23 Thread Emilio G. Cota
On Mon, May 23, 2016 at 11:57:26 +0100, Alex Bennée wrote: > Emilio, is there anything you want to add? I've been following the QHT > stuff which is a really positive addition which my v3 base patches is > based upon (making the hot-path non lock contended). Do you have > anything in the works abov

Re: [Qemu-devel] [PATCH 1/2] atomics: do not use __atomic primitives for RCU atomics

2016-05-23 Thread Emilio G. Cota
On Mon, May 23, 2016 at 16:21:36 +0200, Paolo Bonzini wrote: > On 21/05/2016 22:42, Emilio G. Cota wrote: > > Commit a0aa44b4 ("include/qemu/atomic.h: default to __atomic functions") > > set all atomics to default (on recent GCC versions) to __atomic primitives. >

Re: [Qemu-devel] [PATCH 1/2] atomics: do not use __atomic primitives for RCU atomics

2016-05-23 Thread Emilio G. Cota
On Mon, May 23, 2016 at 09:53:00 -0700, Richard Henderson wrote: > On 05/21/2016 01:42 PM, Emilio G. Cota wrote: > >In the process, the atomic_rcu_read/set were converted to implement > >consume/release semantics, respectively. This is inefficient; for > >correctness and m

<    1   2   3   4   5   6   7   8   9   10   >