On Fri, Mar 18, 2016 at 16:18:42 +, Alex Bennée wrote:
> From: KONRAD Frederic
>
> Signed-off-by: KONRAD Frederic
> Signed-off-by: Paolo Bonzini
> [AJB: minor checkpatch fixes]
> Signed-off-by: Alex Bennée
>
> ---
> v1(ajb)
> - checkpatch fixes
> ---
> diff --git a/cpu-exec.c b/cpu-exec
On Fri, Mar 18, 2016 at 17:59:46 +0100, Paolo Bonzini wrote:
> On 18/03/2016 17:18, Alex Bennée wrote:
> > +
> > +/* Protected by tb_lock. */
>
> Only writes are protected by tb_lock. Read happen outside the lock.
>
> Reads are not quite thread safe yet, because of tb_flush. In order to
>
On Mon, Mar 21, 2016 at 22:08:06 +, Peter Maydell wrote:
> It is not _necessary_, but it is a performance optimization to
> speed up the "missed in the TLB" case. (A TLB flush will wipe
> the tb_jmp_cache table.) From the thread where the move-to-front-of-list
> behaviour was added in 2010, ben
On Tue, May 24, 2016 at 09:08:01 +0200, Paolo Bonzini wrote:
> On 23/05/2016 19:09, Emilio G. Cota wrote:
> > PS. And really equating smp_wmb/rmb to release/acquire as we have under
> > #ifdef __ATOMIC is hard to justify, other than to please tsan.
>
> That only makes a diffe
v1: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg03661.html
Patch 1 hasn't changed from v1 (where it was patch 2 though).
Patches 2 and 3 fix a not-so-small-after-all RCU performance regression
we introduced when transitioning to __atomic primitives. I got
an arm64 machine to test tod
tions pertaining the variable they apply to; this, however,
has no effect on surrounding statements like barriers do. For more
details on this, see:
https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html
Signed-off-by: Emilio G. Cota
---
docs/atomics.txt | 16 +---
1 file changed,
rwise,
only emit the barrier for Sparc hosts. Note that we still guarantee
that smp_read_barrier_depends() is a compiler barrier.
Signed-off-by: Emilio G. Cota
---
include/qemu/atomic.h | 7 +++
1 file changed, 7 insertions(+)
diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
after applying this patch:
$ tests/qht-bench -d 5 -n 1
Before: 9.78 MT/s
After: 10.96 MT/s
Signed-off-by: Emilio G. Cota
---
include/qemu/atomic.h | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 4a4f2fb..c5b6c8d
On Tue, May 24, 2016 at 23:09:47 +0300, Sergey Fedorov wrote:
> On 24/05/16 23:06, Emilio G. Cota wrote:
> > For correctness, smp_read_barrier_depends() is only required to
> > emit a barrier on Sparc hosts. However, we are currently emitting
> > a consume fence unconditional
On Mon, May 23, 2016 at 23:28:27 +0300, Sergey Fedorov wrote:
> What if we turn qht::lock into a mutex and change the function as follows:
>
> static inline
> struct qht_bucket *qht_bucket_lock__no_stale(struct qht *ht,
> uint32_t hash,
>
On Wed, May 25, 2016 at 01:17:21 +0300, Sergey Fedorov wrote:
> >> With this implementation we could:
> >> (1) get rid of qht_map::stale
> >> (2) don't waste cycles waiting for resize to complete
> > I'll include this in v6.
>
> How is it by perf?
Not much of a difference, since resize is a slo
This will be used by upcoming changes for hashing the tb hash.
Add this into a separate file to include the copyright notice from
xxhash.
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/exec/tb-hash-xx.h | 94 +++
1 file
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
cpus.c | 28 ++--
include/qemu/seqlock.h | 4 ++--
2 files changed, 16
Taken from the linux kernel.
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/qemu/processor.h | 30 ++
1 file changed, 30 insertions(+)
create mode 100644 include/qemu/processor.h
diff --git a/include/qemu
:
[15.0,16.7)|▁▂▅▄█▅|[30.3,32.0]
Suggested-by: Richard Henderson
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
translate-all.c | 36
1 file changed, 36 insertions(+)
diff --git a/translate-all.c b/translate-all.c
index 5357737..c8074cf 100644
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
cpus.c | 2 +-
include/qemu/seqlock.h | 10 +---
ks by acquiring the lock with TAS instead
of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota
---
include/qemu/thread.h | 35 +++
1 file changed, 35 insertions(+)
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bdae6df..c5d71cf 100644
t do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
cpu-exec.c | 4 ++--
include/exec/tb-hash.h | 8 ++--
translate-all.c
dist, 39);
[...]
qdist_inc(&dist, 80);
char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);
char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 6 +-
tests/test-qdist.c | 369 +
3 files changed, 375 insertions(+), 1 deletion(-)
create mode 100644 tests/test-qdist.c
those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Signed-off-by: Emilio G. Cota
---
include/qemu/qht.h | 183
util/Makefile.objs | 1 +
util/qht.c | 837 +
3 files changed
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 6 ++-
tests/test-qht.c | 159 +++
3 files changed, 165 insertions(+), 1 deletion(-)
create mode 100644
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/qemu/compiler.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 8f1cc7b..b64f899 100644
--- a/include/qemu/compiler.h
+++ b/include
e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G.
Number of threads
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile| 3 +-
tests/qht-bench.c | 474 ++
3 files changed, 477 insertions(+), 1 deletion(-)
create mode 100644 tests/qht-bench.c
diff --g
v5: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg02366.html
v6 applies cleanly on top of tcg-next (8b1fe3f4 "cpu-exec:
Clean up 'interrupt_request' reloading", tagged "pull-tcg-20160512").
Changes from v5, mostly from Sergey's review:
- processor.h: use #ifdef #elif throughout the fi
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 5 -
tests/test-qht-par.c | 56
3 files changed, 61 insertions(+), 1 deletion(-)
create mode 100644 tests/test-qht-par.c
diff --git a/tests/.gitignore
On Wed, May 25, 2016 at 14:16:56 +0200, Paolo Bonzini wrote:
> On 24/05/2016 22:06, Emilio G. Cota wrote:
> > For correctness, smp_read_barrier_depends() is only required to
> > emit a barrier on Sparc hosts. However, we are currently emitting
> > a consume fence unconditio
r only; QEMU provides all of inc, dec, and, sub, and, or.
Not clear whether the last 'and' is redundant or is being used as
a conjunction. Either way it would be clearer to just remove it.
Other than that:
Reviewed-by: Emilio G. Cota
Emilio
On Fri, May 27, 2016 at 23:53:01 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> > Taken from the linux kernel.
> >
> > Reviewed-by: Richard Henderson
> > Reviewed-by: Alex Bennée
> > Signed-off-by: Emilio G. Cota
> >
On Tue, May 31, 2016 at 16:12:32 +0100, Alex Bennée wrote:
> Emilio G. Cota writes:
> > This serves as a performance benchmark as well as a stress test
> > for QHT. We can tweak quite a number of things, including the
> > number of resize threads and how frequently
On Sun, May 29, 2016 at 22:52:27 +0300, Sergey Fedorov wrote:
> I was just wondering if it could be worthwhile to pass a hash function
> when initializing a QHT. Then we could have variants of qht_insert(),
> qht_remove() and qht_lookup() which does not require a computed hash
> value but call the
On Sun, May 29, 2016 at 22:52:27 +0300, Sergey Fedorov wrote:
> > +/**
> > + * qht_remove - remove a pointer from the hash table
> > + * @ht: QHT to remove from
> > + * @p: pointer to be removed
> > + * @hash: hash corresponding to @p
> > + *
> > + * Attempting to remove a NULL @p is a bug.
> > + *
On Sun, May 29, 2016 at 23:53:42 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> (snip)
> > +
> > +#define TEST_QHT_STRING "tests/qht-bench 1>/dev/null 2>&1 -R -S0.1 -D1
> > -N1"
> > +
> > +static vo
On Sun, May 29, 2016 at 23:45:23 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> > diff --git a/tests/qht-bench.c b/tests/qht-bench.c
> > new file mode 100644
> > index 000..30d27c8
> > --- /dev/null
> > +++ b/tests/qht-benc
On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> (snip)
> > +double qdist_avg(const struct qdist *dist)
> > +{
> > +unsigned long count;
> > +size_t i;
> > +double ret = 0;
> >
On Fri, Jun 03, 2016 at 20:46:07 +0300, Sergey Fedorov wrote:
> On 03/06/16 20:29, Sergey Fedorov wrote:
> > On 03/06/16 20:22, Emilio G. Cota wrote:
> >> On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote:
> >>> On 25/05/16 04:13, Emilio G. Cota wrot
On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> > diff --git a/util/qdist.c b/util/qdist.c
> > new file mode 100644
> > index 000..3343640
> > --- /dev/null
> > +++ b/util/qdist.c
> > @@
On Tue, Jun 07, 2016 at 17:06:16 +0300, Sergey Fedorov wrote:
> On 07/06/16 02:40, Emilio G. Cota wrote:
> > On Fri, Jun 03, 2016 at 20:46:07 +0300, Sergey Fedorov wrote:
> >> Maybe something like
> >> https://en.wikipedia.org/wiki/Kahan_summation_algorithm could
On Tue, Jun 07, 2016 at 18:56:48 +0300, Sergey Fedorov wrote:
> On 07/06/16 04:05, Emilio G. Cota wrote:
> > On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote:
> >> On 25/05/16 04:13, Emilio G. Cota wrote:
> >>> diff --git a/util/qdist.c b/util/qdis
On Wed, Jun 08, 2016 at 07:25:33 +0100, Alex Bennée wrote:
> Emilio G. Cota writes:
>
> > v5: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg02366.html
> >
> > v6 applies cleanly on top of tcg-next (8b1fe3f4 "cpu-exec:
> > Clean up 'interru
On Wed, Jun 08, 2016 at 17:10:03 +0300, Sergey Fedorov wrote:
> On 08/06/16 03:02, Emilio G. Cota wrote:
> > -dist->entries = g_realloc(dist->entries,
> > - sizeof(*dist->entries) * (dist->n + 1));
> > +if (unlikely(dist->n
Acked-by: Sergey Fedorov
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 6 ++-
tests/test-qht.c | 159 +++
3 files changed, 165 insertions(+), 1
This will be used by upcoming changes for hashing the tb hash.
Add this into a separate file to include the copyright notice from
xxhash.
Reviewed-by: Sergey Fedorov
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/exec/tb-hash-xx.h | 94
Reviewed-by: Sergey Fedorov
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/qemu/compiler.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 8f1cc7b..b64f899 100644
--- a/include
t do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.
Reviewed-by: Sergey Fedorov
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
cpu-exec.c | 4 ++--
include/exec/tb-hash.h |
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 5 -
tests/test-qht-par.c | 56
3 files changed, 61 insertions(+), 1 deletion(-)
create mode 100644 tests/test-qht-par.c
diff --git a/tests/.gitignore
Acked-by: Sergey Fedorov
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 6 +-
tests/test-qdist.c | 384 +
3 files changed, 390 insertions(+), 1 deletion(-)
create mode
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: Sergey Fedorov
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
cpus.c | 28 ++--
include/qemu/seqlock.h | 4
e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: Sergey Fedorov
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henders
dist, 39);
[...]
qdist_inc(&dist, 80);
char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);
char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
Taken from the linux kernel.
Reviewed-by: Sergey Fedorov
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/qemu/processor.h | 30 ++
1 file changed, 30 insertions(+)
create mode 100644 include/qemu/processor.h
diff
those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Signed-off-by: Emilio G. Cota
---
include/qemu/qht.h | 183
util/Makefile.objs | 1 +
util/qht.c | 833 +
3 files changed
of threads
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile| 3 +-
tests/qht-bench.c | 488 ++
3 files changed, 491 insertions(+), 1 deletion(-)
create mode 100644 tests/qht-bench.c
diff --git a/tests/.gitigno
imize for uncontended locks by acquiring the lock with TAS instead
of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota
---
include/qemu/thread.h | 35 +++
1 file changed, 35 insertions(+)
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
:
[15.0,16.7)|▁▂▅▄█▅|[30.3,32.0]
Acked-by: Sergey Fedorov
Suggested-by: Richard Henderson
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
translate-all.c | 36
1 file changed, 36 insertions(+)
diff --git a/translate-all.c b/translate-all.c
v6 on qemu-devel:
https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg04251.html
All changes in this iteration come from comments by Sergey
unless otherwise noted. All patches are checkpatch-clean
once false positives are taken into account.
Changes from v6:
- Add reviewed-by tags from v6
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: Sergey Fedorov
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
cpus.c | 2 +-
include
On Fri, Jun 10, 2016 at 16:33:10 +0100, Peter Maydell wrote:
> Fails to build on ppc64be :-(
>
> In file included from /home/pm215/qemu/include/qemu/thread.h:4:0,
> from /home/pm215/qemu/include/block/aio.h:20,
> from /home/pm215/qemu/include/block/block.h:4,
>
On Fri, Jun 10, 2016 at 17:41:26 +0100, Peter Maydell wrote:
> On 10 June 2016 at 17:34, Emilio G. Cota wrote:
> > On Fri, Jun 10, 2016 at 16:33:10 +0100, Peter Maydell wrote:
> >> Fails to build on ppc64be :-(
> >>
> >> In file included from /home/
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
cpus.c | 28 ++--
include/qemu/seqlock.h | 4 ++--
2 files changed, 16
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
cpus.c | 2 +-
include/qemu/seqlock.h | 10 +---
h TAS instead of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota
---
include/qemu/thread.h | 39 +++
1 file changed, 39 insertions(+)
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bdae6df..4b74ee5 100644
--- a/include/qemu/thr
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/qemu/compiler.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 8f1cc7b..b64f899 100644
--- a/include/qemu/compiler.h
+++ b/include
This will be used by upcoming changes for hashing the tb hash.
Add this into a separate file to include the copyright notice from
xxhash.
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/exec/tb-hash-xx.h | 94 +++
1 file
This new helper expands to __atomic_test_and_set with acquire semantics
where available; otherwise it expands to __sync_test_and_set, which
has acquire semantics.
Signed-off-by: Emilio G. Cota
---
include/qemu/atomic.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/qemu/atomic.h
This patchset applies on top of tcg-next (8b1fe3f4 "cpu-exec:
Clean up 'interrupt_request' reloading", tagged "pull-tcg-20160512").
For reference, here is v4:
https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg04670.html
Changes from v4:
- atomics.h:
+ Add atomic_read_acquire and atomi
t do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
cpu-exec.c | 4 ++--
include/exec/tb-hash.h | 8 ++--
translate-all.c
When __atomic is not available, we use full memory barriers instead
of smp/wmb, since acquire/release barriers apply to all memory
operations and not just to loads/stores, respectively.
Signed-off-by: Emilio G. Cota
---
include/qemu/atomic.h | 27 +++
1 file changed, 27
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 5 -
tests/test-qht-par.c | 56
3 files changed, 61 insertions(+), 1 deletion(-)
create mode 100644 tests/test-qht-par.c
diff --git a/tests/.gitignore
Taken from the linux kernel.
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
---
include/qemu/processor.h | 34 ++
1 file changed, 34 insertions(+)
create mode 100644 include/qemu/processor.h
diff --git a/include/qemu
:
[15.0,16.7)|▁▂▅▄█▅|[30.3,32.0]
Suggested-by: Richard Henderson
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
translate-all.c | 36
1 file changed, 36 insertions(+)
diff --git a/translate-all.c b/translate-all.c
index 5357737..c8074cf 100644
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 6 +-
tests/test-qdist.c | 369 +
3 files changed, 375 insertions(+), 1 deletion(-)
create mode 100644 tests/test-qdist.c
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile | 6 ++-
tests/test-qht.c | 159 +++
3 files changed, 165 insertions(+), 1 deletion(-)
create mode 100644
e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: Alex Bennée
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G.
() check after
acquiring the lock of the bucket they operate on.
Signed-off-by: Emilio G. Cota
---
include/qemu/qht.h | 10 +-
util/qht.c | 276 -
2 files changed, 215 insertions(+), 71 deletions(-)
diff --git a/include/qemu/qht.h b
dist, 39);
[...]
qdist_inc(&dist, 80);
char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);
char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
Number of threads
Signed-off-by: Emilio G. Cota
---
tests/.gitignore | 1 +
tests/Makefile| 3 +-
tests/qht-bench.c | 473 ++
3 files changed, 476 insertions(+), 1 deletion(-)
create mode 100644 tests/qht-bench.c
diff --g
those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Reviewed-by: Richard Henderson
Signed-off-by: Emilio G. Cota
---
include/qemu/qht.h | 66 +
util/Makefile.objs | 1 +
util/qht.c | 703
On Sun, May 15, 2016 at 06:22:36 -0400, Pranith Kumar wrote:
> On Fri, May 13, 2016 at 11:34 PM, Emilio G. Cota wrote:
> > When __atomic is not available, we use full memory barriers instead
> > of smp/wmb, since acquire/release barriers apply to all memory
> > operations
On Tue, May 17, 2016 at 23:20:11 +0300, Sergey Fedorov wrote:
> On 17/05/16 23:04, Emilio G. Cota wrote:
(snip)
> > +/*
> > + * We might we tempted to use __atomic_test_and_set with __ATOMIC_ACQUIRE;
> > + * however, the documentation explicitly says that we should only pass
On Tue, May 17, 2016 at 23:35:57 +0300, Sergey Fedorov wrote:
> On 17/05/16 22:38, Emilio G. Cota wrote:
> > On Tue, May 17, 2016 at 20:13:24 +0300, Sergey Fedorov wrote:
> >> On 14/05/16 06:34, Emilio G. Cota wrote:
> (snip)
> >>> +
On Tue, May 17, 2016 at 20:47:52 +0300, Sergey Fedorov wrote:
> On 14/05/16 06:34, Emilio G. Cota wrote:
> > For some workloads such as arm bootup, tb_phys_hash is performance-critical.
> > The is due to the high frequency of accesses to the hash table, originated
> > by (
On Tue, May 17, 2016 at 20:13:24 +0300, Sergey Fedorov wrote:
> On 14/05/16 06:34, Emilio G. Cota wrote:
(snip)
> > +static inline void qemu_spin_lock(QemuSpin *spin)
> > +{
> > +while (atomic_test_and_set_acquire(&spin->value)) {
>
> From gcc-4.8
t gets rid of any guesswork (as in my previous email).
I've changed the patch to:
commit 8f89d36b6203b78df2bf1e3f82871b8aa2ca83b7
Author: Emilio G. Cota
Date: Thu Apr 28 10:56:26 2016 -0400
atomics: add atomic_test_and_set_acquire
Signed-off-by: Emilio G. Cota
diff --git
On Tue, May 17, 2016 at 20:22:52 +0300, Sergey Fedorov wrote:
> On 14/05/16 06:34, Emilio G. Cota wrote:
(snip)
> > +static inline
> > +uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e)
> > +{
> > +uint32_t v1 = TB_HASH_XX_SEED + PRIME32_1 + PRIM
pu_relax() while spinning; optimize for uncontended locks
by
acquiring the lock with TAS instead of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bdae6df..2d225ff 100644
--- a/include/qemu/thread.h
+++ b/include/q
0f173e473c27c5a
Author: Emilio G. Cota
Date: Wed Apr 6 18:21:08 2016 -0400
include/processor.h: define cpu_relax()
Taken from the linux kernel.
Reviewed-by: Richard Henderson
Reviewed-by: Alex Bennée
Signed-off-by: Emilio G. Cota
diff --git a/include/qemu/processor.
On Wed, May 18, 2016 at 21:21:26 +0300, Sergey Fedorov wrote:
> On 14/05/16 06:34, Emilio G. Cota wrote:
> > +static inline int qemu_spin_trylock(QemuSpin *spin)
> > +{
> > +if (atomic_test_and_set_acquire(&spin->value)) {
> > +return -EBUSY;
>
&g
On Wed, May 18, 2016 at 22:51:09 +0300, Sergey Fedorov wrote:
> On 14/05/16 06:34, Emilio G. Cota wrote:
> > +static inline void qemu_spin_lock(QemuSpin *spin)
> > +{
> > +while (atomic_test_and_set_acquire(&spin->value)) {
>
> A possible optimization might
Hi Sergey,
Any 'Ack' below means the change has made it to my tree.
On Sat, May 21, 2016 at 01:13:20 +0300, Sergey Fedorov wrote:
> > +#include "qemu/osdep.h"
> > +#include "qemu-common.h"
>
> There's no need in qemu-common.h
Ack
> > +#include "qemu/seqlock.h"
> > +#include "qemu/qdist.h"
> >
On Fri, May 20, 2016 at 22:48:11 -0400, Emilio G. Cota wrote:
> On Sat, May 21, 2016 at 01:13:20 +0300, Sergey Fedorov wrote:
> > > +static inline
> > > +void *qht_do_lookup(struct qht_bucket *head, qht_lookup_func_t func,
> > > +co
for tsan,
though.
Signed-off-by: Emilio G. Cota
---
include/qemu/atomic.h | 99 ++-
1 file changed, 43 insertions(+), 56 deletions(-)
diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 5bc4d6c..4d62425 100644
--- a/include/qemu/at
Patch 1 fixes a small performance regression introduced when moving
our atomics to __atomic primitives. The regression can be measured
on RMO architectures (I used aarch64); the effect is very small
but consistently measurable: for instance, rcutorture performance
degraded by about 0.3%.
Patch 2 o
tions pertaining the variable they apply to; this, however,
has no effect on surrounding statements like barriers do. For more
details on this, see:
https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html
Signed-off-by: Emilio G. Cota
---
docs/atomics.txt | 16 +---
1 file changed,
On Sun, May 22, 2016 at 09:01:59 +0100, Alex Bennée wrote:
> Emilio G. Cota writes:
> > A small update: I just got rid of all the atomic_read/set's that
> > apply to the hashes, since retries will take care of possible races.
>
> I guess the potential hash-clash from a
On Mon, May 23, 2016 at 14:57:12 +0200, Claudio Fontana wrote:
> Hi, at some point in the past there was a set of performance benchmarks
> which were showing the improvements using mttcg, is there some update
> on that?
Any scalable parallel workload should do.
I've used the C/C++ benchmarks in s
On Mon, May 23, 2016 at 11:57:26 +0100, Alex Bennée wrote:
> Emilio, is there anything you want to add? I've been following the QHT
> stuff which is a really positive addition which my v3 base patches is
> based upon (making the hot-path non lock contended). Do you have
> anything in the works abov
On Mon, May 23, 2016 at 16:21:36 +0200, Paolo Bonzini wrote:
> On 21/05/2016 22:42, Emilio G. Cota wrote:
> > Commit a0aa44b4 ("include/qemu/atomic.h: default to __atomic functions")
> > set all atomics to default (on recent GCC versions) to __atomic primitives.
>
On Mon, May 23, 2016 at 09:53:00 -0700, Richard Henderson wrote:
> On 05/21/2016 01:42 PM, Emilio G. Cota wrote:
> >In the process, the atomic_rcu_read/set were converted to implement
> >consume/release semantics, respectively. This is inefficient; for
> >correctness and m
201 - 300 of 2303 matches
Mail list logo