On Wed, Aug 15, 2018 at 11:09:42 +0800, Fam Zheng wrote: > On Mon, 08/13 13:11, Emilio G. Cota wrote: > > + --enable-sync-profiler) sync_profiler="yes" > > + ;; > > Curious, not asking for a change: can this be made a runtime option instead of > compile time, since there's no library dependencies? That should make this > somewhat easier to use.
Good point. I'll do some profiling tomorrow to see how the latency of the locking primitives could be minimized (ideally, not using the profiler should just add a well-predicted branch). > > + > > +#define QSP_GEN_VOID(type_, qsp_t_, func_, impl_) \ > > + void func_(type_ *obj, const char *file, unsigned line) \ > > + { \ > > + struct qsp_entry *e = qsp_entry_get(obj, file, line, qsp_t_); \ > > + int64_t t; \ > > + \ > > No qsp_init()? > > > + t = get_clock(); \ > > + impl_(obj, file, line); \ > > + atomic_set(&e->ns, e->ns + get_clock() - t); \ > > + atomic_set(&e->n_acqs, e->n_acqs + 1); \ > > + } > > + > > +#define QSP_GEN_RET1(type_, qsp_t_, func_, impl_) \ > > + int func_(type_ *obj, const char *file, unsigned line) \ > > + { \ > > + struct qsp_entry *e = qsp_entry_get(obj, file, line, qsp_t_); \ > > + int64_t t; \ > > + int err; \ > > + \ > > Same here. qsp_init is called by qsp_get_entry. (snip) > > +void qsp_cond_wait(QemuCond *cond, QemuMutex *mutex, const char *file, > > + unsigned line) > > +{ > > + struct qsp_entry *e; > > + int64_t t; > > + > > + qsp_init(); > > + > > + e = qsp_entry_get(cond, file, line, QSP_CONDVAR); > > + t = get_clock(); > > + qemu_cond_wait_impl(cond, mutex, file, line); > > + atomic_set(&e->ns, e->ns + get_clock() - t); > > + atomic_set(&e->n_acqs, e->n_acqs + 1); > > Why not atomic_add (both here and in above macros)? Because fetching e->ns and > then updating it is not "atomic" this way. This isn't a read-modify-write op; atomic_set is used here as "write_once". Note that struct qsp_entry is only ever modified by the current thread (thread_ptr is part of the struct; yes this uses a lot more memory but that's the price of scalability). The struct might be read anytime by other threads though, so we have to use atomic_set to avoid undefined behaviour (e.g. torn reads/writes). Thanks, Emilio