On Mon, Dec 30, 2013 at 03:05:56PM -0800, Pravin Shelar wrote: > On Mon, Dec 30, 2013 at 2:44 PM, Ben Pfaff <b...@nicira.com> wrote: > > On Mon, Dec 30, 2013 at 02:37:38PM -0800, Pravin Shelar wrote: > >> On Mon, Dec 30, 2013 at 1:34 PM, Ben Pfaff <b...@nicira.com> wrote: > >> > On Mon, Dec 30, 2013 at 11:10:22AM -0800, Pravin Shelar wrote: > >> >> On Mon, Dec 30, 2013 at 10:49 AM, Ben Pfaff <b...@nicira.com> wrote: > >> >> > On Mon, Dec 30, 2013 at 10:40:13AM -0800, Pravin Shelar wrote: > >> >> >> On Fri, Dec 27, 2013 at 8:03 PM, Ben Pfaff <b...@nicira.com> wrote: > >> >> >> > ovsthread_counter is an abstract interface that could be > >> >> >> > implemented > >> >> >> > different ways. The initial implementation is simple but less than > >> >> >> > optimally efficient. > >> >> >> > > >> >> >> > Signed-off-by: Ben Pfaff <b...@nicira.com> > >> >> >> > +void > >> >> >> > +ovsthread_counter_inc(struct ovsthread_counter *c, unsigned long > >> >> >> > long int n) > >> >> >> > +{ > >> >> >> > + c = &c[hash_int(ovsthread_id_self(), 0) % N_COUNTERS]; > >> >> >> > + > >> >> >> Does it make sense optimize this locking so that threads running on > >> >> >> same numa-node likely share lock? > >> >> >> we can use process id hashing to achieve it easily. > >> >> > > >> >> > Yes, that makes a lot of sense. How do we do it? > >> >> > > >> >> Use processor id (sched_getcpu()) to hash it. In case of > >> >> sched_getcpu() is not available then we can read thread affinity using > >> >> sched_getaffinity() and return assigned CPU, in properly optimized > >> >> environment we can assume that a thread wold be pinned to one cpu > >> >> only. But I am not sure of doing on platforms other than linux. > >> > > >> > That's reasonable. > >> > > >> > But, on second thought, I am not sure of the benefit from threads on > >> > the same node sharing a lock. I see that there are benefits from > >> > threads on different nodes having different locks, but I'm not sure > >> > that using only one lock on a single node really saves anything. What > >> > do you think? > >> > >> Then how about having per-cpu lock? > > > > That would be ideal but how would I do it? > > Create array of ovsthread_counter__ for all possible cpu. So > N_COUNTERS would be variable equal to num of possible cpu. To > increment use counter at index sched_getcpu().
You make it sound easy ;-) I think that's harder than it sounds. "All possible cpus" is hard to figure out (is it possible from userspace?) especially since the number of cpus can increase or decrease. I don't know of a guarantee that cpus are numbered consecutively from 0 (from 1?) so we'd probably need a hash table instead of an array. We'd probably need a mutex anyway because there's no guarantee that the process doesn't get migrated between cpus while running this code. And we'd still need a fallback for non-Linux. Per-thread would be easier. I'd have just used a per-thread variable here except that POSIX doesn't provide a way to get access to other threads' data, which makes it hard to get the sum. In the end, my goal here is to provide a useful abstraction and a simple implementation that is likely to be correct. Then, later, if profiling shows that this is an actual bottleneck, we can optimize it. But I don't want to spend a bunch of time optimizing code that I don't even know is important. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev