Krishna Yenduri wrote:
> Hi All,
> 
>  I have a user level benchmark that does
>  for (i = 0; i < nthreads; i++)
>       (void) thr_create(NULL, 0, testaes, (void *)0,
>                             THR_NEW_LWP, &tid);
> 
>  I found that running this benchmark with nthreads == ncpus
>  schedules each thread to a separate CPU. The system is a Niagara 2
>  with 128 CPUs/strands.
> 
>  However, for a kernel module/benchmark that does
>  for (i = 0; i < nthreads; i++)
>     (void) thread_create(NULL, 0, &process_aes, (void *)i, 0, &p0,
>                                 TS_RUN, minclsyspri);
> 
>  the scheduling is very uneven and a whole set of CPUs from
>  64-127 did not have any thread scheduled on them. The distribution
>  among 0-63 is also uneven.
> 
>  I assume the thread scheduling behavior is different for system threads
>  which do not have a LWP. But, is this not sub optimal? Is the
>  assumption that kernel subsystems that need to use a large number
>  of threads do their own CPU binding/scheduling to assure even distribution?
> 
> Thanks,
> -Krishna

Keep in mind the differences between lwps and kernel threads, esp. on
NUMA (MPO) platforms.  Note that lgrp_choose isn't called for kernel 
threads....

What are you trying to do?

Trying to use all the cpus in the system at minclsyspri is likely to 
make interactive
use awkward, to say the least.

- Bart



-- 
Bart Smaalders                  Solaris Kernel Performance
[EMAIL PROTECTED]               http://blogs.sun.com/barts
"You will contribute more with mercurial than with thunderbird."
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to