Hi,
I think you already said this, but are you using pthread_cond_signal, or
pthread_cond_broadcast to wake up threads?

max

Zeljko Vrba wrote:
> On Thu, Apr 10, 2008 at 01:02:34PM -0700, Alexander Kolbasov wrote:
>   
>> It may be useful to observe prstat -mL which will report micro-state 
>> accounting data for each thread. 
>>
>>     
> prstat -mL takes a *lot* of time with 16k threads.  Nevertheless, I have some
> further data: reruning the dtrace gave me the output of top 10 consumers of
> CPU time for both CPUs: the winner is "idle" on CPU0 (671 samples from 2029
> samples; the next highest has 114 samples), and again "idle" on CPU1 (776/3891
> samples, the next highest has 106 samples).  Ordinary prstat shows that my
> process is often in sleep state.
>
> Furthermore, I do not think that the problem lies in TLB trashing.  Here
> are three different runs:
>
> 2^28 B total block size (256MB), 2^14 B chunk size (= also 2^{28-14} threads),
> 2^7 repetitions (= 2^35 B (32 GB) encrypted in total): 33.6 seconds
>
> 2^24 B total block size (16MB), 2^10 B chunk size (= again 2^14 threads),
> 2^7 repetitions (= 2^31 B (2 GB) encrypted in total): 25.4 seconds 
>
> 16MB block size is only twice the TLB capacity (2048 entries x 4kB = 8MB). 
> Lowering the block size to 4MB (half the TLB capacity) gives the following:
>
> 2^22 B total block size (4 MB), 2^8 B chunk size (= 2^14 threads),
> 2^7 repetitions ( = 2^29 B (512 MB) encrypted in total): 24.5 seconds
>
> ==
>
> Is there some backoff heuristics in the mutex/CV/whatever code that puts the
> thread to sleep under high contention?  Adaptive mutexes?  I'm off to browse 
> the
> opensolaris code on the net. 
>
> ==
>
> _______________________________________________
> perf-discuss mailing list
> perf-discuss@opensolaris.org
>
>   

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to