While investigating a performance problem in my prototype workspace, I
found that the dtlb-miss rate is about 16% higher than that of the
baseline, so I suspect that the high tlb miss could be the issue.

So, how can I get the statistics on the PC addresses when the tlb miss
happens? Can DTrace or some other facilities help here? Thank you in
advance!

I've developed a prototype DTrace cpc provider which would basically
do what you are asking here. The provider utilises the counter overflow
mechanism available in all processors (that we support anyway).
You simply specify an event (TLB misses in your case), a mode
(system/user) and an overflow count and you're good to go.

The following example shows user-land TLB activity by executable
on an Opteron processor. The probename in the probe specification
may well change but it consists of the event name followed the mode
specification and the count:

# dtrace -qn 'cpc:::[EMAIL PROTECTED] = count();}'
^C

 dtrace                                                            1
 in.rlogind                                                        1
 ls                                                                1
 man                                                               1
 nroff                                                             1
 nscd                                                              1
 pt_chmod                                                          1
 quota                                                             1
 sh                                                                1
 sendmail                                                          2
 inetd                                                             3
 svc.configd                                                       3
 ksh                                                               4
 intrd                                                             6
 login                                                            10

So, in the above we fire the probe every 1000 user-land TLB misses and
from there we simply aggregate on execname. In your case you would
simply change the mode/count to suit and aggregate on stack() (to
start with anyway). Note that we are not getting an absolute view of
event causation here but a sampled view; this should provide a
statistically valid picture of the event activity though.

I can happily provide you with a kernel to try but it comes as a
Cap-Eye and you, obviously, may not be able to use it if your bits
are outside of a single module or so.

Cheers.

Jon.


- yxn

The attached in the sample trapstat output: (ignored the itlb column)

Mine:
cpu m size| dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+----
 0 u   8k|      7399  0.3         0  0.0 | 0.3
 0 u  64k|         0  0.0         0  0.0 | 0.0
 0 u 512k|         0  0.0         0  0.0 | 0.0
 0 u   4m|         0  0.0         0  0.0 | 0.0
- - - - - + - - - - - - - - - - - - - - - + - -
 0 k   8k|    397423 14.5       237  0.0 |14.5
 0 k  64k|         0  0.0         0  0.0 | 0.0
 0 k 512k|         0  0.0         0  0.0 | 0.0
 0 k   4m|         0  0.0         0  0.0 | 0.0
==========+===============================+====
     ttl |    404822 14.7       237  0.0 |14.8

Baseline:
cpu m size| dtlb-miss %tim dtsb-miss %tim |%tim
 0 u   8k|      8667  0.3         9  0.0 | 0.3
 0 u  64k|         0  0.0         0  0.0 | 0.0
 0 u 512k|         0  0.0         0  0.0 | 0.0
 0 u   4m|         0  0.0         0  0.0 | 0.0
- - - - - + - - - - - - - - - - - - - - - + - -
 0 k   8k|    342333 12.5       233  0.0 |12.6
 0 k  64k|         0  0.0         0  0.0 | 0.0
 0 k 512k|         0  0.0         0  0.0 | 0.0
 0 k   4m|        21  0.0         0  0.0 | 0.0
==========+===============================+====
     ttl |    351021 12.8       242  0.0 |12.9


_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to