Hi Rodrigo,
You are right, in that it very much depends on the nature of the workload, the
system configuration, and a variety of other factors. But over time I (and
others i'm sure) have tended to correlate "typical" values that a tool like
mpstat provides with certain workload / system load c
I love happy endings. :)
-Eric
This message posted from opensolaris.org
___
perf-discuss mailing list
perf-discuss@opensolaris.org
> Do you have the bug# around for these problems?
6417367 cpu_wakeup() could be more locality/CMT aware
4933914 scheduler support for shared cache CMT processors
Thanks,
-Eric
This message posted from opensolaris.org
___
perf-discuss mailing list
pe
> That being said, when I have Linux/*BSD on here there
> is none of this sluggish feeling, and when I run
> Linux/*BSD I use the beta's of those like
> Rawhide(fedora)/ Current (FreeBSD) so I know they
> have debuging on and that, I don't think it can just
> simply be running DEBUG on solaris at t
> Do you have the bug# around for these problems?
I don't believe a bug/RFE has yet been filed for this. I'll do so, and will
follow up with the specifics.
Thanks,
-Eric
This message posted from opensolaris.org
___
perf-discuss mailing list
perf-di
> Doesn't matter. Could even be dedicated
> point-to-point links between all chips. My
> assumption is that a processor on a chip can access
> the memory controller without sending messages to
> other chips via the xbar/hypertransport links. Of
> course this can't be done naively...
Right. The
> Also, to expand on the NUMA configuration I have in
> mind: consider a system with 4 hypothetical Niagara+
> chips connected together (yes, original Niagara only
> supports a Single-CMP). Each Niagara has its own
> local memory controllers. Threads running on a chip
> should ideally allocate p
> Right now, Simics tells Solaris that all of the
> memory is on a single board, even though my add-on
> module to Simics actually carries out the timing of
> NUMA. The bottom line is that we currently model
> the timing of NUMA, however Solaris does not do any
> memory placement optimization bec
Hi Mike,
> I would like NUMA in-order to simulate future NUMA
> chip-multiprocessors using Virtutech Simics and the
> Wisconsin GEMS toolkit.
Could you expand on this a bit? Solaris implements different policy
for NUMA and CMT (although affinity and load balancing
tends to be a common theme).
> I have a related comment: I think it may be better to
> change the
> traverse code to scan local CPU first - in the order
> local core, local
> board, local machine and then all remaining boards -
> to avoid massive
> communication overhead in NUMA/COMA machines.
The code already traverses other
> I also wonder whether the pre-existing loop over cpus
> (in lpl order)
> in disp_getwork on systems with many cpus is going to
> access
> a large number of cpu_t and effectively flush the
> TLBs (as happened
> in the mutex_vector_enter perf fix). I guess this is
> a less frequent
> operation and
I ran into the same problem while trying to watch a DVD using xine. The
performance was terrible. I have an ATI Radeon based chipset, I understand at
least part of the issue is related to the driver. Hopefully someone who knows
more can comment...I would *love* to see this fixed.
-Eric
This mes
Sorry for the delay...
Here's a webrev with the Nevada changes:
http://cr.grommit.com/~esaxe/6336786
Comments / questions welcome.
Thanks,
-Eric
This message posted from opensolaris.org
___
perf-discuss mailing list
perf-discuss@opensolaris.org
Hi Stefan,
You should be able get some additional information now about 6336786 from the
OpenSolaris bug database (bugs.opensolaris.org). The fix is planned to be
available as of Nevada build 28...and i'm working on getting the webrev (web
based diffs) for the fix viewable ATM...
Thanks,
-Eri
AMD posted an interesting write up discussing some corner cases that
operating systems (like Solaris) should consider when using the TSC
(Time Stamp Counter) in conjunction with the power management features
provided by current Opteron and Athlon 64 processors.
This was posted last Friday to comp.
> Let me followup with a question. In this application, processes have
> not only their "own" memory, ie heap, stack program text and data, etc,
> but they also share a moderately large (~ 2-5GB today) amount of memory
> in the form of mmap'd files. From Sherry Moore's previous posts, I'm
> assum
Hi David,
Since your v1280 systems has NUMA characteristics, the bias that you see for
one of the boards may be a result of the kernel trying to run your
application's threads "close" to where they have allocated their memory. We
also generally try to keep threads in the same process together,
Ben Cooper wrote:
> Interestingly when I ran the cascade_flock 200 test
> manually I got
> the resource temporarily unavailable fork error
> straight away, so I
> don't think it's libMicro not waiting for the
> processes of previous
> tests to run. When I tried putting the kern.maxproc
> an
Hey look at that. My previous message has been robbed of line breaks and the
first part of
this thread has gone missing. Looks like the html I embedded really confused
things. Here's
a more readable version of my last message. I'll let Derek know that this
thread needs to be fixed...
-Eric
On
Dan Price points out that this could be a case where some "volatility" is
necessary.
I made the loop counter "i" volatile, and the problem went away.
Looking at the disassembly and comparing for the two cases (siglongjmp)
(volatile)
benchmark+0x1b: 8d 45 fc leal -0x4(%ebp),%ea
> You're right. It seems to work OK with 'cc', but
> but these 3 benchmarks
> also fail with gcc 3.4.3 (which is the version that
> is part of Solaris10
> right now). I'm on Nevada build 14.
This seems to be related to the gcc optimizer. I added
a "printf()" to the "if" loop in siglongjmp.c:be
21 matches
Mail list logo