Sean Meighan wrote:
Hi all; i just joined the group. My team has created a tool to watch
Solaris,Linux,Windows. It is non-root on both the client and the server. We
currently are using it to watch the 750 Sun Ray servers inside of sun. The basic
thing it does is executes a 55 command shell script on the client and sends the
1000 line output up to the central server. We do this every 10 minutes across
the 750 machines. We have a Niagara T2000, 32x1000mhz server running Solaris 10
Generic_118833-08. We put ZFS on this box two months ago. We currently have 3.5
million files and 3 billion lines of ASCII sitting on the internal drive of the
Niagara. The box runs less than 20% load.
Everything has been working perfectly
until two days ago, now it can take 10 minutes to exit from vi.
Hi Sean,
May I ask what happened between 13/6 and 14/6?
(Forgive my European bias in dates.)
Without getting into the zfs details, *what changes occurred in the
system* around that time, which is about two days before the "two days
ago", which I assume refers to 15/6.
We don't need any statistical process analysis here, but just from
looking at the graphs I would say that until the end of the day on 13/6,
the system shows very regular activity in cpu *and* disk.
The assertion cpu > users is true for most of this period, with only
a few exceptions. After 14/6, this definitely isn't.
* at 18:00 13/6 the baseline for both cpu and disk drops visibly.
* around 08:00 14/6 the baseline jumps back up to previous levels
* between 14:00 and 16:00, baseline disk activity reaches new heights
for the week. Variance is quite low during this time.
* At the same time, CPU load rises steadily, with little variation,
until halfway through, when it starts a new, more chaotic behaviour
that deviates significantly from previous patterns.
* After 16:00 14/6, disk activity also enters a new, more chaotic
pattern with a higher variance.
* There are a couple of polling gaps on 15/6 and 16/6, which was after
you realised something was happening.
The line graph has some disadvantages - I'd like to see a scatterplot,
if possible with a log scale from 1 to 100 on the Y-axis...
I don't have access to staroffice and the stats-plugin anymore, don't
know if you have, but for data analysis they could be helpful.
Cheers,
Henk
But first check what actually happened between 13 and 14 june. Were new
h/w resources added? New s/w installed or accessed?
Cheers,
Henk Langeveld
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss