Sean Meighan wrote:
Hi all; i just joined the group. My team has created a tool to watch Solaris,Linux,Windows. It is non-root on both the client and the server. We currently are using it to watch the 750 Sun Ray servers inside of sun. The basic thing it does is executes a 55 command shell script on the client and sends the 1000 line output up to the central server. We do this every 10 minutes across the 750 machines. We have a Niagara T2000, 32x1000mhz server running Solaris 10 Generic_118833-08. We put ZFS on this box two months ago. We currently have 3.5 million files and 3 billion lines of ASCII sitting on the internal drive of the Niagara. The box runs less than 20% load.

Everything has been working perfectly until two days ago, now it can take 10 minutes to exit from vi.

Hi Sean,

May I ask what happened between 13/6 and 14/6?
(Forgive my European bias in dates.)

Without getting into the zfs details, *what changes occurred in the
system* around that time, which is about two days before the "two days ago", which I assume refers to 15/6.

We don't need any statistical process analysis here, but just from
looking at the graphs I would say that until the end of the day on 13/6, the system shows very regular activity in cpu *and* disk.
The assertion cpu > users is true for most of this period, with only
a few exceptions.  After 14/6, this definitely isn't.

* at 18:00 13/6 the baseline for both cpu and disk drops visibly.
* around 08:00 14/6 the baseline jumps back up to previous levels
* between 14:00 and 16:00, baseline disk activity reaches new heights
  for the week.  Variance is quite low during this time.
* At the same time, CPU load rises steadily, with little variation,
  until halfway through, when it starts a new, more chaotic behaviour
  that deviates significantly from previous patterns.
* After 16:00 14/6, disk activity also enters a new, more chaotic
  pattern with a higher variance.
* There are a couple of polling gaps on 15/6 and 16/6, which was after
  you realised something was happening.

The line graph has some disadvantages - I'd like to see a scatterplot, if possible with a log scale from 1 to 100 on the Y-axis...

I don't have access to staroffice and the stats-plugin anymore, don't
know if you have, but for data analysis they could be helpful.

Cheers,
Henk

But first check what actually happened between 13 and 14 june. Were new
h/w resources added?  New s/w installed or accessed?

Cheers,
Henk Langeveld
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to