Well, this has happened again, exactly a week later, same time too….
So the SSD ZILS didnt do the trick.
I think I am going to turn off the ZFS auto snapshot service ….
Jun 7 15:50:22 hagrid fct: [ID 132490 kern.notice] NOTICE: qlt1,0 LINK UP,
portid 20300, topology Fabric Pt-to-Pt,speed
A quick update for those who might be following this thread, I started to
collect zilstats, and what I have found is that about once every four days a
transaction takes over half an hour:
TIMEtxgN-Bytes N-Bytes/s N-Max-RateB-Bytes
B-Bytes/s B-Max-Rateops
Hi Adrian,
The SanBoxes? - Nexsan nothing in their logs
OK
Dmesg? :
> May 17 17:33:47 hagrid fct: [ID 132490 kern.notice] NOTICE: qlt1,0
> LINK UP, portid 20300, topology Fabric Pt-to-Pt,speed 8G May 17
> 17:33:48 hagrid fct: [ID 132490 kern.notice] NOTICE: qlt0,0 LINK UP,
> portid 10400
Hi Mike:
The server is a SuperMicro 2042G-TRF, 256GB RAM, 4 x AMD 6272, 2 Qlogic
2562, 1 Intel I250T
The SanBoxes? - Nexsan nothing in their logs
Dmesg? :
> May 17 17:33:47 hagrid fct: [ID 132490 kern.notice] NOTICE: qlt1,0 LINK UP,
> portid 20300, topology Fabric Pt-to-Pt,speed 8G
>
At one time I posted a dtrace script to track txg open times. Look for it in
the forum archives or I can repost it .Some Other folks Posted Similar
Scripts ...
I would not be surprised to find a txg being open for an unusually long time
when the problem happens.
That would Indicate a problem i
Hi Adrian,
What logs have you checked.
The SanBoxes?
Dmesg?
Stmf service?
Are you running snapshots?
Dedup?
Compression?
IRQ sharing?
echo ::interrupts | mdb -k
-Original Message-
From: Adrian Carpenter [mailto:ta...@wbic.cam.ac.uk]
Sent: Friday, May 18, 2012 3:26 AM
To: openindiana-