Hello Richard,
Monday, May 5, 2008, 4:12:23 PM, you wrote:
RE> Rustam wrote:
>> Hello Robert,
>>
>>> Which would happen if you have problem with HW and you're getting
>>> wring checksums on both side of your mirrors. Maybe PS?
>>>
>>> Try memtest anyway or sunvts
>>>
>> Unfortunately, Sun
On May 5, 2008, at 4:43 PM, Bob Friesenhahn wrote:
> On Mon, 5 May 2008, eric kustarz wrote:
>>
>> That's not true:
>> http://blogs.sun.com/erickustarz/entry/zil_disable
>>
>> Perhaps people are using "consistency" to mean different things
>> here...
>
> Consistency means that fsync() assures t
On Mon, 5 May 2008, Marcelo Leal wrote:
> I'm calling consistency, "a coherent local view"...
> I think that was one option to debug (if not a NFS server), without
> generate a corrupted filesystem.
In other words your flight reservation will not be lost if the system
crashes.
Bob
==
On Mon, 5 May 2008, eric kustarz wrote:
>
> That's not true:
> http://blogs.sun.com/erickustarz/entry/zil_disable
>
> Perhaps people are using "consistency" to mean different things here...
Consistency means that fsync() assures that the data will be written
to disk so no data is lost. It is not
On May 5, 2008, at 1:43 PM, Bob Friesenhahn wrote:
> On Mon, 5 May 2008, Marcelo Leal wrote:
>
>> Hello, If you believe that the problem can be related to ZIL code,
>> you can try to disable it to debug (isolate) the problem. If it is
>> not a fileserver (NFS), disabling the zil should not impact
On Mon, 5 May 2008, Marcelo Leal wrote:
> Hello, If you believe that the problem can be related to ZIL code,
> you can try to disable it to debug (isolate) the problem. If it is
> not a fileserver (NFS), disabling the zil should not impact
> consistency.
In what way is NFS special when it come
Hello Leal,
I've been already warned
(http://www.opensolaris.org/jive/message.jspa?messageID=231349) that ZIL could
be a cause and I made tests with zil_disabled. I run scrub and system crashed
exactly at after the same period and the same error. ZIL known to cause some
problems on writes, whi
Hello,
If you believe that the problem can be related to ZIL code, you can try to
disable it to debug (isolate) the problem. If it is not a fileserver (NFS),
disabling the zil should not impact consistency.
Leal.
This message posted from opensolaris.org
Rustam wrote:
> Hello Robert,
>
>> Which would happen if you have problem with HW and you're getting
>> wring checksums on both side of your mirrors. Maybe PS?
>>
>> Try memtest anyway or sunvts
>>
> Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest
> requires too much
Hello Robert,
> Which would happen if you have problem with HW and you're getting
> wring checksums on both side of your mirrors. Maybe PS?
>
> Try memtest anyway or sunvts
Unfortunately, SunVTS doesn't run on non-Sun/OEM hardware. And memtest requires
too much downtime which I cannot afford right
Hello Rustam,
Saturday, May 3, 2008, 9:16:41 AM, you wrote:
R> I don't think that this is hardware issue, however i don't except this. I'll
try to explain why.
R> 1. I've replaced all memory modules which are more likely to cause such a
problem.
R> 2. There are many different applications run
I don't think that this is hardware issue, however i don't except this. I'll
try to explain why.
1. I've replaced all memory modules which are more likely to cause such a
problem.
2. There are many different applications running on that server (Apache,
PostgreSQL, etc.). However, if you look a
Rustam code.az> writes:
>
> Didn't help. Keeps crashing.
> The worst thing is that I don't know where's the problem. More ideas on
> how to find problem?
Lots of CKSUM errors like you see is often indicative of bad hardware. Run
memtest for 24-48 hours.
-marc
_
> Seems kind of old. I am using Generic_127112-11 here.
>
> Probably many hundreds of nasty bugs have been
> eliminated since the version you are using.
I've updated to the latest available kernel 127128-11 (from 28 Apr) which
included a number of fixes to AHCI SATA driver and ZFS.
Didn't help
On Thu, 1 May 2008, Rustam wrote:
> operating system: 5.10 Generic_127112-07 (i86pc)
Seems kind of old. I am using Generic_127112-11 here.
Probably many hundreds of nasty bugs have been eliminated since the
version you are using.
Bob
==
Bob Friesenhahn
[EMA
> Is your ZFS pool configured with redundancy (e.g mirrors, raidz) or is
> it non-redundant? If non-redundant, then there is not much that ZFS
> can really do if a device begins to fail.
It's RAID 10 (more info here:
http://www.opensolaris.org/jive/thread.jspa?threadID=57425):
NAME STATE READ WR
Rustam wrote:
> Today my production server crashed 4 times. THIS IS NIGHTMARE!
> Self-healing file system?! For me ZFS is SELF-KILLING filesystem.
>
> I cannot fsck it, there's no such tool. I cannot scrub it, it crashes
> 30-40 minutes after scrub starts. I cannot use it, it crashes a
> number
On Thu, 1 May 2008, Rustam wrote:
> Today my production server crashed 4 times. THIS IS NIGHTMARE!
> Self-healing file system?! For me ZFS is SELF-KILLING filesystem.
>
> I cannot fsck it, there's no such tool.
> I cannot scrub it, it crashes 30-40 minutes after scrub starts.
> I cannot use it, i
Today my production server crashed 4 times. THIS IS NIGHTMARE!
Self-healing file system?! For me ZFS is SELF-KILLING filesystem.
I cannot fsck it, there's no such tool.
I cannot scrub it, it crashes 30-40 minutes after scrub starts.
I cannot use it, it crashes a number of times every day! And wi
19 matches
Mail list logo