Greetings all.

I am facing serious problems running ZFS on a storage server assembled out of 
commodity hardware that is supposed to be Solaris compatible.

Although I am quite familiar with Linux distros and other unices, I am new to 
Solaris so any suggestions are highly appreciated.



First I tried SXDE 1/08 creating the following pool:

        -bash-3.2# zpool status -v tank
          pool: tank
         state: ONLINE
         scrub: none requested
        config:

                NAME        STATE     READ WRITE CKSUM
                tank        ONLINE       0     0     0
                  raidz1    ONLINE       0     0     0
                    c5t0d0  ONLINE       0     0     0
                    c5t1d0  ONLINE       0     0     0
                    c6t1d0  ONLINE       0     0     0
                    c6t0d0  ONLINE       0     0     0
                    c7t0d0  ONLINE       0     0     0

        errors: No known data errors

All went well until I tried pulling files from the server to another machine 
running 64bit Vista Ultimate SP1 via its build in NFS client. After copying 
cca. 100 of them (split archives all 100MB in size, i.e. cca. 10GB of data) I 
always get an "The semaphore timeout period has expired." error. The machines 
are currently connected by a 1Gbps switch, but I have tried several other 
devices as well (some supporting only 100Mbps).


When this happens, Solaris is still responsive but any zpool command I try 
locks up. E.g. "zpool status tank" would write just the following

          pool: tank
         state: ONLINE
         scrub: none requested

and then lock up.


This gives me the impression that after several minutes of usage, the ZFS 
subsystem on the machine locks up and anything that tries to touch it locks up 
as well.

The only way I found to make the server run again would be a hardware reset. 
Software reboot/shutdown locks up as well.


Another possibly related problem I had was that instead of or in addition to 
this lock up, ZFS degraded my pool considering one of the discs as faulty. 

Always the same one, regardless of the port it was plugged in.

The weird thing is though that the disc appears to be perfectly functional. 
Running the thorough Samsung ESTOOL diagnostic on it many times discovered no 
problems. Cleaning the errors and scrubbing the pool would make it operational 
again, at least for a while.


I have replaced the SXDE with OpenSolaris 2008.05 but it didnt seem to affect 
these problems at all.

I bought more discs hoping that replacing the faulting one would solve the 
problems. Unfortunately it did not solve all of them. The array doesnt degrade 
due to a "faulting disc" anymore, but ZFS still seems to be locking up after 
several minutes of usage.



Thanks in advance for any suggestions how to best approach these problems.



Server HW:

        Mobo:   MSI K9N Diamond
        CPU:    Athlon 64 X2 5200+
        Mem:    Corsair TWIN2x4096-6400C4DHX
        PSU:    Corsair HX620W
        Case:   ThermalTake Armor+
        GFX:    MSI N9600GT-T2D1G-OC
        HDDs:   Spinpoint F1 HD103UJ (1TB, 32MB, 7200rpm, SATA2)

        All HDDs are the same model.
        The machine is not overclocked.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to