On Sat, Aug 06, 2011 at 07:19:56PM +0200, Eugen Leitl wrote:
> 
> Upgrading to hacked N36L BIOS seems to have done the trick:
> 
> eugen@nexenta:~$ zpool status tank
>   pool: tank
>  state: ONLINE
>  scan: none requested
> config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         tank        ONLINE       0     0     0
>           raidz2-0  ONLINE       0     0     0
>             c0t0d0  ONLINE       0     0     0
>             c0t1d0  ONLINE       0     0     0
>             c0t2d0  ONLINE       0     0     0
>             c0t3d0  ONLINE       0     0     0
>         logs
>           c0t5d0s0  ONLINE       0     0     0
>         cache
>           c0t5d0s1  ONLINE       0     0     0
> 
> errors: No known data errors
> 
> Anecdotally, the drive noise and system load have gone
> down as well. It seems even with small SSDs hybrid pools
> are definitely worthwhile.

System is still stable. zilstat on on a lightly loaded box
(this is N36L with 8 GByte RAM and 4x1 and 1.5 TByte Seagate
drives in raidz2):

root@nexenta:/tank/tank0/eugen# ./zilstat.ksh -t 60
TIME                    N-Bytes  N-Bytes/s N-Max-Rate    B-Bytes  B-Bytes/s 
B-Max-Rate    ops  <=4kB 4-32kB >=32kB
2011 Aug 11 10:38:31   17475360     291256    5464560   34078720     567978   
10747904    260      0      0    260
2011 Aug 11 10:39:31   10417568     173626    6191832   20447232     340787   
12189696    156      0      0    156
2011 Aug 11 10:40:31   19264288     321071    5975840   34603008     576716    
9961472    264      0      0    264
2011 Aug 11 10:41:31   11176512     186275    6124832   22151168     369186   
12189696    169      0      0    169
2011 Aug 11 10:42:31   14544432     242407   13321424   26738688     445644   
24117248    204      0      0    204
2011 Aug 11 10:43:31   13470688     224511    5019744   25821184     430353    
9961472    197      0      0    197
2011 Aug 11 10:44:31    9147112     152451    4225464   18350080     305834    
8519680    140      0      0    140
2011 Aug 11 10:45:31   12167552     202792    7760864   23068672     384477   
15204352    176      0      0    176
2011 Aug 11 10:46:31   13306192     221769    8467424   25034752     417245   
15335424    191      0      0    191
2011 Aug 11 10:47:31    8634288     143904    8254112   15990784     266513   
15204352    122      0      0    122
2011 Aug 11 10:48:31    4442896      74048    4078408    9175040     152917    
8257536     70      0      0     70
2011 Aug 11 10:49:31    8256312     137605    5283744   15859712     264328    
9961472    121      0      0    121

I've also run bonnie++ and scrub while under about the same load,
scrub was doing 80-90 MBytes/s.
 
> 
> On Fri, Aug 05, 2011 at 10:43:02AM +0200, Eugen Leitl wrote:
> > 
> > I think I've found the source of my problem: I need to reflash
> > the N36L BIOS to a hacked russian version (sic) which allows
> > AHCI in the 5th drive bay
> > 
> > http://terabyt.es/2011/07/02/nas-build-guide-hp-n36l-microserver-with-nexenta-napp-it/
> > 
> > ...
> > 
> > Update BIOS and install hacked Russian BIOS
> > 
> > The HP BIOS for N36L does not support anything but legacy IDE emulation on 
> > the internal ODD SATA port and the external eSATA port. This is a problem 
> > for Nexenta which can detect false disk errors when using the ODD drive on 
> > emulated IDE mode. Luckily an unknown Russian hacker somewhere has modified 
> > the BIOS to allow AHCI mode on both the internal and eSATA ports. I have 
> > always said, “Give the Russians two weeks and they will crack anything” and 
> > usually that has held true. Huge thank you to whomever has modified this 
> > BIOS given HPs complete failure to do so.
> > 
> > I have enabled this with good results. The main one being no emails from 
> > Nexenta informing you that the syspool has moved to a degraded state when 
> > it actually hasn’t :) 
> > 
> > ...
> > 
> > On Fri, Aug 05, 2011 at 09:05:07AM +0200, Eugen Leitl wrote:
> > > On Thu, Aug 04, 2011 at 11:58:47PM +0200, Eugen Leitl wrote:
> > > > On Thu, Aug 04, 2011 at 02:43:30PM -0700, Larry Liu wrote:
> > > > >
> > > > >> root@nexenta:/export/home/eugen# zpool add tank log /dev/dsk/c3d1p0
> > > > >
> > > > > You should use c3d1s0 here.
> > > > >
> > > > >> Th
> > > > >> root@nexenta:/export/home/eugen# zpool add tank cache /dev/dsk/c3d1p1
> > > > >
> > > > > Use c3d1s1.
> > > > 
> > > > Thanks, that did the trick!
> > > > 
> > > > root@nexenta:/export/home/eugen# zpool status tank
> > > >   pool: tank
> > > >  state: ONLINE
> > > >  scan: scrub repaired 0 in 0h0m with 0 errors on Fri Aug  5 03:04:57 
> > > > 2011
> > > > config:
> > > > 
> > > >         NAME        STATE     READ WRITE CKSUM
> > > >         tank        ONLINE       0     0     0
> > > >           raidz2-0  ONLINE       0     0     0
> > > >             c0t0d0  ONLINE       0     0     0
> > > >             c0t1d0  ONLINE       0     0     0
> > > >             c0t2d0  ONLINE       0     0     0
> > > >             c0t3d0  ONLINE       0     0     0
> > > >         logs
> > > >           c3d1s0    ONLINE       0     0     0
> > > >         cache
> > > >           c3d1s1    ONLINE       0     0     0
> > > > 
> > > > errors: No known data errors
> > > 
> > > Hmm, it doesn't seem to last more than a couple hours
> > > under test load (mapped as a CIFS share receiving a
> > > bittorrent download with 10 k small files in it at
> > > about 10 MByte/s) before falling from the pool:
> > > 
> > > root@nexenta:/export/home/eugen# zpool status tank
> > >   pool: tank
> > >  state: DEGRADED
> > > status: One or more devices are faulted in response to persistent errors.
> > >         Sufficient replicas exist for the pool to continue functioning in 
> > > a
> > >         degraded state.
> > > action: Replace the faulted device, or use 'zpool clear' to mark the 
> > > device
> > >         repaired.
> > >  scan: none requested
> > > config:
> > > 
> > >         NAME        STATE     READ WRITE CKSUM
> > >         tank        DEGRADED     0     0     0
> > >           raidz2-0  ONLINE       0     0     0
> > >             c0t0d0  ONLINE       0     0     0
> > >             c0t1d0  ONLINE       0     0     0
> > >             c0t2d0  ONLINE       0     0     0
> > >             c0t3d0  ONLINE       0     0     0
> > >         logs
> > >           c3d1s0    FAULTED      0     4     0  too many errors
> > >         cache
> > >           c3d1s1    FAULTED     13 7.68K     0  too many errors
> > > 
> > > errors: No known data errors
> > > 
> > > dmesg sez
> > > 
> > > Aug  5 05:53:26 nexenta EVENT-TIME: Fri Aug  5 05:53:26 CEST 2011
> > > Aug  5 05:53:26 nexenta PLATFORM: ProLiant-MicroServer, CSN: CN7051P024, 
> > > HOSTNAME: nexenta
> > > Aug  5 05:53:26 nexenta SOURCE: zfs-diagnosis, REV: 1.0
> > > Aug  5 05:53:26 nexenta EVENT-ID: 516e9c7c-9e29-c504-a422-db37838fa676
> > > Aug  5 05:53:26 nexenta DESC: A ZFS device failed.  Refer to 
> > > http://sun.com/msg/ZFS-8000-D3 for more information.
> > > Aug  5 05:53:26 nexenta AUTO-RESPONSE: No automated response will occur.
> > > Aug  5 05:53:26 nexenta IMPACT: Fault tolerance of the pool may be 
> > > compromised.
> > > Aug  5 05:53:26 nexenta REC-ACTION: Run 'zpool status -x' and replace the 
> > > bad device.
> > > Aug  5 05:53:39 nexenta fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
> > > ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
> > > Aug  5 05:53:39 nexenta EVENT-TIME: Fri Aug  5 05:53:39 CEST 2011
> > > Aug  5 05:53:39 nexenta PLATFORM: ProLiant-MicroServer, CSN: CN7051P024, 
> > > HOSTNAME: nexenta
> > > Aug  5 05:53:39 nexenta SOURCE: zfs-diagnosis, REV: 1.0
> > > Aug  5 05:53:39 nexenta EVENT-ID: 3319749a-b6f7-c305-ec86-d94897dde85b
> > > Aug  5 05:53:39 nexenta DESC: The number of I/O errors associated with a 
> > > ZFS device exceeded
> > > Aug  5 05:53:39 nexenta              acceptable levels.  Refer to 
> > > http://sun.com/msg/ZFS-8000-FD for more information.
> > > Aug  5 05:53:39 nexenta AUTO-RESPONSE: The device has been offlined and 
> > > marked as faulted.  An attempt
> > > Aug  5 05:53:39 nexenta              will be made to activate a hot spare 
> > > if available.
> > > Aug  5 05:53:39 nexenta IMPACT: Fault tolerance of the pool may be 
> > > compromised.
> > > Aug  5 05:53:39 nexenta REC-ACTION: Run 'zpool status -x' and replace the 
> > > bad device.
> > > 
> > > -- 
> > > Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
> > > ______________________________________________________________
> > > ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> > > 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> > > _______________________________________________
> > > zfs-discuss mailing list
> > > zfs-discuss@opensolaris.org
> > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> > -- 
> > Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
> > ______________________________________________________________
> > ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> > 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> -- 
> Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
> ______________________________________________________________
> ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
-- 
Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to