Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-04 Thread Fajar A. Nugraha
On Fri, Oct 3, 2008 at 10:37 PM, Vasile Dumitrescu
<[EMAIL PROTECTED]> wrote:

> VMWare 6.0.4 running on Debian unstable,
> Linux bigsrv 2.6.26-1-amd64 #1 SMP Wed Sep 24 13:59:41 UTC 2008 x86_64 
> GNU/Linux
>
> Solaris is vanilla snv_90 installed with no GUI.


>
> in summary: physical disks, assigned 100% to the VM

That's weird. I thought one of the point of using physical disks
instead of files was to avoid problems caused by caching on host/dom0?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] data corruption causes unremovable files

2008-10-04 Thread David Gwynne
this week i melted a raid hba in a machine twice, which ended up causing real 
data corruption on the disks holding the zpool. as a result of this i have the 
following output from zpool status:

# zpool status -v cache
  pool: cache
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed with 21 errors on Sat Oct  4 15:51:43 2008
config:

NAMESTATE READ WRITE CKSUM
cache   ONLINE   0 0 4.99K
  c5t0d0ONLINE   0 0 0
  c5t1d0ONLINE   0 0 0
  c5t2d0ONLINE   0 0 2.51K
  c5t3d0ONLINE   0 0 2.48K

errors: Permanent errors have been detected in the following files:

/cache/staff/home/d1/d62/f16
/cache/staff_dumps/home-20081003-1300.dump
cache/students_dumps:<0x0>

i believe the last entry is the zil on that dataset, but the others
are real files. however, when i try to remove them:

[EMAIL PROTECTED]:~# rm /cache/staff_dumps/home-20081003-1300.dump
rm: /cache/staff_dumps/home-20081003-1300.dump not removed: I/O error
[EMAIL PROTECTED]:~# : > /cache/staff_dumps/home-20081003-1300.dump
bash: /cache/staff_dumps/home-20081003-1300.dump: I/O error

how do i get rid of these files?

dlg
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Fusion-IO?

2008-10-04 Thread David Flynn
Yes, we've been pleasantly surprised by the demand.  But, that doesn't mean 
we're not anxious to expand our ability to address such an important market as 
OpenSolaris and ZFS.

We're actively working on OpenSolaris drivers.  We don't expect it to take long 
- I'll keep you posted.

-David Flynn

CTO Fusion-io
[EMAIL PROTECTED]
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Fusion-IO?

2008-10-04 Thread Ross
Wohoo!  Best news I've heard all week.  Thanks for posting David :)
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] An slog experiment (my NAS can beat up your NAS)

2008-10-04 Thread Chris Greer
I currently have a traditional NFS cluster hardware setup in the lab (2 host 
with FC attached JBOD storage) but no cluster software yet.  I've been wanting 
to try out the separate ZIL to see what it might do to boost performance.  My 
problem is that I don't have any cool SSD devices, much less ones that I could 
have shared between two host.  Commercial arrays have custom hardware with 
mirrored cache which got me thinking about a way to do this with regular 
hardware.

So I tried this experiment this week...
On each host (OpenSolaris 2008.05), I created an 8GB ramdisk with ramdiskadm.  
I shared this ramdisk on each host via the iscsi target and initiator over a 
1GB crossconnect cable (jumbo frames enabled).  I added these as mirrored slog 
devices in a zpool.

The end result was a pool that I could import and export between host, and it 
can survive one of the host dying.  I also copied a dd image of my ramdisk 
device to stable storage with the pool exported (thus flushed), which allowed 
me to shut the entire cluster down, and power 1 node up, recreate the ramdisk 
and dd the image back and re-import the pool.
I'm not sure I could survive a crash of both nodes, going to try and test some 
more.

The big thing here is I ended up getting a MASSIVE boost in performance even 
with the overhead of the 1GB link, and iSCSI.   The iorate test I was using 
went from 3073 IOPS on 90% sequential writes to 23953 IOPS with the RAM slog 
added.  The service time was also significantly better than the physical disk.
It also boosted the reads significantly and I'm guessing this is because of 
updating the access time on the files was completely cached.

So what are the downsides to this?  If both nodes were to crash and I used the 
same technique to recreate the ramdisk I would lose any transactions in the 
slog at the time of the crash, but the physical disk image is still in a 
consistent state right (just not from my apps point of view)?  Anyone have any 
idea what difference infiniband might make for the cross connect?  In some 
test, I did completely saturate the 1GB link between the boxes.

So is this idea completely crazy?  It also brings up questions of correctly 
sizing your slog in relation to the physical disk on the backend.  It looks 
like if the ZIL can handle significantly more I/O than the physical disk the 
effect will be short lived as the system will have to slow things down as it 
spends more time flushing from the slog to physical disk.  The 8GB looked like 
overkill in my case, because in a lot of the test, it drove the individual disk 
in the system to 100% and was causing service times on the physical disk in the 
900 - 1000ms range (although my app never saw that because of the slog).
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss