[zfs-discuss] Recover zpool/zfs

2008-11-07 Thread Thomas Kloeber

This is the 2nd attempt, so my apologies, if this mail got to you already...

Folkses,

I'm in an absolute state of panic because I lost about 160GB of data
which were on an external USB disk.
Here is what happened:
1. I added a 500GB USB disk to my Ultra25/Solaris 10
2. I created a zpool and a zfs for the whole disk
3. I copied lots of data on to the disk
4. I 'zfs umount'ed the disk, unplugged the USB cable
5. I attached the disk to a PC/Solaris 10 and tried to get the data
6. but I couldn't find it (didn't know about 'zfs import' then)
7. re-attached the disk to the Ultra25/Solaris 10
and now 'zpool status' tells me that the pool is FAULTED and the data is
corrupted. SUN support tells me that I have lost all data. They also
suggested to try this disk on a Nevada system which I did. There the
zpool showed up correctly but the zfs file system didn't.

If I do a hex dump of the physical device I can see all the data but I
can't get to it... aaarrrggghhh

Has anybody successfully patched/tweaked/whatever a zpool or zfs to
recover from this?

I would be most and for ever greatful I somebody could give me a hint.

Thanx,

Thomas

Following is the 'zpool status':
zpool status -xv
pool: usbDisk
state: FAULTED
status: One or more devices could not be used because the the label is 
missing

or invalid. There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from a backup source.
see: http://www.sun.com/msg/ZFS-8000-5E
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
usbDisk FAULTED 0 0 0 corrupted data
c5t0d0s0 FAULTED 0 0 0 corrupted data

and the output of zdb:
disk1
version=4
name='disk1'
state=0
txg=4
pool_guid=10847216141970570446
vdev_tree
type='root'
id=0
guid=10847216141970570446
children[0]
type='disk'
id=0
guid=5919680419735522389
path='/dev/dsk/c1t1d0s1'
devid='id1,[EMAIL PROTECTED]/b'
whole_disk=0
metaslab_array=14
metaslab_shift=30
ashift=9
asize=141729202176
usbDisk
version=4
name='usbDisk'
state=0
txg=4
pool_guid=18060149023938226877
vdev_tree
type='root'
id=0
guid=18060149023938226877
children[0]
type='disk'
id=0
guid=4959744788625823079
path='/dev/dsk/c5t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500100497408
obelix2:/# zdb -l /dev/dsk/c5t0d0s0

LABEL 0

version=4
name='WD'
state=0
txg=36
pool_guid=7944655440509665617
hostname='opensolaris'
top_guid=13767930161106254968
guid=13767930161106254968
vdev_tree
type='disk'
id=0
guid=13767930161106254968
path='/dev/dsk/c8t0d0p0'
devid='id1,[EMAIL PROTECTED]/q'
phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0:q'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500103118848
is_log=0
DTL=18

LABEL 1

version=4
name='WD'
state=0
txg=36
pool_guid=7944655440509665617
hostname='opensolaris'
top_guid=13767930161106254968
guid=13767930161106254968
vdev_tree
type='disk'
id=0
guid=13767930161106254968
path='/dev/dsk/c8t0d0p0'
devid='id1,[EMAIL PROTECTED]/q'
phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0:q'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500103118848
is_log=0
DTL=18

LABEL 2

version=4
name='usbDisk'
state=0
txg=4
pool_guid=18060149023938226877
top_guid=4959744788625823079
guid=4959744788625823079
vdev_tree
type='disk'
id=0
guid=4959744788625823079
path='/dev/dsk/c5t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500100497408

LABEL 3

version=4
name='usbDisk'
state=0
txg=4
pool_guid=18060149023938226877
top_guid=4959744788625823079
guid=4959744788625823079
vdev_tree
type='disk'
id=0
guid=4959744788625823079
path='/dev/dsk/c5t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500100497408
--
Intelligent Communication Software Vertriebs GmbH
Firmensitz: Kistlerhof Str. 111, 81379 München
Registergericht: Amtsgericht München, HRB 88283
Geschäftsführer: Albert Fuss

begin:vcard
fn;quoted-printable:Thomas Kl=C3=B6ber
n;quoted-printable:Kl=C3=B6ber;Thomas
org:ICS GmbH
adr;quoted-printable:;;Leibnizpark 1;R=C3=B6srath;;51503;Germany
email;internet:[EMAIL PROTECTED]
title:Dipl-Inform
tel;work:++49-2205-895558
tel;fax:++49-2205-895625
note:VoIP: 032221318672
x-mozilla-html:TRUE
url:www.ics.de
version:2.1
end:vcard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help recovering zfs filesystem

2008-11-07 Thread Sherwood Glazier
That's exactly what I was looking for.  Hopefully Sun will see fit to include 
this functionality in the OS soon.

Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] recovering data from a dettach mirrored vdev

2008-11-07 Thread Krzys
I was wondering if this ever made to zfs as a fix for bad labels?

On Wed, 7 May 2008, Jeff Bonwick wrote:

> Yes, I think that would be useful.  Something like 'zpool revive'
> or 'zpool undead'.  It would not be completely general-purpose --
> in a pool with multiple mirror devices, it could only work if
> all replicas were detached in the same txg -- but for the simple
> case of a single top-level mirror vdev, or a clean 'zpool split',
> it's actually pretty straightforward.
>
> Jeff
>
> On Tue, May 06, 2008 at 11:16:25AM +0100, Darren J Moffat wrote:
>> Great tool, any chance we can have it integrated into zpool(1M) so that
>> it can find and "fixup" on import detached vdevs as new pools ?
>>
>> I'd think it would be reasonable to extend the meaning of
>> 'zpool import -D' to list detached vdevs as well as destroyed pools.
>>
>> --
>> Darren J Moffat
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
> !DSPAM:122,482161a8460825014478!
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
I woke up yesterday morning, only to discover my system kept rebooting..

It's been running fine for the last while. I upgraded to snv 98 a couple weeks 
back (from 95), and had upgraded my RaidZ Zpool from version 11 to 13 for 
improved scrub performance.

After some research it turned out that, on bootup, importing my 4tb raidZ array 
was causing the system to panic (similar to this OP's error). I got that 
bypassed, and can now at least boot the system..

However, when I try anything (like mdb -kw), it advises me that there is no 
command line editing because: "mdb: no terminal data available for TERM=vt320. 
term init failed: command-line editing and prompt will not be available". This 
means I can't really try what aldredmr had done in mdb, and I really don't have 
any experience in it. I upgraded to snv_100 (November), but experiencing the 
exact same issues. 

If anyone has some insight, it would be greatly appreciated. Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help recovering zfs filesystem

2008-11-07 Thread Nigel Smith
FYI, here are the link to the 'labelfix' utility.
It an attachment to one of Jeff Bonwick's posts on this thread:

http://www.opensolaris.org/jive/thread.jspa?messageID=229969

or here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2008-May/047267.html
http://mail.opensolaris.org/pipermail/zfs-discuss/2008-May/047270.html

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
River Tarnell wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> hi,
>
> i have two systems, A (Solaris 10 update 5) and B (Solaris 10 update 6).  i'm
> using 'zfs send -i' to replicate changes on A to B.  however, the 'zfs recv' 
> on
> B is running extremely slowly.  
I'm sorry, I didn't notice the "-i" in your original message.

I get the same problem sending incremental streams between Thumpers.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems which scrub can't find?

2008-11-07 Thread Matt Ingenthron
Off the lists, someone suggested to me that the "Inconsistent 
filesystem" may be the boot archive and not the ZFS filesystem (though I 
still don't know what's wrong with booting b99).

Regardless, I tried rebuilding the boot_archive with bootadm 
update-archive -vf and verified it by mounting it  and peeking inside.  
I also tried both with and without /etc/hostid.  I still get the same 
behavior.

Any thoughts?

Thanks in advance,

- Matt

[EMAIL PROTECTED] wrote:
> Hi,
>
> After a recent pkg image-update to OpenSolaris build 100, my system 
> booted once and now will no longer boot.  After exhausting other 
> options, I am left wondering if there is some kind of ZFS issue a 
> scrub won't find.
>
> The current behavior is that it will load GRUB, but trying to boot the
> most recent boot environment (b100 based) I get "Error 16: Inconsistent
> filesystem structure".  The pool has gone through two scrubs from a 
> livecd based on b101a without finding anything wrong.  If I select the 
> previous boot environment (b99 based), I get a kernel panic.
>
> I've tried replacing the /etc/hostid based on a hunch from one of the 
> engineers working on Indiana and ZFS boot.  I also tried rebuilding 
> the boot_archive and reloading the GRUB based on build 100.  I then 
> tried reloading the build 99 grub to hopefully get to where I could 
> boot build 99.  No luck with any of these thus far.
>
> More below, and some comments in this bug:
> http://defect.opensolaris.org/bz/show_bug.cgi?id=3965, though may need
> to be a separate bug.
>
> I'd appreciate any suggestions and be glad to gather any data to 
> diagnose this if possible.
>
>
> == Screen when trying to boot b100 after boot menu ==
>
>  Booting 'opensolaris-15'
>
> bootfs rpool/ROOT/opensolaris-15
> kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
> loading '/platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS' ...
> cpu: 'GenuineIntel' family 6 model 15 step 11
> [BIOS accepted mixed-mode target setting!]
>   [Multiboot-kludge, loadaddr=0xbffe38, text-and-data=0x1931a8, bss=0x0,
> entry=0xc0]
> '/platform/i86pc/kernel/amd64/unix -B
> zfs-bootfs=rpool/391,bootpath="[EMAIL PROTECTED],0/pci1179,[EMAIL 
> PROTECTED],2/[EMAIL PROTECTED],0:a",diskdevid="id1,[EMAIL PROTECTED]/a"' 
>
> is loaded
> module$ /platform/i86pc/$ISADIR/boot_archive
> loading '/platform/i86pc/$ISADIR/boot_archive' ...
>
> Error 16: Inconsistent filesystem structure
>
> Press any key to continue...
>
>
>
> == Booting b99 ==
> (by selecting the grub entry from the GRUB menu and adding -kd then 
> doing a :c to continue I get the following stack trace)
>
> debug_enter+37 ()
> panicsys+40b ()
> vpanic+15d ()
> panic+9c ()
> (lines above typed in from ::stack, lines below typed in from when it 
> dropped into the debugger)
> unix:die+ea ()
> unix:trap+3d0 ()
> unix:cmntrap+e9 ()
> unix:mutex_owner_running+d ()
> genunix:lokuppnat+bc ()
> genunix:vn_removeat+7c ()
> genunix:vn_remove_28 ()
> zfs:spa_config_write+18d ()
> zfs:spa_config_sync+102 ()
> zfs:spa_open_common+24b ()
> zfs:spa_open+1c ()
> zfs:dsl_dsobj_to_dsname+37 ()
> zfs:zfs_parse_bootfs+68 ()
> zfs:zfs_mountroot+10a ()
> genunxi:fsop_mountroot+1a ()
> genunix:rootconf+d5 ()
> genunix:vfs_mountroot+65 ()
> genunix:main+e6 ()
> unix:_locore_start+92 ()
>
> panic: entering debugger (no dump device, continue to reboot)
> Loaded modules: [ scsi_vhci uppc sd zfs specfs pcplusmp cpu.generic ]
> kmdb: target stopped at:
> kmdb_enter+0xb: movq   %rax,%rdi
>
>
>
> == Output from zdb ==
> 
> LABEL 0
> 
>version=10
>name='rpool'
>state=1
>txg=327816
>pool_guid=6981480028020800083
>hostid=95693
>hostname='opensolaris'
>top_guid=5199095267524632419
>guid=5199095267524632419
>vdev_tree
>type='disk'
>id=0
>guid=5199095267524632419
>path='/dev/dsk/c4t0d0s0'
>devid='id1,[EMAIL PROTECTED]/a'
>phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
> PROTECTED],0:a'
>whole_disk=0
>metaslab_array=14
>metaslab_shift=29
>ashift=9
>asize=90374406144
>is_log=0
>DTL=161
> 
> LABEL 1
> 
>version=10
>name='rpool'
>state=1
>txg=327816
>pool_guid=6981480028020800083
>hostid=95693
>hostname='opensolaris'
>top_guid=5199095267524632419
>guid=5199095267524632419
>vdev_tree
>type='disk'
>id=0
>guid=5199095267524632419
>path='/dev/dsk/c4t0d0s0'
>devid='id1,[EMAIL PROTECTED]/a'
>phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
> PROTECTED],0:a'
>whole_disk=0
>metaslab_array=14
>metaslab_shift=29
>ashift=9
>asize=90374406144
>is_log=0
>DTL=161
> -

Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Jim Dunham
Andrew,

> I woke up yesterday morning, only to discover my system kept  
> rebooting..
>
> It's been running fine for the last while. I upgraded to snv 98 a  
> couple weeks back (from 95), and had upgraded my RaidZ Zpool from  
> version 11 to 13 for improved scrub performance.
>
> After some research it turned out that, on bootup, importing my 4tb  
> raidZ array was causing the system to panic (similar to this OP's  
> error). I got that bypassed, and can now at least boot the system..
>
> However, when I try anything (like mdb -kw), it advises me that  
> there is no command line editing because: "mdb: no terminal data  
> available for TERM=vt320. term init failed: command-line editing and  
> prompt will not be available". This means I can't really try what  
> aldredmr had done in mdb, and I really don't have any experience in  
> it. I upgraded to snv_100 (November), but experiencing the exact  
> same issues
>
> If anyone has some insight, it would be greatly appreciated. Thanks

I have the same problem SSH'ing in from my Mac OS X, which sets the  
TERM type to 'xterm-color', also not supported.

Do the following, depending on your default shell. and you should be  
all set.

TERM=vt100; export TERM
or
setenv TERM vt100

>
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-11-07 Thread Jacob Ritorto
I have a PC server running Solaris 10 5/08 which seems to frequently become 
unable to share zfs filesystems via the shareiscsi and sharenfs options.  It 
appears, from the outside, to be hung -- all clients just freeze, and while 
they're able to ping the host, they're not able to transfer nfs or iSCSI data.  
They're in the same subnet and I've found no network problems thus far.  

After hearing so much about the Marvell problems I'm beginning to wonder it 
they're the culprit, though they're supposed to be fixed in 127128-11, which is 
the kernel I'm running.  

I have an exact hardware duplicate of this machine running Nevada b91 (iirc) 
that doesn't exhibit this problem.

There's nothing in /var/adm/messages and I'm not sure where else to begin.  

Would someone please help me in diagnosing this failure?  

thx
jake
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Ended up in GRUB prompt after the installation on ZFS

2008-11-07 Thread jan damborsky

Hi ZFS team,

when testing installation with recent OpenSolaris builds,
we have been encountering that in some cases, people end up
in GRUB prompt after the installation - it seems that menu.lst
can't be accessed for some reason. At least two following bugs
seems to be describing the same manifestation of the problem
which root cause has not been identified yet:

4051 opensolaris b99b/b100a does not install on 1.5 TB disk or boot 
fails after install

4591 Install failure on a Sun Fire X4240 with Opensolaris 200811

In those bug reports, nothing indicates those might be ZFS related
and probably there could be more scenarios why it might happen.

But when I hit that problem today when testing Automated Installer
(it is a part of Caiman project and will replace current jumpstart
install technology), I was able to make GRUB find 'menu.lst' just by
using 'zpool import' command - please see below for detailed procedure.

Based on this, could you please take a look at those observations
and if possible help me understand if there is anything obvious
what might be wrong and if you think this is somehow related to
ZFS technology ?

Thank you very much for your help,
Jan


configuration:
--
HW: Ultra 20, 1GB RWM, 1 250GB SATA drive
SW: Opensolaris build 100, 64bit mode

steps used:
---
[1] OpenSolaris 100 installed using Automated Installer
   - Solaris 2 partition created during installation

* partition configuration before installation:

# fdisk -W - c2t0d0p0
...* IdAct  Bhead  Bsect  BcylEhead  Esect  EcylRsect  
Numsect
 192   00  1  1   25463 102316065  
22491000 


* partition configuration after installation:

# fdisk -W - c2t0d0p0
...* IdAct  Bhead  Bsect  BcylEhead  Esect  EcylRsect  
Numsect
 192   00  1  1   25463 102316065  
22491000 
 191   128  25463 102325463 102322507065   3000


[2] When I reboot the system after the installation, I ended up in GRUB 
prompt:

grub> root
(hd0,1,a): Filesystem type unknown, partition type 0xbf

grub> cat /rpool/boot/grub/menu.lst

Error 17: Cannot mount selected partition

grub>

[3] I rebooted into AI and did 'zpool import'
# zdb -l /dev/rdsk/c2t0d0s0 > /tmp/zdb_before_import.txt (attached)
# zpool import -f rpool
# zdb -l /dev/rdsk/c2t0d0s0 > /tmp/zdb_after_import.txt (attached)
# diff /tmp/zdb_before_import.txt /tmp/zdb_after_import.txt
7c7
< txg=21
---
> txg=2675
9c9
< hostid=4741222
---
> hostid=4247690
17a18
> devid='id1,[EMAIL PROTECTED]/a'
31c32
...
# reboot

[4] Now GRUB can access menu.lst and Solaris is booted

hypothesis
--
It seems that for some reason, when ZFS pool was
created, 'devid' information was not added to the
ZFS label.

When 'zpool import' was called, 'devid' got populated.

Looking at the GRUB ZFS plug-in, it seems that 'devid'
(ZPOOL_CONFIG_DEVID attribute) is required in order to
be able to access ZFS filesystem:

In grub/grub-0.95/stage2/fsys_zfs.c:

vdev_get_bootpath()
{
...
   if (strcmp(type, VDEV_TYPE_DISK) == 0) {
   if (vdev_validate(nv) != 0 ||
   (nvlist_lookup_value(nv, ZPOOL_CONFIG_PHYS_PATH,
   bootpath, DATA_TYPE_STRING, NULL) != 0) ||
   (nvlist_lookup_value(nv, ZPOOL_CONFIG_DEVID,
   devid, DATA_TYPE_STRING, NULL) != 0))
   return (ERR_NO_BOOTPATH);
...
}

additional observations:

[1] If 'devid' is populated during installation after 'zpool create'
operation, the problem doesn't occur.

[2] If following described procedure, the problem is reproducible
at will on system where it was initially reproduced
(please see above for the configuration)

[3] I was not able to reproduce that using exactly the same
procedure on following configurations:
* Ferrari 4000 with 160GB IDE disk
* vmware - installation done on IDE disk

[4] When installation into existing Solaris2 partition containing
Solaris instance is done 'devid' is always populated and the problem
doesn't occur.
(it doesn't matter if partition is marked 'active' or not),


LABEL 0

version=13
name='rpool'
state=0
txg=21
pool_guid=7190133845720836012
hostid=4741222
hostname='opensolaris'
top_guid=13119992024029372510
guid=13119992024029372510
vdev_tree
type='disk'
id=0
guid=13119992024029372510
path='/dev/dsk/c2t0d0s0'
phys_path='/[EMAIL PROTECTED],0/pci108e,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0:a'
whole_disk=0
metaslab_array=23
metaslab_shift=27
ashift=9
asize=15327035392
is_log=0

LABEL 1

version=13
name='rpool'
state=0
txg=21
pool_guid=7190133845720836012
hostid=4741222
hostname='opensolaris'
top_g

Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
River Tarnell wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Ian Collins:
>   
>> That's very slow.  What's the nature of your data?
>> 
>  
> mainly two sets of mid-sized files; one of 200KB-2MB in size and other under
> 50KB.  they are organised into subdirectories, A/B/C/.  each directory
> has 18,000-25,000 files.  total data size is around 2.5TB.
>
> hm, something changed while i was writing this mail: now the transfer is
> running at 2MB/sec, and the read i/o has disappeared.  that's still slower 
> than
> i'd expect, but an improvement.
>
>   
The transfer I mentioned just completed, 1.45.TB sent in 84832 seconds
(17.9MB/sec).  This was during a working day when the sever and network
were busy. 

The best ftp speed I managed was 59 MB/sec over then same network.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems -my math doesn't add up

2008-11-07 Thread Miles Nordin
> "n" == none  <[EMAIL PROTECTED]> writes:

 n> snapshots referring to old data which has been deleted from
 n> the current filesystem and I'd like to find out which
 n> snapshots refer to how much data

Imagine you have a filesystem containing ten 1MB files,

  zfs create root/export/home/carton/t
  cd t
  n=0; while [ $n -lt 10 ]; do mkfile 1m $n; n=$(( $n + 1 )); done

and you change nothing on the filesystem except to slowly delete one
file at a time.  After you delete each of them, you take another
snapshot.

  n=0; while [ $n -lt 10 ]; do 
 zfs snapshot root/export/home/carton/[EMAIL PROTECTED]; 
 rm $n; 
 n=$(( $n + 1 )); 
  done

When they're all gone, you stop. Now you want to know, destroying
which snapshot can save you space.

If you think about how you got the filesystem to this state, the ideal
answer is, only deleting the oldest snapshot will save space.
Deleting any middle snapshot saves no space.  Here's what 'zfs list'
says:

bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t10.2M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  7.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

now, uselessly destroy a middle snapshot:

bash-3.2# zfs destroy root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t10.2M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

still using 10MB.  but if you destroy the oldest snapshot:

bash-3.2# zfs destroy root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t9.17M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  9.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

space is freed.  However, there are other cases where deleting a
middle snapshot will save space:

bash-3.2# mkfile 1m 0
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# mkfile 1m 1
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# rm 1
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# rm 0
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t2.07M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
bash-3.2# zfs destroy root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t1.05M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

To me, it looks as though the USED column tells you ``how much space
would be freed if I destroyed the thing on this row,'' and the REFER
column tells you ``about what would 'du -s' report if I ran it on the
root of this dataset.''  The answer to your question is then to look
at the USED column, but with one caveat: after you de

Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
Thanks a lot! Google didn't seem to cooperate as well as I had hoped.

Still no dice on the import. I only have shell access on my Blackberry Pearl 
from where I am, so it's kind of hard, but I'm managing.. I've tried the OP's 
exact commands, and even trying to import array as ro, yet the system still 
wants to panic.. I really hope I don't have to redo my array, and lose 
everything as I still have faith in ZFS...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Is ZFS already the default file System for Solaris 10?

2008-11-07 Thread Kumar, Amit H.
Is ZFS already the default file System for Solaris 10?
If yes has anyone tested it on Thumper ??
Thank you,
Amit

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic at zpool import

2008-11-07 Thread Andrew
Do you guys have any more information about this? I've tried the offset 
methods, zfs_recover, aok=1, mounting read only, yada yada, with still 0 luck. 
I have about 3TBs of data on my array, and I would REALLY hate to lose it.

Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS already the default file System for Solaris 10?

2008-11-07 Thread Rich Teer
On Fri, 7 Nov 2008, Kumar, Amit H. wrote:

> Is ZFS already the default file System for Solaris 10?

ZFS isn't the default file system for Solaris 10, but it is
selectable as the root file system with the most recent update.

-- 
Rich Teer, SCSA, SCNA, SCSECA

CEO,
My Online Home Inventory

URLs: http://www.rite-group.com/rich
  http://www.linkedin.com/in/richteer
  http://www.myonlinehomeinventory.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-11-07 Thread Brent Jones
On Fri, Nov 7, 2008 at 9:11 AM, Jacob Ritorto <[EMAIL PROTECTED]> wrote:
> I have a PC server running Solaris 10 5/08 which seems to frequently become 
> unable to share zfs filesystems via the shareiscsi and sharenfs options.  It 
> appears, from the outside, to be hung -- all clients just freeze, and while 
> they're able to ping the host, they're not able to transfer nfs or iSCSI 
> data.  They're in the same subnet and I've found no network problems thus far.
>
> After hearing so much about the Marvell problems I'm beginning to wonder it 
> they're the culprit, though they're supposed to be fixed in 127128-11, which 
> is the kernel I'm running.
>
> I have an exact hardware duplicate of this machine running Nevada b91 (iirc) 
> that doesn't exhibit this problem.
>
> There's nothing in /var/adm/messages and I'm not sure where else to begin.
>
> Would someone please help me in diagnosing this failure?
>
> thx
> jake
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

I saw this in Nev b87, where for whatever reason, CIFS and NFS would
completely hang and no longer serve requests (I don't use iscsi,
unable to confirm if that had hung too).
The server was responsive, SSH was fine and could execute commands,
clients could ping it and reach it, but CIFS and NFS were essentially
hung.
Intermittently, the system would recover and resume offering shares,
no triggering events could be correlated.
Since upgrading to newer builds, I haven't seen similar issues.

-- 
Brent Jones
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems-my math doesn't add up

2008-11-07 Thread Adam.Bodette
I really think there is something wrong with how space is being reported
by zfs list in terms of snapshots.

Stealing for the example earlier where a new file system was created, 10
1MB files were created and then do snap, remove a file, snap, remove a
file, until they are all gone and you are left with:
 
> bash-3.2# zfs list -r root/export/home/carton/t
> NAME  USED  AVAIL  REFER  MOUNTPOINT
> root/export/home/carton/t10.2M  26.6G18K  
> /export/home/carton/t
> root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
> root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
> root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
> root/export/home/carton/[EMAIL PROTECTED]18K  -  7.03M  -
> root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
> root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
> root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
> root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
> root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
> root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

So the file system itself is now empty of files (18K refer for overhead)
but still using 10MB because of the snapshots still holding onto all 10
1MB files.

By how I understand snapshots the oldest one actually holds the file
after it is deleted and any newer snapshot just points to what that
oldest one is holding.

So because the 0 snapshot was taken first it knows about all 10 files,
snap 1 only knows about 9, etc.

The refer numbers all match up correctly as that is how much data
existed at the time of the snapshot.

But the used seems wrong.

The 0 snapshot should be holding onto all 10 files so I would expect it
to be 10MB "Used" when it's only reporting 1MB used.  Where is the other
9MB hiding?  It only exists because a snapshot is holding it so that
space should be charged to a snapshot.  Since snapshot 1-9 should only
be pointing at the data held by 0 their numbers are correct.

To take the idea further you can delete snapshots 1-9 and snapshot 0
will still say it has 1MB "Used", so where again is the other 9MB?

Adding up the total "used" by snapshots and the "refer" by the file
system *should* add up to the "used" for the file system for it all to
make sense right?

Another way to look at it if you have all 10 snapshots and you delete 0
I would expect snapshot 1 to change from 18K used (overhead) to 9MB used
since it would now be the oldest snapshot and official holder of the
data with snapshots 2-9 now pointing at the data it is holding.  The
first 1MB file delete would now be gone forever.

Am I missing something or is the math to account for snapshot space just
not working right in zfs list/get?

Adam
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
Brent Jones wrote:
> Theres been a couple threads about this now, tracked some bug ID's/ticket:
>
> 6333409
> 6418042
I see these are fixed in build 102.

Are they targeted to get back to Solaris 10 via a patch? 

If not, is it worth escalating the issue with support to get a patch?

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recover zpool/zfs

2008-11-07 Thread Richard Elling
Thomas Kloeber wrote:
> This is the 2nd attempt, so my apologies, if this mail got to you 
> already...
>
> Folkses,
>
> I'm in an absolute state of panic because I lost about 160GB of data
> which were on an external USB disk.
> Here is what happened:
> 1. I added a 500GB USB disk to my Ultra25/Solaris 10
> 2. I created a zpool and a zfs for the whole disk
> 3. I copied lots of data on to the disk
> 4. I 'zfs umount'ed the disk, unplugged the USB cable
> 5. I attached the disk to a PC/Solaris 10 and tried to get the data
> 6. but I couldn't find it (didn't know about 'zfs import' then)
> 7. re-attached the disk to the Ultra25/Solaris 10
> and now 'zpool status' tells me that the pool is FAULTED and the data is
> corrupted. SUN support tells me that I have lost all data. They also
> suggested to try this disk on a Nevada system which I did. There the
> zpool showed up correctly but the zfs file system didn't.

Can you explain this?  If the zpool import succeeded, then
what does "zfs list" show?
 -- richard

>
> If I do a hex dump of the physical device I can see all the data but I
> can't get to it... aaarrrggghhh
>
> Has anybody successfully patched/tweaked/whatever a zpool or zfs to
> recover from this?
>
> I would be most and for ever greatful I somebody could give me a hint.
>
> Thanx,
>
> Thomas
>
> Following is the 'zpool status':
> zpool status -xv
> pool: usbDisk
> state: FAULTED
> status: One or more devices could not be used because the the label is 
> missing
> or invalid. There are insufficient replicas for the pool to continue
> functioning.
> action: Destroy and re-create the pool from a backup source.
> see: http://www.sun.com/msg/ZFS-8000-5E
> scrub: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> usbDisk FAULTED 0 0 0 corrupted data
> c5t0d0s0 FAULTED 0 0 0 corrupted data
>
> and the output of zdb:
> disk1
> version=4
> name='disk1'
> state=0
> txg=4
> pool_guid=10847216141970570446
> vdev_tree
> type='root'
> id=0
> guid=10847216141970570446
> children[0]
> type='disk'
> id=0
> guid=5919680419735522389
> path='/dev/dsk/c1t1d0s1'
> devid='id1,[EMAIL PROTECTED]/b'
> whole_disk=0
> metaslab_array=14
> metaslab_shift=30
> ashift=9
> asize=141729202176
> usbDisk
> version=4
> name='usbDisk'
> state=0
> txg=4
> pool_guid=18060149023938226877
> vdev_tree
> type='root'
> id=0
> guid=18060149023938226877
> children[0]
> type='disk'
> id=0
> guid=4959744788625823079
> path='/dev/dsk/c5t0d0s0'
> devid='id1,[EMAIL PROTECTED]/a'
> whole_disk=0
> metaslab_array=14
> metaslab_shift=32
> ashift=9
> asize=500100497408
> obelix2:/# zdb -l /dev/dsk/c5t0d0s0
> 
> LABEL 0
> 
> version=4
> name='WD'
> state=0
> txg=36
> pool_guid=7944655440509665617
> hostname='opensolaris'
> top_guid=13767930161106254968
> guid=13767930161106254968
> vdev_tree
> type='disk'
> id=0
> guid=13767930161106254968
> path='/dev/dsk/c8t0d0p0'
> devid='id1,[EMAIL PROTECTED]/q'
> phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0:q'
> whole_disk=0
> metaslab_array=14
> metaslab_shift=32
> ashift=9
> asize=500103118848
> is_log=0
> DTL=18
> 
> LABEL 1
> 
> version=4
> name='WD'
> state=0
> txg=36
> pool_guid=7944655440509665617
> hostname='opensolaris'
> top_guid=13767930161106254968
> guid=13767930161106254968
> vdev_tree
> type='disk'
> id=0
> guid=13767930161106254968
> path='/dev/dsk/c8t0d0p0'
> devid='id1,[EMAIL PROTECTED]/q'
> phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0:q'
> whole_disk=0
> metaslab_array=14
> metaslab_shift=32
> ashift=9
> asize=500103118848
> is_log=0
> DTL=18
> 
> LABEL 2
> 
> version=4
> name='usbDisk'
> state=0
> txg=4
> pool_guid=18060149023938226877
> top_guid=4959744788625823079
> guid=4959744788625823079
> vdev_tree
> type='disk'
> id=0
> guid=4959744788625823079
> path='/dev/dsk/c5t0d0s0'
> devid='id1,[EMAIL PROTECTED]/a'
> whole_disk=0
> metaslab_array=14
> metaslab_shift=32
> ashift=9
> asize=500100497408
> 
> LABEL 3
> 
> version=4
> name='usbDisk'
> state=0
> txg=4
> pool_guid=18060149023938226877
> top_guid=4959744788625823079
> guid=4959744788625823079
> vdev_tree
> type='disk'
> id=0
> guid=4959744788625823079
> path='/dev/dsk/c5t0d0s0'
> devid='id1,[EMAIL PROTECTED]/a'
> whole_disk=0
> metaslab_array=14
> metaslab_shift=32
> ashift=9
> asize=500100497408
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.op

Re: [zfs-discuss] Disk space usage of zfs snapshots andfilesystems-my math doesn't add up

2008-11-07 Thread Adam.Bodette
I decided to do some more test situations to try to figure out how
adding/removing snapshots changes the space used reporting.

First I setup a test area, a new zfs file system and created some test
files and then created snapshots removing the files one by one.

> mkfile 1m 0
> mkfile 1m 1
> mkfile 1m 2
> mkfile 1m 3
> zfs snapshot u01/[EMAIL PROTECTED]
> rm 0
> zfs snapshot u01/[EMAIL PROTECTED]
> rm 1  
> zfs snapshot u01/[EMAIL PROTECTED]
> rm 2
> zfs snapshot u01/[EMAIL PROTECTED]
> rm 3
> zfs list -r u01/foo
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.76M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  1.18M  -  4.56M  -
u01/[EMAIL PROTECTED]  50.5K  -  3.43M  -
u01/[EMAIL PROTECTED]  50.5K  -  2.31M  -
u01/[EMAIL PROTECTED]  50.5K  -  1.18M  -


So 4M used on the file system but only 1M used by the snapshots so they
claim.

If I delete @1

> zfs destroy u01/[EMAIL PROTECTED]
> zfs list -r u01/foo  
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.71M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  2.30M  -  4.56M  -
u01/[EMAIL PROTECTED]  50.5K  -  2.31M  -
u01/[EMAIL PROTECTED]  50.5K  -  1.18M  -

now suddenly the @0 snapshot is claiming to use more space?

If I delete the newest of the snapshots @3

> zfs destroy u01/[EMAIL PROTECTED]
> zfs list -r u01/foo
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.66M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  2.30M  -  4.56M  -
u01/[EMAIL PROTECTED]  50.5K  -  2.31M  -

No change in the claimed used space by the @0 snapshot!

Now I delete the @2 snapshot

> zfs destroy u01/[EMAIL PROTECTED]
> zfs list -r u01/foo  
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.61M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  4.56M  -  4.56M  -

The @0 snapshot finally claims all the space it's really been holding
all along.

Up until that point other than subtracting space used (as reported by
df) from the used for the file system as reported by zfs there was no
way to know how much space the snapshots were really using.

Something is not right in the space accounting for snapshots.

---
Adam 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> [EMAIL PROTECTED]
> Sent: Friday, November 07, 2008 14:21
> To: [EMAIL PROTECTED]; zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] Disk space usage of zfs snapshots 
> andfilesystems-my math doesn't add up
> 
> I really think there is something wrong with how space is 
> being reported
> by zfs list in terms of snapshots.
> 
> Stealing for the example earlier where a new file system was 
> created, 10
> 1MB files were created and then do snap, remove a file, snap, remove a
> file, until they are all gone and you are left with:
>  
> > bash-3.2# zfs list -r root/export/home/carton/t
> > NAME  USED  AVAIL  REFER  MOUNTPOINT
> > root/export/home/carton/t10.2M  26.6G18K  
> > /export/home/carton/t
> > root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
> > root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
> > root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
> > root/export/home/carton/[EMAIL PROTECTED]18K  -  7.03M  -
> > root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
> > root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
> > root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
> > root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
> > root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
> > root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
> 
> So the file system itself is now empty of files (18K refer 
> for overhead)
> but still using 10MB because of the snapshots still holding 
> onto all 10
> 1MB files.
> 
> By how I understand snapshots the oldest one actually holds the file
> after it is deleted and any newer snapshot just points to what that
> oldest one is holding.
> 
> So because the 0 snapshot was taken first it knows about all 10 files,
> snap 1 only knows about 9, etc.
> 
> The refer numbers all match up correctly as that is how much data
> existed at the time of the snapshot.
> 
> But the used seems wrong.
> 
> The 0 snapshot should be holding onto all 10 files so I would 
> expect it
> to be 10MB "Used" when it's only reporting 1MB used.  Where 
> is the other
> 9MB hiding?  It only exists because a snapshot is holding it so that
> space should be charged to a snapshot.  Since snapshot 1-9 should only
> be pointing at the data held by 0 their numbers are correct.
> 
> To take the idea further you can delete snapshots 1-9 and snapshot 0
> will still say it has 1MB "Used", so where again is the other 9MB?
> 
> Adding up the total "used" by snapshots and the

Re: [zfs-discuss] Is ZFS already the default file System for Solaris 10?

2008-11-07 Thread Neal Pollack

On 11/07/08 11:24, Kumar, Amit H. wrote:

Is ZFS already the default file System for Solaris 10?
If yes has anyone tested it on Thumper ??


Yes.   Formal Sun support is for Thumper running s10.  For the latest
ZFS bug fixes, it is important to run the most recent s10 update release.
Right now, that should be s10u6 any day now, if it's not already released
for download.

There are many customers on this list running Thumpers with s10u5 plus 
patches.


Cheers,

Neal


Thank you,
Amit
 
 
 



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] A couple of newbie questions about ZFS compression

2008-11-07 Thread Ross Becker
I'm about to enable compression on my ZFS filesystem, as most of the data I 
intend to store should be highly compressible.

Before I do so, I'd like to ask a couple of newbie questions

First -  if you were running a ZFS without compression, wrote some files to it, 
then turned compression on, will those original uncompressed files ever get 
compressed via some background work, or will they need to be copied in order to 
compress them?

Second- clearly the "du" command shows post-compression size; opensolaris 
doesn't have a man page for it, but I'm wondering if there's either an option 
to show "original" size for du, or if there's a suitable replacement I can use 
which will show me the uncompressed size of a directory full of files? (no, 
knowing the compression ratio of the whole filesystem and the du size isn't 
suitable;  I'm looking for a straight-up du substitute which would tell me 
original sizes")


Thanks
   Ross
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A couple of newbie questions about ZFS compression

2008-11-07 Thread Ian Collins
Ross Becker wrote:
> I'm about to enable compression on my ZFS filesystem, as most of the data I 
> intend to store should be highly compressible.
>
> Before I do so, I'd like to ask a couple of newbie questions
>
> First -  if you were running a ZFS without compression, wrote some files to 
> it, then turned compression on, will those original uncompressed files ever 
> get compressed via some background work, or will they need to be copied in 
> order to compress them?
>
>   
Changed filesystem compress properties only apply to new writes.

> Second- clearly the "du" command shows post-compression size; opensolaris 
> doesn't have a man page for it, but I'm wondering if there's either an option 
> to show "original" size for du, or if there's a suitable replacement I can 
> use which will show me the uncompressed size of a directory full of files? 
> (no, knowing the compression ratio of the whole filesystem and the du size 
> isn't suitable;  I'm looking for a straight-up du substitute which would tell 
> me original sizes")
>
>   
I guess the obvious question is why?  ls shows the file size.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Victor Latushkin
Andrew,

Andrew wrote:
> Thanks a lot! Google didn't seem to cooperate as well as I had hoped.
> 
> 
> Still no dice on the import. I only have shell access on my
> Blackberry Pearl from where I am, so it's kind of hard, but I'm
> managing.. I've tried the OP's exact commands, and even trying to
> import array as ro, yet the system still wants to panic.. I really
> hope I don't have to redo my array, and lose everything as I still
> have faith in ZFS...

could you please post a little bit more details - at least panic string 
and stack backtrace during panic. That would help to get an idea about 
what might went wrong.

regards,
victor
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A couple of newbie questions about ZFS compression

2008-11-07 Thread Ross Becker
The compress on-write behavior is what I expected, but I wanted to validate 
that for sure.  Thank you.

On the 2nd question, the obvious answer is that  I'm doing work where knowing 
how large the total file sizes tells me how much work has been completed, and I 
don't have any other feedback which tells me how far along a job is.  When it's 
a directory of 100+ files,  or a whole tree with hundreds of files, it's not 
convenient to add the file sizes up to get the answer.  I could write a perl 
script but it honestly should a be built-in command.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?

2008-11-07 Thread Peter Bridge
Just as a follow up.  I went ahead with the original hardware purchase, it was 
so much cheaper than the alternatives it was hard to resist.

Anyway, OS 2008-05 installed very nicely.  Although it mentions 32bit while 
booting, so I need to investigate that at some point.  The actual hardware 
seems stable and fast enough so far.  The CPU fan is much louder than I 
expected, so I'll probably swap that out since I want this box running 247 in 
the office and can't stand noisy machines.  Anyway I'm booting from a laptop 
ide drive with two sata disks in a mirrored zpool.  This seems to work fine 
although my testing has been limited due to some network problems...

I really need a step-by-step 'how to' to access this box from my OSX Leopard 
based macbook pro.

I've spent about 5 hours trying to get NFS working with minimal progress.  I've 
tried with nwadm disabled, although the two lines I entered to turn it off 
seemed to miss alot of other config items, so I ended up turning it back on.  
The reason I did that was the dhcp was failing to get an address, or that's 
what it looked like, so I wanted to try static address.  Anyway I found a way 
for using a static address with nwadm turned on.  Anyway, to cut a long story 
short, I can now ping between the machines, they have identicle users created 
id=501 and groups with id=501, but OSX just refuses in anyway to connect to the 
NFS share, from command line with various flag, or cmd k.

I'm going to clean install open solaris in the morning, maybe with the newest 
build 100? and try again, but I'd really appreciate some help from someone that 
has got this working.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems -my math doesn't add up

2008-11-07 Thread none
Hi Miles,
Thanks for your reply.
My zfs situation is a little different from your test. I have a few early 
snapshots which I believe would still be sharing most of their data with the 
current filesystem. Then later I have snapshots which would still be holding 
onto data that is now deleted.

So I'd like to get more detailed reports on just what and how much of the data 
is shared between the snapshots and current filesystem, rather than relying on 
my memory. For example, a report showing for all data blocks where they are 
shared would be what I really need:
10% unique to storage/fs
20% shared by storage/fs, [EMAIL PROTECTED], [EMAIL PROTECTED]
40% shared by [EMAIL PROTECTED], [EMAIL PROTECTED]
10% unique to [EMAIL PROTECTED]
10% unique to [EMAIL PROTECTED]
10% unique to [EMAIL PROTECTED]


Unfortunately the USED column is of little help since it only shows you the 
data unique to that snapshot. In my case almost all data is shared amongst the 
snapshots so it only shows up in the USED of the whole fs.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems-my math doesn't add up

2008-11-07 Thread Miles Nordin
> "a" ==   <[EMAIL PROTECTED]> writes:
> "c" == Miles Nordin <[EMAIL PROTECTED]> writes:
> "n" == none  <[EMAIL PROTECTED]> writes:

 n> Unfortunately the USED column is of little help since it only
 n> shows you the data unique to that snapshot. In my case almost
 n> all data is shared amongst the snapshots so it only shows up
 n> in the USED of the whole fs.

Unfortunately sounds right to me.

bash-3.2# mkfile 1m 0
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# rm 0
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t1.04M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  0  -  1.02M  -
root/export/home/carton/[EMAIL PROTECTED]  0  -  1.02M  -

It seems like taking a lot of snapshots can make 'zfs list' output
less useful.

I've trouble imagining an interface to expose the information you
want.  The obvious extension of what we have now is ``show
hypothetical 'zfs list' output after I'd deleted this dataset,'' but I
think that'd be clumsy to use and inefficient to implement.

Maybe btrfs will think of something, since AIUI their model is to have
mandatory infinite snapshots, and instead of explicitly requesting
point-in-time snapshots, your manual action is to free up space by
specifying ranges of snapshot you'd like to delete.  They'll likely
have some interface to answer questions like ``how much space is used
by the log between 2008-01-25 and 2008-02-22?'', and they will have
background log-quantitization daemons that delete snapshot ranges
analagous to our daemons that take snapshots.

Maybe imagining infinitely-granular snapshots is the key to a better
interface: ``show me the USED value for snapshots [EMAIL PROTECTED] - [EMAIL 
PROTECTED]
inclusive,'' and you must always give a contiguous range.  That's
analagous to btrfs-world's hypothetical ``show me the space consumed
by the log between these two timestamps.''

the btrfs guy in here seemed to be saying there's dedup across clone
branches, which ZFS does not do.  I suppose that would complicate
space reporting and their interface to do it.

 a> I really think there is something wrong with how space is
 a> being reported by zfs list in terms of snapshots.

I also had trouble understanding it.

right:

  conduct tests and use them to form a rational explanation of the
  numbers' meanings.  If you can't determine one, ask for help and try
  harder.  Create experiments to debunk a series of proposed, wrong
  explanations.  If a right explanation doesn't emerge, complain it's
  broken chaos.

wrong:

  use the one-word column titles to daydream about what you THINK the
  numbers ought to mean if you'd designed it, and act obstinately
  surprised when the actual meaning doesn't match your daydream.

  One word isn't enough to get you and the designers on the same page.
  You must mentally prepare yourself to accept column-meanings other
  than your assumption.

Imagine the columns were given to you without headings.  Or, reread
your 'zfs list' output assuming the column headings are ORANGE,
TASMANIAN_DEVIL, and BARBIE, then try to understand what each means
through experiment rather than assumption.

 a> But the used seems wrong.

%$"#$%!  But I can only repeat myself:

 c> USED column tells you ``how much
 c> space would be freed if I destroyed the thing on this row,''

AND

 c> after you delete something, the USED column will
 c> reshuffle,

I'd been staring at the USED column for over a year, and did not know
either of those two things until I ran the experiment in my post.

I'm not saying it's necessarily working properly, but I see no
evidence of a flaw from the numbers in the mail I posted, and in your
writeup I see a bunch of assumptions that are not worth
untangling---paragraphs-long explanations don't work for everyone,
including me.  Adam, suggest you keep doing tests until you understand
the actual behavior, because at least the behavior seen in my post is
understandable and sane to me.


pgp8WqrxYZIzJ.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
hey Victor,

Where would i find that? I'm still somewhat getting used to the Solaris 
environment. /var/adm/messages doesn't seem to show any Panic info.. I only 
have remote access via SSH, so I hope I can do something with dtrace to pull it.

Thanks,
Andrew
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?

2008-11-07 Thread Miles Nordin
> "pb" == Peter Bridge <[EMAIL PROTECTED]> writes:

pb> I really need a step-by-step 'how to' to access this box from
pb> my OSX Leopard

What you need for NFS on a laptop is a good automount daemon and a
'umount -f' command that actually does what the man page claims.

The automounter in Leopard works well.  The one in Tiger doesn't---has
quirks and often needs to be rebooted.  AFAICT both 10.4 and 10.5 have
an 'umount -f' that works better than Linux and *BSD.

You can use the automounter by editing files in /etc, but a better way
might be to use the Directory Utility.  Here is an outdated guide, not
for Leopard:

 http://www.behanna.org/osx/nfs/howto4.html

The Leopard steps are:

 1. Open Directory Utility
 2. pick Mounts in the tab-bar
 3. click the Lock in the lower-left corner and authenticate
 4. press +
 5. unroll Advanced Mount Parameters
 6. fill out the form something like this:

   Remote NFS URL: nfs://10.100.100.149/export
   Mount location: /Network/terabithia/export
   Advanced Mount Parameters: nosuid nodev locallocks
   [x] Ignore "set user ID" privileges

 7. Press Verify
 8. Press Apply, or press Command-S

the locallocks mount parameter should be removed.  I need it with old
versions of Linux and Mac OS 10.5.  I don't need it with
10.4+oldLinux, and hopefully 10.5+Solaris won't need it either.

There is a 'net' mount parameter which changes Finder's behavior.
Also it might be better to mount on some tree outside /Network and
/Volumes since these seem to have some strange special meanings.
ymmv.

At my site I also put 'umask 000' in /etc/launchd.conf to,
errmatch user expectations of how the office used to work with
SMB.  For something fancier, you may have to get the Macs and Suns to
share userids with LDAP.  Someone recently posted a PDF here about an
ozzie site serious about Macs that'd done so:

 http://www.afp548.com/filemgmt_data/files/OSX%20HSM.pdf

Another thing worth knowing: macs throw up these boxes that say
something like

 Server gone.

   [Disconnect]

with a big red Disocnnect button.  If your NFS server reboots, they'll
throw one of these boxes at you.  It looks like you only have one
choice.  You actually have three:

 1. ignore the box
 2. press the tiny red [x] in the upper-left corner of the box
 3. press disconnect

If you do (3), Mac OS will do 'umount -f' for you.  At the time the
box appears, the umount has not been done yet.

If you do (1) or (2), any application using the NFS server (including
potentially Finder and all its windows, not just the NFS windows) will
pinwheel until the NFS server comes back.  When it does come back, all
the apps will continue without data loss, and if you did (1) the box
will also disappear on its own.  The error handling is quite good and
in line with Unixy expectations and NFS statelessness.


pgpHWSn6AhMy6.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Victor Latushkin
Andrew пишет:
> hey Victor,
> 
> Where would i find that? I'm still somewhat getting used to the
> Solaris environment. /var/adm/messages doesn't seem to show any Panic
> info.. I only have remote access via SSH, so I hope I can do
> something with dtrace to pull it.

Do you have anything in /var/crash/ ?

If yes, then do something like this and provide output:

cd /var/crash/
echo "::status" | mdb -k 
echo "::stack" | mdb -k 
echo "::msgbuf -v" | mdb -k 

victor

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Andrew Gabriel
Ian Collins wrote:
> Brent Jones wrote:
>> Theres been a couple threads about this now, tracked some bug ID's/ticket:
>>
>> 6333409
>> 6418042
> I see these are fixed in build 102.
> 
> Are they targeted to get back to Solaris 10 via a patch? 
> 
> If not, is it worth escalating the issue with support to get a patch?

Given the issue described is slow zfs recv over network, I suspect this is:

6729347 Poor zfs receive performance across networks

This is quite easily worked around by putting a buffering program 
between the network and the zfs receive. There is a public domain 
"mbuffer" which should work, although I haven't tried it as I wrote my 
own. The buffer size you need is about 5 seconds worth of data. In my 
case of 7200RPM disks (in a mirror and not striped) and a gigabit 
ethernet link, the disks are the limiting factor at around 57MB/sec 
sustained i/o, so I used a 250MB buffer to best effect. If I recall 
correctly, that speeded up the zfs send/recv across the network by about 
3 times, and it then ran at the disk platter speed.

-- 
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
Andrew Gabriel wrote:
> Ian Collins wrote:
>   
>> Brent Jones wrote:
>> 
>>> Theres been a couple threads about this now, tracked some bug ID's/ticket:
>>>
>>> 6333409
>>> 6418042
>>>   
>> I see these are fixed in build 102.
>>
>> Are they targeted to get back to Solaris 10 via a patch? 
>>
>> If not, is it worth escalating the issue with support to get a patch?
>> 
>
> Given the issue described is slow zfs recv over network, I suspect this is:
>
> 6729347 Poor zfs receive performance across networks
>
> This is quite easily worked around by putting a buffering program 
> between the network and the zfs receive. There is a public domain 
> "mbuffer" which should work, although I haven't tried it as I wrote my 
> own. The buffer size you need is about 5 seconds worth of data. In my 
> case of 7200RPM disks (in a mirror and not striped) and a gigabit 
> ethernet link, the disks are the limiting factor at around 57MB/sec 
> sustained i/o, so I used a 250MB buffer to best effect. If I recall 
> correctly, that speeded up the zfs send/recv across the network by about 
> 3 times, and it then ran at the disk platter speed.
>
>   
Did this apply to incremental sends as well?  I can live with ~20MB/sec
for full sends, but ~1MB/sec for incremental sends is a killer.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Andrew Gabriel
Ian Collins wrote:
> Andrew Gabriel wrote:
>> Ian Collins wrote:
>>   
>>> Brent Jones wrote:
>>> 
 Theres been a couple threads about this now, tracked some bug ID's/ticket:

 6333409
 6418042
   
>>> I see these are fixed in build 102.
>>>
>>> Are they targeted to get back to Solaris 10 via a patch? 
>>>
>>> If not, is it worth escalating the issue with support to get a patch?
>>> 
>> Given the issue described is slow zfs recv over network, I suspect this is:
>>
>> 6729347 Poor zfs receive performance across networks
>>
>> This is quite easily worked around by putting a buffering program 
>> between the network and the zfs receive. There is a public domain 
>> "mbuffer" which should work, although I haven't tried it as I wrote my 
>> own. The buffer size you need is about 5 seconds worth of data. In my 
>> case of 7200RPM disks (in a mirror and not striped) and a gigabit 
>> ethernet link, the disks are the limiting factor at around 57MB/sec 
>> sustained i/o, so I used a 250MB buffer to best effect. If I recall 
>> correctly, that speeded up the zfs send/recv across the network by about 
>> 3 times, and it then ran at the disk platter speed.
>  
> Did this apply to incremental sends as well?  I can live with ~20MB/sec
> for full sends, but ~1MB/sec for incremental sends is a killer.

It doesn't help the ~1MB/sec periods in incrementals, but it does help 
the fast periods in incrementals.

-- 
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
Andrew Gabriel wrote:
> Ian Collins wrote:
>> Andrew Gabriel wrote:
>>> Ian Collins wrote:
>>>  
 Brent Jones wrote:

> Theres been a couple threads about this now, tracked some bug
> ID's/ticket:
>
> 6333409
> 6418042
>   
 I see these are fixed in build 102.

 Are they targeted to get back to Solaris 10 via a patch?
 If not, is it worth escalating the issue with support to get a patch?
 
>>> Given the issue described is slow zfs recv over network, I suspect
>>> this is:
>>>
>>> 6729347 Poor zfs receive performance across networks
>>>
>>> This is quite easily worked around by putting a buffering program
>>> between the network and the zfs receive. There is a public domain
>>> "mbuffer" which should work, although I haven't tried it as I wrote
>>> my own. The buffer size you need is about 5 seconds worth of data.
>>> In my case of 7200RPM disks (in a mirror and not striped) and a
>>> gigabit ethernet link, the disks are the limiting factor at around
>>> 57MB/sec sustained i/o, so I used a 250MB buffer to best effect. If
>>> I recall correctly, that speeded up the zfs send/recv across the
>>> network by about 3 times, and it then ran at the disk platter speed.
>>  
>> Did this apply to incremental sends as well?  I can live with ~20MB/sec
>> for full sends, but ~1MB/sec for incremental sends is a killer.
>
> It doesn't help the ~1MB/sec periods in incrementals, but it does help
> the fast periods in incrementals.
>
:)

I don't see the 5 second bursty behaviour described in the bug report. 
It's more like 5 second interval gaps in the network traffic while the
data is written to disk.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
Not too sure if it's much help. I enabled kernel pages and curproc.. Let me 
know if I need to enable "all" then.

solaria crash # echo "::status" | mdb -k
debugging live kernel (64-bit) on solaria
operating system: 5.11 snv_98 (i86pc)
solaria crash # echo "::stack" | mdb -k
solaria crash # echo "::msgbuf -v" | mdb -k
   TIMESTAMP   LOGCTL MESSAGE
2008 Nov  7 18:53:55 ff01c901dcf0   capacity = 1953525168 sectors
2008 Nov  7 18:53:55 ff01c901db70 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:55 ff01c901d9f0   SATA disk device at port 0
2008 Nov  7 18:53:55 ff01c901d870
model ST31000340AS
2008 Nov  7 18:53:55 ff01c901d6f0   firmware SD15
2008 Nov  7 18:53:55 ff01c901d570   serial number 
2008 Nov  7 18:53:55 ff01c901d3f0   supported features:
2008 Nov  7 18:53:55 ff01c901d270
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:55 ff01c901d0f0   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:55 ff01c901adf0   Supported queue depth 32, limited to 31
2008 Nov  7 18:53:55 ff01c901ac70   capacity = 1953525168 sectors
2008 Nov  7 18:53:55 ff01c901aaf0 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:55 ff01c901a970   SATA disk device at port 0
2008 Nov  7 18:53:55 ff01c901a7f0
model Maxtor 6L250S0
2008 Nov  7 18:53:55 ff01c901a670   firmware BANC1G10
2008 Nov  7 18:53:55 ff01c901a4f0   serial number
2008 Nov  7 18:53:55 ff01c901a370   supported features:
2008 Nov  7 18:53:55 ff01c901a2b0
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:55 ff01c901a130   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:55 ff01c901a070   Supported queue depth 32, limited to 31
2008 Nov  7 18:53:55 ff01c9017ef0   capacity = 490234752 sectors
2008 Nov  7 18:53:55 ff01c9017d70 pseudo-device: ramdisk1024
2008 Nov  7 18:53:55 ff01c9017bf0 ramdisk1024 is /pseudo/[EMAIL PROTECTED]
2008 Nov  7 18:53:55 ff01c9017a70 NOTICE: e1000g0 registered
2008 Nov  7 18:53:55 ff01c90179b0
pcplusmp: pci8086,100e (e1000g) instance 0 vector 0x14 ioapic 0x2 intin 0x14 is
bound to cpu 0
2008 Nov  7 18:53:55 ff01c90178f0
Intel(R) PRO/1000 Network Connection, Driver Ver. 5.2.12
2008 Nov  7 18:53:56 ff01c9017830 pseudo-device: lockstat0
2008 Nov  7 18:53:56 ff01c9017770 lockstat0 is /pseudo/[EMAIL PROTECTED]
2008 Nov  7 18:53:56 ff01c90176b0 sd6 at si31240: target 0 lun 0
2008 Nov  7 18:53:56 ff01c90175f0
sd6 is /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
2008 Nov  7 18:53:56 ff01c9017530 sd5 at si31242: target 0 lun 0
2008 Nov  7 18:53:56 ff01c9017470
sd5 is /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
2008 Nov  7 18:53:56 ff01c90173b0 sd4 at si31241: target 0 lun 0
2008 Nov  7 18:53:56 ff01c90172f0
sd4 is /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
2008 Nov  7 18:53:56 ff01c9017230
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd4) online
2008 Nov  7 18:53:56 ff01c9017170 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:56 ff01c90170b0   SATA disk device at port 1
2008 Nov  7 18:53:56 ff01c9087f30
model ST31000340AS
2008 Nov  7 18:53:56 ff01c9087e70   firmware SD15
2008 Nov  7 18:53:56 ff01c9087db0   serial number
2008 Nov  7 18:53:56 ff01c9087cf0   supported features:
2008 Nov  7 18:53:56 ff01c9087c30
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:56 ff01c9087b70   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:56 ff01c9087ab0   Supported queue depth 32, limited to 31
2008 Nov  7 18:53:56 ff01c90879f0   capacity = 1953525168 sectors
2008 Nov  7 18:53:56 ff01c9087930
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd6) online
2008 Nov  7 18:53:56 ff01c9087870
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd5) online
2008 Nov  7 18:53:56 ff01c90877b0 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:56 ff01c90876f0   SATA disk device at port 1
2008 Nov  7 18:53:56 ff01c9087630
model ST31000340AS
2008 Nov  7 18:53:56 ff01c9087570   firmware SD15
2008 Nov  7 18:53:56 ff01c90874b0   serial number 
2008 Nov  7 18:53:56 ff01c90873f0   supported features:
2008 Nov  7 18:53:56 ff01c9087330
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:56 ff01c9087270   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:56 ff01c90871b0   Supported queu

Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Andrew Gabriel
Ian Collins wrote:
> Andrew Gabriel wrote:
>> Ian Collins wrote:
>>> Andrew Gabriel wrote:
 Given the issue described is slow zfs recv over network, I suspect
 this is:

 6729347 Poor zfs receive performance across networks

 This is quite easily worked around by putting a buffering program
 between the network and the zfs receive. There is a public domain
 "mbuffer" which should work, although I haven't tried it as I wrote
 my own. The buffer size you need is about 5 seconds worth of data.
 In my case of 7200RPM disks (in a mirror and not striped) and a
 gigabit ethernet link, the disks are the limiting factor at around
 57MB/sec sustained i/o, so I used a 250MB buffer to best effect. If
 I recall correctly, that speeded up the zfs send/recv across the
 network by about 3 times, and it then ran at the disk platter speed.
>>>  
>>> Did this apply to incremental sends as well?  I can live with ~20MB/sec
>>> for full sends, but ~1MB/sec for incremental sends is a killer.
>> It doesn't help the ~1MB/sec periods in incrementals, but it does help
>> the fast periods in incrementals.
>>
> :)
> 
> I don't see the 5 second bursty behaviour described in the bug report. 
> It's more like 5 second interval gaps in the network traffic while the
> data is written to disk.

That is exactly the issue. When the zfs recv data has been written, zfs 
recv starts reading the network again, but there's only a tiny amount of 
data buffered in the TCP/IP stack, so it has to wait for the network to 
heave more data across. In effect, it's a single buffered copy. The 
addition of a buffer program turns it into a double-buffered (or cyclic 
buffered) copy, with the disks running flat out continuously, and the 
network streaming data across continuously at the disk platter speed.

What are your theoretical max speeds for network and disk i/o?
Taking the smaller of these two, are you seeing the sustained send/recv 
performance match that (excluding the ~1MB/sec periods which is some 
other problem)?

The effect described in that bug is most obvious when the disk and 
network speeds are same order of magnitude (as in the example I gave 
above). Given my disk i/o rate above, if the network is much faster 
(say, 10GB), then it's going to cope with the bursty nature of the 
traffic better. If the network is much slower (say, 100MB), then it's 
going to be running flat out anyway and again you won't notice the 
bursty reads (a colleague measured only 20% gain in that case, rather 
than my 200% gain).

-- 
Andrew


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss