[zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks

2010-03-18 Thread Chris Murray
Good evening,
I understand that NTFS & VMDK do not relate to Solaris or ZFS, but I was 
wondering if anyone has any experience of checking the alignment of data blocks 
through that stack?

I have a VMware ESX 4.0 host using storage presented over NFS from ZFS 
filesystems (recordsize 4KB). Within virtual machine VMDK files, I have 
formatted NTFS filesystems, block size 4KB. Dedup is turned on. When I run ZDB 
-DD, i see a figure of unique blocks which is higher than I expect, which makes 
me wonder whether any given 4KB in the NTFS filesystem is perfectly aligned 
with a 4KB block in ZFS? 

e.g. consider two virtual machines sharing lots of the same blocks. Assuming 
there /is/ a misalignment between NTFS & VMDK/VMDK & ZFS, if they're not in the 
same order within NTFS, they don't align, and will actually produce different 
blocks in ZFS:

VM1
NTFS1---2---3---


  
ZFS 1---2---3---4---

ZFS blocks are "  AA", "AABB" and so on ...
Then in another virtual machine, the blocks are in a different order:

VM2
NTFS1---2---3---


  
ZFS 1---2---3---4---
ZFS blocks for this VM would be "  CC", "CCAA", "AABB" etc. So, no overlap 
between virtual machines, and no benefit from dedup.

I may have it wrong, and there are indeed 30,785,627 unique blocks in my setup, 
but if there's a mechanism for checking alignment, I'd find that very helpful.

Thanks,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks

2010-03-18 Thread Chris Murray
Please excuse my pitiful example. :-)

I meant to say "*less* overlap between virtual machines", as clearly
block "AABB" occurs in both.

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Chris Murray
Sent: 18 March 2010 18:45
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks

Good evening,
I understand that NTFS & VMDK do not relate to Solaris or ZFS, but I was
wondering if anyone has any experience of checking the alignment of data
blocks through that stack?

I have a VMware ESX 4.0 host using storage presented over NFS from ZFS
filesystems (recordsize 4KB). Within virtual machine VMDK files, I have
formatted NTFS filesystems, block size 4KB. Dedup is turned on. When I
run ZDB -DD, i see a figure of unique blocks which is higher than I
expect, which makes me wonder whether any given 4KB in the NTFS
filesystem is perfectly aligned with a 4KB block in ZFS? 

e.g. consider two virtual machines sharing lots of the same blocks.
Assuming there /is/ a misalignment between NTFS & VMDK/VMDK & ZFS, if
they're not in the same order within NTFS, they don't align, and will
actually produce different blocks in ZFS:

VM1
NTFS1---2---3---


  
ZFS 1---2---3---4---

ZFS blocks are "  AA", "AABB" and so on ...
Then in another virtual machine, the blocks are in a different order:

VM2
NTFS1---2---3---


  
ZFS 1---2---3---4---
ZFS blocks for this VM would be "  CC", "CCAA", "AABB" etc. So, no
overlap between virtual machines, and no benefit from dedup.

I may have it wrong, and there are indeed 30,785,627 unique blocks in my
setup, but if there's a mechanism for checking alignment, I'd find that
very helpful.

Thanks,
Chris
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks

2010-03-20 Thread Chris Murray
That's a good idea, thanks. I get the feeling the remainder won't be zero, 
which will back up the misalignment theory. After a bit more digging, it seems 
the problem is just an NTFS issue and can be addressed irrespective of 
underlying storage system.

I think I'm going to try the process in the following link:
http://www.tuxyturvy.com/blog/index.php?/archives/59-Aligning-Windows-Partitions-Without-Losing-Data.html

With any luck I'll then see a smaller dedup table, and better performance!

Thanks to those for feedback,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Expanding RAIDZ with larger disks - can't see all space.

2009-09-27 Thread Chris Murray
Posting this question again as I originally tagged it onto the end of a series 
of longwinded posts of mine where I was having problems replacing a drive. 
After dodgy cabling and a few power cuts, I finally got the new drive 
resilvered.

Before this final replace, I had 3 x 1TB & 1 x 750GB drives in RAIDZ1 zpool.

After the replace, all four are 1TB, but I can still only see a total of 2.73TB 
in zpool list.

I have tried:
1. Reboot.
2. zpool export, then a zpool import.
3. zpool export, reboot, zpool import.

However, I can still only see 2.73TB total. Any ideas what it could be?
All four disks show as 931.51GB in format. Where should I start troubleshooting 
to see why ZFS isn't using all of this space?
I'm currently on SXCE119. I tried a mock-scenario in VMware using the 2009.06 
live cd, which worked correctly after an export and import. Can't do this on my 
setup, however, as I have upgraded my zpool to the latest version, and it can't 
be read using the CD now.

Thanks,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Expanding RAIDZ with larger disks - can't see all space.

2009-09-27 Thread Chris Murray
I knew it would be something simple!!  :-)

Now 3.63TB, as expected, and no need to export and import either! Thanks
Richard, that's done the trick.

Chris

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Troubleshooting dedup performance

2009-12-16 Thread Chris Murray
So if the ZFS checksum is set to fletcher4 at the pool level, and
dedup=on, which checksum will it be using?

If I attempt to set dedup=fletcher4, I do indeed get this:

cannot set property for 'zp': 'dedup' must be one of 'on | off | verify
| sha256[,verify]'

Could it be that my performance troubles are due to the calculation of
two different checksums?

Thanks,
Chris

-Original Message-
From: cyril.pli...@gmail.com [mailto:cyril.pli...@gmail.com] On Behalf
Of Cyril Plisko
Sent: 16 December 2009 17:09
To: Andrey Kuzmin
Cc: Chris Murray; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Troubleshooting dedup performance

>> I've set dedup to what I believe are the least resource-intensive
>> settings - "checksum=fletcher4" on the pool, & "dedup=on" rather than
>
> I believe checksum=fletcher4 is acceptable in dedup=verify mode only.
> What you're doing is seemingly deduplication with weak checksum w/o
> verification.

I think fletcher4 use for the deduplication purposes was disabled [1]
at all, right before build 129 cut.


[1]
http://hg.genunix.org/onnv-gate.hg/diff/93c7076216f6/usr/src/common/zfs/
zfs_prop.c


-- 
Regards,
Cyril

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Troubleshooting dedup performance

2009-12-21 Thread Chris Murray
In case the overhead in calculating SHA256 was the cause, I set ZFS
checksums to SHA256 at the pool level, and left for a number of days.
This worked fine.

Setting dedup=on immediately crippled performance, and then setting
dedup=off fixed things again. I did notice through a zpool iostat that
disk IO increased while dedup was on, although it didn't from the ESXi
side. Could it be that dedup tables don't fit in memory? I don't have a
great deal - 3GB. Is there a measure of how large the tables are in
bytes, rather than number of entries?

Chris

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Chris Murray
Sent: 16 December 2009 17:19
To: Cyril Plisko; Andrey Kuzmin
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Troubleshooting dedup performance

So if the ZFS checksum is set to fletcher4 at the pool level, and
dedup=on, which checksum will it be using?

If I attempt to set dedup=fletcher4, I do indeed get this:

cannot set property for 'zp': 'dedup' must be one of 'on | off | verify
| sha256[,verify]'

Could it be that my performance troubles are due to the calculation of
two different checksums?

Thanks,
Chris

-Original Message-
From: cyril.pli...@gmail.com [mailto:cyril.pli...@gmail.com] On Behalf
Of Cyril Plisko
Sent: 16 December 2009 17:09
To: Andrey Kuzmin
Cc: Chris Murray; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Troubleshooting dedup performance

>> I've set dedup to what I believe are the least resource-intensive
>> settings - "checksum=fletcher4" on the pool, & "dedup=on" rather than
>
> I believe checksum=fletcher4 is acceptable in dedup=verify mode only.
> What you're doing is seemingly deduplication with weak checksum w/o
> verification.

I think fletcher4 use for the deduplication purposes was disabled [1]
at all, right before build 129 cut.


[1]
http://hg.genunix.org/onnv-gate.hg/diff/93c7076216f6/usr/src/common/zfs/
zfs_prop.c


-- 
Regards,
Cyril

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool import hang - possibly dedup related?

2010-02-16 Thread Chris Murray
I'm trying to import a pool into b132 which once had dedup enabled, after the 
machine was shut down with an "init 5".

However, the import hangs the whole machine and I eventually get kicked off my 
SSH sessions. As it's a VM, I can see that processor usage jumps up to near 
100% very quickly, and stays there. Longest I've left it is 12 hours. Before I 
shut down the VM, there was only around 5GB of data in that zpool. There 
doesn't appear to be any disk activity while it's in this stuck state.

Are there any troubleshooting tips on where I can start to look for answers?
The virtual machine is running on ESXi 4, with two virtual CPUs and 3GB RAM.

Thanks in advance,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Hang on zpool import (dedup related)

2010-09-12 Thread Chris Murray
Another hang on zpool import thread, I'm afraid, because I don't seem to have 
observed any great successes in the others and I hope there's a way of saving 
my data ...

In March, using OpenSolaris build 134, I created a zpool, some zfs filesystems, 
enabled dedup on them, moved content into them and promptly discovered how slow 
it was because I only have 4GB RAM. Even with 30GB L2ARC, the performance was 
unacceptable. The trouble started when the machine hung one day. Ever since, 
I've been unable to import my pool without it hanging again. At the time I saw 
posts from others who had run into similar problems, so I thought it best that 
I wait until a later build, on the assumption that some ZFS dedup bug would be 
fixed and I could see my data again. I've been waiting ever since, and only 
just had a chance to try build 147, thanks to illumos and a schillix live CD.

However, the pool still won't import, so I'd much appreciate any 
troubleshooting hints and tips to help me on my way.

schillix b147i

My process is:
1. boot the live CD.
2. on the console session, run  vmstat 1
3. from another machine, SSH in with multiple sessions and:
vmstat 60
vmstat 1
zpool import -f zp
zpool iostat zp 1
zpool iostat zp -v 5
4. wait until it all stops

What I observe is that the zpool import command never finishes, there will be a 
lengthy period of read activity made up of very small reads which then stops 
before an even longer period of what looks like no disk activity.

zp   512G  1.31T  0  0  0  0

The box will be responsive for quite some time, seemingly doing not a great 
deal:

 kthr  memorypagedisk  faults  cpu
 r b w   swap  free  re  mf pi po fr de sr cd cd rm s0   in   sy   cs us sy id
 0 0 0 2749064 3122988 0  7  0  0  0  0  0  0  1  0  0  365  218  714  0  1 99

Then after a matter of hours it'll hang. SSH sessions are no longer responsive. 
On the console I can press return which creates a new line, but vmstat will 
have stopped updating.

Interestingly, what I observed in b134 was the same thing, however the free 
memory would slowly decrease over the course of hours, before a sudden 
nose-dive right before the lock up. Now it appears to hang without that same 
effect.

While the import appears to be working, I can cd to /zp and look at content of 
the filesystems of 5 of the 9 "esx*" directories.
Coincidence or not, it's the last four which appear to be empty - esx_prod 
onward.

# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
zp 905G  1.28T23K  /zp
zp/nfs 889G  1.28T32K  /zp/nfs
zp/nfs/esx_dev 264G  1.28T   264G  /zp/nfs/esx_dev
zp/nfs/esx_hedgehog   25.8G  1.28T  25.8G  /zp/nfs/esx_hedgehog
zp/nfs/esx_meerkat 223G  1.28T   223G  /zp/nfs/esx_meerkat
zp/nfs/esx_meerkat_dedup   938M  1.28T   938M  /zp/nfs/esx_meerkat_dedup
zp/nfs/esx_page   8.90G  1.28T  8.90G  /zp/nfs/esx_page
zp/nfs/esx_prod306G  1.28T   306G  /zp/nfs/esx_prod
zp/nfs/esx_skunk21K  1.28T21K  /zp/nfs/esx_skunk
zp/nfs/esx_temp   45.5G  1.28T  45.5G  /zp/nfs/esx_temp
zp/nfs/esx_template   15.2G  1.28T  15.2G  /zp/nfs/esx_template

Any help would be appreciated. What could be going wrong here? Is it getting 
progressively closer to becoming imported each time I try this, or will it be 
starting from scratch? Feels to me like there's an action in the 
/zp/nfs/esx_prod filesystem it's trying to replay and never getting to the end 
of, for some reason. In case it was getting in a muddle with the l2arc, I 
removed the cache device a matter of minutes into this run. It hasn't hung yet, 
vmstat is still updating, but I tried a 'zpool import' in one of the windows to 
see if I could even see a pool on another disk, and that hasn't returned me 
back to the prompt yet. Also tried to SSH in with another session, and that 
hasn't produced the login prompt.

Thanks in advance,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hang on zpool import (dedup related)

2010-09-12 Thread Chris Murray
Absolutely spot on George. The import with -N took seconds.

Working on the assumption that esx_prod is the one with the problem, I bumped 
that to the bottom of the list. Each mount was done in a second:

# zfs mount zp
# zfs mount zp/nfs
# zfs mount zp/nfs/esx_dev
# zfs mount zp/nfs/esx_hedgehog
# zfs mount zp/nfs/esx_meerkat
# zfs mount zp/nfs/esx_meerkat_dedup
# zfs mount zp/nfs/esx_page
# zfs mount zp/nfs/esx_skunk
# zfs mount zp/nfs/esx_temp
# zfs mount zp/nfs/esx_template

And those directories have the content in them that I'd expect. Good!

So now I try to mount esx_prod, and the influx of reads has started in   zpool 
iostat zp 1

This is the filesystem with the issue, but what can I do now?

Thanks again.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Single VDEV pool permanent and checksum errors after replace

2011-01-03 Thread Chris Murray
Hi,

I have some strange goings-on with my VM of Solaris Express 11, and I
hope someone can help.

It shares out other virtual machine files for use in ESXi 4.0 (it,
too, runs in there)

I had two disks inside the VM - one for rpool and one for 'vmpool'.
All was fine.
vmpool has some deduped data. That was also fine.
I added a Samsung SSD to the ESXi host, created a 512MB VMDK and a
20GB VMDK, and added as log and cache, respectively. This also worked
fine.

At this point, the pool is made of c8t1d0 (data), c8t2d0 (logs),
c8t3d0 (cache). I decide that to add some redundancy, I'll add a
mirrored virtual disk. At this point, it happens that the VMDK for
this disk (c8t4d0) actually resides on the same physical disk as
c8t1d0. The idea was to perform the logical split in Solaris Express
first, deal with the IO penalty of writing everything twice to the
same physical disk (even though Solaris thinks they're two separate
ones), then move that VMDK onto a separate physical disk shortly. This
should in the short term protect against bit-flips and small errors on
the single physical disk that ESXi has, until a second one is
installed. I have a think about capacity, though, and decide I'd
prefer the mirror to be of c8t4d0 and c8t5d0 instead. So, it seems I
want to go from one single disk (c8t1d0), to a mirror of c8t4d0 and
c8t5d0. In my mind, that's a 'zpool replace' onto c8t4d0 and a 'zpool
attach' of c8t5d0. I kick off the replace, and all goes fine. Part way
through I try to do the attach as well, but am politely told I can't.

The replace itself completed without complaint, however on completion,
virtual machines whose disks are inside 'vmpool' start hanging,
checksum errors rapidly start counting up, and since there's no
redundancy, nothing can be done to repair them.


 pool: vmpool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
       corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
       entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
scan: resilvered 48.2G in 2h53m with 0 errors on Mon Jan  3 20:45:49 2011
config:

       NAME        STATE     READ WRITE CKSUM
       vmpool      DEGRADED     0     0 25.6K
         c8t4d0    DEGRADED     0     0 25.6K  too many errors
       logs
         c8t2d0    ONLINE       0     0     0
       cache
         c8t3d0    ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

       /vmpool/nfs/duck/duck_1-flat.vmdk
       /vmpool/nfs/panda/template.xppro-flat.vmdk


At this point, I remove disk c8t1d0, and snapshot the entire VM in
case I do any further damage. This leads to my first two questions:

#1 - are there any suspicions as to what's happened here? How come
the resilver completed fine but now there are checksum errors on the
replacement disk? It does reside on the same physical disk, after all.
Could this be something to do with me attempting the attach during the
replace?
#2 - in my mind, c8t1d0 contains the state of the pool just prior
to the cutover to c8t4d0. Is there any way I can get this back, and
scrap the contents of c8t4d0? A 'zpool import -D' is fruitless, but I
imagine there's some way of tricking Solaris into seeing c8t1d0 this
as a single disk pool again?

Now that I've snapshotted the VM and have a sort of safety net, I run
a scrub, which unsurprisingly unearths checksum errors and lists all
of the files which have problems:


 pool: vmpool
state: ONLINE
status: One or more devices has experienced an error resulting in data
       corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
       entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub repaired 0 in 0h30m with 95 errors on Mon Jan  3 21:47:25 2011
config:

       NAME        STATE     READ WRITE CKSUM
       vmpool      ONLINE       0     0   190
         c8t4d0    ONLINE       0     0   190
       logs
         c8t2d0    ONLINE       0     0     0
       cache
         c8t3d0    ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

       /vmpool/nfs/duck/duck-flat.vmdk
       /vmpool/nfs/duck/Windows Server 2003 Standard Edition.nvram
       /vmpool/nfs/duck/duck_1-flat.vmdk
       /vmpool/nfs/eagle/eagle-flat.vmdk
       /vmpool/nfs/eagle/eagle_1-flat.vmdk
       /vmpool/nfs/eagle/eagle_2-flat.vmdk
       /vmpool/nfs/eagle/eagle_3-flat.vmdk
       /vmpool/nfs/eagle/eagle_5-flat.vmdk
       /vmpool/nfs/panda/Windows XP Professional.nvram
       /vmpool/nfs/panda/panda-flat.vmdk
       /vmpool/nfs/panda/template.xppro-flat.vmdk


I 'zpool clear vmpool', power on one of the VMs, and the checksum
count quickly reaches 970.

#3 - why would this be the case? I thought the purpose of a scrub
was to traverse all blocks, read them, and unearth problems? I'm
wondering why these 970 errors haven

Re: [zfs-discuss] Single VDEV pool permanent and checksum errors after replace

2011-01-05 Thread Chris Murray
Hi Edward,

Thank you for the feedback. All makes sense.

To clarify, yes, I snapshotted the VM within ESXi, not the filesystems within 
the pool. Unfortunately, because of my misunderstanding of how ESXi 
snapshotting works, I'm now left without the option of investigating whether 
the replaced disk could be used to create a new pool.

For anyone interested, I removed the c8t1d0 disk from the VM, snapshotted, 
messed around a little, removed the 'corrupt' disks, added c8t1d0 back in, 
performed a 'zdb -l' which did show a disk of type 'replacing', with two 
children. That looked quite promising, but I wanted to wait until anyone had 
chipped in with some suggestions about how to recover from the replaced disk, 
so I decided to look at the corrupt data again. I reverted back to the snapshot 
in ESXi, bringing back my corrupt disks (as you'd expect), but which 
unfortunately *deleted* (!?) the VMDK files which related to c8t1d0. Not a 
ZFS/Solaris issue of any kind, I know, but one to watch out for potentially if 
anyone else is trying things out in this unsupported configuration.

Shame I can't look into getting data back from the 'good' virtual disk - that's 
probably something I'd like answered so I might look into again once I've put 
this matter to bed.

In the meantime, I'll see what I can do with dd_rescue or dd with 
'noerror,sync' to produce some swiss-cheese VMDK files and see whether the 
content can be repaired. It's not the end of the world if they're gone, but I'd 
like to satisfy my own curiosity with this little exercise in recovery.

Thanks again for the input,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single VDEV pool permanent and checksum errors after replace

2011-01-06 Thread Chris Murray
On 5 January 2011 13:26, Edward Ned Harvey
 wrote:
> One comment about etiquette though:
>


I'll certainly bear your comments in mind in future, however I'm not
sure what happened to the subject, as I used the interface at
http://opensolaris.org/jive/. I thought that would keep the subject
the same. Plus, my gmail account appears to have joined up my reply
from the web interface with the original thread too? Anyhow, I do see
your point about quoting, and will do from now.

For anyone wondering about the extent of checksum problems in my VMDK
files, they range from only 128KB worth in some, to 640KB in others.
Unfortunately it appears that the bad parts are in critical parts of
the filesystem, but it's not a ZFS matter so I'll see what can be done
by way of repair with Windows/NTFS inside each affected VM. So
whatever went wrong, it was only a small amount of data.

Thanks again,
Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single VDEV pool permanent and checksum errors after replace

2011-01-06 Thread Chris Murray
On 6 January 2011 20:02, Chris Murray  wrote:
> On 5 January 2011 13:26, Edward Ned Harvey
>  wrote:
>> One comment about etiquette though:
>>
>
>
> I'll certainly bear your comments in mind in future, however I'm not
> sure what happened to the subject, as I used the interface at
> http://opensolaris.org/jive/. I thought that would keep the subject
> the same. Plus, my gmail account appears to have joined up my reply
> from the web interface with the original thread too? Anyhow, I do see
> your point about quoting, and will do from now.
>
> For anyone wondering about the extent of checksum problems in my VMDK
> files, they range from only 128KB worth in some, to 640KB in others.
> Unfortunately it appears that the bad parts are in critical parts of
> the filesystem, but it's not a ZFS matter so I'll see what can be done
> by way of repair with Windows/NTFS inside each affected VM. So
> whatever went wrong, it was only a small amount of data.
>
> Thanks again,
> Chris
>

I'll get the hang of this e-mail lark on of these days, I'm sure  :-)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Unable to import zpool since system hang during zfs destroy

2008-08-25 Thread Chris Murray
Hi all, 

I have a RAID-Z zpool made up of 4 x SATA drives running on Nexenta 1.0.1 
(OpenSolaris b85 kernel). It has on it some ZFS filesystems and few volumes 
that are shared to various windows boxes over iSCSI. On one particular iSCSI 
volume, I discovered that I had mistakenly deleted some files from the FAT32 
partition that is on it. The files were still in a ZFS snapshot that was made 
earlier in the morning so I made use of the ZFS clone command to create a 
separate copy of the volume. I accessed it in Windows, got the files I needed, 
and then proceeded to delete it using "zfs destroy". During the process, disk 
activity stopped, my SSH windows stopped responding and Windows lost all iSCSI 
connections, reporting delayed write failed for the volumes that disappeared. 

I powered down the Nexenta box and started it back up, where it hung with the 
following output: 

SunOS Release 5.11 Version NexentaOS_20080312 64-bit 
Loading Nexenta... 
Hostname: mammoth 

This is before the usual "Reading ZFS config: done" and "Mounting ZFS 
filesystems" indicators. The only way I could bring system up was to disconnect 
all four SATA drives before power-on. I can then export the zpool, reboot, and 
the system comes up without complaint. However, of course, the pool isn't 
imported. When I execute "zpool import", the pool is detected fine: 

pool: zp 
id: 2070286287887108251 
state: ONLINE 
action: The pool can be imported using its name or numeric identifier. 
config: 

zp ONLINE 
raidz1 ONLINE 
c0t1d0 ONLINE 
c0t0d0 ONLINE 
c0t3d0 ONLINE 
c0t2d0 ONLINE 

The next issue is that when the pool is actually imported ("zpool import -f 
zp"), it too hangs the whole system, albeit after a minute or so of disk 
activity. A "zpool iostat zp 10" during that time is below: 

capacity operations bandwidth 
pool used avail read write read write 
-- - - - - - - 
zp 1.73T 1018G 1.13K 7 4.43M 23.6K 
zp 1.73T 1018G 1.05K 0 4.07M 0 
zp 1.73T 1018G 1.15K 0 4.88M 0 
zp 1.73T 1018G 457 0 1.36M 0 
zp 1.73T 1018G 668 0 2.49M 0 
zp 1.73T 1018G 411 0 1.80M 0 
[system stopped at this point and wouldn't accept keypresses any more] 

I'm lost as to what to do - every time the pool is imported, it briefly turns 
up in "zpool status", but will then hang the system to the extent that I must 
power off, disconnect drives, power up, zpool export, and reboot, just to be 
able to start typing commands again!! 

So far I've tried: 
1. Rebooting with only one of the SATA drives attached at a time. All four 
times the OS came up fine, but of course "zpool status" reported the pool as 
having insufficient replicas. I don't know whether powering up with two or 
three drives will work; I didn't want to try any permutations in case I made 
things worse. 

2. Checking with "fmdump -e", the only output relating to zfs is regarding 
missing vdev's and is presumably from when I have been rebooting with drives 
disconnected. 

3. "dd if=/dev/rdsk/c0t0d0 of=/"dev/null bs=1048576" and the equivalents for 
the other three drives are all currently running and I await the results. Given 
that a scrub takes about 7 hours, I expect I'll have to leave this overnight. 

4. "zdb -e zp" is now at the stage of "Traversing all blocks to verify 
checksums and verify nothing leaked ...". I expect this will also take some 
time. 

While I wait for the results from "dd" and "zdb", is there anything else I can 
try in order to get the pool up and running again? 

I have spotted some previous, similar posts regarding hanging, notably this 
one: 
http://opensolaris.org/jive/thread.jspa?threadID=70205&tstart=15 
Unfortunately, I am a bit of a Nexenta/OpenSolaris/Unix newbie so a lot of that 
is way over my head, and when the system completely hangs, I have no choice but 
to power off. Any help is much appreciated! 

Thanks,
Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy

2008-08-25 Thread Chris Murray
Ah-ha! That certainly looks like the same issue Miles - well spotted! As it 
happens, the "zdb" command failed with "out of memory -- generating core dump" 
whereas all four dd's completed successfully.

I'm downloading snv96 right now - I'll install in the morning and post my 
results both here, and in the thread you mention.

If this works, I may stay with OpenSolaris again - I've been unable to use 
Nexenta as an iSCSI target for ESXi 3.5 because of the b85 kernel, so this 
upgrade to b96 may kill two birds with one stone.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy

2008-08-25 Thread Chris Murray
That's a good point - I'll try svn94 if I can get my hands on it - any idea 
where the download for it is? I've been going round in circles and all I can 
come up with are the variants of svn96 - CD, DVD (2 images), DVD (single 
image). Maybe that's a sign I should give up for the night!

Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy

2008-08-26 Thread Chris Murray
Ok, used the development 2008.11 (b95) livecd earlier this morning to import 
the pool, and it worked fine. I then rebooted back into Nexenta and all is 
well. Many thanks for the help guys!

Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] can anyone help me?

2008-08-26 Thread Chris Murray
Hi all,
I can confirm that this is fixed too. I ran into the exact same issue yesterday 
after destroying a clone:
http://www.opensolaris.org/jive/thread.jspa?threadID=70459&tstart=0

I used the b95-based 2008.11 development livecd this morning and the pool is 
now back up and running again after a quick import and export.

Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS file permissions - some files missing over SMB?

2009-07-14 Thread Chris Murray
Hello,

Hopefully a quick and easy permissions problem here, but I'm stumped and 
quickly reached the end of my Unix knowledge.

I have a ZFS filesystem called "fs/itunes" on pool "zp". In it, the "iTunes 
music" folder contained a load of other folders - one for each artist.

During a resilver operation which was going to take a week, I decided to delete 
this data (amongst other things) and restore it from backup once the resilver 
was complete. It finished on Sunday night, so I started copying content from my 
Windows machine + an NTFS external disk, to "/zp/fs/itunes/iTunes music", using 
winscp.

Now, if I browse to that folder over SMB from the Windows machine, I have a 
subset of all of the artist names, and I can't identify exactly why some are 
there and others aren't. I'm accessing it using the "sharesmb=on" option, and 
user "chris".

So:
*  "\\mammoth\itunes\iTunes music" contains *some* folders.
*  If I winscp using user "chris" and browse to the same folder, everything is 
there.
*  Inspecting the properties on a folder that is visible, I see that it has 
group "staff", owner "chris", and permissions "0777".
*  A one that is visible in winscp but NOT through Windows has the same ... ?!

I'm not sure what I've done here, but clearly there's something I don't 
understand about permissions. If I try to create one of the missing folders 
through Windows, I'm told "Cannot rename New Folder: A file with the name you 
specified already exists. Specify a different file name.", so they appear to be 
hidden from view in some way.

Thanks in advance,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file permissions - some files missing over SMB?

2009-07-14 Thread Chris Murray
The plot thickens ... I had a brainwave and tried accessing a 'missing' folder 
with the following on Windows:

explorer "\\mammoth\itunes\iTunes music\Dubfire"

I can open files within it and can rename them too. So .. still looks like a 
permissions problem to me, but in what way, I'm not quite sure.

Forgot to mention, I'm using SXCE b105.

Thanks again.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file permissions - some files missing over SMB?

2009-07-14 Thread Chris Murray
Thanks Mark. I ran the script and found references in the output to 'aclmode' 
and 'aclinherit'. I had in the back of my mind that I've had to mess on with 
ZFS ACL's in the past, aside from using chmod with the usual numeric values. 
That's given me something to go on. I'll post to cifs-discuss if I don't get 
anywhere.

Thanks for pointing me in the right direction.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] *Almost* empty ZFS filesystem - 14GB?

2009-08-16 Thread Chris Murray
Accidentally posted the below earlier against ZFS Code, rather than ZFS 
Discuss. 

My ESXi box now uses ZFS filesystems which have been shared over NFS. Spotted 
something odd this afternoon - a filesystem which I thought didn't have any 
files in it, weighs in at 14GB. Before I start deleting the empty folders to 
see what happens, any ideas what's happened here?

# zfs list | grep temp
zp/nfs/esx_temp 14.0G 225G 14.0G /zp/nfs/esx_temp
# ls -la /zp/nfs/esx_temp
total 20
drwxr-xr-x 5 root root 5 Aug 13 12:54 .
drwxr-xr-x 7 root root 7 Aug 13 12:40 ..
drwxr-xr-x 2 root root 2 Aug 13 12:53 iguana
drwxr-xr-x 2 root root 2 Aug 13 12:54 meerkat
drwxr-xr-x 2 root root 2 Aug 16 19:39 panda
# ls -la /zp/nfs/esx_temp/iguana/
total 8
drwxr-xr-x 2 root root 2 Aug 13 12:53 .
drwxr-xr-x 5 root root 5 Aug 13 12:54 ..
# ls -la /zp/nfs/esx_temp/meerkat/
total 8
drwxr-xr-x 2 root root 2 Aug 13 12:54 .
drwxr-xr-x 5 root root 5 Aug 13 12:54 ..
# ls -la /zp/nfs/esx_temp/panda/
total 8
drwxr-xr-x 2 root root 2 Aug 16 19:39 .
drwxr-xr-x 5 root root 5 Aug 13 12:54 ..
#

Could there be something super-hidden, which I can't see here?

There don't appear to be any snapshots relating to zp/nfs/esx_temp.

On a suggestion, I have ran the following:

# zfs list -r zp/nfs/esx_temp
NAME USED AVAIL REFER MOUNTPOINT
zp/nfs/esx_temp 14.0G 225G 14.0G /zp/nfs/esx_temp
# du -sh /zp/nfs/esx_temp
8K /zp/nfs/esx_temp
#

Thanks,
Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?

2009-08-16 Thread Chris Murray
Thanks Tim. Results are below:

# zfs list -t snapshot -r zp/nfs/esx_temp
no datasets available
# zfs get refquota,refreservation,quota,reservation zp/nfs/esx_temp
NAME PROPERTYVALUE  SOURCE
zp/nfs/esx_temp  refquotanone   default
zp/nfs/esx_temp  refreservation  none   default
zp/nfs/esx_temp  quota   none   default
zp/nfs/esx_temp  reservation none   default
#

I'm 99% sure I've done something 'obvious', which is escaping me at the minute. 
The filesystem only has 3 folders in it, which are all empty, so I could very 
quickly destroy and recreate, but it has piqued my curiosity enough to wonder 
what's happened here.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?

2009-08-18 Thread Chris Murray
I don't have quotas set, so I think I'll have to put this down to some sort of 
bug. I'm on SXCE 105 at the minute, ZFS version is 3, but zpool is version 13 
(could be 14 if I upgrade). I don't have everything backed-up so won't do a 
"zpool upgrade" just at the minute. I think when SXCE 120 is released, I'll 
install that, upgrade my pool and see if the filesystem still registers as 
14GB. If it does, I'll destroy and recreate - no biggie! :-)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?

2009-08-21 Thread Chris Murray
Nico, what is a zero-link file, and how would I go about finding whether I have 
one? You'll have to bear with me, I'm afraid, as I'm still building my Solaris 
knowledge at the minute - I was brought up on Windows. I use Solaris for my 
storage needs now though, and slowly improving on my knowledge so I can move 
away from Windows one day  :)

If it makes any difference, the problem persists after a full reboot, and I've 
deleted the three folders, so now there is literally nothing in that filesystem 
.. yet it reports 14GB.

It's not too much of an inconvenience, but it does make me wonder whether the 
'used' figures on my other filesystems and zvols are correct.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?

2009-08-21 Thread Chris Murray
That looks like it indeed. Output of zdb -

Object  lvl   iblk   dblk  lsize  asize  type
 9516K 8K   150G  14.0G  ZFS plain file
 264  bonus  ZFS znode
path???

Thanks for the help in clearing this up - satisfies my curiosity.  Nico, I'll 
add those commands to the little list in my mind and they'll push the Windows 
ones out in no time  :)

I'll go to b120 when it is available.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Persistent errors - do I believe?

2009-09-17 Thread Chris Murray
I can flesh this out with detail if needed, but a brief chain of events is:

1. RAIDZ1 zpool with drives A, B, C & D (I don't have access to see original 
drive names)
2. New disk E. Replaced A with E.
3. Part way through resilver, drive D was 'removed'
4. 700+ persistent errors detected, and lots of checksum errors on all drives. 
Surprised by this - I thought the absence of one drive could be tolerated?
5. Exported, rebooted, imported. Drive D present now. Good. :-)
6. Drive D disappeared again. Bad.  :-(
7. This time, only one persistent error.

Does this mean that there aren't errors in the other 700+ files that it 
reported the first time, or have I lost my chance to note these down, and they 
are indeed still corrupt?

I've re-ran step 5 again, so it is now on the third attempted resilver. 
Hopefully drive D won't remove itself again, and I'll actually have 30+ hours 
of stability while the new drive resilvers ...

Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Persistent errors - do I believe?

2009-09-17 Thread Chris Murray
Thanks David. Maybe I mis-understand how a replace works? When I added disk E, 
and used 'zpool replace [A] [E]' (still can't remember those drive names), I 
thought that disk A would still be part of the pool, and read from in order to 
build the contents of disk E? Sort of like a safer way of doing the old 'swap 
one drive at a time' trick with RAID-5 arrays?

Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Persistent errors - do I believe?

2009-09-20 Thread Chris Murray
Ok, the resilver has been restarted a number of times over the past few days 
due to two main issues - a drive disconnecting itself, and power failure. I 
think my troubles are 100% down to these environmental factors, but would like 
some confidence that after the resilver has completed, if it reports there 
aren't any persistent errors, that there actually aren't any.

Attempt #1: the resilver started after I initiated the replace on my SXCE105 
install. All was well until the box lost power. On starting back up, it hung 
while starting OpenSolaris - just after the line containing the system 
hostname. I've had this before when a scrub is in progress. My usual tactic is 
to boot with the 2009.06 live CD, import the pool, stop the scrub, export, 
reboot into SXCE105 again, and import. Of course, you can't stop a replace 
that's in progress, so the remaining attempts are in the 2009.06 live CD (build 
111b perhaps?)

Attempt #2: the resilver started on imported the pool in 2009.06. It was 
resilvering fine until one drive reported itself as offline. dmesg showed that 
the drive was 'gone'. I then noticed a lot of checksum errors at the pool 
level, and RAIDZ1 level, and a large number of 'permanent' errors. In a panic, 
thinking that the resilver was now doing more harm than good, I exported the 
pool and rebooted.

Attempt #3: I imported in 2009.06 again. This time, the drive that was 
disconnected last attempt was online again, and proceeded to resilver along 
with the original drive. There was only one permanent error - in a particular 
snapshot of a ZVOL I'm not too concerned about. This is the point that I wrote 
the original post, wondering if all of those 700+ errors reported the first 
time around weren't a problem any more. I have been running zpool clear in a 
loop because there were checksum errors on another of the drives (neither of 
the two part of the replacing vdev, and not the one that was removed 
previously). I didn't want it to be marked as faulty, so I kept the zpool clear 
running. Then .. power failure.

Attempt #4: I imported in 2009.06. This time, no errors detected at all. Is 
that a result of my zpool clear? Would that clear any 'permanent' errors? From 
the wording, I'd say it wouldn't, and therefore the action of starting the 
resilver again with all of the correct disks in place hasn't found any errors 
so far ... ? Then, disk removal again ... :-(

Attempt #5: I'm convinced that drive removal is down to faulty cabling. I move 
the machine, completely disconnect all drives, re-wire all connections with new 
cables, and start the scrub again in 2009.06. Now, there are checksum errors 
again, so I'm running zpool clear in order to keep drives from being marked as 
faulted .. but I also have this:

errors: Permanent errors have been detected in the following files:
zp/iscsi/meerkat_t...@20090905_1631:<0x1>

I have a few of my usual VMs powered up (ESXi connecting using NFS), and they 
appear to be fine. I've ran a chkdsk in the windows VMs, and no errors are 
reported. Although I can't be 100% confident that any of those files were in 
the original list of 700+ errors. In the absence of iscsitgtd, I'm not powering 
up the ones that rely on iSCSI just yet.

My next steps will be:
1. allow the resilver to finish. Assuming I don't have yet another power cut, 
this will be in about 24 hours.
2. zpool export
3. reboot into SXCE
4. zpool import
5. start all my usual virtual machines on the ESXi host
6. note whether that permanent error is still there <-- this will be an 
interesting one for me - will the export & import clear the error? will my 
looped zpool clear have simply reset the checksum counters to zero, or will it 
have cleared this too?
7. zpool scrub to see what else turns up.

Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Persistent errors - do I believe?

2009-09-22 Thread Chris Murray
I've had an interesting time with this over the past few days ...

After the resilver completed, I had the message "no known data errors" in a 
zpool status.

I guess the title of my post should have been "how permanent are permanent 
errors?". Now, I don't know whether the action of completing the resilver was 
the thing that fixed the one remaining error (in the snapshot of the 'meerkat' 
zvol), or whether my looped zpool clear commands have done it. Anyhow, for 
space/noise reasons, I set the machine back up with the original cables 
(eSATA), in its original tucked-away position, installed SXCE 119 to get me 
remotely up to date, and imported the pool.

So far so good. I then powered up a load of my virtual machines. None of them 
report errors when running a chkdsk, and SQL Server 'DBCC CHECKDB' hasn't 
reported any problems yet. Things are looking promising on the corruption front 
- feels like the errors that were reported while the resilvers were in progress 
have finally been fixed by the final (successful) resilver! Microsoft Exchange 
2003 did complain of corruption of mailbox stores, however I have seen this a 
few times as a result of unclean shutdowns, and don't think it's related to the 
errors that ZFS was reporting on the pool during resilver.

Then, 'disk is gone' again - I think I can definitely put my original troubles 
down to cabling, which I'll sort out for good in the next few days. Now, I'm 
back on the same SATA cables which saw me through the resilvering operation.

One of the drives is showing read errors when I run dmesg. I'm having one 
problem after another with this pool!! I think the disk I/O during the resilver 
has tipped this disk over the edge. I'll replace it ASAP, and then I'll test 
the drive in a separate rig and RMA it.

Anyhow, there is one last thing that I'm struggling with - getting the pool to 
expand to use the size of the new disk. Before my original replace, I had 3x1TB 
and 1x750GB disk. I replaced the 750 with another 1TB, which by my reckoning 
should give me around 4TB as a total size even after checksums and metadata. No:

# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool74G  8.81G  65.2G11%  ONLINE  -
zp 2.73T  2.36T   379G86%  ONLINE  -

2.73T? I'm convinced I've expanded a pool in this way before. What am I missing?

Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Persistent errors - do I believe?

2009-09-24 Thread Chris Murray
Cheers, I did try that, but still got the same total on import - 2.73TB

I even thought I might have just made a mistake with the numbers, so I made a 
sort of 'quarter scale model' in VMware and OSOL 2009.06, with 3x250G and 
1x187G. That gave me a size of 744GB, which is *approx* 1/4 of what I get in 
the physical machine. That makes sense. I then replaced the 187 with another 
250, still 744GB total, as expected. Exported & imported - now 996GB. So, the 
export and import process seems to be the thing to do, but why it's not working 
on my physical machine (SXCE119) is a mystery. I even contemplated that there 
might have still been a 750GB drive left in the setup, but they're all 1TB 
(well, 931.51GB).

Any ideas what else it could be?

For anyone interested in the checksum/permanent error thing, I'm running a 
scrub now. 59% done and not one error.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Error: "Volume size exceeds limit for this system"

2007-11-07 Thread Chris Murray
Hi all,

I am experiencing an issue when trying to set up a large ZFS volume in 
OpenSolaris build 74 and the same problem in Nexenta alpha 7. I have looked on 
Google for the error and have found zero (yes, ZERO) results, so I'm quite 
surprised! Please can someone help?

I am setting up a test environment in VMware Workstation 6.0 before I unleash 
this on real hardware. The virtual system has an IDE HDD, IDE CD-ROM and 4 750 
GB SCSI disks (c2t0d0 --> c2t3d0). Once working, I hope to use the iSCSI target 
to see the volume from a Windows box.

I create the zpool with:
[i]zpool create zp raidz c2t0d0 c2t1d0 c2t2d0 c2t3d0[/i]

This completes fine, and then a [i]zpool iostat[/i] reports the pool as having 
2.93T free. All makes sense so far.

Then, I try to create a ZFS volume in there of size 1.25T with the following 
command:
zfs create -V 1.25T zp/test

This then fails with:
cannot create 'zp/test': volume size exceeds limit for this system

Please can someone help me with this? I've tried making sense of the word 
"system" in this context but haven't come up with much. A volume of 1T works 
fine. Could it be something to do with having a 32 bit CPU?

Many thanks,
Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error: "Volume size exceeds limit for this system"

2007-11-12 Thread Chris Murray
Thanks for the help guys - unfortunately the only hardware at my disposal just 
at the minute is all 32 bit, so I'll just have to wait a while and fork out on 
some 64-bit kit before I get the drives. I'm a home user so I'm glad I didnt 
buy the drives and discover I couldnt use them without spending even more!!

Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Assistance needed expanding RAIDZ with larger drives

2008-01-10 Thread Chris Murray
Hi all,

Please can you help with my ZFS troubles:

I currently have 3 x 400 GB Seagate NL35's and a 500 GB Samsung Spinpoint in a 
RAIDZ array that I wish to expand by systematically replacing each drive with a 
750 GB Western Digital Caviar.

After failing miserably, I'd like to start from scratch again if possible. When 
I last tried, the replace command hung for an age, network connectivity was 
lost (I have to do this by SSH, so that dropped too), and I got strange outputs 
from zpool -status where a particular device (but not the one that was being 
replaced) was listed twice. I also managed to mess up 3 of the 4 WD drives in 
such a way that they were detected by the bios, but no system after that would 
work:

In solaris, a format wouldn't show the drive (c2d0 should have been there, but 
it wasn't)
Using an old Ghost boot CD, the old 3 drives would list, but the new one 
wouldn't.
GParted couldn't see the new drive (the old 3 were there!)

Again, in all of these situations, the drive WAS detected in the BIOS and 
everything looked perfectly fine in there. I have now managed to salvage one 
using Windows and a USB caddy in another computer. This is formatted as one big 
NTFS volume. So the drive does work.

Here is the output from zpool -status:

| # zpool status -v
|   pool: zp
|  state: ONLINE
|  scrub: scrub in progress, 30.76% done, 4h33m to go
| config:
| 
| NAMESTATE READ WRITE CKSUM
| zp  ONLINE   0 0 0
|   raidz1ONLINE   0 0 0
| c2d1ONLINE   0 0 0
| c3d1ONLINE   0 0 0
| c2d0ONLINE   0 0 0
| c3d0ONLINE   0 0 0
|
| errors: No known data errors

And here is the output from ZDB:

| # zdb
| zp
| version=9
| name='zp'
| state=0
| txg=110026
| pool_guid=5629347939003043989
| hostid=823611165
| hostname='mammoth'
| vdev_tree
| type='root'
| id=0
| guid=5629347939003043989
| children[0]
| type='raidz'
| id=0
| guid=1325151684809734884
| nparity=1
| metaslab_array=14
| metaslab_shift=33
| ashift=9
| asize=1600289505280
| is_log=0
| children[0]
| type='disk'
| id=0
| guid=5385778296365299126
| path='/dev/dsk/c2d1s0'
| devid='id1,[EMAIL PROTECTED]/a'
| phys_path='/[EMAIL PROTECTED],0/[EMAIL 
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a'
| whole_disk=1
| DTL=33
| children[1]
| type='disk'
| id=1
| guid=15098521488705848306
| path='/dev/dsk/c3d1s0'
| devid='id1,[EMAIL PROTECTED]/a'
| phys_path='/[EMAIL PROTECTED],0/[EMAIL 
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a'
| whole_disk=1
| DTL=32
| children[2]
| type='disk'
| id=2
| guid=4518340092563481291
| path='/dev/dsk/c2d0s0'
| devid='id1,[EMAIL PROTECTED]/a'
| phys_path='/[EMAIL PROTECTED],0/[EMAIL 
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a'
| whole_disk=1
| DTL=31
| children[3]
| type='disk'
| id=3
| guid=7852006658048665355
| path='/dev/dsk/c3d0s0'
| devid='id1,[EMAIL PROTECTED]/a'
| phys_path='/[EMAIL PROTECTED],0/[EMAIL 
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a'
| whole_disk=1
| DTL=30

I have found many, many posts regarding issues when replacing drives, and have 
used bits of them to get myself in a working state .. but I'm now confused with 
all the imports, exports, detaches etc... Please can someone help me out 
step-by-step with this one and start with:

The array is functioning perfectly but I want to replace c2d0 with a new WD 
drive. Once the scrub completes, What do I do?

Many thanks,

Chris Murray
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Assistance needed expanding RAIDZ with larger drives

2008-01-13 Thread Chris Murray
> About that issue, please check my post in:
> http://www.opensolaris.org/jive/thread.jspa?threadID=48483&tstart=0

Thanks - when I originally tried to replace the first drive, my intention was 
to:
1. Move solaris box and drives
2. Power up to test it still works
3. Power down
4. Replace drive.

I suspect I may have missed out 2 & 3, and ran into the same situation that you 
did.

Anyhow, I seem to now be in an even bigger mess than earlier - when I tried to 
simply swap out one of the old drives with a new one and perform a replace, I 
ran into problems:

1. The hard drive light on the PC lit up, and I heard lots of disk noise, as 
you would expect
2. The light went off. My continuous ping did the following:

Reply from 192.168.0.10: bytes=32 time<1ms TTL=255
Reply from 192.168.0.10: bytes=32 time<1ms TTL=255
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.0.10: bytes=32 time=2092ms TTL=255
Reply from 192.168.0.10: bytes=32 time<1ms TTL=255
Reply from 192.168.0.10: bytes=32 time<1ms TTL=255

3. The light came back on again .. more disk noise. Good - perhaps the pause 
was just a momentary blip
4. Light goes off (this is about 20 minutes since the start)
5. zpool status reports that a resilver completed, and there are errors in 
zp/storage and zp/VMware, and suggests that I should restore from backup
6. I nearly cry, as these are the only 2 files I use.
7. I have heard of ZFS thinking that there are unrecoverable errors before, so 
I run zpool scrub and then zpool clear a number of times. Seem to make no 
difference.

This whole project started when I wanted to move 900 GB of data from a server 
2003 box containing the 4 old disks, to a solaris box. I borrowed 2 x 500 GB 
drives from a friend, copied all the data onto them, put the 4 old drives into 
the solaris box, created the zpool, created my storage and VMware volumes, 
shared them out using iSCSI, created NTFS volumes on the server 2003 box and 
copied the data back onto them. Aside from a couple of networking issues, this 
worked absolutely perfectly. Then I decided I'd like some more space, and 
that's where it all went wrong.

Despite the reports of corruption, the storage and VMware "drives" do still 
work in windows. The iSCSI initiator still picks them up, and if I if I dir /a 
/s, I can see all of the files that were on these NTFS volumes before I tried 
this morning's replace. However, should I trust this? I suspect that even if I 
ran a chkdsk /f, a successful result may not be all that it seems. I still have 
the 2 x 500 GB drives with my data from weeks ago. I'd be sad to lose a few 
weeks worth of work, but that would be better than assuming that ZFS is 
incorrect in saying the volumes are corrupt and then discovering in months time 
that I cannot get at NTFS files because of this root cause.

Since the report of corruption in these 2 volumes, I had a genius 
troubleshooting idea - "what if the problem is not with ZFS, but instead with 
Solaris not liking the drives in general?". I exported my current zpool, 
disconnected all drives, plugged in the 4 new ones, and waited for the system 
to boot again... nothing. The system had stopped in the BIOS, requesting that I 
press F1 as SMART reports that one of the drives is bad! Already?!? I only 
bought the drives a few days ago!!! Now the problem is that I know which of 
these drives is bad, but I don't know whether this was the one that was plugged 
in when zpool status reported all the read/write/checksum errors.

So maybe I have a duff batch of drives .. I leave the remaining 3 plugged in 
and create a brand new zpool called test. No problems at all. I create a 1300 
GB volume on it. Also no problem. I'm currently overwriting it with random data:

dd if=/dev/urandom of=/dev/zvol/rdsk/test/test bs=1048576 count=1331200

I throw in the odd zpool scrub to see how things are doing so far and as yet, 
there hasn't been a single error of any sort. So, 3 of the WD drives (0430739, 
0388708, 0417089) appear to be fine and one is dead already (0373211).

So this leads me to the conclusion that (ignoring the bad one), these drives 
work fine with Solaris. They work fine with ZFS too. It's just the act of 
trying to replace a drive from my old zpool with a new one that causes issues.

My next step will be to run the WD diagnostics on all drives, send the broken 
one back, and then have 4 fully functioning 750 GB drives. I'll also import the 
old zpool into the solaris box - it'll undoubtedly complain that one of the 
drives is missing (the one that I tried to add earlier and got all the errors), 
so I think I'll try one more replace to get all 4 old drives back in the pool. 

So, what do I do after that?

1. Create a brand new pool out of the WD drives, share it using iSCSI and copy 
onto that my data from my friends drives? I'll have lost a good few weeks of 
work but I'll be confident that it isn't corrupt.
2. Igno

Re: [zfs-discuss] Q : change disks to get bigger pool

2008-01-20 Thread Chris Murray
This process should work, but make sure you don't swap any cables around while 
you replace a drive, or you'll run into the situation described in the 
following thread, as I did:

http://www.opensolaris.org/jive/thread.jspa?threadID=48483&tstart=0

Chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss