[zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks
Good evening, I understand that NTFS & VMDK do not relate to Solaris or ZFS, but I was wondering if anyone has any experience of checking the alignment of data blocks through that stack? I have a VMware ESX 4.0 host using storage presented over NFS from ZFS filesystems (recordsize 4KB). Within virtual machine VMDK files, I have formatted NTFS filesystems, block size 4KB. Dedup is turned on. When I run ZDB -DD, i see a figure of unique blocks which is higher than I expect, which makes me wonder whether any given 4KB in the NTFS filesystem is perfectly aligned with a 4KB block in ZFS? e.g. consider two virtual machines sharing lots of the same blocks. Assuming there /is/ a misalignment between NTFS & VMDK/VMDK & ZFS, if they're not in the same order within NTFS, they don't align, and will actually produce different blocks in ZFS: VM1 NTFS1---2---3--- ZFS 1---2---3---4--- ZFS blocks are " AA", "AABB" and so on ... Then in another virtual machine, the blocks are in a different order: VM2 NTFS1---2---3--- ZFS 1---2---3---4--- ZFS blocks for this VM would be " CC", "CCAA", "AABB" etc. So, no overlap between virtual machines, and no benefit from dedup. I may have it wrong, and there are indeed 30,785,627 unique blocks in my setup, but if there's a mechanism for checking alignment, I'd find that very helpful. Thanks, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks
Please excuse my pitiful example. :-) I meant to say "*less* overlap between virtual machines", as clearly block "AABB" occurs in both. -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Chris Murray Sent: 18 March 2010 18:45 To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks Good evening, I understand that NTFS & VMDK do not relate to Solaris or ZFS, but I was wondering if anyone has any experience of checking the alignment of data blocks through that stack? I have a VMware ESX 4.0 host using storage presented over NFS from ZFS filesystems (recordsize 4KB). Within virtual machine VMDK files, I have formatted NTFS filesystems, block size 4KB. Dedup is turned on. When I run ZDB -DD, i see a figure of unique blocks which is higher than I expect, which makes me wonder whether any given 4KB in the NTFS filesystem is perfectly aligned with a 4KB block in ZFS? e.g. consider two virtual machines sharing lots of the same blocks. Assuming there /is/ a misalignment between NTFS & VMDK/VMDK & ZFS, if they're not in the same order within NTFS, they don't align, and will actually produce different blocks in ZFS: VM1 NTFS1---2---3--- ZFS 1---2---3---4--- ZFS blocks are " AA", "AABB" and so on ... Then in another virtual machine, the blocks are in a different order: VM2 NTFS1---2---3--- ZFS 1---2---3---4--- ZFS blocks for this VM would be " CC", "CCAA", "AABB" etc. So, no overlap between virtual machines, and no benefit from dedup. I may have it wrong, and there are indeed 30,785,627 unique blocks in my setup, but if there's a mechanism for checking alignment, I'd find that very helpful. Thanks, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks
That's a good idea, thanks. I get the feeling the remainder won't be zero, which will back up the misalignment theory. After a bit more digging, it seems the problem is just an NTFS issue and can be addressed irrespective of underlying storage system. I think I'm going to try the process in the following link: http://www.tuxyturvy.com/blog/index.php?/archives/59-Aligning-Windows-Partitions-Without-Losing-Data.html With any luck I'll then see a smaller dedup table, and better performance! Thanks to those for feedback, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Expanding RAIDZ with larger disks - can't see all space.
Posting this question again as I originally tagged it onto the end of a series of longwinded posts of mine where I was having problems replacing a drive. After dodgy cabling and a few power cuts, I finally got the new drive resilvered. Before this final replace, I had 3 x 1TB & 1 x 750GB drives in RAIDZ1 zpool. After the replace, all four are 1TB, but I can still only see a total of 2.73TB in zpool list. I have tried: 1. Reboot. 2. zpool export, then a zpool import. 3. zpool export, reboot, zpool import. However, I can still only see 2.73TB total. Any ideas what it could be? All four disks show as 931.51GB in format. Where should I start troubleshooting to see why ZFS isn't using all of this space? I'm currently on SXCE119. I tried a mock-scenario in VMware using the 2009.06 live cd, which worked correctly after an export and import. Can't do this on my setup, however, as I have upgraded my zpool to the latest version, and it can't be read using the CD now. Thanks, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Expanding RAIDZ with larger disks - can't see all space.
I knew it would be something simple!! :-) Now 3.63TB, as expected, and no need to export and import either! Thanks Richard, that's done the trick. Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Troubleshooting dedup performance
So if the ZFS checksum is set to fletcher4 at the pool level, and dedup=on, which checksum will it be using? If I attempt to set dedup=fletcher4, I do indeed get this: cannot set property for 'zp': 'dedup' must be one of 'on | off | verify | sha256[,verify]' Could it be that my performance troubles are due to the calculation of two different checksums? Thanks, Chris -Original Message- From: cyril.pli...@gmail.com [mailto:cyril.pli...@gmail.com] On Behalf Of Cyril Plisko Sent: 16 December 2009 17:09 To: Andrey Kuzmin Cc: Chris Murray; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Troubleshooting dedup performance >> I've set dedup to what I believe are the least resource-intensive >> settings - "checksum=fletcher4" on the pool, & "dedup=on" rather than > > I believe checksum=fletcher4 is acceptable in dedup=verify mode only. > What you're doing is seemingly deduplication with weak checksum w/o > verification. I think fletcher4 use for the deduplication purposes was disabled [1] at all, right before build 129 cut. [1] http://hg.genunix.org/onnv-gate.hg/diff/93c7076216f6/usr/src/common/zfs/ zfs_prop.c -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Troubleshooting dedup performance
In case the overhead in calculating SHA256 was the cause, I set ZFS checksums to SHA256 at the pool level, and left for a number of days. This worked fine. Setting dedup=on immediately crippled performance, and then setting dedup=off fixed things again. I did notice through a zpool iostat that disk IO increased while dedup was on, although it didn't from the ESXi side. Could it be that dedup tables don't fit in memory? I don't have a great deal - 3GB. Is there a measure of how large the tables are in bytes, rather than number of entries? Chris -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Chris Murray Sent: 16 December 2009 17:19 To: Cyril Plisko; Andrey Kuzmin Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Troubleshooting dedup performance So if the ZFS checksum is set to fletcher4 at the pool level, and dedup=on, which checksum will it be using? If I attempt to set dedup=fletcher4, I do indeed get this: cannot set property for 'zp': 'dedup' must be one of 'on | off | verify | sha256[,verify]' Could it be that my performance troubles are due to the calculation of two different checksums? Thanks, Chris -Original Message- From: cyril.pli...@gmail.com [mailto:cyril.pli...@gmail.com] On Behalf Of Cyril Plisko Sent: 16 December 2009 17:09 To: Andrey Kuzmin Cc: Chris Murray; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Troubleshooting dedup performance >> I've set dedup to what I believe are the least resource-intensive >> settings - "checksum=fletcher4" on the pool, & "dedup=on" rather than > > I believe checksum=fletcher4 is acceptable in dedup=verify mode only. > What you're doing is seemingly deduplication with weak checksum w/o > verification. I think fletcher4 use for the deduplication purposes was disabled [1] at all, right before build 129 cut. [1] http://hg.genunix.org/onnv-gate.hg/diff/93c7076216f6/usr/src/common/zfs/ zfs_prop.c -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool import hang - possibly dedup related?
I'm trying to import a pool into b132 which once had dedup enabled, after the machine was shut down with an "init 5". However, the import hangs the whole machine and I eventually get kicked off my SSH sessions. As it's a VM, I can see that processor usage jumps up to near 100% very quickly, and stays there. Longest I've left it is 12 hours. Before I shut down the VM, there was only around 5GB of data in that zpool. There doesn't appear to be any disk activity while it's in this stuck state. Are there any troubleshooting tips on where I can start to look for answers? The virtual machine is running on ESXi 4, with two virtual CPUs and 3GB RAM. Thanks in advance, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Hang on zpool import (dedup related)
Another hang on zpool import thread, I'm afraid, because I don't seem to have observed any great successes in the others and I hope there's a way of saving my data ... In March, using OpenSolaris build 134, I created a zpool, some zfs filesystems, enabled dedup on them, moved content into them and promptly discovered how slow it was because I only have 4GB RAM. Even with 30GB L2ARC, the performance was unacceptable. The trouble started when the machine hung one day. Ever since, I've been unable to import my pool without it hanging again. At the time I saw posts from others who had run into similar problems, so I thought it best that I wait until a later build, on the assumption that some ZFS dedup bug would be fixed and I could see my data again. I've been waiting ever since, and only just had a chance to try build 147, thanks to illumos and a schillix live CD. However, the pool still won't import, so I'd much appreciate any troubleshooting hints and tips to help me on my way. schillix b147i My process is: 1. boot the live CD. 2. on the console session, run vmstat 1 3. from another machine, SSH in with multiple sessions and: vmstat 60 vmstat 1 zpool import -f zp zpool iostat zp 1 zpool iostat zp -v 5 4. wait until it all stops What I observe is that the zpool import command never finishes, there will be a lengthy period of read activity made up of very small reads which then stops before an even longer period of what looks like no disk activity. zp 512G 1.31T 0 0 0 0 The box will be responsive for quite some time, seemingly doing not a great deal: kthr memorypagedisk faults cpu r b w swap free re mf pi po fr de sr cd cd rm s0 in sy cs us sy id 0 0 0 2749064 3122988 0 7 0 0 0 0 0 0 1 0 0 365 218 714 0 1 99 Then after a matter of hours it'll hang. SSH sessions are no longer responsive. On the console I can press return which creates a new line, but vmstat will have stopped updating. Interestingly, what I observed in b134 was the same thing, however the free memory would slowly decrease over the course of hours, before a sudden nose-dive right before the lock up. Now it appears to hang without that same effect. While the import appears to be working, I can cd to /zp and look at content of the filesystems of 5 of the 9 "esx*" directories. Coincidence or not, it's the last four which appear to be empty - esx_prod onward. # zfs list NAME USED AVAIL REFER MOUNTPOINT zp 905G 1.28T23K /zp zp/nfs 889G 1.28T32K /zp/nfs zp/nfs/esx_dev 264G 1.28T 264G /zp/nfs/esx_dev zp/nfs/esx_hedgehog 25.8G 1.28T 25.8G /zp/nfs/esx_hedgehog zp/nfs/esx_meerkat 223G 1.28T 223G /zp/nfs/esx_meerkat zp/nfs/esx_meerkat_dedup 938M 1.28T 938M /zp/nfs/esx_meerkat_dedup zp/nfs/esx_page 8.90G 1.28T 8.90G /zp/nfs/esx_page zp/nfs/esx_prod306G 1.28T 306G /zp/nfs/esx_prod zp/nfs/esx_skunk21K 1.28T21K /zp/nfs/esx_skunk zp/nfs/esx_temp 45.5G 1.28T 45.5G /zp/nfs/esx_temp zp/nfs/esx_template 15.2G 1.28T 15.2G /zp/nfs/esx_template Any help would be appreciated. What could be going wrong here? Is it getting progressively closer to becoming imported each time I try this, or will it be starting from scratch? Feels to me like there's an action in the /zp/nfs/esx_prod filesystem it's trying to replay and never getting to the end of, for some reason. In case it was getting in a muddle with the l2arc, I removed the cache device a matter of minutes into this run. It hasn't hung yet, vmstat is still updating, but I tried a 'zpool import' in one of the windows to see if I could even see a pool on another disk, and that hasn't returned me back to the prompt yet. Also tried to SSH in with another session, and that hasn't produced the login prompt. Thanks in advance, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hang on zpool import (dedup related)
Absolutely spot on George. The import with -N took seconds. Working on the assumption that esx_prod is the one with the problem, I bumped that to the bottom of the list. Each mount was done in a second: # zfs mount zp # zfs mount zp/nfs # zfs mount zp/nfs/esx_dev # zfs mount zp/nfs/esx_hedgehog # zfs mount zp/nfs/esx_meerkat # zfs mount zp/nfs/esx_meerkat_dedup # zfs mount zp/nfs/esx_page # zfs mount zp/nfs/esx_skunk # zfs mount zp/nfs/esx_temp # zfs mount zp/nfs/esx_template And those directories have the content in them that I'd expect. Good! So now I try to mount esx_prod, and the influx of reads has started in zpool iostat zp 1 This is the filesystem with the issue, but what can I do now? Thanks again. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Single VDEV pool permanent and checksum errors after replace
Hi, I have some strange goings-on with my VM of Solaris Express 11, and I hope someone can help. It shares out other virtual machine files for use in ESXi 4.0 (it, too, runs in there) I had two disks inside the VM - one for rpool and one for 'vmpool'. All was fine. vmpool has some deduped data. That was also fine. I added a Samsung SSD to the ESXi host, created a 512MB VMDK and a 20GB VMDK, and added as log and cache, respectively. This also worked fine. At this point, the pool is made of c8t1d0 (data), c8t2d0 (logs), c8t3d0 (cache). I decide that to add some redundancy, I'll add a mirrored virtual disk. At this point, it happens that the VMDK for this disk (c8t4d0) actually resides on the same physical disk as c8t1d0. The idea was to perform the logical split in Solaris Express first, deal with the IO penalty of writing everything twice to the same physical disk (even though Solaris thinks they're two separate ones), then move that VMDK onto a separate physical disk shortly. This should in the short term protect against bit-flips and small errors on the single physical disk that ESXi has, until a second one is installed. I have a think about capacity, though, and decide I'd prefer the mirror to be of c8t4d0 and c8t5d0 instead. So, it seems I want to go from one single disk (c8t1d0), to a mirror of c8t4d0 and c8t5d0. In my mind, that's a 'zpool replace' onto c8t4d0 and a 'zpool attach' of c8t5d0. I kick off the replace, and all goes fine. Part way through I try to do the attach as well, but am politely told I can't. The replace itself completed without complaint, however on completion, virtual machines whose disks are inside 'vmpool' start hanging, checksum errors rapidly start counting up, and since there's no redundancy, nothing can be done to repair them. pool: vmpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scan: resilvered 48.2G in 2h53m with 0 errors on Mon Jan 3 20:45:49 2011 config: NAME STATE READ WRITE CKSUM vmpool DEGRADED 0 0 25.6K c8t4d0 DEGRADED 0 0 25.6K too many errors logs c8t2d0 ONLINE 0 0 0 cache c8t3d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /vmpool/nfs/duck/duck_1-flat.vmdk /vmpool/nfs/panda/template.xppro-flat.vmdk At this point, I remove disk c8t1d0, and snapshot the entire VM in case I do any further damage. This leads to my first two questions: #1 - are there any suspicions as to what's happened here? How come the resilver completed fine but now there are checksum errors on the replacement disk? It does reside on the same physical disk, after all. Could this be something to do with me attempting the attach during the replace? #2 - in my mind, c8t1d0 contains the state of the pool just prior to the cutover to c8t4d0. Is there any way I can get this back, and scrap the contents of c8t4d0? A 'zpool import -D' is fruitless, but I imagine there's some way of tricking Solaris into seeing c8t1d0 this as a single disk pool again? Now that I've snapshotted the VM and have a sort of safety net, I run a scrub, which unsurprisingly unearths checksum errors and lists all of the files which have problems: pool: vmpool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scan: scrub repaired 0 in 0h30m with 95 errors on Mon Jan 3 21:47:25 2011 config: NAME STATE READ WRITE CKSUM vmpool ONLINE 0 0 190 c8t4d0 ONLINE 0 0 190 logs c8t2d0 ONLINE 0 0 0 cache c8t3d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /vmpool/nfs/duck/duck-flat.vmdk /vmpool/nfs/duck/Windows Server 2003 Standard Edition.nvram /vmpool/nfs/duck/duck_1-flat.vmdk /vmpool/nfs/eagle/eagle-flat.vmdk /vmpool/nfs/eagle/eagle_1-flat.vmdk /vmpool/nfs/eagle/eagle_2-flat.vmdk /vmpool/nfs/eagle/eagle_3-flat.vmdk /vmpool/nfs/eagle/eagle_5-flat.vmdk /vmpool/nfs/panda/Windows XP Professional.nvram /vmpool/nfs/panda/panda-flat.vmdk /vmpool/nfs/panda/template.xppro-flat.vmdk I 'zpool clear vmpool', power on one of the VMs, and the checksum count quickly reaches 970. #3 - why would this be the case? I thought the purpose of a scrub was to traverse all blocks, read them, and unearth problems? I'm wondering why these 970 errors haven
Re: [zfs-discuss] Single VDEV pool permanent and checksum errors after replace
Hi Edward, Thank you for the feedback. All makes sense. To clarify, yes, I snapshotted the VM within ESXi, not the filesystems within the pool. Unfortunately, because of my misunderstanding of how ESXi snapshotting works, I'm now left without the option of investigating whether the replaced disk could be used to create a new pool. For anyone interested, I removed the c8t1d0 disk from the VM, snapshotted, messed around a little, removed the 'corrupt' disks, added c8t1d0 back in, performed a 'zdb -l' which did show a disk of type 'replacing', with two children. That looked quite promising, but I wanted to wait until anyone had chipped in with some suggestions about how to recover from the replaced disk, so I decided to look at the corrupt data again. I reverted back to the snapshot in ESXi, bringing back my corrupt disks (as you'd expect), but which unfortunately *deleted* (!?) the VMDK files which related to c8t1d0. Not a ZFS/Solaris issue of any kind, I know, but one to watch out for potentially if anyone else is trying things out in this unsupported configuration. Shame I can't look into getting data back from the 'good' virtual disk - that's probably something I'd like answered so I might look into again once I've put this matter to bed. In the meantime, I'll see what I can do with dd_rescue or dd with 'noerror,sync' to produce some swiss-cheese VMDK files and see whether the content can be repaired. It's not the end of the world if they're gone, but I'd like to satisfy my own curiosity with this little exercise in recovery. Thanks again for the input, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Single VDEV pool permanent and checksum errors after replace
On 5 January 2011 13:26, Edward Ned Harvey wrote: > One comment about etiquette though: > I'll certainly bear your comments in mind in future, however I'm not sure what happened to the subject, as I used the interface at http://opensolaris.org/jive/. I thought that would keep the subject the same. Plus, my gmail account appears to have joined up my reply from the web interface with the original thread too? Anyhow, I do see your point about quoting, and will do from now. For anyone wondering about the extent of checksum problems in my VMDK files, they range from only 128KB worth in some, to 640KB in others. Unfortunately it appears that the bad parts are in critical parts of the filesystem, but it's not a ZFS matter so I'll see what can be done by way of repair with Windows/NTFS inside each affected VM. So whatever went wrong, it was only a small amount of data. Thanks again, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Single VDEV pool permanent and checksum errors after replace
On 6 January 2011 20:02, Chris Murray wrote: > On 5 January 2011 13:26, Edward Ned Harvey > wrote: >> One comment about etiquette though: >> > > > I'll certainly bear your comments in mind in future, however I'm not > sure what happened to the subject, as I used the interface at > http://opensolaris.org/jive/. I thought that would keep the subject > the same. Plus, my gmail account appears to have joined up my reply > from the web interface with the original thread too? Anyhow, I do see > your point about quoting, and will do from now. > > For anyone wondering about the extent of checksum problems in my VMDK > files, they range from only 128KB worth in some, to 640KB in others. > Unfortunately it appears that the bad parts are in critical parts of > the filesystem, but it's not a ZFS matter so I'll see what can be done > by way of repair with Windows/NTFS inside each affected VM. So > whatever went wrong, it was only a small amount of data. > > Thanks again, > Chris > I'll get the hang of this e-mail lark on of these days, I'm sure :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Unable to import zpool since system hang during zfs destroy
Hi all, I have a RAID-Z zpool made up of 4 x SATA drives running on Nexenta 1.0.1 (OpenSolaris b85 kernel). It has on it some ZFS filesystems and few volumes that are shared to various windows boxes over iSCSI. On one particular iSCSI volume, I discovered that I had mistakenly deleted some files from the FAT32 partition that is on it. The files were still in a ZFS snapshot that was made earlier in the morning so I made use of the ZFS clone command to create a separate copy of the volume. I accessed it in Windows, got the files I needed, and then proceeded to delete it using "zfs destroy". During the process, disk activity stopped, my SSH windows stopped responding and Windows lost all iSCSI connections, reporting delayed write failed for the volumes that disappeared. I powered down the Nexenta box and started it back up, where it hung with the following output: SunOS Release 5.11 Version NexentaOS_20080312 64-bit Loading Nexenta... Hostname: mammoth This is before the usual "Reading ZFS config: done" and "Mounting ZFS filesystems" indicators. The only way I could bring system up was to disconnect all four SATA drives before power-on. I can then export the zpool, reboot, and the system comes up without complaint. However, of course, the pool isn't imported. When I execute "zpool import", the pool is detected fine: pool: zp id: 2070286287887108251 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: zp ONLINE raidz1 ONLINE c0t1d0 ONLINE c0t0d0 ONLINE c0t3d0 ONLINE c0t2d0 ONLINE The next issue is that when the pool is actually imported ("zpool import -f zp"), it too hangs the whole system, albeit after a minute or so of disk activity. A "zpool iostat zp 10" during that time is below: capacity operations bandwidth pool used avail read write read write -- - - - - - - zp 1.73T 1018G 1.13K 7 4.43M 23.6K zp 1.73T 1018G 1.05K 0 4.07M 0 zp 1.73T 1018G 1.15K 0 4.88M 0 zp 1.73T 1018G 457 0 1.36M 0 zp 1.73T 1018G 668 0 2.49M 0 zp 1.73T 1018G 411 0 1.80M 0 [system stopped at this point and wouldn't accept keypresses any more] I'm lost as to what to do - every time the pool is imported, it briefly turns up in "zpool status", but will then hang the system to the extent that I must power off, disconnect drives, power up, zpool export, and reboot, just to be able to start typing commands again!! So far I've tried: 1. Rebooting with only one of the SATA drives attached at a time. All four times the OS came up fine, but of course "zpool status" reported the pool as having insufficient replicas. I don't know whether powering up with two or three drives will work; I didn't want to try any permutations in case I made things worse. 2. Checking with "fmdump -e", the only output relating to zfs is regarding missing vdev's and is presumably from when I have been rebooting with drives disconnected. 3. "dd if=/dev/rdsk/c0t0d0 of=/"dev/null bs=1048576" and the equivalents for the other three drives are all currently running and I await the results. Given that a scrub takes about 7 hours, I expect I'll have to leave this overnight. 4. "zdb -e zp" is now at the stage of "Traversing all blocks to verify checksums and verify nothing leaked ...". I expect this will also take some time. While I wait for the results from "dd" and "zdb", is there anything else I can try in order to get the pool up and running again? I have spotted some previous, similar posts regarding hanging, notably this one: http://opensolaris.org/jive/thread.jspa?threadID=70205&tstart=15 Unfortunately, I am a bit of a Nexenta/OpenSolaris/Unix newbie so a lot of that is way over my head, and when the system completely hangs, I have no choice but to power off. Any help is much appreciated! Thanks, Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy
Ah-ha! That certainly looks like the same issue Miles - well spotted! As it happens, the "zdb" command failed with "out of memory -- generating core dump" whereas all four dd's completed successfully. I'm downloading snv96 right now - I'll install in the morning and post my results both here, and in the thread you mention. If this works, I may stay with OpenSolaris again - I've been unable to use Nexenta as an iSCSI target for ESXi 3.5 because of the b85 kernel, so this upgrade to b96 may kill two birds with one stone. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy
That's a good point - I'll try svn94 if I can get my hands on it - any idea where the download for it is? I've been going round in circles and all I can come up with are the variants of svn96 - CD, DVD (2 images), DVD (single image). Maybe that's a sign I should give up for the night! Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy
Ok, used the development 2008.11 (b95) livecd earlier this morning to import the pool, and it worked fine. I then rebooted back into Nexenta and all is well. Many thanks for the help guys! Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me?
Hi all, I can confirm that this is fixed too. I ran into the exact same issue yesterday after destroying a clone: http://www.opensolaris.org/jive/thread.jspa?threadID=70459&tstart=0 I used the b95-based 2008.11 development livecd this morning and the pool is now back up and running again after a quick import and export. Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS file permissions - some files missing over SMB?
Hello, Hopefully a quick and easy permissions problem here, but I'm stumped and quickly reached the end of my Unix knowledge. I have a ZFS filesystem called "fs/itunes" on pool "zp". In it, the "iTunes music" folder contained a load of other folders - one for each artist. During a resilver operation which was going to take a week, I decided to delete this data (amongst other things) and restore it from backup once the resilver was complete. It finished on Sunday night, so I started copying content from my Windows machine + an NTFS external disk, to "/zp/fs/itunes/iTunes music", using winscp. Now, if I browse to that folder over SMB from the Windows machine, I have a subset of all of the artist names, and I can't identify exactly why some are there and others aren't. I'm accessing it using the "sharesmb=on" option, and user "chris". So: * "\\mammoth\itunes\iTunes music" contains *some* folders. * If I winscp using user "chris" and browse to the same folder, everything is there. * Inspecting the properties on a folder that is visible, I see that it has group "staff", owner "chris", and permissions "0777". * A one that is visible in winscp but NOT through Windows has the same ... ?! I'm not sure what I've done here, but clearly there's something I don't understand about permissions. If I try to create one of the missing folders through Windows, I'm told "Cannot rename New Folder: A file with the name you specified already exists. Specify a different file name.", so they appear to be hidden from view in some way. Thanks in advance, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS file permissions - some files missing over SMB?
The plot thickens ... I had a brainwave and tried accessing a 'missing' folder with the following on Windows: explorer "\\mammoth\itunes\iTunes music\Dubfire" I can open files within it and can rename them too. So .. still looks like a permissions problem to me, but in what way, I'm not quite sure. Forgot to mention, I'm using SXCE b105. Thanks again. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS file permissions - some files missing over SMB?
Thanks Mark. I ran the script and found references in the output to 'aclmode' and 'aclinherit'. I had in the back of my mind that I've had to mess on with ZFS ACL's in the past, aside from using chmod with the usual numeric values. That's given me something to go on. I'll post to cifs-discuss if I don't get anywhere. Thanks for pointing me in the right direction. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] *Almost* empty ZFS filesystem - 14GB?
Accidentally posted the below earlier against ZFS Code, rather than ZFS Discuss. My ESXi box now uses ZFS filesystems which have been shared over NFS. Spotted something odd this afternoon - a filesystem which I thought didn't have any files in it, weighs in at 14GB. Before I start deleting the empty folders to see what happens, any ideas what's happened here? # zfs list | grep temp zp/nfs/esx_temp 14.0G 225G 14.0G /zp/nfs/esx_temp # ls -la /zp/nfs/esx_temp total 20 drwxr-xr-x 5 root root 5 Aug 13 12:54 . drwxr-xr-x 7 root root 7 Aug 13 12:40 .. drwxr-xr-x 2 root root 2 Aug 13 12:53 iguana drwxr-xr-x 2 root root 2 Aug 13 12:54 meerkat drwxr-xr-x 2 root root 2 Aug 16 19:39 panda # ls -la /zp/nfs/esx_temp/iguana/ total 8 drwxr-xr-x 2 root root 2 Aug 13 12:53 . drwxr-xr-x 5 root root 5 Aug 13 12:54 .. # ls -la /zp/nfs/esx_temp/meerkat/ total 8 drwxr-xr-x 2 root root 2 Aug 13 12:54 . drwxr-xr-x 5 root root 5 Aug 13 12:54 .. # ls -la /zp/nfs/esx_temp/panda/ total 8 drwxr-xr-x 2 root root 2 Aug 16 19:39 . drwxr-xr-x 5 root root 5 Aug 13 12:54 .. # Could there be something super-hidden, which I can't see here? There don't appear to be any snapshots relating to zp/nfs/esx_temp. On a suggestion, I have ran the following: # zfs list -r zp/nfs/esx_temp NAME USED AVAIL REFER MOUNTPOINT zp/nfs/esx_temp 14.0G 225G 14.0G /zp/nfs/esx_temp # du -sh /zp/nfs/esx_temp 8K /zp/nfs/esx_temp # Thanks, Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?
Thanks Tim. Results are below: # zfs list -t snapshot -r zp/nfs/esx_temp no datasets available # zfs get refquota,refreservation,quota,reservation zp/nfs/esx_temp NAME PROPERTYVALUE SOURCE zp/nfs/esx_temp refquotanone default zp/nfs/esx_temp refreservation none default zp/nfs/esx_temp quota none default zp/nfs/esx_temp reservation none default # I'm 99% sure I've done something 'obvious', which is escaping me at the minute. The filesystem only has 3 folders in it, which are all empty, so I could very quickly destroy and recreate, but it has piqued my curiosity enough to wonder what's happened here. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?
I don't have quotas set, so I think I'll have to put this down to some sort of bug. I'm on SXCE 105 at the minute, ZFS version is 3, but zpool is version 13 (could be 14 if I upgrade). I don't have everything backed-up so won't do a "zpool upgrade" just at the minute. I think when SXCE 120 is released, I'll install that, upgrade my pool and see if the filesystem still registers as 14GB. If it does, I'll destroy and recreate - no biggie! :-) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?
Nico, what is a zero-link file, and how would I go about finding whether I have one? You'll have to bear with me, I'm afraid, as I'm still building my Solaris knowledge at the minute - I was brought up on Windows. I use Solaris for my storage needs now though, and slowly improving on my knowledge so I can move away from Windows one day :) If it makes any difference, the problem persists after a full reboot, and I've deleted the three folders, so now there is literally nothing in that filesystem .. yet it reports 14GB. It's not too much of an inconvenience, but it does make me wonder whether the 'used' figures on my other filesystems and zvols are correct. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] *Almost* empty ZFS filesystem - 14GB?
That looks like it indeed. Output of zdb - Object lvl iblk dblk lsize asize type 9516K 8K 150G 14.0G ZFS plain file 264 bonus ZFS znode path??? Thanks for the help in clearing this up - satisfies my curiosity. Nico, I'll add those commands to the little list in my mind and they'll push the Windows ones out in no time :) I'll go to b120 when it is available. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Persistent errors - do I believe?
I can flesh this out with detail if needed, but a brief chain of events is: 1. RAIDZ1 zpool with drives A, B, C & D (I don't have access to see original drive names) 2. New disk E. Replaced A with E. 3. Part way through resilver, drive D was 'removed' 4. 700+ persistent errors detected, and lots of checksum errors on all drives. Surprised by this - I thought the absence of one drive could be tolerated? 5. Exported, rebooted, imported. Drive D present now. Good. :-) 6. Drive D disappeared again. Bad. :-( 7. This time, only one persistent error. Does this mean that there aren't errors in the other 700+ files that it reported the first time, or have I lost my chance to note these down, and they are indeed still corrupt? I've re-ran step 5 again, so it is now on the third attempted resilver. Hopefully drive D won't remove itself again, and I'll actually have 30+ hours of stability while the new drive resilvers ... Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Persistent errors - do I believe?
Thanks David. Maybe I mis-understand how a replace works? When I added disk E, and used 'zpool replace [A] [E]' (still can't remember those drive names), I thought that disk A would still be part of the pool, and read from in order to build the contents of disk E? Sort of like a safer way of doing the old 'swap one drive at a time' trick with RAID-5 arrays? Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Persistent errors - do I believe?
Ok, the resilver has been restarted a number of times over the past few days due to two main issues - a drive disconnecting itself, and power failure. I think my troubles are 100% down to these environmental factors, but would like some confidence that after the resilver has completed, if it reports there aren't any persistent errors, that there actually aren't any. Attempt #1: the resilver started after I initiated the replace on my SXCE105 install. All was well until the box lost power. On starting back up, it hung while starting OpenSolaris - just after the line containing the system hostname. I've had this before when a scrub is in progress. My usual tactic is to boot with the 2009.06 live CD, import the pool, stop the scrub, export, reboot into SXCE105 again, and import. Of course, you can't stop a replace that's in progress, so the remaining attempts are in the 2009.06 live CD (build 111b perhaps?) Attempt #2: the resilver started on imported the pool in 2009.06. It was resilvering fine until one drive reported itself as offline. dmesg showed that the drive was 'gone'. I then noticed a lot of checksum errors at the pool level, and RAIDZ1 level, and a large number of 'permanent' errors. In a panic, thinking that the resilver was now doing more harm than good, I exported the pool and rebooted. Attempt #3: I imported in 2009.06 again. This time, the drive that was disconnected last attempt was online again, and proceeded to resilver along with the original drive. There was only one permanent error - in a particular snapshot of a ZVOL I'm not too concerned about. This is the point that I wrote the original post, wondering if all of those 700+ errors reported the first time around weren't a problem any more. I have been running zpool clear in a loop because there were checksum errors on another of the drives (neither of the two part of the replacing vdev, and not the one that was removed previously). I didn't want it to be marked as faulty, so I kept the zpool clear running. Then .. power failure. Attempt #4: I imported in 2009.06. This time, no errors detected at all. Is that a result of my zpool clear? Would that clear any 'permanent' errors? From the wording, I'd say it wouldn't, and therefore the action of starting the resilver again with all of the correct disks in place hasn't found any errors so far ... ? Then, disk removal again ... :-( Attempt #5: I'm convinced that drive removal is down to faulty cabling. I move the machine, completely disconnect all drives, re-wire all connections with new cables, and start the scrub again in 2009.06. Now, there are checksum errors again, so I'm running zpool clear in order to keep drives from being marked as faulted .. but I also have this: errors: Permanent errors have been detected in the following files: zp/iscsi/meerkat_t...@20090905_1631:<0x1> I have a few of my usual VMs powered up (ESXi connecting using NFS), and they appear to be fine. I've ran a chkdsk in the windows VMs, and no errors are reported. Although I can't be 100% confident that any of those files were in the original list of 700+ errors. In the absence of iscsitgtd, I'm not powering up the ones that rely on iSCSI just yet. My next steps will be: 1. allow the resilver to finish. Assuming I don't have yet another power cut, this will be in about 24 hours. 2. zpool export 3. reboot into SXCE 4. zpool import 5. start all my usual virtual machines on the ESXi host 6. note whether that permanent error is still there <-- this will be an interesting one for me - will the export & import clear the error? will my looped zpool clear have simply reset the checksum counters to zero, or will it have cleared this too? 7. zpool scrub to see what else turns up. Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Persistent errors - do I believe?
I've had an interesting time with this over the past few days ... After the resilver completed, I had the message "no known data errors" in a zpool status. I guess the title of my post should have been "how permanent are permanent errors?". Now, I don't know whether the action of completing the resilver was the thing that fixed the one remaining error (in the snapshot of the 'meerkat' zvol), or whether my looped zpool clear commands have done it. Anyhow, for space/noise reasons, I set the machine back up with the original cables (eSATA), in its original tucked-away position, installed SXCE 119 to get me remotely up to date, and imported the pool. So far so good. I then powered up a load of my virtual machines. None of them report errors when running a chkdsk, and SQL Server 'DBCC CHECKDB' hasn't reported any problems yet. Things are looking promising on the corruption front - feels like the errors that were reported while the resilvers were in progress have finally been fixed by the final (successful) resilver! Microsoft Exchange 2003 did complain of corruption of mailbox stores, however I have seen this a few times as a result of unclean shutdowns, and don't think it's related to the errors that ZFS was reporting on the pool during resilver. Then, 'disk is gone' again - I think I can definitely put my original troubles down to cabling, which I'll sort out for good in the next few days. Now, I'm back on the same SATA cables which saw me through the resilvering operation. One of the drives is showing read errors when I run dmesg. I'm having one problem after another with this pool!! I think the disk I/O during the resilver has tipped this disk over the edge. I'll replace it ASAP, and then I'll test the drive in a separate rig and RMA it. Anyhow, there is one last thing that I'm struggling with - getting the pool to expand to use the size of the new disk. Before my original replace, I had 3x1TB and 1x750GB disk. I replaced the 750 with another 1TB, which by my reckoning should give me around 4TB as a total size even after checksums and metadata. No: # zpool list NAMESIZE USED AVAILCAP HEALTH ALTROOT rpool74G 8.81G 65.2G11% ONLINE - zp 2.73T 2.36T 379G86% ONLINE - 2.73T? I'm convinced I've expanded a pool in this way before. What am I missing? Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Persistent errors - do I believe?
Cheers, I did try that, but still got the same total on import - 2.73TB I even thought I might have just made a mistake with the numbers, so I made a sort of 'quarter scale model' in VMware and OSOL 2009.06, with 3x250G and 1x187G. That gave me a size of 744GB, which is *approx* 1/4 of what I get in the physical machine. That makes sense. I then replaced the 187 with another 250, still 744GB total, as expected. Exported & imported - now 996GB. So, the export and import process seems to be the thing to do, but why it's not working on my physical machine (SXCE119) is a mystery. I even contemplated that there might have still been a 750GB drive left in the setup, but they're all 1TB (well, 931.51GB). Any ideas what else it could be? For anyone interested in the checksum/permanent error thing, I'm running a scrub now. 59% done and not one error. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Error: "Volume size exceeds limit for this system"
Hi all, I am experiencing an issue when trying to set up a large ZFS volume in OpenSolaris build 74 and the same problem in Nexenta alpha 7. I have looked on Google for the error and have found zero (yes, ZERO) results, so I'm quite surprised! Please can someone help? I am setting up a test environment in VMware Workstation 6.0 before I unleash this on real hardware. The virtual system has an IDE HDD, IDE CD-ROM and 4 750 GB SCSI disks (c2t0d0 --> c2t3d0). Once working, I hope to use the iSCSI target to see the volume from a Windows box. I create the zpool with: [i]zpool create zp raidz c2t0d0 c2t1d0 c2t2d0 c2t3d0[/i] This completes fine, and then a [i]zpool iostat[/i] reports the pool as having 2.93T free. All makes sense so far. Then, I try to create a ZFS volume in there of size 1.25T with the following command: zfs create -V 1.25T zp/test This then fails with: cannot create 'zp/test': volume size exceeds limit for this system Please can someone help me with this? I've tried making sense of the word "system" in this context but haven't come up with much. A volume of 1T works fine. Could it be something to do with having a 32 bit CPU? Many thanks, Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Error: "Volume size exceeds limit for this system"
Thanks for the help guys - unfortunately the only hardware at my disposal just at the minute is all 32 bit, so I'll just have to wait a while and fork out on some 64-bit kit before I get the drives. I'm a home user so I'm glad I didnt buy the drives and discover I couldnt use them without spending even more!! Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Assistance needed expanding RAIDZ with larger drives
Hi all, Please can you help with my ZFS troubles: I currently have 3 x 400 GB Seagate NL35's and a 500 GB Samsung Spinpoint in a RAIDZ array that I wish to expand by systematically replacing each drive with a 750 GB Western Digital Caviar. After failing miserably, I'd like to start from scratch again if possible. When I last tried, the replace command hung for an age, network connectivity was lost (I have to do this by SSH, so that dropped too), and I got strange outputs from zpool -status where a particular device (but not the one that was being replaced) was listed twice. I also managed to mess up 3 of the 4 WD drives in such a way that they were detected by the bios, but no system after that would work: In solaris, a format wouldn't show the drive (c2d0 should have been there, but it wasn't) Using an old Ghost boot CD, the old 3 drives would list, but the new one wouldn't. GParted couldn't see the new drive (the old 3 were there!) Again, in all of these situations, the drive WAS detected in the BIOS and everything looked perfectly fine in there. I have now managed to salvage one using Windows and a USB caddy in another computer. This is formatted as one big NTFS volume. So the drive does work. Here is the output from zpool -status: | # zpool status -v | pool: zp | state: ONLINE | scrub: scrub in progress, 30.76% done, 4h33m to go | config: | | NAMESTATE READ WRITE CKSUM | zp ONLINE 0 0 0 | raidz1ONLINE 0 0 0 | c2d1ONLINE 0 0 0 | c3d1ONLINE 0 0 0 | c2d0ONLINE 0 0 0 | c3d0ONLINE 0 0 0 | | errors: No known data errors And here is the output from ZDB: | # zdb | zp | version=9 | name='zp' | state=0 | txg=110026 | pool_guid=5629347939003043989 | hostid=823611165 | hostname='mammoth' | vdev_tree | type='root' | id=0 | guid=5629347939003043989 | children[0] | type='raidz' | id=0 | guid=1325151684809734884 | nparity=1 | metaslab_array=14 | metaslab_shift=33 | ashift=9 | asize=1600289505280 | is_log=0 | children[0] | type='disk' | id=0 | guid=5385778296365299126 | path='/dev/dsk/c2d1s0' | devid='id1,[EMAIL PROTECTED]/a' | phys_path='/[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' | whole_disk=1 | DTL=33 | children[1] | type='disk' | id=1 | guid=15098521488705848306 | path='/dev/dsk/c3d1s0' | devid='id1,[EMAIL PROTECTED]/a' | phys_path='/[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' | whole_disk=1 | DTL=32 | children[2] | type='disk' | id=2 | guid=4518340092563481291 | path='/dev/dsk/c2d0s0' | devid='id1,[EMAIL PROTECTED]/a' | phys_path='/[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' | whole_disk=1 | DTL=31 | children[3] | type='disk' | id=3 | guid=7852006658048665355 | path='/dev/dsk/c3d0s0' | devid='id1,[EMAIL PROTECTED]/a' | phys_path='/[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' | whole_disk=1 | DTL=30 I have found many, many posts regarding issues when replacing drives, and have used bits of them to get myself in a working state .. but I'm now confused with all the imports, exports, detaches etc... Please can someone help me out step-by-step with this one and start with: The array is functioning perfectly but I want to replace c2d0 with a new WD drive. Once the scrub completes, What do I do? Many thanks, Chris Murray This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Assistance needed expanding RAIDZ with larger drives
> About that issue, please check my post in: > http://www.opensolaris.org/jive/thread.jspa?threadID=48483&tstart=0 Thanks - when I originally tried to replace the first drive, my intention was to: 1. Move solaris box and drives 2. Power up to test it still works 3. Power down 4. Replace drive. I suspect I may have missed out 2 & 3, and ran into the same situation that you did. Anyhow, I seem to now be in an even bigger mess than earlier - when I tried to simply swap out one of the old drives with a new one and perform a replace, I ran into problems: 1. The hard drive light on the PC lit up, and I heard lots of disk noise, as you would expect 2. The light went off. My continuous ping did the following: Reply from 192.168.0.10: bytes=32 time<1ms TTL=255 Reply from 192.168.0.10: bytes=32 time<1ms TTL=255 Request timed out. Request timed out. Request timed out. Request timed out. Request timed out. Request timed out. Reply from 192.168.0.10: bytes=32 time=2092ms TTL=255 Reply from 192.168.0.10: bytes=32 time<1ms TTL=255 Reply from 192.168.0.10: bytes=32 time<1ms TTL=255 3. The light came back on again .. more disk noise. Good - perhaps the pause was just a momentary blip 4. Light goes off (this is about 20 minutes since the start) 5. zpool status reports that a resilver completed, and there are errors in zp/storage and zp/VMware, and suggests that I should restore from backup 6. I nearly cry, as these are the only 2 files I use. 7. I have heard of ZFS thinking that there are unrecoverable errors before, so I run zpool scrub and then zpool clear a number of times. Seem to make no difference. This whole project started when I wanted to move 900 GB of data from a server 2003 box containing the 4 old disks, to a solaris box. I borrowed 2 x 500 GB drives from a friend, copied all the data onto them, put the 4 old drives into the solaris box, created the zpool, created my storage and VMware volumes, shared them out using iSCSI, created NTFS volumes on the server 2003 box and copied the data back onto them. Aside from a couple of networking issues, this worked absolutely perfectly. Then I decided I'd like some more space, and that's where it all went wrong. Despite the reports of corruption, the storage and VMware "drives" do still work in windows. The iSCSI initiator still picks them up, and if I if I dir /a /s, I can see all of the files that were on these NTFS volumes before I tried this morning's replace. However, should I trust this? I suspect that even if I ran a chkdsk /f, a successful result may not be all that it seems. I still have the 2 x 500 GB drives with my data from weeks ago. I'd be sad to lose a few weeks worth of work, but that would be better than assuming that ZFS is incorrect in saying the volumes are corrupt and then discovering in months time that I cannot get at NTFS files because of this root cause. Since the report of corruption in these 2 volumes, I had a genius troubleshooting idea - "what if the problem is not with ZFS, but instead with Solaris not liking the drives in general?". I exported my current zpool, disconnected all drives, plugged in the 4 new ones, and waited for the system to boot again... nothing. The system had stopped in the BIOS, requesting that I press F1 as SMART reports that one of the drives is bad! Already?!? I only bought the drives a few days ago!!! Now the problem is that I know which of these drives is bad, but I don't know whether this was the one that was plugged in when zpool status reported all the read/write/checksum errors. So maybe I have a duff batch of drives .. I leave the remaining 3 plugged in and create a brand new zpool called test. No problems at all. I create a 1300 GB volume on it. Also no problem. I'm currently overwriting it with random data: dd if=/dev/urandom of=/dev/zvol/rdsk/test/test bs=1048576 count=1331200 I throw in the odd zpool scrub to see how things are doing so far and as yet, there hasn't been a single error of any sort. So, 3 of the WD drives (0430739, 0388708, 0417089) appear to be fine and one is dead already (0373211). So this leads me to the conclusion that (ignoring the bad one), these drives work fine with Solaris. They work fine with ZFS too. It's just the act of trying to replace a drive from my old zpool with a new one that causes issues. My next step will be to run the WD diagnostics on all drives, send the broken one back, and then have 4 fully functioning 750 GB drives. I'll also import the old zpool into the solaris box - it'll undoubtedly complain that one of the drives is missing (the one that I tried to add earlier and got all the errors), so I think I'll try one more replace to get all 4 old drives back in the pool. So, what do I do after that? 1. Create a brand new pool out of the WD drives, share it using iSCSI and copy onto that my data from my friends drives? I'll have lost a good few weeks of work but I'll be confident that it isn't corrupt. 2. Igno
Re: [zfs-discuss] Q : change disks to get bigger pool
This process should work, but make sure you don't swap any cables around while you replace a drive, or you'll run into the situation described in the following thread, as I did: http://www.opensolaris.org/jive/thread.jspa?threadID=48483&tstart=0 Chris This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss