Hi, I've been using a ZFS pool inside a VMware'd NexentaOS, on a single real disk partition, for a few months in order to store some backups.
Today I noticed that there were some directories missing inside 2 separate filesystems, which I found strange. I went to the backup logs (also stored inside the pool) and it seemed that at least one of the directories (/pool/backup/var) went missing yesterday *while* the backup was ongoing (inside /pool/backup/var/log). The backup process is a few simple rsyncs from the VMware host, running Linux, to the VMware guest, running Nexenta, followed by snapshot creation. The filesystems were *not* NFS mounted - I had the rsync server process running on the ZFS box. I tried to look into the snapshots, but doing a 'zfs set snapdir=visible pool/backup' didn't make the .zfs dir appear. I did a 'zpool status' and it didn't report any errors or checksum failures whatsoever. I assumed there was probably corrupted memory in the running kernel instance, so I rebooted. Now I can't even mount the pool! [EMAIL PROTECTED]:~# zpool status -x pool: pool state: FAULTED status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAME STATE READ WRITE CKSUM pool UNAVAIL 0 0 0 insufficient replicas c2t0d0s1 UNAVAIL 0 0 0 cannot open [EMAIL PROTECTED]:~# zdb -l /dev/dsk/c2t0d0s1 -------------------------------------------- LABEL 0 -------------------------------------------- version=3 name='pool' state=0 txg=143567 pool_guid=3667491715056107646 top_guid=8396736522625936678 guid=8396736522625936678 vdev_tree type='disk' id=0 guid=8396736522625936678 path='/dev/dsk/c2t0d0s1' devid='id1,[EMAIL PROTECTED]/b' whole_disk=0 metaslab_array=13 metaslab_shift=30 ashift=9 asize=117896380416 DTL=22 -------------------------------------------- LABEL 1 -------------------------------------------- version=3 name='pool' state=0 txg=143567 pool_guid=3667491715056107646 top_guid=8396736522625936678 guid=8396736522625936678 vdev_tree type='disk' id=0 guid=8396736522625936678 path='/dev/dsk/c2t0d0s1' devid='id1,[EMAIL PROTECTED]/b' whole_disk=0 metaslab_array=13 metaslab_shift=30 ashift=9 asize=117896380416 DTL=22 -------------------------------------------- LABEL 2 -------------------------------------------- version=3 name='pool' state=0 txg=143567 pool_guid=3667491715056107646 top_guid=8396736522625936678 guid=8396736522625936678 vdev_tree type='disk' id=0 guid=8396736522625936678 path='/dev/dsk/c2t0d0s1' devid='id1,[EMAIL PROTECTED]/b' whole_disk=0 metaslab_array=13 metaslab_shift=30 ashift=9 asize=117896380416 DTL=22 -------------------------------------------- LABEL 3 -------------------------------------------- version=3 name='pool' state=0 txg=143567 pool_guid=3667491715056107646 top_guid=8396736522625936678 guid=8396736522625936678 vdev_tree type='disk' id=0 guid=8396736522625936678 path='/dev/dsk/c2t0d0s1' devid='id1,[EMAIL PROTECTED]/b' whole_disk=0 metaslab_array=13 metaslab_shift=30 ashift=9 asize=117896380416 DTL=22 I don't know much about Solaris partitions, but here's how I did it (I needed to store swap on this disk): partition> print Current partition table (original): Total disk cylinders available: 14466 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 swap wm 1 - 131 1.00GB (131/0/0) 2104515 1 reserved wm 132 - 14465 109.80GB (14334/0/0) 230275710 2 backup wu 0 - 14465 110.82GB (14466/0/0) 232396290 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0 8 boot wu 0 - 0 7.84MB (1/0/0) 16065 9 unassigned wm 0 0 (0/0/0) 0 However, I found this rather strange: [EMAIL PROTECTED]:~# stat -L /dev/dsk/c2t0d0s1 File: `/dev/dsk/c2t0d0s1' Size: 9223372036854775807 Blocks: 0 IO Block: 8192 block special file Device: 4380000h/70778880d Inode: 26214405 Links: 1 Device type: 32,1 Access: (0640/brw-r-----) Uid: ( 0/ root) Gid: ( 3/ sys) Access: 2006-12-15 05:11:56.000000000 +0000 Modify: 2006-12-15 05:11:56.000000000 +0000 Change: 2006-12-15 05:11:56.000000000 +0000 (The -L parameter in the GNU stat is the follow symlink option). Notice the size! stat -L /dev/dsk/c2t0d0s0 reports a size of 1077511680. Interestingly, I never had any problems until now, I even did weekly scrubs and it *never* reported any errors or checksum failures. I've tried stopping Nexenta, deleting the disk in VMware, readding it, booting and doing 'devfsadm -C'. It didn't solve anything. Is there any way to recover the data? I don't really know how to begin diagnosing/solving the problem. I don't understand why stat is reporting that strange number. Thanks. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss