Re: [zfs-discuss] virtualbox rawdisk discrepancy

2011-11-22 Thread Jim Klimov

2011-11-22 10:24, Frank Cusack пишет:

On Mon, Nov 21, 2011 at 10:06 PM, Frank Cusack mailto:fr...@linetwo.net>> wrote:

grub does need to have an idea of the device path, maybe in vbox
it's seen as the 3rd disk (c0t2), so the boot device name written to
grub.conf is "disk3" (whatever the terminology for that is in
grub-speak), but when I boot on the Sun hardware it is seen as
"disk0" and this just doesn't work.  If it's that easy that'd be
awesome, all I need is an alternate grub entry.




Regarding the disk numbering, GRUB and other x86 loaders
assume that the current "BIOS boot disk" (one passed from
BIOS as the boot device) is always number zero. Numbering
of secondary drives may vary from boot to boot (i.e. if
you boot from one or another disk of a mirrored root).




Or maybe not.  I guess this was findroot() in sol10 but in sol11 this
seems to have gone away.


I haven't used sol11 yet, so I can't say for certain.
But it is possible that the default boot (without findroot)
would use the bootfs property of your root pool. At least
that's the way it worked in intermediate SXCE releases,
where you had to set the whole bootfs path (zfs dataset
name) in grub menu, or use the bootfs property.





Also, I was wrong about the disk target.  When I do the install I
configure the USB stick at disk0, seen by Solaris as c3t0, and no
findroot() line gets written to menu.lst.  Maybe it needs that line when
it boots as a USB still on real hardware?

I'll try import/export and a reconfigure boot when I get a chance.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] virtualbox rawdisk discrepancy

2011-11-22 Thread Fajar A. Nugraha
On Tue, Nov 22, 2011 at 7:32 PM, Jim Klimov  wrote:
>> Or maybe not.  I guess this was findroot() in sol10 but in sol11 this
>> seems to have gone away.
>
> I haven't used sol11 yet, so I can't say for certain.
> But it is possible that the default boot (without findroot)
> would use the bootfs property of your root pool.

Nope.

S11's grub specifies bootfs for every stanza in menu.lst. bootfs pool
property is no longer used.

Anyway, after some testing, I found out you CAN use vbox-installed s11
usb stick on real notebook (enough hardware difference there). The
trick is you have to import-export the pool on the system you're going
to boot the stick on. Meaning, you need to have S11 live cd/usb handy
and boot with that first before booting using your disk.

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] early save the date: What's New and What's Coming in ZFS for illumos, Jan 10

2011-11-22 Thread Deirdre Straughan
Yes, I do plan to stream (assuming no technical difficulties) and record
the session - just ordered new AV equipment in fact (you can thank Bryan
for that). I'd like to encourage and assist worldwide participation in all
meetings, though I recognize that an evening time slot here in California
isn't ideal for most.

-- 


best regards,
Deirdré Straughan
SmartOS Community Manager
Joyent

cell 720 371 4107
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SUMMARY: mounting datasets from a read-only pool with aid of tmpfs

2011-11-22 Thread Jim Klimov

Hello all,

  I'd like to report a tricky situation and a workaround
I've found useful - hope this helps someone in similar
situations.

  To cut the long story short, I could not properly mount
some datasets from a readonly pool, which had a non-"legacy"
mountpoint attribute value set, but the mountpoint was not
available (directory absent or not empty). In this case I
could neither create/cleanup the mountpoints, nor change
the dataset properties to mountpoint=legacy.

  After a while I managed to override a higher-level
point in the FS tree by mounting tmpfs over it, and in
that tmpfs I could make the mountpoints needed by zfs.

  I don't want to claim this adventure as a bug (because
in this case zfs actually works as documented), but rather
as an inconvenience that might need some improvement, i.e.
to allow (forced?) use of "mount -F zfs" even for datasets
with the mountpoint property defined.

  Here goes the detailed version:

  I was evacuating data from a corrupted rpool which I could
only import read-only while booted from a LiveUSB. As I wrote
previously, I could not use "zfs send" (bug now submitted to
Illumos tracker), so I reverted to directly mounting datasets
and copying data off them into another location (into a similar
dataset hierarchy on my data pool), i.e.:

# zpool import -R /POOL pool
# zpool import -R /RPOOL -o readonly=on rpool

  My last-used root fs (rpool/ROOT/openindiana-1) had the
property mountpoint=/ set, so it got mounted into /RPOOL of
the LiveUSB environment. I copied my data off it, roughly
like this:

# cd /RPOOL && ( find . -xdev -depth -print | \
  cpio -pvdm /POOL/rpool-backup/openindiana-1 ; \
  rsync -avPHK ./ /POOL/rpool-backup/openindiana-1/ )

Likewise for many other filesystems, like those with legacy
mountpoints (mounted wherever I like, like /mnt/1) or those
with existing valid mountpoints (like /export).

However I ran into trouble with secondary (older) root FSes
which I wanted to keep for posterity. For whatever reason,
the "mountpoint" property was set to "/mnt". This directory
was not empty, and I found no way to pass "-O" flag to mount
for ZFS automounted FSes (to mount over non-empty dirs).
On a read-only rpool I couldn't clean up the "/mnt" dir.
I could not use the "mount -F zfs" command because dataset's
mountpoint was defined and not "legacy".
I could not change it to legacy - because rpool is read-only.
If I unmounted the "rpool/ROOT/openindiana-1" dataset, there
was no "/mnt" left at all and no way to create one on a
read-only pool.

So I thought of tmpfs - I can mount these with "-O" anywhere
and need no resources for that. First I tried mounting tmpfs
over /RPOOL/mnt with "openindiana-1" mounted, but I couldn't
mount the older root over this mountpoint directly.

So I unmounted all datasets of "rpool", keeping the pool
imported, and mounted tmpfs over the pool's alternate mount
point. Now I could do my trick:

# mount -F tmpfs -O - /RPOOL
# mkdir /RPOOL/mnt
# zfs mount rpool/ROOT/openindiana

To be complete for those who might need this walkthrough,
since I wanted to retain the benefits of root dataset
cloning, I did that with my backup as well:

# zfs snapshot pool/rpool-backup/openindiana-1@20110501
# zfs clone pool/rpool-backup/openindiana-1@20110501 \
  zfs clone pool/rpool-backup/openindiana
# cd /RPOOL/mnt && rsync -avPHK --delete-after \
  ./ /POOL/rpool-backup/openindiana/

Thanks to rsync, I got only differing (older) files written
onto that copy, with newer files removed.

"cpio" and "rsync" both barked on some unreadable files
(I/O errors) which I believe were in the zfs blocks with
mismatching checksums, initially leading to the useless
rpool. I replaced these files with those on the LiveUSB
in the copy on "pool".

Finally I exported and recreated rpool on the same device,
and manually repopulated the zpool properties (failmode,
bootfs) as well as initial values of zfs properties that
I wanted (copies=2, dedup=off, compression=off). Then I
used installgrub to ensure that the new rpool is bootable.

I also set the zfs properties I wanted (i.e. copies=2, also
canmount, mountpoint, caiman.* and others set by OpenIndiana,
compression=on where allowed - on non-root datasets) on the
rpool hierarchy copy in pool. Even though the actual data on
"pool" was written with copies=1, the property value copies=2
will be copied and applied during "zfs send|zfs recv".

Now I could "zfs send" the hierarchical replication stream
from my copy in "pool" to the new "rpool", kind of like this:

# zfs snapshot -r pool/rpool-backup@2019-05
# zfs send -R pool/rpool-backup@2019-05 | zfs recv -vF rpool

Since the hardware was all the same, there was little else
to do. I revised "RPOOL/rpool/boot/grub/menu.lst" and
"RPOOL/etc/vfstab" just in case, but otherwise was ready
to reboot. Luckily for me, the system came up as expected.

That kind of joy does not always happen as planned,
especially when you're half-a-globe away from the
computer you're repair

Re: [zfs-discuss] SUMMARY: mounting datasets from a read-only pool with aid of tmpfs

2011-11-22 Thread Lori Alt



Did you try a temporary mount point?

zfs mount -o mountpoint=/whatever 

- lori


On 11/22/11 15:11, Jim Klimov wrote:

Hello all,

  I'd like to report a tricky situation and a workaround
I've found useful - hope this helps someone in similar
situations.

  To cut the long story short, I could not properly mount
some datasets from a readonly pool, which had a non-"legacy"
mountpoint attribute value set, but the mountpoint was not
available (directory absent or not empty). In this case I
could neither create/cleanup the mountpoints, nor change
the dataset properties to mountpoint=legacy.

  After a while I managed to override a higher-level
point in the FS tree by mounting tmpfs over it, and in
that tmpfs I could make the mountpoints needed by zfs.

  I don't want to claim this adventure as a bug (because
in this case zfs actually works as documented), but rather
as an inconvenience that might need some improvement, i.e.
to allow (forced?) use of "mount -F zfs" even for datasets
with the mountpoint property defined.

  Here goes the detailed version:

  I was evacuating data from a corrupted rpool which I could
only import read-only while booted from a LiveUSB. As I wrote
previously, I could not use "zfs send" (bug now submitted to
Illumos tracker), so I reverted to directly mounting datasets
and copying data off them into another location (into a similar
dataset hierarchy on my data pool), i.e.:

# zpool import -R /POOL pool
# zpool import -R /RPOOL -o readonly=on rpool

  My last-used root fs (rpool/ROOT/openindiana-1) had the
property mountpoint=/ set, so it got mounted into /RPOOL of
the LiveUSB environment. I copied my data off it, roughly
like this:

# cd /RPOOL && ( find . -xdev -depth -print | \
  cpio -pvdm /POOL/rpool-backup/openindiana-1 ; \
  rsync -avPHK ./ /POOL/rpool-backup/openindiana-1/ )

Likewise for many other filesystems, like those with legacy
mountpoints (mounted wherever I like, like /mnt/1) or those
with existing valid mountpoints (like /export).

However I ran into trouble with secondary (older) root FSes
which I wanted to keep for posterity. For whatever reason,
the "mountpoint" property was set to "/mnt". This directory
was not empty, and I found no way to pass "-O" flag to mount
for ZFS automounted FSes (to mount over non-empty dirs).
On a read-only rpool I couldn't clean up the "/mnt" dir.
I could not use the "mount -F zfs" command because dataset's
mountpoint was defined and not "legacy".
I could not change it to legacy - because rpool is read-only.
If I unmounted the "rpool/ROOT/openindiana-1" dataset, there
was no "/mnt" left at all and no way to create one on a
read-only pool.

So I thought of tmpfs - I can mount these with "-O" anywhere
and need no resources for that. First I tried mounting tmpfs
over /RPOOL/mnt with "openindiana-1" mounted, but I couldn't
mount the older root over this mountpoint directly.

So I unmounted all datasets of "rpool", keeping the pool
imported, and mounted tmpfs over the pool's alternate mount
point. Now I could do my trick:

# mount -F tmpfs -O - /RPOOL
# mkdir /RPOOL/mnt
# zfs mount rpool/ROOT/openindiana

To be complete for those who might need this walkthrough,
since I wanted to retain the benefits of root dataset
cloning, I did that with my backup as well:

# zfs snapshot pool/rpool-backup/openindiana-1@20110501
# zfs clone pool/rpool-backup/openindiana-1@20110501 \
  zfs clone pool/rpool-backup/openindiana
# cd /RPOOL/mnt && rsync -avPHK --delete-after \
  ./ /POOL/rpool-backup/openindiana/

Thanks to rsync, I got only differing (older) files written
onto that copy, with newer files removed.

"cpio" and "rsync" both barked on some unreadable files
(I/O errors) which I believe were in the zfs blocks with
mismatching checksums, initially leading to the useless
rpool. I replaced these files with those on the LiveUSB
in the copy on "pool".

Finally I exported and recreated rpool on the same device,
and manually repopulated the zpool properties (failmode,
bootfs) as well as initial values of zfs properties that
I wanted (copies=2, dedup=off, compression=off). Then I
used installgrub to ensure that the new rpool is bootable.

I also set the zfs properties I wanted (i.e. copies=2, also
canmount, mountpoint, caiman.* and others set by OpenIndiana,
compression=on where allowed - on non-root datasets) on the
rpool hierarchy copy in pool. Even though the actual data on
"pool" was written with copies=1, the property value copies=2
will be copied and applied during "zfs send|zfs recv".

Now I could "zfs send" the hierarchical replication stream
from my copy in "pool" to the new "rpool", kind of like this:

# zfs snapshot -r pool/rpool-backup@2019-05
# zfs send -R pool/rpool-backup@2019-05 | zfs recv -vF rpool

Since the hardware was all the same, there was little else
to do. I revised "RPOOL/rpool/boot/grub/menu.lst" and
"RPOOL/etc/vfstab" just in case, but otherwise was ready
to reboot. Luckily for me, the system came up as expected.

T

Re: [zfs-discuss] SUMMARY: mounting datasets from a read-only pool with aid of tmpfs

2011-11-22 Thread Jim Klimov

2011-11-23 2:26, Lori Alt wrote:

Did you try a temporary mount point?
zfs mount -o mountpoint=/whatever 

- lori



I do not want to lie so I'll delay with a definite answer.
I think I've tried that, but I'm not certain now. I'll try
to recreate the situation later and respond responsibly ;)

If this works indeed - that's a good idea ;)
Should it work relative to alternate root as well (just
like a default/predefined mountpoint value would)?

BTW, is there a way to do overlay mounts like "mount -O"
with zfs automount attributes? I have to use legacy mounts
and /etc/vfstab for that now on some systems, but would
like to aviod such complication if possible...

//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Compression

2011-11-22 Thread Matt Breitbach
So I'm looking at files on my ZFS volume that are compressed, and I'm
wondering to myself, "self, are the values shown here the size on disk, or
are they the pre-compressed values".  Google gives me no great results on
the first few pages, so I headed here.

This really relates to my VMware environment.  I had some "things" happen on
my platform that required me to Storage Vmotion everything off of a
particular zpool.  When I did that, I saw most VM's inflate to nearly their
thick provisioned size.  What didn't swell to that size went to about 2/3
provisioned (non-Nexenta storage).

I have been seeing 1.3-1.5x compression ratios on pretty much everything I
turn compression on for (these are general use VM's -
webservers,SQL,firewall,etc).

My question is this - when I'm looking in the file structure, or in the
datastore browser in VMware, am I seeing the uncompressed file size, or the
compressed filesize?

My gut tells me that since they inflated _so_ badly when I storage vmotioned
them, that they are the compressed values, but I would love to know for
sure.

-Matt Breitbach


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Compression

2011-11-22 Thread Jim Klimov

2011-11-23 7:39, Matt Breitbach wrote:

So I'm looking at files on my ZFS volume that are compressed, and I'm
wondering to myself, "self, are the values shown here the size on disk, or
are they the pre-compressed values".  Google gives me no great results on
the first few pages, so I headed here.


Alas, I can't give a good hint about VMWare - which values
it uses. But here are some numbers it might see (likely
"du" or "ls" sizes are in play):

Locally on a ZFS-enabled system you can use "ls" to normally
list your files. This would show you the logical POSIX file
size, including any referenced-but-not-allocated sparse blocks
(logical size = big, physical size = zero), etc.
Basically, this just gives a range of byte numbers that you
can address in the file, and depending on the underlying FS
all or not all of these bytes are backed by physical storage 1:1.

If you use "du" on the ZFS filesystem, you'll see the logical
storage size, which takes into account compression and sparse
bytes. So the "du" size should be not greater than "ls" size.

Harder to see would be the physical allocation size, which
refers to your data pool's redundancy (raidzN level, copies=N
and so on). But you can more or less calculate that from "du"
size. Also your files on ZFS indirectly consume space by
requiring some metadata blocks (usually one per data block,
and usually they are comparatively small) which is pool
metadata and does not show up easily as "file size" as well.
If you're too interested, you might search for howtos on
"zdb" command usage to "debug" your ZFS pool and gather
stats.

If your new storage system does support some sort of
compression, at least for contiguous ranges of zero-bytes,
you might write some large zero-filled files into your
VM's filesystems. This should empty the blocks previously
used by files now deleted in the VM and may give temporary
space savings.




This really relates to my VMware environment.  I had some "things" happen on
my platform that required me to Storage Vmotion everything off of a
particular zpool.  When I did that, I saw most VM's inflate to nearly their
thick provisioned size.  What didn't swell to that size went to about 2/3
provisioned (non-Nexenta storage).

I have been seeing 1.3-1.5x compression ratios on pretty much everything I
turn compression on for (these are general use VM's -
webservers,SQL,firewall,etc).

My question is this - when I'm looking in the file structure, or in the
datastore browser in VMware, am I seeing the uncompressed file size, or the
compressed filesize?

My gut tells me that since they inflated _so_ badly when I storage vmotioned
them, that they are the compressed values, but I would love to know for
sure.

-Matt Breitbach

HTH,
//Jim Klimov

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Compression

2011-11-22 Thread Ian Collins

On 11/23/11 04:58 PM, Jim Klimov wrote:

2011-11-23 7:39, Matt Breitbach wrote:

So I'm looking at files on my ZFS volume that are compressed, and I'm
wondering to myself, "self, are the values shown here the size on disk, or
are they the pre-compressed values".  Google gives me no great results on
the first few pages, so I headed here.

Alas, I can't give a good hint about VMWare - which values
it uses. But here are some numbers it might see (likely
"du" or "ls" sizes are in play):

Locally on a ZFS-enabled system you can use "ls" to normally
list your files. This would show you the logical POSIX file
size, including any referenced-but-not-allocated sparse blocks
(logical size = big, physical size = zero), etc.
Basically, this just gives a range of byte numbers that you
can address in the file, and depending on the underlying FS
all or not all of these bytes are backed by physical storage 1:1.

If you use "du" on the ZFS filesystem, you'll see the logical
storage size, which takes into account compression and sparse
bytes. So the "du" size should be not greater than "ls" size.


It can be significantly bigger:

ls -sh x
   2 x

du -sh x
   1Kx

-- Ian.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] early save the date: What's New and What's Coming in ZFS for illumos, Jan 10

2011-11-22 Thread Zaeem Arshad
On Wed, Nov 23, 2011 at 2:31 AM, Deirdre Straughan <
deirdre.straug...@joyent.com> wrote:

> Yes, I do plan to stream (assuming no technical difficulties) and record
> the session - just ordered new AV equipment in fact (you can thank Bryan
> for that). I'd like to encourage and assist worldwide participation in all
> meetings, though I recognize that an evening time slot here in California
> isn't ideal for most.
>

Thank you. Looking forward to the recorded sessions.

Regards

--
Zaeem
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Compression

2011-11-22 Thread Jim Klimov

2011-11-23 8:21, Ian Collins wrote:

If you use "du" on the ZFS filesystem, you'll see the logical
storage size, which takes into account compression and sparse
bytes. So the "du" size should be not greater than "ls" size.


It can be significantly bigger:

ls -sh x
2 x

du -sh x
1K x


Pun accepted ;)

Ian is right, that "du" size also reflects block-size
usage, and that's how many bytes are actually used on
the FS (over redundancy layer). Even if your files are
smaller than a single block, that's the minimum they
will bite off your disk anyway.

However, the original question was about VM datastores,
so large files were assumed.

//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Compression

2011-11-22 Thread Richard Elling
Hi Matt,

On Nov 22, 2011, at 7:39 PM, Matt Breitbach wrote:

> So I'm looking at files on my ZFS volume that are compressed, and I'm
> wondering to myself, "self, are the values shown here the size on disk, or
> are they the pre-compressed values".  Google gives me no great results on
> the first few pages, so I headed here.
> 
> This really relates to my VMware environment.  I had some "things" happen on
> my platform that required me to Storage Vmotion everything off of a
> particular zpool.  When I did that, I saw most VM's inflate to nearly their
> thick provisioned size.  What didn't swell to that size went to about 2/3
> provisioned (non-Nexenta storage).
> 
> I have been seeing 1.3-1.5x compression ratios on pretty much everything I
> turn compression on for (these are general use VM's -
> webservers,SQL,firewall,etc).
> 
> My question is this - when I'm looking in the file structure, or in the
> datastore browser in VMware, am I seeing the uncompressed file size, or the
> compressed filesize?
> 
> My gut tells me that since they inflated _so_ badly when I storage vmotioned
> them, that they are the compressed values, but I would love to know for
> sure.

How are you measuring the space?

Are you using block (iscsi/fc) or NFS to access the datastores from ESXi?
 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
LISA '11, Boston, MA, December 4-9 














___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss