[zfs-discuss] ZFS compression API (Was Re: [osol-discuss] Re: Re: where to start?)

2006-05-23 Thread Darren J Moffat

[EMAIL PROTECTED] wrote:

Robert Milkowski wrote:


But only if compression is turned on for a filesystem.

Of course, and the default is off.


However I think it would be good to have an API so application can
decide what to compress and what not.
I agree that an API would be good.  However I don't think using the API 
should allow an application to write compressed data if the file system 
has that functionality turned off.  Its a policy thing if the admin has 
compression off it is off for a reason. Or maybe what we need is another 
property value for compression that allows the app to request it but by 
default we don't do it.


*Only* if we fail to come up with a mechanism to do this properly,
efficiently automagically.


Which bit did you mean ? The API itself or the policy part  ?

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS compression API (Was Re: [osol-discuss] Re: Re: where to start?)

2006-05-23 Thread Casper . Dik

>[EMAIL PROTECTED] wrote:
>>> Robert Milkowski wrote:
>>>
 But only if compression is turned on for a filesystem.
>>> Of course, and the default is off.
>>>
 However I think it would be good to have an API so application can
 decide what to compress and what not.
>>> I agree that an API would be good.  However I don't think using the API 
>>> should allow an application to write compressed data if the file system 
>>> has that functionality turned off.  Its a policy thing if the admin has 
>>> compression off it is off for a reason. Or maybe what we need is another 
>>> property value for compression that allows the app to request it but by 
>>> default we don't do it.
>> 
>> *Only* if we fail to come up with a mechanism to do this properly,
>> efficiently automagically.
>
>Which bit did you mean ? The API itself or the policy part  ?

The API (and therefor also the policy).

If we can make it work without, then that is much better.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] tracking error to file

2006-05-23 Thread Wout Mertens
Can that same method be used to figure out what files changed between  
snapshots?


Wout.

On 22 May 2006, at 08:25, Matthew Ahrens wrote:


On Fri, May 19, 2006 at 01:23:02PM -0600, Gregory Shaw wrote:

  DATASET  OBJECT  RANGE
  1b   2402lvl=0 blkid=1965

I haven't found a way to report in human terms what the above object
refers to.  Is there such a method?


There isn't any great method currently, but you can use 'zdb' to find
this information.  The quickest way would be to first determine the  
name

of dataset 0x1b (=27):

# zdb local | grep "ID 27,"
Dataset local/ahrens [ZPL], ID 27, ...

Then get info on that particular object in that filesystem:

# zdb -vvv  2402
...
Object  lvl   iblk   dblk  lsize  asize  type
  2402116K  3.50K  3.50K  2.50K  ZFS plain file
 264  bonus  ZFS znode
path/raidz/usr/src/uts/common/fs/zfs/dmu.c
...

The "path" listed is relative to the filesystem's mountpoint.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: iostat numbers for ZFS disks, build 39

2006-05-23 Thread Wes Williams
> is anyone else seeing this? I couldn't find any
> references to this in
> the bug database.
> 
I'm also seeing this behavior on occasion with b36 and b38...from the b36 box...

Sun Microsystems Inc.   SunOS 5.11  snv_36  October 2007
-bash-3.00$ iostat 1
***snip***
   ttysd0   sd1   sd2   sd3cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
   0   80   0   000   002   200   001  5  0 95
   0  235   0   000   000   000   000  3  0 97
   0   80   0   000   00  24210 193   750   000  6  0 94
   0   82   0   000   00   82  3430   001  6  0 94
   0   80   0   000   000   000   000  3  0 96
   0   80   0   000   000   000   000  2  0 97
   0   80   0   000   000   000   001  3  0 96
   0   80   0   000   00  24192 189   770   000  5  0 95
   0   82   0   000   00  18016418222030220   900   00  
  1  6  0 93
   0   94   0   000   000   000   000  2  0 97
   0   80   0   000   000   000   001  3  0 96
^C
-bash-3.00$
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS desktop integration demo

2006-05-23 Thread Tim Foster
hey all,

I just posted some stuff I'd been playing around with wrt. more desktop
integration of ZFS functionality, full story at:

http://blogs.sun.com/roller/page/timf?entry=zfs_on_your_desktop

it's not much, but it's a start...

cheers,
tim



[ Oh, and if you hadn't noticed, Alo also did some desktop integration
work, concerning ACLs recently, at
http://blogs.sun.com/roller/page/alvaro#gnome_zfs_and_the_acl ]

-- 
Tim Foster, Sun Microsystems Inc, Operating Platforms Group
Engineering Operationshttp://blogs.sun.com/timf

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS compression API (Was Re: [osol-discuss] Re: Re: where to start?)

2006-05-23 Thread Robert Milkowski
Hello Darren,

Tuesday, May 23, 2006, 11:12:15 AM, you wrote:

DJM> [EMAIL PROTECTED] wrote:
>>> Robert Milkowski wrote:
>>>
 But only if compression is turned on for a filesystem.
>>> Of course, and the default is off.
>>>
 However I think it would be good to have an API so application can
 decide what to compress and what not.
>>> I agree that an API would be good.  However I don't think using the API 
>>> should allow an application to write compressed data if the file system 
>>> has that functionality turned off.  Its a policy thing if the admin has 
>>> compression off it is off for a reason. Or maybe what we need is another 
>>> property value for compression that allows the app to request it but by 
>>> default we don't do it.
>> 
>> *Only* if we fail to come up with a mechanism to do this properly,
>> efficiently automagically.

It's not a good idea IMHO.
The problem is that with stronger compression algorithms due to
performance reasons I want to decide which algorithms and to what
files ZFS should try to compress. For some files I write a lot of data
even if they are compressing quite good i don't want it - too much CPU
would be consumed. However for other files in the same data pool I
know we do not write them that often so compression for them would be
useful (and cheap in terms of CPU).


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS Web administration interface

2006-05-23 Thread Bob Miller
Steve,

Thanks for the update, I will try again with Build 40. I generally use the CLI 
anyway, but when showing ZFS to others it is always nice to include the web GUI 
as some people feel more comfortable with it.

Bob
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS compression API (Was Re: [osol-discuss] Re: Re: where to start?)

2006-05-23 Thread Darren J Moffat

Robert Milkowski wrote:

The problem is that with stronger compression algorithms due to
performance reasons I want to decide which algorithms and to what
files ZFS should try to compress. For some files I write a lot of data
even if they are compressing quite good i don't want it - too much CPU
would be consumed. However for other files in the same data pool I
know we do not write them that often so compression for them would be
useful (and cheap in terms of CPU).


This is exactly the reason I don't want the applications running as a 
normal user to be able to turn on more expensive compression algorithms 
if the administrator of the data set has explicitly not turned on 
compression.


I think there are three possible modes here:

1) No compression ever.
2) Current behaviour
3) New: Best effort from ZFS (ie current behaviour) with hints from 
application on the data eg, don't bother trying this is an MP3.



--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] RFE filesystem ownership

2006-05-23 Thread James Dickens

Hi

I think ZFS should add the concept of ownership to a ZFS filesystem,
so if i create a filesystem for joe, he should be able to use his
space how ever he see's fit, if he wants to turn on compression or
take 5000 snapshots its his filesystem, let him. If he wants to
destroy snapshots, he created them it should be allowed, but he should
not be allowed to do the same with carol's filesystem. The current
filesystem management is not fine grained enough to deal with this. Of
course if we don't assign an owner the filesystem should perform much
like it does today.

zfs set owner=joe pool/joe
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] tracking error to file

2006-05-23 Thread Matthew Ahrens
On Tue, May 23, 2006 at 11:49:47AM +0200, Wout Mertens wrote:
> Can that same method be used to figure out what files changed between  
> snapshots?

To figure out what files changed, we need to (a) figure out what object
numbers changed, and (b) do the object number to file name translation.

The method I described (using zdb) will not be involved in either step.
zdb is an undocumented interface, and using it for this purpose is only
a workaround.  However, the same algorithms implemented in zdb will be
used to do step (b), the object number to file name translation.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS Web administration interface

2006-05-23 Thread Ron Halstead
Just a note about build 39. I tried Tim's smreg add suggestion and it worked, I 
now see the ZFS Admin page. But when I click on the link, the next page shows:

Application Error
org.apache.jasper.JasperException: /jsp/zfsmodule/DevicesTree.jsp(51,2) The end 
tag "http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread Darren J Moffat

James Dickens wrote:

Hi

I think ZFS should add the concept of ownership to a ZFS filesystem,
so if i create a filesystem for joe, he should be able to use his
space how ever he see's fit, if he wants to turn on compression or
take 5000 snapshots its his filesystem, let him. If he wants to
destroy snapshots, he created them it should be allowed, but he should
not be allowed to do the same with carol's filesystem. The current
filesystem management is not fine grained enough to deal with this. Of
course if we don't assign an owner the filesystem should perform much
like it does today.


Yes we do need something like this.

This is already covered by the following CRs 6280676, 6421209.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Mirror options pros and cons

2006-05-23 Thread Tom Gendron

Hi,

I have these two pools, four luns each. One has two mirrors x two luns, 
the other is one mirror x 4 luns.


I am trying to figure out what the pro's and cons are of these two configs.

One thing I have noticed is that the single mirror 4 lun config can 
survive as many as three lun failures.  The other config only two.
I am thinking that space efficiency is similar because zfs strips across 
all the luns in both configs.


So that being said. I would like to here from others on pro's and cons 
of these two approaches.


Thanks ahead,
-tomg

  NAME  STATE READ WRITE CKSUM
   mypool ONLINE   0 0 0
 mirror ONLINE   0 0 0
   /export/lun5   ONLINE   0 0 0
   /export/lun2   ONLINE   0 0 0
 mirror ONLINE   0 0 0
   /export/lun3  ONLINE   0 0 0
   /export/lun4  ONLINE   0 0 0

   NAME  STATE READ WRITE CKSUM
   newpool   ONLINE   0 0 0
 mirrorONLINE   0 0 0
   /export/luna  ONLINE   0 0 0
   /export/lunb  ONLINE   0 0 0
   /export/lund  ONLINE   0 0 0
   /export/lunc  ONLINE   0 0 0

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re[2]: ZFS compression API (Was Re: [osol-discuss] Re: Re: where to start?)

2006-05-23 Thread Robert Milkowski
Hello Darren,

Tuesday, May 23, 2006, 4:19:05 PM, you wrote:

DJM> Robert Milkowski wrote:
>> The problem is that with stronger compression algorithms due to
>> performance reasons I want to decide which algorithms and to what
>> files ZFS should try to compress. For some files I write a lot of data
>> even if they are compressing quite good i don't want it - too much CPU
>> would be consumed. However for other files in the same data pool I
>> know we do not write them that often so compression for them would be
>> useful (and cheap in terms of CPU).

DJM> This is exactly the reason I don't want the applications running as a 
DJM> normal user to be able to turn on more expensive compression algorithms
DJM> if the administrator of the data set has explicitly not turned on 
DJM> compression.

DJM> I think there are three possible modes here:

DJM> 1) No compression ever.
DJM> 2) Current behaviour
DJM> 3) New: Best effort from ZFS (ie current behaviour) with hints from 
DJM> application on the data eg, don't bother trying this is an MP3.

Definitely that's something I had in mind.
To be more precise case #3 should be something like either ZFS tries
to determine itself which compression to use for each file (block?) or
administrator can setup default compression for a dataset. However in
both these cases application is able either hint ZFS or even force
specific compression algorithm for a file (block? given write? what
about mmap()?).

In some cases algorithm per block (write()) would be preferred - like
if our application writes an email and it already knows what kind of
attachments there are so it doesn't make sense to even try to compress
this block (however rest of the mail could be compressed). Something
like this:

ioctl(fd, strong compression)
write(fd, buf, s1) - headers and body
ioctl(fd, no compression)
write(fd, buf, s1) - jpeg attachement

However writev() or sendfilev() would be preferred - I'm not sure how
to cope with that situation.

To make API more universal applications should have an option not only
to request specific compression but to request some kind of
compression level without actually specifying algorithm - like I want
light compression or I don't care about CPU and want heavy compression
(or something in a middle).

ps. and of course if admin chooses option #1 application can't force
ZFS to use compression. And in #2 only compression selected by sys
admin is used.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread Robert Milkowski
Hello James,

Tuesday, May 23, 2006, 6:43:11 PM, you wrote:

JD> Hi

JD> I think ZFS should add the concept of ownership to a ZFS filesystem,
JD> so if i create a filesystem for joe, he should be able to use his
JD> space how ever he see's fit, if he wants to turn on compression or
JD> take 5000 snapshots its his filesystem, let him. If he wants to
JD> destroy snapshots, he created them it should be allowed, but he should
JD> not be allowed to do the same with carol's filesystem. The current
JD> filesystem management is not fine grained enough to deal with this. Of
JD> course if we don't assign an owner the filesystem should perform much
JD> like it does today.

JD> zfs set owner=joe pool/joe


IIRC it's planned - however I'm not sure it a user should be able to
turn on compression, especially when it's directly turned off by sys
admin.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mirror options pros and cons

2006-05-23 Thread Robert Milkowski
Hello Tom,

Tuesday, May 23, 2006, 9:46:24 PM, you wrote:

TG> Hi,

TG> I have these two pools, four luns each. One has two mirrors x two luns,
TG> the other is one mirror x 4 luns.

TG> I am trying to figure out what the pro's and cons are of these two configs.

TG> One thing I have noticed is that the single mirror 4 lun config can 
TG> survive as many as three lun failures.  The other config only two.
TG> I am thinking that space efficiency is similar because zfs strips across
TG> all the luns in both configs.

TG> So that being said. I would like to here from others on pro's and cons
TG> of these two approaches.

TG> Thanks ahead,
TG> -tomg

TG>NAME  STATE READ WRITE CKSUM
TG> mypool ONLINE   0 0 0
TG>   mirror ONLINE   0 0 0
TG> /export/lun5   ONLINE   0 0 0
TG> /export/lun2   ONLINE   0 0 0
TG>   mirror ONLINE   0 0 0
TG> /export/lun3  ONLINE   0 0 0
TG> /export/lun4  ONLINE   0 0 0

TG> NAME  STATE READ WRITE CKSUM
TG> newpool   ONLINE   0 0 0
TG>   mirrorONLINE   0 0 0
TG> /export/luna  ONLINE   0 0 0
TG> /export/lunb  ONLINE   0 0 0
TG> /export/lund  ONLINE   0 0 0
TG> /export/lunc  ONLINE   0 0 0


In the first config you should get a pool storage with capacity equal to
'2x lun size'. In the second config only '1x lun size'.
So in the second config you get better redundancy but only half
storage size.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread James Dickens

On 5/23/06, Robert Milkowski <[EMAIL PROTECTED]> wrote:

Hello James,

Tuesday, May 23, 2006, 6:43:11 PM, you wrote:

JD> Hi

JD> I think ZFS should add the concept of ownership to a ZFS filesystem,
JD> so if i create a filesystem for joe, he should be able to use his
JD> space how ever he see's fit, if he wants to turn on compression or
JD> take 5000 snapshots its his filesystem, let him. If he wants to
JD> destroy snapshots, he created them it should be allowed, but he should
JD> not be allowed to do the same with carol's filesystem. The current
JD> filesystem management is not fine grained enough to deal with this. Of
JD> course if we don't assign an owner the filesystem should perform much
JD> like it does today.

JD> zfs set owner=joe pool/joe


IIRC it's planned - however I'm not sure it a user should be able to
turn on compression, especially when it's directly turned off by sys
admin.



perhaps if compression turned/forced off by the admin then it
shouldn't be alowed, but enabling compression on a filesystem that
just had it off by default should be allowed but of course this
complicates implementation. They could create a system wide config
file disallowing compression/encryption  etc.

James
uadmin.blogspot.com



--
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread Roland Mainz
Darren J Moffat wrote:
> James Dickens wrote:
> > I think ZFS should add the concept of ownership to a ZFS filesystem,
> > so if i create a filesystem for joe, he should be able to use his
> > space how ever he see's fit, if he wants to turn on compression or
> > take 5000 snapshots its his filesystem, let him. If he wants to
> > destroy snapshots, he created them it should be allowed, but he should
> > not be allowed to do the same with carol's filesystem. The current
> > filesystem management is not fine grained enough to deal with this. Of
> > course if we don't assign an owner the filesystem should perform much
> > like it does today.
> 
> Yes we do need something like this.
> 
> This is already covered by the following CRs 6280676, 6421209.

That could be done if "zfs" would be based on ksh93... you could simply
run it as "profile shell" (pfksh93) and make a profile for that user+ZFS
filesystem...



Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [EMAIL PROTECTED]
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mirror options pros and cons

2006-05-23 Thread Tom Gendron



Robert Milkowski wrote:

Hello Tom,

Tuesday, May 23, 2006, 9:46:24 PM, you wrote:

TG> Hi,

TG> I have these two pools, four luns each. One has two mirrors x two luns,
TG> the other is one mirror x 4 luns.

TG> I am trying to figure out what the pro's and cons are of these two configs.

TG> One thing I have noticed is that the single mirror 4 lun config can 
TG> survive as many as three lun failures.  The other config only two.

TG> I am thinking that space efficiency is similar because zfs strips across
TG> all the luns in both configs.

TG> So that being said. I would like to here from others on pro's and cons
TG> of these two approaches.

TG> Thanks ahead,
TG> -tomg

TG>NAME  STATE READ WRITE CKSUM
TG> mypool ONLINE   0 0 0
TG>   mirror ONLINE   0 0 0
TG> /export/lun5   ONLINE   0 0 0
TG> /export/lun2   ONLINE   0 0 0
TG>   mirror ONLINE   0 0 0
TG> /export/lun3  ONLINE   0 0 0
TG> /export/lun4  ONLINE   0 0 0

TG> NAME  STATE READ WRITE CKSUM
TG> newpool   ONLINE   0 0 0
TG>   mirrorONLINE   0 0 0
TG> /export/luna  ONLINE   0 0 0
TG> /export/lunb  ONLINE   0 0 0
TG> /export/lund  ONLINE   0 0 0
TG> /export/lunc  ONLINE   0 0 0


In the first config you should get a pool storage with capacity equal to
'2x lun size'. In the second config only '1x lun size'.
So in the second config you get better redundancy but only half
storage size.

  

Ok I see that,  df shows it explicitly.

[EMAIL PROTECTED]> df -F zfs -h
Filesystem size   used  avail capacity  Mounted on
mypool 2.0G39M   1.9G 2%/mypool
newpool   1000M 8K  1000M 1%/newpool

What confused me is that ZFS does dynamic striping and if I write to the 
2x lun mirror all of the disks get IO. But my error in thought was in 
how the data gets spread out. It must be that the writes get striped for 
bandwidth utilization but the blocks and their copies are not spread 
across the mirrors. I'd like to understand that better.


It sure is good to be able to experiment with devious.

-tomg
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread Robert Milkowski
Hello James,

Tuesday, May 23, 2006, 10:25:10 PM, you wrote:

JD> On 5/23/06, Robert Milkowski <[EMAIL PROTECTED]> wrote:
>> Hello James,
>>
>> Tuesday, May 23, 2006, 6:43:11 PM, you wrote:
>>
>> JD> Hi
>>
>> JD> I think ZFS should add the concept of ownership to a ZFS filesystem,
>> JD> so if i create a filesystem for joe, he should be able to use his
>> JD> space how ever he see's fit, if he wants to turn on compression or
>> JD> take 5000 snapshots its his filesystem, let him. If he wants to
>> JD> destroy snapshots, he created them it should be allowed, but he should
>> JD> not be allowed to do the same with carol's filesystem. The current
>> JD> filesystem management is not fine grained enough to deal with this. Of
>> JD> course if we don't assign an owner the filesystem should perform much
>> JD> like it does today.
>>
>> JD> zfs set owner=joe pool/joe
>>
>>
>> IIRC it's planned - however I'm not sure it a user should be able to
>> turn on compression, especially when it's directly turned off by sys
>> admin.
>>

JD> perhaps if compression turned/forced off by the admin then it
JD> shouldn't be alowed, but enabling compression on a filesystem that
JD> just had it off by default should be allowed but of course this
JD> complicates implementation. They could create a system wide config
JD> file disallowing compression/encryption  etc.

Something like 'zfs set compression=user dataset'

or instead of 'user' -> 'allow'

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread Robert Milkowski
Hello Roland,

Tuesday, May 23, 2006, 10:31:37 PM, you wrote:

RM> Darren J Moffat wrote:
>> James Dickens wrote:
>> > I think ZFS should add the concept of ownership to a ZFS filesystem,
>> > so if i create a filesystem for joe, he should be able to use his
>> > space how ever he see's fit, if he wants to turn on compression or
>> > take 5000 snapshots its his filesystem, let him. If he wants to
>> > destroy snapshots, he created them it should be allowed, but he should
>> > not be allowed to do the same with carol's filesystem. The current
>> > filesystem management is not fine grained enough to deal with this. Of
>> > course if we don't assign an owner the filesystem should perform much
>> > like it does today.
>> 
>> Yes we do need something like this.
>> 
>> This is already covered by the following CRs 6280676, 6421209.

RM> That could be done if "zfs" would be based on ksh93... you could simply
RM> run it as "profile shell" (pfksh93) and make a profile for that user+ZFS
RM> filesystem...

Maybe I'm missing something but it has nothing to do with ksh93 or any
other shell. It should just work for a given user despite of it's
shell, etc - just uid and proper privileges.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Mirror options pros and cons

2006-05-23 Thread Robert Milkowski
Hello Tom,

Tuesday, May 23, 2006, 10:37:31 PM, you wrote:


TG> Robert Milkowski wrote:
>> Hello Tom,
>>
>> Tuesday, May 23, 2006, 9:46:24 PM, you wrote:
>>
>> TG> Hi,
>>
>> TG> I have these two pools, four luns each. One has two mirrors x two luns,
>> TG> the other is one mirror x 4 luns.
>>
>> TG> I am trying to figure out what the pro's and cons are of these two 
>> configs.
>>
>> TG> One thing I have noticed is that the single mirror 4 lun config can 
>> TG> survive as many as three lun failures.  The other config only two.
>> TG> I am thinking that space efficiency is similar because zfs strips across
>> TG> all the luns in both configs.
>>
>> TG> So that being said. I would like to here from others on pro's and cons
>> TG> of these two approaches.
>>
>> TG> Thanks ahead,
>> TG> -tomg
>>
>> TG>NAME  STATE READ WRITE CKSUM
>> TG> mypool ONLINE   0 0 0
>> TG>   mirror ONLINE   0 0 0
>> TG> /export/lun5   ONLINE   0 0 0
>> TG> /export/lun2   ONLINE   0 0 0
>> TG>   mirror ONLINE   0 0 0
>> TG> /export/lun3  ONLINE   0 0 0
>> TG> /export/lun4  ONLINE   0 0 0
>>
>> TG> NAME  STATE READ WRITE CKSUM
>> TG> newpool   ONLINE   0 0 0
>> TG>   mirrorONLINE   0 0 0
>> TG> /export/luna  ONLINE   0 0 0
>> TG> /export/lunb  ONLINE   0 0 0
>> TG> /export/lund  ONLINE   0 0 0
>> TG> /export/lunc  ONLINE   0 0 0
>>
>>
>> In the first config you should get a pool storage with capacity equal to
>> '2x lun size'. In the second config only '1x lun size'.
>> So in the second config you get better redundancy but only half
>> storage size.
>>
>>   
TG> Ok I see that,  df shows it explicitly.

[EMAIL PROTECTED]>> df -F zfs -h
TG> Filesystem size   used  avail capacity  Mounted on
TG> mypool 2.0G39M   1.9G 2%/mypool
TG> newpool   1000M 8K  1000M 1%/newpool

TG> What confused me is that ZFS does dynamic striping and if I write to the
TG> 2x lun mirror all of the disks get IO. But my error in thought was in 
TG> how the data gets spread out. It must be that the writes get striped for
TG> bandwidth utilization but the blocks and their copies are not spread 
TG> across the mirrors. I'd like to understand that better.

TG> It sure is good to be able to experiment with devious.

Well, mirror A B mirror C D with zfs is actually behaving like RAID-10
(stripe over mirrors). The main difference here is variable stripe
width but when it comes to protection it's just RAID-10 + checksums
for data+metadata. You can imaging such a config as stacked raid -
the same if you have created two mirrors on HW RAID, then exposed two
such disks to a host and then just did striping using ZFS (zpool
create pool X Y - where X is one mirror from two disks and Y is
another mirror from two disks). The difference is in using variable
stripe width and checksums (and more clever IO scheduler?)

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[9]: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-23 Thread Robert Milkowski
Hello Roch,

Monday, May 22, 2006, 3:42:41 PM, you wrote:



RBPE>   Robert Says:

RBPE>   Just to be sure - you did reconfigure system to actually allow larger
RBPE>   IO sizes?


RBPE> Sure enough, I messed up (I had no tuning to get the above data); So
RBPE> 1 MB was my max transfer sizes. Using 8MB I now see:

RBPE>Bytes   Elapse of phys IO Size   
RBPE>Sent
RBPE>
RBPE>8 MB;   3576 ms of phys; avg sz : 16 KB; throughput 2 MB/s
RBPE>9 MB;   1861 ms of phys; avg sz : 32 KB; throughput 4 MB/s
RBPE>31 MB;  3450 ms of phys; avg sz : 64 KB; throughput 8 MB/s
RBPE>78 MB;  4932 ms of phys; avg sz : 128 KB; throughput 15 MB/s
RBPE>124 MB; 4903 ms of phys; avg sz : 256 KB; throughput 25 MB/s
RBPE>178 MB; 4868 ms of phys; avg sz : 512 KB; throughput 36 MB/s
RBPE>226 MB; 4824 ms of phys; avg sz : 1024 KB; throughput 46 MB/s
RBPE>226 MB; 4816 ms of phys; avg sz : 2048 KB; throughput 54 MB/s (was 46 
MB/s)
RBPE> 32 MB;  686 ms of phys; avg sz : 4096 KB; throughput 58 MB/s (was 46 
MB/s)
RBPE>224 MB; 4741 ms of phys; avg sz : 8192 KB; throughput 59 MB/s (was 47 
MB/s)
RBPE>272 MB; 4336 ms of phys; avg sz : 16384 KB; throughput 58 MB/s (new  
data)
RBPE>288 MB; 4327 ms of phys; avg sz : 32768 KB; throughput 59 MB/s (new 
data)

RBPE> Data  was corrected  after  it was pointed out   that, physio will  be
RBPE> throttled by maxphys. New data was obtained after settings

RBPE> /etc/system: set maxphys=8338608
RBPE> /kernel/drv/sd.conf sd_max_xfer_size=0x80
RBPE> /kernel/drv/ssd.cond ssd_max_xfer_size=0x80

RBPE> And setting un_max_xfer_size in "struct sd_lun".
RBPE> That address was figured out using dtrace and knowing that
RBPE> sdmin() calls ddi_get_soft_state (details avail upon request).
RBPE> 
RBPE> And of course disabling the write cache (using format -e)

RBPE> With this in place I verified that each sdwrite() up to 8M 
RBPE> would lead to a single biodone interrupts using this:

RBPE> dtrace -n 'biodone:entry,sdwrite:[EMAIL PROTECTED], 
stack(20)]=count()}'

RBPE> Note that for 16M and 32M raw device writes, each default_physio
RBPE> will issue a series of 8M I/O. And so we don't
RBPE> expect any more throughput from that.


RBPE> The script  used  to measure  the  rates  (phys.d)  was also
RBPE> modified since I was  counting the bytes  before the I/O had
RBPE> completed and that made a  big difference for the very large
RBPE> I/O sizes.

RBPE> If you take the  8M case, the  above rates correspond to the
RBPE> time it takes to issue  and wait for a  single 8M I/O to the
RBPE> sd driver. So this time certainly does include  1 seek and ~
RBPE> 0.13 seconds  of data transfer, then  the time to respond to
RBPE> the  interrupt, finally the wakeup  of the thread waiting in
RBPE> default_physio(). Given that the data  transfer rate using 4
RBPE> MB is very close to  the one using 8  MB, I'd say that at 60
RBPE> MB/sec all the fixed-cost  element are well amortized.  So I
RBPE> would conclude from this that  the limiting factor is now at
RBPE> the  device itself or  on the data  channel between the disk
RBPE> and the host.


RBPE> Now recall that the throughput that ZFS gets during an
RBPE> spa_sync when submitted to a single dd and knowing that ZFS
RBPE> will work with 128K I/O:

RBPE>1431 MB; 23723 ms of spa_sync; avg sz : 127 KB; throughput 60 MB/s
RBPE>1387 MB; 23044 ms of spa_sync; avg sz : 127 KB; throughput 60 MB/s
RBPE>2680 MB; 44209 ms of spa_sync; avg sz : 127 KB; throughput 60 MB/s
RBPE>1359 MB; 24223 ms of spa_sync; avg sz : 127 KB; throughput 56 MB/s
RBPE>1143 MB; 19183 ms of spa_sync; avg sz : 126 KB; throughput 59 MB/s


RBPE> My disk is

RBPE>.

Is it over FC or just SCSI/SAS?

I have to try again with SAS/SCSI - maybe due to more overhead in FC
larger IOs give better results than on SCSI?

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[9]: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-23 Thread Robert Milkowski
Hello Roch,

Friday, May 19, 2006, 3:53:35 PM, you wrote:

RBPE> Robert Milkowski writes:
 >> Hello Roch,
 >> 
 >> Monday, May 15, 2006, 3:23:14 PM, you wrote:
 >> 
 >> RBPE> The question put forth is whether the ZFS 128K blocksize is sufficient
 >> RBPE> to saturate a regular disk. There is great body of evidence that shows
 >> RBPE> that the bigger the write sizes and matching large FS clustersize lead
 >> RBPE> to more throughput. The counter point is that ZFS schedules it's I/O
 >> RBPE> like nothing else seen before and manages to sature a single disk
 >> RBPE> using enough concurrent 128K I/O.
 >> 
 >> Nevertheless I get much more throughput using UFS and writing with
 >> large block than using ZFS on the same disk. And the difference is
 >> actually quite big in favor of UFS.
 >> 

RBPE> Absolutely. Isn't this issue though ?

RBPE> 6415647 Sequential writing is jumping

RBPE> We will have to fix this to allow dd to get more throughput.
RBPE> I'm pretty sure the fix won't need to increase the
RBPE> blocksize though.

Maybe - but it also means that until this is addressed it doesn't
make any sense to compare ZFS to other filesysystems with sequential
writing... The question is how well above problem is understood and
when is it going to be corrected? And why in your test cases which are
similar to mine you do not see dd to raw device to be actually faster
by any important factor? (again, maybe you are using SCSI and I do use
FC).






-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Misc questions

2006-05-23 Thread Jeff Victor
Some miscellaneous questions:

* When you share a ZFS fs via NFS, what happens to files and filesystems that 
exceed the limits of NFS?

* Is there a recommendation or some guidelines to help answer the question "how 
full should a pool be before deciding it's time add disk space to a pool?"

* Migrating pre-ZFS backups to ZFS backups: is there a better method than 
"restore the old backup into a ZFS fs, then back it up using "zfs send"?

* Are ZFS quotas enforced assuming that compressed data is compressed, or 
uncompressed?  The former seems to imply that the following would create a mess:
  1) Turn on compression
  2) Store data in the pool until the pool is almost full
  3) Turn off compression
  4) Read and re-write every file (thus expanding each file)

* What block sizes will ZFS use?  Is there an explanation somewhere about its 
method of choosing blocksize for a particular workload?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE filesystem ownership

2006-05-23 Thread Mark Shellenbaum

Darren J Moffat wrote:

James Dickens wrote:


Hi

I think ZFS should add the concept of ownership to a ZFS filesystem,
so if i create a filesystem for joe, he should be able to use his
space how ever he see's fit, if he wants to turn on compression or
take 5000 snapshots its his filesystem, let him. If he wants to
destroy snapshots, he created them it should be allowed, but he should
not be allowed to do the same with carol's filesystem. The current
filesystem management is not fine grained enough to deal with this. Of
course if we don't assign an owner the filesystem should perform much
like it does today.



Yes we do need something like this.

This is already covered by the following CRs 6280676, 6421209.




These RFE's are currently being investigated.   The basic idea is that 
an adminstrator will be allowed to grant specific users/groups to 
perform various zfs adminstrative tasks, such as create, destroy, clone, 
changing properties and so on.


After the zfs team is in agreement as to what the interfaces should be, 
I will forward it to zfs-discuss for further feedback.


  -Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Misc questions

2006-05-23 Thread Matthew Ahrens
On Tue, May 23, 2006 at 02:34:30PM -0700, Jeff Victor wrote:
> * When you share a ZFS fs via NFS, what happens to files and
> filesystems that exceed the limits of NFS?

What limits do you have in mind?  I'm not an NFS expert, but I think
that NFSv4 (and probably v3) supports 64-bit file sizes, so there would
be no limit mismatch there.

> * Is there a recommendation or some guidelines to help answer the
> question "how full should a pool be before deciding it's time add disk
> space to a pool?"

I'm not sure, but I'd guess around 90%.

> * Migrating pre-ZFS backups to ZFS backups: is there a better method
> than "restore the old backup into a ZFS fs, then back it up using "zfs
> send"?

No.

> * Are ZFS quotas enforced assuming that compressed data is compressed,
> or uncompressed?

Quotas apply to the amount of space used, after compression.  This is
the space reported by 'zfs list', 'zfs get used', 'df', 'du', etc.

> The former seems to imply that the following would create a mess:
>   1) Turn on compression
>   2) Store data in the pool until the pool is almost full
>   3) Turn off compression
>   4) Read and re-write every file (thus expanding each file)

Since this example doesn't involve quotas, their behavior is not
applicable here.  In this example, there will be insufficient space in
the pool to store your data, so your write operation will fail with
ENOSPC.  Perhaps a messy situation, but I don't see any alternative.  If
this is a concern, don't use compression.

If you filled up a filesystem's quota rather than a pool, the behavior
would be the same except you would get EDQUOT rather than ENOSPC.

> * What block sizes will ZFS use?  Is there an explanation somewhere
> about its method of choosing blocksize for a particular workload?

Files smaller than 128k will be stored in a single block, whose size is
rounded up to the nearest sector (512 bytes).  Files larger than 128k
will be stored in multiple 128k blocks (unless the recordsize property
has been set -- see the zfs(1m) manpage for an explanation of this).

Thanks for using zfs!
--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss