[zfs-discuss] Questions

2008-06-26 Thread Mickae?l ABISROR

>> Hi all,
>>
>> here are some questions about ZFS
>>
>> When accessing to RAW volumes under ZFS do we go through cache or 
>> not? If yes can we disable it?
>> They use Veritas QuickIO (ODM) and they want to compare with ZFS and 
>> also ZFS with and without cache.
>>
>> Do we have some benchmark or figures about the compression CPU 
>> overhead with ZFS? and for Checksumming?
>>
>> I've seen 1ghz/500mbytes of data for checksumming but it seems to me 
>> that's huge...
>
> What about also the update of ZFS? Do we update with packages, 
> patches, kernel patches or Solaris release?
>
>>
>> Cheers
>>
>> Mike
>

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and recordsize

2008-06-26 Thread Peter Boros
Hi,

Is there any way to do this in Solaris 10? For example create an empty  
zfs filesystem before receiving and then receive the stream without  
touching the metadata?

On Jun 25, 2008, at 2:37 PM, James Andrewartha wrote:

> Peter Boros wrote:
>> Hi James,
>>
>> Of course, changing the recordsize was the first thing I did, after I
>> created the original filesystem. I copied some files on it, made a
>> snapshot, and then performed the zfs send (with the decreased
>> recordsize). After I performed a zfs receive, the recordsize was the
>> default (128k) on the new filesystem.
>
> Ah, I was using the -R option to zfs send, which does what you want.  
> It's
> been in since nv77, PSARC/2007/574. To quote the manpage:
>  -R Generate a  replication  stream  package,
> which   will   replicate   the  specified
> filesystem, and all descendant file  sys-
> tems,  up  to  the  named  snapshot. When
> received, all properties, snapshots, des-
> cendent  file  systems,  and  clones  are
> preserved.
>
> If the -i or -I flags are  used  in  con-
> junction with the -R flag, an incremental
> replication  stream  is  generated.   The
> current values of properties, and current
> snapshot and file system  names  are  set
> when  the  stream is received.  If the -F
> flag is specified  when  this  stream  is
> recieved, snapshots and file systems that
> do not exist on the sending side are des-
> troyed.
>
> -- 
> James Andrewartha
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

   __
  /_/\   Peter Boros
 /_\\ \
/_\ \\ / Technical Specialist
   /_/ \/ / /
  /_/ /   \//\   Sun Microsystems Inc.
  \_\//\   / /   Hungary-1027 Budapest, Kapas u. 11-15.
   \_/ / /\ /Phone:   +36 1 489-8900
\_/ \\ \
 \_\ \\
  \_\/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raid card vs zfs

2008-06-26 Thread Mertol Ozyoney
Good analysis. 
Generally X4500 can stream 2,5 GB/sec disk to RAM and around 1,2-1,4 GB/sec
ram to disk. 
Network to disk or disk to network most of the time depends on the network
interface. 

With Infiniband we are seeing 1 GB/sec transfer speeds [2 OS disks, 6
spares]
With dual IB initial results shows a sgood sustained 1,2 GB/sec throughput. 

When you comare this performance with anything offered today, it's very cost
affective. 
Changing to PCI-E and offering more independent busses will increase the
performance slighlty but today X4500 already offers more than enough
performance for most things. 

Mertol 




Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email [EMAIL PROTECTED]



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn
Sent: Wednesday, June 25, 2008 6:45 PM
To: Mertol Ozyoney
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] raid card vs zfs

I see that the configuration tested in this X4500 writeup only uses 
the four built-in gigabit ethernet interfaces.  This places a natural 
limit on the amount of data which can stream from the system.  For 
local host access, I am achieving this level of read performance using 
one StorageTek 2540 (6 mirror pairs) and a single reading process. 
The X4500 with 48 drives should be capable of far more.

The X4500 has two expansion bus slows but they are only 64-bit 133MHz 
PCI-X so it seems that the ability to add bandwidth via more 
interfaces is limited.  A logical improvement to the design is to 
offer PCI-E slots which can support 10Gbit ethernet, Infiniband, or 
Fiber Channel cards so that more of the internal disk bandwidth is 
available to "power user" type clients.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raid card vs zfs

2008-06-26 Thread Mertol Ozyoney
Please note that IO speeds exceeding 1 GB/sec  can be limited by several
components including the OS or device drivers. 
Currrent maximum performance I have seen is around 1,25 GB through dual IB.
Anyway, that's a great realworld performance, considering the price and size
of the unit. 



Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email [EMAIL PROTECTED]



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lida Horn
Sent: Wednesday, June 25, 2008 11:14 PM
To: Tim
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] raid card vs zfs

Tim wrote:
>
>
> On Wed, Jun 25, 2008 at 10:44 AM, Bob Friesenhahn 
> <[EMAIL PROTECTED] > 
> wrote:
>
> I see that the configuration tested in this X4500 writeup only uses
> the four built-in gigabit ethernet interfaces.  This places a natural
> limit on the amount of data which can stream from the system.  For
> local host access, I am achieving this level of read performance using
> one StorageTek 2540 (6 mirror pairs) and a single reading process.
> The X4500 with 48 drives should be capable of far more.
>
> The X4500 has two expansion bus slows but they are only 64-bit 133MHz
> PCI-X so it seems that the ability to add bandwidth via more
> interfaces is limited.  A logical improvement to the design is to
> offer PCI-E slots which can support 10Gbit ethernet, Infiniband, or
> Fiber Channel cards so that more of the internal disk bandwidth is
> available to "power user" type clients.
>
> Bob
> ==
> Bob Friesenhahn
> [EMAIL PROTECTED]
> ,
> http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org 
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
>
> Uhhh... 64bit/133mhz is 17Gbit/sec.  I *HIGHLY* doubt that bus will be 
> a limit.  Without some serious offloading, you aren't pushing that 
> amount of bandwidth out the card.  Most systems I've seen top out 
> around 6bit/sec with current drivers.
Ummm 133MHz is just slightly above 1/8 GHz.  64bits is 8 x 8 bits.  
Multiplying yields 8Gbits/sec or 1GByte/sec.
So even if you have two PCI-X (64-bit/133MHz slots) that are 
independent, that would yield at best 2GB/sec.
The SunFire x4500 is capable of doing > 3GB/sec I/O to the disks, so you 
would still be network
band limited.  Of course if you are using ZFS and/or mirroring that 
 >3GB/sec from the disks goes
down dramatically, so for practical purposes the 2GB/sec limit may well 
be enough.
>
>
> 
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Questions

2008-06-26 Thread Richard Elling
Mickae?l ABISROR wrote:
>>> Hi all,
>>>
>>> here are some questions about ZFS
>>>
>>> When accessing to RAW volumes under ZFS do we go through cache or 
>>> not? If yes can we disable it?
>>>   

I believe Zvols are uncached.

>>> They use Veritas QuickIO (ODM) and they want to compare with ZFS and 
>>> also ZFS with and without cache.
>>>   

Please see the discussions on "directio" at
http://www.solarisinternals.com//wiki/index.php?title=Category:ZFS
http://blogs.sun.com/bobs/entry/one_i_o_two_i
http://blogs.sun.com/bobs/entry/oracle_i_o_supply_and
http://www.solarisinternals.com/wiki/index.php/Direct_I/O


>>> Do we have some benchmark or figures about the compression CPU 
>>> overhead with ZFS? and for Checksumming?
>>>
>>> I've seen 1ghz/500mbytes of data for checksumming but it seems to me 
>>> that's huge...
>>>   

I believe you mean 1 GHz/500 MBytes/s which is reasonable.
That is a rather substantial bandwidth for database.

>> What about also the update of ZFS? Do we update with packages, 
>> patches, kernel patches or Solaris release?
>> 

Yes.  The exact mix depends on your OS and release.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oops: zfs-auto-snapshot with at scheduling

2008-06-26 Thread Nils Goroll
Hi all,

I'll attach a new version zfs-auto-snapshot including some more
improvements, and probably some new bugs. Seriously, I have
tested it, but certainly not all functionality, so please let me know
about any (new) problems you come across.

Except from the change log:

  - Added support to schedule using at(1), see
README.zfs-auto-snapshot.txt
  - take_snapshot will only run if option "manual" given or FMRI is
online
  - added command line action "unschedule"
  - retry all jobs (at and cron) using a short term reschedule
(AT_RETRY="now + 5 minutes") if they cant be run due to a
temporary condition such as
- smf service offline or uninitialized
- pool being scrubbed
  - added command line action "list" to list configured cron and at
jobs (mainly to get human readable information about which
snapshot is due when)
  - added validation functions for various SMF properties

Cheers,

Nils
 
 
This message posted from opensolaris.org

zfs-auto-snapshot-0.10_atjobs2.tar.bz2
Description: BZip2 compressed data
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How about an zfs-auto-snapshot project

2008-06-26 Thread Nils Goroll
And how about making this an official project?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oops: zfs-auto-snapshot with at scheduling

2008-06-26 Thread Tim Foster
Hey Nils,

On Thu, 2008-06-26 at 09:37 -0700, Nils Goroll wrote:
> I'll attach a new version zfs-auto-snapshot including some more
> improvements, and probably some new bugs.

Just to let you know that I *have* seen your mails on this and really
appreciate getting the feedback, but I just haven't had a chance to dig
into your code yet.  I wanted to spend time following up properly,
rather than giving you a quick (but probably un-researched!) answer :-)

  I'll try to follow up on the list tomorrow...

cheers,
tim



>  Seriously, I have
> tested it, but certainly not all functionality, so please let me know
> about any (new) problems you come across.
> 
> Except from the change log:
> 
>   - Added support to schedule using at(1), see
> README.zfs-auto-snapshot.txt
>   - take_snapshot will only run if option "manual" given or FMRI is
> online
>   - added command line action "unschedule"
>   - retry all jobs (at and cron) using a short term reschedule
> (AT_RETRY="now + 5 minutes") if they cant be run due to a
> temporary condition such as
> - smf service offline or uninitialized
> - pool being scrubbed
>   - added command line action "list" to list configured cron and at
> jobs (mainly to get human readable information about which
> snapshot is due when)
>   - added validation functions for various SMF properties
> 
> Cheers,
> 
> Nils
>  
> 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send/receive locks system threads (Bug?)

2008-06-26 Thread Paul
Dear experts, 

when migrating from a 6 disk to a 4 disk raidz1 zpool with 'zfs send -R|zfs 
receive', the system's response time dropped noticeably.
The run queue increased to 28-32, and 'prstat -Lm' showed that several system 
threads were locked repeatedly - noticeably fmd, devfsd and many of the 
svc.{configd,startd}.
The system is an E4500 with 4 US II CPU, so I added 6 CPU on the fly which 
helped a bit but the run queue was still about 10. Reasonably, things were 
worst when the migration was processing gzip-compressed FS's, but surprisingly 
the run queue was >2 with uncompressed FS's, too.
OS is SXDE snv79b.

2nd observation was that even when copying gzip'ed FS's, the CPUs had about 30% 
idle cycles
and zero wio%.
How come? Is this (locking of system threads -> increased run queue) a bug or 
expected behaviour?

3rd: The zpool property 'delegation' was not copied. Same Q: bug or expected? 
Note: migration
was from zpool version 8 to zpool version 10.

4th: Performance was way beyond my expectations.  I set up regular recursive 
snapshots via cron,
so every FS had many (42) snapshots, many empty or of small size. Even the 
empty snaps took
5 seconds each (34 bytes/5 sec), and max. throughput was 8 MB/s with the few 
snaps that had
some GB of data. I understand that even to copy an empty snapshot needs some 
time to process,
and I guess the reason for the poor performance is the locking described above. 
Still the same Q
arises: Is a max. throughput of 8 MB/s to write to a raidz1 of 4 FC disks ok 
with send/receive?

Workaround: If you recursively send/receive a zpool with many snapshots, 
consider to decrease
the # of snapshots first. E.g. when you have rotating snapshots like every 10 
minutes
([EMAIL PROTECTED],min-10,min-20,...} and every hour ([EMAIL 
PROTECTED],hourly-01,...}, delete these
snapshots prior to the send/receive operation.

Thanks in advance, Paul
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] 'zfs list' output showing incorrect mountpoint after boot -Z

2008-06-26 Thread Sumit Gupta
I installed snv_92 with zfs root. Then took a snapshot of the root and clonned 
it. Now I am booting from the clone using the -Z option. The system boots fine 
from the clone but 'zfs list' still shows that the '/' is mounted on the 
original mountpoint instead of the clone even though the output of 'mount' 
shows that '/' is mounted on the clone. Here is the output.

# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
rpool22.1G  44.8G63K  /rpool
rpool/ROOT   6.11G  44.8G18K  legacy
rpool/ROOT/snv_926.10G  44.8G  6.09G  /
rpool/ROOT/[EMAIL PROTECTED]  5.55M  -  6.09G  -
rpool/ROOT/snv_92.backup 8.00M  44.8G  6.10G  legacy
rpool/dump   8.00G  44.8G  8.00G  -
rpool/export   38K  44.8G20K  /export
rpool/export/home  18K  44.8G18K  /export/home
rpool/swap   8.00G  52.8G  2.55M  -
# mount
/ on rpool/ROOT/snv_92.backup read/write/setuid/devices/dev=3f50002 on Wed Dec 
31 16:00:00 1969

[EMAIL PROTECTED] is the snapshot of the original installation on snv_92. 
snv_92.backup is the clone. You can see that the / is mounted on snv_92.backup 
but in zfs list output it still shows that '/' is mounted on snv_92.

Is that a bug or did I misss a step ?

Thanks
Sumit
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs list' output showing incorrect mountpoint after boot -Z

2008-06-26 Thread Mark J Musante

On Jun 26, 2008, at 8:11 PM, Sumit Gupta wrote:

> [EMAIL PROTECTED] is the snapshot of the original installation on  
> snv_92. snv_92.backup is the clone. You can see that the / is  
> mounted on snv_92.backup but in zfs list output it still shows that  
> '/' is mounted on snv_92.
>

It's showing where it *would* mount, if it were mounted.  It's not  
mounted, however (type 'zfs mount' to see a list of currently mounted  
datasets), so everything is OK.

To see if a dataset will be automatically mounted at boot, look at the  
'canmount' property of the dataset.  Only those datasets with a  
canmount set to 'on' will be mounted.  If the property is 'off', then  
it will not be mounted.  If it is set to 'noauto', then it will only  
be mounted if it is the root dataset (or the /var dataset).


Regards,
markm

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss