Re: [zfs-discuss] Case study/recommended ZFS setup for home file

2008-07-10 Thread James C. McPherson
Florin Iucha wrote:
> On Wed, Jul 09, 2008 at 08:42:37PM -0700, Bohdan Tashchuk wrote:
>>> I cannot use OpenSolaris 2008.05 since it does not
>>> recognize the SATA disks attached to the southbridge.
>>> A fix for this problem went into build 93.
>> Which forum/mailing list discusses SATA issues like the above?
> 
> #opensolaris in freenode.net
> 
> I booted from the OpenSolaris LiveCD/installer, and noticing the lack
> of available disks, I cried for help on #irc.  There were a few
> helpful people that gave me some commands to run, to try and get this
> going.  After their efforts failed, I googled for solaris and SB600
> (this is the ATI SouthBridge chip) and found a forum posting from
> another user, back in February, and the hit in bugzilla, pointing to
> the resolution of the bug, with the target being snv_93.


For reference of other people, the bug in question is

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6665032
6665032 ahci driver doesn't work for ATI SB600 AHCI chipset (ASUS M2A-VM)

which is fixed in snv_93




James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog device

2008-07-10 Thread Ross
Yes, but that talks about Flash systems, and the end of the year.  My concern 
is whether Sun will also be releasing flash add-on cards that we can make use 
of elsewhere, including on already purchased Sun kit.
 
Much as I'd love to see Sun add a lightning fast flash boosted server to their 
x64 range, that's not going to help me with ZFS and my existing hardware.  I'd 
really like to know if Sun have any plans for a PCI-X or PCI-E flash card with 
Solaris drivers, and if they don't have any plans along those lines, I'd love 
to see some Solaris drivers for the Fusion-io product.
 
Ross
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-10 Thread Ross
My recommendation:  buy a small, cheap 2.5" SATA hard drive (or 1.8" SSD) and 
use that as your boot volume, I'd even bolt it to the side of your case if you 
have to.  Then use the whole of your three large disks as a raid-z set.

If I were in your shoes I would also have bought 4 drives for ZFS instead of 3, 
and gone for raid-z2.  

And finally, I don't know how much room you have in your current case, but if 
you're ever looking for one that takes more drives I can highly recommend the 
Antec P182.  I've got 6x 1TB drives in my home server and in that case it's so 
quiet I can't even hear it turn on.  My watch ticking easily drowns out this 
server.

PS.  If you're going to be using CIFS, avoid build 93.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-10 Thread Fajar A. Nugraha

Brandon High wrote:

On Wed, Jul 9, 2008 at 3:37 PM, Florin Iucha <[EMAIL PROTECTED]> wrote:
  

The question is, how should I partition the drives, and what tuning
parameters should I use for the pools and file systems?  From reading
the best practices guides [1], [2], it seems that I cannot have the
root file system on a RAID-5 pool, but it has to be a separate storage
pool.  This seems to be slightly at odds with the suggestion of using
whole-disks for ZFS, not just slices/partitions.



The reason for using a whole disk is that ZFS will turn on the drive's
cache. When using slices, the cache is normally disabled. If all
slices are using ZFS, you can turn the drive cache back on. I don't
think it happens by default right now, but you can set it manually.

  
As I recall, using whole disk as zfs also change the disk label to EFI. 
Meaning, you can't boot from it.



Another alternative is to use an IDE to Compact Flash adapter, and
boot off of flash.

Just curious, what will that flash contain?
e.g. will it be similar to linux's /boot, or will it contain the full 
solaris root?

How do you manage redundancy (e.g. mirror) for that boot device?


My plan right now is to create a 20 GB and a 720 GB slice on each
disk, then create two storage pools, one RAID-1 (20 GB) and one RAID-5
(1.440 TB).  Create the root, var, usr and opt file systems in the
first pool, and home, library and photos in the second.  


Good plan.

I hope I
won't need swap, but I could create three 1 GB slices (one on each
disk) for that.



If you have enough memory (say 4gb) you probably won't need swap. I
believe swap can live in a ZFS pool now too, so you won't necesarily
need another slice. You'll just have RAID-Z protected swap.
  

Really? I think solaris still needs non-zfs swap for default dump device.

Regards,

Fajar


smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Ross
I think it's a cracking upgrade Richard.  I was hoping Sun would do something 
like this, so it's great to see it arrive.

As others have said though, I think Sun are missing a trick by not working with 
Vmetro or Fusion-io to add nvram cards to the range now.  In particular, if Sun 
were to work with Fusion-io and add Solaris drivers for the ioDrive, you'd be 
in a position right now to offer a 48TB server with 64GB of read cache, and 
80GB of write cache

You could even offer the same card on the smaller x4240.  Can you imagine how 
well those machines would work as NFS servers?  Either one would make a superb 
NFS storage platform for VMware:  You've got incredible performance, ZFS 
snapshots for backups, and ZFS send/receive to replicate the data elsewhere.  
NetApp and EMC charge a small fortune for a NAS that can do all that, and they 
don't offer anywhere near that amount of fast cache.  Both servers would take 
Infiniband too, which is dirt cheap these days at $125 a card, is supported by 
VMware, and particularly on the smaller server, is way faster than anything EMC 
or NetApp offer.

As a NFS storage platform, you'd be beating EMC and NetApp on price, spindle 
count, features and performance.  I really hope somebody at Sun considers this, 
and thinks about expanding the "What can you do with an x4540" section on the 
website to include VMware. 

Ross
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog device

2008-07-10 Thread Ross
The problem with that is that I'd need to mirror them to guard against failure, 
I'd loose storage capacity, and the peak throughput would be horrible when 
compared to the array.

I'd be sacrificing streaming speed for random write speed, whereas with a PCIe 
nvram card I can have my cake an eat it.

ps.  Yes, I'm greedy :D
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Ross
Heh, I like the way you think Tim.  I'm sure Sun hate people like us.  The 
first thing I tested when I had an x4500 on trial was to make sure an off the 
shelf 1TB disk worked in it :)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Tommaso Boccali
.. And the answer was yes I hope. we are sriously thinking of buying
48 1 tb disk to replace those in a 1 year old thumper

please confirm it again :)

2008/7/10, Ross <[EMAIL PROTECTED]>:
> Heh, I like the way you think Tim.  I'm sure Sun hate people like us.  The
> first thing I tested when I had an x4500 on trial was to make sure an off
> the shelf 1TB disk worked in it :)
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


-- 
Tommaso Boccali
INFN Pisa
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-10 Thread Darren J Moffat
Fajar A. Nugraha wrote:
>> If you have enough memory (say 4gb) you probably won't need swap. I
>> believe swap can live in a ZFS pool now too, so you won't necesarily
>> need another slice. You'll just have RAID-Z protected swap.
>>   
> Really? I think solaris still needs non-zfs swap for default dump device.

No longer true, you can swap and dump to a ZVOL (but not the same one). 
  This change came in after OpenSolaris 2008.05 LiveCD/Install was cut 
so it doesn't take advantage of that.   There was a big long thread 
cross posted to this list about it just recently.

The current SX:CE installer (ie Nevada) uses ZVOL for swap and dump.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Darren J Moffat
I regularly create new zfs filesystems or snapshots and I find it 
annoying that I have to type the full dataset name in all of those cases.

I propose we allow zfs(1) to infer the part of the dataset name upto the 
current working directory.  For example:

Today:

$ zfs create cube/builds/darrenm/bugs/6724478

With this proposal:

$ pwd
/cube/builds/darrenm/bugs
$ zfs create 6724478

Both of these would result in a new dataset cube/builds/darrenm/6724478

This will need some careful though about how to deal with cases like this:

$ pwd
/cube/builds/
$ zfs create 6724478/test

What should that do ? should it create cube/builds/6724478 and 
cube/builds/6724478/test ?  Or should it fail ?  -p already provides
some capbilities in this area.

Maybe the easiest way out of the ambiquity is to add a flag to zfs 
create for the partial dataset name eg:

$ pwd
/cube/builds/darrenm/bugs
$ zfs create -c 6724478

Why "-c" ?  -c for "current directory"  "-p" partial is already taken to 
mean "create all non existing parents" and "-r" relative is already used 
consistently as "recurse" in other zfs(1) commands (as well as lots of 
other places).

Alternately:

$ pwd
/cube/builds/darrenm/bugs
$ zfs mkdir 6724478

Which would act like mkdir does (including allowing a -p and -m flag 
with the same meaning as mkdir(1)) but creates datasets instead of 
directories.

Thoughts ?  Is this useful for anyone else ?  My above examples are some 
of the shorter dataset names I use, ones in my home directory can be 
even deeper.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] We have a driver for the MM-5425CN

2008-07-10 Thread Ross
Hey everybody,

Well, my pestering paid off.  I have a Solaris driver which you're welcom to 
download, but please be aware that it comes with NO SUPPORT WHATSOEVER.

I'm very grateful to the chap who provided this driver, please don't abuse his 
generosity by calling Micro Memory or Vmetro if you have any problems.

I've no idea which version of Solaris this was developed for, how many other 
cards it works with, or if it even works in the current version of solaris.  
Use at your own risk.
http://www.averysilly.com/Micro_Memory_MM-5425CN.zip

Ross
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mark Phalan
On Thu, 2008-07-10 at 11:42 +0100, Darren J Moffat wrote:
> I regularly create new zfs filesystems or snapshots and I find it 
> annoying that I have to type the full dataset name in all of those cases.
> 
> I propose we allow zfs(1) to infer the part of the dataset name upto the 
> current working directory.  For example:
> 
> Today:
> 
> $ zfs create cube/builds/darrenm/bugs/6724478
> 
> With this proposal:
> 
> $ pwd
> /cube/builds/darrenm/bugs
> $ zfs create 6724478
> 
> Both of these would result in a new dataset cube/builds/darrenm/6724478

I find this annoying as well. Another way that would help (but is fairly
orthogonal to your suggestion) would be to write a completion module for
zsh/bash/whatever that could -complete options to the z* commands
including zfs filesystems.

-M

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Tim Foster
On Thu, 2008-07-10 at 13:01 +0200, Mark Phalan wrote:
> > Both of these would result in a new dataset cube/builds/darrenm/6724478
> 
> I find this annoying as well. Another way that would help (but is fairly
> orthogonal to your suggestion) would be to write a completion module for
> zsh/bash/whatever that could -complete options to the z* commands
> including zfs filesystems.

Mark Musante (famous for recently beating the crap out of lu) wrote one
of these - 
http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/bash_tabcompletion_

I don't use bash, but would love a ksh93 version of this!

cheers,
tim



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mark J Musante
On Thu, 10 Jul 2008, Mark Phalan wrote:

> I find this annoying as well. Another way that would help (but is fairly 
> orthogonal to your suggestion) would be to write a completion module for 
> zsh/bash/whatever that could -complete options to the z* commands 
> including zfs filesystems.

You mean something like this?

http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/bash_tabcompletion_


Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mark Phalan
On Thu, 2008-07-10 at 07:12 -0400, Mark J Musante wrote:
> On Thu, 10 Jul 2008, Mark Phalan wrote:
> 
> > I find this annoying as well. Another way that would help (but is fairly 
> > orthogonal to your suggestion) would be to write a completion module for 
> > zsh/bash/whatever that could -complete options to the z* commands 
> > including zfs filesystems.
> 
> You mean something like this?
> 
> http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/bash_tabcompletion_

Yes! Exactly! Now I just need to re-write it for zsh..

Thanks,

-M

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mark J Musante
On Thu, 10 Jul 2008, Tim Foster wrote:

> Mark Musante (famous for recently beating the crap out of lu)

Heh.  Although at this point it's hard to tell who's the beat-er and who's 
the beat-ee...


Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Tim
On Thu, Jul 10, 2008 at 3:37 AM, Ross <[EMAIL PROTECTED]> wrote:

> I think it's a cracking upgrade Richard.  I was hoping Sun would do
> something like this, so it's great to see it arrive.
>
> As others have said though, I think Sun are missing a trick by not working
> with Vmetro or Fusion-io to add nvram cards to the range now.  In
> particular, if Sun were to work with Fusion-io and add Solaris drivers for
> the ioDrive, you'd be in a position right now to offer a 48TB server with
> 64GB of read cache, and 80GB of write cache
>
> You could even offer the same card on the smaller x4240.  Can you imagine
> how well those machines would work as NFS servers?  Either one would make a
> superb NFS storage platform for VMware:  You've got incredible performance,
> ZFS snapshots for backups, and ZFS send/receive to replicate the data
> elsewhere.  NetApp and EMC charge a small fortune for a NAS that can do all
> that, and they don't offer anywhere near that amount of fast cache.  Both
> servers would take Infiniband too, which is dirt cheap these days at $125 a
> card, is supported by VMware, and particularly on the smaller server, is way
> faster than anything EMC or NetApp offer.
>
> As a NFS storage platform, you'd be beating EMC and NetApp on price,
> spindle count, features and performance.  I really hope somebody at Sun
> considers this, and thinks about expanding the "What can you do with an
> x4540" section on the website to include VMware.
>
> Ross
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



I believe NetApp has several (valid) patents in this area that may be
preventing Sun from doing this.  Perhaps that's on the table for
cross-licensing negotiation talks?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Ross
Oh god, I hope not.  A patent on fitting a card in a PCI-E slot, or using nvram 
with RAID (which raid controllers have been doing for years) would just be 
rediculous.  This is nothing more than cache, and even with the American patent 
system I'd have though it hard to get that past the obviousness test.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-10 Thread Florin Iucha
On Thu, Jul 10, 2008 at 12:47:26AM -0700, Ross wrote:
> My recommendation:  buy a small, cheap 2.5" SATA hard drive (or 1.8" SSD) and 
> use that as your boot volume, I'd even bolt it to the side of your case if 
> you have to.  Then use the whole of your three large disks as a raid-z set.

Yup, I'm going with 4GB of mirrored flash for root/var/usr and I'll keep
the main spindles only for data.

> If I were in your shoes I would also have bought 4 drives for ZFS instead of 
> 3, and gone for raid-z2.  

No room - Antec NSK-2440 - and too much power draw.  My server idles at
57-64 W (under Linux) and I'd like to keep it that way.

> And finally, I don't know how much room you have in your current case, but if 
> you're ever looking for one that takes more drives I can highly recommend the 
> Antec P182.  I've got 6x 1TB drives in my home server and in that case it's 
> so quiet I can't even hear it turn on.  My watch ticking easily drowns out 
> this server.

Heh - I do have the P180 as my workstation case.  But I don't have
that much room for servers 8^)

> PS.  If you're going to be using CIFS, avoid build 93.

Can you please give a link to the discussion, or a bug id?

Thanks,
florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


pgpmP7PaXForT.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Glaser, David
Hi all,

I'm a little (ok, a lot) confused on the whole zfs send/receive commands. I've 
seen mention of using zfs send between two different machines, but no good 
howto in order to make it work. I have one try-n-buy x4500 that we are trying 
to move data from onto a new x4500 that we've purchased. Right now I'm using 
rsync over ssh (via 1GB/s network) to copy the data but it is almost painfully 
slow (700GB over 24 hours). Yeah, it's a load of small files for the most part. 
Anyway, would zfs send/receive work better? Do you have to set up a service on 
the receiving machine in order to receive the zfs stream?

The machine is an x4500 running Solaris 10 u5.

Thanks
Dave

David Glaser
Systems Administrator
LSA Information Technology
University of Michigan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] File System snapshot and Hierarchical Storage Management

2008-07-10 Thread Rick Matthews
Hi. I am working on the ADM project within OpenSolaris. ADM is a Hierarchical
Storage manager (HSM) for ZFS. An HSM serves several purposes. One is a
virtualization of amount of disk space by copying (archiving to other media) a 
files
data, then freeing that data space, when needed. Another is as a near 
continuous backup
of users data. The archive process will capture file data at a much shorter 
interval than
the typical backup cycle. These archives can be used as "vault" copies, etc. A 
backup
has occurred when the complete data has been captured, and the ZFS metadata 
referencing
those archive components has been made and retained.

Long background. Whew. My question to the communities is what a user or 
administrators
expectation would be when an HSM is used in conjunction with a snapshot capable
file system like ZFS, which uses a block sharing technology for snapshots. The 
project team and
I ave some ideas, but would appreciate some real-life application and use case 
examples
for validation and/or course correction to the ADM project.

Although I am posting to zfs-discuss and storage-discuss, I would prefer 
replies to storage-discuss.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Carson Gaspar
Darren J Moffat wrote:

> Today:
> 
> $ zfs create cube/builds/darrenm/bugs/6724478
> 
> With this proposal:
> 
> $ pwd
> /cube/builds/darrenm/bugs
> $ zfs create 6724478
> 
> Both of these would result in a new dataset cube/builds/darrenm/6724478
...
> Maybe the easiest way out of the ambiquity is to add a flag to zfs 
> create for the partial dataset name eg:
> 
> $ pwd
> /cube/builds/darrenm/bugs
> $ zfs create -c 6724478
> 
> Why "-c" ?  -c for "current directory"  "-p" partial is already taken to 
> mean "create all non existing parents" and "-r" relative is already used 
> consistently as "recurse" in other zfs(1) commands (as well as lots of 
> other places).

Why not "zfs create $PWD/6724478". Works today, traditional UNIX 
behaviour, no coding required. Unles you're in some bizarroland shell 
(like csh?)...

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Spencer Shepler

On Jul 10, 2008, at 7:05 AM, Ross wrote:

> Oh god, I hope not.  A patent on fitting a card in a PCI-E slot, or  
> using nvram with RAID (which raid controllers have been doing for  
> years) would just be rediculous.  This is nothing more than cache,  
> and even with the American patent system I'd have though it hard to  
> get that past the obviousness test.

How quickly they forget.

Take a look at the Prestoserve User's Guide for a refresher...

http://docs.sun.com/app/docs/doc/801-4896-11



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Kyle McDonald
Tommaso Boccali wrote:
> .. And the answer was yes I hope. we are sriously thinking of buying
> 48 1 tb disk to replace those in a 1 year old thumper
>
> please confirm it again :)
>
>   
In my 15 year experience with Sun Products, I've never known one to care 
about drive brand, model, or firmware. If it was standards compliant for 
both physical interface, and protocol the machine would use it in my 
experience. This was mainly with host attached JBOD though (which the 
x4500 and x4540 are.) In RAID arrays my guess is that it wouldn't care 
then either, though you'd be opening yourself up to wierd interactions 
between the array and the drive firmware if you didn't use a tested 
combination.

The drive carriers were a different story though. Some were easy to get. 
Others extrememly hard. There was one carrier that we couldn't get 
separately even when I worked at Sun.

   -kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Torrey McMahon
Spencer Shepler wrote:
> On Jul 10, 2008, at 7:05 AM, Ross wrote:
>
>   
>> Oh god, I hope not.  A patent on fitting a card in a PCI-E slot, or  
>> using nvram with RAID (which raid controllers have been doing for  
>> years) would just be rediculous.  This is nothing more than cache,  
>> and even with the American patent system I'd have though it hard to  
>> get that past the obviousness test.
>> 
>
> How quickly they forget.
>
> Take a look at the Prestoserve User's Guide for a refresher...
>
> http://docs.sun.com/app/docs/doc/801-4896-11

Or Fast Write Cache

http://docs.sun.com/app/docs/coll/fast-write-cache2.0
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Ross
Yup, worked fine.  We removed a 500GB disk and replaced it with a 1TB one, 
didn't even need any downtime (although you have to be quick with the 
screwdriver).  I spoke to the x64 line product manager last year and they 
confirmed that Sun plan to support 2TB drives, and probably even 4TB drives as 
they're released.
 
I bought a spare drive bay with the latest x2200 we bought since they look to 
be the same type of mount.  When we do finally buy a Thumper I'm planning to 
have a few spare bays so we can upgrade 4 disks at a time.
 
Now I'm very new to Solaris, so I don't know if this is really the best 
procedure, but this is the best way I found to upgrade the drives on the x4500:
 

Upgrading Drives
Use the procedure for replacing a failed drive with a slight modification.  
When a drive is replaced with a hot spare, ZFS sees that as a temporary change, 
and will revert back to the original drive once it has been replaced.
If you are upgrading, it's more efficient to make the swop permanent.  ie:  
old drive -> new drive
Instead of:
old drive -> spare -> new drive
Overview:

Remove a spare from the ZFS pool 
Use the cfgadm program as documented below to replace the spare with a bigger 
drive 
Replace a drive in the ZFS pool with the new large drive 
Then offline that drive, and repeat 
1. Remove the spare and upgrade it
# zpool remove  c5t7d0
# cfgadm | grep c5t7d0
sata1/1::dsk/c5t7d0disk connectedconfigured   ok
# cfgadm -c unconfigure sata1/1Now remove and replace the physical disk

2. Start the rolling replacements:
# cfgadm -c configure sata1/1
# zpool replace   c5t7d0
... wait
# cfgadm | grep 
sata1/2::dsk/disk connectedconfigured   ok
# cfgadm -c unconfigure sata1/2Replace that drive and repeat.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Moore, Joe
Carson Gaspar wrote:
> Darren J Moffat wrote:
> > $ pwd
> > /cube/builds/darrenm/bugs
> > $ zfs create -c 6724478
> > 
> > Why "-c" ?  -c for "current directory"  "-p" partial is 
> already taken to 
> > mean "create all non existing parents" and "-r" relative is 
> already used 
> > consistently as "recurse" in other zfs(1) commands (as well 
> as lots of 
> > other places).
> 
> Why not "zfs create $PWD/6724478". Works today, traditional UNIX 
> behaviour, no coding required. Unles you're in some bizarroland shell 
> (like csh?)...

Because the zfs dataset mountpoint may not be the same as the zfs pool
name.  This makes things a bit complicated for the initial request.

Personally, I haven't played with datasets where the mountpoint is
different.  If you have a zpool tank mounted on /tank and /tank/homedirs
with mountpoint=/export/home, do you create the next dataset
/tank/homedirs/carson, or /export/home/carson ?  And does the mountpoint
get inherited in the obvious (vs. the simple vs. not at all) way?  I
don't know.

Also $PWD has a leading / in this example.

--Joe
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Bob Friesenhahn
On Thu, 10 Jul 2008, Ross wrote:
>
> As a NFS storage platform, you'd be beating EMC and NetApp on price, 
> spindle count, features and performance.  I really hope somebody at 
> Sun considers this, and thinks about expanding the "What can you do 
> with an x4540" section on the website to include VMware.

I expect that Sun is realizing that it is already undercutting much of 
the rest of its product line.  These minor updates would allow the 
X4540 to compete against much more expensive StorageTek SAN hardware. 
How can other products remain profitable when competing against such a 
star performer?

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Darren J Moffat
Carson Gaspar wrote:
> Why not "zfs create $PWD/6724478". Works today, traditional UNIX 
> behaviour, no coding required. Unles you're in some bizarroland shell 

Did you actually try that ?

braveheart# echo $PWD 
 

/tank/p2/2/1
braveheart# zfs create $PWD/44 
 

cannot create '/tank/p2/2/1/44': leading slash in name

It work because zfs create takes a dataset name but $PWD will give you a 
pathname starting with /.  Dataset names don't start with /.

Also this assumes that your mountpoint hierarchy is identical to your 
dataset name hierarchy (other than the leading /) which isn't 
necessarily true,  ie if any of the datasets have a non default 
mountpoint property.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Darren J Moffat
Glaser, David wrote:
> Hi all,
> 
> I'm a little (ok, a lot) confused on the whole zfs send/receive commands.
 >  I've seen mention of using zfs send between two different machines,
 > but no good howto in order to make it work.

zfs(1) man page, Examples 12 and 13 show how to use senn/receive with 
ssh.  What isn't clear about them ?

 > Do you have to set up a service on the receiving machine in order to 
receive the zfs stream?

No.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mike Gerdts
On Thu, Jul 10, 2008 at 5:42 AM, Darren J Moffat <[EMAIL PROTECTED]> wrote:
> Thoughts ?  Is this useful for anyone else ?  My above examples are some
> of the shorter dataset names I use, ones in my home directory can be
> even deeper.

Quite usable and should be done.

The key problem I see is how to deal with ambiguity.

# zpool create pool
# zfs create pool/home
# zfs set mountpoint=/home  pool/home
# zfs create pool/home/adams  (for Dilbert's master)
...
# zfs create pool/home/gerdts  (for Me)
...
# zfs create pool/home/pool   (for Ms. Pool)
...
# cd /home
# zfs snapshot [EMAIL PROTECTED]

What just got snapshotted?

My vote would be that it would try the traditional match first, then
try to do it by resolving the path.  That is, if it would have failed
in the past, it should see if the specified path is the root
(mountpoint?) of a data set.  That way things like the following
should work unambigously:

# zfs snapshot ./[EMAIL PROTECTED]
# zfs snapshot `pwd`/[EMAIL PROTECTED]

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Bob Friesenhahn
On Thu, 10 Jul 2008, Glaser, David wrote:
> x4500 that we've purchased. Right now I'm using rsync over ssh (via 
> 1GB/s network) to copy the data but it is almost painfully slow 
> (700GB over 24 hours). Yeah, it's a load of small files for the most 
> part. Anyway, would zfs send/receive work better? Do you have to set 
> up a service on the receiving machine in order to receive the zfs 
> stream?

You don't need to set up a service on the remote machine.  You can use 
ssh to invoke the zfs receive and pipe the data across the ssh 
connection, which is similar to what rsync is doing.  For example 
(from the zfs docs):

   zfs send tank/[EMAIL PROTECTED] | ssh newsys zfs recv sandbox/[EMAIL 
PROTECTED]

For a fresh copy, the bottleneck is quite likely ssh itself.  Ssh uses 
fancy encryption algorithms which take lots of CPU time and really 
slows things down.  The "blowfish" algorithm seems to be fastest so 
passing

   -c blowfish

As an ssh option can significantly speed things up.  For example, this 
is how you can tell rsync to use ssh with your own options:

   --rsh='/usr/bin/ssh -c blowfish'

In order to achieve even more performance (but without encryption), 
you can use Netcat as the underlying transport.  See 
http://netcat.sourceforge.net/.

Lastly, if you have much more CPU available than bandwidth, then it is 
worthwhile to install and use the 'lzop' compression program which 
compresses very quickly to a format only about 30% less compressed 
than what gzip achieves but fast enough for real-time data 
transmission.  It is easy to insert lzop into the pipeline so that 
less data is sent across the network.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Tim Spriggs
Darren J Moffat wrote:
> Glaser, David wrote:
>   
>> Hi all,
>>
>> I'm a little (ok, a lot) confused on the whole zfs send/receive commands.
>> 
>  >  I've seen mention of using zfs send between two different machines,
>  > but no good howto in order to make it work.
>
> zfs(1) man page, Examples 12 and 13 show how to use senn/receive with 
> ssh.  What isn't clear about them ?
>
>  > Do you have to set up a service on the receiving machine in order to 
> receive the zfs stream?
>
> No.
>   
I found that the overhead of SSH really hampered my ability to transfer 
data between thumpers as well. When I simply ran a set of sockets and a 
pipe things went much faster (filled a 1G link). Essentially I used 
netcat instead of SSH.

-Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Florin Iucha
On Thu, Jul 10, 2008 at 09:02:35AM -0700, Tim Spriggs wrote:
> > zfs(1) man page, Examples 12 and 13 show how to use senn/receive with 
> > ssh.  What isn't clear about them ?
>
> I found that the overhead of SSH really hampered my ability to transfer 
> data between thumpers as well. When I simply ran a set of sockets and a 
> pipe things went much faster (filled a 1G link). Essentially I used 
> netcat instead of SSH.

You can use blowfish [0] or arcfour [1] as they are faster than the
default algorithm (3des).

Cheers,
florin

0: ssh(1) man page
1: http://www.psc.edu/networking/projects/hpn-ssh/theory.php

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


pgpvZKLQ7qBAI.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Richard Elling
Kyle McDonald wrote:
> Tommaso Boccali wrote:
>   
>> .. And the answer was yes I hope. we are sriously thinking of buying
>> 48 1 tb disk to replace those in a 1 year old thumper
>>
>> please confirm it again :)
>>
>>   
>> 
> In my 15 year experience with Sun Products, I've never known one to care 
> about drive brand, model, or firmware. If it was standards compliant for 
> both physical interface, and protocol the machine would use it in my 
> experience. This was mainly with host attached JBOD though (which the 
> x4500 and x4540 are.) In RAID arrays my guess is that it wouldn't care 
> then either, though you'd be opening yourself up to wierd interactions 
> between the array and the drive firmware if you didn't use a tested 
> combination.
>   

In general, yes, industry standard drives should be industry standard.
We do favor the enterprise-class drives, mostly because they are lower
cost over time -- it costs real $$ to answer the phone for a field
replacement request.  Usually, there is a Sun-specific label because
though we source from many vendors and products like hardware
RAID controllers get upset when the replacement disk reports a
different size.

> The drive carriers were a different story though. Some were easy to get. 
> Others extrememly hard. There was one carrier that we couldn't get 
> separately even when I worked at Sun.
>   

Drive carriers are a different ballgame.  AFAIK, there is no
industry standard carrier that meets our needs.  We require
service LEDs for many of our modern disk carriers, so there
is a little bit of extra electronics there.  You will see more
electronics for some of the newer products as I explain here:
http://blogs.sun.com/relling/entry/this_ain_t_your_daddy

I won't get into the "support" issue... it hurts my brain.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Darren J Moffat
Florin Iucha wrote:
> On Thu, Jul 10, 2008 at 09:02:35AM -0700, Tim Spriggs wrote:
>>> zfs(1) man page, Examples 12 and 13 show how to use senn/receive with 
>>> ssh.  What isn't clear about them ?
>> I found that the overhead of SSH really hampered my ability to transfer 
>> data between thumpers as well. When I simply ran a set of sockets and a 
>> pipe things went much faster (filled a 1G link). Essentially I used 
>> netcat instead of SSH.
> 
> You can use blowfish [0] or arcfour [1] as they are faster than the
> default algorithm (3des).

The default algorithm for ssh on Solaris is not 3des it is aes128-ctr.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Darren J Moffat
Mike Gerdts wrote:
> On Thu, Jul 10, 2008 at 5:42 AM, Darren J Moffat <[EMAIL PROTECTED]> wrote:
>> Thoughts ?  Is this useful for anyone else ?  My above examples are some
>> of the shorter dataset names I use, ones in my home directory can be
>> even deeper.
> 
> Quite usable and should be done.
> 
> The key problem I see is how to deal with ambiguity.
> 
> # zpool create pool
> # zfs create pool/home
> # zfs set mountpoint=/home  pool/home
> # zfs create pool/home/adams  (for Dilbert's master)
> ...
> # zfs create pool/home/gerdts  (for Me)
> ...
> # zfs create pool/home/pool   (for Ms. Pool)
> ...
> # cd /home
> # zfs snapshot [EMAIL PROTECTED]
> 
> What just got snapshotted?

The dataset named "pool" only.  I don't see how that could be ambiguous 
now or with what I proposed.

If you said zfs snapshot -r [EMAIL PROTECTED] then all of them.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Glaser, David
Is that faster than blowfish?

Dave


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Darren J Moffat
Sent: Thursday, July 10, 2008 12:27 PM
To: Florin Iucha
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS send/receive questions

Florin Iucha wrote:
> On Thu, Jul 10, 2008 at 09:02:35AM -0700, Tim Spriggs wrote:
>>> zfs(1) man page, Examples 12 and 13 show how to use senn/receive with
>>> ssh.  What isn't clear about them ?
>> I found that the overhead of SSH really hampered my ability to transfer
>> data between thumpers as well. When I simply ran a set of sockets and a
>> pipe things went much faster (filled a 1G link). Essentially I used
>> netcat instead of SSH.
>
> You can use blowfish [0] or arcfour [1] as they are faster than the
> default algorithm (3des).

The default algorithm for ssh on Solaris is not 3des it is aes128-ctr.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Tim
On Thu, Jul 10, 2008 at 10:20 AM, Bob Friesenhahn <
[EMAIL PROTECTED]> wrote:

> On Thu, 10 Jul 2008, Ross wrote:
> >
> > As a NFS storage platform, you'd be beating EMC and NetApp on price,
> > spindle count, features and performance.  I really hope somebody at
> > Sun considers this, and thinks about expanding the "What can you do
> > with an x4540" section on the website to include VMware.
>
> I expect that Sun is realizing that it is already undercutting much of
> the rest of its product line.  These minor updates would allow the
> X4540 to compete against much more expensive StorageTek SAN hardware.
> How can other products remain profitable when competing against such a
> star performer?
>
> Bob
> ==
> Bob Friesenhahn
> [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


Because at the end of the day, the x4540 still isn't *there* (and probably
never will be) for 24/7 SAN/LUN access.  AFAIK, nothing in the storagetek
line-up is worth a damn as far as NAS goes that would compete with this.  I
honestly don't believe anyone looking at a home-grown x4540 is TRULY in the
market for a high end STK SAN anyways.

It's the same reason you don't see HDS or EMC rushing to adjust the price of
the SYM or USP-V based on Sun releasing the thumpers.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mike Gerdts
On Thu, Jul 10, 2008 at 11:31 AM, Darren J Moffat <[EMAIL PROTECTED]> wrote:
> Mike Gerdts wrote:
>>
>> On Thu, Jul 10, 2008 at 5:42 AM, Darren J Moffat <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> Thoughts ?  Is this useful for anyone else ?  My above examples are some
>>> of the shorter dataset names I use, ones in my home directory can be
>>> even deeper.
>>
>> Quite usable and should be done.
>>
>> The key problem I see is how to deal with ambiguity.
>>
>> # zpool create pool
>> # zfs create pool/home
>> # zfs set mountpoint=/home  pool/home
>> # zfs create pool/home/adams  (for Dilbert's master)
>> ...
>> # zfs create pool/home/gerdts  (for Me)
>> ...
>> # zfs create pool/home/pool   (for Ms. Pool)
>> ...
>> # cd /home
>> # zfs snapshot [EMAIL PROTECTED]
>>
>> What just got snapshotted?
>
> The dataset named "pool" only.  I don't see how that could be ambiguous now
> or with what I proposed.
>
> If you said zfs snapshot -r [EMAIL PROTECTED] then all of them.

Which dataset named pool?  The one at /pool (the root of the zpool, if
you will) or the one at /home/pool (Ms. Pool's home directory) which
happens to be `pwd`/pool.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Glaser, David
I guess what I was wondering if there was a direct method rather than the 
overhead of ssh.



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Darren J Moffat
Sent: Thursday, July 10, 2008 11:40 AM
To: Glaser, David
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS send/receive questions

Glaser, David wrote:
> Hi all,
>
> I'm a little (ok, a lot) confused on the whole zfs send/receive commands.
 >  I've seen mention of using zfs send between two different machines,
 > but no good howto in order to make it work.

zfs(1) man page, Examples 12 and 13 show how to use senn/receive with
ssh.  What isn't clear about them ?

 > Do you have to set up a service on the receiving machine in order to
receive the zfs stream?

No.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Keith Bierman

On Jul 10, 2008, at 9:20 AM, Bob Friesenhahn wrote:

>
> I expect that Sun is realizing that it is already undercutting much of
> the rest of its product line.

a) Failure to do so just means that someone else does, and wins the  
customer.
b) A lot of "enterprise class" infrastructure wonks are very, very  
conservative. They'll want to continue with the architecture they  
have for a long time to come. I suspect that the real "loss" is the  
potential to convert up and coming shops to the premium storage ...  
but that would have been an uphill battle anyway, so moving that  
class of consumer into "open storage" is an easier sell anyway.

-- 
Keith H. Bierman   [EMAIL PROTECTED]  | AIM kbiermank
5430 Nassau Circle East  |
Cherry Hills Village, CO 80113   | 303-997-2749
 Copyright 2008




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Darren J Moffat
Mike Gerdts wrote:
> On Thu, Jul 10, 2008 at 11:31 AM, Darren J Moffat <[EMAIL PROTECTED]> wrote:
>> Mike Gerdts wrote:
>>> On Thu, Jul 10, 2008 at 5:42 AM, Darren J Moffat <[EMAIL PROTECTED]>
>>> wrote:
 Thoughts ?  Is this useful for anyone else ?  My above examples are some
 of the shorter dataset names I use, ones in my home directory can be
 even deeper.
>>> Quite usable and should be done.
>>>
>>> The key problem I see is how to deal with ambiguity.
>>>
>>> # zpool create pool
>>> # zfs create pool/home
>>> # zfs set mountpoint=/home  pool/home
>>> # zfs create pool/home/adams  (for Dilbert's master)
>>> ...
>>> # zfs create pool/home/gerdts  (for Me)
>>> ...
>>> # zfs create pool/home/pool   (for Ms. Pool)
>>> ...
>>> # cd /home
>>> # zfs snapshot [EMAIL PROTECTED]
>>>
>>> What just got snapshotted?
>> The dataset named "pool" only.  I don't see how that could be ambiguous now
>> or with what I proposed.
>>
>> If you said zfs snapshot -r [EMAIL PROTECTED] then all of them.
> 
> Which dataset named pool?  The one at /pool (the root of the zpool, if
> you will) or the one at /home/pool (Ms. Pool's home directory) which
> happens to be `pwd`/pool.

Ah sorry I missed that your third dataset ended in pool.

The answer is still the same though if the proposal to use a new flag 
for partial paths is taken.  Which is why I suggested that, it is 
ambiguous in the example you gave if zfs(1) commands other than create 
can take relative paths too [ which would be useful ].

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Darren J Moffat
Glaser, David wrote:
> I guess what I was wondering if there was a direct method rather than the 
> overhead of ssh.

As others have suggested use netcat (/usr/bin/nc) however you get no 
over the wire data confidentiality or integrity and no strong 
authentication with that.

If you need those then a combination of netcat and IPsec might help.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Glaser, David
Thankfully right now it's between a private IP network between the two 
machines. I'll play with it a bit and let folks know if I can't get it to work.

Thanks,
Dave


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Darren J Moffat
Sent: Thursday, July 10, 2008 12:50 PM
To: Glaser, David
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS send/receive questions

Glaser, David wrote:
> I guess what I was wondering if there was a direct method rather than the 
> overhead of ssh.

As others have suggested use netcat (/usr/bin/nc) however you get no
over the wire data confidentiality or integrity and no strong
authentication with that.

If you need those then a combination of netcat and IPsec might help.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Will Murnane
On Thu, Jul 10, 2008 at 12:43, Glaser, David <[EMAIL PROTECTED]> wrote:
> I guess what I was wondering if there was a direct method rather than the 
> overhead of ssh.
On receiving machine:
nc -l 12345 | zfs recv mypool/[EMAIL PROTECTED]
and on sending machine:
zfs send sourcepool/[EMAIL PROTECTED] | nc othermachine.umich.edu 12345
You'll need to build your own netcat, but this is fairly simple.  If
you run into trouble let me know and I'll post an x86 package.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Glaser, David
Could I trouble you for the x86 package? I don't seem to have much in the way 
of software on this try-n-buy system...

Thanks,
Dave


-Original Message-
From: Will Murnane [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 10, 2008 12:58 PM
To: Glaser, David
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS send/receive questions

On Thu, Jul 10, 2008 at 12:43, Glaser, David <[EMAIL PROTECTED]> wrote:
> I guess what I was wondering if there was a direct method rather than the 
> overhead of ssh.
On receiving machine:
nc -l 12345 | zfs recv mypool/[EMAIL PROTECTED]
and on sending machine:
zfs send sourcepool/[EMAIL PROTECTED] | nc othermachine.umich.edu 12345
You'll need to build your own netcat, but this is fairly simple.  If
you run into trouble let me know and I'll post an x86 package.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Richard Elling
Torrey McMahon wrote:
> Spencer Shepler wrote:
>   
>> On Jul 10, 2008, at 7:05 AM, Ross wrote:
>>
>>   
>> 
>>> Oh god, I hope not.  A patent on fitting a card in a PCI-E slot, or  
>>> using nvram with RAID (which raid controllers have been doing for  
>>> years) would just be rediculous.  This is nothing more than cache,  
>>> and even with the American patent system I'd have though it hard to  
>>> get that past the obviousness test.
>>> 
>>>   
>> How quickly they forget.
>>
>> Take a look at the Prestoserve User's Guide for a refresher...
>>
>> http://docs.sun.com/app/docs/doc/801-4896-11
>> 
>
> Or Fast Write Cache
>
> http://docs.sun.com/app/docs/coll/fast-write-cache2.0
>   

Yeah, the J-shaped scar just below my right shoulder blade...

For the benefit of the alias, these sorts of products have a very limited
market because they store state inside the server and use batteries.
RAS guys hate batteries, especially those which are sitting on
non-hot-pluggable I/O cards.  While there are some specific cards
which do allow hardware assisted remote replication (a previous
Sun technology called "reflective memory" as used by VAXclusters)
most of the issues are with serviceability and not availability.  It
is really bad juju to leave state in the wrong place during a service
event.

Where I think the jury is deadlocked is whether these are actually
faster than RAID cards like
http://www.sun.com/storagetek/storage_networking/hba/raid/

But from a performability perspective, the question is whether or
not such cards perform significantly better than SSDs?  Thoughts?
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-10 Thread Richard Elling
Fajar A. Nugraha wrote:
> Brandon High wrote:
>> Another alternative is to use an IDE to Compact Flash adapter, and
>> boot off of flash.
> Just curious, what will that flash contain?
> e.g. will it be similar to linux's /boot, or will it contain the full 
> solaris root?
> How do you manage redundancy (e.g. mirror) for that boot device?
>

zfs set copies=2 :-)


hmm... I need to dig up my notes on that and blog it...
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Tim Spriggs
Will Murnane wrote:
> On Thu, Jul 10, 2008 at 12:43, Glaser, David <[EMAIL PROTECTED]> wrote:
>   
>> I guess what I was wondering if there was a direct method rather than the 
>> overhead of ssh.
>> 
> On receiving machine:
> nc -l 12345 | zfs recv mypool/[EMAIL PROTECTED]
> and on sending machine:
> zfs send sourcepool/[EMAIL PROTECTED] | nc othermachine.umich.edu 12345
> You'll need to build your own netcat, but this is fairly simple.  If
> you run into trouble let me know and I'll post an x86 package.
>
> Will
>   
If you are running Nexenta you can also "apt-get install sunwnetcat"
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-10 Thread Will Murnane
On Thu, Jul 10, 2008 at 13:05, Glaser, David <[EMAIL PROTECTED]> wrote:
> Could I trouble you for the x86 package? I don't seem to have much in the way 
> of software on this try-n-buy system...
No problem.  Packages are posted at
http://will.incorrige.us/solaris-packages/ .  You'll need gettext and
iconv as well as netcat, as it links against libiconv.  Download the
gzip files, decompress them with gzip -d, then pkgtrans $packagefile
$tempdir and run pkgadd -d $tempdir.  Files will be installed in the
/usr/site hierarchy.  The executable is called "netcat", not "nc",
because that's what it builds as by default.  I believe I got all the
dependencies, but if not I'll be glad to post whatever is missing as
well.

If you'd rather have spec files and sources (which you can assemble
with pkgbuild) than binaries, I can provide those instead.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Will Murnane
On Thu, Jul 10, 2008 at 12:14, Richard Elling <[EMAIL PROTECTED]> wrote:
> Drive carriers are a different ballgame.  AFAIK, there is no
> industry standard carrier that meets our needs.  We require
> service LEDs for many of our modern disk carriers, so there
> is a little bit of extra electronics there.  You will see more
> electronics for some of the newer products as I explain here:
> http://blogs.sun.com/relling/entry/this_ain_t_your_daddy
While I do appreciate the need for redundancy, and thus interposers, I
wish Sun's pricing on the disk/interposer combinations were a little
more aggressive.  Here's current costs straight from Sun's site:
XTA-ST1NJ-250G7K  $ 170.00 250GB
XTA-ST1NJ-500G7K  $ 400.00 500GB
XTA-ST1NJ-750G7K  $ 525.00 750GB
XTA-ST1NJ-1T7K $ 950.00 1000GB
Compare for a moment with prices for raw disks (Seagate NS series,
from ZipZoomFly; I picked these because they look an awful lot like
what I got when I bought disks from Sun):
ST3250310NS $70  250GB
ST3500320NS $100 500GB
ST3750330NS $150 750GB
ST31000340NS $235 1000GB
So for a 250 gig drive the interposer and caddy etc cost me $100, for
a 500 gb disk it costs me $300, for a 750 it costs $375, and for 1TB
disks it costs an extra $715!  Look at it another way - if I buy a
250GB disk from Sun and a 1TB disk of my own for the prices above, it
costs me $405, less than half what buying a 1TB disk from Sun would
cost.  I can buy two of those assemblies for less than one from Sun,
and leave one as a cold spare.

Another way to look at it: Suppose that Sun charges a flat $100 for
the extra electronics.  Then the 250GB disk is priced at market value,
the 500 at three times market, the 750 at 2.8 times market, and the
1TB at 3.6 times market.

If the prices on disks were lower on these, they would be interesting
for low-end businesses or even high-end home users.  The chassis is
within reach of reasonable, but the disk prices look ludicrously high
from where I sit.  An empty one only costs $3k, sure, but fill it with
twelve disks and it's up to $20k.  Are there some extra electronics
required for larger disks that help explain this steep slope of cost?
I can't think of any reasons off the top of my head (other than the
understandable profit motive).

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Peter Tribble
On Thu, Jul 10, 2008 at 4:13 PM, Kyle McDonald <[EMAIL PROTECTED]> wrote:
> In my 15 year experience with Sun Products, I've never known one to care
> about drive brand, model, or firmware. If it was standards compliant for
> both physical interface, and protocol the machine would use it in my
> experience. This was mainly with host attached JBOD though (which the
> x4500 and x4540 are.) In RAID arrays my guess is that it wouldn't care
> then either, though you'd be opening yourself up to wierd interactions
> between the array and the drive firmware if you didn't use a tested
> combination.

My experience with RAID arrays (mostly Sun's)  has been that they're incredibly
picky about the drives they talk to, firmware in particular. You
pretty much have
to have one of the few supported configurations for it to work. If
you're lucky the
array will update the firmware for you. I've also seen the intelligent
controllers in
some of Sun's JBOD units (the S1, and the 3000 series) fail to recognize drives
that work perfectly well elsewhere.

I'm slightly disappointed that there wasn't a model for 2.5 inch
drives in there,
though.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Carson Gaspar
Moore, Joe wrote:
> Carson Gaspar wrote:
>> Darren J Moffat wrote:
>>> $ pwd
>>> /cube/builds/darrenm/bugs
>>> $ zfs create -c 6724478
>>>
>>> Why "-c" ?  -c for "current directory"  "-p" partial is 
>> already taken to 
>>> mean "create all non existing parents" and "-r" relative is 
>> already used 
>>> consistently as "recurse" in other zfs(1) commands (as well 
>> as lots of 
>>> other places).
>> Why not "zfs create $PWD/6724478". Works today, traditional UNIX 
>> behaviour, no coding required. Unles you're in some bizarroland shell 
>> (like csh?)...
> 
> Because the zfs dataset mountpoint may not be the same as the zfs pool
> name.  This makes things a bit complicated for the initial request.

The leading slash will be a problem with the current code. I forgot 
about that... make that ${PWD#/} (or change the code to ignore the 
leading slash...). That is, admittedly, more typing than a single 
character option, but not much.

And yes, if your mount name and pool names don't match, extra code would 
be required to determine the parent pool/fs of the path passed. But no 
more code than magic CWD goo... I really don't like special case options 
whose sole purpose is to shorten command line length.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Torrey McMahon
Richard Elling wrote:
> Torrey McMahon wrote:
>> Spencer Shepler wrote:
>>  
>>> On Jul 10, 2008, at 7:05 AM, Ross wrote:
>>>
>>>  
 Oh god, I hope not.  A patent on fitting a card in a PCI-E slot, 
 or  using nvram with RAID (which raid controllers have been doing 
 for  years) would just be rediculous.  This is nothing more than 
 cache,  and even with the American patent system I'd have though it 
 hard to  get that past the obviousness test.
   
>>> How quickly they forget.
>>>
>>> Take a look at the Prestoserve User's Guide for a refresher...
>>>
>>> http://docs.sun.com/app/docs/doc/801-4896-11
>>> 
>>
>> Or Fast Write Cache
>>
>> http://docs.sun.com/app/docs/coll/fast-write-cache2.0
>>   
>
> Yeah, the J-shaped scar just below my right shoulder blade...
>
> For the benefit of the alias, these sorts of products have a very limited
> market because they store state inside the server and use batteries.
> RAS guys hate batteries, especially those which are sitting on
> non-hot-pluggable I/O cards.  While there are some specific cards
> which do allow hardware assisted remote replication (a previous
> Sun technology called "reflective memory" as used by VAXclusters)
> most of the issues are with serviceability and not availability.  It
> is really bad juju to leave state in the wrong place during a service
> event.
>
> Where I think the jury is deadlocked is whether these are actually
> faster than RAID cards like
> http://www.sun.com/storagetek/storage_networking/hba/raid/
>
> But from a performability perspective, the question is whether or
> not such cards perform significantly better than SSDs?  Thoughts?

In the past we saw some good performance gains with on server caching of 
the i/o even if it went to a large array with lots of cache down the 
road. Mr. Sneed had some numbers from back in the day but for some 
workloads there was quite the impact by ack'ing writes and coalescing 
them on the host NVRAM before sending them to the array. You cut the i/o 
time down for the app *and* the i/o you do send is a lot less bursty. 
(Of course if you filled up the memory on the card you'd hit other 
issues.)




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] proposal partial/relative paths for zfs(1)

2008-07-10 Thread Mike Gerdts
On Thu, Jul 10, 2008 at 1:40 PM, Carson Gaspar <[EMAIL PROTECTED]> wrote:
> Moore, Joe wrote:
>> Because the zfs dataset mountpoint may not be the same as the zfs pool
>> name.  This makes things a bit complicated for the initial request.
>
> The leading slash will be a problem with the current code. I forgot
> about that... make that ${PWD#/} (or change the code to ignore the
> leading slash...). That is, admittedly, more typing than a single
> character option, but not much.

Most of the places where I use zfs the ${PWD#/} trick would not work.
I have a pool of storage that has an arbitrary name then I use it in
various places to match up with names I have traditionally used.  That
is, I have a pool named local that has:

local/zones on /zones
local/ws on /ws
local/home on /export/home
...

I think that if you take a look at a machine that uses zfsroot you
will find very few, if any, datasets used directly at
/rpool/.

> And yes, if your mount name and pool names don't match, extra code would
> be required to determine the parent pool/fs of the path passed. But no
> more code than magic CWD goo... I really don't like special case options
> whose sole purpose is to shorten command line length.

It's not just shortening command line length.  If a user has
permissions to do things in his/her datasets, there should be no need
for that user to know about the overall structure of the zpool.  This
is user-visible complexity that will turn into a long-term management
problem as sysadmins split or merge pools, change pool naming schemes,
reorganize dataset hierarchies, etc.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE: ZFS commands "zmv" and "zcp"

2008-07-10 Thread Raquel K. Sanborn
No,  the problem data must be moved or copied from where it is, to a different 
ZFS.

Raquel
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Richard Elling
Will Murnane wrote:
> On Thu, Jul 10, 2008 at 12:14, Richard Elling <[EMAIL PROTECTED]> wrote:
>   
>> Drive carriers are a different ballgame.  AFAIK, there is no
>> industry standard carrier that meets our needs.  We require
>> service LEDs for many of our modern disk carriers, so there
>> is a little bit of extra electronics there.  You will see more
>> electronics for some of the newer products as I explain here:
>> http://blogs.sun.com/relling/entry/this_ain_t_your_daddy
>> 
> While I do appreciate the need for redundancy, and thus interposers, I
> wish Sun's pricing on the disk/interposer combinations were a little
> more aggressive.  Here's current costs straight from Sun's site:
> XTA-ST1NJ-250G7K  $ 170.00 250GB
> XTA-ST1NJ-500G7K  $ 400.00 500GB
> XTA-ST1NJ-750G7K  $ 525.00 750GB
> XTA-ST1NJ-1T7K $ 950.00 1000GB
> Compare for a moment with prices for raw disks (Seagate NS series,
> from ZipZoomFly; I picked these because they look an awful lot like
> what I got when I bought disks from Sun):
> ST3250310NS $70  250GB
> ST3500320NS $100 500GB
> ST3750330NS $150 750GB
> ST31000340NS $235 1000GB
> So for a 250 gig drive the interposer and caddy etc cost me $100, for
> a 500 gb disk it costs me $300, for a 750 it costs $375, and for 1TB
> disks it costs an extra $715!  Look at it another way - if I buy a
> 250GB disk from Sun and a 1TB disk of my own for the prices above, it
> costs me $405, less than half what buying a 1TB disk from Sun would
> cost.  I can buy two of those assemblies for less than one from Sun,
> and leave one as a cold spare.
>
> Another way to look at it: Suppose that Sun charges a flat $100 for
> the extra electronics.  Then the 250GB disk is priced at market value,
> the 500 at three times market, the 750 at 2.8 times market, and the
> 1TB at 3.6 times market.
>
> If the prices on disks were lower on these, they would be interesting
> for low-end businesses or even high-end home users.  The chassis is
> within reach of reasonable, but the disk prices look ludicrously high
> from where I sit.  An empty one only costs $3k, sure, but fill it with
> twelve disks and it's up to $20k.  Are there some extra electronics
> required for larger disks that help explain this steep slope of cost?
> I can't think of any reasons off the top of my head (other than the
> understandable profit motive).
>   

Never confuse price with cost.  Price is set by market forces.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Mirroring - Scenario

2008-07-10 Thread Robb Snavely
I have a scenario (tray failure) that I am trying to "predict" how zfs 
will behave and am looking for some input .  Coming from the world of 
svm, ZFS is WAY different ;)

If we have 2 racks, containing 4 trays each, 2 6540's that present 8D 
Raid5 luns to the OS/zfs and through zfs we setup a mirror config such 
that: I'm oversimplifying here but...

Rack 1 - Tray 1 = lun 0Rack 2 - Tray 1  =  lun 4
Rack 1 - Tray 2 = lun 1Rack 2 - Tray 2  =  lun 5
Rack 1 - Tray 3 = lun 2Rack 2 - Tray 3  =  lun 6
Rack 1 - Tray 4 = lun 3Rack 2 - Tray 4  =  lun 7

so the zpool command would be:

zpool create somepool mirror 0 4 mirror 1 5 mirror 2 6 mirror 3 7   
<---(just for ease of explanation using the "supposed" lun numbers)

so a status output would look similar to:

somepool
  mirror
  0
  4
  mirror
  1
  5
  mirror
  3
  6
  mirror
  4
  7

Now in the VERY unlikely event that we lost the first tray in each rack 
which contain 0 and 4 respectively...

somepool
  mirror---
  0   |
  4   |   Bye Bye
 ---
  mirror
  1
  5
  mirror
  3
  6
  mirror
  4
  7


Would the entire "somepool" zpool die?  Would it affect ALL users in 
this pool or a portion of the users?  Is there a way in zfs to be able 
to tell what individual users are hosed (my group is a bunch of control 
freaks ;)?  How would zfs react to something like this?  Also any 
feedback on a better way to do this is more then welcome

Please keep in mind I am a "ZFS noob" so detailed explanations would be 
awesome.

Thanks in advance

Robb
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirroring - Scenario

2008-07-10 Thread Bob Friesenhahn
On Thu, 10 Jul 2008, Robb Snavely wrote:
>
> Now in the VERY unlikely event that we lost the first tray in each rack
> which contain 0 and 4 respectively...
>
> somepool
>  mirror---
>  0   |
>  4   |   Bye Bye
> ---
>  mirror
>  1
>  5
>  mirror
>  3
>  6
>  mirror
>  4
>  7
>
>
> Would the entire "somepool" zpool die?  Would it affect ALL users in
> this pool or a portion of the users?  Is there a way in zfs to be able

ZFS loadshares the pool over the VDEVs (mirror is a type of VDEV) so 
your entire pool would become dead and unusable until the VDEV is 
restored.

You want your mirrors to be based on hardware which is as distinct as 
possible.  If necessary you could consider a tripple mirror.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS/Install related question

2008-07-10 Thread Andre
Hi there,

I'm currently setting up a new system to my lab. 4 SATA drives would be turned 
into the main file system (ZFS?) running on a soft raid (raid-z?).

My main target is reliability, my experience with Linux SoftRaid was 
catastrophic and the array could no be restored after some testing simulating 
power failures (thank god I did the tests before relying on that...)

For what I've seen so far, Solaris cannot boot from a raid-z system. Is that 
correct?

In this case, what needs to be out of the array? Example, on a Linux system, I 
could set the /boot to be on a old 256MB USB flash.(As long the boot loader and 
kernel were out of the array the system would boot.) What are the requirements 
for booting from the USB but loading a system on the array?

Second, how do I proceed during the Install process?

I know it's a little bit weird but I must confess I'm doing it on purpose. :-)

I thank you in advance
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-10 Thread User Name
I am building a 14 disk raid 6 array with 1 TB seagate AS (non-enterprise) 
drives.

So there will be 14 disks total, 2 of them will be parity, 12 TB space 
available.

My drives have a BER of 10^14

I am quite scared by my calculations - it appears that if one drive fails, and 
I do a rebuild, I will perform:

13*8*10^12 = 104

reads.  But my BER is smaller:

10^14 = 100

So I am (theoretically) guaranteed to lose another drive on raid rebuild.  Then 
the calculation for _that_ rebuild is:

12*8*10^12 = 96

So no longer guaranteed, but 96% isn't good.

I have looked all over, and these seem to be the accepted calculations - which 
means if I ever have to rebuild, I'm toast.

But here is the question - the part I am having trouble understanding:

The 13*8*10^12 operations required for the first rebuild  isn't that the 
number for _the entire array_ ?  Any given 1 TB disk only has 10^12 bits on it 
_total_.  So why would I ever do more than 10^12 operations on the disk ?

It seems very odd to me that a raid controller would have to access any given 
bit more than once to do a rebuild ... and the total number of bits on a drive 
is 10^12, which is far below the 10^14 BER number.

So I guess my question is - why are we all doing this calculation, wherein we 
apply the total operations across an entire array rebuild to a single drives 
BER number ?

Thanks.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-10 Thread Richard Elling
User Name wrote:
> I am building a 14 disk raid 6 array with 1 TB seagate AS (non-enterprise) 
> drives.
>
> So there will be 14 disks total, 2 of them will be parity, 12 TB space 
> available.
>
> My drives have a BER of 10^14
>
> I am quite scared by my calculations - it appears that if one drive fails, 
> and I do a rebuild, I will perform:
>
> 13*8*10^12 = 104
>
> reads.  But my BER is smaller:
>
> 10^14 = 100
>
> So I am (theoretically) guaranteed to lose another drive on raid rebuild.  
> Then the calculation for _that_ rebuild is:
>
> 12*8*10^12 = 96
>
> So no longer guaranteed, but 96% isn't good.
>
> I have looked all over, and these seem to be the accepted calculations - 
> which means if I ever have to rebuild, I'm toast.
>   

If you were using RAID-5, you might be concerned.  For RAID-6,
or at least raidz2, you could recover an unrecoverable read during
the rebuild of one disk.

> But here is the question - the part I am having trouble understanding:
>
> The 13*8*10^12 operations required for the first rebuild  isn't that the 
> number for _the entire array_ ?  Any given 1 TB disk only has 10^12 bits on 
> it _total_.  So why would I ever do more than 10^12 operations on the disk ?
>   

Actually, ZFS only rebuilds the data.  So you need to multiply by
the space utilization of the pool, which will usually be less than
100%.

> It seems very odd to me that a raid controller would have to access any given 
> bit more than once to do a rebuild ... and the total number of bits on a 
> drive is 10^12, which is far below the 10^14 BER number.
>
> So I guess my question is - why are we all doing this calculation, wherein we 
> apply the total operations across an entire array rebuild to a single drives 
> BER number ?
>   

You might also be interested in this blog
http://blogs.zdnet.com/storage/?p=162

A couple of things seem to be at work here.  I study field data
failure rates.  We tend to see unrecoverable read failure rates
at least an order of magnitude better than the specifications.
This is a good thing, but simply points out that the specifications
are often sand-bagged -- they are not a guarantee.  However,
you are quite right in your intuition that if you have a lot of
bits of data, then you need to pay attention to the bit-error
rate (BER) of unrecoverable reads on disks. This sort of model
can be used to determine a mean time to data loss (MTTDL) as
I explain here:
http://blogs.sun.com/relling/entry/a_story_of_two_mttdl

Perhaps it would help if we changed the math to show the risk
as a function of the amount of data given the protection scheme? 
hmmm something like probability of data loss per year for
N TBytes with configuration XYZ.  Would that be more
useful for evaluating configurations?
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-10 Thread Ross
Yes, but pricing that's so obviously disconnected with cost leads customers to 
feel they're being ripped off.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss