Re: [zfs-discuss] Petabytes on a budget - blog

2009-09-02 Thread Bill Moore
On Wed, Sep 02, 2009 at 02:54:42PM -0400, Jacob Ritorto wrote: > Torrey McMahon wrote: > >> 3) Performance isn't going to be that great with their design >> but...they might not need it. > > > Would you be able to qualify this assertion? Thinking through it a bit, > even if the disks are better

Re: [zfs-discuss] Issue with simultaneous IO to lots of ZFS pools

2008-04-30 Thread Bill Moore
A silly question: Why are you using 132 ZFS pools as opposed to a single ZFS pool with 132 ZFS filesystems? --Bill On Wed, Apr 30, 2008 at 01:53:32PM -0400, Chris Siebenmann wrote: > I have a test system with 132 (small) ZFS pools[*], as part of our > work to validate a new ZFS-based fileserve

Re: [zfs-discuss] Best hardware

2007-11-10 Thread Bill Moore
I would recommend the 64-bit system, but make sure your controller card will work in it, first. The bottleneck will most likely be the incoming network connection (100MB/s) in any case. Assuming, of course, that you have more than one disk. With the 64-bit system, you'll run into fewer issues in

Re: [zfs-discuss] Count objects/inodes

2007-11-09 Thread Bill Moore
You can just do something like this: # zfs list tank/home/billm NAMEUSED AVAIL REFER MOUNTPOINT tank/home/billm83.9G 5.56T 74.1G /export/home/billm # zdb tank/home/billm Dataset tank/home/billm [ZPL], ID 83, cr_txg 541, 74.1G, 111066 objects L

Re: [zfs-discuss] Fracture Clone Into FS

2007-10-17 Thread Bill Moore
I may not be understanding your usage case correctly, so bear with me. Here is what I understand your request to be. Time is increasing from left to right. A -- B -- C -- D -- E \ - F -- G Where E and G are writable filesystems and the others are snapshots. I think y

Re: [zfs-discuss] import zpool error if use loop device as vdev

2007-09-19 Thread Bill Moore
Also, there is no need to go through lofi. You can directly add files to a zpool: # mkfile -v 100m /var/tmp/disk1 # mkfile -v 100m /var/tmp/disk2 # zpool create pool_1and2 /var/tmp/disk1 /var/tmp/disk2 I do this in demos all the time, and it's quite handy. And as George pointed out,

Re: [zfs-discuss] ext3 on zvols journal performance pathologies?

2007-09-11 Thread Bill Moore
I would also suggest setting the recordsize property on the zvol when you create it to 4k, which is, I think, the native ext3 block size. If you don't do this and allow ZFS to use it's 128k default blocksize, then a 4k write from ext3 will turn into a 128k read/modify/write on the ZFS side. This c

Re: [zfs-discuss] (politics) Sharks in the waters

2007-09-05 Thread Bill Moore
On Wed, Sep 05, 2007 at 03:43:38PM -0500, Rob Windsor wrote: > (No, I'm not defending Sun in it's apparent patent-growling, either, it > all sucks IMO.) In contrast to the positioning by NetApp, Sun didn't start the patent fight. It was started by StorageTek, well prior to Sun's acquisition of t

Re: [zfs-discuss] Stop a resilver

2007-06-05 Thread Bill Moore
Did you try issuing: zpool detach your_pool_name new_device That should detach the new device and stop the resilver. If you just want to stop the resilver (and leave the device), you should be able to do: zpool scrub -s your_pool_name Which will stop the scrub/resilver. --Bill On

Re: [zfs-discuss] ZFS copies and fault tolerance

2007-04-21 Thread Bill Moore
See my blog on this topic: http://blogs.sun.com/bill/entry/ditto_blocks_the_amazing_tape The quick summary is that if there is more than one vdev comprising the pool, the copies will be spread across multiple vdevs. If there is only one, then the copies are spread out physically (at least by

Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

2007-04-20 Thread Bill Moore
When you say rewrites, can you give more detail? For example, are you rewriting in 8K chunks, random sizes, etc? The reason I ask is because ZFS will, by default, use 128K blocks for large files. If you then rewrite a small chunk at a time, ZFS is forced to read 128K, modify the small chunk you'

Re: [zfs-discuss] Re: Cheap ZFS homeserver.

2007-01-31 Thread Bill Moore
On Wed, Jan 31, 2007 at 05:01:19AM -0800, Tom Buskey wrote: > As a followup, the system I'm trying to use this on is a dual PII 400 > with 512MB. Real low budget. > > 2 500 GB drives with 2 120 GB in a RAIDZ. The idea is that I can get > 2 more 500 GB drives later to get full capacity. I tested

Re: [zfs-discuss] hot spares - in standby?

2007-01-29 Thread Bill Moore
You could easily do this in Solaris today by just using power.conf(4). Just have it spin down any drives that have been idle for a day or more. The periodic testing part would be an interesting project to kick off. --Bill On Mon, Jan 29, 2007 at 08:21:16PM -0200, Toby Thain wrote: > Hi, > > T

Re: [zfs-discuss] NFS and ZFS, a fine combination

2007-01-08 Thread Bill Moore
On Mon, Jan 08, 2007 at 03:47:31PM +0100, Peter Schuller wrote: > > http://blogs.sun.com/roch/entry/nfs_and_zfs_a_fine > > So just to confirm; disabling the zil *ONLY* breaks the semantics of fsync() > and synchronous writes from the application perspective; it will do *NOTHING* > to lesse

Re: [zfs-discuss] HOWTO make a mirror after the fact

2007-01-07 Thread Bill Moore
On Sun, Jan 07, 2007 at 06:28:04PM -0500, Dennis Clarke wrote: > Now then, I have a collection of six disks on controller c0 that I would > like to now mirror with this ZPool zfs0. Thats the wrong way of thinking > really. In the SVM world I would create stripes and then mirror them to get > eithe

Re: [zfs-discuss] zfs pool in degraded state, zpool offline fails with no valid replicas

2007-01-05 Thread Bill Moore
On Fri, Jan 05, 2007 at 10:14:21AM -0800, Eric Hill wrote: > I have a pool of 48 500GB disks across four SCSI channels (12 per > channel). One of the disks failed, and was replaced. The pool is now > in a degraded state, but I can't seem to get the pool to be happy with > the replacement. I did

Re: [zfs-discuss] ZFS over NFS extra slow?

2007-01-02 Thread Bill Moore
Another thing to keep an eye out for is disk caching. With ZFS, whenever the NFS server tells us to make sure something is on disk, we actually make sure it's on disk by asking the drive to flush dirty data in its write cache out to the media. Needless to say, this takes a while. With UFS, it is

Re: [zfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43

2006-12-09 Thread Bill Moore
On Fri, Dec 08, 2006 at 12:15:27AM -0800, Ben Rockwood wrote: > Clearly ZFS file creation is just amazingly heavy even with ZIL > disabled. If creating 4,000 files in a minute squashes 4 2.6Ghz Opteron > cores we're in big trouble in the longer term. In the meantime I'm > going to find a new h

Re: [zfs-discuss] Production ZFS Server Death (06/06)

2006-11-28 Thread Bill Moore
They both use checksums and can provide self-healing data. --Bill On Tue, Nov 28, 2006 at 02:54:56PM -0700, Jason J. W. Williams wrote: > Do both RAID-Z and Mirror redundancy use checksums on ZFS? Or just RAID-Z? > > Thanks in advance, > J > > On 11/28/06, David Dyer-Bennet <[EMAIL PROTECTED]

Re: Re: [zfs-discuss] poor NFS/ZFS performance

2006-11-23 Thread Bill Moore
On Thu, Nov 23, 2006 at 03:37:33PM +0100, Roch - PAE wrote: > Al Hopper writes: > > Hi Roch - you are correct in that the data presented was incomplete. I > > did'nt present data for the same test with an NFS mount from the same > > server, for a UFS based filesystem. So here is that data poin

Re: [zfs-discuss] ZFS problems

2006-11-18 Thread Bill Moore
Hi Michael. Based on the output, there should be no user-visible file corruption. ZFS saw a bunch of checksum errors on the disk, but was able to recover in every instance. While 2-disk RAID-Z is really a fancy (and slightly more expensive, CPU-wise) way of doing mirroring, at no point should yo

Re: [zfs-discuss] Metaslab alignment on RAID-Z

2006-09-26 Thread Bill Moore
Thanks, Chris, for digging into this and sharing your results. These seemingly stranded sectors are actually properly accounted for in terms of space utilization, since they are actually unusable while maintaining integrity in the face of a single drive failure. The way the RAID-Z space accountin

Re: [zfs-discuss] Re: reslivering, how long will it take?

2006-09-15 Thread Bill Moore
On Fri, Sep 15, 2006 at 01:26:21PM -0700, Tim Cook wrote: > says it's online now so I can only assume it's working. Doesn't seem > to be reading from any of the other disks in the array though. Can it > sliver without traffic to any other disks? /noob Can you send the output of "zpool status -v

Re: [zfs-discuss] Re: reslivering, how long will it take?

2006-09-15 Thread Bill Moore
On Fri, Sep 15, 2006 at 01:10:25PM -0700, Tim Cook wrote: > the status showed 19.46% the first time I ran it, then 9.46% the > second. The question I have is I added the new disk, but it's showing > the following: > > Device: c5d0 > Storage Pool: fserv > Type: Disk > Device State: Faulted (cannot

Re: [zfs-discuss] reslivering, how long will it take?

2006-09-15 Thread Bill Moore
On Fri, Sep 15, 2006 at 12:43:19PM -0700, Tim Cook wrote: > Being resilvered 444.00 GB 168.21 GB 158.73 GB > > Just wondering if anyone has any rough guesstimate of how long this > will take? It's 3x1200JB ata drives and one Seagate SATA drive. The > SATA drive is the one that w

Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-15 Thread Bill Moore
On Fri, Sep 15, 2006 at 01:23:31AM -0700, can you guess? wrote: > Implementing it at the directory and file levels would be even more > flexible: redundancy strategy would no longer be tightly tied to path > location, but directories and files could themselves still inherit > defaults from the fil

Re: [zfs-discuss] Re: Re: Corrupted LUN in RAIDZ group -- How to repair?

2006-09-14 Thread Bill Moore
On Thu, Sep 14, 2006 at 08:09:07AM -0700, David Smith wrote: > I have run zpool scrub again, and I now see checksum errors again. > Wouldn't the checksum errors gotten fixed with the first zpool scrub? > > Can anyone recommend actions I should do at this point? After running the first scrub, d

Re: [zfs-discuss] Re: Re: SCSI synchronize cache cmd

2006-08-22 Thread Bill Moore
On Tue, Aug 22, 2006 at 11:46:30AM -0700, Anton B. Rang wrote: > I realized just now that we're actually sending the wrong variant of > SYNCHRONIZE CACHE, at least for SCSI devices which support SBC-2. > > SBC-2 (or possibly even SBC-1, I don't have it handy) added the > SYNC_NV bit to the command

Re: [zfs-discuss] Encryption on ZFS / Disk Usage

2006-08-22 Thread Bill Moore
On Tue, Aug 22, 2006 at 07:02:53PM +0200, Thomas Deutsch wrote: > >ZFS' RAIDZ1 uses one parity disk per RAIDZ set, similarly to RAID-5. > >ZFS' RAIDZ2 uses two parity disks per RAIDZ set. > > This means that RAIDZ2 allows problems with two disks? That's right. A third failure would cause data lo

Re: [zfs-discuss] Re: SCSI synchronize cache cmd

2006-08-21 Thread Bill Moore
On Mon, Aug 21, 2006 at 02:40:40PM -0700, Anton B. Rang wrote: > Yes, ZFS uses this command very frequently. However, it only does this > if the whole disk is under the control of ZFS, I believe; so a > workaround could be to use slices rather than whole disks when > creating a ZFS pool on a buggy

Re: [zfs-discuss] in-kernel gzip compression

2006-08-19 Thread Bill Moore
On Sat, Aug 19, 2006 at 01:25:21PM +0200, michael schuster wrote: > maybe a stupid question: what do we use for compressing dump data on the > dump device? We use a variant of Lempel-Ziv called lzjb (the jb is for Jeff Bonwick). The algorithm was designed for very small code/memory footprint and

Re: [zfs-discuss] ZFS vs. Apple XRaid

2006-07-31 Thread Bill Moore
On Mon, Jul 31, 2006 at 06:08:04PM -0400, Jan Schaumann wrote: > # echo '::offsetof vdev_t vdev_nowritecache' | mdb -k > offsetof (vdev_t, vdev_nowritecache) = 0x4c0 Ok, then try this: echo '::spa -v' | mdb -k | awk '/dev.dsk/{print $1"+4c0/W1"}' | mdb -kw --Bill ___

Re: [zfs-discuss] ZFS vs. Apple XRaid

2006-07-31 Thread Bill Moore
On Mon, Jul 31, 2006 at 03:59:23PM -0400, Jan Schaumann wrote: > Thanks for the suggestion. However, I'm not sure if the above pipeline > is correct: > > 2# !! | awk '/dev.dsk/{print $1"::print -a vdev_t vdev_nowritecache"}' > 857a0580::print -a vdev_t vdev_nowritecache > 3# !! | mdb -k >

Re: [zfs-discuss] ZFS vs. Apple XRaid

2006-07-31 Thread Bill Moore
On Mon, Jul 31, 2006 at 02:17:00PM -0400, Jan Schaumann wrote: > Is there anybody here who's using ZFS on Apple XRaids and serving them > via NFS? Does anybody have any other ideas what I could do to solve > this? (I have, in the mean time, converted the XRaid to plain old UFS, > and performance

Re: [zfs-discuss] zpool panic

2006-07-31 Thread Bill Moore
Interesting. When you do the import, try doing this: zpool import -o ro yourpool And see if that fares any better. If it works, could you send the output of "zpool status -v"? Also, how big is the pool in question? Either access to the machine, or a way to copy the crash dump would be usef

Re: [zfs-discuss] zfs sucking down my memory!?

2006-07-21 Thread Bill Moore
On Sat, Jul 22, 2006 at 12:44:16AM +0800, Darren Reed wrote: > Bart Smaalders wrote: > > >I just swap on a zvol w/ my ZFS root machine. > > > I haven't been watching...what's the current status of using > ZFS for swap/dump? > > Is a/the swap solution to use mkswap and then specify that file > i

Re: [zfs-discuss] Can't remove corrupt file

2006-07-21 Thread Bill Moore
On Fri, Jul 21, 2006 at 07:22:17AM -0600, Gregory Shaw wrote: > After reading the ditto blocks blog (good article, btw), an idea > occurred to me: > > Since we use ditto blocks to preserve critical filesystem data, would > it be practical to add a filesystem property that would cause all > f

Re: [zfs-discuss] Can't remove corrupt file

2006-07-20 Thread Bill Moore
any block > of ZFS metadata is destroyed, we always have another copy. > Bill Moore describes ditto blocks in detail here: > > http://blogs.sun.com/roller/page/bill?entry=ditto_blocks_the_amazing_tape Right. And I should point out that if Eric had been running build 38 or later, this

Re: [zfs-discuss] Enabling compression/encryption on a populated filesystem

2006-07-18 Thread Bill Moore
On Wed, Jul 19, 2006 at 03:10:00AM +0200, [EMAIL PROTECTED] wrote: > So how many of the 128 bits of the blockpointer are used for things > other than to point where the block is? 128 *bits*? What filesystem have you been using? :) We've got luxury-class block pointers that are 128 *bytes*. We

Re: [zfs-discuss] Re: Transactional RAID-Z?

2006-07-11 Thread Bill Moore
On Tue, Jul 11, 2006 at 11:03:17PM -0400, David Abrahams wrote: > How can RAID-Z preserve transactional semantics when a single > FS block write requires writing to multiple physical devices? ZFS uses a technique that's been used in databases for years: phase trees. First you write all

Re: [zfs-discuss] ZFS needs a viable backup mechanism

2006-07-07 Thread Bill Moore
On Fri, Jul 07, 2006 at 08:20:50AM -0400, Dennis Clarke wrote: > As near as I can tell the ZFS filesystem has no way to backup easily to a > tape in the same way that ufsdump has served for years and years. > > ... > > Of course it took a number of hours for that I/O error to appear because the > t

Re: [zfs-discuss] x86 CPU Choice for ZFS

2006-07-07 Thread Bill Moore
On Fri, Jul 07, 2006 at 09:50:47AM +0100, Darren J Moffat wrote: > Eric Schrock wrote: > >On Thu, Jul 06, 2006 at 09:53:32PM +0530, Pramod Batni wrote: > >> offtopic query : > >> How can ZFS require more VM address space but not more VM ? > >> > > > >The real problem is VA fragmentation, not co

Re: [zfs-discuss] ZFS questions

2006-06-20 Thread Bill Moore
On Tue, Jun 20, 2006 at 11:17:42AM -0700, Jonathan Adams wrote: > On Tue, Jun 20, 2006 at 09:32:58AM -0700, Richard Elling wrote: > > Flash is (can be) a bit more sophisticated. The problem is that they > > have a limited write endurance -- typically spec'ed at 100k writes to > > any single bit.

Re: [zfs-discuss] disk write cache, redux

2006-06-02 Thread Bill Moore
On Fri, Jun 02, 2006 at 12:42:53PM -0700, Philip Brown wrote: > hi folks... > I've just been exposed to zfs directly, since I'm trying it out on > "a certain 48-drive box with 4 cpus" :-) > > I read in the archives, the recent " hard drive write cache " > thread. in which someone at sun made the c

Re: [zfs-discuss] The 12.5% compression rule

2006-05-24 Thread Bill Moore
On Thu, May 11, 2006 at 12:34:45PM +0100, Darren J Moffat wrote: > Where does the 12.5% compression rule in zio_compress_data() come from ? > Given that this is in the generic function for all compression > algorithms rather than in the implementation of lzjb I wonder where the > number comes fro

Re: [zfs-discuss] Re: Re: zfs snapshot for backup, Quota

2006-05-18 Thread Bill Moore
On Thu, May 18, 2006 at 12:46:28PM -0700, Charlie wrote: > Eric Schrock wrote: > > > Using traditional tools or ZFS send/receive? > > Traditional (amanda). I'm not seeing a way to dump zfs file systems to > tape without resorting to 'zfs send' being piped through gtar or > something. Even then, th

Re: [zfs-discuss] ZFS: More information on ditto blocks?

2006-05-05 Thread Bill Moore
On Fri, May 05, 2006 at 10:19:56AM +0200, Constantin Gonzalez wrote: > (apologies if this was discussed before, I _did_ some research, but this > one may have slipped for me...) I'm in the process of writing a blog on this one. Give me another day or so. > Looking through the current Sun ZFS Tec

Re: [zfs-discuss] write sizes

2006-05-04 Thread Bill Moore
On Thu, May 04, 2006 at 09:55:37AM -0700, Adam Leventhal wrote: > Is there a way, given a dataset or pool, to get some statistics about the > sizes of writes that were made to the underlying vdevs? Does zdb -bsv give you what you want? --Bill ___ zfs-