I have a Sun A5000, 22x 73GB 15K disks in split-bus configuration, two dual 2Gb
HBAs and four fibre cables from server to array, all for just under $200.
The array gives 4Gb of aggregate thoughput in each direction across two 11 disk
buses.
Right now it is the main array, but when we outgrow it
David Dyer-Bennet wrote:
> My choice of mirrors rather than RAIDZ is based on
> the fact that I have
> only 8 hot-swap bays (I still think of this as LARGE
> for a home server;
> the competition, things like the Drobo, tends to have
> 4 or 5), that I
> don't need really large amounts of storage (af
On Jun 3, 2010 7:35 PM, David Magda wrote:
> On Jun 3, 2010, at 13:36, Garrett D'Amore wrote:
>
> > Perhaps you have been unlucky. Certainly, there is
> a window with N
> > +1 redundancy where a single failure leaves the
> system exposed in
> > the face of a 2nd fault. This is a statistics
>
> I have a small question about the depth of scrub in a
> raidz/2/3 configuration.
> I'm quite sure scrub does not check spares or unused
> areas of the disks (it
> could check if the disks detects any errors there).
> But what about the parity?
>From some informal performance testing of RAIDZ2/3
Joachim Worringen wrote:
> Greetings,
>
> we are running a few databases of currently 200GB
> (growing) in total for data warehousing:
> - new data via INSERTs for (up to) millions of rows
> per day; sometimes with UPDATEs
> - most data in a single table (=> 10 to 100s of
> millions of rows)
> - q
> I think the request is to remove vdev's from a pool.
> Not currently possible. Is this in the works?
Actually, I think this is two requests, hashed over hundreds of times in this
forum:
1. Remove a vdev from a pool
2. Nondisruptively change vdev geometry
#1 above has a stunningly obvious use
Cindy wrote:
> Mirrored pools are more flexible and generally
> provide good performance.
>
> You can easily create a mirrored pool of two disks
> and then add two
> more disks later. You can also replace each disk with
> larger disks
> if needed. See the example below.
There is no dispute that m
> I've found plenty of documentation on how to create a
> ZFS volume, iscsi share it, and then do a fresh
> install of Fedora or Windows on the volume.
Really? I have found just the opposite: how to move your functioning
Windows/Linux install to iSCSI.
I am fumbling through this process for Ubu
> > ' iostat -Eni ' indeed outputs Device ID on some of
> > the drives,but I still
> > can't understand how it helps me to identify model
> > of specific drive.
Get and install smartmontools. Period. I resisted it for a few weeks but it
has been an amazing tool. It will tell you more than you
Michael Shadle wrote:
>Actually I guess my real question is why iostat hasn't logged any
> errors in its counters even though the device has been bad in there
> for months?
One of my arrays had a drive in slot 4 fault -- lots of reset something or
other
errors. I cleared the errors and the po
> If the format utility is not displaying the WD drives
> correctly,
> then ZFS won't see them correctly either. You need to
> find out why.
>
> I would export this pool and recheck all of your
> device connections.
I didn't see it in the postings, but are the same serial numbers showing up
mult
> Hi,
> Out of pure curiosity, I was wondering, what would
> happen if one tries to use a regular 7200RPM (or 10K)
> drive as slog or L2ARC (or both)?
I have done both with success.
At one point my backup pool was a collection of USB attached drives (please
keep the laughter down) with dedup=ver
> ahh that explains it all, god damn that base 1000
> standard , only usefull for sales people :)
As much as it all annoys me too, the SI prefixes are used correctly pretty much
everywhere except in operating systems.
A kilometer is not 1024 meters and a megawatt is not 1048576 watts.
Us, the I
Erik Trimble wrote:
> On 8/10/2010 9:57 PM, Peter Taps wrote:
> > Hi Eric,
> >
> > Thank you for your help. At least one part is clear
> now.
> >
> > I still am confused about how the system is still
> functional after one disk fails.
> >
> > Consider my earlier example of 3 disks zpool
> configure
Peter wrote:
> One question though. Marty mentioned that raidz
> parity is limited to 3. But in my experiment, it
> seems I can get parity to any level.
>
> You create a raidz zpool as:
>
> # zpool create mypool raidzx disk1 diskk2
>
> Here, x in raidzx is a numeric value indicating the
> d
> Hello,
>
> I would like to backup my main zpool (originally
> called "data") inside an equally originally named
> "backup"zpool, which will also holds other kinds of
> backups.
>
> Basically I'd like to end up with
> backup/data
> backup/data/dataset1
> backup/data/dataset2
> backup/otherthing
Script attached.
Cheers,
Marty
--
This message posted from opensolaris.org
zfs_sync
Description: Binary data
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Erik Trimble wrote:
> As always, the devil is in the details. In this case,
> the primary
> problem I'm having is maintaining two different block
> mapping schemes
> (one for the old disk layout, and one for the new
> disk layout) and still
> being able to interrupt the expansion process. My
>
> > Hi Ross,
> >
> > What about old good raid10? It's a pretty
> reasonable choice for
> > heavy loaded storages, isn't it?
> >
> > I remember when I migrated raidz2 to 8xdrives
> raid10 the application
> > administrators were just really happy with the new
> access speed. (we
> > didn't use
Bob Friesenhahn wrote:
> Why are people talking about "RAID-5", RAID-6", and
> "RAID-10" on this
> list? This is the zfs-discuss list and zfs does not
> do "RAID-5",
> "RAID-6", or "RAID-10".
>
> Applying classic RAID terms to zfs is just plain
> wrong and misleading
> since zfs does not direc
Bob Friesenhahn wrote:
> On Tue, 22 Dec 2009, Marty Scholes wrote:
> >
> > That's not entirely true, is it?
> > * RAIDZ is RAID5 + checksum + COW
> > * RAIDZ2 is RAID6 + checksum + COW
> > * A stack of mirror vdevs is RAID10 + checksum +
> COW
>
>
risner wrote:
> If I understand correctly, raidz{1} is 1 drive
> protection and space is (drives - 1) available.
> Raidz2 is 2 drive protection and space is (drives -
> 2) etc. Same for raidz3 being 3 drive protection.
Yes.
> Everything I've seen you should stay around 6-9
> drives for raidz, so
Michael Herf wrote:
> I've written about my slow-to-dedupe RAIDZ.
>
> After a week of.waitingI finally bought a
> little $100 30G OCZ
> Vertex and plugged it in as a cache.
>
> After <2 hours of warmup, my zfs send/receive rate on
> the pool is
> >16MB/sec (reading and writing each at 16M
Ian wrote:
> Why did you set dedup=verify on the USB pool?
Because that is my last-ditch copy of the data and MUST be correct. At the
same time, I want to cram as much data as possible into the pool.
If I ever go to the USB pool, something has already gone horribly wrong and I
am desperate. I
--- On Thu, 1/7/10, Tiernan OToole wrote:
> Sorry to hijack the thread, but can you
> explain your setup? Sounds interesting, but need more
> info...
This is just a home setup to amuse me and placate my three boys, each of whom
has several Windows instances running under Virtualbox.
Server is a
> Any news regarding this issue? I'm having the same
> problems.
Me too. My v40z with U320 drives in the internal bay will lock up partway
through a scrub.
I backed the whole SCSI chain down to U160, but it seems a shame that U320
speeds can't be used.
--
This message posted from opensolaris.
> To fix it, I swapped out the Adaptec controller and
> put in LSI Logic
> and all the problems went away.
I'm using Sun's built-in LSI controller with (I presume) the original internal
cable shipped by Sun.
Still, no joy for me at U320 speeds. To be precise, when the controller is set
at U3
>> Was my raidz2 performance comment above correct?
>> That the write speed is that of the slowest disk?
>> That is what I believe I have
>> read.
> You are
> sort-of-correct that its the write speed of the
> slowest disk.
My experience is not in line with that statement. RAIDZ will write a co
Bob Friesenhahn wrote:
> It is unreasonable to spend more than 24 hours to resilver a single
> drive. It is unreasonable to spend more than 6 days resilvering all
> of the devices in a RAID group (the 7th day is reserved for the system
> administrator). It is unreasonable to spend very much time
I cant' stop myself; I have to respond. :-)
Richard wrote:
> The ideal pool has one inexpensive, fast, and reliable device :-)
My ideal pool has become one inexpensive, fast and reliable "device" built on
whatever I choose.
> I'm not sure how to connect those into the system (USB 3?)
Me neith
This paper is exactly what is needed -- giving an overview to a wide audience
of the ZFS fundamental components and benefits.
I found several grammar errors -- to be expected in a draft and I think at
least one technical error.
The paper seems to imply that multiple vdevs will induce striping a
> Is it currently or near future possible to shrink a
> zpool "remove a disk"
As other's have noted, no, not until the mythical bp_rewrite() function is
introduced.
So far I have found no documentation on bp_rewrite(), other than it is the
solution to evacuating a vdev, restriping a vdev, defra
Erik wrote:
> Actually, your biggest bottleneck will be the IOPS
> limits of the
> drives. A 7200RPM SATA drive tops out at 100 IOPS.
> Yup. That's it.
> So, if you need to do 62.5e6 IOPS, and the rebuild
> drive can do just 100
> IOPS, that means you will finish (best case) in
> 62.5e4 seconds
I am speaking from my own observations and nothing scientific such as reading
the code or designing the process.
> A) Resilver = Defrag. True/false?
False
> B) If I buy larger drives and resilver, does defrag
> happen?
No. The first X sectors of the bigger drive are identical to the smaller
Richard Elling wote:
> Define "fragmentation"?
Maybe this is the wrong thread. I have noticed that an old pool can take 4
hours to scrub, with a large portion of the time reading from the pool disks at
the rate of 150+ MB/s but zpool iostat reports 2 MB/s read speed. My naive
interpretation i
David Dyer-Bennet wote:
> Sure, if only a single thread is ever writing to the
> disk store at a time.
>
> This situation doesn't exist with any kind of
> enterprise disk appliance,
> though; there are always multiple users doing stuff.
Ok, I'll bite.
Your assertion seems to be that "any kind of
Alexander Skwar wrote:
> Okay. This contradicts the ZFS Best Practices Guide,
> which states:
>
> # For production environments, configure ZFS so that
> # it can repair data inconsistencies. Use ZFS
> redundancy,
> # such as RAIDZ, RAIDZ-2, RAIDZ-3, mirror, or copies
> > 1,
> # regardless of the R
Is this a sector size issue?
I see two of the disks each doing the same amount of work in roughly half the
I/O operations each operation taking about twice the time compared to each of
the remaining six drives.
I know nothing about either drive, but I wonder if one type of drive has twice
the
Roy Sigurd Karlsbakk wrote:
> device r/s w/s kr/s kw/s wait actv svc_t %w %b
> cmdk0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
> cmdk1 0.0 163.6 0.0 20603.7 1.6 0.5 12.9 24 24
> fd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
> sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
> sd1 0.5 140.3 0.3 2426.3 0.0 1.0 7.2 0 14
> sd2 0
Have you had a lot of activity since the scrub started?
I have noticed what appears to be extra I/O at the end of a scrub when activity
took place during the scrub. It's as if the scrub estimator does not take the
extra activity into account.
--
This message posted from opensolaris.org
___
I think you are seeing ZFS store up the writes, coalesce them, then flush to
disk every 30 seconds.
Unless the writes are synchronous, the ZIL won't be used, but the writes will
be cached instead, then flushed.
If you think about it, this is far more sane than flushing to disk every time
the w
Ok,
Let's think about this for a minute. The log drive is c1t11d0 and it appears
to be almost completely unused, so we probably can rule out a ZIL bottleneck.
I run Ubuntu booting iSCSI against OSol 128a and the writes do not appear to be
synchronous. So, writes aren't the issue.
>From the
I apologize if this has been covered before. I have not seen a blow-by-blow
installation guide for Ubuntu onto an iSCSI target.
The install guides I have seen assume that you can make a target visible to
all, which is a problem if you want multiple iSCSI installations on the same
COMSTAR targe
> Here are some more findings...
>
> The Nexenta box has 3 pools:
> syspool: made of 2 mirrored (hardware RAID) local SAS
> disks
> pool_sas: made of 22 15K SAS disks in ZFS mirrors on
> 2 JBODs on 2 controllers
> pool_sata: made of 42 SATA disks in 6 RAIDZ2 vdevs on
> a single controller
>
> Whe
> I've had a few people sending emails directly
> suggesting it might have something to do with the
> ZIL/SLOG. I guess I should have said that the issue
> happen both ways, whether we copy TO or FROM the
> Nexenta box.
You mentioned a second Nexenta box earlier. To rule out client-side issues,
Sorry, I can't not respond...
Edward Ned Harvey wrote:
> whatever you do, *don't* configure one huge raidz3.
Peter, whatever you do, *don't* make a decision based on blanket
generalizations.
> If you can afford mirrors, your risk is much lower.
> Because although it's
> hysically possible for
> On Fri, Oct 15, 2010 at 3:16 PM, Marty Scholes
> wrote:
> > My home server's main storage is a 22 (19 + 3) disk
> RAIDZ3 pool backed up hourly to a 14 (11+3) RAIDZ3
> backup pool.
>
> How long does it take to resilver a disk in that
> pool? And how lo
> Richard wrote:
> Yep, it depends entirely on how you use the pool. As soon as you
> come up with a credible model to predict that, then we can optimize
> accordingly :-)
You say that somewhat tongue-in-cheek, but Edward's right. If the resliver
code progresses in slab/transaction-group/whatev
Richard wrote:
>
> Untrue. The performance of a 21-disk raidz3 will be nowhere near the
> performance of a 20 disk 2-way mirrror.
You know this stuff better than I do. Assuming no bus/cpu bottlenecks, a 21
disk raidz3 should provide sequential throughput of 18 disks and random
throughput of 1 d
> 2011/5/26 Eugen Leitl :
> > How bad would raidz2 do on mostly sequential writes
> and reads
> > (Athlon64 single-core, 4 GByte RAM, FreeBSD 8.2)?
> >
> > The best way is to go is striping mirrored pools,
> right?
> > I'm worried about losing the two "wrong" drives out
> of 8.
> > These are all 72
While I am by no means on expert on this, I went through a similar mental
exercise previously and came to the conclusion that in order to service a
particular read request, zfs may need to read more from the disk. For example,
a 16KB request in a stripe might need to retrieve the full 128KB str
I'll throw out some (possibly bad) ideas.
Is ARC satisfying the caching needs? 32 GB for ARC should almost cover the
40GB of total reads, suggesting that the L2ARC doesn't add any value for this
test.
Are the SSD devices saturated from an I/O standpoint? Put another way, can ZFS
put data to
> > Are some of the reads sequential? Sequential reads
> don't go to L2ARC.
>
> That'll be it. I assume the L2ARC is just taking
> metadata. In situations
> such as mine, I would quite like the option of
> routing sequential read
> data to the L2ARC also.
The good news is that it is almost a c
> This is not a true statement. If the primarycache
> policy is set to the default, all data will
> be cached in the ARC.
Richard, you know this stuff so well that I am hesitant to disagree with you.
At the same time, I have seen this myself, trying to load video files into
L2ARC without succes
> If it is true that unlike ZFS itself, the replication
> stream format has
> no redundancy (even of ECC/CRC sort), how can it be
> used for
> long-term retention "on tape"?
It can't. I don't think it has been documented anywhere, but I believe that it
has been well understood that if you don't
> I stored a snapshot stream to a file
The tragic irony here is that the file was stored on a non-zfs filesystem. You
had had undetected bitrot which unknowingly corrupted the stream. Other files
also might have been silently corrupted as well.
You may have just made one of the strongest case
I am asssuming you will put all of the vdevs into a single pool, which is a
good idea unless you have a specific reason for keeping them separate, e.g. you
want to be able to destroy / rebuild a particular vdev while leaving the others
intact.
Fewer disks per vdev implies more vdevs, providing
Just for completeness, there is also VirtualBox which runs Solaris nicely.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
It sounds like you are getting a good plan together.
> The only thing though I seem to remember reading that adding vdevs to
> pools way after the creation of the pool and data had been written to it,
> that things aren't spread evenly - is that right? So it might actually make
> sense to buy all
> Has there been any change to the server hardware with
> respect to number of
> drives since ZFS has come out? Many of the servers
> around still have an even
> number of drives (2, 4) etc. and it seems far from
> optimal from a ZFS
> standpoint. All you can do is make one or two
> mirrors, or a 3
> Lights. Good.
Agreed. In a fit of desperation and stupidity I once enumerated disks by
pulling them one by one from the array to see which zfs device faulted.
On a busy array it is hard even to use the leds as indicators.
It makes me wonder how large shops with thousands of spindles handle
> Lights. Good.
Agreed. In a fit of desperation and stupidity I once enumerated disks by
pulling them one by one from the array to see which zfs device faulted.
On a busy array it is hard even to use the leds as indicators.
It makes me wonder how large shops with thousands of spindles handle th
Funny you say that.
My Sun v40z connected a pair of Sun A5200 arrays running OSol 128a can't see
the enclosures. The luxadm command comes up blank.
Except for that annoyance (and similar other issues) the Sun gear works well
with a Sun operating system.
Sent from Yahoo! Mail on Android
___
> didn't seem to we would need zfs to provide that redundancy also.
There was a time when I fell for this line of reasoning too. The problem (if
you want to call it that) with zfs is that it will show you, front and center,
the corruption taking place in your stack.
> Since we're on SAN with R
After moving from SXCE to 2009.06, my ZFS pools/file systems were at too new of
a version. I upgraded to the latest dev and recently upgraded to 122, but am
not too thrilled with the instability, especially zfs send / recv lockups
(don't recall the bug number).
I keep a copy of all of my criti
> The zfs send stream is dependent on the version of
> the filesystem, so the
> only way to create an older stream is to create a
> back-versioned
> filesystem:
>
> zfs create -o version=N pool/filesystem
> You can see what versions your system supports by
> using the zfs upgrade
> comman
> Generally speaking, striping mirrors will be faster
> than raidz or raidz2,
> but it will require a higher number of disks and
> therefore higher cost to
> The main reason to use
> raidz or raidz2 instead
> of striping mirrors would be to keep the cost down,
> or to get higher usable
> space out
Lori Alt wrote:
> As for being able to read streams of a later format
> on an earlier
> version of ZFS, I don't think that will ever be
> supported. In that
> case, we really would have to somehow convert the
> format of the objects
> stored within the send stream and we have no plans to
> impl
> This line of reasoning doesn't get you very far.
> It is much better to take a look at
> the mean time to data loss (MTTDL) for the various
> configurations. I wrote a
> series of blogs to show how this is done.
> http://blogs.sun.com/relling/tags/mttdl";
> target="_blank">http://blogs.sun.com
> Yes. This is a mathematical way of saying
> "lose any P+1 of N disks."
I am hesitant to beat this dead horse, yet it is a nuance that either I have
completely misunderstood or many people I've met have completely missed.
Whether a stripe of mirrors or mirror of a stripes, any single failure m
70 matches
Mail list logo