Hi,
Just be reading about apache.org incident report for 8/28/2009
( https://blogs.apache.org/infra/entry/apache_org_downtime_report )
The use of Solaris and ZFS on the European server was interesting including the
recovery.
However, what I found more interesting was the use of one time passwo
Let me explain what i have and you decide if it's what you're looking for.
I run a home NAS based on ZFS (due to hardware issues i am using FreeBSD 7.2
as my os but all the data is on ZFS)
This system has multiple uses. I have about 10 users and 4 HTPC's connected
via gigabit. I have ZFS filesyst
Scott Lawson writes:
> Also you may wish to look at the output of 'iostat -xnce 1' as well.
>
> You can post those to the list if you have a specific problem.
>
> You want to be looking for error counts increasing and specifically 'asvc_t'
> for the service times on the disks. I higher num
Do you have the zfs primarycache property on this release ?
if so, you could set it to 'metadata' or none.
primarycache=all | none | metadata
Controls what is cached in the primary cache (ARC). If
this property is set to "all", then both user data and
metadat
Bill Moore sun.com> writes:
>
> Moving on, modern high-capacity SATA drives are in the 100-120MB/s
> range. Let's call it 125MB/s for easier math. A 5-port port multiplier
> (PM) has 5 links to the drives, and 1 uplink. SATA-II speed is 3Gb/s,
> which after all the framing overhead, can get yo
Marc Bevand gmail.com> writes:
>
> So in conclusion, my SBNSWAG (scientific but not so wild-ass guess)
> is that the max I/O throughput when reading from all the disks on
> 1 of their storage pod is about 1000MB/s.
Correction: the SiI3132 are on x1 (not x2) links, so my guess as to
the aggregate
"100% random writes produce around 200 IOPS with a 4-6 second pause
around every 10 seconds. "
This indicates that the bandwidth you're able to transfer
through the protocol is about 50% greater than the bandwidth
the pool can offer to ZFS. Since, this is is not sustainable, you
s
stuart anderson writes:
> > > > Question :
> > > >
> > > > Is there a way to change the volume blocksize
> > say
> > > via 'zfs snapshot send/receive'?
> > > >
> > > > As I see things, this isn't possible as the
> > target
> > > volume (including property values) gets
> > overwritten
>
Lori Alt wrote:
The -n option does some verification. It verifies that the record
headers distributed throughout the stream are syntactically valid.
Since each record header contains a length field which allows the next
header to be found, one bad header will cause the processing of the
stre
Roch Bourbonnais Wrote:
""100% random writes produce around 200 IOPS with a 4-6 second pause
around every 10 seconds. "
This indicates that the bandwidth you're able to transfer
through the protocol is about 50% greater than the bandwidth
the pool can offer to ZFS. Since, this is is not sustainabl
On 09/04/09 09:41, dick hoogendijk wrote:
Lori Alt wrote:
The -n option does some verification. It verifies that the record
headers distributed throughout the stream are syntactically valid.
Since each record header contains a length field which allows the
next header to be found, one bad he
Hi zfs congnoscenti,
a few quick question about my hardware choice (a bit late, since the
box is up already):
A 3U supermicro chassis with 16x SATA/SAS hotplug
Supermicro X8DDAi (2x Xeon QC 1.26 GHz S1366, 24 GByte RAM, IPMI)
2x LSI SAS3081E-R
16x WD2002FYPS
Right now I'm running Solaris 10 5/9
On Sep 3, 2009, at 10:32 PM, Tim Cook wrote:
On Fri, Sep 4, 2009 at 12:17 AM, Ross wrote:
Hi Richard,
Actually, reading your reply has made me realise I was overlooking
something when I talked about tar, star, etc... How do you backup a
ZFS volume? That's something traditional tools can'
Lori Alt wrote:
The -u option to zfs recv (which was just added to support flash
archive installs, but it's useful for other reasons too) suppresses
all mounts of the received file systems. So you can mount them
yourself afterward in whatever order is appropriate, or do a 'zfs
mount -a'.
You
On 09/04/09 09:54, Scott Meilicke wrote:
Roch Bourbonnais Wrote:
""100% random writes produce around 200 IOPS with a 4-6 second pause
around every 10 seconds. "
This indicates that the bandwidth you're able to transfer
through the protocol is about 50% greater than the bandwidth
the po
I am still not buying it :) I need to research this to satisfy myself.
I can understand that the writes come from memory to disk during a txg write
for async, and that is the behavior I see in testing.
But for sync, data must be committed, and a SSD/ZIL makes that faster because
you are writing
This sounds like the same behavior as opensolaris 2009.06. I had several disks
recently go UNAVAIL, and the spares did not take over. But as soon as I
physically removed a disk, the spare started replacing the removed disk. It
seems UNAVAIL is not the same as the disk not being there. I wish the
Scott Meilicke wrote:
> So what happens during the txg commit?
>
> For example, if the ZIL is a separate device, SSD for this example, does it
> not work like:
>
> 1. A sync operation commits the data to the SSD
> 2. A txg commit happens, and the data from the SSD are written to the
> spinning
So what happens during the txg commit?
For example, if the ZIL is a separate device, SSD for this example, does it not
work like:
1. A sync operation commits the data to the SSD
2. A txg commit happens, and the data from the SSD are written to the spinning
disk
So this is two writes, correct?
We have a number of shared spares configured in our ZFS pools, and
we're seeing weird issues where spares don't get used under some
circumstances. We're running Solaris 10 U6 using pools made up of
mirrored vdevs, and what I've seen is:
* if ZFS detects enough checksum errors on an active disk,
Doh! I knew that, but then forgot...
So, for the case of no separate device for the ZIL, the ZIL lives on the disk
pool. In which case, the data are written to the pool twice during a sync:
1. To the ZIL (on disk)
2. From RAM to disk during tgx
If this is correct (and my history in this thread
On Fri, 4 Sep 2009, Scott Meilicke wrote:
So what happens during the txg commit?
For example, if the ZIL is a separate device, SSD for this example, does it not
work like:
1. A sync operation commits the data to the SSD
2. A txg commit happens, and the data from the SSD are written to the spi
So, I just re-read the thread, and you can forget my last post. I had thought
the argument was that the data were not being written to disk twice (assuming
no separate device for the ZIL), but it was just explaining to me that the data
are not read from the ZIL to disk, but rather from memory to
On Fri, Sep 4, 2009 at 5:36 AM, Marc Bevand wrote:
> Marc Bevand gmail.com> writes:
> >
> > So in conclusion, my SBNSWAG (scientific but not so wild-ass guess)
> > is that the max I/O throughput when reading from all the disks on
> > 1 of their storage pod is about 1000MB/s.
>
> Correction: the
Scott Meilicke wrote:
I am still not buying it :) I need to research this to satisfy myself.
I can understand that the writes come from memory to disk during a txg write
for async, and that is the behavior I see in testing.
But for sync, data must be committed, and a SSD/ZIL makes that faster
We have groups generating terabytes a day of image data from lab instruments
and saving them to an X4500.
We have tried lzbj : compressratio = 1.13 in 11 hours , 1.3 TB -> 1.1 TB
gzip -9 : compress ratio = 1.68 in > 37 hours, 1.3 TB ->
.75 TB
The filesystem performance was
On Wed, Sep 2, 2009 at 4:56 PM, David Magda wrote:
> Said support was committed only two to three weeks ago:
>
>> PSARC/2009/394 SATA Framework Port Multiplier Support
>> 6422924 sata framework has to support port multipliers
>> 6691950 ahci driver needs to support SIL3726/4726 SATA port multiplier
Hi Brandon
To answer your question, all you need to do is look up those bug numbers:
http://bugs.opensolaris.org/view_bug.do?bug_id=6422924
http://bugs.opensolaris.org/view_bug.do?bug_id=6691950
..and you see the fix should be in release snv_122.
Your in luck, as the OpenSolaris dev repository
On Sep 4, 2009, at 2:22 PM, Scott Meilicke > wrote:
So, I just re-read the thread, and you can forget my last post. I
had thought the argument was that the data were not being written to
disk twice (assuming no separate device for the ZIL), but it was
just explaining to me that the data are
Yes, I was getting confused. Thanks to you (and everyone else) for clarifying.
Sync or async, I see the txg flushing to disk starve read IO.
Scott
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http
On Sep 4, 2009, at 12:23 PM, Len Zaifman wrote:
We have groups generating terabytes a day of image data from lab
instruments and saving them to an X4500.
Wouldn't it be easier to compress at the application, or between the
application and the archiving file system?
We have tried lzbj : com
On Sep 4, 2009, at 4:33 PM, Scott Meilicke > wrote:
Yes, I was getting confused. Thanks to you (and everyone else) for
clarifying.
Sync or async, I see the txg flushing to disk starve read IO.
Well try the kernel setting and see how it helps.
Honestly though if you can say it's all sync wr
On Fri, 2009-09-04 at 13:41 -0700, Richard Elling wrote:
> On Sep 4, 2009, at 12:23 PM, Len Zaifman wrote:
>
> > We have groups generating terabytes a day of image data from lab
> > instruments and saving them to an X4500.
>
> Wouldn't it be easier to compress at the application, or between th
I only see the blocking while load testing, not during regular usage, so I am
not so worried. I will try the kernel settings to see if that helps if/when I
see the issue in production.
For what it is worth, here is the pattern I see when load testing NFS (iometer,
60% random, 65% read, 8k chun
On Fri, Sep 04, 2009 at 01:41:15PM -0700, Richard Elling wrote:
> On Sep 4, 2009, at 12:23 PM, Len Zaifman wrote:
> >We have groups generating terabytes a day of image data from lab
> >instruments and saving them to an X4500.
>
> Wouldn't it be easier to compress at the application, or between
On 09/04/09 10:17, dick hoogendijk wrote:
Lori Alt wrote:
The -u option to zfs recv (which was just added to support flash
archive installs, but it's useful for other reasons too) suppresses
all mounts of the received file systems. So you can mount them
yourself afterward in whatever order is
On Fri, 4 Sep 2009, Louis-Frédéric Feuillette wrote:
JPEG2000 uses arithmetic encoding to do the final compression step.
Arithmetic encoding has a higher compression rate (in general) than
gzip-9, lzbj or others. There is an opensource implementation of
jpeg2000 called jasper[1]. Jasper is the
On Fri, 4 Sep 2009, Scott Meilicke wrote:
I only see the blocking while load testing, not during regular
usage, so I am not so worried. I will try the kernel settings to see
if that helps if/when I see the issue in production.
The flipside of the "pulsing" is that the deferred writes dimish
On Sep 4, 2009, at 6:33 PM, Bob Friesenhahn > wrote:
On Fri, 4 Sep 2009, Scott Meilicke wrote:
I only see the blocking while load testing, not during regular
usage, so I am not so worried. I will try the kernel settings to
see if that helps if/when I see the issue in production.
The flips
On Sep 4, 2009, at 5:25 PM, Scott Meilicke > wrote:
I only see the blocking while load testing, not during regular
usage, so I am not so worried. I will try the kernel settings to see
if that helps if/when I see the issue in production.
For what it is worth, here is the pattern I see when l
On Fri, Sep 4, 2009 at 1:12 PM, Nigel
Smith wrote:
> Let us know if you can get the port multipliers working..
>
> But remember, there is a problem with ZFS raidz in that release, so be
> careful:
I saw that, so I think I'll be waiting until snv_124 to update. The
system that I'm thinking of usin
On Fri, 4 Sep 2009, Ross Walker wrote:
I guess one can find a silver lining in any grey cloud, but for myself I'd
just rather see a more linear approach to writes. Anyway I have never seen
any reads happen during these write flushes.
I have yet to see a read happen during the write flush eit
On Sep 4, 2009, at 8:59 PM, Bob Friesenhahn > wrote:
On Fri, 4 Sep 2009, Ross Walker wrote:
I guess one can find a silver lining in any grey cloud, but for
myself I'd just rather see a more linear approach to writes. Anyway
I have never seen any reads happen during these write flushes.
I
On Fri, 4 Sep 2009, Ross Walker wrote:
I have yet to see a read happen during the write flush either. That
impacts my application since it needs to read in order to proceed, and it
does a similar amount of writes as it does reads.
The ARC makes it hard to tell if they are satisfied from cac
On Sep 4, 2009, at 21:44, Ross Walker wrote:
Though I have only heard good comments from my ESX admins since
moving the VMs off iSCSI and on to ZFS over NFS, so it can't be that
bad.
What's your pool configuration? Striped mirrors? RAID-Z with SSDs?
Other?
__
On Sep 4, 2009, at 10:02 PM, David Magda wrote:
On Sep 4, 2009, at 21:44, Ross Walker wrote:
Though I have only heard good comments from my ESX admins since
moving the VMs off iSCSI and on to ZFS over NFS, so it can't be
that bad.
What's your pool configuration? Striped mirrors? RAID-Z w
On Thu, Sep 3, 2009 at 4:57 AM, Karel Gardas wrote:
> Hello,
> your "(open)solaris for Ecc support (which seems to have been dropped from
> 200906)" is misunderstanding. OS 2009.06 also supports ECC as 2005 did. Just
> install it and use my updated ecccheck.pl script to get informed about
> errors
I've been sending daily incrementals off-site for a while now, but
recently they failed so I had to send an incremental covering a number
of snapshots. I expected the incremental to be approximately the sum
of the snapshots, but it seems to be considerably larger and still
going. The source machine
Tim Cook cook.ms> writes:
>
> Whats the point of arguing what the back-end can do anyways? This is bulk
data storage. Their MAX input is ~100MB/sec. The backend can more than
satisfy that. Who cares at that point whether it can push 500MB/s or
5000MB/s? It's not a database processing tran
On Sat, Sep 5, 2009 at 12:30 AM, Marc Bevand wrote:
> Tim Cook cook.ms> writes:
> >
> > Whats the point of arguing what the back-end can do anyways? This is
> bulk
> data storage. Their MAX input is ~100MB/sec. The backend can more than
> satisfy that. Who cares at that point whether it can
50 matches
Mail list logo