> "r" == Ross <[EMAIL PROTECTED]> writes:
rs> I don't think it likes it if the iscsi targets aren't
rs> available during boot.
from my cheatsheet:
-8<-
ok boot -m milestone=none
[boots. enter root password for maintenance.]
bash-3.00# /sbin/mount -o remount,rw / [<-- otherw
Yeah, thanks Maurice, I just saw that one this afternoon. I guess you
can't reboot with iscsi full stop... o_0
And I've seen the iscsi bug before (I was just too lazy to look it up
lol), I've been complaining about that since February.
In fact it's been a bad week for iscsi here, I've managed to
>2. With iscsi, you can't reboot with sendtargets enabled, static
>discovery still seems to be the order of the day.
I'm seeing this problem with static discovery:
http://bugs.opensolaris.org/view_bug.do?bug_id=6775008.
>4. iSCSI still has a 3 minute timeout, during which time your pool
>wil
Ok, I've done some more testing today and I almost don't know where to start.
I'll begin with the good news for Miles :)
- Rebooting doesn't appear to cause ZFS to loose the resilver status (but see
1. below)
- Resilvering appears to work fine, once complete I never saw any checksum
errors when
On 2-Dec-08, at 3:35 PM, Miles Nordin wrote:
>> "r" == Ross <[EMAIL PROTECTED]> writes:
>
> r> style before I got half way through your post :) [...status
> r> problems...] could be a case of oversimplifying things.
> ...
> And yes, this is a religious argument. Just because it sp
> "r" == Ross <[EMAIL PROTECTED]> writes:
r> style before I got half way through your post :) [...status
r> problems...] could be a case of oversimplifying things.
yeah I was a bit inappropriate, but my frustration comes from the
(partly paranoid) imagining of how the idea ``we nee
Hi Miles,
It's probably a bad sign that although that post came through as anonymous in
my e-mail, I recognised your style before I got half way through your post :)
I agree, the zpool status being out of date is weird, I'll dig out the bug
number for that at some point as I'm sure I've mention
> "rs" == Ross Smith <[EMAIL PROTECTED]> writes:
rs> 4. zpool status still reports out of date information.
I know people are going to skim this message and not hear this.
They'll say ``well of course zpool status says ONLINE while the pool
is hung. ZFS is patiently waiting. It doesn't
Hi Richard,
Thanks, I'll give that a try. I think I just had a kernel dump while
trying to boot this system back up though, I don't think it likes it
if the iscsi targets aren't available during boot. Again, that rings
a bell, so I'll go see if that's another known bug.
Changing that setting on
Incidentally, while I've reported this again as a RFE, I still haven't seen a
CR number for this. Could somebody from Sun check if it's been filed please.
thanks,
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discu
Hey folks,
I've just followed up on this, testing iSCSI with a raided pool, and
it still appears to be struggling when a device goes offline.
>>> I don't see how this could work except for mirrored pools. Would that
>>> carry enough market to be worthwhile?
>>> -- richard
>>>
>>
>> I have to adm
Ross Smith wrote:
> On Fri, Nov 28, 2008 at 5:05 AM, Richard Elling <[EMAIL PROTECTED]> wrote:
>
>> Ross wrote:
>>
>>> Well, you're not alone in wanting to use ZFS and iSCSI like that, and in
>>> fact my change request suggested that this is exactly one of the things that
>>> could be addre
On Fri, Nov 28, 2008 at 5:05 AM, Richard Elling <[EMAIL PROTECTED]> wrote:
> Ross wrote:
>>
>> Well, you're not alone in wanting to use ZFS and iSCSI like that, and in
>> fact my change request suggested that this is exactly one of the things that
>> could be addressed:
>>
>> "The idea is really a
Ross wrote:
> Well, you're not alone in wanting to use ZFS and iSCSI like that, and in fact
> my change request suggested that this is exactly one of the things that could
> be addressed:
>
> "The idea is really a two stage RFE, since just the first part would have
> benefits. The key is to imp
> Well, you're not alone in wanting to use ZFS and
> iSCSI like that, and in fact my change request
> suggested that this is exactly one of the things that
> could be addressed:
Thank you ! Yes, this was also to tell you that you are not alone :-)
I agree completely with you on your technical poi
Well, you're not alone in wanting to use ZFS and iSCSI like that, and in fact
my change request suggested that this is exactly one of the things that could
be addressed:
"The idea is really a two stage RFE, since just the first part would have
benefits. The key is to improve ZFS availability,
Hello,
Thank you for this very interesting thread !
I want to confirm that Synchronous Distributed Storage is main goal when using
ZFS !
The target architecture is 1 local drive, and 2 (or more) remote iSCSI targets,
with ZFS being the iSCSI initiator.
System is designed/cut so that local dis
Thanks James, I've e-mailed Alan and submitted this one again.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Thu, 27 Nov 2008 04:33:54 -0800 (PST)
Ross <[EMAIL PROTECTED]> wrote:
> Hmm... I logged this CR ages ago, but now I've come to find it in
> the bug tracker I can't see it anywhere.
>
> I actually logged three CR's back to back, the first appears to have
> been created ok, but two have just di
Hmm... I logged this CR ages ago, but now I've come to find it in the bug
tracker I can't see it anywhere.
I actually logged three CR's back to back, the first appears to have been
created ok, but two have just disappeared. The one I created ok is:
http://bugs.opensolaris.org/view_bug.do?bug
Ross wrote:
> Hey folks,
>
> Well, there haven't been any more comments knocking holes in this idea, so
> I'm wondering now if I should log this as an RFE?
>
go for it!
> Is this something others would find useful?
>
Yes. But remember that this has a very limited scope. Basically
it
Hey folks,
Well, there haven't been any more comments knocking holes in this idea, so I'm
wondering now if I should log this as an RFE?
Is this something others would find useful?
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mail
Thinking about it, we could make use of this too. The ability to add a
remote iSCSI mirror to any pool without sacrificing local performance
could be a huge benefit.
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> CC: [EMAIL PROTECTED]; zfs-discuss@opensolaris.org
> Subject: Re: Availabilit
Ross Smith wrote:
> Triple mirroring you say? That'd be me then :D
>
> The reason I really want to get ZFS timeouts sorted is that our long
> term goal is to mirror that over two servers too, giving us a pool
> mirrored across two servers, each of which is actually a zfs iscsi
> volume hosted o
On Thu, Aug 28, 2008 at 11:21 PM, Ian Collins <[EMAIL PROTECTED]> wrote:
> Miles Nordin writes:
>
> > suggested that unlike the SVM feature it should be automatic, because
> > by so being it becomes useful as an availability tool rather than just
> > performance optimisation.
> >
> So on a server
Miles Nordin writes:
>> "bf" == Bob Friesenhahn <[EMAIL PROTECTED]> writes:
>
> bf> You are saying that I can't split my mirrors between a local
> bf> disk in Dallas and a remote disk in New York accessed via
> bf> iSCSI?
>
> nope, you've misread. I'm saying reads should go to
Eric Schrock writes:
>
> A better option would be to not use this to perform FMA diagnosis, but
> instead work into the mirror child selection code. This has already
> been alluded to before, but it would be cool to keep track of latency
> over time, and use this to both a) prefer one drive over
lt, with the optional setting being to allow the pool to
continue accepting writes while the pool is in a non redundant state.
Ross
> Date: Sat, 30 Aug 2008 10:59:19 -0500
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> CC: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-disc
On Sat, 30 Aug 2008, Ross wrote:
> while the problem is diagnosed. - With that said, could the write
> timeout default to on when you have a slog device? After all, the
> data is safely committed to the slog, and should remain there until
> it's written to all devices. Bob, you seemed the most
Wow, some great comments on here now, even a few people agreeing with me which
is nice :D
I'll happily admit I don't have the in depth understanding of storage many of
you guys have, but since the idea doesn't seem pie-in-the-sky crazy, I'm going
to try to write up all my current thoughts on ho
Miles Nordin wrote:
>> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
>>
>
> re> if you use Ethernet switches in the interconnect, you need to
> re> disable STP on the ports used for interconnects or risk
> re> unnecessary cluster reconfigurations.
>
> RSTP/802.
> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
re> if you use Ethernet switches in the interconnect, you need to
re> disable STP on the ports used for interconnects or risk
re> unnecessary cluster reconfigurations.
RSTP/802.1w plus setting the ports connected to Solaris as `
On Fri, 29 Aug 2008, Miles Nordin wrote:
>
> I guess I'm changing my story slightly. I *would* want ZFS to collect
> drive performance statistics and report them to FMA, but I wouldn't
Your email *totally* blew my limited buffer size, but this little bit
remained for me to look at. It left me w
> "es" == Eric Schrock <[EMAIL PROTECTED]> writes:
es> The main problem with exposing tunables like this is that they
es> have a direct correlation to service actions, and
es> mis-diagnosing failures costs everybody (admin, companies,
es> Sun, etc) lots of time and money. Once
Nicolas Williams wrote:
> On Thu, Aug 28, 2008 at 11:29:21AM -0500, Bob Friesenhahn wrote:
>
>> Which of these do you prefer?
>>
>>o System waits substantial time for devices to (possibly) recover in
>> order to ensure that subsequently written data has the least
>> chance of being
On Thu, Aug 28, 2008 at 01:05:54PM -0700, Eric Schrock wrote:
> As others have mentioned, things get more difficult with writes. If I
> issue a write to both halves of a mirror, should I return when the first
> one completes, or when both complete? One possibility is to expose this
> as a tunable
On Thu, Aug 28, 2008 at 11:29:21AM -0500, Bob Friesenhahn wrote:
> Which of these do you prefer?
>
>o System waits substantial time for devices to (possibly) recover in
> order to ensure that subsequently written data has the least
> chance of being lost.
>
>o System immediately
Bill Sommerfeld wrote:
> On Thu, 2008-08-28 at 13:05 -0700, Eric Schrock wrote:
>
>> A better option would be to not use this to perform FMA diagnosis, but
>> instead work into the mirror child selection code. This has already
>> been alluded to before, but it would be cool to keep track of lat
On Thu, 2008-08-28 at 13:05 -0700, Eric Schrock wrote:
> A better option would be to not use this to perform FMA diagnosis, but
> instead work into the mirror child selection code. This has already
> been alluded to before, but it would be cool to keep track of latency
> over time, and use this to
On Thu, 28 Aug 2008, Miles Nordin wrote:
> None of the decisions I described its making based on performance
> statistics are ``haywire''---I said it should funnel reads to the
> faster side of the mirror, and do this really quickly and
> unconservatively. What's your issue with that?
>From what
On Thu, Aug 28, 2008 at 08:34:24PM +0100, Ross Smith wrote:
>
> Personally, if a SATA disk wasn't responding to any requests after 2
> seconds I really don't care if an error has been detected, as far as
> I'm concerned that disk is faulty.
Unless you have power management enabled, or there's a b
> "bf" == Bob Friesenhahn <[EMAIL PROTECTED]> writes:
bf> If the system or device is simply overwelmed with work, then
bf> you would not want the system to go haywire and make the
bf> problems much worse.
None of the decisions I described its making based on performance
statistics
> "es" == Eric Schrock <[EMAIL PROTECTED]> writes:
es> I don't think you understand how this works. Imagine two
es> I/Os, just with different sd timeouts and retry logic - that's
es> B_FAILFAST. It's quite simple, and independent of any
es> hardware implementation.
AIUI the
but feel it should have that same approach to management of its drives.
However, that said, I'll be more than willing to test the new
B_FAILFAST logic on iSCSI once it's released. Just let me know when
it's out.
Ross
> Date: Thu, 28 Aug 2008 11:29:21 -0500
> From: [EMAIL
On Thu, 28 Aug 2008, Miles Nordin wrote:
>
> you're right in terms of fixed timeouts, but there's no reason it
> can't compare the performance of redundant data sources, and if one
> vdev performs an order of magnitude slower than another set of vdevs
> with sufficient redundancy, stop issuing read
On Thu, Aug 28, 2008 at 02:17:08PM -0400, Miles Nordin wrote:
>
> you're right in terms of fixed timeouts, but there's no reason it
> can't compare the performance of redundant data sources, and if one
> vdev performs an order of magnitude slower than another set of vdevs
> with sufficient redunda
> "es" == Eric Schrock <[EMAIL PROTECTED]> writes:
es> Finally, imposing additional timeouts in ZFS is a bad idea.
es> [...] As such, it doesn't have the necessary context to know
es> what constitutes a reasonable timeout.
you're right in terms of fixed timeouts, but there's no re
Ross, thanks for the feedback. A couple points here -
A lot of work went into improving the error handling around build 77 of
Nevada. There are still problems today, but a number of the
complaints we've seen are on s10 software or older nevada builds that
didn't have these fixes. Anything from
On Thu, 28 Aug 2008, Ross wrote:
>
> I believe ZFS should apply the same tough standards to pool
> availability as it does to data integrity. A bad checksum makes ZFS
> read the data from elsewhere, why shouldn't a timeout do the same
> thing?
A problem is that for some devices, a five minute
Since somebody else has just posted about their entire system locking up when
pulling a drive, I thought I'd raise this for discussion.
I think Ralf made a very good point in the other thread. ZFS can guarantee
data integrity, what it can't do is guarantee data availability. The problem
is, t
50 matches
Mail list logo