On 2007.02.04 02:13:51 +0100, Björn Steinbrink wrote:
> On 2007.02.02 23:48:14 -0600, Robert Hancock wrote:
> > There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch)
> > which should hopefully avoid this problem for the cache flush commands,
> > at least - can you try that one out?
On 2007.02.02 23:48:14 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
> >>On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> >>>Larry Walton wrote:
> The last patch (sata_nv-force-int-dev-in-interrupt.patch)
> seems to hav
Björn Steinbrink wrote:
On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
Larry Walton wrote:
The last patch (sata_nv-force-int-dev-in-interrupt.patch)
seems to have fix the problem. Much appreciated,
thank you. I'd consider it a must h
On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
> On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> > Larry Walton wrote:
> > >The last patch (sata_nv-force-int-dev-in-interrupt.patch)
> > >seems to have fix the problem. Much appreciated,
> > >thank you. I'd consider it a must have in
On 2007.01.24 09:24:00 +0100, Ian Kumlien wrote:
> On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote:
> > Larry Walton wrote:
> > > The last patch (sata_nv-force-int-dev-in-interrupt.patch)
> > > seems to have fix the problem. Much appreciated,
> > > thank you. I'd consider it a must have
On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote:
> Larry Walton wrote:
> > The last patch (sata_nv-force-int-dev-in-interrupt.patch)
> > seems to have fix the problem. Much appreciated,
> > thank you. I'd consider it a must have in 2.6.20.
>
> Can any of the rest of you that have been s
On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> Larry Walton wrote:
> >The last patch (sata_nv-force-int-dev-in-interrupt.patch)
> >seems to have fix the problem. Much appreciated,
> >thank you. I'd consider it a must have in 2.6.20.
>
> Can any of the rest of you that have been seeing th
Larry Walton wrote:
The last patch (sata_nv-force-int-dev-in-interrupt.patch)
seems to have fix the problem. Much appreciated,
thank you. I'd consider it a must have in 2.6.20.
Can any of the rest of you that have been seeing this problem also
confirm that this fixes it?
--
Robert Hancock
The last patch (sata_nv-force-int-dev-in-interrupt.patch)
seems to have fix the problem. Much appreciated,
thank you. I'd consider it a must have in 2.6.20.
--
*--* Mail: [EMAIL PROTECTED]
*--* Voice: 206.892.6269
*--* Cell: 206.225.0154
*--* HTTP://real.com
--
Björn Steinbrink wrote:
Hm, I don't think it is unhappy about looking at NV_INT_STATUS_CK804.
I'm running 2.6.20-rc5 with the INT_DEV check removed for 8 hours now
without a single problem and that should still look at
NV_INT_STATUS_CK804, right?
I just noticed that my last email might not have b
On 2007.01.22 19:24:22 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >>>Running a kernel with the return statement replace by a line that prints
> >>>the irq_stat instead.
> >>>
> >>>Currently I'm seeing lots of 0x10 on ata1 and 0x0 on ata2.
> >>40 minutes stress test now and no exceptio
Alistair John Strachan wrote:
On Tuesday 23 January 2007 01:24, Robert Hancock wrote:
As a final aside, this is another case where the hardware docs for this
controller would really be useful, in order to know whether we are
actually supposed to be reading that register in ADMA mode or not. I
se
On Tuesday 23 January 2007 01:24, Robert Hancock wrote:
> As a final aside, this is another case where the hardware docs for this
> controller would really be useful, in order to know whether we are
> actually supposed to be reading that register in ADMA mode or not. I
> sent a query to Allen Marti
Björn Steinbrink wrote:
Running a kernel with the return statement replace by a line that prints
the irq_stat instead.
Currently I'm seeing lots of 0x10 on ata1 and 0x0 on ata2.
40 minutes stress test now and no exception yet. What's interesting is
that ata1 saw exactly one interrupt with irq_s
On 1/15/07, Jeff Garzik <[EMAIL PROTECTED]> wrote:
Jens Axboe wrote:
> On Mon, Jan 15 2007, Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> I'd be surprised if the device would not obey the 7 second timeout rule
>>> that seems to be set in stone and not allow more dirty in-drive cache
>>> than it cou
On 2007.01.22 17:57:08 +0100, Björn Steinbrink wrote:
> On 2007.01.22 17:12:40 +0100, Björn Steinbrink wrote:
> > On 2007.01.21 18:17:01 -0600, Robert Hancock wrote:
> > > Hmm, another miss, apparently.. Has anyone tried removing these lines
> > > >from nv_host_intr in 2.6.20-rc5 sata_nv.c and see
On 2007.01.22 17:12:40 +0100, Björn Steinbrink wrote:
> On 2007.01.21 18:17:01 -0600, Robert Hancock wrote:
> > Björn Steinbrink wrote:
> > >On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
> > >>Björn Steinbrink wrote:
> > >>>All kernels were bad using that approach. So back to square 1. :/
> >
On 2007.01.21 18:17:01 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
> >>Björn Steinbrink wrote:
> >>>All kernels were bad using that approach. So back to square 1. :/
> >>>
> >>>Björn
> >>>
> >>OK guys, here's a new patch to try again
On Monday, 22. January 2007 03:39, Tejun Heo wrote:
> Hello,
>
> Chr wrote:
> > Ok, you won't believe this... I opened my case and rewired my drives...
> > And guess what, my second (aka the "good") HDD is now failing!
> > I guess, my mainboard has a (but maybe two, or three :( ) "bad"
> > sata
Hello,
Chr wrote:
Ok, you won't believe this... I opened my case and rewired my drives...
And guess what, my second (aka the "good") HDD is now failing!
I guess, my mainboard has a (but maybe two, or three :( ) "bad" sata-port(s)!
Or, you have power related problem. Try to rewire the power
Björn Steinbrink wrote:
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
Björn Steinbrink wrote:
All kernels were bad using that approach. So back to square 1. :/
Björn
OK guys, here's a new patch to try against 2.6.20-rc5:
Right now when switching between ADMA mode and legacy mode (i.e.
Björn Steinbrink wrote:
On 2007.01.21 23:08:11 +0100, Björn Steinbrink wrote:
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
Björn Steinbrink wrote:
All kernels were bad using that approach. So back to square 1. :/
Björn
OK guys, here's a new patch to try against 2.6.20-rc5:
Right now
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >All kernels were bad using that approach. So back to square 1. :/
> >
> >Björn
> >
>
> OK guys, here's a new patch to try against 2.6.20-rc5:
>
> Right now when switching between ADMA mode and legacy mode (i.e. when
On 2007.01.21 23:08:11 +0100, Björn Steinbrink wrote:
> On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
> > Björn Steinbrink wrote:
> > >All kernels were bad using that approach. So back to square 1. :/
> > >
> > >Björn
> > >
> >
> > OK guys, here's a new patch to try against 2.6.20-rc5:
> >
On 2007.01.21 13:58:01 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >All kernels were bad using that approach. So back to square 1. :/
> >
> >Björn
> >
>
> OK guys, here's a new patch to try against 2.6.20-rc5:
>
> Right now when switching between ADMA mode and legacy mode (i.e. when
On Sunday, 21. January 2007 19:01, Björn Steinbrink wrote:
> On 2007.01.21 18:34:40 +0100, Chr wrote:
>
> I run those two in parallel:
> while /bin/true; do ls -lR / > /dev/null 2>&1; done
> while /bin/true; do echo 255 > /proc/sys/vm/drop_caches; sleep 1; done
>
> Not sure if running them in paral
Björn Steinbrink wrote:
All kernels were bad using that approach. So back to square 1. :/
Björn
OK guys, here's a new patch to try against 2.6.20-rc5:
Right now when switching between ADMA mode and legacy mode (i.e. when
going from doing normal DMA reads/writes to doing a FLUSH CACHE) we ju
On 2007.01.21 09:36:18 +0100, Björn Steinbrink wrote:
> On 2007.01.21 00:39:20 -0600, Robert Hancock wrote:
> > Björn Steinbrink wrote:
> > >On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote:
> > >>Robert Hancock wrote:
> > >>>change in 2.6.20-rc is either causing or triggering this problem. It
> >
On 2007.01.21 18:34:40 +0100, Chr wrote:
> On Sunday, 21. January 2007 09:36, Björn Steinbrink wrote:
> > On 2007.01.21 00:39:20 -0600, Robert Hancock wrote:
> >
> > Ah, right... sata_nv.c of course interacts with the outside world, d'oh!
> >
> > Up to now, I only got bad kernels, latest tested bei
On Sunday, 21. January 2007 09:36, Björn Steinbrink wrote:
> On 2007.01.21 00:39:20 -0600, Robert Hancock wrote:
>
> Ah, right... sata_nv.c of course interacts with the outside world, d'oh!
>
> Up to now, I only got bad kernels, latest tested being:
> 94fcda1f8ab5e0cacc381c5ca1cc9aa6ad523576
>
> Wh
On 2007.01.21 00:39:20 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote:
> >>Robert Hancock wrote:
> >>>change in 2.6.20-rc is either causing or triggering this problem. It
> >>>would be useful if you could try git bisect between 2.6.19 and
Björn Steinbrink wrote:
On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote:
Robert Hancock wrote:
change in 2.6.20-rc is either causing or triggering this problem. It
would be useful if you could try git bisect between 2.6.19 and
2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that
On 2007.01.20 22:34:27 -0500, Jeff Garzik wrote:
> Robert Hancock wrote:
> >change in 2.6.20-rc is either causing or triggering this problem. It
> >would be useful if you could try git bisect between 2.6.19 and
> >2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that
>
>
> Yes, '
Robert Hancock wrote:
change in 2.6.20-rc is either causing or triggering this problem. It
would be useful if you could try git bisect between 2.6.19 and
2.6.20-rc5, keeping the latest sata_nv.c each time, and see if that
Yes, 'git bisect' would be the next step in figuring out this puzzle.
Chr wrote:
Could you (or anyone else) test what happens if you take the 2.6.20-rc5
version of sata_nv.c and try it on 2.6.19? That would tell us whether
it's this change or whether it's something else (i.e. in libata core).
Ok, did that! (got a fresh 2.6.19 tar ball, and used 2.6.20-rc5' sata_n
On Saturday, 20. January 2007 20:59, you wrote:
> Ian Kumlien wrote:
> > Hi,
> >
> > I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama
> > enabled, to 2.6.20-rc5, which gave me problems almost instantly.
> >
> > I just thought that it might be interesting to know that it DID
On lör, 2007-01-20 at 21:43 +, Alistair John Strachan wrote:
> On Saturday 20 January 2007 19:59, Robert Hancock wrote:
> > Ian Kumlien wrote:
> > > Hi,
> > >
> > > I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama
> > > enabled, to 2.6.20-rc5, which gave me problems almo
On Saturday 20 January 2007 19:59, Robert Hancock wrote:
> Ian Kumlien wrote:
> > Hi,
> >
> > I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama
> > enabled, to 2.6.20-rc5, which gave me problems almost instantly.
> >
> > I just thought that it might be interesting to know tha
Ian Kumlien wrote:
Hi,
I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama
enabled, to 2.6.20-rc5, which gave me problems almost instantly.
I just thought that it might be interesting to know that it DID work
nicely.
CC since i'm not on the ml
(I'm ccing more of the peo
On Saturday, 20. January 2007 03:41, Robert Hancock wrote:
> Alistair John Strachan wrote:
> > On Tuesday 16 January 2007 01:53, Jeff Garzik wrote:
> >> Robert Hancock wrote:
> >>> I'll try your stress test when I get a chance, but I doubt I'll run
> >>> into the same problem and I haven't seen any
Hi,
I went from 2.6.19+sata_nv-adma-ncq-v7.patch, with no problems and adama
enabled, to 2.6.20-rc5, which gave me problems almost instantly.
I just thought that it might be interesting to know that it DID work
nicely.
CC since i'm not on the ml
--
Ian Kumlien -- http://pomac.netswarm.net
s
On 2007.01.19 20:41:36 -0600, Robert Hancock wrote:
> Alistair John Strachan wrote:
> >On Tuesday 16 January 2007 01:53, Jeff Garzik wrote:
> >>Robert Hancock wrote:
> >>>I'll try your stress test when I get a chance, but I doubt I'll run into
> >>>the same problem and I haven't seen any similar re
On Saturday 20 January 2007 02:41, Robert Hancock wrote:
> By the way, I assume that you guys are using reiserfs or xfs, as it
> appears no other file systems issue flush commands automatically. I had
> to test this by "echo 1 > delete" on the SCSI disk in sysfs, as I am
> using ext3.
I'll give it
Alistair John Strachan wrote:
On Tuesday 16 January 2007 01:53, Jeff Garzik wrote:
Robert Hancock wrote:
I'll try your stress test when I get a chance, but I doubt I'll run into
the same problem and I haven't seen any similar reports. Perhaps it's
some kind of wierd timing issue or incompatibil
On Friday, 19. January 2007 16:05, Alistair John Strachan wrote:
> On Tuesday 16 January 2007 01:53, Jeff Garzik wrote:
> > Robert Hancock wrote:
> > > I'll try your stress test when I get a chance, but I doubt I'll run
> > > into the same problem and I haven't seen any similar reports. Perhaps
> >
On Tuesday 16 January 2007 01:53, Jeff Garzik wrote:
> Robert Hancock wrote:
> > I'll try your stress test when I get a chance, but I doubt I'll run into
> > the same problem and I haven't seen any similar reports. Perhaps it's
> > some kind of wierd timing issue or incompatibility between the
> >
On Tuesday 16 January 2007 00:34, Robert Hancock wrote:
> I'll try your stress test when I get a chance, but I doubt I'll run into
> the same problem and I haven't seen any similar reports. Perhaps it's
> some kind of wierd timing issue or incompatibility between the
> controller and that drive whe
On 2007.01.18 18:09:50 -0600, Robert Hancock wrote:
> I heard from Larry Walton who was apparently seeing this problem as
> well. He tried my recent "sata_nv: cleanup ADMA error handling v2" patch
> and originally thought it fixed the problem, but it turned out to only
> make it happen less ofte
I heard from Larry Walton who was apparently seeing this problem as
well. He tried my recent "sata_nv: cleanup ADMA error handling v2" patch
and originally thought it fixed the problem, but it turned out to only
make it happen less often.
I wouldn't expect that patch to have an effect on this
Björn Steinbrink wrote:
It should be correct the way it is - that check is trying to prevent
ATAPI commands from using DMA until the slave_config function has been
called to set up the DMA parameters properly. When the
NV_ADMA_ATAPI_SETUP_COMPLETE flag is not set, this returns 1 which
disallow
Robert Hancock wrote:
I'll try your stress test when I get a chance, but I doubt I'll run into
the same problem and I haven't seen any similar reports. Perhaps it's
some kind of wierd timing issue or incompatibility between the
controller and that drive when running in ADMA mode? I seem to reme
Robert Hancock wrote:
Note that the ATA-7 spec for FLUSH CACHE says that "This command may
take longer than 30 s to complete."
Yep...
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at h
Jens Axboe wrote:
On Mon, Jan 15 2007, Jeff Garzik wrote:
Jens Axboe wrote:
I'd be surprised if the device would not obey the 7 second timeout rule
that seems to be set in stone and not allow more dirty in-drive cache
than it could flush out in approximately that time.
AFAIK Windows flush-cache
On 2007.01.15 18:34:43 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >>My latest bisection attempt actually led to your sata_nv ADMA commit. [1]
> >>I've now backed out that patch from 2.6.20-rc5 and have my stress test
> >>running for 20 minutes now ("record" for a bad kernel surviving
Jens Axboe wrote:
On Mon, Jan 15 2007, Jeff Garzik wrote:
Jens Axboe wrote:
I'd be surprised if the device would not obey the 7 second timeout rule
that seems to be set in stone and not allow more dirty in-drive cache
than it could flush out in approximately that time.
AFAIK Windows flush-cache
Björn Steinbrink wrote:
My latest bisection attempt actually led to your sata_nv ADMA commit. [1]
I've now backed out that patch from 2.6.20-rc5 and have my stress test
running for 20 minutes now ("record" for a bad kernel surviving that
test is about 40 minutes IIRC). I'll keep it running for at
On Mon, Jan 15 2007, Jeff Garzik wrote:
> Jens Axboe wrote:
> >I'd be surprised if the device would not obey the 7 second timeout rule
> >that seems to be set in stone and not allow more dirty in-drive cache
> >than it could flush out in approximately that time.
>
> AFAIK Windows flush-cache timeo
On 2007.01.15 22:17:24 +0100, Björn Steinbrink wrote:
> On 2007.01.14 17:43:53 -0600, Robert Hancock wrote:
> > Björn Steinbrink wrote:
> > >Hi,
> > >
> > >with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
> > >often, with 2.6.19 there are no such exceptions. dmesg and lspci -
On 2007.01.14 17:43:53 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >Hi,
> >
> >with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
> >often, with 2.6.19 there are no such exceptions. dmesg and lspci -v
> >output follows. In the meantime, I'll start bisecting.
>
> .
On 2007.01.15 07:48:23 +0100, Mikael Pettersson wrote:
> Notice how the problems started exactly at the point the
> "NVRM" NVIDIA module (whatever it is) was loaded ...
That's not the reason. Yeah, I should not have sent a log of a run with
the nvidia module loaded, but the same thing happens with
Jens Axboe wrote:
I'd be surprised if the device would not obey the 7 second timeout rule
that seems to be set in stone and not allow more dirty in-drive cache
than it could flush out in approximately that time.
AFAIK Windows flush-cache timeout is 30 seconds, not 7 as with other
commands...
Mikael Pettersson wrote:
Notice how the problems started exactly at the point the
"NVRM" NVIDIA module (whatever it is) was loaded ...
Yes, that's a bit suspicious...
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PR
Björn Steinbrink writes:
> Hi,
>
> with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
> often, with 2.6.19 there are no such exceptions. dmesg and lspci -v
> output follows. In the meantime, I'll start bisecting.
>
> Thanks
> Björn
>
>
> Linux version 2.6.20-rc2
On Sun, Jan 14 2007, Robert Hancock wrote:
> Jeff Garzik wrote:
> >>Looks like all of these errors are from a FLUSH CACHE command and the
> >>drive is indicating that it is no longer busy, so presumably done.
> >>That's not a DMA-mapped command, so it wouldn't go through the ADMA
> >>machinery a
On 2007.01.15 01:34:48 +0100, Björn Steinbrink wrote:
> On 2007.01.14 19:22:51 -0500, Jeff Garzik wrote:
> > Robert Hancock wrote:
> > >Björn Steinbrink wrote:
> > >>Hi,
> > >>
> > >>with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
> > >>often, with 2.6.19 there are no such e
Jeff Garzik wrote:
Looks like all of these errors are from a FLUSH CACHE command and the
drive is indicating that it is no longer busy, so presumably done.
That's not a DMA-mapped command, so it wouldn't go through the ADMA
machinery and I wouldn't have expected this to be handled any
differen
On 2007.01.14 19:22:51 -0500, Jeff Garzik wrote:
> Robert Hancock wrote:
> >Björn Steinbrink wrote:
> >>Hi,
> >>
> >>with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
> >>often, with 2.6.19 there are no such exceptions. dmesg and lspci -v
> >>output follows. In the meantime, I
Robert Hancock wrote:
Björn Steinbrink wrote:
Hi,
with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
often, with 2.6.19 there are no such exceptions. dmesg and lspci -v
output follows. In the meantime, I'll start bisecting.
...
ata1.00: exception Emask 0x0 SAct 0x0 SErr
Björn Steinbrink wrote:
Hi,
with 2.6.20-rc{2,4,5} (no other tested yet) I see SATA exceptions quite
often, with 2.6.19 there are no such exceptions. dmesg and lspci -v
output follows. In the meantime, I'll start bisecting.
...
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
69 matches
Mail list logo