From: Shyam Sundar S K
[ Upstream commit 186edbb510bd60e748f93975989ccba25ee99c50 ]
The current driver calls netif_carrier_off() late in the link tear down
which can result in a netdev watchdog timeout.
Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
avoids the warning
From: Shyam Sundar S K
[ Upstream commit 186edbb510bd60e748f93975989ccba25ee99c50 ]
The current driver calls netif_carrier_off() late in the link tear down
which can result in a netdev watchdog timeout.
Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
avoids the warning
From: Shyam Sundar S K
[ Upstream commit 186edbb510bd60e748f93975989ccba25ee99c50 ]
The current driver calls netif_carrier_off() late in the link tear down
which can result in a netdev watchdog timeout.
Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
avoids the warning
From: Shyam Sundar S K
[ Upstream commit 186edbb510bd60e748f93975989ccba25ee99c50 ]
The current driver calls netif_carrier_off() late in the link tear down
which can result in a netdev watchdog timeout.
Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
avoids the warning
On Wed, 19 Aug 2020 10:29:09 -0700
Jesse Brandeburg wrote:
> What I don't understand in the stack trace is this:
> > > [ 107.654661] Call Trace:
> > > [ 107.657735]
> > > [ 107.663155] ? ftrace_graph_caller+0xc0/0xc0
> > > [ 107.667929] call_timer_fn+0x3b/0x1b0
> > > [ 107.672238] ? ne
racepoints stat_sleep, stat_iowait,
> > stat_blocked and stat_runtime require the kernel parameter
> > schedstats=enable or kernel.sched_schedstats=1
> > [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait,
> > stat_blocked and stat_runtime require the kernel parameter
hedstats=1
> [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait,
> stat_blocked and stat_runtime require the kernel parameter
> schedstats=enable or kernel.sched_schedstats=1
> [ 107.507991] [ cut here ]
> [ 107.513103] NETDEV WATCHDOG: eth0 (igb):
kernel.sched_schedstats=1
[ 107.507991] [ cut here ]
[ 107.513103] NETDEV WATCHDOG: eth0 (igb): transmit queue 2 timed out
[ 107.519973] WARNING: CPU: 1 PID: 331 at net/sched/sch_generic.c:442
dev_watchdog+0x4c7/0x4d0
[ 107.528907] Modules linked in: x86_pkg_temp_thermal
While running selftests bpf test_sysctl on stable rc 5.6 branch kernel
on arm64 hikey device. The following warning was noticed.
[ 1097.207013] NETDEV WATCHDOG: eth0 (asix): transmit queue 0 timed out
[ 1097.387913] WARNING: CPU: 0 PID: 206 at
/usr/src/kernel/net/sched/sch_generic.c:443
While running selftests bpf test_sysctl on stable rc 4.19 branch kernel
on arm64 hikey device. The following warning was noticed.
[ 118.957395] test_bpf: #296 BPF_MAXINSNS: exec all MSH
[ 148.966435] [ cut here ]
[ 148.988349] NETDEV WATCHDOG: eth0 (asix): transmit
On Tue, Mar 20, 2018 at 11:41:06AM +0530, Satish Baddipadige wrote:
> Can you please test the attached patch?
Well, the network connection just died with it. It didn't fire the
netdev watchdog but I still had to down and up eth0 in order to continue
using it. ssh connection into the box
On Tue, Mar 20, 2018 at 11:41:06AM +0530, Satish Baddipadige wrote:
> Can you please test the attached patch?
Sure, will do when I get back next week.
Thx.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply. Srsly.
On Wed, Feb 28, 2018 at 7:40 PM, Siva Reddy Kallam
wrote:
> On Sat, Feb 24, 2018 at 3:48 PM, Borislav Petkov wrote:
>> Hi,
>>
>> this didn't happen before but after 4.16-rc1 my tg3 nic stops for
>> whatever reason and the connection to the machine is dead. It didn't show
>> anything in dmesg unti
On Sat, Feb 24, 2018 at 3:48 PM, Borislav Petkov wrote:
> Hi,
>
> this didn't happen before but after 4.16-rc1 my tg3 nic stops for
> whatever reason and the connection to the machine is dead. It didn't show
> anything in dmesg until today.
>
> The IO pagefaults look like it is trying to access so
ddress=0x0001f180 flags=0x]
[ 64.992145] [ cut here ]--------
[ 64.992406] NETDEV WATCHDOG: eth0 (tg3): transmit queue 0 timed out
[ 64.992742] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:464
dev_watchdog+0x1fe/0x210
[ 64.992744] Modules linked in: ar
Hi Folks,
I'm running slackware linux 14 32 bits as a firewall/ipsec
gateway with linux 3.18.0
I got this error just after 12 hours uptime:
WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:303 dev_watchdog+0xee/0x174()
NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Modules link
eric.c:255 dev_watchdog+0x165/0x220()
Nov 26 04:18:34 localhost kernel: [ 7814.197892] NETDEV WATCHDOG: eth0
(igb): transmit queue 7 timed out
Nov 26 04:18:34 localhost kernel: [ 7814.197894] Modules linked in: tun
nfsv3 nfs_acl nfs fscache dm_multipath scsi_dh lockd sunrpc openvswitch
ipt_REJEC
On 05/02/14 20:43, Andrew Cooper wrote:
On 05/02/2014 20:23, Zoltan Kiss wrote:
On 04/02/14 19:47, Michael Chan wrote:
On Fri, 2014-01-31 at 14:29 +0100, Zoltan Kiss wrote:
[ 5417.275472] WARNING: at net/sched/sch_generic.c:255
dev_watchdog+0x156/0x1f0()
[ 5417.275474] NETDEV WATCHDOG: eth1
On 05/02/2014 20:23, Zoltan Kiss wrote:
> On 04/02/14 19:47, Michael Chan wrote:
>> On Fri, 2014-01-31 at 14:29 +0100, Zoltan Kiss wrote:
>>> [ 5417.275472] WARNING: at net/sched/sch_generic.c:255
>>> dev_watchdog+0x156/0x1f0()
>>> [ 5417.275474] NETDEV WAT
On 05/02/14 20:23, Zoltan Kiss wrote:
On 04/02/14 19:47, Michael Chan wrote:
On Fri, 2014-01-31 at 14:29 +0100, Zoltan Kiss wrote:
[ 5417.275472] WARNING: at net/sched/sch_generic.c:255
dev_watchdog+0x156/0x1f0()
[ 5417.275474] NETDEV WATCHDOG: eth1 (bnx2): transmit queue 2 timed out
The
On 04/02/14 19:47, Michael Chan wrote:
On Fri, 2014-01-31 at 14:29 +0100, Zoltan Kiss wrote:
[ 5417.275472] WARNING: at net/sched/sch_generic.c:255
dev_watchdog+0x156/0x1f0()
[ 5417.275474] NETDEV WATCHDOG: eth1 (bnx2): transmit queue 2 timed out
The dump shows an internal IRQ pending on MSIX
On 31/01/14 18:56, Wei Liu wrote:
On Thu, Jan 30, 2014 at 07:08:11PM +, Zoltan Kiss wrote:
Hi,
I've experienced some queue timeout problems mentioned in the
subject with igb and bnx2 cards. I haven't seen them on other cards
so far. I'm using XenServer with 3.10 Dom0 kernel (however igb wer
On Fri, 2014-01-31 at 14:29 +0100, Zoltan Kiss wrote:
> [ 5417.275472] WARNING: at net/sched/sch_generic.c:255
> dev_watchdog+0x156/0x1f0()
> [ 5417.275474] NETDEV WATCHDOG: eth1 (bnx2): transmit queue 2 timed out
The dump shows an internal IRQ pending on MSIX vector 2 which matche
On Thu, Jan 30, 2014 at 07:08:11PM +, Zoltan Kiss wrote:
> Hi,
>
> I've experienced some queue timeout problems mentioned in the
> subject with igb and bnx2 cards. I haven't seen them on other cards
> so far. I'm using XenServer with 3.10 Dom0 kernel (however igb were
> already updated to late
that may be useful. Thanks.
Hi,
Here is some:
[ 5417.275463] [ cut here ]
[ 5417.275472] WARNING: at net/sched/sch_generic.c:255
dev_watchdog+0x156/0x1f0()
[ 5417.275474] NETDEV WATCHDOG: eth1 (bnx2): transmit queue 2 timed out
[ 5417.275476] Modules linked in: tun
On Thu, 2014-01-30 at 19:08 +, Zoltan Kiss wrote:
> I've experienced some queue timeout problems mentioned in the subject
> with igb and bnx2 cards.
Please provide the full tx timeout dmesg. bnx2 dumps some diagnostic
information during tx timeout that may be useful. Thanks.
--
To unsubsc
Hi,
I've experienced some queue timeout problems mentioned in the subject
with igb and bnx2 cards. I haven't seen them on other cards so far. I'm
using XenServer with 3.10 Dom0 kernel (however igb were already updated
to latest version), and there are Windows guests sending data through
these
Nick,
You could try 7.3.21-k8-NAPI in tree or the out-of-tree version as
Bjorn mentioned.
To read and debug an old version driver is not a interesting thing for
somebody to do.
Thanks,
Ethan
On Tue, Dec 3, 2013 at 9:33 PM, Nick Pegg wrote:
> On Mon, Dec 2, 2013 at 10:51 PM, Ethan Zhao wrote
On Mon, Dec 2, 2013 at 10:51 PM, Ethan Zhao wrote:
> Bjorn,
>Seems not the same bug as http://sourceforge.net/p/e1000/bugs/367/
> , Nick is not running his kernel on bare metal, per the error log,
> he runs his kernel as HVM DomU guest or Dom0 on XEN ? so just a check
> of NULL will not fix
le similar report: http://sourceforge.net/p/e1000/bugs/367/ (no
> real data there).
>
>>
>> Nov 16 07:03:19 rx ----[ cut here ]
>> Nov 16 07:03:19 rx WARNING: at net/sched/sch_generic.c:255
>> dev_watchdog+0x25b/0x270()
>> Nov 16 07:0
From: Nick Pegg [mailto:n...@nickpegg.com]
Sent: Monday, December 02, 2013 2:57 PM
To: linux-kernel@vger.kernel.org; e1000-de...@lists.sourceforge.net
Subject: Re: [E1000-devel] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0
timed out
> Intel maintains newer drivers out-of-tree at
> Intel maintains newer drivers out-of-tree at
> http://sourceforge.net/projects/e1000/, and it's possible this is some
> bug that has already been fixed. The current version there looks like
> e1000e-2.5.4, released 2013-09-05.
>
> Possible similar report: http://sourceforge.net/p/e1000/bugs/367/
plies
since I'm not subscribed to this list. Thanks!
-Nick
Nov 16 07:03:19 rx [ cut here ]
Nov 16 07:03:19 rx WARNING: at net/sched/sch_generic.c:255
dev_watchdog+0x25b/0x270()
Nov 16 07:03:19 rx Hardware name: X8DT6
Nov 16 07:03:19 rx NETDEV WATCHDOG: eth0 (
Hi
> On 26 March 2013 13:44, Andrew Brooks wrote:
>> Using niu driver for this card: Oracle/SUN Multithreaded 10-Gigabit
>> Ethernet Network Controller and after a period the interface will hang
>> with errors every 5 seconds
>> "niu: xxx: eth2: Transmit timed out, resetting"
Here's more informa
ot;
>
> Sometimes also in syslog are messages
> WARNING: at sch_generic:255 dev_watchdog
> NETDEV WATCHDOG: eth2 (niu): transmit queue 10 timed out
Do you think this could be caused by a problem I've seen reported
by other machines on the network
"received unsolicited ack for DL_UN
ARNING: at sch_generic:255 dev_watchdog
NETDEV WATCHDOG: eth2 (niu): transmit queue 10 timed out
Does anyone know which driver revision has fixed this problem or if
it's still buggy?
Thanks!
Andrew
P.S. My guess is the commit on 2012-10-02 ??
2013-02-04 ethernet: Remove unnecessary all
Mar 20 04:55:33 2013] WARNING: at
/home/abuild/rpmbuild/BUILD/kernel-desktop-3.8.3/linux-3.8/net/sched/sch_generic.c:254
dev_watchdog+0x1e0/0x1f0()
[Wed Mar 20 04:55:33 2013] Hardware name: 200763G
[Wed Mar 20 04:55:33 2013] NETDEV WATCHDOG: eth1 (ipheth): transmit queue 0
timed out
[Wed Mar 20 04
Jörg Otte :
[...]
> To Summarize: Two net-regressions where introduced in v3.8 (driver r8169):
>
> 1) NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> was introduced by commit
> e0c075577965d1c01b30038d38bf637b027a1df3
> ("r8169: enable ALDPS for power saving&
gt;r8169: enable ALDPS for power saving
>>
>> That's it! This fixes the problem for me!
>>
>> Thanks, Jörg
>
>
> We are closely before v3.8 and I didn't see a solution
> so far.
> What is the plan regarding this issue(s)?
>
> Thanks, Jörg
2013/1/6 Jörg Otte :
> 2013/1/5 Francois Romieu :
>> Can you check if things improve with v3.8-rc2 after removing :
>>
>> 1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
>>r8169: workaround for missing extended GigaMAC registers
>> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>>r8169: enable int
2013/1/5 Francois Romieu :
> Can you check if things improve with v3.8-rc2 after removing :
>
> 1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
>r8169: workaround for missing extended GigaMAC registers
> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>r8169: enable internal ASPM and clock request
Jörg Otte :
[...]
> jojo@ahorn:~$ dmesg | grep XID
> [1.808847] r8169 :02:00.0 eth0: RTL8168evl/8111evl at
> 0xc9054000, 5c:9a:d8:69:2b:39, XID 0c900800 IRQ 42
Can you check if things improve with v3.8-rc2 after removing :
1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
r8169: wo
2013/1/5 Francois Romieu :
> Jörg Otte :
> [...]
>> It's a regression, it never happend before 3.8-rc.
>
> Please check that 'dmesg | grep XID' exhibits a 8168evl.
jojo@ahorn:~$ dmesg | grep XID
[1.808847] r8169 :02:00.0 eth0: RTL8168evl/8111evl at
0xc9054000, 5c:9a:d8:69:2b:39, X
Jörg Otte :
[...]
> It's a regression, it never happend before 3.8-rc.
Please check that 'dmesg | grep XID' exhibits a 8168evl.
I'll showe and dig it. It's epidemic.
--
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.
I frequently see the following in the syslog:
[ 184.552914] [ cut here ]
[ 184.552927] WARNING: at
/data/kernel/linux/net/sched/sch_generic.c:254
dev_watchdog+0xf2/0x151()
[ 184.552929] Hardware name: LIFEBOOK AH532
[ 184.552932] NETDEV WATCHDOG: eth0 (r8169): transmit
Steffen Klassert wrote:
On Fri, Nov 23, 2007 at 04:52:39PM +0100, BERTRAND Joël wrote:
BERTRAND Joël wrote:
Hello,
Since I have installed a 2.6.23.1 linux kernel on my U60, I can see
several NETDEV WATCHDOG. This trouble never occurs with 2.6.23-rc4.
This bug occurs after a random
On Fri, Nov 23, 2007 at 04:52:39PM +0100, BERTRAND Joël wrote:
> BERTRAND Joël wrote:
> >Hello,
> >
> >Since I have installed a 2.6.23.1 linux kernel on my U60, I can see
> >several NETDEV WATCHDOG. This trouble never occurs with 2.6.23-rc4.
> >This
BERTRAND Joël wrote:
Hello,
Since I have installed a 2.6.23.1 linux kernel on my U60, I can see
several NETDEV WATCHDOG. This trouble never occurs with 2.6.23-rc4.
This bug occurs after a random uptime.
I have made the same constation this evening on a amd64/up with two
3C905 and
Hi,
after reading about issues with the nics on kontron boards I did a
bios upgrade,
but this did not change anything.
However, yesterday the nic (onboard) I used died. No link at all,
after switching to
the next onboard nic I got a NETDEV transmit timeout with that one on
kernel 2.6.22-r2.
It se
Hi Francois,
this is what I found and sent:
The error exists from patch 2 on. I did some network testing with
patch 1 and currently use it and have no errors so far.
>From my experiences up to now patch 1 should be error free.
Do you need additional info?
2007/9/12, Francois Romieu <[EMAIL PROT
Karl Meyer <[EMAIL PROTECTED]> :
[...]
> am am looking for this issue for some time now, but there where no
> errors in 2.6.22-r2 (gentoo speak, I guess this is 2.6.22.2
> officially), I also ran git-bisect (for more information see the older
> messages in this thread).
2.6.22-r2 in gentoo is base
;
> On 01/09/07, Karl Meyer <[EMAIL PROTECTED]> wrote:
> > This is what happened today:
> >
> > Sep 1 21:08:01 frege NETDEV WATCHDOG: eth0: transmit timed out
> > frege ~ # uname -r
> > 2.6.22.5-cfs-v20.5
>
> Can you reproduce this on 2.6.22 (not 2.
Hi,
On 01/09/07, Karl Meyer <[EMAIL PROTECTED]> wrote:
> This is what happened today:
>
> Sep 1 21:08:01 frege NETDEV WATCHDOG: eth0: transmit timed out
> frege ~ # uname -r
> 2.6.22.5-cfs-v20.5
Can you reproduce this on 2.6.22 (not 2.6.22.x - it might be a -stable
regressi
This is what happened today:
Sep 1 21:08:01 frege NETDEV WATCHDOG: eth0: transmit timed out
frege ~ # uname -r
2.6.22.5-cfs-v20.5
2007/8/16, Francois Romieu <[EMAIL PROTECTED]>:
> (please do not remove the netdev Cc:)
>
> Francois Romieu <[EMAIL PROTECTED]> :
> [...]
On 21-08-2007 12:56, Karl Meyer wrote:
> fyi:
> I do not know whether it is related to the problem, but since using
> the version you told me there are these entries is my log:
> frege Hangcheck: hangcheck value past margin!
...
BTW, I don't know wheter it's related too, but I think you should try
fyi:
I do not know whether it is related to the problem, but since using
the version you told me there are these entries is my log:
frege Hangcheck: hangcheck value past margin!
frege Hangcheck: hangcheck value past margin!
frege Hangcheck: hangcheck value past margin!
2007/8/16, Francois Romie
The error exists from patch 2 on. I did some network testing with
patch 1 and currently use it and have no errors so far.
>From my experiences up to now patch 1 should be error free.
2007/8/16, Francois Romieu <[EMAIL PROTECTED]>:
> (please do not remove the netdev Cc:)
>
> Francois Romieu <[EMAIL
I did some testing today and found that the error occurs after
applying some of the patches. However I did not figure out the exact
patch in which the error "starts" since it sometimes occurs immediatly
when moving some data over the net and sometimes it takes 30 min till
I get the transmit timeout
(please do not remove the netdev Cc:)
Francois Romieu <[EMAIL PROTECTED]> :
[...]
> If it does not work I'll dissect 0e4851502f846b13b29b7f88f1250c980d57e944
> tomorrow.
You will find a tgz archive in attachment which contains a serie of patches
(0001-... to 0005-...) to walk from 6dccd16b7c2703e
Karl Meyer <[EMAIL PROTECTED]> :
> I did some additional testing, the results are:
> [0e4851502f846b13b29b7f88f1250c980d57e944] r8169: merge with version
> 8.001.00 of Realtek's r8168 driver
> does not work, I after some traffic the transmit timeout occurs.
> [6dccd16b7c2703e8bbf8bca62b5cf248332afb
I did some additional testing, the results are:
[0e4851502f846b13b29b7f88f1250c980d57e944] r8169: merge with version
8.001.00 of Realtek's r8168 driver
does not work, I after some traffic the transmit timeout occurs.
[6dccd16b7c2703e8bbf8bca62b5cf248332afbe2] r8169: merge with version
6.001.00 of R
Sorry, I was wrong, still testing
2007/8/14, Francois Romieu <[EMAIL PROTECTED]>:
> Karl Meyer <[EMAIL PROTECTED]> :
> [...]
> > dmesg, interrupts and .config are attached. I will have a look at git
> > bisect.
>
> Can you reproduce the problem when nvidia binary-only stuff is not loaded
> af
it by doing "git revert
0127215c17414322b350c3c6fbd1a7d8dd13856f" on my git clone, now I am
happily running 2.6.23-rc3-ge60a without the NETDEV WATCHDOG message.
2007/8/14, Francois Romieu <[EMAIL PROTECTED]>:
> Karl Meyer <[EMAIL PROTECTED]> :
> [...]
> > dmesg, interrupts and .config are att
Karl Meyer <[EMAIL PROTECTED]> :
[...]
> dmesg, interrupts and .config are attached. I will have a look at git bisect.
Can you reproduce the problem when nvidia binary-only stuff is not loaded
after boot ?
--
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
g. when installing packages over the nvsv4 share, all
> network stuff freezes for some time and syslog tells me:
> Aug 13 13:16:09 frege NETDEV WATCHDOG: eth0: transmit timed out
> Aug 13 13:16:39 frege NETDEV WATCHDOG: eth0: transmit timed out
> Aug 13 13:17:09 frege NETDEV WATCHDOG: eth
for some time and syslog tells me:
Aug 13 13:16:09 frege NETDEV WATCHDOG: eth0: transmit timed out
Aug 13 13:16:39 frege NETDEV WATCHDOG: eth0: transmit timed out
Aug 13 13:17:09 frege NETDEV WATCHDOG: eth0: transmit timed out
Aug 13 13:17:57 frege NETDEV WATCHDOG: eth0: transmit timed out
Some info
-stable review patch. If anyone has any objections, please let us know.
--
There's a bug in the driver that only initializes half of the context
memory on the 5708. Surprisingly, this works most of the time except
for some occasional netdev watchdogs when sending a lot of 64-by
On Thu, 19 Apr 2007, Tomasz Chmielewski wrote:
I also have recurrent problems with
NETDEV WATCHDOG: eth0: transmit timed out
If you search the list, you'll find several similar reports about the tulip
driver (NETDEV WATCHDOG: eth0: transmit timed out).
Adding nopaic/nolapic/noacpi op
> I also have recurrent problems with
> NETDEV WATCHDOG: eth0: transmit timed out
I remember having it with some older kernels on Fujitsu-Siemens Scenic
machines.
If you search the list, you'll find several similar reports about the
tulip driver (NETDEV WATCHDOG: eth0: transmi
Package: linux-kernel
Version: 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12)
(Submitted to linux-kernel@vger.kernel.org && [EMAIL PROTECTED])
I also have recurrent problems with
NETDEV WATCHDOG: eth0: transmit timed out
I am running on a Pentium 3 with a Linksys LNE100TX V5.1
PCI ethernet car
On Fri, Apr 06, 2007 at 07:19:25PM +0100, Christian Kujau wrote:
> On Wed, 4 Apr 2007, Christian Kujau wrote:
> >>Maybe it's a real locking problem. Here are some more
> >>suggestions for testing (if you don't find anything better):
> >>- try without SMP, so: 'acpi=off lapic nosmp'
>
> We were abl
On Fri, 6 Apr 2007, Christian Kujau wrote:
but yes, this seem to be different problems, for the curious among you I've
put details here: http://nerdbynature.de/bits/2.6.20.4/db2/
that's http://nerdbynature.de/bits/2.6.20.4/db1/2/ sorry.
--
BOFH excuse #270:
Someone has messed up the kerne
On Wed, 4 Apr 2007, Christian Kujau wrote:
Maybe it's a real locking problem. Here are some more
suggestions for testing (if you don't find anything better):
- try without SMP, so: 'acpi=off lapic nosmp'
We were able to have our hosting provider to replace the 8139too with a
E100, the onboard
On Wed, Apr 04, 2007 at 02:20:23PM +0100, Christian Kujau wrote:
> On Wed, 4 Apr 2007, Jarek Poplawski wrote:
> >So, it's a lot sooner than before. (BTW, isn't there anything
> >in debug log?)
>
> No, nothing. I've set up remote-syslgging to the other node (node1
> logging to node2 and vice versa
On Wed, 4 Apr 2007, Francois Romieu wrote:
No serial cable ?
No, unfortunately this hosting provider does not have a serial console
to access :(
4 - try:
http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.21-rc5/r8169-20070402
Are they in -rc5 yet or 'not in -rc5 but should be applied to -r
Christian Kujau <[EMAIL PROTECTED]> :
[...]
> Actually I was thinking about *using* netconsole, since even setting up
> remote (userspace-)syslog left nothing on the syslog-server, when the
> machine crashed. But if it's b0rked in 8139, I will refrain from doing
> so.
Please refrain :o)
No ser
On Wed, 4 Apr 2007, Denys wrote:
IMHO it can be hardware issue also, i had something very similar with faulty
hardware combinations.
Since it's happening on 2 nodes, I somehow doubt that...
--
BOFH excuse #447:
According to Microsoft, it's by design
-
To unsubscribe from this list: send the l
IMHO it can be hardware issue also, i had something very similar with faulty
hardware combinations.
On Wed, 4 Apr 2007 13:21:00 +0200, Jarek Poplawski wrote
> On Tue, Apr 03, 2007 at 04:19:46PM +0100, Christian Kujau wrote:
> > On Tue, 3 Apr 2007, Jarek Poplawski wrote:
> > >Did you try with 8139
On Wed, 4 Apr 2007, Jarek Poplawski wrote:
So, it's a lot sooner than before. (BTW, isn't there anything
in debug log?)
No, nothing. I've set up remote-syslgging to the other node (node1
logging to node2 and vice versa) - nothing :(
I see both CPUs did interrupt handling again.
Yes, when
On Tue, 3 Apr 2007, Francois Romieu wrote:
Christian Kujau <[EMAIL PROTECTED]> :
If the apic voodoo makes no difference, you can:
1 - leave it enabled
Well, we tried to boot with ACPI compiled in again, but disabled during
boot:
- acpi=off lapic, crashed after 1h (almost exactly) of service
On Tue, Apr 03, 2007 at 04:19:46PM +0100, Christian Kujau wrote:
> On Tue, 3 Apr 2007, Jarek Poplawski wrote:
> >Did you try with 8139cp instead of 8139too?
>
> Tried that, 8139cp could not be loaded :(
Sorry for misleading!
> >(Maybe even try some other card to narrow the problem?)
> >You could
Christian Kujau <[EMAIL PROTECTED]> :
[...]
> Please see http://nerdbynature.de/bits/2.6.20.4/ for details for both
> hosts and feel free to ask for more details. Although both boxes are in
> production we'll be happy test more bootoptions/patches and the like.
If the apic voodoo makes no differ
Christian Kujau <[EMAIL PROTECTED]> :
> On Tue, 3 Apr 2007, Jarek Poplawski wrote:
> >Did you try with 8139cp instead of 8139too?
>
> Tried that, 8139cp could not be loaded :(
It is a different beast.
--
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the bod
On 4/3/07, Christian Kujau <[EMAIL PROTECTED]> wrote:
On Tue, 3 Apr 2007, Robert Hancock wrote:
> Although it's not as bad with servers, many machines are designed to run only
> Windows (which normally always uses ACPI) and simply aren't tested well or at
> all with ACPI disabled so you can run i
On Tue, 3 Apr 2007, Robert Hancock wrote:
These days I think it's usually best to have ACPI on with current systems.
Whooha, really? While I honor the acpi-folks' work when using a desktop
machine I am otherwise always reminded to the comment in
arch/i386/kernel/apm.c, which basically says: "
On Tue, 3 Apr 2007, Jarek Poplawski wrote:
Did you try with 8139cp instead of 8139too?
Tried that, 8139cp could not be loaded :(
(Maybe even try some other card to narrow the problem?)
You could also try to test without ehci, if it's possible.
USB has been disabled completely. After booting
On Mon, 2 Apr 2007, Chuck Ebbert wrote:
Where is the info from before you changed to "noapic"? Or were the
machines always using XT-PIC for all the interrupts???
We booted with 'acpi=off lapic' (with ACPI options compiled in, to be
able to boot with acpi=on later on) and the box locked up agai
Christian Kujau wrote:
Len et al., do you even suggest to use ACPI on a server system at all? I
myself always thought of ACPI being evil and to avoid when possible
(thus switching it off completely on a serversystem).
These days I think it's usually best to have ACPI on with current
systems.
On Tue, 3 Apr 2007, Jarek Poplawski wrote:
Did you try with 8139cp instead of 8139too?
I forgot about that, thanks.
(Maybe even try some other card to narrow the problem?)
We're try to convince our hosting provider to replace the NIC with a
e1000.
You could also try to test without ehci
On 02-04-2007 21:41, Christian Kujau wrote:
>
> Hi there,
>
> we have serious problems with 2 of our servers: both shiny new amd64
> dual core, with both 2GB RAM, 32bit kernel+userland (Debian/testing).
> Both servers have 2 NICs, RTL8139 (eth0, irq10) and RTL8169s
> (eth1, irq11).
Hi,
Did you
be able to do so?
Len et al., do you even suggest to use ACPI on a server system at all? I
myself always thought of ACPI being evil and to avoid when possible
(thus switching it off completely on a serversystem).
Since these NETDEV WATCHDOG issues seems to be a "known issue" (kinda,
On Mon, 2 Apr 2007, Chuck Ebbert wrote:
Where is the info from before you changed to "noapic"? Or were the
machines always using XT-PIC for all the interrupts???
XT-PIC is only used since we switched to noapic, before there was
IO-APIC-fasteoi on both ethernet cards and interrupts were balance
69s
> (eth1, irq11).
>
> Both boxes are running fine but after "a while" they lock up and
> eventually restart all of a sudden. The last messages in the logfile
> are:
>
> 14:15:11 db2 kernel: NETDEV WATCHDOG: eth0: transmit timed out
> 14:15:14 db2 kernel: eth0: lin
On Mon, 2 Apr 2007, Chuck Ebbert wrote:
Please see http://nerdbynature.de/bits/2.6.20.4/ for details for both
hosts and feel free to ask for more details. Although both boxes are in
production we'll be happy test more bootoptions/patches and the like.
Where is the info from before you changed t
Christian Kujau wrote:
>
> Please see http://nerdbynature.de/bits/2.6.20.4/ for details for both
> hosts and feel free to ask for more details. Although both boxes are in
> production we'll be happy test more bootoptions/patches and the like.
Where is the info from before you changed to "noapic"?
ock up and
eventually restart all of a sudden. The last messages in the logfile
are:
14:15:11 db2 kernel: NETDEV WATCHDOG: eth0: transmit timed out
14:15:14 db2 kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Then the box reboots, nothing else in the log.
As the servers have been set u
How can you ensure
it's a kernel panic if you only see a freeze ? I mean, there can be a
deadlock somewhere causing a freeze without necessarily a panic,
eventhough you have a oops.
> Before that panic occurs, I have "NETDEV WATCHDOG: eth1: transmit
> timed out" . always right
I would really appreciate someone looking at that oops below, and just
giving me some hints to the problem please.
Oops:
Unable to handle kernel paging request at virtual address 909fe955
c013b582
*pde =
Oops:
CPU: 0
eax: 909fe93d ebx: 0001 ecx: 909fe93d edx:
nning,
freezes. It is clear kernel panic.
Before that panic occurs, I have "NETDEV WATCHDOG: eth1: transmit
timed out" . always right before crash.
I am sure, the software doesn't have memleaks. Everything in pcap was
programmed according with examples, and double checked - but I can
After many hours of stressing network I could reproduce once NETDEV
WATCHDOG: eth1: transmit timed out on rt-14, but is not so frequently as
on rt15. I just reproduce one time on rt14 with many many stress, on
rt15 is much more frequently.
On Sun, 2006-12-17 at 00:20 +, Sergio Monteiro Basto
1 - 100 of 136 matches
Mail list logo