On 1/9/07, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
On Tue, Jan 09, 2007 at 11:27:59AM +0100, Thibaut VARENE wrote:
...
> I suspected both and changed both the disk and the ram for quality
> parts, that I tested afterwards. Both passed thorough tests.
>
> Finally, using the other NIC on the box (a VIA Rhine II, 100Mbps),
> works absolutely fine.

If you are not tired, I'd suggest two more tests:

I volunteered to help :)

For the sake of testing up-to-date code, I performed the following
tests with 2.6.20-rc4.

First test was the usual nfs video playback. Crashdump is
panic-2.6.20-rc4-nfs.txt. Went down in about 20mn.

- as above but with NIC set to 100Mbps also,

Couldn't crash the machine (or at least it didn't happen in the time
frame I was willing to wait for doing ftp downloads, ~20mn). One note
though:

The throughput of the card was terribly sucky when set in 100-FD: I
couldn't get more than 5,5MB/s doing ftp get writing to /dev/null (to
rule out disk perf), ie, half the max link speed, though the /only/
thing I changed in the setup was the link speed (same switch - made
sure it properly detected link speed/duplex, same file server, same
everything else).

When configured in 1000-FD, still writing to /dev/null I could get
about 60MB/s. Again half link speed, but there, I suppose that the
remote fileserver couldn't pull data faster from the disks :)

- long downloading but without nfs e.g. ftp

That was fast and easy. In 1000-FD, I took down the box in 2s (after
downloading 90MB). Crashdump is panic-2.6.20-rc4-ftp.txt

(btw. there were some patches after 2.6.19
for rpc memory races).

It seems that's something else. I think I also reproduced the bug
while surfing the internet with firefox, but I didn't have serial line
hooked to capture a dump, unfortunately.

PS: Maintainers were cc-ed, I hope?

Now they are :)

HTH

T-Bone

--
Thibaut VARENE
http://www.parisc-linux.org/~varenet/
Debian GNU/Linux 4.0 Alucard ttyS0                                              
                                                                                
Alucard login: ------------[ cut here ]------------                             
kernel BUG at drivers/net/mv643xx_eth.c:1071!                                   
Oops: Exception in kernel mode, sig: 5 [#1]                                     
PREEMPT                                                                         
Modules linked in: eeprom sbp2 scsi_mod eth1394 uhci_hcd ohci1394 parport_pc pae
NIP: C0210B40 LR: C02126DC CTR: C0212620                                        
REGS: da247ac0 TRAP: 0700   Not tainted  (2.6.20-rc4)                           
MSR: 00021032 <ME,IR,DR>  CR: 28222488  XER: 00000000                           
TASK = db82a050[1780] 'ncftp' THREAD: da246000                                  
GPR00: 00000000 DA247B70 DB82A050 CFB14260 CFB14000 0000000B DED5FD72 00000000  
GPR08: 00000819 00000001 00001000 0000081A 48222422 10056CD0 28004422 C03D9BF8  
GPR16: 00000000 00000000 00000000 DA246000 00000001 CFB142BC 00009032 00000000  
GPR24: 00000000 00000000 C03E0000 CFB14000 C0212620 DEDFD160 CFB14260 DED5FD40  
NIP [C0210B40] eth_alloc_tx_desc_index+0x44/0x50                                
LR [C02126DC] mv643xx_eth_start_xmit+0xbc/0x3b8                                 
Call Trace:                                                                     
[DA247B70] [DED5FD70] 0xded5fd70 (unreliable)                                   
[DA247BB0] [C029F258] dev_hard_start_xmit+0x1d4/0x2c8                           
[DA247BD0] [C02A1BF4] dev_queue_xmit+0x2bc/0x334                                
[DA247BF0] [C02BC8A8] ip_output+0x120/0x244                                     
[DA247C10] [C02BD8DC] ip_queue_xmit+0x17c/0x408                                 
[DA247C80] [C02CEB1C] tcp_transmit_skb+0x358/0x7bc                              
[DA247CC0] [C02CBF80] __tcp_ack_snd_check+0x64/0xbc                             
[DA247CD0] [C02CDA94] tcp_rcv_established+0x5d4/0x980                           
[DA247D00] [C02D4764] tcp_v4_do_rcv+0xe0/0x3c0                                  
[DA247D30] [C0294B58] release_sock+0x7c/0xf4                                    
[DA247D50] [C02C5C1C] tcp_recvmsg+0x4c8/0xbcc                                   
[DA247DB0] [C0294490] sock_common_recvmsg+0x3c/0x60                             
[DA247DD0] [C02920E4] sock_aio_read+0x10c/0x114                                 
[DA247E30] [C006F210] do_sync_read+0xc4/0x138                                   
[DA247EF0] [C006FECC] vfs_read+0x19c/0x1a4                                      
[DA247F10] [C00702E4] sys_read+0x4c/0x90                                        
[DA247F40] [C00122EC] ret_from_syscall+0x0/0x38                                 
--- Exception: c01 at 0xff5ba98                                                 
    LR = 0x10032fc0                                                             
Instruction dump:                                                               
5400fffe 0f000000 81030020 81230024 39680001 7c0b53d6 7c0051d6 7d605850         
7d694a78 91630020 7d290034 5529d97e <0f090000> 7d034378 4e800020 2f840001       
 <0>Kernel panic - not syncing: Fatal exception in interrupt                    
 <0>Rebooting in 180 seconds..<4>atkbd.c: Spurious ACK on isa0060/serio0. Some .
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.
Debian GNU/Linux 4.0 Alucard ttyS0                                              
                                                                                
Alucard login: [drm] Setting GART location based on new memory map              
[drm] Loading R200 Microcode                                                    
[drm] writeback test succeeded in 1 usecs                                       
------------[ cut here ]------------                                            
kernel BUG at drivers/net/mv643xx_eth.c:1071!                                   
Oops: Exception in kernel mode, sig: 5 [#1]                                     
PREEMPT                                                                         
Modules linked in: nfs lockd sunrpc                                             
NIP: C0210B40 LR: C02126DC CTR: C0212620                                        
REGS: d8961aa0 TRAP: 0700   Not tainted  (2.6.20-rc4)                           
MSR: 00021032 <ME,IR,DR>  CR: 24022088  XER: 00000000                           
TASK = dffd91e0[3879] 'rpciod/0' THREAD: d8960000                               
GPR00: 00000000 D8961B50 DFFD91E0 CFB1E260 CFB1E000 0000000B DECA91B2 00000000  
GPR08: 00000B6A 00000001 00001000 00000B6B 44022022 FFF045B4 009B52B4 017FFA7C  
GPR16: 009B52AC 017FFA80 009B4E68 D8960000 C03B0000 CFB1E2BC 00009032 00000000  
GPR24: 00000000 00000000 C03E0000 CFB1E000 C0212620 DF0033A0 CFB1E260 DECA9180  
NIP [C0210B40] eth_alloc_tx_desc_index+0x44/0x50                                
LR [C02126DC] mv643xx_eth_start_xmit+0xbc/0x3b8                                 
Call Trace:                                                                     
[D8961B50] [DECA91B0] 0xdeca91b0 (unreliable)                                   
[D8961B90] [C029F258] dev_hard_start_xmit+0x1d4/0x2c8                           
[D8961BB0] [C02A1BF4] dev_queue_xmit+0x2bc/0x334                                
[D8961BD0] [C02BC8A8] ip_output+0x120/0x244                                     
[D8961BF0] [C02BD8DC] ip_queue_xmit+0x17c/0x408                                 
[D8961C60] [C02CEB1C] tcp_transmit_skb+0x358/0x7bc                              
[D8961CA0] [C02CBF80] __tcp_ack_snd_check+0x64/0xbc                             
[D8961CB0] [C02CDA94] tcp_rcv_established+0x5d4/0x980                           
[D8961CE0] [C02D4764] tcp_v4_do_rcv+0xe0/0x3c0                                  
[D8961D10] [C02D6F2C] tcp_v4_rcv+0x760/0x940                                    
[D8961D40] [C02B805C] ip_local_deliver+0xe4/0x1a4                               
[D8961D60] [C02B8518] ip_rcv+0x288/0x46c                                        
[D8961D90] [C029EE4C] netif_receive_skb+0x214/0x304                             
[D8961DC0] [C0213744] mv643xx_poll+0x41c/0x48c                                  
[D8961E10] [C02A1064] net_rx_action+0x98/0x200                                  
[D8961E40] [C0026D48] __do_softirq+0x80/0xf4                                    
[D8961E70] [C00068F4] do_softirq+0x58/0x5c                                      
[D8961E80] [C00267FC] irq_exit+0x60/0x80                                        
[D8961E90] [C00069A0] do_IRQ+0xa8/0xc8                                          
[D8961EA0] [C0012994] ret_from_except+0x0/0x14                                  
--- Exception: 501 at add_wait_queue+0x50/0x84                                  
    LR = worker_thread+0x100/0x154                                              
[D8961F60] [D988CE28] 0xd988ce28 (unreliable)                                   
[D8961F70] [C0035AB4] worker_thread+0x100/0x154                                 
[D8961FC0] [C0039B4C] kthread+0xc0/0xfc                                         
[D8961FF0] [C00131C4] kernel_thread+0x44/0x60                                   
Instruction dump:                                                               
5400fffe 0f000000 81030020 81230024 39680001 7c0b53d6 7c0051d6 7d605850         
7d694a78 91630020 7d290034 5529d97e <0f090000> 7d034378 4e800020 2f840001       
 <0>Kernel panic - not syncing: Fatal exception in interrupt                    
 <0>Rebooting in 180 seconds..<4>atkbd.c: Spurious ACK on isa0060/serio0. Some .
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.
atkbd.c: Spurious ACK on isa0060/serio0. Some program might be trying access ha.

Reply via email to