Hi there, We have a moderately-loaded NFS server running Debian/Sarge with the stock kernel 2.6.8-10-amd64-k8 (the current Debian binary deb). It serves around 10 NFS client connections from a Solaris 8 box. For what it's worth, the exported filesystems are also being shared simultaneously via Samba to windows clients (although I doubt this is the problem).
We're getting kernel GPFs once every 3 weeks or so, and it seems to be on process nfsd. This is the case at least on the two occasions where there was a trace left in the logs. See below. I checked my ram with memtest86+, and I even changed the eth card to a more stable e1000, but the problem must be somewhere else... Any ideas? =============== Apr 8 12:09:38 anakin kernel: general protection fault: 0000 [1] Apr 8 12:09:38 anakin kernel: CPU 0 Apr 8 12:09:38 anakin kernel: Modules linked in: ipv6 nfsd exportfs lockd sunrpc evdev ehci_hcd ohci_hcd ide_cd cdrom forcedeth rtc raid1 md ext2 ext3 jbd mbcache ide_generic ide_disk amd74xx ide_core unix font vesafb cfbcopyarea cfbimgblt cfbfillrect Apr 8 12:09:38 anakin kernel: Pid: 7980, comm: nfsd Not tainted 2.6.8-10-amd64-k8 Apr 8 12:09:38 anakin kernel: RIP: 0010:[<ffffffff80152aeb>] <ffffffff80152aeb>{cache_alloc_refill+283} Apr 8 12:09:38 anakin kernel: RSP: 0000:0000010037a55788 EFLAGS: 00010086 Apr 8 12:09:38 anakin kernel: RAX: 8e00000018b8dd89 RBX: 000001003ec92000 RCX: 0000010000100000 Apr 8 12:09:38 anakin kernel: RDX: 000001003f552210 RSI: 000000000000000e RDI: 0000010016f1d028 Apr 8 12:09:38 anakin kernel: RBP: 000001003f552200 R08: 000001003ec92010 R09: 000001003f552220 Apr 8 12:09:38 anakin kernel: R10: 000001003f552230 R11: 0000010007640018 R12: 000001003f552210 Apr 8 12:09:38 anakin kernel: R13: 000001003f92a220 R14: 0000000000000050 R15: 000001003d48ad80 Apr 8 12:09:38 anakin kernel: FS: 0000000000000000(0000) GS:ffffffff803b1180(0000) knlGS:00000000557cb080 Apr 8 12:09:38 anakin kernel: CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b Apr 8 12:09:38 anakin kernel: CR2: 0000000055be4000 CR3: 0000000000101000 CR4: 00000000000006e0 Apr 8 12:09:38 anakin kernel: Process nfsd (pid: 7980, threadinfo 0000010037a54000, task 000001003e9dea40) Apr 8 12:09:38 anakin kernel: Stack: 000001001b7a3360 0000000000000000 000001003f4c0800 00000000001a8831 Apr 8 12:09:38 anakin kernel: 000001003f4c0800 0000010002051f40 000001003d48ad80 ffffffff8015294b Apr 8 12:09:38 anakin kernel: 0000000000000212 ffffffffa006ddd5 Apr 8 12:09:38 anakin kernel: Call Trace:<ffffffff8015294b>{kmem_cache_alloc+43} <ffffffffa006ddd5>{:ext3:ext3_alloc_inode+21} Apr 8 12:09:38 anakin kernel: <ffffffff8017e105>{alloc_inode+21} <ffffffff8017f378>{iget_locked+168} Apr 8 12:09:38 anakin kernel: <ffffffffa006ae74>{:ext3:ext3_lookup+100} <ffffffff80174a02>{__lookup_hash+258} Apr 8 12:09:38 anakin kernel: <ffffffff80174abc>{lookup_one_len+108} <ffffffffa01298fd>{:nfsd:compose_entry_fh+205} Apr 8 12:09:38 anakin kernel: <ffffffffa0129b25>{:nfsd:encode_entry+437} <ffffffff8011d942>{pci_map_sg+642} Apr 8 12:09:38 anakin kernel: <ffffffffa0067da4>{:ext3:ext3_get_block_handle+228} Apr 8 12:09:38 anakin kernel: <ffffffffa0023a28>{:ide_core:__ide_dma_begin+40} <ffffffffa008a849>{:ide_disk:__ide_do_rw_disk+809} Apr 8 12:09:38 anakin kernel: <ffffffffa0129e60>{:nfsd:nfs3svc_encode_entry_plus+16} Apr 8 12:09:38 anakin kernel: <ffffffffa0064995>{:ext3:ext3_readdir+1157} <ffffffffa0129e50>{:nfsd:nfs3svc_encode_entry_plus+0} Apr 8 12:09:38 anakin kernel: <ffffffffa011deb5>{:nfsd:fh_verify+1333} <ffffffffa00e6e11>{:sunrpc:svc_sock_enqueue+561} Apr 8 12:09:38 anakin kernel: <ffffffffa0129e50>{:nfsd:nfs3svc_encode_entry_plus+0} Apr 8 12:09:38 anakin kernel: <ffffffff80178d5d>{vfs_readdir+157} <ffffffffa0129e50>{:nfsd:nfs3svc_encode_entry_plus+0} Apr 8 12:09:38 anakin kernel: <ffffffffa0120205>{:nfsd:nfsd_readdir+149} <ffffffffa0126bd1>{:nfsd:nfsd3_proc_readdirplus+241} Apr 8 12:09:38 anakin kernel: <ffffffffa011b5f0>{:nfsd:nfsd_dispatch+240} <ffffffffa00e6922>{:sunrpc:svc_process+914} Apr 8 12:09:38 anakin kernel: <ffffffffa011b1e0>{:nfsd:nfsd+0} <ffffffffa011b39a>{:nfsd:nfsd+442} Apr 8 12:09:38 anakin kernel: <ffffffff8012edae>{schedule_tail+14} <ffffffff80110c67>{child_rip+8} Apr 8 12:09:38 anakin kernel: <ffffffffa011b1e0>{:nfsd:nfsd+0} <ffffffffa011b1e0>{:nfsd:nfsd+0} Apr 8 12:09:38 anakin kernel: <ffffffff80110c5f>{child_rip+0} Apr 8 12:09:38 anakin kernel: Apr 8 12:09:38 anakin kernel: Code: 48 89 50 08 48 89 02 66 83 79 24 ff 48 c7 01 00 01 10 00 48 Apr 8 12:09:38 anakin kernel: RIP <ffffffff80152aeb>{cache_alloc_refill+283} RSP <0000010037a55788> =============== =============== May 5 15:51:15 anakin kernel: general protection fault: 0000 [1] May 5 15:51:15 anakin kernel: CPU 0 May 5 15:51:15 anakin kernel: Modules linked in: nfsd exportfs lockd sunrpc ipv6 evdev forcedeth ehci_hcd ohci_hcd ide_cd cdrom e1000 rtc raid1 md ext2 ext3 jbd mbcache ide_generic ide_disk amd74xx ide_core unix font vesafb cfbcopyarea cfbimgblt cfbfillrect May 5 15:51:15 anakin kernel: Pid: 1609, comm: nfsd Not tainted 2.6.8-10-amd64-k8 May 5 15:51:15 anakin kernel: RIP: 0010:[<ffffffff80152aeb>] <ffffffff80152aeb>{cache_alloc_refill+283} May 5 15:51:15 anakin kernel: RSP: 0000:000001003de09788 EFLAGS: 00010086 May 5 15:51:15 anakin kernel: RAX: 8e00000018b8dd89 RBX: 000001003ecb4000 RCX: 0000010000100000 May 5 15:51:15 anakin kernel: RDX: 000001003f556210 RSI: 0000000000000009 RDI: 000001000eb6f028 May 5 15:51:15 anakin kernel: RBP: 000001003f556200 R08: 000001003ecb4010 R09: 000001003f556220 May 5 15:51:15 anakin kernel: R10: 000001003f556230 R11: 000001002582f0c0 R12: 000001003f556210 May 5 15:51:15 anakin kernel: R13: 000001003f92e220 R14: 0000000000000050 R15: 000001003f2d3380 May 5 15:51:15 anakin kernel: FS: 0000000000000000(0000) GS:ffffffff803b1180(0000) knlGS:00000000557cb080 May 5 15:51:15 anakin kernel: CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b May 5 15:51:15 anakin kernel: CR2: 000000000816fd78 CR3: 0000000000101000 CR4: 00000000000006e0 May 5 15:51:15 anakin kernel: Process nfsd (pid: 1609, threadinfo 000001003de08000, task 000001003e524430) May 5 15:51:15 anakin kernel: Stack: 000001003de097f8 0000000000000000 000001003f4c2800 00000000004451fc May 5 15:51:15 anakin kernel: 000001003f4c2800 00000100020de130 000001003f2d3380 ffffffff8015294b May 5 15:51:15 anakin kernel: 0000000000000212 ffffffffa006ddd5 May 5 15:51:15 anakin kernel: Call Trace:<ffffffff8015294b>{kmem_cache_alloc+43} <ffffffffa006ddd5>{:ext3:ext3_alloc_inode+21} May 5 15:51:15 anakin kernel: <ffffffff8017e105>{alloc_inode+21} <ffffffff8017f378>{iget_locked+168} May 5 15:51:15 anakin kernel: <ffffffffa006ae74>{:ext3:ext3_lookup+100} <ffffffff80174a02>{__lookup_hash+258} May 5 15:51:15 anakin kernel: <ffffffff80174abc>{lookup_one_len+108} <ffffffffa01818fd>{:nfsd:compose_entry_fh+205} May 5 15:51:15 anakin kernel: <ffffffffa0181b25>{:nfsd:encode_entry+437} <ffffffff8011d942>{pci_map_sg+642} May 5 15:51:15 anakin kernel: <ffffffffa0067da4>{:ext3:ext3_get_block_handle+228} May 5 15:51:15 anakin kernel: <ffffffffa0023a28>{:ide_core:__ide_dma_begin+40} <ffffffffa008a849>{:ide_disk:__ide_do_rw_disk+809} May 5 15:51:15 anakin kernel: <ffffffffa0181e60>{:nfsd:nfs3svc_encode_entry_plus+16} May 5 15:51:15 anakin kernel: <ffffffffa0064995>{:ext3:ext3_readdir+1157} <ffffffffa0181e50>{:nfsd:nfs3svc_encode_entry_plus+0} May 5 15:51:15 anakin kernel: <ffffffffa0175eb5>{:nfsd:fh_verify+1333} <ffffffffa013ee11>{:sunrpc:svc_sock_enqueue+561} May 5 15:51:15 anakin kernel: <ffffffffa0181e50>{:nfsd:nfs3svc_encode_entry_plus+0} May 5 15:51:15 anakin kernel: <ffffffff80178d5d>{vfs_readdir+157} <ffffffffa0181e50>{:nfsd:nfs3svc_encode_entry_plus+0} May 5 15:51:15 anakin kernel: <ffffffffa0178205>{:nfsd:nfsd_readdir+149} <ffffffffa017ebd1>{:nfsd:nfsd3_proc_readdirplus+241} May 5 15:51:15 anakin kernel: <ffffffffa01735f0>{:nfsd:nfsd_dispatch+240} <ffffffffa013e922>{:sunrpc:svc_process+914} May 5 15:51:15 anakin kernel: <ffffffffa01731e0>{:nfsd:nfsd+0} <ffffffffa017339a>{:nfsd:nfsd+442} May 5 15:51:15 anakin kernel: <ffffffff8012edae>{schedule_tail+14} <ffffffff80110c67>{child_rip+8} May 5 15:51:15 anakin kernel: <ffffffffa01731e0>{:nfsd:nfsd+0} <ffffffffa01731e0>{:nfsd:nfsd+0} May 5 15:51:15 anakin kernel: <ffffffff80110c5f>{child_rip+0} May 5 15:51:15 anakin kernel: May 5 15:51:15 anakin kernel: Code: 48 89 50 08 48 89 02 66 83 79 24 ff 48 c7 01 00 01 10 00 48 May 5 15:51:15 anakin kernel: RIP <ffffffff80152aeb>{cache_alloc_refill+283} RSP <000001003de09788> =============== -- JL -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]