Stephen Hemminger wrote:
Looks like a memory over commit with small machines??
Begin forwarded message:
Date: Fri, 19 Oct 2007 01:35:33 -0700 (PDT)
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bug 9189] New: Oops in kernel 2.6.21-rc4 through 2.6.23, page
allocation failure
[snip]
Problem Description:After recent upgrade to kernel 2.6.23 (from 2.6.20) I have
started seeing kernel oops-es in networking code. The problem is 100%
reproducible in my environment. I've seen two slightly different backtraces but
both seem to be caused by the same commit.
I've performed the git bisect and tracked down the problem to the commit:
53cdcc04c1e85d4e423b2822b66149b6f2e52c2c [TCP]: Fix tcp_mem[] initialization
Once I reverse this commit in 2.6.23 the problem goes away (this is true also
for the kernel version generated by git bisect, 2.6.21-rc4).
Backtrace #1:
page allocation failure. order:1, mode:0x20
[<c0131581>] __alloc_pages+0x2e1/0x300
[<c0144bee>] cache_alloc_refill+0x29e/0x4b0
[<c0144e6e>] __kmalloc+0x6e/0x80
[<c0227103>] __alloc_skb+0x53/0x110
[<c024de5c>] tcp_collapse+0x1ac/0x370
[<c024e11d>] tcp_prune_queue+0xfd/0x2c0
[<c024eaad>] tcp_data_queue+0x7cd/0xbb0
[<c0225c2d>] skb_checksum+0x4d/0x2a0
[<c02504ee>] tcp_rcv_established+0x36e/0x6a0
[<c02561e4>] tcp_v4_do_rcv+0xb4/0x2a0
[<c0131379>] __alloc_pages+0xd9/0x300
[<c0258269>] tcp_v4_rcv+0x6a9/0x6c0
[<c023ddb1>] ip_local_deliver+0x91/0x110
[<c023e130>] ip_rcv+0x230/0x3c0
[<c0227103>] __alloc_skb+0x53/0x110
[<c022b742>] netif_receive_skb+0x152/0x1e0
[<c022ce6f>] process_backlog+0x6f/0xe0
[<c022cf3c>] net_rx_action+0x5c/0xf0
[<c0115af2>] __do_softirq+0x42/0x90
[<c0115b67>] do_softirq+0x27/0x30
[<c01044fd>] do_IRQ+0x3d/0x70
[<c0115818>] sys_gettimeofday+0x28/0x80
[<c0102967>] common_interrupt+0x23/0x28
=======================
I'm not surprised that this commit would make a difference in this
situation, since it does change the fraction of memory TCP is allowed to
use. (If it really is too much in this situation, we should tweak the
function.) However, I don't think this is the root cause. Why does it
oops here when the allocation fails?
-John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html