On Sun, 2013-03-17 at 09:52 -0700, Eric Dumazet wrote:
>
> We perform a sock_hold() somewhere while the socket is already dead.
Oh well, thats not exactly that.
When processing ICMP messages we perform a full inet[6] lookup and can
find a LISTEN socket.
Please try the following fix :
diff --g
On Sun, 2013-03-17 at 09:33 -0700, Eric Dumazet wrote:
> On Sun, 2013-03-17 at 02:21 -0700, dormando wrote:
>
> > Hope you don't mind a screenshot:
> > http://www.dormando.me/p/3.8.2-trace-crash.jpg
> >
> > (I put the patches on 3.8.2). box is on another continent so screenshot
> > via IPMI is wh
On Sun, 2013-03-17 at 02:21 -0700, dormando wrote:
> Hope you don't mind a screenshot:
> http://www.dormando.me/p/3.8.2-trace-crash.jpg
>
> (I put the patches on 3.8.2). box is on another continent so screenshot
> via IPMI is what I get. If this isn't enough or isn't right I'll try
> harder to ge
> On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote:
> > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> >
> > > Thanks thats really useful, we might miss to increment socket refcount
> > > in a timer setup.
> > >
> >
> > Hmm, please add following debugging patch as well
> >
> > diff -
On Sun, Mar 17, 2013 at 07:39:48AM +0100, Hannes Frederic Sowa wrote:
> On Sat, Mar 16, 2013 at 10:36:06AM -0700, Eric Dumazet wrote:
> > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> >
> > > Thanks thats really useful, we might miss to increment socket refcount
> > > in a timer setup.
On Sat, Mar 16, 2013 at 10:36:06AM -0700, Eric Dumazet wrote:
> On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
>
> > Thanks thats really useful, we might miss to increment socket refcount
> > in a timer setup.
> >
>
> Hmm, please add following debugging patch as well
>
> diff --git a/in
> On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote:
> > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> >
> > > Thanks thats really useful, we might miss to increment socket refcount
> > > in a timer setup.
> > >
> >
> > Hmm, please add following debugging patch as well
> >
> > diff -
On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote:
> On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
>
> > Thanks thats really useful, we might miss to increment socket refcount
> > in a timer setup.
> >
>
> Hmm, please add following debugging patch as well
>
> diff --git a/include/n
On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> Thanks thats really useful, we might miss to increment socket refcount
> in a timer setup.
>
Hmm, please add following debugging patch as well
diff --git a/include/net/sock.h b/include/net/sock.h
index 14f6e9d..fe7c8a6 100644
--- a/includ
On Thu, 2013-03-14 at 16:15 -0700, dormando wrote:
> *sigh*. it's been a long month, sorry:
>
> [58377.436522] IPv4: Attempt to release TCP socket family 2 in state 1
> 8813fbad9500
> [58377.436539] [ cut here ]
> [58377.436545] WARNING: at net/ipv4/af_inet.c:146
> ine
> On Thu, 2013-03-14 at 14:21 -0700, dormando wrote:
> > >
> > > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > > index 68f6a94..1d4d97e 100644
> > > --- a/net/ipv4/af_inet.c
> > > +++ b/net/ipv4/af_inet.c
> > > @@ -141,8 +141,9 @@ void inet_sock_destruct(struct sock *sk)
> > > sk_mem_r
On Thu, 2013-03-14 at 14:21 -0700, dormando wrote:
> >
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index 68f6a94..1d4d97e 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -141,8 +141,9 @@ void inet_sock_destruct(struct sock *sk)
> > sk_mem_reclaim(sk);
>
> On Wed, 2013-03-06 at 16:41 -0800, dormando wrote:
>
> > Ok... bridge module is loaded but nothing seems to be using it. No
> > bond/tunnels/anything enabled. I couldn't quickly figure out what was
> > causing it to load.
> >
> > We removed the need for macvlan, started machines with a fresh boot
On Wed, 2013-03-06 at 16:41 -0800, dormando wrote:
> Ok... bridge module is loaded but nothing seems to be using it. No
> bond/tunnels/anything enabled. I couldn't quickly figure out what was
> causing it to load.
>
> We removed the need for macvlan, started machines with a fresh boot, and
> they
> On Mon, 2013-03-04 at 21:44 -0800, dormando wrote:
>
> > No 3rd party modules. There's a tiny patch for controlling initcwnd from
> > userspace and another one for the extra_free_kbytes tunable that I brought
> > up in another thread. We've had the initcwnd patch in for a long time
> > without tr
On Mon, 2013-03-04 at 21:44 -0800, dormando wrote:
> No 3rd party modules. There's a tiny patch for controlling initcwnd from
> userspace and another one for the extra_free_kbytes tunable that I brought
> up in another thread. We've had the initcwnd patch in for a long time
> without trouble. The
On Mon, 4 Mar 2013, Eric Dumazet wrote:
> On Tue, 2013-03-05 at 11:47 +0800, Cong Wang wrote:
> > (Cc'ing the right netdev mailing list...)
> >
> > On 03/05/2013 08:01 AM, dormando wrote:
> > > Hi!
> > >
> > > I have a (core lockup?) with 3.7.6+ and 3.8.2 which appears to be under
> > > ixgbe. T
On Tue, 2013-03-05 at 11:47 +0800, Cong Wang wrote:
> (Cc'ing the right netdev mailing list...)
>
> On 03/05/2013 08:01 AM, dormando wrote:
> > Hi!
> >
> > I have a (core lockup?) with 3.7.6+ and 3.8.2 which appears to be under
> > ixgbe. The machine appears to still be up but network stays in a s
(Cc'ing the right netdev mailing list...)
On 03/05/2013 08:01 AM, dormando wrote:
Hi!
I have a (core lockup?) with 3.7.6+ and 3.8.2 which appears to be under
ixgbe. The machine appears to still be up but network stays in a severely
hobbled state. Either lagging or not responding to the network
Hi!
I have a (core lockup?) with 3.7.6+ and 3.8.2 which appears to be under
ixgbe. The machine appears to still be up but network stays in a severely
hobbled state. Either lagging or not responding to the network at all.
On a new box the hang happens within 8-24 hours of giving it production
netw
20 matches
Mail list logo