On Fri, Jan 11, 2013 at 12:51:05AM +, Eric Wong wrote:
> Mel Gorman wrote:
> > mm: compaction: Partially revert capture of suitable high-order page
>
>
>
> > Reported-by: Eric Wong
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Mel Gorman
>
> Thanks, my original use case and test wor
Mel Gorman wrote:
> mm: compaction: Partially revert capture of suitable high-order page
> Reported-by: Eric Wong
> Cc: sta...@vger.kernel.org
> Signed-off-by: Mel Gorman
Thanks, my original use case and test works great after several hours!
Tested-by: Eric Wong
Unfortunately, I also hi
On Thu, 2013-01-10 at 19:42 +, Mel Gorman wrote:
> Thanks Eric, it's much appreciated. However, I'm still very much in favour
> of a partial revert as in retrospect the implementation of capture took the
> wrong approach. Could you confirm the following patch works for you?
> It's should funct
Mel Gorman wrote:
> Thanks Eric, it's much appreciated. However, I'm still very much in favour
> of a partial revert as in retrospect the implementation of capture took the
> wrong approach. Could you confirm the following patch works for you?
> It's should functionally have the same effect as the
On Thu, Jan 10, 2013 at 09:25:11AM +, Eric Wong wrote:
> Mel Gorman wrote:
> > page->pfmemalloc can be left set for captured pages so try this but as
> > capture is rarely used I'm strongly favouring a partial revert even if
> > this works for you. I haven't reproduced this using your workload
Mel Gorman wrote:
> page->pfmemalloc can be left set for captured pages so try this but as
> capture is rarely used I'm strongly favouring a partial revert even if
> this works for you. I haven't reproduced this using your workload yet
> but I have found that high-order allocation stress tests for
Mel Gorman wrote:
> When I looked at it for long enough I found a number of problems. Most
> affect timing but two serious issues are in there. One affects how long
> kswapd spends compacting versus reclaiming and the other increases lock
> contention meaning that async compaction can abort early.
On Wed, Jan 09, 2013 at 01:37:46PM +, Mel Gorman wrote:
> On Tue, Jan 08, 2013 at 11:23:25PM +, Eric Wong wrote:
> > Mel Gorman wrote:
> > > Please try the following patch. However, even if it works the benefit of
> > > capture may be so marginal that partially reverting it and simplifying
On Tue, Jan 08, 2013 at 06:32:29PM -0800, Eric Dumazet wrote:
> On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote:
> > On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote:
> > > Mel Gorman wrote:
> > > > Please try the following patch. However, even if it works the benefit of
> > > > capture ma
On Tue, Jan 08, 2013 at 11:23:25PM +, Eric Wong wrote:
> Mel Gorman wrote:
> > Please try the following patch. However, even if it works the benefit of
> > capture may be so marginal that partially reverting it and simplifying
> > compaction.c is the better decision.
>
> I already got my VM s
Eric Wong wrote:
> Oops, I had to restart my test :x. However, I was able to reproduce the
> issue very quickly again with your patch. I've double-checked I'm
> booting into the correct kernel, but I do have more load on this
> laptop host now, so maybe that made it happen more quickly...
Oops,
Eric Wong wrote:
> Eric Dumazet wrote:
> > On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote:
> > > Hmm, it seems sk_filter() can return -ENOMEM because skb has the
> > > pfmemalloc() set.
> >
> > >
> > > One TCP socket keeps retransmitting an SKB via loopback, and TCP stack
> > > drops th
Eric Dumazet wrote:
> On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote:
> > Hmm, it seems sk_filter() can return -ENOMEM because skb has the
> > pfmemalloc() set.
>
> >
> > One TCP socket keeps retransmitting an SKB via loopback, and TCP stack
> > drops the packet again and again.
>
> soc
On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote:
>
> Hmm, it seems sk_filter() can return -ENOMEM because skb has the
> pfmemalloc() set.
>
> One TCP socket keeps retransmitting an SKB via loopback, and TCP stack
> drops the packet again and again.
sock_init_data() sets sk->sk_allocatio
On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote:
> On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote:
> > Mel Gorman wrote:
> > > Please try the following patch. However, even if it works the benefit of
> > > capture may be so marginal that partially reverting it and simplifying
> > > compa
On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote:
> Mel Gorman wrote:
> > Please try the following patch. However, even if it works the benefit of
> > capture may be so marginal that partially reverting it and simplifying
> > compaction.c is the better decision.
>
> I already got my VM stuck on
Mel Gorman wrote:
> Please try the following patch. However, even if it works the benefit of
> capture may be so marginal that partially reverting it and simplifying
> compaction.c is the better decision.
I already got my VM stuck on this one. I had two twosleepy instances,
2774 was the one that
On Mon, Jan 07, 2013 at 10:38:50PM +, Eric Wong wrote:
> Mel Gorman wrote:
> > Right now it's difficult to see how the capture could be the source of
> > this bug but I'm not ruling it out either so try the following (untested
> > but should be ok) patch. It's not a proper revert, it just dis
Eric Wong wrote:
> Mel Gorman wrote:
> > Right now it's difficult to see how the capture could be the source of
> > this bug but I'm not ruling it out either so try the following (untested
> > but should be ok) patch. It's not a proper revert, it just disables the
> > capture page logic to see i
Eric Dumazet wrote:
> It would not surprise me if sk_stream_wait_memory() have plain bug(s) or
> race(s).
>
> In 2010, in commit 482964e56e132 Nagendra Tomar fixed a pretty severe
> long standing bug.
>
> This path is not taken very often on most machines.
>
> I would try the following patch :
Mel Gorman wrote:
> Right now it's difficult to see how the capture could be the source of
> this bug but I'm not ruling it out either so try the following (untested
> but should be ok) patch. It's not a proper revert, it just disables the
> capture page logic to see if it's at fault.
Things loo
On Mon, 2013-01-07 at 12:25 +, Mel Gorman wrote:
>
> > ===> 28014[28017]/stack <===
> > [] release_sock+0xe5/0x11b
> > [] sk_stream_wait_memory+0x1f7/0x1fc
> > [] autoremove_wake_function+0x0/0x2a
> > [] tcp_sendmsg+0x710/0x86d
> > [] sock_sendmsg+0x7b/0x93
> > [] sys_sendto+0xee/0x145
> > []
On Sun, Jan 06, 2013 at 12:07:00PM +, Eric Wong wrote:
> Mel Gorman wrote:
> > Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then
> > answer the following questions please?
>
> This is on my main machine running 3.8-rc2
>
> > 1. What are the contents of /proc/vmstat at t
Mel Gorman wrote:
> Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then
> answer the following questions please?
This is on my main machine running 3.8-rc2
> 1. What are the contents of /proc/vmstat at the time it is stuck?
===> /proc/vmstat <===
nr_free_pages 40305
nr_inact
Mel Gorman wrote:
> On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote:
> > Instead, I disabled THP+compaction under v3.7.1 and I've been unable to
> > reproduce the issue without THP+compaction.
> >
>
> Implying that it's stuck in compaction somewhere. It could be the case
> that compact
Mel Gorman wrote:
> On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote:
> > Instead, I disabled THP+compaction under v3.7.1 and I've been unable to
> > reproduce the issue without THP+compaction.
> >
>
> Implying that it's stuck in compaction somewhere. It could be the case
> that compact
On Fri, 2013-01-04 at 16:01 +, Mel Gorman wrote:
> Implying that it's stuck in compaction somewhere. It could be the case
> that compaction alters timing enough to trigger another bug. You say it
> tests differently depending on whether TCP or unix sockets are used
> which might indicate multi
On Wed, Jan 02, 2013 at 08:08:48PM +, Eric Wong wrote:
> (changing Cc:)
>
> Eric Wong wrote:
> > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> > local TCP socket. The isolated code below can reproduces the issue
> > after many minutes (<1 hour). It might be easier to
Eric Wong wrote:
> Eric Wong wrote:
> > I think this requires frequent dirtying/cycling of pages to reproduce.
> > (from copying large files around) to interact with compaction.
> > I'll see if I can reproduce the issue with read-only FS activity.
>
> Still successfully running the read-only tes
Eric Wong wrote:
> I think this requires frequent dirtying/cycling of pages to reproduce.
> (from copying large files around) to interact with compaction.
> I'll see if I can reproduce the issue with read-only FS activity.
Still successfully running the read-only test on my main machine, will
pro
Eric Wong wrote:
> Eric Dumazet wrote:
> > With the following patch, I cant reproduce the 'apparent stuck'
>
> Right, the output is just an approximation and the logic there
> was bogus.
>
> Thanks for looking at this.
I'm still able to reproduce the issue under v3.8-rc2 with your patch
for to
Eric Dumazet wrote:
> On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote:
> > Eric Wong wrote:
> > > [1] my full setup is very strange.
> > >
> > > Other than the FUSE component I forgot to mention, little depends on
> > > the kernel. With all this, the standalone toosleepy can get stuc
On Wed, 2013-01-02 at 20:47 +, Eric Wong wrote:
> Eric Wong wrote:
> > [1] my full setup is very strange.
> >
> > Other than the FUSE component I forgot to mention, little depends on
> > the kernel. With all this, the standalone toosleepy can get stuck.
> > I'll try to reproduce
Eric Wong wrote:
> [1] my full setup is very strange.
>
> Other than the FUSE component I forgot to mention, little depends on
> the kernel. With all this, the standalone toosleepy can get stuck.
> I'll try to reproduce it with less...
I just confirmed my toosleepy processes will ge
(changing Cc:)
Eric Wong wrote:
> I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> local TCP socket. The isolated code below can reproduces the issue
> after many minutes (<1 hour). It might be easier to reproduce on
> a busy system while disk I/O is happening.
s/might be/
Eric Wong wrote:
> Eric Wong wrote:
> > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> > local TCP socket. The isolated code below can reproduces the issue
> > after many minutes (<1 hour). It might be easier to reproduce on
> > a busy system while disk I/O is happening.
Eric Wong wrote:
> I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> local TCP socket. The isolated code below can reproduces the issue
> after many minutes (<1 hour). It might be easier to reproduce on
> a busy system while disk I/O is happening.
Ugh, I can't seem to reprod
37 matches
Mail list logo