c21b48 causes nfs client to hang and kernel BUG at ./include/linux/mm.h:432 (bisected)

2017-07-23 Thread Trevor Cordes
Hi! I've bisected a bug I'm seeing to: c21b48cc1bbf2f5af3ef54ada559f7fadf8b508b net: adjust skb->truesize in ___pskb_trim() The bug manifests as my NFS4 (TCP) client mount hanging after 5-10s of heavy read data transfer, which also produces: kernel BUG at ./include/linux/mm.h:462 (see full trac

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-02-05 Thread Trevor Cordes
On 2017-02-05 Michal Hocko wrote: > On Fri 03-02-17 18:36:54, Trevor Cordes wrote: > > I ran to_test/linus-tree/oom_hickups branch (4.10.0-rc6+) for 50 > > hours and it does NOT have the bug! No problems at all so far. > > OK, that is definitely good to know. My o

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-02-03 Thread Trevor Cordes
On 2017-02-01 Michal Hocko wrote: > On Wed 01-02-17 03:29:28, Trevor Cordes wrote: > > On 2017-01-30 Michal Hocko wrote: > [...] > > > Testing with Valinall rc6 released just yesterday would be a good > > > fit. There are some more fixes sitting on mmotm on top a

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-02-01 Thread Trevor Cordes
On 2017-01-30 Michal Hocko wrote: > On Sun 29-01-17 16:50:03, Trevor Cordes wrote: > > On 2017-01-25 Michal Hocko wrote: > > > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > > > OK, I patched & compiled mhocko's git tree from the other day > >

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-29 Thread Trevor Cordes
On 2017-01-25 Michal Hocko wrote: > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > OK, I patched & compiled mhocko's git tree from the other day > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a > > couple of weeks

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-26 Thread Trevor Cordes
On 2017-01-24 Michal Hocko wrote: > On Sun 22-01-17 18:45:59, Trevor Cordes wrote: > [...] > > Also, completely separate from your patch I ran mhocko's 4.9 tree > > with mem=2G to see if lower ram amount would help, but it didn't. > > Even with 2G the system oom a

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-25 Thread Trevor Cordes
On 2017-01-23 Mel Gorman wrote: > On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote: > > On 2017-01-20 Mel Gorman wrote: > > > > > > > > Thanks for the OOM report. I was expecting it to be a particular > > > > shape and my exp

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-22 Thread Trevor Cordes
On 2017-01-20 Mel Gorman wrote: > > > > Thanks for the OOM report. I was expecting it to be a particular > > shape and my expectations were not matched so it took time to > > consider it further. Can you try the cumulative patch below? It > > combines three patches that > > > > 1. Allow slab shri

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-19 Thread Trevor Cordes
On 2017-01-19 Michal Hocko wrote: > On Thu 19-01-17 03:48:50, Trevor Cordes wrote: > > On 2017-01-17 Michal Hocko wrote: > > > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko > > > > wrote:

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-19 Thread Trevor Cordes
On 2017-01-17 Michal Hocko wrote: > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > [...] > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > index 532a2a750952..46aac487b89a

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-17 Thread Trevor Cordes
On 2017-01-17 Michal Hocko wrote: > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > [...] > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > index 532a2a750952..46aac487b89a

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-17 Thread Trevor Cordes
On 2017-01-16 Mel Gorman wrote: > > > You can easily check whether this is memcg related by trying to > > > run the same workload with cgroup_disable=memory kernel command > > > line parameter. This will put all the memcg specifics out of the > > > way. > > > > I will try booting now into cgroup

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-14 Thread Trevor Cordes
On 2017-01-12 Michal Hocko wrote: > On Wed 11-01-17 16:52:32, Trevor Cordes wrote: > [...] > > I'm not sure how I can tell if my bug is because of memcgs so here > > is a full first oom example (attached). > > 4.7 kernel doesn't contain 71c799f4982d ("mm

Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-11 Thread Trevor Cordes
On 2017-01-11 Mel Gorman wrote: > On Wed, Jan 11, 2017 at 12:11:46PM +, Mel Gorman wrote: > > On Wed, Jan 11, 2017 at 04:32:43AM -0600, Trevor Cordes wrote: > > > Hi! I have biected a nightly oom-killer flood and crash/hang on > > > one of the boxes I admin. It

mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-11 Thread Trevor Cordes
Hi! I have biected a nightly oom-killer flood and crash/hang on one of the boxes I admin. It doesn't crash on Fedora 23/24 4.7.10 kernel but does on any 4.8 Fedora kernel. I did a vanilla bisect and the bug is here: commit b2e18757f2c9d1cdd746a882e9878852fdec9501 Author: Mel Gorman Date:

Re: netfilter regression causes lost pings "operation not permitted"

2016-12-07 Thread Trevor Cordes
On 2016-12-07 Trevor Cordes wrote: > Bisected down to: > 870190a9ec9075205c0fa795a09fa931694a3ff1 > 7c9664351980aaa6a4b8837a314360b3a4ad382a Oh! I forgot to mention the most important point: iptable_nat module MUST be loaded for the bug to show up! modprobe iptable_nat If you rmmod it

netfilter regression causes lost pings "operation not permitted"

2016-12-07 Thread Trevor Cordes
Bisected down to: 870190a9ec9075205c0fa795a09fa931694a3ff1 7c9664351980aaa6a4b8837a314360b3a4ad382a Hi! 4.8.x caused a script of mine that pings all IPs on my LAN /24 subnet in about 0.5s, and nmap doing the same, to error on the send() call with "operation not permitted". This happens after a

Re: [PATCH v3] ktime: Fix ktime_divns to do signed division

2015-05-11 Thread Trevor Cordes
nd it seems to work fine / stable, and fixes my bug. I think it's a done deal! Thanks once again! > Cc: Nicolas Pitre > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Josh Boyer > Cc: One Thousand Gnomes > Cc: Trevor Cordes > Cc: # 3.17+ for regression > Test

Re: [PATCH] ktime: Fix ktime_divns to do signed division

2015-05-02 Thread Trevor Cordes
: https://bugzilla.kernel.org/show_bug.cgi?id=95431 > Cc: Nicolas Pitre > Cc: Thomas Gleixner > Cc: Josh Boyer > Cc: One Thousand Gnomes > Reported-by: Trevor Cordes > Signed-off-by: John Stultz Tested-by: Trevor Cordes (runtime test on i686) > --- > include/linux/k

Re: regression in ktime.h circa 3.16.0-rc5+ breaks lirc irsend, bad commit 166afb64511

2015-05-01 Thread Trevor Cordes
On 2015-04-30 John Stultz wrote: > >From your description it does seem like some sort of edge case > >problem > w/ the 32bit ktime_divns(), but I don't see it right off, and I agree > with Alan to do both calculations and print out warn when that > happens. > > There's also not a ton of users of t

Re: regression in ktime.h circa 3.16.0-rc5+ breaks lirc irsend, bad commit 166afb64511

2015-04-29 Thread Trevor Cordes
igned below and it's probably wrong, so ignore my prognosticating.) Please see my rhbz link near the bottom for the full details. Thanks everyone! On 2015-03-23 Trevor Cordes wrote: > Hello everyone, this is my first attempt at bisecting a kernel to > solve a bug. Please bear wi

regression in ktime.h circa 3.16.0-rc5+ breaks lirc irsend, bad commit 166afb64511

2015-03-23 Thread Trevor Cordes
Hello everyone, this is my first attempt at bisecting a kernel to solve a bug. Please bear with me. I have successfully bisected and located a commit that is causing my problem. Look at commit 166afb64511. ktime_to_us returns s64, but the commit changes it so ktime_to_us just returns what kt

memory bug ever since 3.12, oom-killer invoked, computer freezes

2014-07-08 Thread Trevor Cordes
Excuse a novice on his first post to this list. I have tried to obtain help elsewhere with no success. I have been dealing with a bad kernel bug since 3.12 came out. It is present in 3.12, 3.13 and 3.14 up to 3.14.8 (Fedora 19 kernel). What happens is around the same time every day, using the b