Re: d_lookup: Unable to handle kernel paging request

2019-06-29 Thread Vicente Bergas
On Tuesday, June 25, 2019 12:48:17 PM CEST, Vicente Bergas wrote: On Tuesday, June 25, 2019 11:46:02 AM CEST, Will Deacon wrote: [+Marc] Hi again, Vicente, On Mon, Jun 24, 2019 at 12:47:41PM +0100, Will Deacon wrote: ... Hi Will, the memtest test is still pending... Hi Will, i've just ran

Re: d_lookup: Unable to handle kernel paging request

2019-06-25 Thread Vicente Bergas
On Tuesday, June 25, 2019 11:46:02 AM CEST, Will Deacon wrote: [+Marc] Hi again, Vicente, On Mon, Jun 24, 2019 at 12:47:41PM +0100, Will Deacon wrote: On Sat, Jun 22, 2019 at 08:02:19PM +0200, Vicente Bergas wrote: ... Before you rush over to LAKML, please could you provide your full dmesg o

Re: d_lookup: Unable to handle kernel paging request

2019-06-25 Thread Will Deacon
[+Marc] Hi again, Vicente, On Mon, Jun 24, 2019 at 12:47:41PM +0100, Will Deacon wrote: > On Sat, Jun 22, 2019 at 08:02:19PM +0200, Vicente Bergas wrote: > > Hi Al, > > i think have a hint of what is going on. > > With the last kernel built with your sentinels at hlist_bl_*lock > > it is very eas

Re: d_lookup: Unable to handle kernel paging request

2019-06-24 Thread Will Deacon
On Sat, Jun 22, 2019 at 08:02:19PM +0200, Vicente Bergas wrote: > Hi Al, > i think have a hint of what is going on. > With the last kernel built with your sentinels at hlist_bl_*lock > it is very easy to reproduce the issue. > In fact it is so unstable that i had to connect a serial port > in order

Re: d_lookup: Unable to handle kernel paging request

2019-06-22 Thread Vicente Bergas
Hi Al, i think have a hint of what is going on. With the last kernel built with your sentinels at hlist_bl_*lock it is very easy to reproduce the issue. In fact it is so unstable that i had to connect a serial port in order to save the kernel trace. Unfortunately all the traces are at different ad

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Al Viro
On Wed, Jun 19, 2019 at 06:51:51PM +0200, Vicente Bergas wrote: > > What's your config, BTW? SMP and DEBUG_SPINLOCK, specifically... > > Hi Al, > here it is: > https://paste.debian.net/1088517 Aha... So LIST_BL_LOCKMASK is 1 there (same as on distro builds)... Hell knows - how about static in

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Will Deacon
On Wed, Jun 19, 2019 at 06:51:51PM +0200, Vicente Bergas wrote: > here it is: > https://paste.debian.net/1088517 No modules and OPTIMIZE_INLINING=n, so this isn't either of my first thoughts. Hmm. I guess I should try to reproduce the issue locally. Will

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Will Deacon
Hi all, On Wed, Jun 19, 2019 at 05:28:02PM +0100, Al Viro wrote: > [arm64 maintainers Cc'd; I'm not adding a Cc to moderated list, > sorry] Thanks for adding us. > On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote: > > > Hi Al, > > i have been running the distro-provided kernel the

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Vicente Bergas
On Wednesday, June 19, 2019 6:28:02 PM CEST, Al Viro wrote: [arm64 maintainers Cc'd; I'm not adding a Cc to moderated list, sorry] On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote: Hi Al, i have been running the distro-provided kernel the last few weeks and had no issues at all.

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Al Viro
[arm64 maintainers Cc'd; I'm not adding a Cc to moderated list, sorry] On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote: > Hi Al, > i have been running the distro-provided kernel the last few weeks > and had no issues at all. > https://archlinuxarm.org/packages/aarch64/linux-aarch64

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Vicente Bergas
On Tuesday, June 18, 2019 8:35:48 PM CEST, Al Viro wrote: On Tue, May 28, 2019 at 11:38:43AM +0200, Vicente Bergas wrote: On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: ... __d_lookup() running into &dentry->d_hash == 0x0100 at some point in hash chain and trying to look at -

Re: d_lookup: Unable to handle kernel paging request

2019-06-18 Thread Al Viro
On Tue, Jun 18, 2019 at 07:35:48PM +0100, Al Viro wrote: > So far it looks like something is buggering a forward reference > in hash chain in a fairly specific way - the values seen had been > 01000 and > 880001000. Does that smell like anything from arm64-specific > data stru

Re: d_lookup: Unable to handle kernel paging request

2019-06-18 Thread Al Viro
On Tue, May 28, 2019 at 11:38:43AM +0200, Vicente Bergas wrote: > On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: > > ... > > IOW, here we have also run into bogus hlist forward pointer or head - > > same 0x100 in one case and 0x88000100 in two others. > > > > Have you tried

Re: d_lookup: Unable to handle kernel paging request

2019-05-28 Thread Vicente Bergas
On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: ... IOW, here we have also run into bogus hlist forward pointer or head - same 0x100 in one case and 0x88000100 in two others. Have you tried to see if KASAN catches anything on those loads? Use-after-free, for example... An

Re: d_lookup: Unable to handle kernel paging request

2019-05-24 Thread Vicente Bergas
On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: On Wed, May 22, 2019 at 05:44:30PM +0200, Vicente Bergas wrote: ... IOW, here we have also run into bogus hlist forward pointer or head - same 0x100 in one case and 0x88000100 in two others. Have you tried to see if KASAN cat

Re: d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Al Viro
On Wed, May 22, 2019 at 05:44:30PM +0200, Vicente Bergas wrote: >2d30: f8617893ldr x19, [x4, x1, lsl #3] >2d34: f27ffa73andsx19, x19, #0xfffe >2d38: 54000920b.eq2e5c <__d_lookup_rcu+0x15c> // b.none >2d3c: aa0003f

Re: d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Vicente Bergas
Hi Al, On Wednesday, May 22, 2019 3:53:31 PM CEST, Al Viro wrote: On Wed, May 22, 2019 at 12:40:55PM +0200, Vicente Bergas wrote: Hi, since a recent update the kernel is reporting d_lookup errors. They appear randomly and after each error the affected file or directory is no longer accessible.

Re: d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Al Viro
On Wed, May 22, 2019 at 12:40:55PM +0200, Vicente Bergas wrote: > Hi, > since a recent update the kernel is reporting d_lookup errors. > They appear randomly and after each error the affected file or directory > is no longer accessible. > The kernel is built with GCC 9.1.0 on ARM64. > Four traces f

d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Vicente Bergas
Hi, since a recent update the kernel is reporting d_lookup errors. They appear randomly and after each error the affected file or directory is no longer accessible. The kernel is built with GCC 9.1.0 on ARM64. Four traces from different workloads follow. This trace is from v5.1-12511-g72cf0b07418