Re: d_lookup: Unable to handle kernel paging request

2019-06-29 Thread Vicente Bergas
On Tuesday, June 25, 2019 12:48:17 PM CEST, Vicente Bergas wrote: On Tuesday, June 25, 2019 11:46:02 AM CEST, Will Deacon wrote: [+Marc] Hi again, Vicente, On Mon, Jun 24, 2019 at 12:47:41PM +0100, Will Deacon wrote: ... Hi Will, the memtest test is still pending... Hi Will, i've just ran

Re: d_lookup: Unable to handle kernel paging request

2019-06-25 Thread Vicente Bergas
On Tuesday, June 25, 2019 11:46:02 AM CEST, Will Deacon wrote: [+Marc] Hi again, Vicente, On Mon, Jun 24, 2019 at 12:47:41PM +0100, Will Deacon wrote: On Sat, Jun 22, 2019 at 08:02:19PM +0200, Vicente Bergas wrote: ... Before you rush over to LAKML, please could you provide your full dmesg o

Re: d_lookup: Unable to handle kernel paging request

2019-06-25 Thread Will Deacon
[+Marc] Hi again, Vicente, On Mon, Jun 24, 2019 at 12:47:41PM +0100, Will Deacon wrote: > On Sat, Jun 22, 2019 at 08:02:19PM +0200, Vicente Bergas wrote: > > Hi Al, > > i think have a hint of what is going on. > > With the last kernel built with your sentinels at hlist_bl_*lock > > it is very eas

Re: d_lookup: Unable to handle kernel paging request

2019-06-24 Thread Will Deacon
On Sat, Jun 22, 2019 at 08:02:19PM +0200, Vicente Bergas wrote: > Hi Al, > i think have a hint of what is going on. > With the last kernel built with your sentinels at hlist_bl_*lock > it is very easy to reproduce the issue. > In fact it is so unstable that i had to connect a serial port > in order

Re: d_lookup: Unable to handle kernel paging request

2019-06-22 Thread Vicente Bergas
Hi Al, i think have a hint of what is going on. With the last kernel built with your sentinels at hlist_bl_*lock it is very easy to reproduce the issue. In fact it is so unstable that i had to connect a serial port in order to save the kernel trace. Unfortunately all the traces are at different ad

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Al Viro
On Wed, Jun 19, 2019 at 06:51:51PM +0200, Vicente Bergas wrote: > > What's your config, BTW? SMP and DEBUG_SPINLOCK, specifically... > > Hi Al, > here it is: > https://paste.debian.net/1088517 Aha... So LIST_BL_LOCKMASK is 1 there (same as on distro builds)... Hell knows - how about static in

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Will Deacon
On Wed, Jun 19, 2019 at 06:51:51PM +0200, Vicente Bergas wrote: > here it is: > https://paste.debian.net/1088517 No modules and OPTIMIZE_INLINING=n, so this isn't either of my first thoughts. Hmm. I guess I should try to reproduce the issue locally. Will

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Will Deacon
Hi all, On Wed, Jun 19, 2019 at 05:28:02PM +0100, Al Viro wrote: > [arm64 maintainers Cc'd; I'm not adding a Cc to moderated list, > sorry] Thanks for adding us. > On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote: > > > Hi Al, > > i have been running the distro-provided kernel the

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Vicente Bergas
On Wednesday, June 19, 2019 6:28:02 PM CEST, Al Viro wrote: [arm64 maintainers Cc'd; I'm not adding a Cc to moderated list, sorry] On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote: Hi Al, i have been running the distro-provided kernel the last few weeks and had no issues at all.

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Al Viro
[arm64 maintainers Cc'd; I'm not adding a Cc to moderated list, sorry] On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote: > Hi Al, > i have been running the distro-provided kernel the last few weeks > and had no issues at all. > https://archlinuxarm.org/packages/aarch64/linux-aarch64

Re: d_lookup: Unable to handle kernel paging request

2019-06-19 Thread Vicente Bergas
On Tuesday, June 18, 2019 8:35:48 PM CEST, Al Viro wrote: On Tue, May 28, 2019 at 11:38:43AM +0200, Vicente Bergas wrote: On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: ... __d_lookup() running into &dentry->d_hash == 0x0100 at some point in hash chain and trying to look at -

Re: d_lookup: Unable to handle kernel paging request

2019-06-18 Thread Al Viro
On Tue, Jun 18, 2019 at 07:35:48PM +0100, Al Viro wrote: > So far it looks like something is buggering a forward reference > in hash chain in a fairly specific way - the values seen had been > 01000 and > 880001000. Does that smell like anything from arm64-specific > data stru

Re: d_lookup: Unable to handle kernel paging request

2019-06-18 Thread Al Viro
On Tue, May 28, 2019 at 11:38:43AM +0200, Vicente Bergas wrote: > On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: > > ... > > IOW, here we have also run into bogus hlist forward pointer or head - > > same 0x100 in one case and 0x88000100 in two others. > > > > Have you tried

Re: d_lookup: Unable to handle kernel paging request

2019-05-28 Thread Vicente Bergas
On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: ... IOW, here we have also run into bogus hlist forward pointer or head - same 0x100 in one case and 0x88000100 in two others. Have you tried to see if KASAN catches anything on those loads? Use-after-free, for example... An

Re: d_lookup: Unable to handle kernel paging request

2019-05-24 Thread Vicente Bergas
On Wednesday, May 22, 2019 6:29:46 PM CEST, Al Viro wrote: On Wed, May 22, 2019 at 05:44:30PM +0200, Vicente Bergas wrote: ... IOW, here we have also run into bogus hlist forward pointer or head - same 0x100 in one case and 0x88000100 in two others. Have you tried to see if KASAN cat

Re: d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Al Viro
On Wed, May 22, 2019 at 05:44:30PM +0200, Vicente Bergas wrote: >2d30: f8617893ldr x19, [x4, x1, lsl #3] >2d34: f27ffa73andsx19, x19, #0xfffe >2d38: 54000920b.eq2e5c <__d_lookup_rcu+0x15c> // b.none >2d3c: aa0003f

Re: d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Vicente Bergas
Hi Al, On Wednesday, May 22, 2019 3:53:31 PM CEST, Al Viro wrote: On Wed, May 22, 2019 at 12:40:55PM +0200, Vicente Bergas wrote: Hi, since a recent update the kernel is reporting d_lookup errors. They appear randomly and after each error the affected file or directory is no longer accessible.

Re: d_lookup: Unable to handle kernel paging request

2019-05-22 Thread Al Viro
On Wed, May 22, 2019 at 12:40:55PM +0200, Vicente Bergas wrote: > Hi, > since a recent update the kernel is reporting d_lookup errors. > They appear randomly and after each error the affected file or directory > is no longer accessible. > The kernel is built with GCC 9.1.0 on ARM64. > Four traces f