Re: Fwd: Memory corruption in multithreaded user space program while calling fork

2023-07-08 Thread Thorsten Leemhuis
[adding Linus to the list of recipients to ensure the fix makes it into
-rc1 (and can finally be backported to -stable).

Linus, here is the backstory, as I assume you haven't seen this yet:

CONFIG_PER_VMA_LOCK (which defaults to Y; merged for v6.4-rc1 in
0bff0aaea03 ("x86/mm: try VMA lock-based page fault handling first"))
sometimes causes memory corruption reported here:
https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf...@kernel.org/
https://bugzilla.kernel.org/show_bug.cgi?id=217624

The plan since early this week is to mark CONFIG_PER_VMA_LOCK as broken;
latest patch that does this is this one afaics:
https://lore.kernel.org/all/20230706011400.2949242-3-sur...@google.com/

But that change or something similar hasn't reached you yet afaics;
note, this is the second patch of a series with two patches]

On 05.07.23 17:49, Andrew Morton wrote:
> On Wed, 5 Jul 2023 10:51:57 +0200 "Linux regression tracking (Thorsten 
> Leemhuis)"  wrote:
> 
>>>>> I'm in wait-a-few-days-mode on this.  To see if we have a backportable
>>>>> fix rather than disabling the feature in -stable.
>>
>> Andrew, how long will you remain in "wait-a-few-days-mode"? Given what
>> Greg said below and that we already had three reports I know of I'd
>> prefer if we could fix this rather sooner than later in mainline --
>> especially as Arch Linux and openSUSE Tumbleweed likely have switched to
>> 6.4.y already or will do so soon.
> 
> I'll send today's 2-patch series to Linus today or tomorrow.

That afaics did not happen until now. :-(

This makes me regret that I did not CC Linus earlier. I always feel like
a snitcher when I do that. But in retrospective it seems it would have
been the right thing to do given the problem, as I suspect Linus would
have quickly applied the patch or marked the feature as broken himself.

So thx to this (and a handful of earlier, similar situations) I now
fully made my peace with feeling like a snitcher (I always knew that
it's kinda part of the position). When something in me says "Ick, this
looks bad to my untrained eyes" I'll immediately CC Linus.

Linus, if I take things to far just let me know. But I assume you get a
lot of mails and won't mind a few more.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-18 Thread Thorsten Leemhuis
On 18.07.23 05:32, Bagas Sanjaya wrote:
> On Thu, Jul 13, 2023 at 09:11:10AM -0700, Randy Dunlap wrote:
>> On 7/12/23 19:37, Stephen Rothwell wrote:
>>> Changes since 20230712:
>>
>> on ppc64:
>>
>> In file included from ../include/linux/device.h:15,
>>  from ../arch/powerpc/include/asm/io.h:22,
>>  from ../include/linux/io.h:13,
>>  from ../include/linux/irq.h:20,
>>  from ../arch/powerpc/include/asm/hardirq.h:6,
>>  from ../include/linux/hardirq.h:11,
>>  from ../include/linux/interrupt.h:11,
>>  from ../drivers/video/fbdev/ps3fb.c:25:
>> ../drivers/video/fbdev/ps3fb.c: In function 'ps3fb_probe':
>> ../drivers/video/fbdev/ps3fb.c:1172:40: error: 'struct fb_info' has no 
>> member named 'dev'
>>  1172 |  dev_driver_string(info->dev), dev_name(info->dev),
>>   |^~
>> ../include/linux/dev_printk.h:110:37: note: in definition of macro 
>> 'dev_printk_index_wrap'
>>   110 | _p_func(dev, fmt, ##__VA_ARGS__);
>>\
>>   | ^~~
>> ../drivers/video/fbdev/ps3fb.c:1171:9: note: in expansion of macro 'dev_info'
>>  1171 | dev_info(info->device, "%s %s, using %u KiB of video 
>> memory\n",
>>   | ^~~~
>> ../drivers/video/fbdev/ps3fb.c:1172:61: error: 'struct fb_info' has no 
>> member named 'dev'
>>  1172 |  dev_driver_string(info->dev), dev_name(info->dev),
>>   | ^~
>> ../include/linux/dev_printk.h:110:37: note: in definition of macro 
>> 'dev_printk_index_wrap'
>>   110 | _p_func(dev, fmt, ##__VA_ARGS__);
>>\
>>   | ^~~
>> ../drivers/video/fbdev/ps3fb.c:1171:9: note: in expansion of macro 'dev_info'
>>  1171 | dev_info(info->device, "%s %s, using %u KiB of video 
>> memory\n",
>>   | ^~~~
> 
> Hmm, there is no response from Thomas yet. I guess we should go with
> reverting bdb616479eff419, right?

I'm missing something here:

* What makes you think this is caused by bdb616479eff419? I didn't see
anything in the thread that claims this, but I might be missing something
* related: if I understand Randy right, this is only happening in -next;
so why is bdb616479eff419 the culprit, which is also in mainline since
End of June?

And asking for a revert already is a bit jumping the gun; sure, it would
be good to get this fixed, but remember: developers have a lot on their
plate and thus sometimes are forced to set priorities; they also
sometimes go on vacation or are afk for other reasons; and sometimes
they just miss a mail or two. These are just a few reasons why there
might be good reasons why Thomas didn't look into this yet, hence please
first ask really kindly before asking for a revert.

Ciao, Thorsten


Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-19 Thread Thorsten Leemhuis
On 19.07.23 14:36, Bagas Sanjaya wrote:
> On 7/18/23 17:06, Thorsten Leemhuis wrote:
>> I'm missing something here:
>>
>> * What makes you think this is caused by bdb616479eff419? I didn't see
>> anything in the thread that claims this, but I might be missing something
>> * related: if I understand Randy right, this is only happening in -next;
>> so why is bdb616479eff419 the culprit, which is also in mainline since
>> End of June?
> 
> Actually drivers/video/fbdev/ps3bf.c only had two non-merge commits during
> previous cycle: 25ec15abb06194 and bdb616479eff419. The former was simply
> adding .owner field in ps3fb_ops (hence trivial), so I inferred that the
> culprit was likely the latter (due to it was being authored by Thomas).

As you can see from Michael's reply this was misguided, as it was an
external change that broke the driver. This happens all the time, such
inferring thus is not possible at all.

Ciao, Thorsten


Re: Issues with the first PowerPC updates for the kernel 6.1 #forregzbot

2022-10-30 Thread Thorsten Leemhuis
[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 12.10.22 08:51, Christian Zigotzky wrote:
> Hi All,
> 
> I use the Nemo board with a PASemi PA6T CPU and have some issues since the 
> first PowerPC updates for the kernel 6.1.
> 
> I successfully compiled the git kernel with the first PowerPC updates two 
> days ago.
> 
> Unfortunately this kernel is really dangerous. Many things for example 
> Network Manager and LightDM don't work anymore and produced several gigabyte 
> of config files till the partition has been filled.
> 
> I deleted some files like the resolv.conf that had a size over 200 GB!
> 
> Unfortunately, MintPPC was still damaged. For example LightDM doesn't work 
> anymore and the MATE desktop doesn't display any icons anymore because Caja 
> wasn't able to reserve memory anymore.
> 
> In this case, bisecting isn't an option and I have to wait some weeks. It is 
> really difficult to find the issue if the userland will damaged again and 
> again.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced c2e7a19827eec443a7cb
#regzbot title ppc: PASemi PA6T CPU: Network Manager and LightDM and
fill volume with data
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.


Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception

2024-04-04 Thread Thorsten Leemhuis
On 05.04.24 05:20, Michael Ellerman wrote:
> "Linux regression tracking (Thorsten Leemhuis)"
>  writes:
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> Was this regression ever resolved? Doesn't look like it, but maybe I
>> just missed something.
> 
> I'm not sure how it ended up on the regression list.

That is easy to explain: I let lei search for mails containing words
like regress, bisect, and revert to become aware of regressions that
might need tracking. And...

> IMHO it's not really a regression.

...sometimes I misjudge or misinterpret something and add it to the
regression tracking. Looks like that happened here.

Sorry for that and the noise it caused!

#regzbot resolve: invalid: was not really a regression in the first place

Ciao, Thorsten


Re: [PASEMI] Nemo board doesn't reboot anymore after the commit "HID: usbhid: Add ALWAYS_POLL quirk for some mice" #forregzbot

2022-12-22 Thread Thorsten Leemhuis
[Note: this mail contains only information for Linux kernel regression
tracking. Mails like these contain '#forregzbot' in the subject to make
then easy to spot and filter out. The author also tried to remove most
or all individuals from the list of recipients to spare them the hassle.]

On 22.12.22 11:42, Christian Zigotzky wrote:
> 
> The Nemo board [1] doesn't reboot anymore since the final kernel 6.1.
> The reboot works with the RC8 of kernel 6.1.
> Actually, a reboot works but the CFE firmware is not loaded. Maybe there
> is still something in the memory after the reboot.
> 
> I bisected today. [2]

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced f6d910a89a23
#regzbot title hid: PASEMI Nemo board doesn't reboot anymore
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.


Bug 215658 - arch/powerpc/mm/mmu_context.o Assembler messages: Error: unrecognized opcode: `dssall' (PowerMac G4)

2022-03-10 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker.

I noticed a regression report in bugzilla.kernel.org that afaics nobody
acted upon since it was reported about a week ago, that's why I decided
to forward it to the lists and a few relevant people to the CC. To quote
from the ticket:

> 5.16.12 kernel build for my G4 DP on my Talos II fails with:
> 
> [...]
>   CC  arch/powerpc/mm/init_32.o
>   CC  arch/powerpc/mm/pgtable_32.o
>   CC  arch/powerpc/mm/pgtable-frag.o
>   CC  arch/powerpc/mm/ioremap.o
>   CC  arch/powerpc/mm/ioremap_32.o
>   CC  arch/powerpc/mm/init-common.o
>   CC  arch/powerpc/mm/mmu_context.o
> {standard input}: Assembler messages:
> {standard input}:30: Error: unrecognized opcode: `dssall'
> make[2]: *** [scripts/Makefile.build:287: arch/powerpc/mm/mmu_context.o] 
> Fehler 1
> make[1]: *** [scripts/Makefile.build:549: arch/powerpc/mm] Fehler 2
> make: *** [Makefile:1846: arch/powerpc] Error 2
> 
> This seems to have been introduced by commit 
> d51f86cfd8e378d4907958db77da3074f6dce3ba "powerpc/mm: Switch obsolete dssall 
> to .long"
> 
> Reverting this commit fixes the build for my G4.

Could somebody take a look into this? Or was this discussed somewhere
else already? Or even fixed?

Anyway, to get this tracked:

#regzbot introduced: d51f86cfd8e378d4907958db77da3074f6dce3ba
#regzbot from: Erhard F 
#regzbot title:  arch/powerpc/mm/mmu_context.o Assembler messages:
Error: unrecognized opcode: `dssall' (PowerMac G4)
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215658

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.

-- 
Additional information about regzbot:

If you want to know more about regzbot, check out its web-interface, the
getting start guide, and the references documentation:

https://linux-regtracking.leemhuis.info/regzbot/
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md

The last two documents will explain how you can interact with regzbot
yourself if your want to.

Hint for reporters: when reporting a regression it's in your interest to
CC the regression list and tell regzbot about the issue, as that ensures
the regression makes it onto the radar of the Linux kernel's regression
tracker -- that's in your interest, as it ensures your report won't fall
through the cracks unnoticed.

Hint for developers: you normally don't need to care about regzbot once
it's involved. Fix the issue as you normally would, just remember to
include 'Link:' tag in the patch descriptions pointing to all reports
about the issue. This has been expected from developers even before
regzbot showed up for reasons explained in
'Documentation/process/submitting-patches.rst' and
'Documentation/process/5.Posting.rst'.


Re: Bug 215658 - arch/powerpc/mm/mmu_context.o Assembler messages: Error: unrecognized opcode: `dssall' (PowerMac G4)

2022-03-10 Thread Thorsten Leemhuis
On 10.03.22 12:22, Christophe Leroy wrote:
> Le 10/03/2022 à 11:39, Thorsten Leemhuis a écrit :
>> Hi, this is your Linux kernel regression tracker.
>>
>> I noticed a regression report in bugzilla.kernel.org that afaics nobody
>> acted upon since it was reported about a week ago, that's why I decided
>> to forward it to the lists and a few relevant people to the CC. To quote
>> from the ticket:
> 
> I already looked at it when the ticket was opened and that's a bit puzzling.

Yeah, same here, but I decided I to pick it up, as that's what I'm here for.

> With v5.16.12 and the config file in the bug report I have no such problem:
> 
>CC  arch/powerpc/mm/fault.o
>CC  arch/powerpc/mm/mem.o
> [...]

Maybe it's one of those bugs related to the version of binutils?

> The bug is puzzling because it says the problem is introduced by commit 
> d51f86cfd8e3 ("powerpc/mm: Switch obsolete dssall to .long") whereas the 
> purpose of that commit is exactly to fix the issue you are reporting.
>
> And as far as I can see that commit is not in v5.16.12, so my feeling is 
> that somethings wrong with the bug report.
> 
> By the way I think that cherry-picking that commit into v5.16.12 should 
> fix it.

Maybe that's what he had meant to be writing? Maybe your comment in the
ticket will lead to some enlightenment.

Thx for looking into this.

Ciao, Thorsten

>>> 5.16.12 kernel build for my G4 DP on my Talos II fails with:
>>>
>>> [...]
>>>CC  arch/powerpc/mm/init_32.o
>>>CC  arch/powerpc/mm/pgtable_32.o
>>>CC  arch/powerpc/mm/pgtable-frag.o
>>>CC  arch/powerpc/mm/ioremap.o
>>>CC  arch/powerpc/mm/ioremap_32.o
>>>CC  arch/powerpc/mm/init-common.o
>>>CC  arch/powerpc/mm/mmu_context.o
>>> {standard input}: Assembler messages:
>>> {standard input}:30: Error: unrecognized opcode: `dssall'
>>> make[2]: *** [scripts/Makefile.build:287: arch/powerpc/mm/mmu_context.o] 
>>> Fehler 1
>>> make[1]: *** [scripts/Makefile.build:549: arch/powerpc/mm] Fehler 2
>>> make: *** [Makefile:1846: arch/powerpc] Error 2
>>>
>>> This seems to have been introduced by commit 
>>> d51f86cfd8e378d4907958db77da3074f6dce3ba "powerpc/mm: Switch obsolete 
>>> dssall to .long"
>>>
>>> Reverting this commit fixes the build for my G4.
>>
>> Could somebody take a look into this? Or was this discussed somewhere
>> else already? Or even fixed?
>>
>> Anyway, to get this tracked:
>>
>> #regzbot introduced: d51f86cfd8e378d4907958db77da3074f6dce3ba
>> #regzbot from: Erhard F 
>> #regzbot title:  arch/powerpc/mm/mmu_context.o Assembler messages:
>> Error: unrecognized opcode: `dssall' (PowerMac G4)
>> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215658
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>
>> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
>> reports on my table. I can only look briefly into most of them and lack
>> knowledge about most of the areas they concern. I thus unfortunately
>> will sometimes get things wrong or miss something important. I hope
>> that's not the case here; if you think it is, don't hesitate to tell me
>> in a public reply, it's in everyone's interest to set the public record
>> straight.


Re: Bug 215658 - arch/powerpc/mm/mmu_context.o Assembler messages: Error: unrecognized opcode: `dssall' (PowerMac G4)

2022-03-10 Thread Thorsten Leemhuis
On 10.03.22 13:22, Thorsten Leemhuis wrote:
> On 10.03.22 12:22, Christophe Leroy wrote:
>> Le 10/03/2022 à 11:39, Thorsten Leemhuis a écrit :
>>> Hi, this is your Linux kernel regression tracker.
>>>
>>> I noticed a regression report in bugzilla.kernel.org that afaics nobody
>>> acted upon since it was reported about a week ago, that's why I decided
>>> to forward it to the lists and a few relevant people to the CC. To quote
>>> from the ticket:
>> I already looked at it when the ticket was opened and that's a bit puzzling.
> Yeah, same here, but I decided I to pick it up, as that's what I'm here for.

TWIMC, the reported stated in bugzilla:

```
> This was Gentoo Sources v5.16.12 not upstream sources. But now I am
> not able to reproduce it which is even more strange... Also Gentoos'
> v5.16.13 builds ok.
> 
> What I did in the meantime was downgrading to binutils 2.37 (had 2.38
> before) and rebuilding the toolchain afterwards.
> 
> So this probably was never a bug but an issue with my setup. ;)
> Closing here```

Thus removing it from the regression tracking as well:

#regzbot invalid: reporter can't reproduce anymore and the report was  a
bit puzzling anyway

Ciao, Thorsten

>> With v5.16.12 and the config file in the bug report I have no such problem:
>>
>>CC  arch/powerpc/mm/fault.o
>>CC  arch/powerpc/mm/mem.o
>> [...]
> 
> Maybe it's one of those bugs related to the version of binutils?
> 
>> The bug is puzzling because it says the problem is introduced by commit 
>> d51f86cfd8e3 ("powerpc/mm: Switch obsolete dssall to .long") whereas the 
>> purpose of that commit is exactly to fix the issue you are reporting.
>>
>> And as far as I can see that commit is not in v5.16.12, so my feeling is 
>> that somethings wrong with the bug report.
>>
>> By the way I think that cherry-picking that commit into v5.16.12 should 
>> fix it.
> 
> Maybe that's what he had meant to be writing? Maybe your comment in the
> ticket will lead to some enlightenment.
> 
> Thx for looking into this.
> 
> Ciao, Thorsten
> 
>>>> 5.16.12 kernel build for my G4 DP on my Talos II fails with:
>>>>
>>>> [...]
>>>>CC  arch/powerpc/mm/init_32.o
>>>>CC  arch/powerpc/mm/pgtable_32.o
>>>>CC  arch/powerpc/mm/pgtable-frag.o
>>>>CC  arch/powerpc/mm/ioremap.o
>>>>CC  arch/powerpc/mm/ioremap_32.o
>>>>CC  arch/powerpc/mm/init-common.o
>>>>CC  arch/powerpc/mm/mmu_context.o
>>>> {standard input}: Assembler messages:
>>>> {standard input}:30: Error: unrecognized opcode: `dssall'
>>>> make[2]: *** [scripts/Makefile.build:287: arch/powerpc/mm/mmu_context.o] 
>>>> Fehler 1
>>>> make[1]: *** [scripts/Makefile.build:549: arch/powerpc/mm] Fehler 2
>>>> make: *** [Makefile:1846: arch/powerpc] Error 2
>>>>
>>>> This seems to have been introduced by commit 
>>>> d51f86cfd8e378d4907958db77da3074f6dce3ba "powerpc/mm: Switch obsolete 
>>>> dssall to .long"
>>>>
>>>> Reverting this commit fixes the build for my G4.
>>>
>>> Could somebody take a look into this? Or was this discussed somewhere
>>> else already? Or even fixed?
>>>
>>> Anyway, to get this tracked:
>>>
>>> #regzbot introduced: d51f86cfd8e378d4907958db77da3074f6dce3ba
>>> #regzbot from: Erhard F 
>>> #regzbot title:  arch/powerpc/mm/mmu_context.o Assembler messages:
>>> Error: unrecognized opcode: `dssall' (PowerMac G4)
>>> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215658
>>>
>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>>
>>> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
>>> reports on my table. I can only look briefly into most of them and lack
>>> knowledge about most of the areas they concern. I thus unfortunately
>>> will sometimes get things wrong or miss something important. I hope
>>> that's not the case here; if you think it is, don't hesitate to tell me
>>> in a public reply, it's in everyone's interest to set the public record
>>> straight.


Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-11-15 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker speaking.

This looks stalled, as afaics nothing to get this regression fixed
happened since below mail. How can we things rolling again?

Eugene, were you able to look into the patch from Joakim?

Or did I miss anything and some progress to fix this was made elsewhere?
Please let me know if that's the case.

Ciao, Thorsten (carrying his Linux kernel regression tracker hat)

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply. That's in everyone's interest, as
what I wrote above might be misleading to everyone reading this; any
suggestion I gave they thus might sent someone reading this down the
wrong rabbit hole, which none of us wants.

P.P.S.: Feel free to ignore the following lines, they are only meant for
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/):

#regzbot poke

On 02.11.21 22:15, Joakim Tjernlund wrote:
> On Sat, 2021-10-30 at 14:20 +, Joakim Tjernlund wrote:
>> On Fri, 2021-10-29 at 17:14 +, Eugene Bordenkircher wrote:
>
>>> We've discovered a situation where the FSL udc driver 
>>> (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over 
>>> the request queue, but the queue has been corrupted at some point so it 
>>> loops infinitely.  I believe we have narrowed into the offending code, but 
>>> we are in need of assistance trying to find an appropriate fix for the 
>>> problem.  The identified code appears to be in all versions of the Linux 
>>> kernel the driver exists in.
>>>
>>> The problem appears to be when handling a USB_REQ_GET_STATUS request.  The 
>>> driver gets this request and then calls the ch9getstatus() function.  In 
>>> this function, it starts a request by "borrowing" the per device 
>>> status_req, filling it in, and then queuing it with a call to 
>>> list_add_tail() to add the request to the endpoint queue.  Right before it 
>>> exits the function however, it's calling ep0_prime_status(), which is 
>>> filling out that same status_req structure and then queuing it with another 
>>> call to list_add_tail() to add the request to the endpoint queue.  This 
>>> adds two instances of the exact same LIST_HEAD to the endpoint queue, which 
>>> breaks the list since the prev and next pointers end up pointing to the 
>>> wrong things.  This ends up causing a hard loop the next time nuke() gets 
>>> called, which happens on the next setup IRQ.
>>>
>>> I'm not sure what the appropriate fix to this problem is, mostly due to my 
>>> lack of expertise in USB and this driver stack.  The code has been this way 
>>> in the kernel for a very long time, which suggests that it has been 
>>> working, unless USB_REQ_GET_STATUS requests are never made.  This further 
>>> suggests that there is something else going on that I don't understand.  
>>> Deleting the call to ep0_prime_status() and the following ep0stall() call 
>>> appears, on the surface, to get the device working again, but may have side 
>>> effects that I'm not seeing.
>>>
>>> I'm hopeful someone in the community can help provide some information on 
>>> what I may be missing or help come up with a solution to the problem.  A 
>>> big thank you to anyone who would like to help out.
>>>
>>> Eugene
>>
>> Run into this to a while ago. Found the bug and a few more fixes.
>> This is against 4.19 so you may have to tweak them a bit.
>> Feel free to upstream them.
>>
>>  Jocke 
> 
> Curious, did my patches help? Good to known once we upgrade as well.
> 
>  Jocke


Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-11-25 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker speaking.

Top-posting for once, to make this easy to process for everyone:

Li Yang and Felipe Balbi: how to move on with this? It's quite an old
regression, but nevertheless it is one and thus should be fixed. Part of
my position is to make that happen and thus remind developers and
maintainers about this until the regression is resolved.

Ciao, Thorsten

On 16.11.21 20:11, Eugene Bordenkircher wrote:
> On 02.11.21 22:15, Joakim Tjernlund wrote:
>> On Sat, 2021-10-30 at 14:20 +, Joakim Tjernlund wrote:
>>> On Fri, 2021-10-29 at 17:14 +, Eugene Bordenkircher wrote:
>>
 We've discovered a situation where the FSL udc driver 
 (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over 
 the request queue, but the queue has been corrupted at some point so it 
 loops infinitely.  I believe we have narrowed into the offending code, but 
 we are in need of assistance trying to find an appropriate fix for the 
 problem.  The identified code appears to be in all versions of the Linux 
 kernel the driver exists in.

 The problem appears to be when handling a USB_REQ_GET_STATUS request.  The 
 driver gets this request and then calls the ch9getstatus() function.  In 
 this function, it starts a request by "borrowing" the per device 
 status_req, filling it in, and then queuing it with a call to 
 list_add_tail() to add the request to the endpoint queue.  Right before it 
 exits the function however, it's calling ep0_prime_status(), which is 
 filling out that same status_req structure and then queuing it with 
 another call to list_add_tail() to add the request to the endpoint queue.  
 This adds two instances of the exact same LIST_HEAD to the endpoint queue, 
 which breaks the list since the prev and next pointers end up pointing to 
 the wrong things.  This ends up causing a hard loop the next time nuke() 
 gets called, which happens on the next setup IRQ.

 I'm not sure what the appropriate fix to this problem is, mostly due to my 
 lack of expertise in USB and this driver stack.  The code has been this 
 way in the kernel for a very long time, which suggests that it has been 
 working, unless USB_REQ_GET_STATUS requests are never made.  This further 
 suggests that there is something else going on that I don't understand.  
 Deleting the call to ep0_prime_status() and the following ep0stall() call 
 appears, on the surface, to get the device working again, but may have 
 side effects that I'm not seeing.

 I'm hopeful someone in the community can help provide some information on 
 what I may be missing or help come up with a solution to the problem.  A 
 big thank you to anyone who would like to help out.
>>>
>>> Run into this to a while ago. Found the bug and a few more fixes.
>>> This is against 4.19 so you may have to tweak them a bit.
>>> Feel free to upstream them.
>>
>> Curious, did my patches help? Good to known once we upgrade as well.
> 
> There's good news and bad news.
> 
> The good news is that this appears to stop the driver from entering
> an infinite loop, which prevents the Linux system from locking up and
> never recovering.  So I'm willing to say we've made the behavior
> better.
> 
> The bad news is that once we get past this point, there is new bad
> behavior.  What is on top of this driver in our system is the RNDIS
> gadget driver communicating to a Laptop running Win10 -1809.
> Everything appears to work fine with the Linux system until there is
> a USB disconnect.  After the disconnect, the Linux side appears to
> continue on just fine, but the Windows side doesn't seem to recognize
> the disconnect, which causes the USB driver on that side to hang
> forever and eventually blue screen the box.  This doesn't happen on
> all machines, just a select few.   I think we can isolate the
> behavior to a specific antivirus/security software driver that is
> inserting itself into the USB stack and filtering the disconnect
> message, but we're still proving that.
> 
> I'm about 90% certain this is a different problem and we can call
> this patchset good, at least for our test setup.  My only hesitation
> is if the Linux side is sending a set of responses that are confusing
> the Windows side (specifically this antivirus) or not.  I'd be
> content calling that a separate defect though and letting this one
> close up with that patchset.

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply. That's in everyone's interest, as
what I wrote above might be misleading to everyone reading this; any
suggestion I gave they thus might sent someone r

Re: [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1

2021-12-13 Thread Thorsten Leemhuis
[TLDR: adding this regression to regzbot; most of this mail is compiled
from a few templates paragraphs some of you might have seen already.]

Hi, this is your Linux kernel regression tracker speaking.

Top-posting for once, to make this easy accessible to everyone.

Thanks for the report.

Adding the regression mailing list to the list of recipients, as it
should be in the loop for all regressions, as explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

To be sure this issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced dfeae1073583
#regzbot title mtd: cfi_cmdset_0002: flash write accesses on the
hardware fail on a PowerPC MPC8313 to a 8-bit-parallel S29GL064N flash
#regzbot ignore-activity

Reminder: when fixing the issue, please add a 'Link:' tag with the URL
to the report (the parent of this mail), then regzbot will automatically
mark the regression as resolved once the fix lands in the appropriate
tree. For more details about regzbot see footer.

Sending this to everyone that got the initial report, to make all aware
of the tracking. I also hope that messages like this motivate people to
directly get at least the regression mailing list and ideally even
regzbot involved when dealing with regressions, as messages like this
wouldn't be needed then.

Don't worry, I'll send further messages wrt to this regression just to
the lists (with a tag in the subject so people can filter them away), as
long as they are intended just for regzbot. With a bit of luck no such
messages will be needed anyway.

Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat).

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply. That's in everyone's interest, as
what I wrote above might be misleading to everyone reading this; any
suggestion I gave thus might sent someone reading this down the wrong
rabbit hole, which none of us wants.

BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.

On 13.12.21 14:24, Ahmad Fatoum wrote:
> Hi,
> 
> I've been investigating a breakage on a PowerPC MPC8313: The SoC is connected
> via the "Enhanced Local Bus Controller" to a 8-bit-parallel S29GL064N flash,
> which is represented as a memory-mapped cfi-flash.
> 
> The regression began in v4.17-rc1 with
> 
>   dfeae1073583 ("mtd: cfi_cmdset_0002: Change write buffer to check correct 
> value")
> 
> and causes all flash write accesses on the hardware to fail. Example output
> after v5.1-rc2[1]:
> 
>   root@host:~# mount -t jffs2 /dev/mtdblock0 /mnt
>   MTD do_write_buffer_wait(): software timeout, address:0x000c000b.
>   jffs2: Write clean marker to block at 0x000c failed: -5
> 
> This issue still persists with v5.16-rc. Reverting aforementioned patch fixes
> it, but I am still looking for a change that keeps both Tokunori's and my
> hardware happy.
> 
> What Tokunori's patch did is that it strengthened the success condition
> for flash writes:
> 
>  - Prior to the patch, DQ polling was done until bits
>stopped toggling. This was taken as an indicator that the write succeeded
>and was reported up the stack. i.e. success condition is chip_ready()
> 
>  - After the patch, polling continues until the just written data is
>actually read back, i.e. success condition is chip_good()
> 
> This new condition never holds for me, when DQ stabilizes, it reads 0xFF,
> never the just written data. The data is still written and can be read back
> on subsequent reads, just not at that point of time in the poll loop.
> 
> We haven't had write issues for the years predating that patch. As the
> regression has been mainline for a while, I am wondering what about my setup
> that makes it pop up here, but not elsewhere?
> 
> I consulted the data sheet[2] and found Figure 27, which describes DQ polling
> during embedded algorithms. DQ switches from status output to "True" (I assume
> True == all bits set == 0xFF) until CS# is reasserted. 
> 
> I compared with another chip's datasheet, and it (Figure 8.4) doesn't describe
> such an intermittent "True" state. In any case, the driver polls a few hundred
> times, however, before giving up, so there should be enough CS# toggles.
> 
> 
> Locally, I'll revert this patch for now. I think accepting 0xFF as a success
> condition may be appropriate, but I don't yet have the rationale to back it 
> up.
> 
> I am investigating this some more, probably with a logic trace, b

Re: [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1

2022-01-20 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker speaking.

On 15.12.21 18:34, Tokunori Ikegami wrote:
> Hi Ahmad-san,
> 
> Sorry for the regression issue by the change: dfeae1073583.
> To make sure could you please try with the word write instead of the
> buffered writes?

Ahmad, did you try what Tokunori asked? Was any progress made to get
this regression fixed? To me it looks like it fell through the cracks.
Can anyone provide a status update please?

Ciao, Thorsten

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply, that's in everyone's interest.

BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.

#regzbot poke

> FYI: There are some changes to disable the buffered writes as below.
>   1.
> https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=target/linux/ar71xx/patches-4.9/411-mtd-cfi_cmdset_0002-force-word-write.patch;h=ddd69f17e1ac16e8fc3a694c56231fee1e2ef149;hb=fec8fe806963c96a6506c2aebc3572d3a11f285f
> 
>   2.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/mtd/chips/cfi_cmdset_0002.c?h=v5.16-rc5&id=7e4404113686868858a34210c28ae122e967aa64
> 
> 
> Note:
>   Currently I am not able to investigate the issue on the product for
> the change before.
> 
>   By the way in the past I had investigated the similar issue on Buffalo
> WZR-HP-G300NH using the S29GL256N.
>   It was not able to find the root cause by the investigation since not
> required actually at that time.
>   Also actually the buffered writes were disabled on the OpenWrt
> firmware as the change [1] above.
>   But I am not sure the reason detail to disable the buffered writes on
> the OpenWrt firmware.
>   I thought the issue not caused by the change: dfeae1073583 since the
> issue happened without the change.
> 
>   So I am not sure why the above change [2] needed to disable the
> buffered writes on Buffalo WZR-HP-G300NH.
>   Probably seems needed to disable the buffered writes on the other
> firmware also but not OpenWrt firmware.
> 
>   Anyway there are difference with your regression issue as below.
>     1. Flash device: S29GL064N (Your regression issue), S29GL256N
> (WZR-HP-G300NH)
>     2. Regression issue: Yes (Your regression issue), No (WZR-HP-G300NH
> as I investigated before)
> 
> Regards,
> Ikegami
> 
> On 2021/12/14 16:23, Thorsten Leemhuis wrote:
>> [TLDR: adding this regression to regzbot; most of this mail is compiled
>> from a few templates paragraphs some of you might have seen already.]
>>
>> Hi, this is your Linux kernel regression tracker speaking.
>>
>> Top-posting for once, to make this easy accessible to everyone.
>>
>> Thanks for the report.
>>
>> Adding the regression mailing list to the list of recipients, as it
>> should be in the loop for all regressions, as explained here:
>> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
>>
>> To be sure this issue doesn't fall through the cracks unnoticed, I'm
>> adding it to regzbot, my Linux kernel regression tracking bot:
>>
>> #regzbot ^introduced dfeae1073583
>> #regzbot title mtd: cfi_cmdset_0002: flash write accesses on the
>> hardware fail on a PowerPC MPC8313 to a 8-bit-parallel S29GL064N flash
>> #regzbot ignore-activity
>>
>> Reminder: when fixing the issue, please add a 'Link:' tag with the URL
>> to the report (the parent of this mail), then regzbot will automatically
>> mark the regression as resolved once the fix lands in the appropriate
>> tree. For more details about regzbot see footer.
>>
>> Sending this to everyone that got the initial report, to make all aware
>> of the tracking. I also hope that messages like this motivate people to
>> directly get at least the regression mailing list and ideally even
>> regzbot involved when dealing with regressions, as messages like this
>> wouldn't be needed then.
>>
>> Don't worry, I'll send further messages wrt to this regression just to
>> the lists (with a tag in the subject so people can filter them away), as
>> long as they are intended just for regzbot. With a bit of luck no such
>> messages will be needed anyway.
&

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2022-01-20 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker speaking.

On 04.12.21 01:40, Leo Li wrote:
>> -Original Message-
>> From: Joakim Tjernlund 
>> Sent: Thursday, December 2, 2021 4:45 PM
>> To: regressi...@leemhuis.info; Leo Li ;
>> eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org; linuxppc-
>> d...@lists.ozlabs.org
>> Cc: gre...@linuxfoundation.org; ba...@kernel.org
>> Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to
>> unrecoverable loop.
>>
>> On Thu, 2021-12-02 at 20:35 +, Leo Li wrote:
>>>
 -Original Message-
 From: Joakim Tjernlund 
 Sent: Wednesday, December 1, 2021 8:19 AM
 To: regressi...@leemhuis.info; Leo Li ;
 eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org;
 linuxppc- d...@lists.ozlabs.org
 Cc: gre...@linuxfoundation.org; ba...@kernel.org
 Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list
 leads to unrecoverable loop.

 On Tue, 2021-11-30 at 12:56 +0100, Joakim Tjernlund wrote:
> On Mon, 2021-11-29 at 23:48 +, Eugene Bordenkircher wrote:
>> Agreed,
>>
>> We are happy pick up the torch on this, but I'd like to try and
>> hear from
 Joakim first before we do.  The patch set is his, so I'd like to
 give him the opportunity.  I think he's the only one that can add a
 truly proper description as well because he mentioned that this
 includes a "few more fixes" than just the one we ran into.  I'd
 rather hear from him than try to reverse engineer what was being
>> addressed.
>>
>> Joakim, if you are still watching the thread, would you like to
>> take a stab
 at it?  If I don't hear from you in a couple days, we'll pick up the
 torch and do what we can.

Did anything happen? Sure, it's a old regression from the v3.4-rc4 days,
but there iirc was already a tested proto-patch in that thread that
fixes the issue. Or was progress made and I just missed it?

Ciao, Thorsten

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply, that's in everyone's interest.

BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.

#regzbot ignore-activity

> I am far away from this now and still on 4.19. I don't mind if you
> tweak
 tweak the patches for better "upstreamability"

 Even better would be to migrate to the chipidea driver, I am told
 just a few tweaks are needed but this is probably something NXP
 should do as they have access to other SOC's using chipidea.
>>>
>>> I agree with this direction but the problem was with bandwidth.  As this
>> controller was only used on legacy platforms, it is harder to justify new 
>> effort
>> on it now.
>>
>> Legacy? All PPC is legacy and not supported now?
> 
> I'm not saying that they are not supported, but they are in maintenance only 
> mode.


Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2022-02-17 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker speaking. Top-posting
for once, to make this easy accessible to everyone.

Sadly it looks to me like nobody is going to address this (quite old)
regression (that afaic only very few people will hit), despite the rough
patch to fix it that was already posted and tested in this thread.

Well, guess that's how it is sometimes. Marking it as "on back burner"
in regzbot to reduce the noise there:

#regzbot backburner: Tested patch available, but things nevertheless got
stuck

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.

#regzbot poke



On 20.01.22 13:54, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker speaking.
> 
> On 04.12.21 01:40, Leo Li wrote:
>>> -Original Message-
>>> From: Joakim Tjernlund 
>>> Sent: Thursday, December 2, 2021 4:45 PM
>>> To: regressi...@leemhuis.info; Leo Li ;
>>> eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org; linuxppc-
>>> d...@lists.ozlabs.org
>>> Cc: gre...@linuxfoundation.org; ba...@kernel.org
>>> Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to
>>> unrecoverable loop.
>>>
>>> On Thu, 2021-12-02 at 20:35 +, Leo Li wrote:
>>>>
>>>>> -Original Message-
>>>>> From: Joakim Tjernlund 
>>>>> Sent: Wednesday, December 1, 2021 8:19 AM
>>>>> To: regressi...@leemhuis.info; Leo Li ;
>>>>> eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org;
>>>>> linuxppc- d...@lists.ozlabs.org
>>>>> Cc: gre...@linuxfoundation.org; ba...@kernel.org
>>>>> Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list
>>>>> leads to unrecoverable loop.
>>>>>
>>>>> On Tue, 2021-11-30 at 12:56 +0100, Joakim Tjernlund wrote:
>>>>>> On Mon, 2021-11-29 at 23:48 +, Eugene Bordenkircher wrote:
>>>>>>> Agreed,
>>>>>>>
>>>>>>> We are happy pick up the torch on this, but I'd like to try and
>>>>>>> hear from
>>>>> Joakim first before we do.  The patch set is his, so I'd like to
>>>>> give him the opportunity.  I think he's the only one that can add a
>>>>> truly proper description as well because he mentioned that this
>>>>> includes a "few more fixes" than just the one we ran into.  I'd
>>>>> rather hear from him than try to reverse engineer what was being
>>> addressed.
>>>>>>>
>>>>>>> Joakim, if you are still watching the thread, would you like to
>>>>>>> take a stab
>>>>> at it?  If I don't hear from you in a couple days, we'll pick up the
>>>>> torch and do what we can.
> 
> Did anything happen? Sure, it's a old regression from the v3.4-rc4 days,
> but there iirc was already a tested proto-patch in that thread that
> fixes the issue. Or was progress made and I just missed it?
> 
> Ciao, Thorsten
> 
> P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
> on my table. I can only look briefly into most of them. Unfortunately
> therefore I sometimes will get things wrong or miss something important.
> I hope that's not the case here; if you think it is, don't hesitate to
> tell me about it in a public reply, that's in everyone's interest.
> 
> BTW, I have no personal interest in this issue, which is tracked using
> regzbot, my Linux kernel regression tracking bot
> (https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
> this mail to get things rolling again and hence don't need to be CC on
> all further activities wrt to this regression.
> 
> #regzbot ignore-activity
> 
>>>>>> I am far away from this now and still on 4.19. I don't mind if you
>>>>>> tweak
>>>>> tweak the patches for better "upstreamability"
>>>>>
>>>>> Even better would be to migrate to the chipidea driver, I am told
>>>>> just a few tweaks are needed but this is probably something NXP
>>>>> should do as they have access to other SOC's using chipidea.
>>>>
>>>> I agree with this direction but the problem was with bandwidth.  As this
>>> controller was only used on legacy platforms, it is harder to justify new 
>>> effort
>>> on it now.
>>>
>>> Legacy? All PPC is legacy and not supported now?
>>
>> I'm not saying that they are not supported, but they are in maintenance only 
>> mode.


Linux 4.11: Reported regressions as of Tuesday, 20176-03-14

2017-03-14 Thread Thorsten Leemhuis
Hi! Find below my first regression report for Linux 4.11. It lists 9
regressions I'm currently aware of.

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Sorry, I didn't compile any regression reports for 4.10: I 
didn't find time due to various reasons (vacation, a cold, regular 
work, and attending two conferences). Reminder: compiling these reports
has nothing to do with my paid job and I'm doing it in my spare time
just because I think someone should do it.

P.P.S: Dear Gmane mainling list archive webinterface, please come
back soon. I really really miss you. It hurts ever single day.
Don't you miss me, too? ;-)

== Current regressions ==

Desc: PowerPC crashes on boot, bisected to commit 5657933dbb6e
Repo: 2017-03-02 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1343553.html
Stat: 2017-03-09 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1349568.html
Note: patch hopefully heading mainline

Desc: thinkpad x220: GPU hang
Repo: 2017-03-05 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1345689.html
Stat: n/a 
Note: poked discussion for a status update

Desc: e1000e: __pci_enable_msi_range fails before/after resume
Repo: 2017-03-06 https://bugzilla.kernel.org/show_bug.cgi?id=194801
Stat: 2017-03-06 https://bugzilla.kernel.org/show_bug.cgi?id=194801#c1
Note: poked bug for status; might need to get forwared to network people

Desc: Two batteries is detected on DEXP Ursus 7W tablet instead of one
Repo: 2017-03-07 https://bugzilla.kernel.org/show_bug.cgi?id=194811
Stat: 2017-03-12 https://bugzilla.kernel.org/show_bug.cgi?id=194811#c8
Note: patch likely heading mainline

Desc: [lkp-robot] [f2fs] 4ac912427c: -33.7% aim7.jobs-per-min regression
Repo: 2017-03-08 https://www.spinics.net/lists/kernel/msg2459239.html
Stat: 2017-03-13 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1353085.html
Note: patch discussed and heading mainline

Desc: VM with virtio-scsi drive often crashes during boot with kernel 4.11rc1
Repo: 2017-03-09 https://bugzilla.kernel.org/show_bug.cgi?id=194837
Stat: n/a 
Note: will forward this to scsi & virtio & kvm people

Desc: general protection fault: inet6_fill_ifaddr+0x6c/0x230
Repo: 2017-03-11 https://bugzilla.kernel.org/show_bug.cgi?id=194849
Stat: n/a 
Note: poked bug for status; might need to get forwared to network people

Desc: Synaptics RMI4 touchpad regression in 4.11-rc1
Repo: 2017-03-11 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html
Stat: 2017-03-13 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1352399.html
Note: solution discussed, no patch yet

Desc: DRM BUG while initializing cape verde (2nd card)
Repo: 2017-03-13 https://bugzilla.kernel.org/show_bug.cgi?id=194867
Stat: n/a 
Note: patch proposed by reporter



Linux 4.11: Reported regressions as of Tuesday, 2017-04-02

2017-04-02 Thread Thorsten Leemhuis
Hi! Find below my second regression report for Linux 4.11. It lists 13
regressions I'm currently aware of. It lists 6 fixed regressions. Some
of them where in the first report from three weeks ago; a few were 
supposed to go into a second report I prepared last week, but wasn't 
able to finish :-/

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports, that makes
compiling these reports a whole lot easier!

== Current regressions ==

Desc: malta_defconfig regressions
Repo: 2017-03-31 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1367470.html
Stat: n/a 
Note: some patched already heading mainline

Desc: Commit d8514d8edb5b ("ovl: copy up regular file using O_TMPFILE") breaks 
ubifs
Repo: 2017-03-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1363879.html
Stat: 2017-03-30 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1366190.html 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1366208.html
Note: patches being tested; tests looking good so far

Desc: 07ec51480b5e  ("virtio_pci: use shared interrupts for virtqueues") causes 
some kworker grief in -rt too
Repo: 2017-03-27 
https://www.mail-archive.com/search?l=mid&q=1490605644.14634.50.ca...@gmx.de
Stat: 2017-03-31 
https://www.mail-archive.com/search?l=mid&q=20170331082049.ga4...@lst.de
Note: hch is looking into this

Desc: HP 820 G3 becomes unstable after resume from suspend
Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195041
Stat: 2017-03-26 
Note: might be a duplicate of 
https://bugzilla.kernel.org/show_bug.cgi?id=194801 ; revert for that bug is 
heading upstream via davem (see 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1361534.html )

Desc: NVMe APST? Samsung PM951 NVMe sudden controller death
Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195039
Stat: 2017-03-29 
Note: Got luto and axboe into the loop, investigation ongoing, maybe a 
blacklist update is needed; issue might be the same to 
https://bugzilla.kernel.org/show_bug.cgi?id=194921 (see below)

Desc: NVMe APST? NVMe resets leads to capacity change to 0, leading to panics; 
Samsung SSD as well
Repo: 2017-03-18 https://bugzilla.kernel.org/show_bug.cgi?id=194921
Stat: 2017-03-28 
Note: Got luto and axboe into the loop, investigation ongoing; maybe a 
blacklist update is needed;  issue might be related to 
https://bugzilla.kernel.org/show_bug.cgi?id=195039 (see above)

Desc: Perf regression after enabling nvme APST
Repo: 2017-03-17 https://lkml.org/lkml/2017/3/17/177
Stat: 2017-03-20 https://lkml.org/lkml/2017/3/20/998
Note: luto: lying disk?

Desc: i915 gpu hangs under load
Repo: 2017-03-22 
https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg116227.html 
https://bugs.freedesktop.org/show_bug.cgi?id=100181
Stat: 2017-04-02 
https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg117315.html
Note: Reporter: "there's a fix out there. I don't know if it's in rc5 though."

Desc: 4.10/4.11: mmc: core: HS DDR switch, don't change timing before checking 
status, as it might lead to boot problems
Repo: 2017-03-10 https://patchwork.kernel.org/patch/9617489/
Stat: 2017-03-24 
Note: looks like the real root cause was found, but then the discussion stalled 
afaics

Desc: Synaptics RMI4 touchpad regression in 4.11-rc1: pointer jumps
Repo: 2017-03-11 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html
Stat: 2017-03-31 
https://www.mail-archive.com/search?l=mid&q=20170331085751.gf22...@mail.corp.redhat.com
Note: two patches to improve the situation available; discussion which to use

Desc: Synaptics RMI4 touchpad regression in 4.11-rc1: palm detection
Repo: 2017-03-11 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html
Stat: 2017-03-19 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1356832.html
Note: discussion stalled; asked for an update

Desc: e1000e: __pci_enable_msi_range fails before/after resume
Repo: 2017-03-06 https://bugzilla.kernel.org/show_bug.cgi?id=194801
Stat: 2017-03-14 https://bugzilla.kernel.org/show_bug.cgi?id=194801#c1
Note: revert for that bug is heading upstream via davem (see 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1361534.html )

Desc: thinkpad x220: GPU hang
Repo: 2017-03-05 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1345689.html
Stat: 2017-03-25 
https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg180860.html
Note: Ignored by DRI people? Pavel wrote: "We know where the bug is, but 
there's no fix for it. There was one patch, but it was quickly withdrawn."


== Stalled, waiting for feedback from reporter ==

Desc: pine64 defconfig: WARNING: CPU: 0 PID: 86 at drivers/base/dd.c:349 
driver_probe_device+0x258/0x2c0
Repo: 2017-03-25 https://b

Linux 4.11: Reported regressions as of Sunday, 2017-04-09

2017-04-09 Thread Thorsten Leemhuis
Hi! Find below my third regression report for Linux 4.11. It lists 15
regressions I'm currently aware of. 5 regressions mentioned in last
weeks report got fixed.

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, as that makes compiling these reports a whole lot easier!

== Current regressions ==

Desc: Problems since 5b52330bbfe6 "audit: fix auditd/kernel connection state 
tracking"
Repo: 2017-04-09 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1373243.html
Stat: 2017-04-09 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1373315.html
Note: solution discussed

Desc: "Synaptics Touch Digitizer V04" no longer working
Repo: 2017-04-08 https://bugzilla.kernel.org/show_bug.cgi?id=195287
Stat: n/a 
Note: brand new

Desc: crash since 617f01211baf ("8139too: use napi_complete_done()")
Repo: 2017-04-07 
https://www.mail-archive.com/netdev@vger.kernel.org/msg162162.html
Stat: 2017-04-08 
https://www.mail-archive.com/netdev@vger.kernel.org/msg162285.html
Note: quite new, people are looking into it

Desc: Warning since RC4: sed_opal:OPAL: Error on step function: 0 with error 
-95: Unknown Error
Repo: 2017-04-07 https://bugzilla.kernel.org/show_bug.cgi?id=195277
Stat: 2017-04-07 https://bugzilla.kernel.org/show_bug.cgi?id=195277#c1
Note: patch submitted

Desc: networking/softirq performance regression due to a part of 374ad05ab64d
Repo: 2017-03-29 https://lkml.org/lkml/2017/3/29/758
Stat: 2017-04-05 https://lkml.org/lkml/2017/4/5/553
Note: might be stalled

Desc: 4.11 PowerMac G5 970MP: [drm:.r600_ring_test [radeon]] *ERROR* radeon: 
ring 0 test failed (scratch(0x8504)=0xCAFEDEAD
Repo: 2017-03-24  https://bugs.freedesktop.org/show_bug.cgi?id=99851#c26
Stat: n/a 
Note: looks like reporter (who does not see the problem in 4.10) could need 
some help bisecting

Desc: Commit d8514d8edb5b ("ovl: copy up regular file using O_TMPFILE") breaks 
ubifs
Repo: 2017-03-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1363879.html
Stat: 2017-03-30 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1366190.html 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1366208.html
Note: looks stalled after patches to fix issue seemed to do well in tests

Desc: 07ec51480b5e  ("virtio_pci: use shared interrupts for virtqueues") causes 
some kworker grief in -rt too
Repo: 2017-03-27 
https://www.mail-archive.com/search?l=mid&q=1490605644.14634.50.ca...@gmx.de
Stat: 2017-04-07 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372848.html
Note: WIP

Desc: pine64 defconfig: WARNING: CPU: 0 PID: 86 at drivers/base/dd.c:349 
driver_probe_device+0x258/0x2c0
Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195037 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1368488.html
Stat: 2017-03-28 https://bugzilla.kernel.org/show_bug.cgi?id=195037#c2
Note: André doesnt have a clue and needs someone with insights devm and devres

Desc: NVMe APST? Samsung PM951 NVMe sudden controller death
Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195039
Stat: 2017-03-29 
Note: Poked luto, as it looked stalled; maybe a blacklist update is needed;  
issue might be related to https://bugzilla.kernel.org/show_bug.cgi?id=194921 
(see below)

Desc: NVMe APST? NVMe resets leads to capacity change to 0, leading to panics; 
Samsung SSD as well
Repo: 2017-03-18 https://bugzilla.kernel.org/show_bug.cgi?id=194921
Stat: 2017-03-28 
Note: Poked luto, as it looked stalled; maybe a blacklist update is needed;  
issue might be related to https://bugzilla.kernel.org/show_bug.cgi?id=195039 
(see above)

Desc: Perf regression after enabling nvme APST
Repo: 2017-03-17 https://lkml.org/lkml/2017/3/17/177
Stat: 2017-03-20 https://lkml.org/lkml/2017/3/20/998
Note: stalled

Desc: i915 gpu hangs under load
Repo: 2017-03-22 
https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg116227.html 
https://bugs.freedesktop.org/show_bug.cgi?id=100181 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372182.html
Stat: 2017-04-07 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372182.html
Note: A few patches from Andrea (who's seeing problems as well) are being 
discussed

Desc: Synaptics RMI4 touchpad regression in 4.11-rc1: pointer jumps
Repo: 2017-03-11 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html
Stat: 2017-03-31 
https://www.mail-archive.com/search?l=mid&q=20170331085751.gf22...@mail.corp.redhat.com
Note: stalled; two patched to improve the situation available;

Desc: Synaptics RMI4 touchpad regression in 4.11-rc1: palm detection
Repo: 2017-03-11 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html
Stat: 2017-03-19 
https://www.mail-archive.com/linux-kernel@

Linux 4.11: Reported regressions as of Friday, 2017-04-21

2017-04-21 Thread Thorsten Leemhuis
Hi! Find below my fourth regression report for Linux 4.11. It lists 10
regressions I'm currently aware of. 7 regressions mentioned in last
weeks report got fixed.

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

Desc: Client system gets spammed with 'NFSv4 Callback' kthreads - eventually 
causing NFS shares to become unresponsive
Repo: 2017-04-17 https://bugzilla.kernel.org/show_bug.cgi?id=195449
Stat: n/a 
Note: quite new, needs to get forwarded to the network people

Desc: boot problems when virtio-scsi is used
Repo: 2017-04-19 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1380561.html
Stat: 2017-04-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1381637.html
Note: patch heading mainline

Desc: unable to handle kernel NULL pointer dereference, 
mtip_irq_handler+0x262/0x3c0 [mtip32xx]
Repo: 2017-04-12 https://bugzilla.kernel.org/show_bug.cgi?id=195429
Stat: n/a 
Note: not entirely clear if this is a regression, but bhelgaas is looking into 
it

Desc: tytso: systemd doesn't see most devices
Repo: 2017-04-11 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1375255.html
Stat: 2017-04-12 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1375851.html
Note: heisenbug?

Desc: crash since 617f01211baf ("8139too: use napi_complete_done()")
Repo: 2017-04-07 
https://www.mail-archive.com/netdev@vger.kernel.org/msg162162.html
Stat: 2017-04-08 
https://www.mail-archive.com/netdev@vger.kernel.org/msg162285.html
Note: there is a patch in that thread, but it afaics is not heading mainline 
and the discussion stalled

Desc: Busy softirq can cause userspace not to be scheduled
Repo: 2017-03-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1363884.html 
Stat: 2017-04-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1381585.html
Note: Frederic: "working on reproducing that one"

Desc: Commit d8514d8edb5b ("ovl: copy up regular file using O_TMPFILE") breaks 
ubifs
Repo: 2017-03-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1363879.html
Stat: 2017-04-17 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1378740.html
Note: Patch hopefully heading upstream soon

Desc: pine64 defconfig: WARNING: CPU: 0 PID: 86 at drivers/base/dd.c:349 
driver_probe_device+0x258/0x2c0
Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195037 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1368488.html
Stat: 2017-04-18 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1379209.html
Note: André doesnt have a clue and needs someone with insights devm and devres; 
seems to be gone in -next

Desc: NVMe APST: Samsung NVMe sudden controller death
Repo: 2017-03-18 https://bugzilla.kernel.org/show_bug.cgi?id=195039 
https://bugzilla.kernel.org/show_bug.cgi?id=194921
Stat: 2017-04-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1381954.html
Note: blacklist update heading mainline 

Desc: Perf regression after enabling nvme APST
Repo: 2017-03-17 https://lkml.org/lkml/2017/3/17/177
Stat: 2017-04-20 https://lkml.org/lkml/2017/3/20/998
Note: stalled


== Stalled, waiting for feedback from reporter ==

Desc: kvm: workqueue lockup
Repo: 2017-03-14 https://bugzilla.kernel.org/show_bug.cgi?id=194883
Stat: 2017-03-27 https://bugzilla.kernel.org/show_bug.cgi?id=194883#c1
Note: asked reporter if issue still present in RC4

Desc: general protection fault: inet6_fill_ifaddr+0x6c/0x230
Repo: 2017-03-11 https://bugzilla.kernel.org/show_bug.cgi?id=194849
Stat: 2017-03-13 https://bugzilla.kernel.org/show_bug.cgi?id=194849#c1
Note: Cong Wang asked reporter for details, didn't get any reply yet

Desc: 4.11 PowerMac G5 970MP: [drm:.r600_ring_test [radeon]] *ERROR* radeon: 
ring 0 test failed (scratch(0x8504)=0xCAFEDEAD
Repo: 2017-03-24 https://bugs.freedesktop.org/show_bug.cgi?id=99851#c26
Stat: n/a 
Note: stalled; looks like reporter (who does not see the problem in 4.10) could 
need some help bisecting


== Going to be removed from the list ==

Desc: i915 gpu hangs under load (2)
Repo: 2017-03-22 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372182.html
Stat: 2017-04-07 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372182.html
Note: Andrea saw issues and sent some patches that were discussed; some were 
applied, others discussed, but then the discussion stalled;  I assume it's 
fixed, as Andrea knows how all this works

Desc: Warning since RC4: sed_opal:OPAL: Error on step function: 0 with error 
-95: Unknown Error
Repo: 2017-04-07 https://bugzilla.kernel.org/show_bug.cgi?id=195277
Stat: 2017-04-07 https://www.spinics.net/lists/linux-block/msg11136.html

Linux 4.14: Reported regressions as of Sunday, 2017-10-01

2017-10-01 Thread Thorsten Leemhuis
Hi! Find below my first regression report for Linux 4.14. It lists 4
regressions I'm currently aware of (two of the reports are my own). I
skimmed LKML, bugzilla.kernel.org, but those were all I found that
looked worthy. And nobody pointed me to any regressions directly. Sigh.
Either we are doing really well this cycle or nobody wants his
regression tracked...

As always: Are you aware of any other regressions? Then please let me
know by mail (a simple bounce in my direction is enough!). For details
see http://bit.ly/lnxregtrackid And please tell me if there is anything
in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Module removal-related regression
Status: Fix "driver core: suppress sending MODALIAS in UNBIND uevents"
is sitting in greg's driver-core-linus
Reported: 2017-09-09
https://marc.info/?l=linux-kernel&m=150497889508778
Cause: 1455cf8dbfd06aa7651dcfccbadb7a093944ca65

stalls, short lived or long lived lockups very shortly after boot.
Status: Nothing new since 10+ days; maybe fixed already? poked list for
a status update
Reported: 2017-09-19
https://marc.info/?l=linux-kernel&m=150583434416295
Cause: 74def747bcd0

CIFS SMB2+ combined with pythons xattr.listxattr leads to "IOError:
[Errno 61]"
Status: no reaction from developers yet; need to poke list again
Note: Disclaimer: A regression the regression tracker reported
Reported: 2017-09-26
https://marc.info/?l=linux-cifs&m=150644485708526
Cause: 8dc5b3a6cb2f (assumed)

Ath10k disconnects
Status: brand new, might turn out to be a false alarm
Note: Disclaimer: A regression the regression tracker reported
Reported: 2017-10-01
http://lists.infradead.org/pipermail/ath10k/2017-October/010189.html


== Reported, but not added to the report for one reason or another ==

New default s2idle does not work on Dell XPS 13 9360
Status: Might be fixed, asked for a status update.
https://bugzilla.kernel.org/show_bug.cgi?id=196907

dd1c1f2f20: will-it-scale.per_process_ops -5% regression
Status: Linus: "Sadly, while I love the concept of performance tracking,
the "will-it-scale" reports haven't really been reliable enough to
really be useful." https://lkml.org/lkml/2017/9/3/103
https://marc.info/?l=linux-kernel&m=150457748221543

52306e882f: stress-ng.lockofd.ops_per_sec -11% regression
Status: see above
https://marc.info/?l=linux-kernel&m=150658588810343

9e52fc2b50:  will-it-scale.per_thread_ops -16% regression
Status: see above; but there was some discussion about this here, so
maybe it should be on the list
https://marc.info/?l=linux-kernel&m=150649209519462



Linux 4.14: Reported regressions as of Sunday, 2017-10-08

2017-10-08 Thread Thorsten Leemhuis
Hi! Find below my second regression report for Linux 4.14. It lists 8
regressions I'm currently aware of. One regression was fixed since last
weeks report. One was in there that shouldn't have been there.

As always: Are you aware of any other regressions? Then please let me
know by mail (a simple bounce in my direction is enough!). For details
see http://bit.ly/lnxregtrackid And please tell me if there is anything
in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to Adam and Igor for pointing me at two regressions they face.
And thx to Yanko for pointing out a stupidity I did in last weeks report.

== Current regressions ==

"hangs when building e.g. perf" & "Random insta-reboots on AMD Phenom II"
Status: "Mr. Luto better revert the new lazy TLB flushing fun'n'games"
-> "Yeah, working on it.  It's not a straightforward revert."
Note: TWIMC: Workaround: wrmsr -a 0xc0010015 0x118
Reported: 2017-09-05
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1484723.html
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1501379.html
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1501570.html
Cause: 94b1b03b519b81c494900cb112aa00ed205cc2d9

New default s2idle does not work on Dell XPS 13 9360
Status: works fine on several other owners of this laptop; maybe
specific to the variant or the particular machine the reporter owns;
looks related to the storage device used
Reported: 2017-09-11
https://bugzilla.kernel.org/show_bug.cgi?id=196907
Cause: e870c6c87cf9484090d28f2a68aa29e008960c93 (assumed)

CIFS SMB2+ combined with pythons xattr.listxattr leads to "IOError:
[Errno 61
Status: no reaction from developers yet; reporter needs to reverify and
poke list
Note: Disclaimer: A regression the regression tracker reported
Reported: 2017-09-26
https://marc.info/?l=linux-cifs&m=150644485708526
Cause: 8dc5b3a6cb2f (assumed)

Ath10k disconnects
Status: WIP, recently got bisected
Note: Only happens with some wifi routers; Disclaimer: A regression the
regression tracker reported
Reported: 2017-10-01
http://lists.infradead.org/pipermail/ath10k/2017-October/010189.html
Cause: c9353bf483d3724c116a9d502c0ead9cec54a61a

Oops in nouveau_fbcon_set_suspend_work during boot or resume on some
machines
Status: No reaction from developers;  ask reporter for bisection
Note: Something specific to  ThinkPad W530 and W531?
Reported: 2017-10-02
https://bugzilla.kernel.org/show_bug.cgi?id=197103
https://bugs.freedesktop.org/show_bug.cgi?id=102381

networking doesn't work in opensuse 42.2 due to apparmor: add base
infastructure for socket mediation
Status: there was a debate if this is a regression or not that faded out
Reported: 2017-10-03
https://lkml.kernel.org/r/1507003338.3174.4.ca...@hansenpartnership.com
Cause: 651e28c5537abb39076d3949fb7618536f1d242e

nftables oops with 4.14.0-rc3 on arm64 (Rock64 board)
Status: Tested fix: http://patchwork.ozlabs.org/patch/821334/
Reported: 2017-10-04
https://bugzilla.kernel.org/show_bug.cgi?id=197123
Cause: 9f08ea848117

WiFi stopped working with 4.14
Status: told reporters they better should bringt this to netdev
Note: maybe these are two different issues; one with rtl8723bs an one
where with 8265 that is related to switching BT on and off
Reported: 2017-10-05
https://bugzilla.kernel.org/show_bug.cgi?id=197137


== Fixed since last report ==

stalls, short lived or long lived lockups very shortly after boot.
Status: got reverted weeks ago and I missed it when compiling last weeks
report (sorry!)
Reported: 2017-09-19
https://marc.info/?l=linux-kernel&m=150583434416295
Cause: 74def747bcd0

Module removal-related regression
Status: Fixed: 6878e7de6af726de47f9f3bec649c3f49e786586
Reported: 2017-09-09
https://marc.info/?l=linux-kernel&m=150497889508778
Cause: 1455cf8dbfd06aa7651dcfccbadb7a093944ca65


Linux 4.14: Reported regressions as of Sunday, 2017-10-15

2017-10-15 Thread Thorsten Leemhuis
Hi! Find below my third regression report for Linux 4.14. It lists 9
regressions I'm currently aware of. Two regressions got fixed since last
weeks report.

As always: Are you aware of any other regressions? Then please let me
know by mail (a simple bounce or forward in my direction is enough!).
For details see http://bit.ly/lnxregtrackid And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

New default s2idle does not work on Dell XPS 13 9360 with hynix 512GB
Status: works fine on several other owners of this laptop;  appears to
be related to NVMe(and specifically to the particular Hynix 512G NVMe
SSD in that machine)
Note: this seems to be a regression in 4.13 that is exposed by a change
in 4.14; judge yourself if this should be in the list or not
Reported: 2017-09-11
https://bugzilla.kernel.org/show_bug.cgi?id=196907
https://lkml.kernel.org/r/3347223.tl9vk2v...@aspire.rjw.lan
Cause: e870c6c87cf9484090d28f2a68aa29e008960c93 (assumed)

CIFS SMB2+ combined with pythons xattr.listxattr leads to "IOError:
[Errno 61
Status: no reaction from developers yet; reporter too lazy, he still
needs to reverify and poke list
Note: Disclaimer: A regression the regression tracker reported
Reported: 2017-09-26
https://marc.info/?l=linux-cifs&m=150644485708526
Cause: 8dc5b3a6cb2f (assumed)

Ath10k disconnects
Status: no progress in the last week; will poke on Monday
Note: Only happens with some wifi routers; Disclaimer: A regression the
regression tracker reported
Reported: 2017-10-01
http://lists.infradead.org/pipermail/ath10k/2017-October/010189.html
Cause: c9353bf483d3724c116a9d502c0ead9cec54a61a

Oops in nouveau_fbcon_set_suspend_work during boot or resume on some
machines
Status: Three people see this now; one is considering a bisection
Reported: 2017-10-02
https://bugzilla.kernel.org/show_bug.cgi?id=197103
https://bugs.freedesktop.org/show_bug.cgi?id=102381

networking doesn't work in opensuse 42.2 due to apparmor: add base
infastructure for socket mediation
Status: stalled: after the reported by James there was a debate if this
is a regression or not that in the end faded out
Note: Maybe this should be removed from the list
Reported: 2017-10-03
https://lkml.kernel.org/r/1507003338.3174.4.ca...@hansenpartnership.com
Cause: 651e28c5537abb39076d3949fb7618536f1d242e

WiFi stopped working with 4.14 (two report: a staging driver and iwlwifi)
Status: told reporters a week ago they better should bring this to
netdev; didn't happen afaics
Note: maybe these are two different issues; one with rtl8723bs
(Staging!) and one where with 8265 that is related to switching BT on
and off
Reported: 2017-10-05
https://bugzilla.kernel.org/show_bug.cgi?id=197137

Dramatic lockdep slowdown in 4.14
Status: WIP
Note: "Now since it's lockdep I guess this can't really be considered a
regression if these changes did improve lockdep correctness, but still,
this dramatic slow down essentially forces me to disable PROVE_LOCKING
by default on this system."
Reported: 2017-10-13
https://lkml.kernel.org/r/20171013090333.GA17356@localhost
Cause: 28a903f63ec0

Hikey620: it's easy to trigger a panic with "rcu_preempt detected stalls
on CPUs/tasks"
https://lkml.kernel.org/r/20171010142725.GA24797@leoy-linaro
Status: WIP
Cause: e3067861ba66

On first generation i486 processors it immediately resets the system
after the "Booting the kernel" message.
Status: brand new
https://lkml.kernel.org/r/cap8wd_a-6dpezhrsq4yrkkkmypkxw1wobxchj0ojcrvjmsc...@mail.gmail.com
Cause: 87e81786b13b267c4355e0d23e33c7e4c08fa63f


== Fixed since last report ==

"hangs when building e.g. perf" & "Random insta-reboots on AMD Phenom II"
Status: Fixed by https://git.kernel.org/torvalds/c/67bb8e999e0a
Reported: 2017-09-05
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1484723.html
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1501379.html
https://lkml.kernel.org/r/cover.1508000261.git.l...@kernel.org
Cause: 94b1b03b519b81c494900cb112aa00ed205cc2d9

nftables oops with 4.14.0-rc3 on arm64 (Rock64 board)
Status: Fixed by https://git.kernel.org/torvalds/c/5f9bfe0ef622
Reported: 2017-10-04
https://bugzilla.kernel.org/show_bug.cgi?id=197123
Cause: 9f08ea848117


Linux 4.14: Reported regressions as of Sunday, 2017-10-29

2017-10-29 Thread Thorsten Leemhuis
Hi! Find below my fourth regression report for Linux 4.14. It lists 6
regressions I'm currently aware of; for most of them fixes are in the
work. 4 regressions got fixed since last weeks report; 1 turned out to
not be a regression.

As always: Are you aware of any other regressions? Then please let me
know by mail (a simple bounce or forward in my direction is enough!).
For details see http://bit.ly/lnxregtrackid And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Sorry, was travelling to kernel summit, that's why there wasn't a
report last week

== Current regressions ==

New default s2idle does not work on Dell XPS 13 9360 with hynix 512GB
Status: Patch to blacklist the problematic machines up for testing
Note: appears to be related to NVMe (and specifically to the particular
Hynix 512G NVMe SSD in the reporter's machine); this seems to be a
regression in 4.13 that is exposed by a change in 4.14; judge yourself
if this should be in the list or not
Reported: 2017-09-11
https://bugzilla.kernel.org/show_bug.cgi?id=196907
Cause: e870c6c87cf9484090d28f2a68aa29e008960c93 (indirectly)

CIFS SMB2+ combined with pythons xattr.listxattr leads to "IOError:
[Errno 61]
Status: reporter was asked to provide additional details, but didn't
provide them yet (sorry, been busy)
Note: Disclaimer: A regression the regression tracker reported
Reported: 2017-09-26
https://marc.info/?l=linux-cifs&m=150644485708526
Cause: 8dc5b3a6cb2f (assumed)

Ath10k disconnects
Status: revert planned:
http://lists.infradead.org/pipermail/ath10k/2017-October/010368.html
Note: Only happens with some wifi routers; Disclaimer: A regression the
regression tracker reported
Reported: 2017-10-01
http://lists.infradead.org/pipermail/ath10k/2017-October/010189.html
Cause: c9353bf483d3724c116a9d502c0ead9cec54a61a

Oops in nouveau_fbcon_set_suspend_work during boot or resume on some
machines
Status: asked if https://git.kernel.org/torvalds/c/481376632537 fixes
this; if not then bisect needed
Reported: 2017-10-02
https://bugzilla.kernel.org/show_bug.cgi?id=197103
https://bugs.freedesktop.org/show_bug.cgi?id=102381

blk_partition_remap: fail for partition 3 on ARM board Odroid U3, with
root fs on eMMC
Status: told reporter to bring this to the list and CC the developers in
the suspected commit
Reported: 2017-10-17
https://bugzilla.kernel.org/show_bug.cgi?id=197303
Cause: 74d46992e0d9dee7f1f376de0d56d31614c8a17a (likey)

tun devices not working anymore in openvpn
Status: Fix in dave's tree already: 5c25f65fd1e42685f7ccd80e0621829c105785d9
Reported: 2017-10-28
https://lkml.kernel.org/r/ac1aaeab-b9f5-a034-56a8-4305494db...@eikelenboom.it
Cause: 0ad646c81b2182f7fa67ec0c8c825e0ee165696d


== Going to get removed ==

Hikey620: it's easy to trigger a panic with "rcu_preempt detected stalls
on CPUs/tasks"
Status: "[…] this issue is quite likely related with CA53 errata,
especialy ERRATA_A53_855873 is the relative one. So I changed to use
ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. […]
https://lkml.kernel.org/r/20171010142725.GA24797@leoy-linaro
Cause: e3067861ba66


== Fixed since last report ==

networking doesn't work in opensuse 42.2 due to apparmor: add base
infastructure for socket mediation
Status: Fixed by a revert: https://git.kernel.org/torvalds/c/80c094a47dd4
Reported: 2017-10-03
https://lkml.kernel.org/r/1507003338.3174.4.ca...@hansenpartnership.com
Cause: 651e28c5537abb39076d3949fb7618536f1d242e

WiFi stopped working with 4.14 (two report: a staging driver and iwlwifi)
Status: Fixed by a revert: https://git.kernel.org/torvalds/c/80c094a47dd4
Reported: 2017-10-05
https://bugzilla.kernel.org/show_bug.cgi?id=197137
Cause: 651e28c5537abb39076d3949fb7618536f1d242e

Dramatic lockdep slowdown in 4.14
Status: fixed by a revert: https://git.kernel.org/torvalds/c/b483cf3bc249
Reported: 2017-10-13
https://lkml.kernel.org/r/20171013090333.GA17356@localhost
Cause: 28a903f63ec0

On first generation i486 processors it immediately resets the system
after the "Booting the kernel" message.
Status: Fixed by https://git.kernel.org/torvalds/c/9c48c0965b97
https://lkml.kernel.org/r/cap8wd_a-6dpezhrsq4yrkkkmypkxw1wobxchj0ojcrvjmsc...@mail.gmail.com
Cause: 87e81786b13b267c4355e0d23e33c7e4c08fa63f


Linux 4.9: Reported regressions as of Sunday, 2016-12-04

2016-12-04 Thread Thorsten Leemhuis
Hi! Here is my fifth regression report for Linux 4.9. It lists 11
regressions I'm aware of. 4 of them are new; 6 got fixed since 
the last report -- that was two weeks ago, because I yet again
didn't find any spare time to compile a report last Sunday :-/

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Desc: System hang up with call trace during doing S3/S4 stress
Repo: 16-12-01 https://bugzilla.kernel.org/show_bug.cgi?id=189421
Stat: n/a 
Note: Brand new

Desc: x86/unwind: Fix guess-unwinder regression // With frame pointers 
disabled, /proc//stack is broken
Repo: 16-11-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1281244.html
Stat: n/a 
Note: Fix heading upstream

Desc: [lkp] [mremap] 5d1904204c: will-it-scale.per_thread_ops -13.1% regression
Repo: 16-11-27 https://www.spinics.net/lists/linux-mm/msg117307.html
Stat: n/a 
Note: Aaron could not reproduce the issue on two of his machines

Desc: sched: fix find_idlest_group for fork/performance regression in hackbench
Repo: 16-11-25 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1280514.html
Stat: 16-12-04 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1285936.html
Note: Report contains patch to fix this, testing ongoing

Desc: GPU hang on resume from hibernation
Repo: 16-10-16 https://bugs.freedesktop.org/show_bug.cgi?id=98288#c9 
https://bugzilla.kernel.org/show_bug.cgi?id=177701
Stat: n/a 
Note: confusing bug, but seems the issue is still present

Desc: GPU hang on PlaneShift
Repo: 16-10-16 https://bugs.freedesktop.org/show_bug.cgi?id=98922 
https://bugzilla.kernel.org/show_bug.cgi?id=177701
Stat: n/a 
Note: confusing bug, but the issue might still be present

Desc: [i.MX6 DRM IPUv3] Regression 4.9-rc5: greenish screen with YUV420 video
Repo: 16-11-17 https://www.spinics.net/lists/kernel/msg2385550.html
Stat: 16-12-02 https://www.spinics.net/lists/kernel/msg2396720.html
Note: "patch available: 3fd8b292ae6b (""drm/imx: ipuv3-plane: merge 
ipu_plane_atomic_set_base into atomic_update"")"

Desc: "Failed boots/Package drops bisected to 4cd13c21b207 ""softirq: Let 
ksoftirqd do its job""; "
Repo: 16-11-16 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1273344.html
Stat: 16-11-25 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1280379.html
Note: Stalled

Desc: builddeb: fix cross-building to arm64 producing host-arch debs
Repo: 16-11-04 https://www.spinics.net/lists/linux-kbuild/msg13635.html
Stat: 16-11-11 https://www.spinics.net/lists/linux-kbuild/msg13696.html
Note: Nothing happened when Adam pinged Michael in 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1276268.html


== Stalled, waiting for feedback from reporter ==

Desc: 4.9-rc1 boot regression, ambiguous bisect result
Repo: 2016-10-19 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253369.html
Stat: 16-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255296.html
Note: Waiting for Dan or someone else to look into this

Desc: Skylake gen6 suspend/resume video regression
Repo: 16-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177731 
https://bugs.freedesktop.org/show_bug.cgi?id=98517
Stat: 16-10-31 https://bugzilla.kernel.org/show_bug.cgi?id=177731#c7
Note: Stalled

== Fixed since last report ==

Desc: module loadling broken due to kbuild changes
Repo: 16-10-15 http://www.gossamer-threads.com/lists/linux/kernel/2544734 and 
various other threads (https://lwn.net/Articles/707520/ )
Fix:  
https://git.kernel.org/torvalds/c/cd3caefb4663e3811d37cc2afad3cce642d60061 
https://git.kernel.org/torvalds/c/faaae2a581435f32781a105dda3501df388fddcb 
(among others)

Desc: "irq 16: nobody cared (try booting with the ""irqpoll"" option) since 
t0b9e2988ab226 (ahci: use pci_alloc_irq_vectors)"
Repo: 16-11-19 https://bugzilla.kernel.org/show_bug.cgi?id=188181
Stat: 16-12-03 https://bugzilla.kernel.org/show_bug.cgi?id=188181#c9
Note: Fixed according to reporter (might be thx to 
https://git.kernel.org/torvalds/c/6929ef385e09c0065b87fda3e7b872a5070ac783 )

Desc: "build regression: make.cross ARCH=mips fails with ""No rule to make 
target 'alchemy/devboards/'. """
Repo: 16-10-30 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1262410.html 
https://marc.info/?l=linux-kernel&m=147780880425626
Fix:  https://git.kernel.org/torvalds/c/818f38c5b7c4482abd71c64ac4d49911fbefaf9e

Desc: "oops due to 493b2ed3f760 (""crypto: algif_hash - Handle NULL hashes 
correctly"")"
Repo: 16-11-17 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1273867.html
Fix:  https://git.kernel.org/torvalds/c/a8348bca2944d397a528772f5c0ccb47a8b58af4

Desc: MSI is no longer enabled for many/most Intel SATA controllers in 4.9
Repo: 16-11-16 https://bugzilla.kernel.org/show_bug.cgi?id=187821
Fix:  https://git.kernel.org/

Linux 4.16: Reported regressions as of Monday, 2018-02-19 (Was: Linux 4.16-rc2)

2018-02-19 Thread Thorsten Leemhuis
Hi! Find below my first regression report for Linux 4.16. It lists 2
regressions I'm currently aware of.

Are you aware of any other regressions? Then please let me know by mail
(a simple bounce or forward to the email address is enough!).

For details see http://bit.ly/lnxregtrackid And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Sorry, I didn't find spare time to work on regression reporting
and related issues during the 4.15 cycle. I also have a huge todo list
after the regression tracking discussion during maintainer and kernel
summit last fall. I will try to work on that that sooner or later; that
in the end will change nearly all the aspects of regression tracking is
done currently. But that will take some time (which I struggle to find
currently :-/ ) to realize, so for this cycle  I'm continuing to do
regression reports like I used to do them.

== Current regressions ==

Debian kernel package tool make-kpkg stalls indefinitely during kernel
build due to commit "kconfig: remove check_stdin()"
Status: stalled after some discussions
Note: Shouldn't be a problem to back this one out either if it turns out
to cause massive amounts of pain in practice I guess, even if it's the
Debian tools doing something weird.
Reported: 2018-02-12
https://marc.info/?l=linux-kernel&m=151846414807219
Cause: d2a04648a5dbc3d1d043b35257364f0197d4d868
Linux-Regression-ID: 2fd778

DM Regression: read() returns data when it shouldn't
Status: Developers are looking into it
Reported: 2018-02-14
https://marc.info/?l=linux-kernel&m=151861337518109&w=2
Cause: 18a25da84354c6bb655320de6072c00eda6eb602
Linux-Regression-ID: 9e195f


Linux 4.16: Reported regressions as of Monday, 2018-02-26 (Was: Linux 4.16-rc3)

2018-02-26 Thread Thorsten Leemhuis
On 26.02.2018 04:05, Linus Torvalds wrote:
> We're on the normal schedule for 4.16 and everything still looks very regular.

Hi! Find below my second regression report for Linux 4.16. It lists 8
regressions I'm currently aware of.

To anyone reading this: Are you aware of any other regressions that got
introduced this development cycle? Then please let me know by mail (a
simple bounce or forward to the email address is enough!).

For details see http://bit.ly/lnxregtrackid And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Debian kernel package tool make-kpkg stalls indefinitely during kernel
build due to commit "kconfig: remove check_stdin()"
Status: stalled after some discussions; seems nobody really cares deeply?
Note: From the discussion: "Shouldn't be a problem to back this one out
either if it turns out to cause massive amounts of pain in practice I
guess, even if it's the Debian tools doing something weird."
Reported: 2018-02-12
https://marc.info/?l=linux-kernel&m=151846414807219
Cause: d2a04648a5dbc3d1d043b35257364f0197d4d868
Linux-Regression-ID: 2fd778

Dell R640 does not boot due to SCSI/SATA failure
Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151931128006031
Cause: 84676c1f21e8
Linux-Regression-ID: 15a115

On Nokia N900:/dev/input/event6 aka AV Jack support disappeared
Status: quite new
Reported: 2018-02-24
https://marc.info/?l=linux-omap&m=151950886524308&w=2
Cause: 14e3e295b2b9
Linux-Regression-ID: 4b650f

SD card reader stopped working
Status: quite new
Reported: 2018-02-24
https://bugzilla.kernel.org/show_bug.cgi?id=198917
Linux-Regression-ID: 9adeaf

[mm, mlock, vmscan]  9c4e6b1a70:  stress-ng.hdd.ops_per_sec -7.9% regression
Status: WIP
Note: performance regression found by lkp-robot
Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151956997301994
Cause: 9c4e6b1a7027f102990c0395296015a812525f4d

aim7.jobs-per-min -18.0% regression
Status: WIP; bisection result is doubted, but was verified
Note: performance regression found by lkp-robot
Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151957120702272&w=2
Cause: c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e

Interrupt storm after suspend causes one busy kworker
Status: quite new
Reported: 2018-02-25
https://bugzilla.kernel.org/show_bug.cgi?id=198929
Linux-Regression-ID: 41c451


== Fix heading mainline ==

Dell XPS 13 9360 keyboard no longer works
Status: Fix in the platform tree from what I've heard
Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151927645427980
Cause: 30323fb6d552c41997baca5292bf7001366cab57


== Fixed since last report ==

DM Regression: read() returns data when it shouldn't
Status: Was fixed already when last report was compiled
Reported: 2018-02-14
https://marc.info/?l=linux-kernel&m=151861337518109&w=2
Cause: 18a25da84354c6bb655320de6072c00eda6eb602
Linux-Regression-ID: 9e195f


== Fixed before it was properly added to the report ==

kconfig.h: Include compiler types to avoid missed struct attributes
https://git.kernel.org/torvalds/c/28128c61e08e

i2c-designware: sound and s2ram broken on Broadwell-U system since
commit ca382f5b38f367b6 (add i2c gpio recovery option)
https://git.kernel.org/torvalds/c/d1fa74520dcd


Linux 4.16: Reported regressions as of Monday, 2018-03-05 (Was: Linux 4.16-rc4)

2018-03-05 Thread Thorsten Leemhuis
On 05.03.2018 00:15, Linus Torvalds wrote:
> Hmm. A reasonably calm week - the biggest change is to the 'kvm-stat'
> tool, not any actual kernel files.

Hi! Find below my third regression report for Linux 4.16. It lists 7
regressions I'm currently aware of. 3 were fixed since last weeks report.

To anyone reading this: Are you aware of any other regressions that got
introduced this development cycle? Then please let me know by mail (a
simple bounce or forward to the email address is enough!).

For details see http://bit.ly/lnxregtrackid And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Dell R640 does not boot due to SCSI/SATA failure
Status: Reporter looked into this and indicated the change might have
triggered a firmware bug on his machine
Reported: 2018-02-22 Last known developer activity:
https://marc.info/?l=linux-kernel&m=152026091325037
https://marc.info/?l=linux-kernel&m=151931128006031
Cause: 84676c1f21e8
Linux-Regression-ID: 15a115

[mm, mlock, vmscan]  9c4e6b1a70:  stress-ng.hdd.ops_per_sec -7.9% regression
Status: WIP; side note: lkp-robot warned about something else triggered
by the same commit:
https://lkml.kernel.org/r/20180302093940.GE25699@yexl-desktop is related
Note: performance regression found by lkp-robot
Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151956997301994
Cause: 9c4e6b1a7027f102990c0395296015a812525f4d

aim7.jobs-per-min -18.0% regression
Status: some discussion last week, but no real solution yet
Note: performance regression found by lkp-robot
Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151957120702272&w=2
Cause: c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e

Interrupt storm after suspend causes one busy kworker
Status: stalled?
Reported: 2018-02-25
https://bugzilla.kernel.org/show_bug.cgi?id=198929
Linux-Regression-ID: 41c451

hci_bcm: Streamline runtime PM code change for 4.16 kernel breaks
bluetooth on ASUS T100TA
Status: poked reporters for a update if they reported it to the relevant
developers
Reported: 2018-03-01
https://bugzilla.kernel.org/show_bug.cgi?id=198953
Cause: 43fff768346810042836df325d736bd2c2a634a7


== Regressions with fixes heading mainline ==

selftests: memory-hotplug: fix emit_tests regression
https://marc.info/?l=linux-kernel&m=151993543423651


== Going to get removed from the report ==

Debian kernel package tool make-kpkg stalls indefinitely during kernel
build due to commit "kconfig: remove check_stdin()"
Status: stalled after some discussions; seems nobody really cares that much
Note: From the discussion: "Shouldn't be a problem to back this one out
either if it turns out to cause massive amounts of pain in practice I
guess, even if it's the Debian tools doing something weird."
Reported: 2018-02-12
https://marc.info/?l=linux-kernel&m=151846414807219
Cause: d2a04648a5dbc3d1d043b35257364f0197d4d868
Linux-Regression-ID: 2fd778


== Fixed since last report ==

Dell XPS 13 9360 keyboard no longer works
Status: https://git.kernel.org/torvalds/c/de9647efeaa9
Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151927645427980
Cause: 30323fb6d552c41997baca5292bf7001366cab57

on Nokia N900:/dev/input/event6 aka AV Jack support disappeared
Status: https://git.kernel.org/torvalds/c/6662ae6af82d
https://git.kernel.org/torvalds/c/ce27fb2c56db
Reported: 2018-02-24
https://marc.info/?l=linux-omap&m=151950886524308&w=2
Cause: 14e3e295b2b9
Linux-Regression-ID: 4b650f

SD card reader stopped working
Status: Fixed in 4.16.0-rc3 according to reporter
Reported: 2018-02-24
https://bugzilla.kernel.org/show_bug.cgi?id=198917
Linux-Regression-ID: 9adeaf



Linux 4.16: Reported regressions as of , 2018-03-12 (Was: Linux 4.16-rc5)

2018-03-12 Thread Thorsten Leemhuis
On 12.03.2018 01:42, Linus Torvalds wrote:
> This continue to be pretty normal - this rc is slightly larger than
> rc4 was, but that looks like one of the normal fluctuations

Hi! Find below my fourth regression report for Linux 4.16. It lists 9
regressions I'm currently aware of. 1 was fixed since last weeks report.

To anyone reading this: Are you aware of any other regressions that got
introduced this development cycle? Then please let me know by mail (a
simple bounce or forward to the sender of this email address is enough!).

For details see http://bit.ly/lnxregtrackid And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Dell R640 does not boot due to SCSI/SATA failure
- Status: WIP; some other changed planed for 4.17 solve this regression,
but there is a discussion if they are too big for 4.16-rc
- Cause: 84676c1f21e8
- Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151931128006031
- Last known developer activity:
https://marc.info/?i=1520515113.20980.31.camel%40gmail.com
- Last relevant activity:
https://marc.info/?l=linux-kernel&m=152026091325037
- Linux-Regression-ID: 15a115

[mm, mlock, vmscan]  9c4e6b1a70:  stress-ng.hdd.ops_per_sec -7.9% regression
- Status: looks stalled; side note: lkp-robot warned about something
else triggered by the same commit:
https://lkml.kernel.org/r/20180302093940.GE25699@yexl-desktop is related
- Cause: 9c4e6b1a7027f102990c0395296015a812525f4d
- Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151956997301994
- Note: performance regression found by lkp-robot

aim7.jobs-per-min -18.0% regression
- Status: looks stalled after some discussions happened the week before last
- Cause: c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e
- Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151957120702272&w=2
- Note: performance regression found by lkp-robot

Interrupt storm after suspend causes one busy kworker
- Status: Waiting for data from reporter
- Reported: 2018-02-25
https://bugzilla.kernel.org/show_bug.cgi?id=198929
- Linux-Regression-ID: 41c451

hci_bcm: Streamline runtime PM code change for 4.16 kernel breaks
bluetooth on ASUS T100TA
- Status: WIP
- Cause: 43fff768346810042836df325d736bd2c2a634a7
- Reported: 2018-03-01
https://bugzilla.kernel.org/show_bug.cgi?id=198953

Error updating SMART data during runtime and could not connect to lv
["Possible Regression"]
- Status: Brand new
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152075643627082
https://bugzilla.kernel.org/show_bug.cgi?id=199077

Regression from efi: call get_event_log before ExitBootServices
- Status: WIP, looks like a firmware issue that gets triggerd
- Cause: 33b6d03469b2
- Reported: 2018-03-06
https://marc.info/?l=linux-kernel&m=152035206220237&w=2

15% longer running times on lvm2 test suite
- Status: Quite new
- Cause: 44c02a2c3dc55835e9f0d8ef73966406cd805001
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152077333230274


== Regressions with fixes heading mainline ==

[amdgpu CARRIZO] disabled backlight after S3 resume
- Status: Alex provided a patch "drm/amdgpu: save/restore backlight
level in legacy dce code"; assuming he'll send it upstream
- Cause: 4ec6ecf48c64d1da82a008f6fb0be86c4044287d
- Reported: 2018-03-07
https://bugzilla.kernel.org/show_bug.cgi?id=199047


== Fixed since last report ==

selftests: memory-hotplug: fix emit_tests regression
- Status: Fixed by https://git.kernel.org/torvalds/c/ba004a2955f7
- Cause: 16c513b13477
https://marc.info/?l=linux-kernel&m=151993543423651



Linux 4.16: Reported regressions as of , 2018-03-19 (Was: Re: Linux 4.16-rc6)

2018-03-19 Thread Thorsten Leemhuis
On 19.03.2018 02:14, Linus Torvalds wrote:
> This has been a nice quiet week, so rc6 is pretty tiny. Everything
> looks like we're on a usual schedule - I'll make an rc7, but hopefully
> that will be it.

Hi! Find below my fifth regression report for Linux 4.16. It lists 7
regressions I'm currently aware of. 2 were fixed since last weeks
report; 1 is new; 2 are going to be removed (see below for details).

Are you aware of any other regressions that got introduced this
development cycle? Then please let me know by mail (a simple bounce or
forward to the sender of this email address is enough!). And please tell
me if there is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Dell R640 does not boot due to SCSI/SATA failure
- Status: WIP; some other changes developed for 4.17 solve this
regression, but there is a discussion for a proper fix
- Cause:  https://git.kernel.org/torvalds/c/84676c1f21e8
- Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151931128006031
- Last known developer activity: 2018-03-14
https://marc.info/?l=linux-block&m=152102086831636&w=2
- Other relevant links:
https://marc.info/?l=linux-block&m=152051511802229&w=2
https://marc.info/?l=linux-kernel&m=152026091325037

hci_bcm: Streamline runtime PM code change for 4.16 kernel breaks
bluetooth on ASUS T100TA
- Status: Hans is working on it, lot's of activity in bugzilla
- Cause:  https://git.kernel.org/torvalds/c/43fff7683468
- Reported: 2018-03-01
https://bugzilla.kernel.org/show_bug.cgi?id=198953

Error updating SMART data during runtime and could not connect to lv
["Possible Regression"]
- Status: Two issues discussed here; not much progress yet on the
regression (latency issues in the MU03 version of the firmware,
triggered by polling SMART data, which causes lvmetad to timeout in some
cases)
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152075643627082
https://bugzilla.kernel.org/show_bug.cgi?id=199077
- Last known developer activity: 2018-03-19
https://marc.info/?l=linux-kernel&m=152145306610330
- Other relevant links:
https://www.mail-archive.com/linux-block@vger.kernel.org/msg18338.html
https://marc.info/?l=linux-scsi&m=152095303312164&w=2

15% longer running times on lvm2 test suite
- Status: Seems the real problem is in the way the test scripts interact
with the kernel
- Cause:  https://git.kernel.org/torvalds/c/44c02a2c3dc5
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152077333230274
- Last known developer activity: 2018-03-13
https://marc.info/?l=linux-kernel&m=152097761921525

sdhci-acpi not recognizing eMMC
- Status: Told reporter to contact developers an ML by mail
- Cause:  https://git.kernel.org/torvalds/c/1b7ba57ecc86
- Reported: 2018-03-13
https://bugzilla.kernel.org/show_bug.cgi?id=199105


== Waiting for clarification from reporter ==

Interrupt storm after suspend causes one busy kworker
- Status: Still waiting for data from reporter
- Reported: 2018-02-25
https://bugzilla.kernel.org/show_bug.cgi?id=198929

AMDGPU Fury X random screen flicker on Linux kernel 4.16rc5
- Status: New, but data missing; unclear if this really is a regression
in 4.16, but looks a lot like one
- Reported: 2018-03-13
https://bugzilla.kernel.org/show_bug.cgi?id=199101


== Going to get removed from the report ==

aim7.jobs-per-min -18.0% regression
- Status: not an issue; looks like something weird happen when compiling
the kernel which lead to bogus results
- Cause:  https://git.kernel.org/torvalds/c/c0cef30e4ff0
- Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151957120702272&w=2
- Last known developer activity:
https://marc.info/?l=linux-kernel&m=152113601602820

[mm, mlock, vmscan]  9c4e6b1a70:  stress-ng.hdd.ops_per_sec -7.9% regression
- Status: looks stalled, seems nobody cares; side note: lkp-robot warned
about something else triggered by the same commit:
https://lkml.kernel.org/r/20180302093940.GE25699@yexl-desktop is related
- Cause:  https://git.kernel.org/torvalds/c/9c4e6b1a7027
- Reported: 2018-02-25
https://marc.info/?l=linux-kernel&m=151956997301994
- Note: performance regression found by lkp-robot


== Fixed since last report ==

Regression from efi: call get_event_log before ExitBootServices
- Status: Fixed by https://git.kernel.org/torvalds/c/79832f0b5f71
- Cause:  https://git.kernel.org/torvalds/c/33b6d03469b2
- Reported: 2018-03-06
https://marc.info/?l=linux-kernel&m=152035206220237&w=2

[amdgpu CARRIZO] disabled backlight after S3 resume
- Status: Fixed by https://git.kernel.org/torvalds/c/b5e324131697
- Cause:  https://git.kernel.org/torvalds/c/4ec6ecf48c64
- Reported: 2018-03-07
https://bugzilla.kernel.org/show_bug.cgi?id=199047


Linux 4.16: Reported regressions as of Tuesday, 2018-03-27 (Was: Linux 4.16-rc7)

2018-03-27 Thread Thorsten Leemhuis
On 26.03.2018 01:37, Linus Torvalds wrote:
> […] Anyway. Go out and test. And let's hope next week is nice and calm and
> I can release the final 4.16 next Sunday without any extra  rc's.
> 
>Linus

Hi! Find below my sixth regression report for Linux 4.16. It lists 7
regressions I'm currently aware of. 2 were fixed since last weeks
report; 2 are new.

Are you aware of any other regressions that got introduced this
development cycle? Then please let me know by mail (a simple bounce or
forward to the sender of this email address is enough!). And please tell
me if there is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Dell R640 does not boot due to SCSI/SATA failure
- Status: Afaics still unfixed; lost track, ask reporter for an update
on Monday morning, no reply yet
- Cause:  https://git.kernel.org/torvalds/c/84676c1f21e8
- Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151931128006031
- Note: Issue understood and even (kind of accidentally) fixed by a
patch series that was proposed for 4.17 (see links)
- Last known developer activity: 2018-03-14
https://marc.info/?l=linux-block&m=152102086831636&w=2
- Other relevant links:
https://marc.info/?l=linux-block&m=152051511802229&w=2
https://marc.info/?l=linux-kernel&m=152026091325037

Error updating SMART data during runtime and could not connect to lv
- Status: Stalled afaics
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152075643627082
https://bugzilla.kernel.org/show_bug.cgi?id=199077
- Note: Two issues discussed in that thread; only one is a regression
(latency issues in the MU03 version of the firmware, triggered by
polling SMART data, which causes lvmetad to timeout in some cases)
- Last known developer activity: 2018-03-19
https://marc.info/?l=linux-kernel&m=152145306610330
- Other relevant links:
https://marc.info/?l=linux-kernel&m=152146297613525
https://marc.info/?l=linux-scsi&m=152095303312164&w=2

15% longer running times on lvm2 test suite
- Status: Stalled afaics
- Cause:  https://git.kernel.org/torvalds/c/44c02a2c3dc5
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152077333230274
- Note: Seems the real problem is in the way the test scripts interact
with the kernel
- Last known developer activity: 2018-03-13
https://marc.info/?l=linux-kernel&m=152097761921525

AMDGPU Fury X random screen flicker on Linux kernel 4.16rc5
- Status: might be stalled
- Reported: 2018-03-13
https://bugzilla.kernel.org/show_bug.cgi?id=199101

ASUS XG-C100C 10G Network Adapter no longer working
- Status: told reporter to bring it to netdev; looks like he needs help
debugging
- Reported: 2018-03-22
https://bugzilla.kernel.org/show_bug.cgi?id=199177

multi_v7_defconfig fails to boot on many OMAP systems
- Status: quite new, but patch is being prepared
- Cause:  https://git.kernel.org/torvalds/c/c083dc5f3738
- Reported: 2018-03-23
https://marc.info/?l=linux-clk&m=152198452423677&w=2
- Last known developer activity: 2018-03-27
https://marc.info/?l=linux-clk&m=152199237525182&w=2


== Waiting for clarification from reporter ==

Interrupt storm after suspend causes one busy kworker
- Status: Still waiting for data from reporter
- Reported: 2018-02-25
https://bugzilla.kernel.org/show_bug.cgi?id=198929


== Fixed since last report ==

hci_bcm: Streamline runtime PM code change for 4.16 kernel breaks
bluetooth on ASUS T100TA
- Status: Fixed by https://git.kernel.org/torvalds/c/b09c61522c81
- Cause:  https://git.kernel.org/torvalds/c/43fff7683468
- Reported: 2018-03-01
https://bugzilla.kernel.org/show_bug.cgi?id=198953

sdhci-acpi not recognizing eMMC
- Status: Fixed by https://git.kernel.org/torvalds/c/d58ac803cfbb
- Cause:  https://git.kernel.org/torvalds/c/1b7ba57ecc86
- Reported: 2018-03-13
https://bugzilla.kernel.org/show_bug.cgi?id=199105


Linux 4.16: Reported regressions as of Friday, 2018-03-30

2018-03-30 Thread Thorsten Leemhuis
On 26.03.2018 01:37, Linus Torvalds wrote:
> […] Anyway. Go out and test. And let's hope next week is nice and calm and
> I can release the final 4.16 next Sunday without any extra  rc's.
> 
>Linus

Hi! Find below my seventh regression report for Linux 4.16; it's a "the
final release is getting closer" special release. It lists 7 regressions
I'm currently aware of. 1 was fixed since the report I sent on Tuesday;
1 is new.

Are you aware of any other regressions that got introduced this
development cycle? Then please let me know by mail (a simple bounce or
forward to the sender of this email address is enough!). And please tell
me if there is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Error updating SMART data during runtime and could not connect to lv
["Possible Regression"]
- Status: Stalled afaics
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152075643627082
https://bugzilla.kernel.org/show_bug.cgi?id=199077
- Note: Two issues discussed in that thread; only one is a regression
(latency issues in the MU03 version of the firmware, triggered by
polling SMART data, which causes lvmetad to timeout in some cases)
- Last known developer activity: 2018-03-19
https://marc.info/?l=linux-kernel&m=152145306610330
- Other relevant links:
https://marc.info/?l=linux-kernel&m=152146297613525
https://marc.info/?l=linux-scsi&m=152095303312164&w=2

15% longer running times on lvm2 test suite
- Status: Stalled afaics
- Cause:  https://git.kernel.org/torvalds/c/44c02a2c3dc5
- Reported: 2018-03-11
https://marc.info/?l=linux-kernel&m=152077333230274
- Note: Seems the real problem is in the way the test scripts interact
with the kernel
- Last known developer activity: 2018-03-13
https://marc.info/?l=linux-kernel&m=152097761921525

AMDGPU Fury X random screen flicker on Linux kernel 4.16rc5
- Status: waiting for bisect
- Reported: 2018-03-13
https://bugzilla.kernel.org/show_bug.cgi?id=199101

ASUS XG-C100C 10G Network Adapter no longer working
- Status: got driver maintainer involved who asked reporter for more details
- Reported: 2018-03-22
https://bugzilla.kernel.org/show_bug.cgi?id=199177

multi_v7_defconfig fails to boot on many OMAP systems
- Status: patch available: "clk: ti: fix flag space conflict with
clkctrl clocks" https://marc.info/?l=linux-arm-kernel&m=152217288709609&w=2
- Cause:  https://git.kernel.org/torvalds/c/49159a9dc3da
- Reported: 2018-03-23
https://marc.info/?l=linux-clk&m=152198452423677&w=2
- Last known developer activity: 2018-03-27
https://marc.info/?l=linux-clk&m=152199237525182&w=2

hugetlbfs overflow checking regression on 32bit
- Status: patch was proposed, but has issues, too
- Cause:  https://git.kernel.org/torvalds/c/63489f8e8211
- Reported: 2018-03-29
https://marc.info/?l=linux-kernel&m=152229704211382&w=2
- Last known developer activity: 2018-03-29
https://marc.info/?l=linux-mm&m=152235614429445&w=2
- Other relevant links:
https://marc.info/?l=linux-kernel&m=152229710411390&w=2


== Waiting for clarification from reporter ==

Interrupt storm after suspend causes one busy kworker
- Status: Still waiting for data from reporter
- Reported: 2018-02-25
https://bugzilla.kernel.org/show_bug.cgi?id=198929


== Fixed since last report ==

Dell R640 does not boot due to SCSI/SATA failure
- Status: Fixed by 2f31115e940c 8b834bff1b73 adbe552349f2 c3506df85091
b5b6e8c8d3b4
- Cause:  https://git.kernel.org/torvalds/c/84676c1f21e8
- Reported: 2018-02-22
https://marc.info/?l=linux-kernel&m=151931128006031
- Note: Thx Artem and Dsterba for pointers


Linux 4.8: Reported regressions as of Sunday, 2016-09-18

2016-09-18 Thread Thorsten Leemhuis
Hi! Here is my fourth regression report for Linux 4.8. It lists 14
regressions I'm aware of. 5 of them are new; 1 mentioned in last 
weeks report got fixed.

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And pls tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Desc: genirq: Flags mismatch irq 8, 0088 (mmc0) vs. 0080 (rtc0). mmc0: 
Failed to request irq 8: -16
Repo: 2016-08-01 https://bugzilla.kernel.org/show_bug.cgi?id=150881
Stat: 2016-09-09 https://bugzilla.kernel.org/show_bug.cgi?id=150881#c34
Note: stalled; root cause somewhere in the main gpio merge for 4.8, but 
problematic commit still unknown

Desc: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
Repo: 2016-08-09 http://www.spinics.net/lists/kernel/msg2317052.html
Stat: 2016-09-09 https://marc.info/?t=14734151953&r=1&w=2
Note: looks like post-4.8 material at this point: Mel working on it in his 
spare time, but "The progression of this series has been unsatisfactory."

Desc: scsi host6: runtime PM trying to activate child device host6 but parent 
(2-2:1.0) is not active
Repo: 2016-08-15 https://bugzilla.kernel.org/show_bug.cgi?id=153171
Stat: 2016-09-14 https://bugzilla.kernel.org/show_bug.cgi?id=153171#c5
Note: patch available; mkp "I would like the change to get a bit of soak time 
before we backport it."

Desc: DT/OCTEON driver probing broken
Repo: 2016-08-16 http://www.spinics.net/lists/devicetree/msg138990.html
Stat: 2016-08-30 http://www.spinics.net/lists/devicetree/msg140682.html
Note: stalled, poked Rob a week ago, got no reply

Desc: gpio-leds broken on OCTEON
Repo: 2016-08-23 http://www.spinics.net/lists/devicetree/msg139863.html
Stat: 2016-09-12 http://www.spinics.net/lists/devicetree/msg140179.html
Note: patch in Linux MIPS patchwork 
https://patchwork.linux-mips.org/patch/14091/

Desc: Skylake graphics regression: projector failure with 4.8-rc3
Repo: 2016-08-26 http://www.spinics.net/lists/intel-gfx/msg105478.html
Stat: 2016-09-01 https://lkml.org/lkml/2016/8/31/946
Note: stalled, poked lists

Desc: lk 4.8 + !CONFIG_SHMEM + shmat() = oops
Repo: 2016-08-30 http://www.spinics.net/lists/linux-mm/msg112920.html
Stat: 2016-09-07 http://www.spinics.net/lists/linux-mm/msg113177.html
Note: just like last week: patch "ipc/shm: fix crash if CONFIG_SHMEM is not 
set" is going to fix this, but has not hit mainline yet

Desc: regression in re-read operation by iozone ~10%
Repo: 2016-09-02 https://bugzilla.kernel.org/show_bug.cgi?id=155821
Stat: n/a 
Note: stalled; told reporter he might be better of posting about the issue to 
some mailing list

Desc: brcmfmac is preventing suspend (Dell XPS 13 9350 / Skylake)/
Repo: 2016-09-13 https://bugzilla.kernel.org/show_bug.cgi?id=156631
Stat: n/a 
Note: used to be https://bugzilla.kernel.org/show_bug.cgi?id=156361 in last 
weeks report

Desc: CPU speed set very low
Repo: 2016-09-09 https://lkml.org/lkml/2016/9/9/608
Stat: 2016-09-14 https://lkml.org/lkml/2016/9/14/588
Note: Larry: "Testing continues."

Desc: pinctrl: qcom: Clear all function selection bits introduced a regression 
by not properly masking the calculated value.
Repo: 2016-09-12 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1229055.html
Stat: n/a 
Note: Patch available in initial report

Desc: WARN in vb2_dma_sg_alloc when makes dvbstream fail with with AverMedia 
Hybrid+FM
Repo: 2016-09-13 https://bugzilla.kernel.org/show_bug.cgi?id=156751
Stat: n/a 
Note: told reporter he might be better of posting the issue to linux-media

Desc: it is now common that machine needs re-run of xrandr after resume
Repo: 2016-09-13 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1230782.html
Stat: 2016-09-15 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1232834.html 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1231104.html
Note: Seems Martin isn't seeing the problem anymore; Pavel's issue might be a 
different problem anyway; and he now has a userland-problem that complicates 
testing

Desc: usb: gadget: udc: atmel: fix endpoint name
Repo: 2016-09-15 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1232309.html 
http://www.mail-archive.com/linux-usb@vger.kernel.org/msg80645.html 
Stat: n/a 
Note: Patch available: "Felipe, Greg,It is clearly a regression and material 
for 4.8-fixes."

== Fixed since last report ==

Desc: ath9k: bring back direction setting in ath9k_{start_stop}
Repo: 2016-09-01 https://marc.info/?l=linux-wireless&m=147292415030585&w=2
Fix:  https://git.kernel.org/torvalds/c/e34f2ff40e0339f6a379e1ecf49e8f2759056453


Linux 4.8: Reported regressions as of Sunday, 2016-09-25

2016-09-25 Thread Thorsten Leemhuis
Hi! Here is my fifth regression report for Linux 4.8. It lists 15
regressions I'm aware of. 5 of them are new (for many of those
there are patches available to fix the regression); 3 mentioned
in last weeks report got fixed; 1 is going to be removed. 

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Desc: genirq: Flags mismatch irq 8, 0088 (mmc0) vs. 0080 (rtc0). mmc0: 
Failed to request irq 8: -16
Repo: 2016-08-01 https://bugzilla.kernel.org/show_bug.cgi?id=150881
Stat: 2016-09-09 https://bugzilla.kernel.org/show_bug.cgi?id=150881#c34
Note: stalled; root cause somewhere in the main gpio merge for 4.8, but 
problematic commit still unknown

Desc: scsi host6: runtime PM trying to activate child device host6 but parent 
(2-2:1.0) is not active
Repo: 2016-08-15 https://bugzilla.kernel.org/show_bug.cgi?id=153171
Stat: 2016-09-14 https://bugzilla.kernel.org/show_bug.cgi?id=153171#c5
Note: "patch available; mkp ""I would like the change to get a bit of soak time 
before we backport it."""

Desc: DT/OCTEON driver probing broken
Repo: 2016-08-16 http://www.spinics.net/lists/devicetree/msg138990.html
Stat: 2016-08-30 http://www.spinics.net/lists/devicetree/msg140682.html
Note: stalled, poked Rob two weeks ago, got no reply

Desc: gpio-leds broken on OCTEON
Repo: 2016-08-23 http://www.spinics.net/lists/devicetree/msg139863.html
Stat: 2016-09-12 http://www.spinics.net/lists/devicetree/msg140179.html
Note: patch in Linux MIPS patchwork 
https://patchwork.linux-mips.org/patch/14091/

Desc: regression in re-read operation by iozone ~10%
Repo: 2016-09-02 https://bugzilla.kernel.org/show_bug.cgi?id=155821
Stat: n/a 
Note: stalled; told reporter he might be better of posting about the issue to 
some mailing list

Desc: brcmfmac is preventing suspend (Dell XPS 13 9350 / Skylake)/
Repo: 2016-09-13 https://bugzilla.kernel.org/show_bug.cgi?id=156631
Stat: n/a 
Note: stalled; told reporter he might be better of posting about the issue to 
some mailing list; this used to be 
https://bugzilla.kernel.org/show_bug.cgi?id=156361 in last weeks report

Desc: CPU speed set very low
Repo: 2016-09-09 https://lkml.org/lkml/2016/9/9/608
Stat: 2016-09-23 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1238862.html
Note: "Larry: ""My debugging continues. […] I think the bug lies between commit 
581e0cd (bad) and f7816ad (good). I will need to do a long test on commit 
1b05cf6, which did not fail with a shorter run. """

Desc: pinctrl: qcom: Clear all function selection bits introduced a regression 
by not properly masking the calculated value.
Repo: 2016-09-12 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1229055.html
Stat: n/a 
Note: stalled; patch available in initial report

Desc: it is now common that machine needs re-run of xrandr after resume
Repo: 2016-09-13 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1230782.html
Stat: n/a 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1232834.html 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1231104.html
Note: stalled; seems Martin isn't seeing the problem anymore; Pavel's issue 
might be a different problem anyway; and he now has a userland-problem that 
complicates testing

Desc: usb: gadget: udc: atmel: fix endpoint name
Repo: 2016-09-15 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1232309.html 
Stat: 2016-09-23 
http://www.mail-archive.com/linux-usb@vger.kernel.org/msg80981.html
Note: "Patch available: Greg: ""It's Felipe's area, not mine"""

Desc: cifs mount regression in 4.8 and 4.4 stable
Repo: 2016-09-22 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1237249.html
Stat: 2016-09-23 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1238305.html
Note: root cause seems to be a6b5058 fs/cifs: make share unaccessible at root 
level mountable

Desc: i2c-core: acpi_i2c_get_info() touches non-existent devices
Repo: 2016-09-19 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1234457.html
Stat: 2016-09-22 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1236953.html
Note: "Wolfram added it a patch to his ""for-next"" branch: ""If someone wants 
it backported, it needs to be rewritten and re-tested."""

Desc: [media] solo6x10: avoid delayed register write
Repo: 2016-09-21 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1236732.html
Stat: n/a 
Note: patch available in report

Desc: sched/fair: Do not decay new task load on first enqueue
Repo: 2016-09-23 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1238302.html
Stat: 2016-09-23 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1238499.html
Note: patch available in report

Desc: MIPS: smp-cps: Avoid BUG() when offlining pre-r6 CPUs
Repo: 2016-09-23 
http://www.mail-archive.com/linux-kern

Linux 4.9: Reported regressions as of Sunday, 2016-10-23

2016-10-23 Thread Thorsten Leemhuis
Hi! Here is my first regression report for Linux 4.9. It lists 14
regressions I'm aware of. 
 
As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: I ran out of time today (I really need to automate some things, 
but I do not find time for it…; and some of the regression tracking work 
simply can not be automated :/) and could not make it completely 
through my backlog. That's why I likely missed lots of regressions 
that were reported during the merge window and remain unfixed as of now 
:-/ Let me know about those, please.

== Current regressions ==

Desc: PPC32: fails to boot on my PowerBook G4 Aluminum; bisected to commit 
05fd007e4629
Repo: 2016-10-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253391.html
Stat: 2016-10-22 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255516.html
Note: no response from responsible developers yet

Desc: "Nokia N900 (omap3-n900) with ""VDD1: ramp_delay not set"" string in 
printk frequently"
Repo: 2016-10-19 https://bugzilla.kernel.org/show_bug.cgi?id=178371
Stat: n/a 
Note: told reporter whom to contact

Desc: radeon performance drop from 4.8 to 4.9-rc1 in Shadow of Mordor
Repo: 2016-10-14 https://bugzilla.kernel.org/show_bug.cgi?id=178221 
https://lists.freedesktop.org/archives/dri-devel/2016-October/120693.html
Stat: 2016-10-20 
https://lists.freedesktop.org/archives/dri-devel/2016-October/121427.html
Note: WIP

Desc: can't boot with root fs on md raid 0; mdadm: no devices listed in conf 
file were found. 
Repo: 2016-10-17 https://bugzilla.kernel.org/show_bug.cgi?id=178211
Stat: 2016-10-18 https://bugzilla.kernel.org/show_bug.cgi?id=178211#c1
Note: Root cause unknown; might be a controller driver issue

Desc: unable to handle kernel NULL pointer dereference at fuse_setattr
Repo: 2016-10-17 https://bugzilla.kernel.org/show_bug.cgi?id=177801
Stat: 2016-10-18 https://bugzilla.kernel.org/show_bug.cgi?id=177801#c5
Note: Fix heading upstream

Desc: Skylake gen6 suspend/resume video regression
Repo: 2016-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177731
Stat: n/a 
Note: 

Desc: warning in intel_dp_aux_transfer: CPU: 0 PID: 4 at 
drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1ed/0x230#
Repo: 2016-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177701
Stat: n/a 
Note: 

Desc: """Failed to find cpu0 device node"" in dmesg"
Repo: 2016-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177681 
https://bugzilla.kernel.org/show_bug.cgi?id=180031
Stat: n/a 
Note: 

Desc: boot failure of Intel Mobile Internet Devices due to a change in the PCI 
subsystem that appeared in v4.9-rc1.
Repo: 2016-10-23 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255643.html
Stat: n/a 
Note: Fix proposed

Desc: "Regression with 0b9e2988ab22 (""ahci: use pci_alloc_irq_vectors"")"
Repo: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1254825.html
Stat: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1254882.html
Note: WIP

Desc: 761ed4a94582 tty: serial_core: convert uart_close to use tty_port_close
Repo: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1254753.html
Stat: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1254987.html
Note: WIP

Desc: [selinux/audit/netlink, regression?] Warning at kernel/softirq.c:161
Repo: 2016-10-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1254148.html
Stat: 2016-10-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1254151.html 
https://patchwork.ozlabs.org/patch/684753/
Note: WIP

Desc: some gpio drivers broken by commit 762c2e46
Repo: 2016-10-18 https://www.spinics.net/lists/linux-gpio/msg17283.html
Stat: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253906.html
Note: WIP

Desc: 4.9-rc1 boot regression, ambiguous bisect result
Repo: 2016-10-19 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253369.html
Stat: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255296.html
Note: WIP


Linux 4.9: Reported regressions as of Sunday, 2016-10-30

2016-10-30 Thread Thorsten Leemhuis
Hi! Here is my second regression report for Linux 4.9. It lists 14
regressions I'm aware of. 4 of them are new; 3 got fixed since last weeks 
report.

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Desc: tpm0: TPM self test failed
Repo: 2016-10-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1259943.html
Stat: 2016-10-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1260452.html 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1256705.html 
Note: Fix hopefully soon heading upstream

Desc: Radeon Oops on shutdown
Repo: 2016-10-19 https://bugzilla.kernel.org/show_bug.cgi?id=178421
Stat: 2016-10-30 https://bugzilla.kernel.org/show_bug.cgi?id=178421#c6
Note: WIP

Desc: module loadling broken due to kbuild changes
Repo: 2016-10-15 http://www.gossamer-threads.com/lists/linux/kernel/2544734
Stat: 2016-10-27 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1259418.html
Note: Fix available, waiting for Michal to get back from vacation; wondering if 
those will fix https://bugzilla.kernel.org/show_bug.cgi?id=185581 and 
https://lkml.org/lkml/2016/10/27/471 and 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1250105.html as 
well

Desc: pci: artpec-6: imprecise external abort
Repo: 2016-10-14 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1249646.html
Stat: 2016-10-14 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1249922.html
Note: Patch available

Desc: PPC32: fails to boot on my PowerBook G4 Aluminum; bisected to commit 
05fd007e4629
Repo: 2016-10-20 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253391.html
Stat: 2016-10-22 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255516.html 
https://www.linux-mips.org/archives/linux-mips/2016-10/msg00176.html 
https://lkml.org/lkml/2016/10/18/142
Note: Larry made a hack that works for him.

Desc: "Nokia N900 (omap3-n900) with ""VDD1: ramp_delay not set"" string in 
printk frequently"
Repo: 2016-10-19 https://bugzilla.kernel.org/show_bug.cgi?id=178371
Stat: n/a https://lkml.org/lkml/2016/10/27/527
Note: discussion on lkml ongoing

Desc: radeon performance drop from 4.8 to 4.9-rc1 in Shadow of Mordor
Repo: 2016-10-14 https://bugzilla.kernel.org/show_bug.cgi?id=178221 
https://lists.freedesktop.org/archives/dri-devel/2016-October/120693.html
Stat: 2016-10-20 
https://lists.freedesktop.org/archives/dri-devel/2016-October/121427.html
Note: Stuck? Poked bugzilla

Desc: unable to handle kernel NULL pointer dereference at fuse_setattr
Repo: 2016-10-17 https://bugzilla.kernel.org/show_bug.cgi?id=177801
Stat: 2016-10-18 https://bugzilla.kernel.org/show_bug.cgi?id=177801#c5
Note: Fix heading upstream

Desc: Skylake gen6 suspend/resume video regression
Repo: 2016-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177731
Stat: 2016-10-25 https://bugzilla.kernel.org/show_bug.cgi?id=177731#c3
Note: WIP

Desc: warning in intel_dp_aux_transfer: CPU: 0 PID: 4 at 
drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1ed/0x230#
Repo: 2016-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177701
Stat: 2016-10-27 https://bugs.freedesktop.org/show_bug.cgi?id=97344
Note: Poked Janni to give a statement

Desc: """Failed to find cpu0 device node"" in dmesg"
Repo: 2016-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177681 
https://bugzilla.kernel.org/show_bug.cgi?id=180031
Stat: 2016-10-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1260133.html
Note: Patches afaics in the work

Desc: boot failure of Intel Mobile Internet Devices due to a change in the PCI 
subsystem that appeared in v4.9-rc1.
Repo: 2016-10-23 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255643.html
Stat: 2016-10-26 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1258579.html
Note: Fix proposed

Desc: 4.9-rc1 boot regression, ambiguous bisect result
Repo: 2016-10-19 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253369.html
Stat: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255296.html
Note: Poked list, as this looks stuck or was is discussed (or even fixed) 
somewhere else


== Stalled, waiting for feedback from reporter ==

Desc: can't boot with root fs on md raid 0; mdadm: no devices listed in conf 
file were found. 
Repo: 2016-10-17 https://bugzilla.kernel.org/show_bug.cgi?id=178211
Stat: 2016-10-18 https://bugzilla.kernel.org/show_bug.cgi?id=178211#c1
Note: Root cause unknown; might be a controller driver issue


== Going to be removed from the list ==

Desc: some gpio drivers broken by commit 762c2e46
Repo: 2016-10-18 https://www.spinics.net/lists/linux-gpio/msg17283.html
Stat: 2016-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253906.html
Note: Not a 4.9 regression


== Fixed sinc

Re: Linux 4.9: Reported regressions as of Sunday, 2016-10-30

2016-11-06 Thread Thorsten Leemhuis
Lo! On 01.11.2016 09:18, Paul Bolle wrote:
> On Sun, 2016-10-30 at 14:20 +0100, Thorsten Leemhuis wrote:
>> As always: Are you aware of any other regressions? Then please let me
>> know (simply CC regressi...@leemhuis.info).
> Do build regressions count?

That's a good question.

> Because I was trying to fix an obscure build issue in arch/mips, choose
> a random configuration that should hit that issue, and promptly ran
> into
> https://lkml.kernel.org/r/<201610301405.k82kqqw0%25fengguang...@intel.com>
> The same configuration does build under v4.8, I tested that of course.

I'd say it's a practical problem that users run into and hence it's a
regression. Sure, in this case it hits only those that compile kernels
themselves; but those are users, too, and we don't want to scare them
away with things that suddenly stop working.

IOW: I'll include it in this weeks report.

Ciao, Thorsten


Linux 4.9: Reported regressions as of Sunday, 2016-11-06

2016-11-06 Thread Thorsten Leemhuis
Hi! Here is my third regression report for Linux 4.9. It lists 17
regressions I'm aware of. 6 of them are new; 3 got fixed since
last weeks report (a fourth looks fixed as well). The console
problem ("console: don't prefer first registered [...]") got
reported to me multiple times, but the revert to finally get
this fixed is in -mm already.

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Desc: thinkpad x60: BIOS limit stops working,
Repo: 16-11-05 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1264916.html
Stat: n/a 
Note: WIP

Desc: thinkpad x60:  thermal passive cooling can not prevent the system from 
overheating, when there is no BIOS limit.
Repo: 16-11-05 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1264916.html
Stat: n/a 
Note: WIP

Desc: test failures of sendfile(2) and splice(2) 
Repo: 16-11-01 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1262400.html
Stat: 16-11-01 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1262648.html
Note: WIP, patch available

Desc: amdgpu, topaz: powerplay initialization failed
Repo: 16-10-31 https://bugzilla.kernel.org/show_bug.cgi?id=185681 
https://bugs.freedesktop.org/show_bug.cgi?id=98357#
Stat: 16-11-04 https://bugzilla.kernel.org/show_bug.cgi?id=185681#c7
Note: WIP

Desc: mangled display since -rc1 (two systems: one with intel, one with nvidia 
gpu)
Repo: 16-10-31 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1261699.html
Stat: n/a 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1262493.html
Note: root cause unknown, proper bisec needed (would be good if somebody could 
help the reporter)

Desc: "build regression: make.cross ARCH=mips fails with ""No rule to make 
target 'alchemy/devboards/'. """
Repo: 16-10-30 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1262410.html 
https://marc.info/?l=linux-kernel&m=147780880425626
Stat: n/a 
Note: nothing happened yet; BTW: Should build regressions be on this list at 
all?

Desc: tpm0: TPM self test failed & can't request region for resource
Repo: 16-10-28 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1259943.html 
https://bugzilla.kernel.org/show_bug.cgi?id=185631
Stat: 16-11-03 
https://www.mail-archive.com/tpmdd-devel@lists.sourceforge.net/msg02010.html
Note: Partly fixed by 
https://git.kernel.org/torvalds/c/befd99656c5eb765fe9d96045c4cba099fd938db , 
but it seems more fixes are needed (and available!)

Desc: boot failure of Intel Mobile Internet Devices due to a change in the PCI 
subsystem that appeared in v4.9-rc1.
Repo: 16-10-23 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255643.html
Stat: 16-10-26 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1258579.html
Note: Poked list, as it looks like the proposed fix got forgotten

Desc: Radeon Oops on shutdown / Panic on shutdown in routine 
radeon_connector_unregister()
Repo: 16-10-19 https://bugzilla.kernel.org/show_bug.cgi?id=178421 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1261699.html
Stat: 16-10-30 https://bugzilla.kernel.org/show_bug.cgi?id=178421#c6
Note: Patch available

Desc: ""console: don't prefer first registered if DT specifies stdout-path"" 
breaks console on video outputs of various ARM boards; breaks some ppc machines 
as well
Repo: 16-10-18 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1264523.html 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253391.html 
https://www.linux-mips.org/archives/linux-mips/16-10/msg00176.html
Stat: 16-11-06 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1265059.html 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1264422.html
Note: revert discussed and also in -mm; Side note: this seems to be a 
regression that annoys quite a lot of people

Desc: unable to handle kernel NULL pointer dereference at fuse_setattr
Repo: 16-10-17 https://bugzilla.kernel.org/show_bug.cgi?id=177801
Stat: 16-10-18 https://bugzilla.kernel.org/show_bug.cgi?id=177801#c5
Note: poked Miklos, as the fix is not yet upstream afaics

Desc: Skylake gen6 suspend/resume video regression
Repo: 16-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177731 
https://bugs.freedesktop.org/show_bug.cgi?id=98517
Stat: 16-10-25 https://bugzilla.kernel.org/show_bug.cgi?id=177731#c3
Note: WIP

Desc: warning in intel_dp_aux_transfer: CPU: 0 PID: 4 at 
drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1ed/0x230#
Repo: 16-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177701
Stat: 16-10-27 https://bugs.freedesktop.org/show_bug.cgi?id=97344
Note: Poked Janni a week ago to give a status update, but didn't hear anything 
yet

Desc: module loadling broken due to kbuild changes
Repo: 16-10-15 http://www.gossamer-threads.com/lists/linux/kernel/2544

Linux 4.9: Reported regressions as of Sunday, 2016-11-20

2016-11-20 Thread Thorsten Leemhuis
Hi! Here is my fourth regression report for Linux 4.9. It lists 10
regressions I'm aware of. 6 of them are new; 11 got fixed (wow!)
since the last report -- that was two weeks ago, because I 
didn't find any spare time to compile a report last Sunday :-/

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And please tell me if there
is anything in the report that shouldn't be there.

Ciao, Thorsten

== Current regressions ==

Desc: "irq 16: nobody cared (try booting with the ""irqpoll"" option) since 
t0b9e2988ab226 (ahci: use pci_alloc_irq_vectors)"
Repo: 16-11-19 https://bugzilla.kernel.org/show_bug.cgi?id=188181
Stat: n/a 
Note: new

Desc: [i.MX6 DRM IPUv3] Regression 4.9-rc5: greenish screen with YUV420 video
Repo: 16-11-17 https://www.spinics.net/lists/kernel/msg2385550.html
Stat: n/a 
Note: new

Desc: "oops due to 493b2ed3f760 (""crypto: algif_hash - Handle NULL hashes 
correctly"")"
Repo: 16-11-17 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1273867.html
Stat: 16-11-17 https://patchwork.kernel.org/patch/9434741/
Note: WIP, Patch available

Desc: MSI is no longer enabled for many/most Intel SATA controllers in 4.9
Repo: 16-11-16 https://bugzilla.kernel.org/show_bug.cgi?id=187821
Stat: 16-11-17 https://bugzilla.kernel.org/show_bug.cgi?id=187821#c3
Note: WIP, Patch available

Desc: "Failed boots bisected to 4cd13c21b207 ""softirq: Let ksoftirqd do its 
job"""
Repo: 16-11-16 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1273344.html
Stat: 16-11-18 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1275668.html
Note: WIP

Desc: qla2xxx: do not abort all commands in the adapter during EEH recovery
Repo: 16-11-14 
https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg55186.html
Stat: 16-11-14 
https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg55198.html
Note: Fix heading mainline

Desc: builddeb: fix cross-building to arm64 producing host-arch debs
Repo: 16-11-04 https://www.spinics.net/lists/linux-kbuild/msg13635.html
Stat: 16-11-10 https://www.spinics.net/lists/linux-kbuild/msg13676.html
Note: Looks stalled

Desc: "build regression: make.cross ARCH=mips fails with ""No rule to make 
target 'alchemy/devboards/'. """
Repo: 16-10-30 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1262410.html 
https://marc.info/?l=linux-kernel&m=147780880425626
Stat: n/a 
Note: nothing happened yet; BTW: Should build regressions be on this list at 
all?


== Stalled, waiting for feedback from reporter ==

Desc: 4.9-rc1 boot regression, ambiguous bisect result
Repo: 2016-10-19 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253369.html
Stat: 16-10-21 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1255296.html
Note: Waiting for Dan or someone else to look into this

Desc: Skylake gen6 suspend/resume video regression
Repo: 16-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177731 
https://bugs.freedesktop.org/show_bug.cgi?id=98517
Stat: 16-10-25 https://bugzilla.kernel.org/show_bug.cgi?id=177731#c3
Note: Stalled, poked bugzlla


== Going to be removed from the list ==

Desc: warning in intel_dp_aux_transfer: CPU: 0 PID: 4 at 
drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1ed/0x230#
Repo: 16-10-16 https://bugzilla.kernel.org/show_bug.cgi?id=177701
Stat: n/a 
Note: the warning seems to be fixed, but the problem that Martin saw might 
still be there; this will get a new entry in the regression list once confirmed

Desc: can't boot with root fs on md raid 0; mdadm: no devices listed in conf 
file were found. 
Repo: 2016-10-17 https://bugzilla.kernel.org/show_bug.cgi?id=178211
Stat: 16-11-03 https://bugzilla.kernel.org/show_bug.cgi?id=178211#c4
Note: sata adapters not detected, reporter is unable to debug and could need 
some help;


== Fixed since last report ==

Desc: """console: don't prefer first registered if DT specifies stdout-path"" 
breaks console on video outputs of various ARM boards; breaks some ppc machines 
as well"
Repo: 16-10-18 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1264523.html 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1253391.html 
https://www.linux-mips.org/archives/linux-mips/2016-10/msg00176.html
Fix:  https://git.kernel.org/torvalds/c/c6c7d83b9c9e6a8b3e6d84c820ac61fbffc9e396

Desc: thinkpad x60, T40p: overheat with v4.9-rc4 (was Re: v4.8-rc1: thinkpad 
x60: running at low frequency even during kernel build)
Repo: 16-11-05 
https://www.mail-archive.com/ibm-acpi-devel@lists.sourceforge.net/msg03909.html 
https://bugzilla.kernel.org/show_bug.cgi?id=187311 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1264916.html
Fix:  https://git.kernel.org/torvalds/c/e2174b0c24caca170ca61eda2ae49c9561ff8896

Desc: mangled display since -rc1 (two systems: one with intel, one with nvidia 
gpu)
Repo: 16-10-31 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1261699.html
Fix:  http

Linux 4.7: Reported regressions as of Saturday, 2016-07-02

2016-07-03 Thread Thorsten Leemhuis
Hi! Here is my fourth regression report for Linux 4.7; a day earlier then usual.
It has 14 entries;
 * 2 of them are new
 * 9 regressions (not included here) were fixed since the last report(¹)
 * 1 made it to the list after last Sunday (thx for telling me about it Kalle!),
   but was fixed before I compiled this one
 * 2 turned out to not be regressions at all
 * 1 was a duplicate

Fixes for 4 of the regression in this report are heading towards mainline. 
One of the remaining regressions is likely fixed already. 

Please let me know if a regression is missing in the list; or if
there is something on the list which shouldn't be there.

HTH, CU, Thorsten

(¹) previous reports can be found at
http://thread.gmane.org/gmane.linux.kernel/2241805
http://thread.gmane.org/gmane.linux.kernel/2247804
http://thread.gmane.org/gmane.linux.kernel/2253623


Desc: Bad flicker on skylake HQD due to code in the 4.7 merge window
Repo: 2016-05-30 http://thread.gmane.org/gmane.linux.kernel/2230377
Stat: 2016-06-23 http://thread.gmane.org/gmane.linux.kernel/2230377/focus=158274
Note: vswing issue, maybe primary a Firmware problem; nevertheless a 
regression; nothing happened for a week, so poked list

Desc: 795ae7a0de: pixz.throughput -9.1% regression
Repo: 2016-06-02 http://thread.gmane.org/gmane.linux.kernel/2233056/
Stat: 2016-06-22 
http://thread.gmane.org/gmane.linux.kernel/2233056/focus=2251134
Note: hannes still can't reproduce; stalled, until reporter posts results from 
more tests

Desc: RadeonSI get a huge performance dip with used with the nine state tracker
Repo: 2016-06-04 https://bugzilla.kernel.org/show_bug.cgi?id=119631
Stat: 2016-06-20 https://bugzilla.kernel.org/show_bug.cgi?id=119631#c14
Note: poked bugzilla, as it looked stalled

Desc: BUG: unable to handle kernel NULL pointer dereference […] 
qla24xx_process_response_queue+0x49/0x4b0 [qla2xxx]
Repo: 2016-06-14 https://bugzilla.kernel.org/show_bug.cgi?id=120201
Stat: 2016-06-30 
http://thread.gmane.org/gmane.linux.kernel/2257008/focus=2257139 
http://thread.gmane.org/gmane.linux.kernel/2247804/focus=2252004 
http://thread.gmane.org/gmane.linux.kernel/2257008/focus=2257139
Note: Patch which fixes the problem (at least afaics) was posted on 2016-06-30

Desc: Performance drop 30-40% for SPECjbb2005 and SPECjvm2008 benchmarks
Repo: 2016-06-16 https://bugzilla.kernel.org/show_bug.cgi?id=120481
Stat: 2016-06-25 https://bugzilla.kernel.org/show_bug.cgi?id=120481#c15 
http://thread.gmane.org/gmane.linux.kernel/2245977/focus=2253079
Note: solution heading upstream (made it from sched/urgent to tip 
7dd4912594daf769a46744848b05bd5bc6d62469 
ea1dc6fc6242f991656e35e2ed3d90ec1cd13418 )

Desc: performance drop on SFC interface around 30 %
Repo: 2016-06-17 https://bugzilla.kernel.org/show_bug.cgi?id=120461
Stat: 2016-07-01 https://bugzilla.kernel.org/show_bug.cgi?id=120461#c12
Note: poked bugzilla, as it looked stalled

Desc: System hang when plug/un-plug USB 3.1 key via thunderbolt port on Dell 
XPS 13
Repo: 2016-06-14 https://bugzilla.kernel.org/show_bug.cgi?id=120241
Stat: 2016-07-02 https://bugzilla.kernel.org/show_bug.cgi?id=120241#c7
Note: wip

Desc: lk 4.7 regression: EDAC, amd64_edac: Drop pci_register_driver() use
Repo: 2016-06-15 http://thread.gmane.org/gmane.linux.kernel/2245115/
Stat: 2016-07-01 (private mail) 
http://thread.gmane.org/gmane.linux.kernel/2246008/focus=2246009
Note: solution heading upstream (solution in tip: 
1ead852dd88779eda12cb09cc894a03d9abfe1ec )

Desc: regression in 8250 uart driver
Repo: 2016-06-14 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Stat: 2016-07-02 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2258232
Note: andy has it still on his todo list, but something else has a higher 
priority

Desc: sound disappeared, ACPI/PCI/IRQ related
Repo: 2016-06-20 http://thread.gmane.org/gmane.linux.kernel/2247930/
Stat: 2016-06-30 
http://thread.gmane.org/gmane.linux.kernel/2247930/focus=2256773 
http://thread.gmane.org/gmane.linux.acpi.devel/85281/focus=20097
Note: Rafael queued fix

Desc: Multi-thread udp 4.7 regression, bisected to 71d8c47fc653; performance 
decreased or complete failure with a test
Repo: 2016-06-27 http://thread.gmane.org/gmane.linux.network/418861/focus=418908
Stat: 2016-06-27 http://thread.gmane.org/gmane.linux.network/418861/focus=418908
Note: poke someone on Monday, looks it fel of the radar

Desc: kmemcheck: Caught 64-bit read from uninitialized memory; 
iptables/nf_register_net_hooks
Repo: 2016-06-19 https://bugzilla.kernel.org/show_bug.cgi?id=120651
Stat: 2016-07-02 
Note: told reporter how to contact the relevant developers

Desc: Rename file corruption due to VFS-Changes?
Repo: 2016-06-16 http://thread.gmane.org/gmane.linux.kernel/2245402/
Stat: 2016-06-22 
http://thread.gmane.org/gmane.linux.kernel/2245402/focus=2250254
Note: issue likely fixed in mainline by 9f541801 + e7d6ef979; see also: 
http://thread.gmane.org/gmane.linux.kernel/2231928/focus=76124

Desc: UBSAN splat in d

Linux 4.7: Reported regressions as of Sunday, 2016-07-10

2016-07-10 Thread Thorsten Leemhuis
Hi! Here is my fifth regression report for Linux 4.7. It lists 10
regressions I'm currently aware of; 2 of them are new; 1 of those 
seems to be a a side effect of a fix for another regression.

The report also mentions 3 regression that I removed from the list, as
it looks like those issues are no regressions.

Find also details on 4 regressions that were fixed since the last
report(¹)

As usual: Please let me know about any regression missing on the list  
or if it contains something which shouldn't be there.

HTH, CU, Thorsten

(¹) previous reports can be found at
http://thread.gmane.org/gmane.linux.kernel/2241805
http://thread.gmane.org/gmane.linux.kernel/2247804
http://thread.gmane.org/gmane.linux.kernel/2253623
http://thread.gmane.org/gmane.linux.kernel/2258287

== Current regressions ==

Desc: Bad flicker on skylake HQD due to code in the 4.7 merge window
Repo: 2016-05-30 http://thread.gmane.org/gmane.linux.kernel/2230377
Stat: 2016-07-08 http://thread.gmane.org/gmane.linux.kernel/2230377/focus=94140
Note: vswing issue, maybe primary a Firmware problem; a few patches improved 
situation, but did not fix the problem

Desc: 795ae7a0de: pixz.throughput -9.1% regression
Repo: 2016-06-02 http://thread.gmane.org/gmane.linux.kernel/2233056/
Stat: 2016-06-22 
http://thread.gmane.org/gmane.linux.kernel/2233056/focus=2251134
Note: hannes still can't reproduce; stalled until reporter posts results from 
more tests

Desc: System hang when plug/un-plug USB 3.1 key via thunderbolt port on Dell 
XPS 13
Repo: 2016-06-14 https://bugzilla.kernel.org/show_bug.cgi?id=120241
Stat: 2016-07-08 https://bugzilla.kernel.org/show_bug.cgi?id=120241#c7
Note: there was wip, but root cause hard to track down; responsible developer 
now afk till early August

Desc: regression in 8250 uart driver
Repo: 2016-06-14 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Stat: 2016-07-02 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2258232
Note: andy has it still on his todo list, but something else has a higher 
priority

Desc: Multi-thread udp 4.7 regression, bisected to 71d8c47fc653; performance 
decreased or complete failure with a test
Repo: 2016-06-27 http://thread.gmane.org/gmane.linux.network/418861/focus=418908
Stat: 2016-07-05 http://thread.gmane.org/gmane.linux.network/418861/focus=63936
Note: waiting for feedback from reporter

Desc: kmemcheck: Caught 64-bit read from uninitialized memory; 
iptables/nf_register_net_hooks
Repo: 2016-06-19 https://bugzilla.kernel.org/show_bug.cgi?id=120651
Stat: 2016-07-02 
Note: afaics stalled, after I told reporter a week ago how to contact the 
relevant developers

Desc: UBSAN splat in drivers/acpi/acpica/dsutils.c:641:16
Repo: 2016-06-15 https://bugzilla.kernel.org/show_bug.cgi?id=120351
Stat: 2016-06-22 https://bugzilla.kernel.org/show_bug.cgi?id=120411#c3
Note: wonder if this is important enough to investigate further; related, 
similar bugs: https://bugzilla.kernel.org/show_bug.cgi?id=120411 
https://bugzilla.kernel.org/show_bug.cgi?id=120391 
https://bugzilla.kernel.org/show_bug.cgi?id=120371 
https://bugzilla.kernel.org/show_bug.cgi?id=120361 Not sure which of them are 
regressions

Desc: Kernel does not boot with 7ed18e2d1b6782989eb399ef79a8cc1a1b583b3c ( 
acpi-4.7-rc7 aka All of these fix recent regressions in ACPICA, in the ACPI PCI 
IRQ management code and in the ACPI AML debugger.)
Repo: 2016-07-08 https://bugzilla.kernel.org/show_bug.cgi?id=121701
Stat: 2016-07-10 
http://thread.gmane.org/gmane.linux.kernel.pci/53279/focus=85652
Note: pointed Rafael to bugzilla on LKML

Desc: ACPI EC problems/ ACPI / EC: Fix an order issue in ec_remove_handlers()
Repo: 2016-07-06 http://thread.gmane.org/gmane.linux.kernel/2260279/
Stat: 2016-07-08 http://thread.gmane.org/gmane.linux.kernel/2260279/focus=85583
Note: patch under discussion


== Going to me removed from the list ==

Desc: RadeonSI get a huge performance dip with used with the nine state tracker
Repo: 2016-06-04 https://bugzilla.kernel.org/show_bug.cgi?id=119631
Stat: 2016-07-04 https://bugzilla.kernel.org/show_bug.cgi?id=119631#c14
Note: not a regression according to one of the AMD graphic driver developers; 
otoh it looks like one to me; yes, the real bug is in the gallium nine, wine 
side, but afaics it only showed up after a kernel change introduced a new 
feature

Desc: performance drop on SFC interface around 30 %
Repo: 2016-06-17 https://bugzilla.kernel.org/show_bug.cgi?id=120461
Stat: 2016-07-06 https://bugzilla.kernel.org/show_bug.cgi?id=120461#c20
Note: reporter and developer agree: not a regression in 4.7

Desc: Rename file corruption due to VFS-Changes?
Repo: 2016-06-16 http://thread.gmane.org/gmane.linux.kernel/2245402/
Stat: 2016-06-22 
http://thread.gmane.org/gmane.linux.kernel/2245402/focus=2250254
Note: issue likely fixed in mainline by 9f541801 + e7d6ef979; see also: 
http://thread.gmane.org/gmane.linux.kernel/2231928/focus=76124


== Fixed since last report ==

Desc: BUG: unable to handle 

Linux 4.7: Reported regressions as of Sunday, 2016-07-17

2016-07-17 Thread Thorsten Leemhuis
Hi! Here is my sixth regression report for Linux 4.7. It lists 8
regressions I'm currently aware of; 2 of them are new. 

The report also mentions 3 regressions that were fixed since the last
report(¹). There were a few ones that were reported to me in the past
week (many thx for that!) and fixed already, which I did not add to
the list.

As usual: Please let me know about any regression missing on the list 
or if it contains something which shouldn't be there. I'm afk next
weekend, so this likely will be the last regression report before 4.7
gets released. But I guess I'll write at least one or two more reports
after that to make sure regressions found after the release get
collected somewhere.

HTH, CU, Thorsten

(¹) previous reports can be found at
http://thread.gmane.org/gmane.linux.kernel/2241805
http://thread.gmane.org/gmane.linux.kernel/2247804
http://thread.gmane.org/gmane.linux.kernel/2253623
http://thread.gmane.org/gmane.linux.kernel/2258287
http://thread.gmane.org/gmane.linux.kernel/2262759


== Current regressions ==

Desc: 795ae7a0de: pixz.throughput -9.1% regression
Repo: 2016-06-02 http://thread.gmane.org/gmane.linux.kernel/2233056/
Stat: 2016-06-22 
http://thread.gmane.org/gmane.linux.kernel/2233056/focus=2251134
Note: stalled; hannes can't reproduce and waits for reporter to post results 
from more tests

Desc: System hang when plug/un-plug USB 3.1 key via thunderbolt port on Dell 
XPS 13
Repo: 2016-06-14 https://bugzilla.kernel.org/show_bug.cgi?id=120241
Stat: 2016-07-16 https://bugzilla.kernel.org/show_bug.cgi?id=120241#c37
Note: wip, but root cause hard to track down

Desc: regression in 8250 uart driver
Repo: 2016-06-14 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Stat: 2016-07-02 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2258232
Note: andy has it still on his todo list, but something else has a higher 
priority

Desc: Multi-thread udp 4.7 regression, bisected to 71d8c47fc653; performance 
decreased or complete failure with a test
Repo: 2016-06-27 http://thread.gmane.org/gmane.linux.network/418861/focus=418908
Stat: 2016-07-12 http://thread.gmane.org/gmane.linux.network/418861/focus=63936
Note: patch "netfilter: conntrack: skip clash resolution if nat is in place" in 
the net tree fixes this

Desc: kmemcheck: Caught 64-bit read from uninitialized memory; 
iptables/nf_register_net_hooks
Repo: 2016-06-19 https://bugzilla.kernel.org/show_bug.cgi?id=120651
Stat: 2016-07-11 https://bugzilla.kernel.org/show_bug.cgi?id=120651#c2
Note: reporter wanted to recheck, nothing new since then

Desc: UBSAN splat in drivers/acpi/acpica/dsutils.c:641:16
Repo: 2016-06-15 https://bugzilla.kernel.org/show_bug.cgi?id=120351
Stat: 2016-06-22 https://bugzilla.kernel.org/show_bug.cgi?id=120411#c3
Note: wonder if this is important enough to investigate further; related, 
similar bugs: https://bugzilla.kernel.org/show_bug.cgi?id=120411 
https://bugzilla.kernel.org/show_bug.cgi?id=120391 
https://bugzilla.kernel.org/show_bug.cgi?id=120371 
https://bugzilla.kernel.org/show_bug.cgi?id=120361 Not sure which of them are 
regressions

Desc: After commit f21a21983ef13a, the i915 display is turned off
Repo: 2016-06-16 https://bugs.freedesktop.org/show_bug.cgi?id=96675
Stat: 2016-07-15 http://thread.gmane.org/gmane.linux.kernel/2266539/focus=94597 
https://bugs.freedesktop.org/show_bug.cgi?id=96675#c11
Note: fix in the works: https://bugs.freedesktop.org/show_bug.cgi?id=96675

Desc: legacy gamma table updates are completely broken for Intel on 
Linux-4.7-rc7
Repo: 2016-07-12 http://thread.gmane.org/gmane.comp.video.dri.devel/159323/
Stat: 2016-07-14 
http://thread.gmane.org/gmane.comp.video.dri.devel/159323/focus=159663
Note: seems https://patchwork.freedesktop.org/patch/89111/ (drm/i915: add 
missing condition for committing planes on crtc) fixes this


== Fixed since last report ==

Desc: Bad flicker on skylake HQD due to code in the 4.7 merge window
Repo: 2016-05-30 http://thread.gmane.org/gmane.linux.kernel/2230377
Stat: 2016-07-08 http://thread.gmane.org/gmane.linux.kernel/2230377/focus=94140
Note: Fixed by http://thread.gmane.org/gmane.linux.kernel/2230377/focus=94140 
vswing issue, maybe primary a Firmware problem; a few patches improved 
situation, but did not fix the problem

Desc: Kernel does not boot with 7ed18e2d1b6782989eb399ef79a8cc1a1b583b3c ( 
acpi-4.7-rc7 aka All of these fix recent regressions in ACPICA, in the ACPI PCI 
IRQ management code and in the ACPI AML debugger.)
Repo: 2016-07-08 https://bugzilla.kernel.org/show_bug.cgi?id=121701
Stat: 2016-07-10 
http://thread.gmane.org/gmane.linux.kernel.pci/53279/focus=85652
Note: fixed by 
https://git.kernel.org/linus/f1b5e4fac164ff43b189d996e4f05f95cc57b984

Desc: ACPI EC problems/ ACPI / EC: Fix an order issue in ec_remove_handlers()
Repo: 2016-07-06 http://thread.gmane.org/gmane.linux.kernel/2260279/
Stat: 2016-07-08 http://thread.gmane.org/gmane.linux.kernel/2260279/focus=85583
Note: fixed by 
https://git.kernel.org/t

Linux 4.7: Reported regressions as of Sunday, 2016-08-07

2016-08-07 Thread Thorsten Leemhuis
Hi! Here is my seventh regression report for Linux 4.7. It lists 9
regressions I'm currently aware of. 4 of them are new; 2 are stalled
until the reporter provides more feedback.

The report also mentions 3 regressions that were fixed since the last
report. There is also 1 I plan to remove because it's just referring 
to new warnings from UBSAN (let me know if you think it should stay).

As usual: Please let me know about any regressions missing on the list
or if it contains something which shouldn't be there. Since the 
release of 4.7 there were quite a few new bugs filed on 
bugzilla.kernel.org that mentioned 4.7; I briefly looked at them, but
it looks like most of them are not regression (sometimes it's not 
pretty clear, but I currently lack the time to investigate further 
into each of those). 

I'll send the first regression report for 4.9 next week

HTH, CU, Thorsten

P.S.: Gmane, I miss you. A lot. 

== Current regressions ==

Desc: System hang when plug/un-plug USB 3.1 key via thunderbolt port on Dell 
XPS 13
Repo: 2016-06-14 https://bugzilla.kernel.org/show_bug.cgi?id=120241
Stat: 2016-08-05 https://bugzilla.kernel.org/show_bug.cgi?id=120241#c40
Note: wip

Desc: regression in 8250 uart driver
Repo: 2016-06-14 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Stat: 2016-07-02 https://lkml.org/lkml/2016/7/2/94
Note: need to poke andy for a status update

Desc: ath9k/ ar5008 Oopses on v4.7-rc7+
Repo: 2016-07-17 https://marc.info/?l=linux-wireless&m=146878954223067&w=2
Stat: 2016-08-04 https://marc.info/?l=linux-wireless&m=147029189600658&w=2
Note: fix ath9k-fix-accessing-gpio-warning.patch available

Desc: Regression: commit 12549e2 clocksource/drivers/time-armada-370-xp: 
Convert init function to return error
Repo: 2016-07-28 https://bugzilla.kernel.org/show_bug.cgi?id=150571
Stat: 2016-08-06 https://lkml.org/lkml/2016/8/6/65
Note: proper developer now in the loop

Desc: ath9k cpuidle warnings ath9k_beacon_config (No adverse effects are 
noticed but this must be from quite a recent commit.)
Repo: 2016-08-02 https://bugzilla.kernel.org/show_bug.cgi?id=151271
Stat: n/a 
Note: needs further investigation; seem I or somebody else need to help here

Desc: /dev/input/by-path links for PS/2 input devices are not created
Repo: 2016-08-02 https://bugzilla.kernel.org/show_bug.cgi?id=151331
Stat: n/a 
Note: needs further investigation; seem I or somebody else need to help here

Desc: performance degradation on Opteron 6272 related to COD technology
Repo: 2016-06-28 https://marc.info/?l=linux-kernel&m=146715574528055&w=2
Stat: 2016-07-28 https://marc.info/?l=linux-kernel&m=146974252111026&w=2
Note: wip


== Stalled, waiting for feedback from reporter ==

Desc: 795ae7a0de: pixz.throughput -9.1% regression
Repo: 2016-06-02 http://thread.gmane.org/gmane.linux.kernel/2233056/
Stat: 2016-06-22 https://lkml.org/lkml/2016/6/22/754
Note: stalled; hannes could not reproduce and asked reporter for further 
feedback weeks ago

Desc: kmemcheck: Caught 64-bit read from uninitialized memory; 
iptables/nf_register_net_hooks
Repo: 2016-06-19 https://bugzilla.kernel.org/show_bug.cgi?id=120651
Stat: 2016-07-17 https://bugzilla.kernel.org/show_bug.cgi?id=120651#c8
Note: waiting for reporter to check a proposed fix


== Going to be removed from the list ==

Desc: UBSAN splat in drivers/acpi/acpica/dsutils.c:641:16
Repo: 2016-06-15 https://bugzilla.kernel.org/show_bug.cgi?id=120351
Stat: 2016-06-22 https://bugzilla.kernel.org/show_bug.cgi?id=120411#c3
Note: this debugging feature found new problems, but it seems the bugs don't 
get much attention (like many ohters on bugzilla.kernel.org); if nobody 
complains I'll remove this regression from the report to spend time for more 
important bugs; related, similar bugs: 
https://bugzilla.kernel.org/show_bug.cgi?id=120411 
https://bugzilla.kernel.org/show_bug.cgi?id=120391 
https://bugzilla.kernel.org/show_bug.cgi?id=120371 
https://bugzilla.kernel.org/show_bug.cgi?id=120361 Not sure which of them are 
regressions


== Fixed since last report ==

Desc: Multi-thread udp 4.7 regression, bisected to 71d8c47fc653; performance 
decreased or complete failure with a test
Repo: 2016-06-27 http://thread.gmane.org/gmane.linux.network/418861/focus=418908
Stat: 2016-07-12 http://thread.gmane.org/gmane.linux.network/418861/focus=63936
Note: fixed by 
https://git.kernel.org/torvalds/c/590b52e10d410e1439ae86be9fe19d75fdab628b

Desc: After commit f21a21983ef13a, the i915 display is turned off
Repo: 2016-06-16 https://bugs.freedesktop.org/show_bug.cgi?id=96675
Stat: 2016-07-15 http://thread.gmane.org/gmane.linux.kernel/2266539/focus=94597 
https://bugs.freedesktop.org/show_bug.cgi?id=96675#c11
Note: fixed by 
https://git.kernel.org/torvalds/c/c71d4d58981bed3366769ef5cf1f20e588fe16d0

Desc: legacy gamma table updates are completely broken for Intel on 
Linux-4.7-rc7
Repo: 2016-07-12 http://thread.gmane.org/gmane.comp.video.dri.devel/159323/
Stat: 2016-07-14 
http://thread.gmane.org/gmane.c

Linux 4.7: Reported regressions as of Sunday, 2016-08-14

2016-08-14 Thread Thorsten Leemhuis
Hi! Here is my eight regression report for Linux 4.7. It lists 13
regressions I'm currently aware of. 6 of them are new; none were fixed.

As usual: Please let me know about any regressions missing on the list
or if it contains something which shouldn't be there. Since the release
of 4.7 there were quite a few new bugs filed on bugzilla.kernel.org
that mentioned 4.7; I briefly looked at them, but it looks like most of
them are not regressions.

Ciao, Thorsten

== Current regressions ==

Desc: System hang when plug/un-plug USB 3.1 key via thunderbolt port on Dell 
XPS 13
Repo: 2016-06-14 https://bugzilla.kernel.org/show_bug.cgi?id=120241
Stat: 2016-08-12 https://bugzilla.kernel.org/show_bug.cgi?id=120241#c41
Note: wip (slowly)

Desc: regression in 8250 uart driver
Repo: 2016-06-14 
http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Stat: 2016-07-02 https://lkml.org/lkml/2016/7/2/94
Note: need to poke andy for a status update

Desc: ath9k/ ar5008 Oopses on v4.7-rc7+
Repo: 2016-07-17 https://marc.info/?l=linux-wireless&m=146878954223067&w=2
Stat: 2016-08-04 https://marc.info/?l=linux-wireless&m=147029189600658&w=2
Note: https://patchwork.kernel.org/patch/9262875/; not yet in mainline

Desc: Regression: commit 12549e2 clocksource/drivers/time-armada-370-xp: 
Convert init function to return error
Repo: 2016-07-28 https://bugzilla.kernel.org/show_bug.cgi?id=150571
Stat: 2016-08-06 https://lkml.org/lkml/2016/8/6/65
Note: proper developers now in the loop

Desc: ath9k cpuidle warnings ath9k_beacon_config (No adverse effects are 
noticed but this must be from quite a recent commit.)
Repo: 2016-08-02 https://bugzilla.kernel.org/show_bug.cgi?id=151271
Stat: n/a 
Note: needs further investigation; seem I or somebody else need to help here

Desc: /dev/input/by-path links for PS/2 input devices are not created
Repo: 2016-08-02 https://bugzilla.kernel.org/show_bug.cgi?id=151331
Stat: n/a 
Note: needs further investigation; seem I or somebody else need to help here

Desc: performance degradation on Opteron 6272 related to COD technology
Repo: 2016-06-28 https://marc.info/?l=linux-kernel&m=146715574528055&w=2
Stat: 2016-07-28 https://marc.info/?l=linux-kernel&m=146974252111026&w=2
Note: wip

Desc: ext4 error in dx_probe (due to the new parallel dir lookup code?)
Repo: 2016-07-18 https://lkml.org/lkml/2016/7/18/191
Stat: 2016-08-09 https://lkml.org/lkml/2016/8/7/105 
https://lkml.org/lkml/2016/8/9/86
Note: looks like its fixed in mainline by b47820edd

Desc: Display lost on Kirkwood/OpenRD Client
Repo: 2016-08-11 http://www.spinics.net/lists/kernel/msg2318358.html
Stat: 2016-08-11 http://www.spinics.net/lists/kernel/msg2319139.html
Note: proper patch in the works

Desc: [AR9285] WiFi LED stops working after suspend/hibernate cycle
Repo: 2016-08-08 https://bugzilla.kernel.org/show_bug.cgi?id=151711
Stat: n/a 
Note: 

Desc: Updating from 4.6.4 to 4.7 breaks pptp pass through
Repo: 2016-08-12 https://bugzilla.kernel.org/show_bug.cgi?id=152101
Stat: n/a 
Note: 

Desc: e1000e Failed to restore TIMINCA clock rate delta: -22
Repo: 2016-08-12 https://bugzilla.kernel.org/show_bug.cgi?id=152131
Stat: n/a 
Note: 

Desc: iwlmvm warnings on 4.7.0 (7265D firmwares 16 and 21)
Repo: 2016-08-14 https://bugzilla.kernel.org/show_bug.cgi?id=153061
Stat: n/a 
Note: 

== Stalled, waiting for feedback from reporter ==

Desc: 795ae7a0de: pixz.throughput -9.1% regression
Repo: 2016-06-02 http://thread.gmane.org/gmane.linux.kernel/2233056/
Stat: 2016-06-22 https://lkml.org/lkml/2016/6/22/754
Note: stalled; hannes could not reproduce and asked reporter for further 
feedback weeks ago

Desc: kmemcheck: Caught 64-bit read from uninitialized memory; 
iptables/nf_register_net_hooks
Repo: 2016-06-19 https://bugzilla.kernel.org/show_bug.cgi?id=120651
Stat: 2016-07-17 https://bugzilla.kernel.org/show_bug.cgi?id=120651#c8
Note: waiting for reporter to check a proposed fix


Linux 4.8: Reported regressions as of Sunday, 2016-08-14

2016-08-14 Thread Thorsten Leemhuis
Hi! Here is my first regression report for Linux 4.8. It lists 11
regressions. I was told or found about 10 more, but it turned out all 
of them were fixed already in the past few days. Nice, but this in one 
of the reasons why compiling this report took way more hours than the 
past few reports :-/

Anyway: Are you aware of any other regressions? Then please let me 
know. And you know the drill: if there is anything on the report that 
shouldn't be there please let me know.

Ciao, Thorsten

P.S.: Gmane, I still miss you a lot :-/

P.P.S.: "There are only two hard things in Computer Science: cache 
invalidation, naming things, and off-by-one errors" is my reply to 
those that asked to lend my time machine after reading the line "I'll
send the first regression report for 4.9 next week" that last weeks
report contained ;-)

== Current regressions ==

Desc: irqdomain: Don't set type when mapping an IRQ breaks nexus7 gpio buttons
Repo: 2016-07-30 https://marc.info/?l=linux-kernel&m=146985356305280&w=2
Stat: 2016-08-12 https://marc.info/?l=linux-kernel&m=147099735609904&w=2
Note: wip

Desc: genirq: Flags mismatch irq 8, 0088 (mmc0) vs. 0080 (rtc0). mmc0: 
Failed to request irq 8: -16
Repo: 2016-08-01 https://bugzilla.kernel.org/show_bug.cgi?id=150881
Stat: 2016-08-13 https://bugzilla.kernel.org/show_bug.cgi?id=150881#c26
Note: wip

Desc: Failed to create /dev/root: -14 after commit e6978e4bf1 (ARM: save and 
reset the address limit when entering an exception)
Repo: 2016-08-02 https://lkml.org/lkml/2016/8/2/2085
Stat: 2016-08-11 https://lkml.org/lkml/2016/8/10/858
Note: patch heading upstream

Desc: wlcore: NULL pointer dereference in wlcore_op_get_expected_throughput
Repo: 2016-08-04 https://marc.info/?l=linux-kernel&m=147031427806879&w=2
Stat: n/a 
Note: patches heading upstream

Desc: Crashes in refresh_zone_stat_thresholds when some nodes have no memory
Repo: 2016-08-04 https://marc.info/?t=14702932181&r=1&w=2
Stat: n/a 
Note: nothing happened since the report

Desc: Intermittent crash on reboot (acpi_ex_system_reset_event)
Repo: 2016-08-04 https://bugzilla.kernel.org/show_bug.cgi?id=151441
Stat: 2016-08-10 https://bugzilla.kernel.org/show_bug.cgi?id=151441#c2
Note: wip

Desc: [lkp] [mm, page_alloc] e6cbd7f2ef: pixz.throughput -5.1% regression due 
to mm, page_alloc: remove fair zone allocation policy
Repo: 2016-08-08 https://marc.info/?l=linux-kernel&m=147064515210332&w=3
Stat: n/a 
Note: nothing happened since the report

Desc: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
Repo: 2016-08-09 http://www.spinics.net/lists/kernel/msg2317052.html
Stat: 2016-08-14 http://www.spinics.net/lists/kernel/msg2320159.html
Note: wip

Desc: 362899b (macvtap: switch to use skb array) causes oops during teardown
Repo: 2016-08-10 http://www.spinics.net/lists/netdev/msg389465.html
Stat: 2016-08-10 http://www.spinics.net/lists/netdev/msg389465.html
Note: tested fix: http://www.spinics.net/lists/netdev/msg389623.html

Desc: mmc: Secure discard is broken
Repo: 2016-08-11 http://www.spinics.net/lists/linux-mmc/msg38481.html
Stat: 2016-08-11 http://www.spinics.net/lists/linux-mmc/msg38491.html
Note: report contains patch

Desc: imx-drm: Possible regression after update to atomic: black screen after 
Weston starts
Repo: 2016-08-13 http://www.spinics.net/lists/kernel/msg2320013.html
Stat: n/a 
Note: 


Linux 4.8: Reported regressions as of Sunday, 2016-08-28

2016-08-28 Thread Thorsten Leemhuis
Hi! Here is my second regression report for Linux 4.8. It lists 11
regressions. 5 of them are new; 5 mentioned in the last report two 
weeks ago got fixed.

FWIW: A small detail: I did not include "Regression - SATA disks behind 
USB ones on v4.8-rc1, breaking boot. [Re: Who reordered my disks]" 
(http://www.spinics.net/lists/linux-usb/msg144871.html ) in below list 
report. The discussion mentions that device names like /dev/sd? are not 
considered stable as they might change depending on various factors -- 
like the order in which modules are loaded or other timing issues (like 
in this case). That is how it is afaik (even if it's not well known), 
and that's why I didn't include the issue; let me know if you think it 
should be on the list.

OTOH I included "Commit cb4f71c429 deliberately changes order of 
network interfaces" (http://www.spinics.net/lists/kernel/msg2325600.html )
for now, as I think traditional network interface names (eth0, eth1, ...)
might be considered stable -- but I'm not sure, that's why I raise it
here.

Anyway, you know the drill: Are you aware of any other regressions?
Then please let me know. And tell me if there is anything in the
report that shouldn't be there.

Ciao, Thorsten

P.S.: Thanks to all those that Aaro Koskinen, Hans de Goede, Pavel 
Machek for CCing me when reporting regressions. Much appreciated! Ohh, 
and thx to all those that replied when I asked them for status updates
when things look stuck. 

P.P.S: Sorry, I did not manage to compile a report last weekend. I 
visited a FLOSS conference and there got hit by a flu that slowed be 
down all week :-/ That's why there won't be a regression report for 
4.7. I'll be travelling again next weekend, so there won't be a 
regression report next Sunday :-/

== Current regressions ==

Desc: irqdomain: Don't set type when mapping an IRQ breaks nexus7 gpio buttons
Repo: 2016-07-30 https://marc.info/?l=linux-kernel&m=146985356305280&w=2
Stat: 2016-08-12 https://marc.info/?l=linux-kernel&m=147093069326172&w=2
Note: fix found two weeks ago, not in mainline afaics; waiting for tglx to get 
back?

Desc: genirq: Flags mismatch irq 8, 0088 (mmc0) vs. 0080 (rtc0). mmc0: 
Failed to request irq 8: -16
Repo: 2016-08-01 https://bugzilla.kernel.org/show_bug.cgi?id=150881
Stat: 2016-08-19 https://bugzilla.kernel.org/show_bug.cgi?id=150881#c34
Note: reporter is bisecting the bug that got introduced by the main gpio merge 
for 4.8

Desc: Intermittent crash on reboot (acpi_ex_system_reset_event)
Repo: 2016-08-04 https://bugzilla.kernel.org/show_bug.cgi?id=151441
Stat: 2016-08-15 https://bugzilla.kernel.org/show_bug.cgi?id=151441#c7
Note: poked reporter, as nothing had happened after he was asked to test a 
revert

Desc: [lkp] [mm, page_alloc] e6cbd7f2ef: pixz.throughput -5.1% regression due 
to mm, page_alloc: remove fair zone allocation policy
Repo: 2016-08-08 https://marc.info/?l=linux-kernel&m=147064515210332&w=3
Stat: n/a 
Note: poked Mel, as nothing happened after the the report

Desc: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
Repo: 2016-08-09 http://www.spinics.net/lists/kernel/msg2317052.html
Stat: 2016-08-20 http://www.spinics.net/lists/kernel/msg2325437.html
Note: hmm, long discussion about potential issues, but no simple solution in 
sight and discussion ended about a week ago

Desc: imx-drm: Possible regression after update to atomic: black screen after 
Weston starts
Repo: 2016-08-13 http://www.spinics.net/lists/kernel/msg2320013.html
Stat: n/a 
Note: poked Dave, as nothing happened after the the report

Desc: DT/OCTEON driver probing broken
Repo: 2016-08-16 http://www.spinics.net/lists/devicetree/msg138990.html
Stat: 2016-08-28 
Note: proposed fix: https://patchwork.linux-mips.org/patch/14041/

Desc: gpio-leds broken on OCTEON
Repo: 2016-08-23 http://www.spinics.net/lists/devicetree/msg139863.html
Stat: 2016-08-25 http://www.spinics.net/lists/devicetree/msg140179.html
Note: wip

Desc: Commit cb4f71c429 deliberately changes order of network interfaces
Repo: 2016-08-21 http://www.spinics.net/lists/kernel/msg2325600.html
Stat: n/a 
Note: long, ongoing discussion if this is considered a regression

Desc: Bluetooth: Fix bt_sock_recvmsg when MSG_TRUNC is not set
Repo: 2016-08-25 http://www.spinics.net/lists/linux-bluetooth/msg68012.html
Stat: 2016-08-25 http://www.spinics.net/lists/linux-bluetooth/msg68049.html
Note: patch in bluetooth-stable tree

Desc: Skylake graphics regression: projector failure with 4.8-rc3
Repo: 2016-08-26 http://www.spinics.net/lists/intel-gfx/msg105478.html
Stat: n/a 
Note: brand new, more details expected

== Fixed since last report ==

Desc: Failed to create /dev/root: -14 after commit e6978e4bf1 (ARM: save and 
reset the address limit when entering an exception)
Repo: 2016-08-02 https://lkml.org/lkml/2016/8/2/2085
Fix:  https://git.kernel.org/torvalds/c/87eed3c74d7c65556f744230a90bf9556dd29146

Desc: wlcore: NULL pointer dereference in wlcore_op_get_expected_throughput
Repo: 2016

Linux 4.8: Reported regressions as of Sunday, 2016-09-11

2016-09-11 Thread Thorsten Leemhuis
Hi! Here is my third regression report for Linux 4.8. It lists 10
regressions I'm aware of. 6 of them are new; 3 mentioned in the last
report sent two weeks ago got fixed; 3 got removed for other reasons
(see below).

As always: Are you aware of any other regressions? Then please let me
know (simply CC regressi...@leemhuis.info). And tell me if there is
anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Sorry for not sending a report last week, I did not find the time
as I was travelling :-/


== Current regressions ==

Desc: genirq: Flags mismatch irq 8, 0088 (mmc0) vs. 0080 (rtc0). mmc0: 
Failed to request irq 8: -16
Repo: 2016-08-01 https://bugzilla.kernel.org/show_bug.cgi?id=150881
Stat: 2016-09-09 https://bugzilla.kernel.org/show_bug.cgi?id=150881#c34
Note: root cause somewhere in the main gpio merge for 4.8, but problematic 
commit still unknown

Desc: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
Repo: 2016-08-09 http://www.spinics.net/lists/kernel/msg2317052.html
Stat: 2016-09-09 https://marc.info/?t=14734151953&r=1&w=2
Note: Mel working on it in his spare time, but "The progression of this series 
has been unsatisfactory."

Desc: DT/OCTEON driver probing broken
Repo: 2016-08-16 http://www.spinics.net/lists/devicetree/msg138990.html
Stat: 2016-08-30 http://www.spinics.net/lists/devicetree/msg140682.html
Note: stalled, poked Rob

Desc: gpio-leds broken on OCTEON
Repo: 2016-08-23 http://www.spinics.net/lists/devicetree/msg139863.html
Stat: 2016-08-25 http://www.spinics.net/lists/devicetree/msg140179.html
Note: stalled, poked list

Desc: Skylake graphics regression: projector failure with 4.8-rc3
Repo: 2016-08-26 http://www.spinics.net/lists/intel-gfx/msg105478.html
Stat: 2016-09-01 https://lkml.org/lkml/2016/8/31/946
Note: looks a bit stalled, but guess jejb still has not forgotten about it

Desc: lk 4.8 + !CONFIG_SHMEM + shmat() = oops
Repo: 2016-08-30 http://www.spinics.net/lists/linux-mm/msg112920.html
Stat: 2016-09-07 http://www.spinics.net/lists/linux-mm/msg113177.html
Note: patch "ipc/shm: fix crash if CONFIG_SHMEM is not set" is going to fix this

Desc: ath9k: bring back direction setting in ath9k_{start_stop}
Repo: 2016-09-01 https://marc.info/?l=linux-wireless&m=147292415030585&w=2
Stat: 2016-09-07 https://marc.info/?l=linux-wireless&m=147325468810927&w=2
Note: fix heading upstream

Desc: regression in re-read operation by iozone ~10%
Repo: 2016-09-02 https://bugzilla.kernel.org/show_bug.cgi?id=155821
Stat: n/a 
Note: told reporter he might be better of posting about the issue to some 
mailing list

Desc: Suspend fails & system is unusable after attempt (Dell XPS 13 9350 / 
Skylake)
Repo: 2016-09-08 https://bugzilla.kernel.org/show_bug.cgi?id=156361
Stat: n/a 
Note: quite new

Desc: CPU speed set very low
Repo: 2016-09-09 https://lkml.org/lkml/2016/9/9/608
Stat: n/a 
Note: quite new

== Going to be removed from the list ==

Desc: [lkp] [mm, page_alloc] e6cbd7f2ef: pixz.throughput -5.1% regression due 
to mm, page_alloc: remove fair zone allocation policy
Repo: 2016-08-08 https://marc.info/?l=linux-kernel&m=147064515210332&w=3
Stat: 2016-08-30 https://lkml.org/lkml/2016/8/30/176
Note: Quoting Mel: Drop it for the moment. My expectation is that it's a 
relatively minor hazard. […]

Desc: Intermittent crash on reboot (acpi_ex_system_reset_event)
Repo: 2016-08-04 https://bugzilla.kernel.org/show_bug.cgi?id=151441
Stat: 2016-08-15 https://bugzilla.kernel.org/show_bug.cgi?id=151441#c7
Note: reporter stopped seeing this crash

Desc: Commit cb4f71c429 deliberately changes order of network interfaces
Repo: 2016-08-21 http://www.spinics.net/lists/kernel/msg2325600.html
Stat: n/a 
Note: long discussion if this is considered a regression at all; mentioned it 
in the last report to make sure people are aware of the issue; nothing happened 
since then, so I assume people do not consider it a regression


== Fixed since last report ==

Desc: irqdomain: Don't set type when mapping an IRQ breaks nexus7 gpio buttons
Repo: 2016-07-30 https://marc.info/?l=linux-kernel&m=146985356305280&w=2
Fix:  https://git.kernel.org/torvalds/c/1e12c4a9393b75a744aada2c8115434572698bc3

Desc: imx-drm: Possible regression after update to atomic: black screen after 
Weston starts
Repo: 2016-08-13 http://www.spinics.net/lists/kernel/msg2320013.html
Fix:  https://git.kernel.org/torvalds/c/c6c1f9bc798bee7cfc2e172cd2c9b48187d801a7

Desc: Bluetooth: Fix bt_sock_recvmsg when MSG_TRUNC is not set
Repo: 2016-08-25 http://www.spinics.net/lists/linux-bluetooth/msg68012.html
Fix:  https://git.kernel.org/torvalds/c/90a56f72edb088c678083c32d05936c7c8d9a948



Linux 4.13: Reported regressions as of Sunday, 2017-07-30

2017-07-30 Thread Thorsten Leemhuis
Hi! Find below my first regression report for Linux 4.13. It lists 8
regressions I'm currently aware of (a few others I had on my list got
fixed in the past few days). You can also find it at
http://bit.ly/lnxregrep413 where I try to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid
And please tell me if there is anything in the report that shouldn't be
there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
2017-07-10 http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Due to https://git.kernel.org/torvalds/c/e585513b76

Null dereference in rt5677_i2c_probe()
2017-07-17 lr#96bd63 https://bugzilla.kernel.org/show_bug.cgi?id=196397
Due to https://git.kernel.org/torvalds/c/a36afb0ab6
Status: Takashi proposed a patch that fixes the issue
Latest discussion: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
(2017-07-17)

[I945GM] Pasted text not shown after mouse middle-click
2017-07-17 lr#d672f3 https://bugs.freedesktop.org/show_bug.cgi?id=101819
Status: could not get reproduced yet
Notes: related to the regression that was fixed rc2+
https://bugs.freedesktop.org/show_bug.cgi?id=101790

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
2017-07-24 lr#bd29ab https://bugzilla.kernel.org/show_bug.cgi?id=196459
Due to https://git.kernel.org/torvalds/c/33e4f80ee6
Status: it's a tracking bug, looks like issue is handled by Intel devs
already
Notes: suspend-to-idle is rare

[lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail
2017-07-26 lr#a7d273
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop Due to
https://git.kernel.org/torvalds/c/28785f70ef

Dell XPS 13 9360: Touchscreen does not report events
2017-07-28 lr#fe68bb https://bugzilla.kernel.org/show_bug.cgi?id=196519
Status: afaics waiting to get forwarded to linux-usb by reporter
Notes: might be the same as
https://bugzilla.kernel.org/show_bug.cgi?id=196431

Xen HVM guest with KASLR enabled wouldn't boot any longer
2017-07-28 https://lkml.kernel.org/r/20170728102314.29100-1-jgr...@suse.com
Status: WIP, patches up for review

NULL pointer deref in networking
2017-07-29 lr#084be9 https://bugzilla.kernel.org/show_bug.cgi?id=196529
Status: told reporter he might be better off posting to netdev


Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-06 Thread Thorsten Leemhuis
Hi! Find below my second regression report for Linux 4.13. It lists 10
regressions I'm currently aware of (albeit in one case it's not entirely
clear yet if it's a regression in 4.13). One regression got fixed since
last weeks report. You can also find the report at
http://bit.ly/lnxregrep413 where I try to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
(2017-07-10)
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Status: Asked on the list, but issue still gets ignored by everyone
Cause: https://git.kernel.org/torvalds/c/e585513b76
Note: I'm a bit unsure if adding this issue to this list was a good idea.

Null dereference in rt5677_i2c_probe() (2017-07-17)
https://bugzilla.kernel.org/show_bug.cgi?id=196397
Linux-Regression-ID: lr#96bd63
Status: Patch is available in in asoc-next as commit ddc9e69b9dc2, but
was not part of the changes to this subsystem that got merged a few days ago
Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
Latest: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6 (2017-07-17)

[I945GM] Pasted text not shown after mouse middle-click (2017-07-17)
https://bugs.freedesktop.org/show_bug.cgi?id=101819
Linux-Regression-ID: lr#d672f3
Status: could not get reproduced yet
Note: looks like it's getting ignored
Latest: https://bugs.freedesktop.org/show_bug.cgi?id=101819#c8 (2017-07-17)

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard (2017-07-24)
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Linux-Regression-ID: lr#bd29ab
Status: it's a tracking bug, looks like issue is handled by Intel devs
already
Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
Note: suspend-to-idle is rare

[lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail (2017-07-26)
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
Linux-Regression-ID: lr#a7d273
Status: Seems it gets ignored by everyone
Cause: https://git.kernel.org/torvalds/c/28785f70ef

Xen HVM guest with KASLR enabled wouldn't boot any longer  (2017-07-28)
https://lkml.kernel.org/r/20170728102314.29100-1-jgr...@suse.com
Status: WIP, patches up for review, but were not part of the changes to
this subsystem that got merged a few days ago

bio-integrity: Fix regression if profile verify_fn is NULL (2017-08-02)
https://lkml.kernel.org/r/20170802122750.12216-1-gmazyl...@gmail.com
Linux-Regression-ID: lr#35498d
Status: Discussion ongoing how to fix it properly
Latest: https://lkml.kernel.org/r/yq13795epil@oracle.com (2017-08-02)

CIFS mount error -112 (2017-08-06)
https://bugzilla.kernel.org/show_bug.cgi?id=196599
Linux-Regression-ID: lr#60efe5
Status: Brand new


== Waiting for reporter ==

NULL pointer deref in networking (2017-07-29)
https://bugzilla.kernel.org/show_bug.cgi?id=196529
Linux-Regression-ID: lr#084be9
Status: maybe reporter lost interest

SGI UV300/UV300: kernel BUG at arch/x86/mm/init_64.c:350! during boot
(2017-08-02)
https://bugzilla.kernel.org/show_bug.cgi?id=196561
Status: not 100% sure if this is a regression
Note: related to https://bugzilla.kernel.org/show_bug.cgi?id=196565 ?


== Fixed since last weeks report ==

Dell XPS 13 9360: Touchscreen does not report events (2017-07-28)
https://bugzilla.kernel.org/show_bug.cgi?id=196519
Linux-Regression-ID: lr#fe68bb
Status: Fixed in rc3


== Legend ==

First few lines -> short summary followed by date and a link to the
 report that lead to inclusion in this report
Cause -> commit that causes this regression
Status -> short start summary written by regression tracker
Note -> additional note written by regression tracker
Latest -> most recent and informative point where issue was discussed
See also -> other places where this issue was or is discussed

Everything apart from the description and the link to the report is
optional.

EOF


Linux 4.13: Reported regressions as of Monday, 2017-08-14

2017-08-14 Thread Thorsten Leemhuis
Hi! Find below my third regression report for Linux 4.13. It lists 11
regressions I'm currently aware of (or 10 if you count the two scsi-mq
regressions discussions as one). 4 regressions are new; 3 got fixed
since last weeks report (two others didn't even make it to the report,
as they were quickly fixed); 1 gets removed. You can also find the
report at http://bit.ly/lnxregrep413 where I try to update it every now
and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

P.P.S.: Sorry, I adjusted the report structure again because I added a
new field that shows the date when a proper kernel developer (normally:
one that is working in the affected subsystem) looked into issue. That
should hopefully make it easier to spot regressions that are getting
ignored or got stuck somehow.


== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was report against linux-next in May already
Reported: 2017-07-10 Developer activity: none
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76

Null dereference in rt5677_i2c_probe()
Status: Patch is available in in asoc-next as commit ddc9e69b9dc2
Reported: 2017-07-17 Developer activity: 2017-07-27
https://bugzilla.kernel.org/show_bug.cgi?id=196397
https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
Linux-Regression-ID: lr#96bd63

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: it's a tracking bug for an issue that seems to get handled by
Intel devs already
Note: suspend-to-idle is rare
Reported: 2017-07-24 Developer activity: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
Linux-Regression-ID: lr#bd29ab

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef
Linux-Regression-ID: lr#a7d273

SCSI-MQ performance regression due to blk-mq scheduler
Status: Revert planned
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Note: see also "Switching to MQ by default may generate some bug reports"
Reported: 2017-07-31 Developer activity: 2017-08-13
https://lkml.kernel.org/r/20170731165111.11536-2-ming@redhat.com
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Cause: https://git.kernel.org/torvalds/c/5c279bd9e4

Switching to MQ by default may generate some bug reports
Status: Revert planned
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Note: see also "SCSI-MQ performance regression due to blk-mq scheduler"
Reported: 2017-08-03 Developer activity: 2017-08-13
https://lkml.kernel.org/r/20170803085115.r2jfz2lofy5sp...@techsingularity.net
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Cause: https://git.kernel.org/torvalds/c/5c279bd9e4

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Reminded people they need to get the issue to the mailing list
Note: Due to the changes in  908b852df1d5d27d289e915fea7bfc16d38b8a76
That's a security change, but one that IMHO at least could have been
handled a lot better by giving users a hint what's wrong
Reported: 2017-08-06 Developer activity: none
https://bugzilla.kernel.org/show_bug.cgi?id=196599
https://bugzilla.kernel.org/show_bug.cgi?id=196599#c6
Cause: https://git.kernel.org/torvalds/c/eef914a9eb
Linux-Regression-ID: lr#60efe5

clang build regression in ext4
Status: report contains patch to fix issue
Reported: 2017-08-07 Developer activity: 2017-08-12
https://lkml.kernel.org/r/20170807105701.3835991-1-a...@arndb.de
Cause: https://git.kernel.org/torvalds/c/2df2c3402f

ACPI/IORT: fix build regression without IOMMU
Status: report contains patch to fix issue
Reported: 2017-08-10 Developer activity: 2017-08-10
https://lkml.kernel.org/r/20170810121114.2509560-1-a...@arndb.de
Cause: https://git.kernel.org/torvalds/c/bc8648d49a

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Status: brand new
Reported: 2017-08-13 Developer activity: none
https://lkml.kernel.org/r/59901cdb.b0ndvwhnqacjcnum%fengguang...@intel.com
Cause: https://git.kernel.org/torvalds/c/89a55278de

Lockdep: possi

Linux 4.13: Reported regressions as of Monday, 2017-08-28

2017-08-28 Thread Thorsten Leemhuis
Hi! Find below my fourth regression report for Linux 4.13. It lists 6
regressions I'm currently aware of. 1 of them is new, 5 got fixed since
the last report (that was two weeks ago; didn't find time for compiling
one last week; sorry). You can also find the report at
http://bit.ly/lnxregrep413 where I try to update it every now and then.
That didn't work to well in the past few weeks; but I'll try to update
it at then end oft the week as the 4.13 release gets closer.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was reported against linux-next in May already
Reported: 2017-07-10 Developer activity: none known
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76f7

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: it's a tracking bug for an issue that seems to get handled by
Intel devs already
Note: suspend-to-idle is rare
Reported: 2017-07-24 Developer activity: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee69b
Linux-Regression-ID: lr#bd29ab

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef88
Linux-Regression-ID: lr#a7d273

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Issue was raised on the mailing list, but ignore
Note: That's a security change, but one that IMHO at least could have
been handled a lot better by giving users a hint what's wrong (and not
"mount: […] Host is down"). I'm considering to submit a revert as RFC to
get a discussion going.
Reported: 2017-08-06 Developer activity: none known
https://bugzilla.kernel.org/show_bug.cgi?id=196599
https://www.spinics.net/lists/linux-cifs/msg12992.html
Cause: https://git.kernel.org/torvalds/c/eef914a9eb5e &
https://git.kernel.org/torvalds/c/908b852df1d5
Linux-Regression-ID: lr#60efe5

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Note: is there any way to query if this is still happening in 0-day?
Reported: 2017-08-13 Developer activity: none known
https://lkml.kernel.org/r/59901cdb.b0ndvwhnqacjcnum%fengguang...@intel.com
Cause: https://git.kernel.org/torvalds/c/89a55278dee4

regression when ATI chipsets detected
Status: Fix proposed
Reported: 2017-08-23 Developer activity: 2017-08-24
https://lkml.kernel.org/r/1503485760-15146-1-git-send-email-sandeep.si...@amd.com
https://lkml.kernel.org/r/1503548835-27057-1-git-send-email-sandeep.si...@amd.com
Cause: https://git.kernel.org/torvalds/c/e788787ef4f9


== Going to get removed ==

SGI UV300/UV300: kernel BUG at arch/x86/mm/init_64.c:350! during boot
Status: not 100% sure if this is a regression; reported didn't provide
feedback
Note: related to https://bugzilla.kernel.org/show_bug.cgi?id=196565 ?
Reported: 2017-08-02 Developer activity: none known
https://bugzilla.kernel.org/show_bug.cgi?id=196561

ACPI/IORT: build regression without IOMMU
Note: Adding this was a mistake, as the causing commit was in linux-next
and not yet in mainline. Sorry for the noise.
Reported: 2017-08-10 Developer activity: none known
https://lkml.kernel.org/r/20170810121114.2509560-1-a...@arndb.de


== Fixed since last report ==

Null dereference in rt5677_i2c_probe()
Status: Fixed in https://git.kernel.org/torvalds/c/9ce76511b67b
Reported: 2017-07-17 Developer activity: none known
https://bugzilla.kernel.org/show_bug.cgi?id=196397
Cause: https://git.kernel.org/torvalds/c/a36afb0ab648
Linux-Regression-ID: lr#96bd63

SCSI-MQ performance regression due to blk-mq scheduler
Status: Fixed in https://git.kernel.org/torvalds/c/cbe7dfa26eee
Reported: 2017-07-31 Developer activity: none known
https://lkml.kernel.org/r/20170731165111.11536-2-ming@redhat.com
https://lkml.kernel.org/r/20170813174422.16197-1-...@lst.de
Cause: https://git.kernel.org/torvalds/c/5c279bd9e406

Switching to MQ by default may generate some bug reports
Status: Fixed in https://git.kernel.org/torvalds/c/cbe7dfa26eee
Reported: 2017-08-03 Developer activity: none known
https://lkml.kernel.org/r/20170803085115.r2jfz2lofy5sp...@tech

Linux 4.13: Reported regressions as of Sunday, 2017-09-03

2017-09-03 Thread Thorsten Leemhuis
Hi! Find below my fifth regression report for Linux 4.13. It lists 4
regressions I'm currently aware of. There are no new ones; 2 got fixed
since the last report.

You can also find the report at http://bit.ly/lnxregrep413 where I try
to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was reported against linux-next in May already
Reported: 2017-07-10
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76f7

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Last known developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef88
Linux-Regression-ID: lr#a7d273

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Note: is there any way to query 0-day to see if this is still happening?
Reported: 2017-08-13
https://lkml.org/lkml/2017/8/13/38
Cause: https://git.kernel.org/torvalds/c/89a55278dee4

usb:xhci: regression when ATI chipsets detected
Status: Fix in usb-next/usb-testing
Reported: 2017-08-23 Last known developer activity: 2017-08-28
https://lkml.kernel.org/r/1503485760-15146-1-git-send-email-sandeep.si...@amd.com
Cause: https://git.kernel.org/torvalds/c/e788787ef4f9


== Fixed since last report ==

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: was a tracking bug that got closed by the developer that created it
Reported: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee69b
Linux-Regression-ID: lr#bd29ab

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Situation not perfect, but improved a lot by
https://git.kernel.org/torvalds/c/7e682f766f28
Note: https://lkml.org/lkml/2017/8/31/843
Reported: 2017-08-06
https://bugzilla.kernel.org/show_bug.cgi?id=196599
Cause: https://git.kernel.org/torvalds/c/eef914a9eb5e /
https://git.kernel.org/torvalds/c/908b852df1d5
Linux-Regression-ID: lr#60efe5


Re: [PATCH 04/23] scsi: initialize scsi midlayer limits before allocating the queue

2024-05-29 Thread Thorsten Leemhuis
On 29.05.24 16:36, Linux regression tracking (Thorsten Leemhuis) wrote:
> [CCing the regression list, as it should be in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
> 
> On 20.05.24 17:15, Christoph Hellwig wrote:
>> Adding ben and the linuxppc list.
> 
> Hmm, no reply and no other progress to get this resolved afaics. So lets
> bring Michael into the mix, he might be able to help out.
> 
> BTW TWIMC: a PowerMac G5 user user reported similar symptoms here
> recently: https://bugzilla.kernel.org/show_bug.cgi?id=218858

And yet another report with similar symptoms, this time with a
"PowerMac7,2 PPC970 0x390202 PowerMac":
https://bugzilla.kernel.org/show_bug.cgi?id=218905

Ciao, Thorsten

>> Context: pata_macio initialization now fails as we enforce that the
>> segment size is set properly.
>>
>> On Wed, May 15, 2024 at 04:52:29PM -0700, Guenter Roeck wrote:
>>> pata_macio_common_init() Calling ata_host_activate() with limit 65280
>>> ...
>>> max_segment_size is 65280; PAGE_SIZE is 65536; BLK_MAX_SEGMENT_SIZE is 65536
>>> WARNING: CPU: 0 PID: 12 at block/blk-settings.c:202 
>>> blk_validate_limits+0x2d4/0x364
>>> ...
>>>
>>> This is with PPC_BOOK3S_64 which selects a default page size of 64k.
>>
>> Yeah.  Did you actually manage to use pata macio previously?  Or is
>> it just used because it's part of the pmac default config?
>>
>>> Looking at the old code, I think it did what you suggested above,
>>
>>> but assuming that the driver requested a lower limit on purpose that
>>> may not be the best solution.
>>
>>> Never mind, though - I updated my test configuration to explicitly
>>> configure the page size to 4k to work around the problem. With that,
>>> please consider this report a note in case someone hits the problem
>>> on a real system (and sorry for the noise).
>>
>> Yes, the idea behind this change was to catch such errors.  So far
>> most errors have been drivers setting lower limits than what the
>> hardware can actually handle, but I'd love to track this down.
>>
>> If the hardware can't actually handle the lower limit we should
>> probably just fail the probe gracefully with a well comment if
>> statement instead.


Re: Xorg doesn't start and some other issues with the RC1 of kernel 6.10

2024-05-31 Thread Thorsten Leemhuis
On 31.05.24 11:03, Michael Ellerman wrote:
> Michael Ellerman  writes:
>> Christian Zigotzky  writes:
>>> On 28.05.24 22:00, Christian Zigotzky wrote:
 Hi All,

 Xorg doesn't start anymore since the RC1 of kernel 6.10. We tested it 
 with the VirtIO GPU and with some Radeon cards.

 Another error message: Failed to start Setup Virtual Console.

 Maybe this is the issue: + CONFIG_ARCH_HAS_KERNEL_FPU_SUPPORT=y

 Tested with FSL P5040, FSL P5020, and PASEMI boards.

 Could you please test Xorg on your PowerPC machines?

 Thanks,
 Christian
>>> I tested the RC1 in a virtual e5500 QEMU PowerPC machine with Bochs VGA 
>>> (-device VGA,vgamem_mb=256) and Xorg doesn't start either.
>>>
>>> Error message: xf86OpenConsole: KDSETMODE KD_GRAPHICS failed 
>>> Inappropriate ioctl for device.
>>
>> That is presumably because of this:
>>   
>> https://lore.kernel.org/all/0da9785e-ba44-4718-9d08-4e96c1ba7...@kernel.org/
> 
> Attempting to regzbot this.
> 
> #regzbot introduced: 8c467f330059
> #regzbot monitor: 
> https://lore.kernel.org/all/0da9785e-ba44-4718-9d08-4e96c1ba7...@kernel.org/

Thx, I already had an eye on this, but thought tracking would not be
needed, as Greg (now CCed) wanted to revert 8c467f3300591a ("VT: Use
macros to define ioctls") two days ago:
https://lore.kernel.org/all/2024052901-police-trash-e9f9@gregkh/

But that commit is not yet in -next afaics. :-/

/me meanwhile wonders if it would be wise to fix this before -rc2

Ciao, Thorsten


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-09-29 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 29.09.23 13:27, Erhard Furtner wrote:
> Greetings!
> 
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3 fails to 
> boot with following dmesg shown on the OpenFirmware console (transcribed 
> screenshot):
> 
> [...]
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> rcu: Hierarchical RCU implementation.
>  Tracing variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 30 jiffies.
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> mpic: Setting up MPIC " MPIC 1   " version 1.2 at f804, max 2 CPUs
> mpic: ISU size: 124, shift: 7, mask: 7f
> mpic: Initializing for 124 sources
> mpic: Setting up HT PICs workarounds for U3/U4
> BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
> Faulting instruction address: 0xc005dc40
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: GT  6.6.0-rc3-PMacGS #1
> Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
> NIP:  c005dc40 LR: c000 CTR: c0007730
> REGS: c22bf510 TRAP: 0380   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 44004242  XER: 
> IRQMASK: 3
> GPR00:  c22bf7b0 c10c0b00 01ac
> GPR04: 03c8 0300 c000f20001ae 0300
> GPR08: 0006 feffbb62ffec65ff 0001 
> GPR12: 90001032 c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 0006 
> GPR20: 01ac c0f6f920 c22cd985 000c
> GPR24: 0300 0003b0a3691d c0003e00803e 
> GPR28: c00c c000f20001ee feffbb62ffec65fe 01ac
> NIP [c005dc40] hash_page_do_lazy_icache+0x50/0x100
> LR [c000] __hash_page_4K+0x420/0x590
> Call Trace:
> [c22bf7e0] [] 0x
> [c22bf8c0] [c005e164] hash_page_mm+0x364/0x6f0
> [c22bf990] [c005e684] do_hash_fault+0x114/0x2b0
> [c22bf9c0] [c00078e8] data_access_common_virt+0x198/0x1f0
> --- interrupt: 300 at mpic_init+0x4bc/0x10c4
> NIP:  c2020a5c LR: c2020a04 CTR: 
> REGS: c22bf9f0 TRAP: 0300   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 24004248  XER: 
> DAR: c0003e00803e DSISR: 4000 IRQMASK: 1
> GPR00:  c22bfc90 c10c0b00 c0003e008030
> GPR04:    
> GPR08:  221b80894c06df2f  
> GPR12:  c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 02367c70 
> GPR20: 567ce25e8c9202b7 c0f6f920 0001 c0003e008030
> GPR24: c226f348 0004 c404c640 
> GPR28: c0003e008030 c404c000 45886d8559cb69b4 c22bfc90
> NIP [c005dc40] mpic_init+0x4bc/0x10c4
> LR [c000] mpic_init+0x464/0x10c4
> ~~~ interrupt: 300
> [c22bfd90] [c2022ae4] pmac_setup_one_mpic+0x258/0x2dc
> [c22bf2e0] [c2022df4] pmac_pic_init+0x28c/0x3d8
> [c22bfef0] [c200b750] init_IRQ+0x90/0x140
> [c22bff30] [c20053c0] start_kernel+0x57c/0x78c
> [c22bffe0] [c000cb48] start_here_common+0x1c/0x20
> Code: 0929 7c292040 4081007c fbc10020 3d220127 78843664 3929d700 ebc9 
> 7fde2214 e93e 712a0001 40820064  71232000 40820048 e93e
> ---[ end trace  ]---
> 
> Kernel panic - not syncing: Fatal exception
> Rebooting in 40 seconds..
> 
> 
> I bisected the issue and got 9fee28baa601f4dbf869b1373183b312d2d5ef3d as 1st 
> bad commit:
> 
>  # git bisect good
> 9fee28baa601f4dbf869b1373183b312d2d5ef3d is the first bad commit
> commit 9fee28baa601f4dbf869b1373183b312d2d5ef3d
> Author: Matthew Wilcox (Oracle) 
> Date:   Wed Aug 2 16:13:49 2023 +0100
> 
> powerpc: implement the new page table range API
> 
> Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().  Change
> the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio.
> 
> [wi...@infradead.org: re-export flush_dcache_icache_folio()]
>   Link: https://lkml.kernel.org/r/zmx1daywvd9em...@c

Re: [PATCH] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-18 Thread Linux regression tracking (Thorsten Leemhuis)
On 19.11.23 00:45, Timothy Pearson wrote:
> During floating point and vector save to thread data fr0/vs0 are clobbered
> by the FPSCR/VSCR store routine.  This leads to userspace register corruption
> and application data corruption / crash under the following rare condition:
> [...]
> Tested-by: Timothy Pearson 

Many thx for this, good to see you finally found the problem.

FWIW, you might want to add a

 Closes:
https://lore.kernel.org/all/480932026.45576726.1699374859845.javamail.zim...@raptorengineeringinc.com/

here. Yes, I care about those tags because of regression tracking. But
it only relies on Link:/Closes: tags because they were meant to be used
in the first place to link to backstories and details of a change[1].

And you and Jens did such good debugging in that thread, which is why
it's IMHO really worth linking here in case anyone ever needs to look
into the backstory later.

> Signed-off-by: Timothy Pearson 
> [..]

Thx again for all your work you put into this.

Ciao, Thorsten

[1] see Documentation/process/submitting-patches.rst
(http://docs.kernel.org/process/submitting-patches.html) and
Documentation/process/5.Posting.rst
(https://docs.kernel.org/process/5.Posting.html)

See also these mails from Linus:
https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+gsyyfbth0f-dgvd...@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nowvkvzjpm3vfu1zobp37fwd_h9iad...@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wjxzafG-=j8ot30s7upn4rhbs6tx-uvfz5rme+l5_d...@mail.gmail.com/


Re: Fwd: Memory corruption in multithreaded user space program while calling fork

2023-07-03 Thread Linux regression tracking (Thorsten Leemhuis)
On 02.07.23 14:27, Bagas Sanjaya wrote:
> I notice a regression report on Bugzilla [1]. Quoting from it:
> 
>> After upgrading to kernel version 6.4.0 from 6.3.9, I noticed frequent but 
>> random crashes in a user space program.  After a lot of reduction, I have 
>> come up with the following reproducer program:
> [...]
>> After tuning the various parameters for my computer, exit code 2, which 
>> indicates that memory corruption was detected, occurs approximately 99% of 
>> the time.  Exit code 1, which occurs approximately 1% of the time, means it 
>> ran out of statically-allocated memory before reproducing the issue, and 
>> increasing the memory usage any more only leads to diminishing returns.  
>> There is also something like a 0.1% chance that it segfaults due to memory 
>> corruption elsewhere than in the statically-allocated buffer.
>>
>> With this reproducer in hand, I was able to perform the following bisection:
> [...]
>
> See Bugzilla for the full thread.

Additional details from
https://bugzilla.kernel.org/show_bug.cgi?id=217624#c5 :

```
I can confirm that v6.4 with 0bff0aaea03e2a3ed6bfa302155cca8a432a1829
reverted no longer causes any memory corruption with either my
reproducer or the original program.
```

FWIW: 0bff0aaea03 ("x86/mm: try VMA lock-based page fault handling
first") [merged for v6.4-rc1, authored by Suren Baghdasaryan [already CCed]]

That's the same commit that causes build problems with go:

https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf...@kernel.org/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot introduced: 0bff0aaea03e2a3


Re: [PATCH v4 29/33] x86/mm: try VMA lock-based page fault handling first

2023-07-03 Thread Linux regression tracking (Thorsten Leemhuis)
On 29.06.23 16:40, Jiri Slaby wrote:
> On 27. 02. 23, 18:36, Suren Baghdasaryan wrote:
>> Attempt VMA lock-based page fault handling first, and fall back to the
>> existing mmap_lock-based handling if that fails.
>>
>> Signed-off-by: Suren Baghdasaryan 
>> ---
>>   arch/x86/Kconfig    |  1 +
>>   arch/x86/mm/fault.c | 36 
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index a825bf031f49..df21fba77db1 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -27,6 +27,7 @@ config X86_64
>>   # Options that are inherently 64-bit kernel only:
>>   select ARCH_HAS_GIGANTIC_PAGE
>>   select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
>> +    select ARCH_SUPPORTS_PER_VMA_LOCK
>>   select ARCH_USE_CMPXCHG_LOCKREF
>>   select HAVE_ARCH_SOFT_DIRTY
>>   select MODULES_USE_ELF_RELA
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index a498ae1fbe66..e4399983c50c 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -19,6 +19,7 @@
>>   #include     /* faulthandler_disabled()    */
>>   #include     /*
>> efi_crash_gracefully_on_page_fault()*/
>>   #include 
>> +#include     /* find_and_lock_vma() */
>>     #include     /* boot_cpu_has, ...    */
>>   #include     /* dotraplinkage, ...    */
>> @@ -1333,6 +1334,38 @@ void do_user_addr_fault(struct pt_regs *regs,
>>   }
>>   #endif
>>   +#ifdef CONFIG_PER_VMA_LOCK
>> +    if (!(flags & FAULT_FLAG_USER))
>> +    goto lock_mmap;
>> +
>> +    vma = lock_vma_under_rcu(mm, address);
>> +    if (!vma)
>> +    goto lock_mmap;
>> +
>> +    if (unlikely(access_error(error_code, vma))) {
>> +    vma_end_read(vma);
>> +    goto lock_mmap;
>> +    }
>> +    fault = handle_mm_fault(vma, address, flags |
>> FAULT_FLAG_VMA_LOCK, regs);
>> +    vma_end_read(vma);
>> +
>> +    if (!(fault & VM_FAULT_RETRY)) {
>> +    count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> +    goto done;
>> +    }
>> +    count_vm_vma_lock_event(VMA_LOCK_RETRY);
> 
> This is apparently not strong enough as it causes go build failures like:

TWIMC & for the record: there is another report about trouble caused by
this change; for details see

https://bugzilla.kernel.org/show_bug.cgi?id=217624

And a "forward to devs and lists" thread about that report:

https://lore.kernel.org/all/facbfec3-837a-51ed-85fa-31021c17d...@gmail.com/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> [  409s] strconv
> [  409s] releasep: m=0x579e2000 m->p=0x5781c600 p->m=0x0 p->status=2
> [  409s] fatal error: releasep: invalid p state
> [  409s]
> 
> [  325s] hash/adler32
> [  325s] hash/crc32
> [  325s] cmd/internal/codesign
> [  336s] fatal error: runtime: out of memory
> 
> There are many kinds of similar errors. It happens in 1-3 out of 20
> builds only.
> 
> If I revert the commit on top of 6.4, they all dismiss. Any idea?
> 
> The downstream report:
> https://bugzilla.suse.com/show_bug.cgi?id=1212775
> 
>> +
>> +    /* Quick path to respond to signals */
>> +    if (fault_signal_pending(fault, regs)) {
>> +    if (!user_mode(regs))
>> +    kernelmode_fixup_or_oops(regs, error_code, address,
>> + SIGBUS, BUS_ADRERR,
>> + ARCH_DEFAULT_PKEY);
>> +    return;
>> +    }
>> +lock_mmap:
>> +#endif /* CONFIG_PER_VMA_LOCK */
>> +
>>   /*
>>    * Kernel-mode access to the user address space should only occur
>>    * on well-defined single instructions listed in the exception
>> @@ -1433,6 +1466,9 @@ void do_user_addr_fault(struct pt_regs *regs,
>>   }
>>     mmap_read_unlock(mm);
>> +#ifdef CONFIG_PER_VMA_LOCK
>> +done:
>> +#endif
>>   if (likely(!(fault & VM_FAULT_ERROR)))
>>   return;
>>   
> 
> thanks,


Re: Fwd: Memory corruption in multithreaded user space program while calling fork

2023-07-05 Thread Linux regression tracking (Thorsten Leemhuis)
On 05.07.23 09:08, Greg KH wrote:
> On Tue, Jul 04, 2023 at 01:22:54PM -0700, Suren Baghdasaryan wrote:
>> On Tue, Jul 4, 2023 at 9:18 AM Andrew Morton  
>> wrote:
>>> On Tue, 4 Jul 2023 09:00:19 +0100 Greg KH  
>>> wrote:
 Thanks! I'll investigate this later today. After discussing with
 Andrew, we would like to disable CONFIG_PER_VMA_LOCK by default until
 the issue is fixed. I'll post a patch shortly.
>>>
>>> Posted at: 
>>> https://lore.kernel.org/all/20230703182150.2193578-1-sur...@google.com/
>>
>> As that change fixes something in 6.4, why not cc: stable on it as well?
>
> Sorry, I thought since per-VMA locks were introduced in 6.4 and this
> patch is fixing 6.4 I didn't need to send it to stable for older
> versions. Did I miss something?

 6.4.y is a stable kernel tree right now, so yes, it needs to be included
 there :)
>>>
>>> I'm in wait-a-few-days-mode on this.  To see if we have a backportable
>>> fix rather than disabling the feature in -stable.

Andrew, how long will you remain in "wait-a-few-days-mode"? Given what
Greg said below and that we already had three reports I know of I'd
prefer if we could fix this rather sooner than later in mainline --
especially as Arch Linux and openSUSE Tumbleweed likely have switched to
6.4.y already or will do so soon.

>> Ok, I think we have a fix posted at [2]  and it's cleanly applies to
>> 6.4.y stable branch as well. However fork() performance might slightly
>> regress, therefore disabling per-VMA locks by default for now seems to
>> be preferable even with this fix (see discussion at
>> https://lore.kernel.org/all/54cd9ffb-8f4b-003f-c2d6-3b6b0d2cb...@google.com/).
>> IOW, both [1] and [2] should be applied to 6.4.y stable. Both apply
>> cleanly and I CC'ed stable on [2]. Greg, should I send [1] separately
>> to stable@vger?
> 
> We can't do anything for stable until it lands in Linus's tree, so if
> you didn't happen to have the stable@ tag in the patch, just email us
> the git SHA1 and I can pick it up that way.
> 
> thanks,
> 
> greg k-h

Ciao, Thorsten


Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-18 Thread Linux regression tracking (Thorsten Leemhuis)
Michael, thx for looking into this!

On 18.07.23 13:48, Michael Ellerman wrote:
> Bagas Sanjaya  writes:
>> On Thu, Jul 13, 2023 at 09:11:10AM -0700, Randy Dunlap wrote:
>>> on ppc64:
>>>
>>> In file included from ../include/linux/device.h:15,
>>>  from ../arch/powerpc/include/asm/io.h:22,
>>>  from ../include/linux/io.h:13,
>>>  from ../include/linux/irq.h:20,
>>>  from ../arch/powerpc/include/asm/hardirq.h:6,
>>>  from ../include/linux/hardirq.h:11,
>>>  from ../include/linux/interrupt.h:11,
>>>  from ../drivers/video/fbdev/ps3fb.c:25:
>>> ../drivers/video/fbdev/ps3fb.c: In function 'ps3fb_probe':
>>> ../drivers/video/fbdev/ps3fb.c:1172:40: error: 'struct fb_info' has no 
>>> member named 'dev'
>>>  1172 |  dev_driver_string(info->dev), dev_name(info->dev),
>>>   |^~
>>> ../include/linux/dev_printk.h:110:37: note: in definition of macro 
>>> 'dev_printk_index_wrap'
>>>   110 | _p_func(dev, fmt, ##__VA_ARGS__);   
>>> \
>>>   | ^~~
>>> ../drivers/video/fbdev/ps3fb.c:1171:9: note: in expansion of macro 
>>> 'dev_info'
>>>  1171 | dev_info(info->device, "%s %s, using %u KiB of video 
>>> memory\n",
>>>   | ^~~~
>>> ../drivers/video/fbdev/ps3fb.c:1172:61: error: 'struct fb_info' has no 
>>> member named 'dev'
>>>  1172 |  dev_driver_string(info->dev), dev_name(info->dev),
>>>   | ^~
>>> ../include/linux/dev_printk.h:110:37: note: in definition of macro 
>>> 'dev_printk_index_wrap'
>>>   110 | _p_func(dev, fmt, ##__VA_ARGS__);   
>>> \
>>>   | ^~~
>>> ../drivers/video/fbdev/ps3fb.c:1171:9: note: in expansion of macro 
>>> 'dev_info'
>>>  1171 | dev_info(info->device, "%s %s, using %u KiB of video 
>>> memory\n",
>>>   | ^~~~
>>>
>>>
>>
>> Hmm, there is no response from Thomas yet. I guess we should go with
>> reverting bdb616479eff419, right? Regardless, I'm adding this build 
>> regression
>> to regzbot so that parties involved are aware of it:
>>
>> #regzbot ^introduced: bdb616479eff419
>> #regzbot title: build regression in PS3 framebuffer
> 
> Does regzbot track issues in linux-next?

It can, I made sure of that in case somebody want to use this sooner or
later (and it wasn't much work), but I don't actively use this
functionally right now and do not plan to do so, there are more
important issues to spend time on.

> They're not really regressions because they're not in a release yet.
> 
> Anyway I don't see where bdb616479eff419 comes from.

That makes two of us :-D

> The issue was introduced by:
> 
>   701d2054fa31 fbdev: Make support for userspace interfaces configurable

Ahh, that makes a lot more sense. While at it, let me tell regzbot:

#regzbot introduced: 701d2054fa31

Ciao, Thorsten


Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-31 Thread Linux regression tracking (Thorsten Leemhuis)
On 18.07.23 18:15, Randy Dunlap wrote:
> On 7/18/23 04:48, Michael Ellerman wrote:
>> Bagas Sanjaya  writes:
>>> On Thu, Jul 13, 2023 at 09:11:10AM -0700, Randy Dunlap wrote:
 on ppc64:

 In file included from ../include/linux/device.h:15,
  from ../arch/powerpc/include/asm/io.h:22,
  from ../include/linux/io.h:13,
  from ../include/linux/irq.h:20,
  from ../arch/powerpc/include/asm/hardirq.h:6,
  from ../include/linux/hardirq.h:11,
  from ../include/linux/interrupt.h:11,
  from ../drivers/video/fbdev/ps3fb.c:25:
 ../drivers/video/fbdev/ps3fb.c: In function 'ps3fb_probe':
 ../drivers/video/fbdev/ps3fb.c:1172:40: error: 'struct fb_info' has no 
 member named 'dev'
> [...]
>>
>> Does regzbot track issues in linux-next?

Seems your patch didn't make any progress, at least I can't see it in
-next. Is there a reason why, or did I miss anything?

And yes, sure, I'm aware that it's -next and a driver that people might
not enable regularly. But I noticed it and thought "quickly bring it up,
might be good to fix this rather sooner than later before other people
run into it (and who knows, maybe it'll switch a light in some CI system
from red to green as well)"

Ciao, Thorsten

>> The driver seems to only use info->dev in that one dev_info() line,
>> which seems purely cosmetic, so I think it could just be removed, eg:
>>
>> diff --git a/drivers/video/fbdev/ps3fb.c b/drivers/video/fbdev/ps3fb.c
>> index d4abcf8aff75..a304a39d712b 100644
>> --- a/drivers/video/fbdev/ps3fb.c
>> +++ b/drivers/video/fbdev/ps3fb.c
>> @@ -1168,8 +1168,7 @@ static int ps3fb_probe(struct ps3_system_bus_device 
>> *dev)
>>  
>>  ps3_system_bus_set_drvdata(dev, info);
>>  
>> -dev_info(info->device, "%s %s, using %u KiB of video memory\n",
>> - dev_driver_string(info->dev), dev_name(info->dev),
>> +dev_info(info->device, "using %u KiB of video memory\n",
>>   info->fix.smem_len >> 10);
>>  
>>  task = kthread_run(ps3fbd, info, DEVICE_NAME);
> 
> 
> Tested-by: Randy Dunlap  # build-tested
> 
> Thanks.
> 


Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

2023-04-04 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 23.03.23 10:53, Srikar Dronamraju wrote:
> 
> I am unable to boot upstream kernels from v5.16 to the latest upstream
> kernel on a maxconfig system. (Machine config details given below)
> 
> At boot, we see a series of messages like the below.
> 
> dracut-initqueue[13917]: Warning: dracut-initqueue: timeout, still waiting 
> for following initqueue hooks:
> dracut-initqueue[13917]: Warning: 
> /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-uuid\x2f93dc0767-18aa-467f-afa7-5b4e9c13108a.sh:
>  "if ! grep -q After=remote-fs-pre.target 
> /run/systemd/generator/systemd-cryptsetup@*.service 2>/dev/null; then
> dracut-initqueue[13917]: [ -e 
> "/dev/disk/by-uuid/93dc0767-18aa-467f-afa7-5b4e9c13108a" ]
> dracut-initqueue[13917]: fi"

Alexey, did you look into this? This is apparently caused by a commit of
yours (see quoted part below) that Michael applied. Looks like it fell
through the cracks from here, but maybe I'm missing something.

Anyway, for the rest of this mail:

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 387273118714
#regzbot title powerps/pseries/dma: Probing nvme disks fails on powerpc
Maxconfig
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

> journalctl shows the below warning.
> 
>  WARNING: CPU: 242 PID: 1219 at 
> /home/srikar/work/linux.git/arch/powerpc/kernel/iommu.c:227 
> iommu_range_alloc+0x3d4/0x450
>  Modules linked in: lpfc(E+) nvmet_fc(E) nvmet(E) configfs(E) qla2xxx(E+) 
> nvme_fc(E) nvme_fabrics(E) vmx_crypto(E) gf128mul(E) xhci_pci(E) 
> xhci_pci_renesas(E) xhci_hcd(E) ipr(E+) nvme(E) usbcore(E) libata(E) 
> nvme_core(E) t10_pi(E) scsi_transport_fc(E) usb_common(E) btrfs(E) 
> blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) sg(E) 
> dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) 
> scsi_mod(E) scsi_common(E)
>  CPU: 242 PID: 1219 Comm: kworker/u3843:0 Tainted: GW   EL
> 5.15.0-sp4+ #33 91e1c36ffe385108bbe4a3834506a047dc78552d
>  Workqueue: nvme-reset-wq nvme_reset_work [nvme]
>  NIP:  c005a134 LR: c005a128 CTR: 
>  REGS: c7fd4c7eb580 TRAP: 0700   Tainted: GW   EL 
> (5.15.0-sp4+)
>  MSR:  80029033   CR: 24002424  XER: 
>  CFAR: c020972c IRQMASK: 0
>  GPR00: c005a128 c7fd4c7eb820 c2aa4b00 0001
>  GPR04: c273d648 0003 0bfbcb21 c2d88390
>  GPR08:   00f2 c2b05240
>  GPR12: 2000 cbfbdfffcb00  c7fd4c9d1c40
>  GPR16:    
>  GPR20:   c2bab580 
>  GPR24: c73b30c8   
>  GPR28: c7fd7133  0001 0001
>  NIP [c005a134] iommu_range_alloc+0x3d4/0x450
>  LR [c005a128] iommu_range_alloc+0x3c8/0x450
>  Call Trace:
>  [c7fd4c7eb820] [c005a128] iommu_range_alloc+0x3c8/0x450 
> (unreliable)
>  [c7fd4c7eb8e0] [c005a580] iommu_alloc+0x60/0x170
>  [c7fd4c7eb930] [c005bd4c] iommu_alloc_coherent+0x11c/0x1d0
>  [c7fd4c7eb9d0] [c00597e8] dma_iommu_alloc_coherent+0x38/0x50
>  [c7fd4c7eb9f0] [c0249ce8] dma_alloc_attrs+0x128/0x180
>  [c7fd4c7eba60] [c0080001093210d8] nvme_alloc_queue+0x90/0x2b0 [nvme]
>  [c7fd4c7ebac0] [c008000109326034] nvme_reset_work+0x44c/0x1870 [nvme]
>  [c7fd4c7ebc30] [c01870b8] process_one_work+0x388/0x730
>  [c7fd4c7ebd10] [c01874d8] worker_thread+0x78/0x5b0
>  [c7fd4c7ebda0] [c01945cc] kthread+0

Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-08 Thread Linux regression tracking (Thorsten Leemhuis)
On 08.05.23 14:49, Michael Ellerman wrote:
> "Linux regression tracking #adding (Thorsten Leemhuis)"
>  writes:
>> [CCing the regression list, as it should be in the loop for regressions:
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>
>> [TLDR: I'm adding this report to the list of tracked Linux kernel
>> regressions; the text you find below is based on a few templates
>> paragraphs you might have encountered already in similar form.
>> See link in footer if these mails annoy you.]
> 
> Patch is in testing.
> https://lore.kernel.org/linuxppc-dev/20230505171816.3175865-1-r...@kernel.org/

Ahh, great, thx for letting me know.

Thanks to a proper Link tag regzbot would have noticed that fix once it
landed in next, but it's nevertheless good to know that the fix is
already under review. :-D

Fun fact: sometimes I wish we would not post fixes in new threads, as
that makes it hard to find the proposed fix for anybody that runs into
reported issues and also manages to find the report (e.g. this thread).
But whatever, that's just a detail.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot monitor:
https://lore.kernel.org/linuxppc-dev/20230505171816.3175865-1-r...@kernel.org/

>> On 02.05.23 04:22, Christian Zigotzky wrote:
>>> Hello,
>>>
>>> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
>>>
>>> The kernel hangs right after the booting Linux via __start() @
>>> 0x ...
>>>
>>> I was able to revert the PowerPC updates 6.4-1 [2] with the following
>>> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
>>>
>>> After a re-compiling, the kernel boots without any problems without the
>>> PowerPC updates 6.4-1 [2].
>>>
>>> Could you please explain me, what you have done in the boot area?
>>>
>>> Please find attached the kernel config.
>>
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced e4ab08be5b4902e5
>> #regzbot title powerpc: boot issues on PASEMI Nemo board
>> #regzbot ignore-activity
>>
>> This isn't a regression? This issue or a fix for it are already
>> discussed somewhere else? It was fixed already? You want to clarify when
>> the regression started to happen? Or point out I got the title or
>> something else totally wrong? Then just reply and tell me -- ideally
>> while also telling regzbot about it, as explained by the page listed in
>> the footer of this mail.
>>
>> Developers: When fixing the issue, remember to add 'Link:' tags pointing
>> to the report (the parent of this mail). See page linked in footer for
>> details.
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> That page also explains what to do if mails like this annoy you.
> 


Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-08 Thread Linux regression tracking (Thorsten Leemhuis)



On 08.05.23 14:58, Bagas Sanjaya wrote:
> On Mon, May 08, 2023 at 01:29:22PM +0200, Linux regression tracking #adding 
> (Thorsten Leemhuis) wrote:
>> [CCing the regression list, as it should be in the loop for regressions:
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>
>> [TLDR: I'm adding this report to the list of tracked Linux kernel
>> regressions; the text you find below is based on a few templates
>> paragraphs you might have encountered already in similar form.
>> See link in footer if these mails annoy you.]
>>
>> On 02.05.23 04:22, Christian Zigotzky wrote:
>>> Hello,
>>>
>>> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
>>>
>>> The kernel hangs right after the booting Linux via __start() @
>>> 0x ...
>>>
>>> I was able to revert the PowerPC updates 6.4-1 [2] with the following
>>> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
>>>
>>  ... 
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced e4ab08be5b4902e5
> 
> Why and how can you conclude that the culprit is e4ab08be5b4902 
> ("powerpc/isa-bridge:
> Remove open coded "ranges" parsing") rather than powerpc PR merge commit
> 70cc1b5307e8ee ("Merge tag 'powerpc-6.4-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")? 

I looked at the thread and noticed it was mentioned later (
https://lore.kernel.org/all/3fa42c8c-09bd-d0f0-401b-315b484f4...@xenosoft.de/
).

Ciao, Thorsten


Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)

2023-06-22 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

As Linus will likely release 6.4 on this or the following Sunday a quick
question: is there any hope this regression might be fixed any time
soon? Doesn't look like it, as it seems nothing happened for a few days,
but maybe I missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 15.06.23 06:57, Sachin Sant wrote:
> 
>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 6000 6000 
>>> 6000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728  2c2c 
>>> 41820020 7d8903a6 
>>
>>  2c:   28 07 23 e9 ld  r9,1832(r3)
>>  30:   50 00 89 e9 ld  r12,80(r9)
>>
>> Where r3 is *chip.
>> r9 is NULL, and 80 = 0x50.
>>
>> Looks like a NULL chip->ops, which oopses in:
>>
>> static int tpm_request_locality(struct tpm_chip *chip)
>> {
>> int rc;
>>
>> if (!chip->ops->request_locality)
>>
>>
>> Can you test the patch below?
>>
> 
> It proceeds further but then run into following crash
> 
> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? 
> (uid: 0)
> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x0018
> [  103.269595] Faulting instruction address: 0xc09dcf34
> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) 
> nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) 
> nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) 
> nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) 
> rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) 
> aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) 
> libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) 
> crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) 
> vmx_crypto(E) fuse(E)
> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G 
>E  6.4.0-rc6-dirty #8
> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [  103.269653] NIP:  c09dcf34 LR: c09dd2bc CTR: 
> c09eaa60
> [  103.269656] REGS: c000a113f510 TRAP: 0300   Tainted: GE
>(6.4.0-rc6-dirty)
> [  103.269660] MSR:  8280b033   CR: 
> 88484886  XER: 0001
> [  103.269669] CFAR: c09dd2b8 DAR: 0018 DSISR: 4000 
> IRQMASK: 0  [  103.269669] GPR00: c09dd2bc c000a113f7b0 
> c14a1500 c0009031  [  103.269669] GPR04: c0009f77 
> 0016 06007a01 0016  [  103.269669] GPR08: 
> c0009f77   8000  [  
> 103.269669] GPR12: c09eaa60 c0135fab7f00  
>   [  103.269669] GPR16:   
>    [  103.269669] GPR20:  
>     [  103.269669] GPR24: 
>  0016 c0009031 1000  [  
> 103.269669] GPR28: c0009f77 7a01 c0009f77 
> c0009031  [  103.269707] NIP [c09dcf34] 
> tpm_try_transmit+0x74/0x300
> [  103.269713] LR [c09dd2bc] tpm_transmit+0xfc/0x190
> [  103.269717] Call Trace:
> [  103.269718] [c000a113f7b0] [c000a113f880] 0xc000a113f880 
> (unreliable)
> [  103.269724] [c000a113f840] [c09dd2bc] tpm_transmit+0xfc/0x190
> [  103.269727] [c000a113f900] [c09dd398] 
> tpm_transmit_cmd+0x48/0x110
> [  103.269731] [c000a113f980] [c09df1b0] 
> tpm2_get_tpm_pt+0x140/0x230
> [  103.269736] [c000a113fa20] [c09db208] 
> tpm_amd_is_rng_defective+0xb8/0x250
> [  103.269739] [c000a113faa0] [c09db828] 
> tpm_chip_unregister+0x138/0x160
> [  103.269743] [c000a113fae0] [c09eaa94] 
> tpm_ibmvtpm_remove+0x34/0x130
> [  103.269748] [c000a113fb50] [c0115738] vio_bus_remove+0x58/0xd0
> [  103.269754] [c000a113fb90] [c0a01dcc] 
> device_shutdown+0x21c/0x39c
> [  103.269758] [c000a113fc20] [c01a2684] 
> kernel_restart_prepare+0x54/0x70
> [  103.269762] [c000a113fc40] [c0292c48] kernel_kexec+0xa8/0x100
> [  103.269766] [c000a113fcb0] [c01a2cd4] 
> __do_sys_reboot+0x214/0x2c0
> [  103.269770] [c000a113fe10] [c0034adc] 
> system_call_exception+0x13c/0x340
> [  103.269776] [c000a113fe50] [c000d05c] 
> system_call_vectored_common+

Re: 6.2-rc7 fails building on Talos II: memory.c:(.text+0x2e14): undefined reference to `hash__tlb_flush'

2023-02-16 Thread Linux regression tracking (Thorsten Leemhuis)
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 16.02.23 00:55, Erhard F. wrote:
> Just noticed a build failure on 6.2-rc7 for my Talos 2 (.config attached):
> 
>  # make
>   CALLscripts/checksyscalls.sh
>   UPD include/generated/utsversion.h
>   CC  init/version-timestamp.o
>   LD  .tmp_vmlinux.kallsyms1
> ld: ld: DWARF error: could not find abbrev number 6
> mm/memory.o: in function `unmap_page_range':
> memory.c:(.text+0x2e14): undefined reference to `hash__tlb_flush'
> ld: memory.c:(.text+0x2f8c): undefined reference to `hash__tlb_flush'
> ld: ld: DWARF error: could not find abbrev number 3117
> mm/mmu_gather.o: in function `tlb_remove_table':
> mmu_gather.c:(.text+0x584): undefined reference to `hash__tlb_flush'
> ld: mmu_gather.c:(.text+0x6c4): undefined reference to `hash__tlb_flush'
> ld: mm/mmu_gather.o: in function `tlb_flush_mmu':
> mmu_gather.c:(.text+0x80c): undefined reference to `hash__tlb_flush'
> ld: mm/mmu_gather.o:mmu_gather.c:(.text+0xbe0): more undefined references to 
> `hash__tlb_flush' follow
> make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Fehler 1
> make: *** [Makefile:1264: vmlinux] Error 2
> 
> As 6.2-rc6 was good on this machine I did a quick bisect which revealed this 
> commit:
> 
>  # git bisect bad
> 1665c027afb225882a5a0b014c45e84290b826c2 is the first bad commit
> [...]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 1665c027afb225
#regzbot title powerpc: 6.2-rc7 fails building on Talos II
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception

2024-04-04 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Was this regression ever resolved? Doesn't look like it, but maybe I
just missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 20.03.24 14:43, Nicholas Piggin wrote:
> On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote:
>> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
>> Book3S HV nestedv2: Do not cancel pending decrementer exception") which
>> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
>> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
>> expiry TB' register which was higher compared to handling extra decrementer
>> exceptions.
>>
>> This overhead of reading 'HDEC expiry TB' register has been mitigated
>> recently by the L0 hypervisor(PowerVM) by putting the value of this
>> register in L2 guest-state output buffer on trap to L1. From there the
>> value of this register is cached, made available in kvmhv_run_single_vcpu()
>> to compare it against host(L1) timebase and cancel the pending hypervisor
>> decrementer exception if needed.
> 
> Ah, I figured out the problem here. Guest entry never clears the
> queued dec, because it's level triggered on the DEC MSB so it
> doesn't go away when it's delivered. So upstream code is indeed
> buggy and I think I take the blame for suggesting this nestedv2
> workaround.
> 
> I actually don't think that is necessary though, we could treat it
> like other interrupts.  I think that would solve the problem without
> having to test dec here.
> 
> I am wondering though, what workload slows down that this patch
> was needed in the first place. We'd only get here after a cede
> returns, then we'd dequeue the dec and stop having to GET_STATE
> it here.
> 
> Thanks,
> Nick
> 
>>
>> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending 
>> decrementer exception")
>> Signed-off-by: Vaibhav Jain 
>> ---
>>  arch/powerpc/kvm/book3s_hv.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 0b921704da45..e47b954ce266 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
>> time_limit,
>>   * entering a nested guest in which case the decrementer is now owned
>>   * by L2 and the L1 decrementer is provided in hdec_expires
>>   */
>> -if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
>> +if (kvmppc_core_pending_dec(vcpu) &&
>>  ((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
>>   (trap == BOOK3S_INTERRUPT_SYSCALL &&
>>kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
> 


Re: [PATCH 04/23] scsi: initialize scsi midlayer limits before allocating the queue

2024-05-29 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 20.05.24 17:15, Christoph Hellwig wrote:
> Adding ben and the linuxppc list.

Hmm, no reply and no other progress to get this resolved afaics. So lets
bring Michael into the mix, he might be able to help out.

BTW TWIMC: a PowerMac G5 user user reported similar symptoms here
recently: https://bugzilla.kernel.org/show_bug.cgi?id=218858

Ciao, Thorsten

> Context: pata_macio initialization now fails as we enforce that the
> segment size is set properly.
> 
> On Wed, May 15, 2024 at 04:52:29PM -0700, Guenter Roeck wrote:
>> pata_macio_common_init() Calling ata_host_activate() with limit 65280
>> ...
>> max_segment_size is 65280; PAGE_SIZE is 65536; BLK_MAX_SEGMENT_SIZE is 65536
>> WARNING: CPU: 0 PID: 12 at block/blk-settings.c:202 
>> blk_validate_limits+0x2d4/0x364
>> ...
>>
>> This is with PPC_BOOK3S_64 which selects a default page size of 64k.
> 
> Yeah.  Did you actually manage to use pata macio previously?  Or is
> it just used because it's part of the pmac default config?
> 
>> Looking at the old code, I think it did what you suggested above,
> 
>> but assuming that the driver requested a lower limit on purpose that
>> may not be the best solution.
> 
>> Never mind, though - I updated my test configuration to explicitly
>> configure the page size to 4k to work around the problem. With that,
>> please consider this report a note in case someone hits the problem
>> on a real system (and sorry for the noise).
> 
> Yes, the idea behind this change was to catch such errors.  So far
> most errors have been drivers setting lower limits than what the
> hardware can actually handle, but I'd love to track this down.
> 
> If the hardware can't actually handle the lower limit we should
> probably just fail the probe gracefully with a well comment if
> statement instead.


Re: Xorg doesn't start and some other issues with the RC1 of kernel 6.10

2024-05-31 Thread Linux regression tracking (Thorsten Leemhuis)
On 01.06.24 08:34, Greg KH wrote:
> On Fri, May 31, 2024 at 12:16:50PM +0200, Greg KH wrote:
>> On Fri, May 31, 2024 at 12:02:15PM +0200, Greg KH wrote:
>>> On Fri, May 31, 2024 at 11:19:34AM +0200, Thorsten Leemhuis wrote:
>>>> Thx, I already had an eye on this, but thought tracking would not be
>>>> needed, as Greg (now CCed) wanted to revert 8c467f3300591a ("VT: Use
>>>> macros to define ioctls") two days ago:
>>>> https://lore.kernel.org/all/2024052901-police-trash-e9f9@gregkh/
>>>>
>>>> But that commit is not yet in -next afaics. :-/
>>>>
>>>> /me meanwhile wonders if it would be wise to fix this before -rc2
>>>
>>> I do, sorry, been traveling this week with geen vrije tijd.  Will get to
>>> it tomorrow.
>>
>> Ugh, sorry for the dutch, I have "no free time" because I am studying
>> the language this week.  It is bleeding over here into my emails now...

:-D

> Pull request now sent:
>   https://lore.kernel.org/r/zlq8ymiubtois...@kroah.com

Dank u wel![1, 2] And good luck with your studies! Ciao, Thorsten

[1] "Many thx!"

[2] I understand some dutch (more than enough for "geen vrije tijd"),
but do not really speak it; but it was enough to get that simple phrase
right on the first attempt.

#regzbot fix: 7bc4244c882a7d7d


Re: [PATCH] tpm: ibmvtpm: Call tpm2_sessions_init() to initialize session support

2024-06-28 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list]

On 20.06.24 00:34, Stefan Berger wrote:
> Jarkko,
>   are you ok with this patch?

Hmmm, hope I did not miss anythng, but looks like nothing happened for
about 10 days here. Hence:

Jarkko, looks like some feedback from your side really would help to
find a path to get this regression resolved before 6.10 is released.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

> On 6/17/24 15:34, Stefan Berger wrote:
>> Fix the following type of error message caused by a missing call to
>> tpm2_sessions_init() in the IBM vTPM driver:
>>
>> [    2.987131] tpm tpm0: tpm2_load_context: failed with a TPM error
>> 0x01C4
>> [    2.987140] ima: Error Communicating to TPM chip, result: -14
>>
>> Fixes: d2add27cf2b8 ("tpm: Add NULL primary creation")
>> Signed-off-by: Stefan Berger 
>> ---
>>   drivers/char/tpm/tpm_ibmvtpm.c | 4 
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/char/tpm/tpm_ibmvtpm.c
>> b/drivers/char/tpm/tpm_ibmvtpm.c
>> index d3989b257f42..1e5b107d1f3b 100644
>> --- a/drivers/char/tpm/tpm_ibmvtpm.c
>> +++ b/drivers/char/tpm/tpm_ibmvtpm.c
>> @@ -698,6 +698,10 @@ static int tpm_ibmvtpm_probe(struct vio_dev
>> *vio_dev,
>>   rc = tpm2_get_cc_attrs_tbl(chip);
>>   if (rc)
>>   goto init_irq_cleanup;
>> +
>> +    rc = tpm2_sessions_init(chip);
>> +    if (rc)
>> +    goto init_irq_cleanup;
>>   }
>>     return tpm_chip_register(chip);


Re: [PATCH v4 29/33] x86/mm: try VMA lock-based page fault handling first

2023-06-29 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 29.06.23 16:40, Jiri Slaby wrote:
> 
> On 27. 02. 23, 18:36, Suren Baghdasaryan wrote:
>> Attempt VMA lock-based page fault handling first, and fall back to the
>> existing mmap_lock-based handling if that fails.
> [...]
>> +    fault = handle_mm_fault(vma, address, flags |
>> FAULT_FLAG_VMA_LOCK, regs);
>> +    vma_end_read(vma);
>> +
>> +    if (!(fault & VM_FAULT_RETRY)) {
>> +    count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> +    goto done;
>> +    }
>> +    count_vm_vma_lock_event(VMA_LOCK_RETRY);
> 
> This is apparently not strong enough as it causes go build failures like:
> 
> [  409s] strconv
> [  409s] releasep: m=0x579e2000 m->p=0x5781c600 p->m=0x0 p->status=2
> [  409s] fatal error: releasep: invalid p state
> [  409s]
> 
> [  325s] hash/adler32
> [  325s] hash/crc32
> [  325s] cmd/internal/codesign
> [  336s] fatal error: runtime: out of memory
> 
> There are many kinds of similar errors. It happens in 1-3 out of 20
> builds only.
> 
> If I revert the commit on top of 6.4, they all dismiss. Any idea?
> 
> The downstream report:
> https://bugzilla.suse.com/show_bug.cgi?id=1212775
> [...]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 0bff0aaea03e2a3ed6bfa3021
https://bugzilla.suse.com/show_bug.cgi?id=1212775
#regzbot title mm: failures when building go in 1-3 out of 20 builds
#regzbot ignore-activity

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: Kernel Crash Dump (kdump) broken with 6.5

2023-07-19 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 17.07.23 16:45, Sachin Sant wrote:
> Kdump seems to be broken with 6.5 for ppc64le.
> [...]
> 
> 6.4 was good. Git bisect points to following patch
> 
> commit 606787fed7268feb256957872586370b56af697a
> powerpc/64s: Remove support for ELFv1 little endian userspace
> 
> Reverting this patch allows a successful capture of vmcore.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 606787fed7268feb256957872586370b56af69
#regzbot title powerpc/64s: Crash Dump (kdump) broken with 6.5
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: Kernel Crash Dump (kdump) broken with 6.5

2023-07-30 Thread Linux regression tracking #update (Thorsten Leemhuis)
On 19.07.23 18:19, Linux regression tracking #adding (Thorsten Leemhuis)
wrote:
> On 17.07.23 16:45, Sachin Sant wrote:
>> Kdump seems to be broken with 6.5 for ppc64le.
>> [...]
>>
>> 6.4 was good. Git bisect points to following patch
>>
>> commit 606787fed7268feb256957872586370b56af697a
>> powerpc/64s: Remove support for ELFv1 little endian userspace
>>
>> Reverting this patch allows a successful capture of vmcore.

Was fixed by revert:

#regzbot fix: 106ea7ffd56
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-08 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 02.05.23 04:22, Christian Zigotzky wrote:
> Hello,
> 
> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
> 
> The kernel hangs right after the booting Linux via __start() @
> 0x ...
> 
> I was able to revert the PowerPC updates 6.4-1 [2] with the following
> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
> 
> After a re-compiling, the kernel boots without any problems without the
> PowerPC updates 6.4-1 [2].
> 
> Could you please explain me, what you have done in the boot area?
> 
> Please find attached the kernel config.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced e4ab08be5b4902e5
#regzbot title powerpc: boot issues on PASEMI Nemo board
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: Fwd: Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)

2023-05-21 Thread Linux regression tracking #update (Thorsten Leemhuis)
/me removes a few people from CC, as this thread already annoyed a few
people

On 11.05.23 10:06, Bagas Sanjaya wrote:
> 
> I notice a regression report on bugzilla ([1]). As many developers
> don't keep an eye on it, I decide to forward it by email.
> [...] 
> #regzbot introduced: v6.2..v6.3 
> https://bugzilla.kernel.org/show_bug.cgi?id=217427
> #regzbot title: No video output from AMD RX 570 and kernel exploit attempt on 
> ppc64le

per https://gitlab.freedesktop.org/drm/amd/-/issues/2553#note_1911308

#regzbot fix: 3cf7cd3f770a0b89dc
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)

2023-06-15 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 14.06.23 17:12, Sachin Sant wrote:
> Following crash is observed during a kexec operation on 
> IBM Power10 server:
> 
> [ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 
> 0)
> [ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x0050
> [ 34.381565] Faulting instruction address: 0xc09db1e4
> [ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) 
> nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) 
> nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) 
> nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) 
> rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) 
> aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) 
> libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) 
> crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) 
> vmx_crypto(E) fuse(E)
> [ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 
> 6.4.0-rc6-00037-gb6dad5178cea #3
> [ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 34.381621] NIP: c09db1e4 LR: c09db928 CTR: c09eab60
> [ 34.381625] REGS: c0009742f780 TRAP: 0300 Tainted: G E 
> (6.4.0-rc6-00037-gb6dad5178cea)
> [ 34.381628] MSR: 8280b033  CR: 
> 4444 XER: 0001
> [ 34.381638] CFAR: c09db19c DAR: 0050 DSISR: 4000 
> IRQMASK: 0 
> [ 34.381638] GPR00: c09db928 c0009742fa20 c14a1500 
> c81d 
> [ 34.381638] GPR04: cd842c50 cd842c50 0025 
> fffe 
> [ 34.381638] GPR08:   0009 
> c00800785280 
> [ 34.381638] GPR12: c09eab60 c0135fab7f00  
>  
> [ 34.381638] GPR16:    
>  
> [ 34.381638] GPR20:    
>  
> [ 34.381638] GPR24:    
> c2e21e08 
> [ 34.381638] GPR28: cd842c48 c2a02208 c321c0c0 
> c81d 
> [ 34.381674] NIP [c09db1e4] tpm_amd_is_rng_defective+0x74/0x240
> [ 34.381681] LR [c09db928] tpm_chip_unregister+0x138/0x160
> [ 34.381685] Call Trace:
> [ 34.381686] [c0009742faa0] [c09db928] 
> tpm_chip_unregister+0x138/0x160
> [ 34.381690] [c0009742fae0] [c09eab94] 
> tpm_ibmvtpm_remove+0x34/0x130
> [ 34.381695] [c0009742fb50] [c0115738] vio_bus_remove+0x58/0xd0
> [ 34.381701] [c0009742fb90] [c0a01ecc] device_shutdown+0x21c/0x39c
> [ 34.381705] [c0009742fc20] [c01a2684] 
> kernel_restart_prepare+0x54/0x70
> [ 34.381710] [c0009742fc40] [c0292c48] kernel_kexec+0xa8/0x100
> [ 34.381714] [c0009742fcb0] [c01a2cd4] __do_sys_reboot+0x214/0x2c0
> [ 34.381718] [c0009742fe10] [c0034adc] 
> system_call_exception+0x13c/0x340
> [ 34.381723] [c0009742fe50] [c000d05c] 
> system_call_vectored_common+0x15c/0x2ec
> [ 34.381729] --- interrupt: 3000 at 0x7fff9c5459f0
> [ 34.381732] NIP: 7fff9c5459f0 LR:  CTR: 
> [ 34.381735] REGS: c0009742fe80 TRAP: 3000 Tainted: G E 
> (6.4.0-rc6-00037-gb6dad5178cea)
> [ 34.381738] MSR: 8280f033  CR: 
> 42422884 XER: 
> [ 34.381747] IRQMASK: 0 
> [ 34.381747] GPR00: 0058 7ad83d70 00012fc47f00 
> fee1dead 
> [ 34.381747] GPR04: 28121969 45584543  
> 0003 
> [ 34.381747] GPR08: 0010   
>  
> [ 34.381747] GPR12:  7fff9c7bb2c0 00012fc3f598 
>  
> [ 34.381747] GPR16:   00012fc1fcc0 
>  
> [ 34.381747] GPR20: 8913 8914 00014b891020 
> 0003 
> [ 34.381747] GPR24:  0001 0003 
> 7ad83ef0 
> [ 34.381747] GPR28: 00012fc19f10 7fff9c6419c0 00014b891080 
> 00014b891040 
> [ 34.381781] NIP [7fff9c5459f0] 0x7fff9c5459f0
> [ 34.381784] LR [] 0x0
> [ 34.381786] --- interrupt: 3000
> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 6

Re: 6.2-rc7 fails building on Talos II: memory.c:(.text+0x2e14): undefined reference to `hash__tlb_flush'

2023-02-17 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux regression tracking. A
change or fix related to the regression discussed in this thread was
posted or applied, but it did not use a Link: tag to point to the
report, as Linus and the documentation call for. Things happen, no
worries -- but now the regression tracking bot needs to be told manually
about the fix. See link in footer if these mails annoy you.]

On 16.02.23 11:09, Linux regression tracking (Thorsten Leemhuis) wrote:
> 
> On 16.02.23 00:55, Erhard F. wrote:
>> Just noticed a build failure on 6.2-rc7 for my Talos 2 (.config attached):
>>
>>  # make
>>   CALLscripts/checksyscalls.sh
>>   UPD include/generated/utsversion.h
>>   CC  init/version-timestamp.o
>>   LD  .tmp_vmlinux.kallsyms1
>> ld: ld: DWARF error: could not find abbrev number 6
>> mm/memory.o: in function `unmap_page_range':
>> memory.c:(.text+0x2e14): undefined reference to `hash__tlb_flush'
>> ld: memory.c:(.text+0x2f8c): undefined reference to `hash__tlb_flush'
>> ld: ld: DWARF error: could not find abbrev number 3117
>> mm/mmu_gather.o: in function `tlb_remove_table':
>> mmu_gather.c:(.text+0x584): undefined reference to `hash__tlb_flush'
>> ld: mmu_gather.c:(.text+0x6c4): undefined reference to `hash__tlb_flush'
>> ld: mm/mmu_gather.o: in function `tlb_flush_mmu':
>> mmu_gather.c:(.text+0x80c): undefined reference to `hash__tlb_flush'
>> ld: mm/mmu_gather.o:mmu_gather.c:(.text+0xbe0): more undefined references to 
>> `hash__tlb_flush' follow
>> make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Fehler 1
>> make: *** [Makefile:1264: vmlinux] Error 2

#regzbot fix: 4302abc628fc0dc08e5855f21bbfa
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.