On Sat, 2016-03-12 at 17:18 +0100, Ingo Molnar wrote: > * Toshi Kani <toshi.k...@hpe.com> wrote: > > > On Fri, 2016-03-11 at 09:13 +0000, Ingo Molnar wrote: > > > * Ingo Molnar <mi...@kernel.org> wrote: > > > > > > > > > > > * Toshi Kani <toshi.k...@hpe.com> wrote: > > > > > > > > > MTRR manages PAT initialization as it implements a rendezvous > > > > > handler that initializes PAT as part of MTRR initialization. > > > > > > > > > > When CPU does not support MTRR, ex. qemu32 virtual CPU, MTRR > > > > > simply skips PAT init, which causes PAT left enabled without > > > > > initialization. [...] > > > > > > > > What practical effects does this have to the user? Does the kernel > > > > crash? > > > > > > Btw., I find this omission _highly_ annoying: describing what > > > negative effects a bug _causes in practice_ is the most important > > > part of a changelog. How on earth can an experienced contributor omit > > > such an important component from a patch description? > > > > > > Most readers of changelogs couldn't care less about technical details > > > of how the bug is fixed (of course others will read it so it's nice > > > to have too), but what symptoms a bug causes, how serious is it, > > > whether it should be backported are like super important compared to > > > everything else you wrote - and both the description and the > > > changelogs are totally silent on those topics ... > > > > > > I've seen this in other PAT patches - please try to improve this. > > > > My apology. I agree the importance of describing the negative effect of > > the issue. This case is complicated to describe thoroughly, but here is > > a summary. > > The new changelog looks very good, thanks! > > > The issue was reported as a regression caused by 'commit 9cd25aac1f44 > > ("x86/mm/pat: Emulate PAT when it is disabled")'. So, the goal of this > > patchset is to fix this regression. > > https://lkml.org/lkml/2016/3/3/828 > > So one thing that matters more than anything else in the changelog, the > title! Right now the title is: > > x86/mtrr: Refactor PAT initialization code > > ... that's a nice title for a true refactoring of the code, but this > isn't really that, the purpose of this fix is to fix a bad Xorg crash for > Qemu users. > > The principle you need to remember is that readers of your changelogs > will be _very happy_ about 'negative' phrases like: > > bad bug > Xorg crash > boot failure > kernel crash > NULL dereference > > I.e. the 'best' title for a bug fix is to characterize it in the most > negative truthful fashion in the changelog. It sounds a bit > counterintuitive but it's true. > > So in this case the best changelog title would be something like: > > x86/pat: Fix Xorg crashes in Qemu sessions > > People will absolutely _love_ such titles, because: > > - users who are trying to find mysterious Xorg failures can grep for it > and might find it before it hits a stable kernel they are using > > - maintainers (like me) are able to see it at a glance that this fix > should go to Linus more urgently than other fixes. (and definitely more > urgently than feature patches.) > > - stable kernel maintainers and distro backporters can see it > immediately at a glance that they really want this fix. > > So by being intentionally and maximally negative in the title, you are > being very helpful to your fellow developers and users! > > Now consider the original title: > > x86/mtrr: Refactor PAT initialization code > > 99% of people will glance over such a title, which is not good. > Furhermore, maintainers like me will get _annoyed_ at such titles, > because this neutrally formulated title, while very polite, actively > hides the important detail that these patches fix real negative bugs for > real users. > > Okay?
Thanks for all the explanation and guidance! That's very helpful. Yes, I will keep this in mind. > And please also note that in the Linux kernel no-one ever 'blames' other > people for bugs. Bugs are part of the human condition and they happen all > the time as long as they are not introduced by carelessness. So in the > typical case you cannot possibly socially embarrass any good kernel > developer by reporting and fixing a bug he introduced. The typical > reaction you will get is 'oh great, one bug less to worry about!', so > socially you can be absolutely honest and 'impolite' about the negative > effects of bugs. Understood. > > The negative effects of the issue were two failures in Xorg on qemu32 > > env, which was triggered by the fact that its virtual CPU does not > > support MTRR. > > https://lkml.org/lkml/2016/3/4/775 > > #1. copy_process() failed in the check in reserve_pfn_range() > > #2. error path in copy_process() then hit WARN_ON_ONCE in > > untrack_pfn(). > > Yeah, it's nice to quote actual crash signatures as well (in a short > form) - because people hitting the crashes often do a google search and > might find the fix based on such patterns. Will do. Thanks! -Toshi