Re: [PATCH] vt_ioctl: make VT_RESIZEX behave like VT_RESIZE

2021-04-11 Thread Linus Torvalds
On Sun, Apr 11, 2021 at 2:43 PM Maciej W. Rozycki  wrote:
>
>  So it does trigger with vgacon and my old server, which I have started
> experimenting with and for a start I have switched to a new kernel for an
> unrelated purpose (now that I have relieved it from all its usual tasks
> except for the last remaining one for which I haven't got the required
> user software ported to the new system yet):
>
> "struct vt_consize"->v_vlin is ignored. Please report if you need this.
> "struct vt_consize"->v_clin is ignored. Please report if you need this.

Note that it's entirely possible that things continue to work well
despite this warning. It's unclear to me from your email if you
actually see any difference (and apparently you're not able to see it
right now due to not being close to the machine).

The fact that v_vlin/v_clin are ignored doesn't necessarily mean that
they are different from what they were before, so the warning may be a
non-issue.

> It continues using svgatextmode with its glass (CRT) VT to set my usual
> 80x37 text mode (720x576 pixel resolution) by manipulating the VGA clock
> chip and the CRT controller appropriately for a nice refresh rate of 85Hz:
>
> Chipset = `TVGA8900', Textmode clock = 44.90 MHz, 80x37 chars, CharCell = 
> 9x16. Refresh = 52.51kHz/84.7Hz.

That doesn't seem necessarily wrong to me.

>  So what's the supposed impact of this change that prompted the inclusion
> of the messages?

There _may_ be no impact at all apart from the messages.

The code _used_ to set the scan lines (v_vlin) and font height
(v_clin) from those numbers if they were non-zero, and now it just
ignores them and warns instead.

The code now just sets the font height from the actual font size when
the font is set. Which is honestly the only thing that ever made
sense. Trying to set it with v_clin is ignored, but it's entirely
possible - maybe even likely - that your user of VT_RESIZEX sets it to
the same values it already has.

Exactly because setting a font line number to anything else than the
font size isn't exactly sensible.

But if your screen looks different than it used to, that is obviously
interesting and says something actually changed (outside of the
message itself).

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


New warnings with gcc-11

2021-04-27 Thread Linus Torvalds
I've updated to Fedora 34 on one of my machines, and it causes a lot
of i915 warnings like

  drivers/gpu/drm/i915/intel_pm.c: In function ‘ilk_setup_wm_latency’:
  drivers/gpu/drm/i915/intel_pm.c:3059:9: note: referencing argument 3
of type ‘const u16 *’ {aka ‘const short unsigned int *’}
  drivers/gpu/drm/i915/intel_pm.c:2994:13: note: in a call to function
‘intel_print_wm_latency’

and the reason is that gcc now seems to look at the argument array
size more, and notices that

 (a) intel_print_wm_latency() takes a "const u16 wm[8]" argument

but

 (b) most of the arrays passed in tend to look like 'u16 pri_latency[5]'

I think I will make the argument type to intel_print_wm_latency() be
just "const u16 wm[]" for now, just to avoid seeing a ton of silly
warnings.

I'm not sure if there is a better solution (like making all of those
latency arrays be 8 entries in size), so I'm just letting you know
about my change in this area in case anybody has a better idea.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: New warnings with gcc-11

2021-04-27 Thread Linus Torvalds
On Tue, Apr 27, 2021 at 4:43 PM Linus Torvalds
 wrote:
>
> I think I will make the argument type to intel_print_wm_latency() be
> just "const u16 wm[]" for now, just to avoid seeing a ton of silly
> warnings.

After fixing the trivial ones, this one remains:

  drivers/gpu/drm/i915/display/intel_dp.c: In function
‘intel_dp_check_mst_status’:
  drivers/gpu/drm/i915/display/intel_dp.c:4554:22: warning:
‘drm_dp_channel_eq_ok’ reading 6 bytes from a region of size 4
[-Wstringop-overread]
   4554 | !drm_dp_channel_eq_ok(&esi[10],
intel_dp->lane_count)) {
|
^~~~
  drivers/gpu/drm/i915/display/intel_dp.c:4554:22: note: referencing
argument 1 of type ‘const u8 *’ {aka ‘const unsigned char *’}
  In file included from drivers/gpu/drm/i915/display/intel_dp.c:38:
  ./include/drm/drm_dp_helper.h:1459:6: note: in a call to function
‘drm_dp_channel_eq_ok’
   1459 | bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE],
|  ^~~~

and I'm not fixing that one, because it actually looks like a valid
warning, and doesn't have an obvious fix.

That "esi[]" array is 14 bytes in size (DP_DPRX_ESI_LEN). So when it
does that "&esi[10]" and passes it in as an argument, then only 4
bytes remain of the array.

And drm_dp_channel_eq_ok() supposedly takes a "const u8
link_status[DP_LINK_STATUS_SIZE]", which is 6 bytes.

There may be some reason this is ok, but it does look a bit fishy, and
the compiler warning is appropriate.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.13-rc1

2021-04-28 Thread Linus Torvalds
On Tue, Apr 27, 2021 at 8:32 PM Dave Airlie  wrote:
>
> There aren't a massive amount of conflicts, only with vmwgfx when I
> did a test merge into your master yesterday, I think you should be
> able to handle them yourself, but let me know if you want me to push a
> merged tree somewhere (or if I missed something).

The conflict was easy enough to resolve, but was unusual in that my
tree had vmwgfx fixes that weren't in the development tree (ie that
commit 2ef4fb92363c: "drm/vmwgfx: Make sure bo's are unpinned before
putting them back").

Usually when I merge stuff, I can see that the fixes that were pushed
to my tree are also in the development branch. Not so this time.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.13-rc1

2021-04-28 Thread Linus Torvalds
On Wed, Apr 28, 2021 at 11:14 AM Daniel Vetter  wrote:
>
> Maybe we're overdoing it a bit, but we're trying to not backmerge all
> the time for no reason at all.

Oh, I'm not complaining. I think it's reasonable if some particular
issue doesn't hold up further development.

So my email was more a statement of surprise at a new pattern than
anything else.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.13-rc1

2021-04-28 Thread Linus Torvalds
On Tue, Apr 27, 2021 at 8:32 PM Dave Airlie  wrote:
>
> This is the main drm pull request for 5.13. The usual lots of work all
> over the place. [...]
>
> Mikita Lipski:
>   drm/amd/display: Add MST capability to trigger_hotplug interface

Hmm. I've already merged this, but my clang build shows that this looks buggy:

drivers/gpu/drm/amd/amdgpu/amdgpu_dm/amdgpu_dm_debugfs.c:3015:53:
warning: address of 'aconnector->mst_port->mst_mgr' will always
evaluate to 'true' [-Wpointer-bool-conversion]
if (!(aconnector->port &&
&aconnector->mst_port->mst_mgr))
   ~~  ~~^~~

and yeah, checking for the address of a member of a structure benign
NULL doesn't really work.

I'm assuming the '&' is just a left-over cut-and-paste error or something.

Please fix after reviewing (I'm not going to blindly just remove the
'&' just to silence the warning, since I don't know the code).

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm fixes for 5.14-rc4

2021-08-05 Thread Linus Torvalds
This might possibly have been fixed already by the previous drm pull,
but I wanted to report it anyway, just in case.

It happened after an uptime of over a week, so it might not be trivial
to reproduce.

It's a NULL pointer dereference in dc_stream_retain() with the code being

lock xadd %eax,0x390(%rdi) <-- trapping instruction

and that's just the

kref_get(&stream->refcount);

with a NULL 'stream' argument.

  Call Trace:
   dc_resource_state_copy_construct+0x13f/0x190 [amdgpu]
   amdgpu_dm_atomic_commit_tail+0xd5/0x1540 [amdgpu]
   commit_tail+0x97/0x180 [drm_kms_helper]
   process_one_work+0x1df/0x3a0

the oops is followed by a stream of

  [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:55:crtc-1]
hw_done or flip_done timed out

and the machine was not usable afterwards.

lspci says this is a

 49:00.0 VGA compatible controller [0300]:
   Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere
   [Radeon RX 470/480/570/570X/580/580X/590]
   [1002:67df] (rev e7) (prog-if 00 [VGA controller])

Full oops in the attachment, but I think the above is all the really
salient details.

   Linus


amd-gpu-ooops
Description: Binary data


Re: [Intel-gfx] [BUG] on reboot: bisected to: drm/i915: Shut down displays gracefully on reboot

2021-01-14 Thread Linus Torvalds
On Thu, Jan 14, 2021 at 2:01 PM Steven Rostedt  wrote:
>
> Thanks, I take it, it will be going into mainline soon.

Just got merged - it might be a good idea to verify that your problem is solved.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.12-rc1

2021-02-21 Thread Linus Torvalds
On Thu, Feb 18, 2021 at 10:06 PM Dave Airlie  wrote:
>
> Let me know if there are any issues,

gcc was happy, and I obviously already pushed out my merge, but then
when I did my clang build afterwards, it reports:

  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:764:2: warning:
variable 'structure_size' is used uninitialized whenever switch
default is taken [-Wsometimes-uninitialized]
  default:
  ^~~
  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:770:23: note:
uninitialized use occurs here
  memset(header, 0xFF, structure_size);
   ^~

and clang is very very right. That "default" case is completely
broken, and will generate a randomly sized memset. Not good.

Presumably that default case never happens, but if so it shouldn't exist.

Perhaps better yet, make the "default" case just do a "return" instead
of a break. Breaking out of the switch statement to code that cannot
possibly work is all kinds of mindless.

Kevin/Alex? This was introduced by commit de4b7cd8cb87
("drm/amd/pm/swsmu: unify the init soft gpu metrics function")

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PULL] fixes around VM_PFNMAP and follow_pfn for 5.12 merge window

2021-02-22 Thread Linus Torvalds
On Mon, Feb 22, 2021 at 2:25 AM Daniel Vetter  wrote:
>
> Cc all the mailing lists ... my usual script crashed and I had to
> hand-roll the email and screwed it up ofc :-/

Oh, and my reply thus also became just a reply to you personally.

So repeating it here, in case somebody has comments about that
access_process_vm() issue.

On Mon, Feb 22, 2021 at 2:23 AM Daniel Vetter  wrote:
>
> I've stumbled over this for my own learning and then realized there's a
> bunch of races around VM_PFNMAP mappings vs follow pfn.
>
> If you're happy with this [..]

Happy? No. But it seems an improvement.

I did react to some of this: commit 0fb1b1ed7dd9 ("/dev/mem: Only set
filp->f_mapping") talks about _what_ it does, but not so much _why_ it
does it. It doesn't seem to actually matter, and seems almost
incidental (because you've looked at f_mapping and i_mapping just
didn't matter but was adjacent.

And generic_access_phys() remains horrific. Does anything actually use
this outside of the odd magical access_remote_vm() code?

I'm wondering if that code shouldn't just be removed entirely. It's
quite old, I'm not sure it's really relevant. See commit 28b2ee20c7cb
("access_process_vm device memory infrastructure").

I guess you do debug the X server, but still.. Do you actually ever
look at device memory through the debugger? I'd hope that you'd use an
access function and make gdb call it in the context of the debuggee?

Whatever. I've pulled it, and I'm not _unhappy_ with it, but I'd also
not call myself overly giddy and over the moon happy about this code.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PULL] fixes around VM_PFNMAP and follow_pfn for 5.12 merge window

2021-02-22 Thread Linus Torvalds
On Mon, Feb 22, 2021 at 2:25 AM Daniel Vetter  wrote:
>
> Cc all the mailing lists ... my usual script crashed and I had to
> hand-roll the email and screwed it up ofc :-/

Oh, and you also didn't get a pr-tracker-bot response for this for the
same reason.

Even your fixed email was ignored by the bot (I think because of the
"Re:" in the subject line).

So consider this your manual pr-tracker-bot replacement.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PULL] topic/iomem-mmap-vs-gup

2021-05-06 Thread Linus Torvalds
[ You had a really odd Reply-to on this one ]

On Mon, May 3, 2021 at 12:15 PM Daniel Vetter  wrote:
>
> Anyway here's a small pull for you to ponder, now that the big ones are
> all through.

Well, _now_ I'm all caught up. Knock wood. Anyway, time to look at it:

> Follow-up to my pull from last merge window: kvm and vfio lost their
> very unsafe use of follow_pfn, this appropriately marks up the very
> last user for some userptr-as-buffer use-cases in media. There was
> some resistance to outright removing it, maybe we can do this in a few
> releases.

Hmm. So this looks mostly ok to me, although I think the change to the
nommu case is pretty ridiculous.

On nommu, unsafe_follow_pfn() should just be a wrapper around
follow_pfn(). There's no races when you can't remap anything. No?

Do the two media cases even work on nommu?

Finally - did you intend fo this to be a real pull request? Because
the email read to me like "think about this and tell me what you
think" rather than "please pull"..

And I have now fulfilled that "think about and tell me" part ;)

  Linus


Re: [PULL] topic/iomem-mmap-vs-gup

2021-05-08 Thread Linus Torvalds
[ Daniel, please fix your broken email setup. You have this insane
"Reply-to" list that just duplicates all the participants. Very
broken, very annoying ]

On Fri, May 7, 2021 at 8:53 AM Daniel Vetter  wrote:
>
> So personally I think the entire thing should just be thrown out, it's all
> levels of scary and we have zero-copy buffer sharing done properly with
> dma-buf since years in v4l.

So I've been looking at this more, and the more I look at it, the less
I like this series.

I think the proper fix is to just fix things.

For example, I'm looking at the v4l users of follow_pfn(), and I find
get_vaddr_frames(), which is just broken.

Fine, we know users are broken, but look at what appears to be the
main user of get_vaddr_frames(): vb2_dc_get_userptr().

What does that function do? Immediately after doing
get_vaddr_frames(), it tries to turn those pfn's into page pointers,
and then do sg_alloc_table_from_pages() on the end result.

Yes, yes, it also has that "ok, that failed, let's try to see if it's
some physically contiguous mapping" and do DMA directly to those
physical pages, but the point there is that that only happens when
they weren't normal pages to begin with.

So thew *fix* for at least that path is to

 (a) just use the regular pin_user_pages() for normal pages

 (b) perhaps keep the follow_pfn() case, but then limit it to that "no
page backing" and that physical pages case.

And honestly, the "struct frame_vector" thing already *has* support
for this, and the problem is simply that the v4l code has decided to
have the callers ask for pfn's rather than have the callers just ask
for a frame-vector that is either "pfn's with no paeg backing" _or_
"page list with proper page reference counting".

So this series of yours that just disables follow_pfn() actually seems
very wrong.

I think follow_pfn() is ok for the actual "this is not a 'struct page'
backed area", and disabling that case is wrong even going forward.

End result, I think the proper model is:

 - keep follow_pfn(), but limit it to the "not vm_normal_page()" case,
and return error for some real page mapping

 - make the get_vaddr_frames() first try "pin_user_pages()" (and
create a page array) and fall back to "follow_pfn()" if that fails (or
the other way around). Set the

IOW, get_vaddr_frames() would just do

vec->got_ref = is_pages;
vec->is_pfns = !is_pages;

and everything would just work out - the v4l code seems to already
have all the support for "it's a ofn array" vs "it's properly
refcounted pages".

So the only case we should disallow is the mixed case, that the v4l
code already seems to not be able to handle anyway (and honestly, it
looks like "got_ref/is_pfns" should be just one flag - they always
have to have the opposite values).

So I think this "unsafe_follow_pfn()" halfway step is actively wrong.
It doesn't move us forward. Quite the reverse. It just makes the
proper fix harder.

End result: not pulling it, unless somebody can explain to me in small
words why I'm wrong and have the mental capacity of a damaged rodent.

Linus


Re: New warnings with gcc-11

2021-05-08 Thread Linus Torvalds
I have heard nothing about this, and it remains the only warning from
my allmodconfig build (I have another one for drm compiled with clang,
but there I at least heard back that a fix exists).

Since I am going to release rc1 tomorrow, and I don't want to release
it with an ugly compiler warning, I took it upon myself to just fix
the code:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fec4d42724a1bf3dcba52307e55375fdb967b852

HOWEVER.

That commit fixes the warning, and is at worst harmless. At best it
fixes an access to random stack memory. But it does smell like
somebody who actually knows how these arrays work should look at that
code.

IOW, maybe the code should actually have read 16 bytes from the Event
Status Indicator? Maybe offset 10 was wrong? Maybe
drm_dp_channel_eq_ok() should never have taken six bytes to begin
with?

It's a mystery, and I haven't heard anything otherwise, so there it is.

  Linus

On Wed, Apr 28, 2021 at 12:27 AM Jani Nikula  wrote:
>
> On Tue, 27 Apr 2021, Linus Torvalds  wrote:
> > I've updated to Fedora 34 on one of my machines, and it causes a lot
> > of i915 warnings like
> >
> >   drivers/gpu/drm/i915/intel_pm.c: In function ‘ilk_setup_wm_latency’:
> >   drivers/gpu/drm/i915/intel_pm.c:3059:9: note: referencing argument 3
> > of type ‘const u16 *’ {aka ‘const short unsigned int *’}
> >   drivers/gpu/drm/i915/intel_pm.c:2994:13: note: in a call to function
> > ‘intel_print_wm_latency’
> >
> > and the reason is that gcc now seems to look at the argument array
> > size more, and notices that
>
> Arnd Bergmann reported some of these a while back. I think we have some
> of them fixed in our -next already, but not all. Thanks for the
> reminder.


Re: [git pull] drm fixes round two for 5.13-rc1

2021-05-09 Thread Linus Torvalds
On Sun, May 9, 2021 at 11:16 AM Dave Airlie  wrote:
>
> Bit later than usual, I queued them all up on Friday then promptly
> forgot to write the pull request email. This is mainly amdgpu fixes,
> with some radeon/msm/fbdev and one i915 gvt fix thrown in.

Hmm. Gcc seems ok with this, but clang complains:

   drivers/video/fbdev/core/fbmem.c:736:21: warning: attribute
declaration must precede definition [-Wignored-attributes]
   static const struct __maybe_unused seq_operations proc_fb_seq_ops = {
   ^

but I noticed it only after I had already pushed out the pull.

I'm actually surprised that gcc accepted that horrid mess: putting
"__maybe_unused" between the "struct" and the struct name is very very
wrong.

I fixed it up after the merge due to not noticing earlier..

Maybe the drm test robots should start testing with clang too?

   Linus


Re: [PATCH] video: fbdev: vga16fb: fix OOB write in vga16fb_imageblit()

2021-05-14 Thread Linus Torvalds
On Fri, May 14, 2021 at 9:20 AM Tetsuo Handa
 wrote:
>
> Currently it is impossible to control upper limit of rows/columns values
> based on amount of memory reserved for the graphical screen, for
> resize_screen() calls vc->vc_sw->con_resize() only if vc->vc_mode is not
> already KD_GRAPHICS

Honestly, the saner approach would seem to be to simply error out if
vc_mode is KD_GRAPHICS.

Doing VT_RESIZE while in KD_GRAPHICS mode seems _very_ questionable,
and is clearly currently very buggy.

So why not just say "that clearly already doesn't work, so make it
explicitly not permitted"?

  Linus


Re: [PATCH] video: fbdev: vga16fb: fix OOB write in vga16fb_imageblit()

2021-05-14 Thread Linus Torvalds
On Fri, May 14, 2021 at 10:29 AM Linus Torvalds
 wrote:
>
> So why not just say "that clearly already doesn't work, so make it
> explicitly not permitted"?

IOW, something like this would seem fairly simple and straightforward:

  diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
  index 01645e87b3d5..f24e627b7402 100644
  --- a/drivers/tty/vt/vt.c
  +++ b/drivers/tty/vt/vt.c
  @@ -1171,8 +1171,13 @@ static inline int resize_screen(struct
vc_data *vc, int width, int height,
  /* Resizes the resolution of the display adapater */
  int err = 0;

  -   if (vc->vc_mode != KD_GRAPHICS && vc->vc_sw->con_resize)
  +   if (vc->vc_sw->con_resize) {
  +   // If we have a resize function but are in KD_GRAPHICS mode,
  +   // we can't actually do a resize and need to error out.
  +   if (vc->vc_mode == KD_GRAPHICS)
  +   return -EINVAL;
  err = vc->vc_sw->con_resize(vc, width, height, user);
  +   }

  return err;
   }

not tested, but it looks ObviouslyCorrect(tm), and since we know the
old case didn't work right, it seems very safe to do.

   Linus


Re: [git pull] drm fixes for 5.13-rc2

2021-05-14 Thread Linus Torvalds
On Thu, May 13, 2021 at 7:34 PM Dave Airlie  wrote:
>
> Just realised I got the tag header wrong, these are the rc2 fixes.

Heh. The tag message also says:

> vc4:
> - drop an used function

which just makes me think you may have started drinking early ;)

I fixed it up. Skål!

 Linus


Re: [PATCH] video: fbdev: vga16fb: fix OOB write in vga16fb_imageblit()

2021-05-14 Thread Linus Torvalds
On Fri, May 14, 2021 at 10:37 AM Linus Torvalds
 wrote:
>
> IOW, something like this would seem fairly simple and straightforward:

Proper patch in case syzbot can test this..

  Linus
From b33ca195cecea478768de353b3ae976c07a65615 Mon Sep 17 00:00:00 2001
From: Linus Torvalds 
Date: Fri, 14 May 2021 11:06:12 -0700
Subject: [PATCH] vt: don't allow text-mode resizing when in KD_GRAPHICS mode

The VT layer itself just keeps track of the underlying text contents
just fine, but if the underlying hardware driver has a con_resize()
function, we can't just ignore it when in KD_GRAPHICS mode.

So just refuse to do a text mode resize if we're not in text mode.

Reported-by: Tetsuo Handa 
Reported-by: syzbot 
Signed-off-by: Linus Torvalds 
---
 drivers/tty/vt/vt.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 01645e87b3d5..f24e627b7402 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1171,8 +1171,13 @@ static inline int resize_screen(struct vc_data *vc, int width, int height,
 	/* Resizes the resolution of the display adapater */
 	int err = 0;
 
-	if (vc->vc_mode != KD_GRAPHICS && vc->vc_sw->con_resize)
+	if (vc->vc_sw->con_resize) {
+		// If we have a resize function but are in KD_GRAPHICS mode,
+		// we can't actually do a resize and need to error out.
+		if (vc->vc_mode == KD_GRAPHICS)
+			return -EINVAL;
 		err = vc->vc_sw->con_resize(vc, width, height, user);
+	}
 
 	return err;
 }
-- 
2.31.1.365.ga2a05a39c5



Re: [PATCH] video: fbdev: vga16fb: fix OOB write in vga16fb_imageblit()

2021-05-14 Thread Linus Torvalds
On Fri, May 14, 2021 at 1:25 PM Maciej W. Rozycki  wrote:
>
>  Overall I think it does make sense to resize the text console at any
> time, even if the visible console (VT) chosen is in the graphics mode,

It might make sense, but only if we call the function to update the
low-level data.

Not calling it, and then starting to randomly use the (wrong)
geometry, and just limiting it so that it's all within the buffer -
THAT does not make sense.

So I think your patch is fundamentally wrong. It basically says "let's
use random stale incorrect data, but just make sure that the end
result is still within the allocated buffer".

My patch is at least conceptually sane.

An alternative would be to just remove the "vcmode != KD_GRAPHICS"
check entirely, and always call con_resize() to update the low-level
data, but honestly, that seems very likelty to break something very
fundamentally, since it's not how any of fbcon has ever been tested,

Another alternative would be to just delay the resize to when vcmode
is put back to text mode again. That sounds somewhat reasonable to me,
but it's a pretty big thing.

But no, your patch to just "knowingly use entirely wrong values, then
add a limit check because we know the values are possibly garbage and
not consistent with reality" is simply not acceptable.

  Linus


Re: [PATCH] video: fbdev: vga16fb: fix OOB write in vga16fb_imageblit()

2021-05-14 Thread Linus Torvalds
On Fri, May 14, 2021 at 1:32 PM Linus Torvalds
 wrote:
>
> Another alternative would be to just delay the resize to when vcmode
> is put back to text mode again. That sounds somewhat reasonable to me,
> but it's a pretty big thing.

Actually thinking more about that option, it sounds horrible. It would
mean that we'd continue to use the old geometry for the actual VC
buffers for a random time, and then change it to the new geometry at
some arbitrary point.

So I think the only reasonable approach (apart from just my "don't do
that then") might be to just always call ->con_resize().

There are only actually three cases of "->con_resize()", so it might
not be too bad.

Looking at it, both sisusbcon_resize() and vgacon_resize() seem to be
trivially fine in KD_GRAPHICS mode.

vgacon already seems to have that "!vga_is_gfx" test, and does
vgacon_doresize() at vgacon_switch(). It might need to add a
vgacon_doresize() to the vgacon_blank() case 0 code so that it
actually does the right thing when going back to KD_TEXT mode.

And fbcon_resize() looks like it might be mostly ok with it too.
Again, there is a con_is_visible() test, and I suspect that might need
to be changed to

if (con_is_visible(vc) && vc->vc_mode == KD_TEXT)

instead,  but it doesn't look _too_ bad.

So I think just removing the "vc->vc_mode != KD_GRAPHICS" test from
resize_screen() might be the way to go. That way, the low-level data
structures actually are in sync with the resize, and the "out of
bounds" bug should never happen.

Would you mind testing that?

   Linus


Re: [PATCH v2] tty: vt: always invoke vc->vc_sw->con_resize callback

2021-05-15 Thread Linus Torvalds
On Sat, May 15, 2021 at 9:33 AM Maciej W. Rozycki  wrote:
>
>  NB I suggest that you request your change to be backported, i.e. post v3
> with:
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: sta...@vger.kernel.org # v2.6.12+

I've applied it to my tree, but let's wait to see that it doesn't
cause any issues before notifying the stable people.

   Linus


Re: [PATCH AUTOSEL 5.12 5/5] tty: vt: always invoke vc->vc_sw->con_resize callback

2021-05-17 Thread Linus Torvalds
On Mon, May 17, 2021 at 6:09 PM Sasha Levin  wrote:
>
> From: Tetsuo Handa 
>
> [ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]

So I think the commit is fine, and yes, it should be applied to
stable, but it's one of those "there were three different patches in
as many days to fix the problem, and this is the right one, but maybe
stable should hold off for a while to see that there aren't any
problem reports".

I don't think there will be any problems from this, but while the
patch is tiny, it's conceptually quite a big change to something that
people haven't really touched for a long time.

So use your own judgement, but it might be a good idea to wait a week
before backporting this to see if anything screams.

  Linus


Re: [git pull] drm for 5.14-rc1

2021-07-01 Thread Linus Torvalds
On Wed, Jun 30, 2021 at 9:34 PM Dave Airlie  wrote:
>
> Hi Linus,
>
> This is the main drm pull request for 5.14-rc1.
>
> I've done a test pull into your current tree, and hit two conflicts
> (one in vc4, one in amdgpu), both seem pretty trivial, the amdgpu one
> is recent and sfr sent out a resolution for it today.

Well, the resolutions may be trivial, but the conflict made me look at
the code, and it's buggy.

Commit 04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function")
is broken. It made the code do

mmap_read_lock(mm);
vma = find_vma(mm, start);
mmap_read_unlock(mm);

and then it *uses* that "vma" after it has dropped the lock.

That's a big no-no - once you've dropped the lock, the vma contents
simply aren't reliable any more. That mapping could now be unmapped
and removed at any time.

Now, the conflict actually made one of the uses go away (switching to
vma_lookup() means that the subsequent code no longer needs to look at
"vm_start" to verify we're actually _inside_ the vma), but it still
checks for vma->vm_file afterwards.

So those locking changes in commit 04d8d73dbcbe are completely bogus.

I tried to fix up that bug while handling the conflict, but who knows
what else similar is going on elsewhere.

So I would ask people to

 (a) verify that I didn't make things worse as I fixed things up (note
how I had to change the last argument to amdgpu_hmm_range_get_pages()
from false to true etc).

 (b) go and look at their vma lookup code: you can't just look up a
vma under the lock, and then drop the lock, and then think things stay
stable.

In particular for that (b) case: it is *NOT* enough to look up
vma->vm_file inside the lock and cache that. No - if the test is about
"no backing file before looking up pages", then you have to *keep*
holding the lock until after you've actually looked up the pages!

Because otherwise any test for "vma->vm_file" is entirely pointless,
for the same reason it's buggy to even look at it after dropping the
lock: because once you've dropped the lock, the thing you just tested
for might not be true any more.

So no, it's not valid to do

bool has_file = vma && vma->vm_file;

and then drop the lock, because you don't use 'vma' any more as a
pointer, and then use 'has_file' outside the lock. Because after
you've dropped the lock, 'has_file' is now meaningless.

So it's not just about "you can't look at vma->vm_file after dropping
the lock". It's more fundamental than that. Any *decision* you make
based on the vma is entirely pointless and moot after the lock is
dropped!

Did I fix it up correctly? Who knows. The code makes more sense to me
now and seems valid. But I really *really* want to stress how locking
is important.

You also can't just unlock in the middle of an operation - even if you
then take the lock *again* later (as amdgpu_hmm_range_get_pages() then
did), the fact that you unlocked in the middle means that all the
earlier tests you did are simply no longer valid when you re-take the
lock.

 Linus


Re: Linux 5.14-rc1

2021-07-12 Thread Linus Torvalds
On Mon, Jul 12, 2021 at 12:08 AM Jon Masters  wrote:
>
> I happened to be installing a Fedora 34 (x86) VM for something and did a
> test kernel compile that hung on boot. Setting up a serial console I get
> the below backtrace from ttm but I have not had chance to look at it.

It's a NULL pointer in qxl_bo_delete_mem_notify(), with the code
disassembling to

  16: 55push   %rbp
  17: 48 89 fd  mov%rdi,%rbp
  1a: e8 a2 02 00 00callq  0x2c1
  1f: 84 c0test   %al,%al
  21: 74 0dje 0x30
  23: 48 8b 85 68 01 00 00 mov0x168(%rbp),%rax
  2a:* 83 78 10 03  cmpl   $0x3,0x10(%rax) <-- trapping instruction
  2e: 74 02je 0x32
  30: 5dpop%rbp
  31: c3retq

and that "cmpl $3" looks exactly like that

if (bo->resource->mem_type == TTM_PL_PRIV

and the bug is almost certainly from commit d3116756a710 ("drm/ttm:
rename bo->mem and make it a pointer"), which did

-   if (bo->mem.mem_type == TTM_PL_PRIV ...
+   if (bo->resource->mem_type == TTM_PL_PRIV ...

and claimed "No functional change".

But clearly the "bo->resource" pointer is NULL.

Added guilty parties and dri-devel mailing list.

Christian? Full report at

   
https://lore.kernel.org/lkml/a9473821-1d53-0037-7590-aeaf8e85e...@jonmasters.org/

but there's not a whole lot else there that is interesting except for
the call trace:

  ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
  ttm_bo_release+0x1a1/0x300 [ttm]
  ttm_bo_delayed_delete+0x1be/0x220 [ttm]
  ttm_device_delayed_workqueue+0x18/0x40 [ttm]
  process_one_work+0x1ec/0x390
  worker_thread+0x53/0x3e0

so it's presumably the cleanup phase and perhaps "bo->resource" has
been deallocated and cleared?

  Linus


Re: [patch V3 22/37] highmem: High implementation details and document API

2020-11-03 Thread Linus Torvalds
On Tue, Nov 3, 2020 at 2:33 AM Thomas Gleixner  wrote:
>
> +static inline void *kmap(struct page *page)
> +{
> +   void *addr;
> +
> +   might_sleep();
> +   if (!PageHighMem(page))
> +   addr = page_address(page);
> +   else
> +   addr = kmap_high(page);
> +   kmap_flush_tlb((unsigned long)addr);
> +   return addr;
> +}
> +
> +static inline void kunmap(struct page *page)
> +{
> +   might_sleep();
> +   if (!PageHighMem(page))
> +   return;
> +   kunmap_high(page);
> +}

I have no complaints about the patch, but it strikes me that if people
want to actually have much better debug coverage, this is where it
should be (I like the "every other address" thing too, don't get me
wrong).

In particular, instead of these PageHighMem(page) tests, I think
something like this would be better:

   #ifdef CONFIG_DEBUG_HIGHMEM
 #define page_use_kmap(page) ((page),1)
   #else
 #define page_use_kmap(page) PageHighMem(page)
   #endif

adn then replace those "if (!PageHighMem(page))" tests with "if
(!page_use_kmap())" instead.

IOW, in debug mode, it would _always_ remap the page, whether it's
highmem or not. That would really stress the highmem code and find any
fragilities.

No?

Anyway, this is all sepatrate from the series, which still looks fine
to me. Just a reaction to seeing the patch, and Thomas' earlier
mention that the highmem debugging doesn't actually do much.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm next pull for 5.10-rc1

2020-11-15 Thread Linus Torvalds
On Tue, Nov 3, 2020 at 2:20 PM Kirill A. Shutemov  wrote:
>
> On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote:
> >   drm/nouveau/kms: Search for encoders' connectors properly
>
> This commit (09838c4efe9a) broke boot for me. These two hunks in
> particular:

Christ. It's been two weeks. I'm doing -rc4 today, and I still don't
have the fix.

The problem seems entirely obvious, as reported by Kirill: the nv50
code unconditionally calls the "atomic_{dis,en}able()" functions, even
when not everybody was converted.

The fix seems to be to either just do the conversion of the remaining
cases (which looks like just adding an argument to the remaining
functions, and using that for the "atomic" callback), or the trivial
suggestion by Kirill from two weeks ago:

> I hacked up patch to use help->disable/help->enable if atomic_ versions
> are NULL. It worked.

Kirill, since the nouveau people aren't fixing this, can you just send
me your tested patch?

Lyude/Ben - let me just say that I think this is all a huge disgrace.

You had a problem report with a bisected commit, a suggested fix, and
two weeks later there's absolutely _nothing_.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm nouveau urgent fixes for 5.10-rc4

2020-11-15 Thread Linus Torvalds
On Sun, Nov 15, 2020 at 12:41 PM Dave Airlie  wrote:
>
> As mentioned I did have a fixes pull from Ben from after I'd sent you
> out stuff, it contains the fix for the regression reported in the rc1
> thread along with two others.

Thanks, pulled,

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Linux 5.10-rc4

2020-11-18 Thread Linus Torvalds
On Wed, Nov 18, 2020 at 4:12 AM David Laight  wrote:
>
> I've got the 'splat' below during boot.
> This is an 8-core C2758 Atom cpu using the on-board/cpu graphics.
> User space is Ubuntu 20.04.
>
> Additionally the X display has all the colours and alignment slightly
> messed up.
> 5.9.0 was ok.
> I'm just guessing the two issues are related.

Sounds likely.  But it would be lovely if you could bisect when
exactly the problem(s) started to both verify that, and just to
pinpoint the exact change..

I'm adding Thomas Zimmermann to the cc, because he did that "drm/ast:
Program display mode in CRTC's atomic_enable" which looks relevant in
that it's right in that call-chain.

Did some initialization perhaps get overlooked?

And Dave and Daniel and the drm list cc'd as well..

Full splat left quoted below for new people and list.

Linus

> [   20.809891] WARNING: CPU: 0 PID: 973 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:284 drm_gem_vram_offset+0x35/0x40 
> [drm_vram_helper]
> [   20.821543] Modules linked in: nls_iso8859_1 dm_multipath scsi_dh_rdac 
> scsi_dh_emc scsi_dh_alua ipmi_ssif intel_powerclamp coretemp kvm_intel kvm 
> joydev input_leds ipmi_si intel_cstate ipmi_devintf ipmi_msghandler mac_hid 
> sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs 
> blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
> async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath 
> linear ast drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt 
> fb_sys_fops cec drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul 
> ghash_clmulni_intel gpio_ich drm aesni_intel hid_generic glue_helper 
> crypto_simd igb usbhid cryptd ahci i2c_i801 hid libahci i2c_smbus lpc_ich dca 
> i2c_ismt i2c_algo_bit
> [   20.887477] CPU: 0 PID: 973 Comm: gnome-shell Not tainted 5.10.0-rc4+ #78
> [   20.894274] Hardware name: Supermicro A1SAi/A1SRi, BIOS 1.1a 08/27/2015
> [   20.900896] RIP: 0010:drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
> [   20.907342] Code: 00 48 89 e5 85 c0 74 17 48 83 bf 78 01 00 00 00 74 18 48 
> 8b 87 80 01 00 00 5d 48 c1 e0 0c c3 0f 0b 48 c7 c0 ed ff ff ff 5d c3 <0f> 0b 
> 31 c0 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 87 18 06
> [   20.926100] RSP: 0018:9f59811d3a68 EFLAGS: 00010246
> [   20.931339] RAX: 0002 RBX: 8b46861e20c0 RCX: 
> c032d600
> [   20.938479] RDX: 8b468f47a000 RSI: 8b46861e2000 RDI: 
> 8b468f9acc00
> [   20.945622] RBP: 9f59811d3a68 R08: 0040 R09: 
> 8b46864ce288
> [   20.952769] R10:  R11: 0001 R12: 
> 8b468f47a000
> [   20.959915] R13:  R14:  R15: 
> 8b468ad2bf00
> [   20.967057] FS:  7f5b37ac5cc0() GS:8b49efc0() 
> knlGS:
> [   20.975149] CS:  0010 DS:  ES:  CR0: 80050033
> [   20.980904] CR2: 7f5b3d093f00 CR3: 000103438000 CR4: 
> 001006f0
> [   20.988047] Call Trace:
> [   20.990506]  ast_cursor_page_flip+0x22/0x100 [ast]
> [   20.995313]  ast_cursor_plane_helper_atomic_update+0x46/0x70 [ast]
> [   21.001524]  drm_atomic_helper_commit_planes+0xbd/0x220 [drm_kms_helper]
> [   21.008243]  drm_atomic_helper_commit_tail_rpm+0x3a/0x70 [drm_kms_helper]
> [   21.015062]  commit_tail+0x99/0x130 [drm_kms_helper]
> [   21.020050]  drm_atomic_helper_commit+0x123/0x150 [drm_kms_helper]
> [   21.026269]  drm_atomic_commit+0x4a/0x50 [drm]
> [   21.030737]  drm_atomic_helper_update_plane+0xe7/0x140 [drm_kms_helper]
> [   21.037384]  __setplane_atomic+0xcc/0x110 [drm]
> [   21.041953]  drm_mode_cursor_universal+0x13e/0x260 [drm]
> [   21.047299]  drm_mode_cursor_common+0xef/0x220 [drm]
> [   21.052287]  ? alloc_set_pte+0x10d/0x6d0
> [   21.056244]  ? drm_mode_cursor_ioctl+0x60/0x60 [drm]
> [   21.061242]  drm_mode_cursor2_ioctl+0xe/0x10 [drm]
> [   21.066067]  drm_ioctl_kernel+0xae/0xf0 [drm]
> [   21.070455]  drm_ioctl+0x241/0x3f0 [drm]
> [   21.074415]  ? drm_mode_cursor_ioctl+0x60/0x60 [drm]
> [   21.079401]  __x64_sys_ioctl+0x91/0xc0
> [   21.083167]  do_syscall_64+0x38/0x90
> [   21.086755]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   21.091813] RIP: 0033:0x7f5b3cf1350b
> [   21.095403] Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 
> c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
> [   21.114154] RSP: 002b:7ffef1966588 EFLAGS: 0246 ORIG_RAX: 
> 0010
> [   21.121730] RAX: ffda RBX: 7ffef19665c0 RCX: 
> 7f5b3cf1350b
> [   21.128870] RDX: 7ffef19665c0 RSI: c02464bb RDI: 
> 0009
> [   21.136013] RBP: c02464bb R08: 0040 R09: 
> 0004
> [   21.143157] R10: 0002 R11: 0246 R12: 
> 561ec9d10060
> [   21.150295] R13: 0009 R14: 561eca2cc9a0 R15: 
> 0040

Re: [git pull] drm for 5.18-rc1

2022-03-24 Thread Linus Torvalds
On Wed, Mar 23, 2022 at 7:30 PM Dave Airlie  wrote:
>
> This is the main drm pull request for 5.18.
>
> The summary changelog is below, lots of work all over,
> Intel improving DG2 support, amdkfd CRIU support, msm
> new hw support, and faster fbdev support.

Ok, so this was annoying.

I've merged it, but will note three things that I really hope get
fixed / checked:

 (1) My merge resolution looked mostly trivial, except for an annoying
conflict between commits

4ed545e7d100 ("dt-bindings: display: mediatek: disp: split
each block to individual yaml")

and

6d0990e6e844 ("media: dt-binding: mediatek: Get rid of
mediatek,larb for multimedia HW")

where one of them splits up a file that is modified by the other.

I ended up just getting rid of all the "mediatek,larb" mentions in the
split-up files, despite the fact that (a) those mentions can be found
elsewhere and (b) the split-up did other changes too, so maybe it's
wrong.

 (2) As Guenter reported, the fbdev performance improvement of
cfb_imageblit() is broken.

I was going to just revert it, but I see that there is a two-series
patch to fix it at

https://lore.kernel.org/all/20220313192952.12058-1-tzimmerm...@suse.de/

so I merged it in that broken form, in the hope that this set of fixes
will be sent to me asap.

 (3) Very similarly to (2), but broken mediatek DT files.

I hope my changes in (1) didn't make things worse, but there's a
series of fixes as

https://lore.kernel.org/all/20220309134702.9942-1-jason-jh@mediatek.com/

and again I hope I'll get those fixes from the proper places asap.

I considered just delaying merging this all entirely, but it seems
better to get this all in, with the known problems and known fixes,
and see if we hit something _else_ too.

Anyway, let's hope I didn't miss anything, and that those are the only
major issues.

  Linus


Re: [git pull] drm fixes for 5.18-rc1

2022-03-25 Thread Linus Torvalds
On Thu, Mar 24, 2022 at 7:13 PM Dave Airlie  wrote:
>
> Some fixes were queued up in and in light of the fbdev regressions,
> I've pulled those in as well,

Thanks, pulled.

It sounds (from the other thread) that the mediatek DT issue is also
about to be fixed, even if it's not yet here.

But that hopefully (probably?) affects fewer people and testing robots.

 Linus


amdgpu link problem (was Re: [git pull] drm for 5.18-rc1)

2022-03-28 Thread Linus Torvalds
I didn't notice this until now, probably because everything still
_works_, but I get a new big warning splat at bootup on my main
workstation these days as of the merge window changes.

The full warning is attached, but it's basically the ASSERT(0) at line 938 in

  drivers/gpu/drm/amd/display/dc/core/dc_link.c

and it looks to have been introduced by commit c282d9512cdd
("drm/amd/display: factor out dp detection link training and mst top
detection").

This is the same old setup I've reported before, with some random
Radeon card with two monitors attached (PCI ID 1002:67df rev e7,
subsystem ID 1da2:e353).

I think the card went by the name "Sapphire Pulse RX 580 8GB" in case
any of that matters, but it's been working fine.

It still works fine, it just has a big ugly boot-time splat.

As mentioned, two 4K monitors attached, both over HDMI.

If there is any particular info you want, just let me know where/how
to find it, and I can provide.

  Linus


warn
Description: Binary data


Re: [PATCH] Kbuild: remove -std=gnu89 from compiler arguments

2022-02-27 Thread Linus Torvalds
On Sun, Feb 27, 2022 at 1:54 PM Arnd Bergmann  wrote:
>
> Since the differences between gnu99, gnu11 and gnu17 are fairly minimal
> and mainly impact warnings at the -Wpedantic level that the kernel
> never enables, the easiest way is to just leave out the -std=gnu89
> argument entirely, and rely on the compiler default language setting,
> which is gnu11 for gcc-5, and gnu1x/gnu17 for all other supported
> versions of gcc or clang.

Honestly, I'd rather keep the C version we support as some explicit
thing, instead of "whatever the installed compiler is".

Not only do I suspect that you can set it in gcc spec files (so the
standard version might actually be site-specific, not compiler version
specific), but particularly with clang, I'd like that "GNU extensions
enabled" to be explicit. Yes, maybe it's the default, but let's make
sure.

The C version level has traditionally had a lot of odd semantic
meaning details - you mention "inline", others have existed. So it's
not just the actual new features that some C version implements, it's
those kind of "same syntax, different meaning" issues. I really don't
think that's something we want in the kernel any more.

Been there, done that, and we did the explicit standards level for a reason.

It may be true that c99/c11/c17 are all very similar, and don't have
those issues. Or maybe they do.

And I don't want somebody with a newer compiler version to not notice
that he or she ended up using a c17 feature, just because _that_
compiler supported it, and then other people get build errors because
their compilers use gnu11 instead by default.

Put another way: I see absolutely no upside to allowing different
users using higher/lower versions of the standard. There are only
downsides.

If gnu11 is supported by gcc-5.1 and up, and all the relevant clang
versions, then let's just pick that.

And if there are any possible future advantages to gnu17 (or eventual
gnu2x versions), let's document those, so that we can say "once our
compiler version requirements go up sufficiently, we'll move to gnuXX
because we want to take advantage of YY".

Please?

   Linus

   Linus


Re: [greybus-dev] [PATCH] Kbuild: remove -std=gnu89 from compiler arguments

2022-02-27 Thread Linus Torvalds
On Sun, Feb 27, 2022 at 3:04 PM Alex Elder  wrote:
>
> Glancing at the Greybus code, I don't believe there's any
> reason it needs to shift a negative value.  Such warnings
> could be fixed by making certain variables unsigned, for
> example.

As mentioned in the original thread, making things unsigned actually
is likely to introduce bugs and make things worse.

The warning is simply bogus, and the fact that it was enabled by
-Wextra in gcc for std=gnu99 and up was a mistake that looks likely to
be fixed in gcc.

So don't try to "fix" the code to make any possible warnings go away.
You may just make things worse.

(That is often true in general for the more esoteric warnings, but in
this case it's just painfully more obvious).

  Linus


Re: [PATCH] [v2] Kbuild: move to -std=gnu11

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 3:37 AM Arnd Bergmann  wrote:
>
> I think the KBUILD_USERCFLAGS portion and the modpost.c fix for it
> make sense regardless of the -std=gnu11 change

I do think they make sense, but I want to note again that people doing
cross builds obviously use different tools for user builds than for
the kernel. In fact, even not cross-building, we've had situations
where the "kbuild" compiler is different from the host compiler,
because people have upgraded one but not the other (upgrading the
kernel build environment is actually much easier than upgrading the
host build environment, because you don't need all the random
libraries etc, and you can literally _just_ build your own gcc and
binutils)

And we have *not* necessarily required that the host tools match the
kernel tools.

So I could well imagine that there are people who build their kernels,
but their host build environment might be old enough that -std=gnu11
is problematic for that part.

And note how any change to  KBUILD_USERCFLAGS is reflected in KBUILD_HOSTCFLAGS.

So I would suggest that the KBUILD_USERCFLAGS part of the patch (and
the modpost.c change that goes with it) be done as a separate commit.
Because we might end up reverting that part.

Hmm?

   Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 4:19 AM Christian König
 wrote:
>
> I don't think that using the extra variable makes the code in any way
> more reliable or easier to read.

So I think the next step is to do the attached patch (which requires
that "-std=gnu11" that was discussed in the original thread).

That will guarantee that the 'pos' parameter of list_for_each_entry()
is only updated INSIDE the for_each_list_entry() loop, and can never
point to the (wrongly typed) head entry.

And I would actually hope that it should actually cause compiler
warnings about possibly uninitialized variables if people then use the
'pos' pointer outside the loop. Except

 (a) that code in sgx/encl.c currently initializes 'tmp' to NULL for
inexplicable reasons - possibly because it already expected this
behavior

 (b) when I remove that NULL initializer, I still don't get a warning,
because we've disabled -Wno-maybe-uninitialized since it results in so
many false positives.

Oh well.

Anyway, give this patch a look, and at least if it's expanded to do
"(pos) = NULL" in the entry statement for the for-loop, it will avoid
the HEAD type confusion that Jakob is working on. And I think in a
cleaner way than the horrid games he plays.

(But it won't avoid possible CPU speculation of such type confusion.
That, in my opinion, is a completely different issue)

I do wish we could actually poison the 'pos' value after the loop
somehow - but clearly the "might be uninitialized" I was hoping for
isn't the way to do it.

Anybody have any ideas?

Linus


p
Description: Binary data


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 11:56 AM Linus Torvalds
 wrote:
>
> I do wish we could actually poison the 'pos' value after the loop
> somehow - but clearly the "might be uninitialized" I was hoping for
> isn't the way to do it.

Side note: we do need *some* way to do it.

Because if we make that change, and only set it to another pointer
when not the head, then we very much change the semantics of
"list_for_each_head()". The "list was empty" case would now exit with
'pos' not set at all (or set to NULL if we add that). Or it would be
set to the last entry.

And regardless, that means that all the

if (pos == head)

kinds of checks after the loop would be fundamentally broken.

Darn. I really hoped for (and naively expected) that we could actually
have the compiler warn about the use-after-loop case. That whole
"compiler will now complain about bad use" was integral to my clever
plan to use the C99 feature of declaring the iterator inside the loop.

But my "clever plan" was apparently some ACME-level Wile E. Coyote sh*t.

Darn.

   Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 12:03 PM Linus Torvalds
 wrote:
>
> Side note: we do need *some* way to do it.

Ooh.

This patch is a work of art.

And I mean that in the worst possible way.

We can do

typeof(pos) pos

in the 'for ()' loop, and never use __iter at all.

That means that inside the for-loop, we use a _different_ 'pos' than outside.

And then the compiler will not see some "might be uninitialized", but
the outer 'pos' *will* be uninitialized.

Unless, of course, the outer 'pos' had that pointless explicit initializer.

Here - can somebody poke holes in this "work of art" patch?

 Linus
 Makefile   | 2 +-
 arch/x86/kernel/cpu/sgx/encl.c | 2 +-
 include/linux/list.h   | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index daeb5c88b50b..cc4b0a266af0 100644
--- a/Makefile
+++ b/Makefile
@@ -515,7 +515,7 @@ KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
 		   -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
 		   -Werror=implicit-function-declaration -Werror=implicit-int \
 		   -Werror=return-type -Wno-format-security \
-		   -std=gnu89
+		   -std=gnu11
 KBUILD_CPPFLAGS := -D__KERNEL__
 KBUILD_AFLAGS_KERNEL :=
 KBUILD_CFLAGS_KERNEL :=
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 48afe96ae0f0..87db2f3936b0 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -450,7 +450,7 @@ static void sgx_mmu_notifier_release(struct mmu_notifier *mn,
  struct mm_struct *mm)
 {
 	struct sgx_encl_mm *encl_mm = container_of(mn, struct sgx_encl_mm, mmu_notifier);
-	struct sgx_encl_mm *tmp = NULL;
+	struct sgx_encl_mm *tmp;
 
 	/*
 	 * The enclave itself can remove encl_mm.  Note, objects can't be moved
diff --git a/include/linux/list.h b/include/linux/list.h
index dd6c2041d09c..708078b2f24d 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -634,9 +634,9 @@ static inline void list_splice_tail_init(struct list_head *list,
  * @head:	the head for your list.
  * @member:	the name of the list_head within the struct.
  */
-#define list_for_each_entry(pos, head, member)\
-	for (pos = list_first_entry(head, typeof(*pos), member);	\
-	 !list_entry_is_head(pos, head, member);			\
+#define list_for_each_entry(pos, head, member)	\
+	for (typeof(pos) pos = list_first_entry(head, typeof(*pos), member);	\
+	 !list_entry_is_head(pos, head, member);	\
 	 pos = list_next_entry(pos, member))
 
 /**


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 12:10 PM Linus Torvalds
 wrote:
>
> We can do
>
> typeof(pos) pos
>
> in the 'for ()' loop, and never use __iter at all.
>
> That means that inside the for-loop, we use a _different_ 'pos' than outside.

The thing that makes me throw up in my mouth a bit is that in that

typeof(pos) pos

the first 'pos' (that we use for just the typeof) is that outer-level
'pos', IOW it's a *different* 'pos' than the second 'pos' in that same
declaration that declares the inner level shadowing new 'pos'
variable.

If I was a compiler person, I would say "Linus, that thing is too ugly
to live", and I would hate it. I'm just hoping that even compiler
people say "that's *so* ugly it's almost beautiful".

Because it does seem to work. It's not pretty, but hey, it's not like
our headers are really ever be winning any beauty contests...

Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 12:16 PM Matthew Wilcox  wrote:
>
> Then we can never use -Wshadow ;-(  I'd love to be able to turn it on;
> it catches real bugs.

Oh, we already can never use -Wshadow regardless of things like this.
That bridge hasn't just been burned, it never existed in the first
place.

The whole '-Wshadow' thing simply cannot work with local variables in
macros - something that we've used since day 1.

Try this (as a "p.c" file):

#define min(a,b) ({ \
typeof(a) __a = (a);\
typeof(b) __b = (b);\
__a < __b ? __a : __b; })

int min3(int a, int b, int c)
{
return min(a,min(b,c));
}

and now do "gcc -O2 -S t.c".

Then try it with -Wshadow.

In other words, -Wshadow is simply not acceptable. Never has been,
never will be, and that has nothing to do with the

typeof(pos) pos

kind of thing.

Your argument just isn't an argument.

  Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 12:29 PM Johannes Berg
 wrote:
>
> If we're willing to change the API for the macro, we could do
>
>   list_for_each_entry(type, pos, head, member)
>
> and then actually take advantage of -Wshadow?

See my reply to Willy. There is no way -Wshadow will ever happen.

I considered that (type, pos, head, member) kind of thing, to the
point of trying it for one file, but it ends up as horrendous syntax.
It turns out that declaring the type separately really helps, and
avoids crazy long lines among other things.

It would be unacceptable for another reason too - the amount of churn
would just be immense. Every single use of that macro (and related
macros) would change, even the ones that really don't need it or want
it (ie the good kinds that already only use the variable inside the
loop).

So "typeof(pos) pos" may be ugly - but it's a very localized ugly.

Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 1:47 PM Jakob Koschel  wrote:
>
> The goal of this is to get compiler warnings right? This would indeed be 
> great.

Yes, so I don't mind having a one-time patch that has been gathered
using some automated checker tool, but I don't think that works from a
long-term maintenance perspective.

So if we have the basic rule being "don't use the loop iterator after
the loop has finished, because it can cause all kinds of subtle
issues", then in _addition_ to fixing the existing code paths that
have this issue, I really would want to (a) get a compiler warning for
future cases and (b) make it not actually _work_ for future cases.

Because otherwise it will just happen again.

> Changing the list_for_each_entry() macro first will break all of those cases
> (e.g. the ones using 'list_entry_is_head()).

So I have no problems with breaking cases that we basically already
have a patch for due to  your automated tool. There were certainly
more than a handful, but it didn't look _too_ bad to just make the
rule be "don't use the iterator after the loop".

Of course, that's just based on that patch of yours. Maybe there are a
ton of other cases that your patch didn't change, because they didn't
match your trigger case, so I may just be overly optimistic here.

But basically to _me_, the important part is that the end result is
maintainable longer-term. I'm more than happy to have a one-time patch
to fix a lot of dubious cases if we can then have clean rules going
forward.

> I assumed it is better to fix those cases first and then have a simple
> coccinelle script changing the macro + moving the iterator into the scope
> of the macro.

So that had been another plan of mine, until I actually looked at
changing the macro. In the one case I looked at, it was ugly beyond
belief.

It turns out that just syntactically, it's really nice to give the
type of the iterator from outside the way we do now. Yeah, it may be a
bit odd, and maybe it's partly because I'm so used to the
"list_for_each_list_entry()" syntax, but moving the type into the loop
construct really made it nasty - either one very complex line, or
having to split it over two lines which was even worse.

Maybe the place I looked at just happened to have a long typename, but
it's basically always going to be a struct, so it's never a _simple_
type. And it just looked very odd adn unnatural to have the type as
one of the "arguments" to that list_for_each_entry() macro.

So yes, initially my idea had been to just move the iterator entirely
inside the macro. But specifying the type got so ugly that I think
that

typeof (pos) pos

trick inside the macro really ends up giving us the best of all worlds:

 (a) let's us keep the existing syntax and code for all the nice cases
that did everything inside the loop anyway

 (b) gives us a nice warning for any normal use-after-loop case
(unless you explicitly initialized it like that
sgx_mmu_notifier_release() function did for no good reason

 (c) also guarantees that even if you don't get a warning,
non-converted (or newly written) bad code won't actually _work_

so you end up getting the new rules without any ambiguity or mistaken

> With this you are no longer able to set the 'outer' pos within the list
> iterator loop body or am I missing something?

Correct. Any assignment inside the loop will be entirely just to the
local loop case. So any "break;" out of the loop will have to set
another variable - like your updated patch did.

> I fail to see how this will make most of the changes in this
> patch obsolete (if that was the intention).

I hope my explanation above clarifies my thinking: I do not dislike
your patch, and in fact your patch is indeed required to make the new
semantics work.

What I disliked was always the maintainability of your patch - making
the rules be something that isn't actually visible in the source code,
and letting the old semantics still work as well as they ever did, and
having to basically run some verification pass to find bad users.

(I also disliked your original patch that mixed up the "CPU
speculation type safety" with the actual non-speculative problems, but
that was another issue).

Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 3:26 PM Matthew Wilcox  wrote:
>
> #define ___PASTE(a, b)  a##b
> #define __PASTE(a, b) ___PASTE(a, b)
> #define _min(a, b, u) ({ \

Yeah, except that's ugly beyond belief, plus it's literally not what
we do in the kernel.

Really. The "-Wshadow doesn't work on the kernel" is not some new
issue, because you have to do completely insane things to the source
code to enable it.

Just compare your uglier-than-sin version to my straightforward one.
One does the usual and obvious "use a private variable to avoid the
classic multi-use of a macro argument". And the other one is an
abomination.

  Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 4:45 PM Linus Torvalds
 wrote:
>
> Yeah, except that's ugly beyond belief, plus it's literally not what
> we do in the kernel.

(Of course, I probably shouldn't have used 'min()' as an example,
because that is actually one of the few places where we do exactly
that, using our __UNIQUE_ID() macros. Exactly because people _have_
tried to do -Wshadow when doing W=2).

 Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 4:38 PM Segher Boessenkool
 wrote:
>
> In C its scope is the rest of the declaration and the entire loop, not
> anything after it.  This was the same in C++98 already, btw (but in
> pre-standard versions of C++ things were like you remember, yes, and it
> was painful).

Yeah, the original C++ model was just unadulterated garbage, with no
excuse for it, and the scope was not the loop, but the block the loop
existed in.

That would never have been acceptable for the kernel - it's basically
just an even uglier version of "put variable declarations in the
middle of code" (and we use "-Wdeclaration-after-statement" to
disallow that for kernel code, although apparently some of our user
space tooling code doesn't enforce or follow that rule).

The actual C99 version is the sane one which actually makes it easier
and clearer to have loop iterators that are clearly just in loop
scope.

That's a good idea in general, and I have wanted to start using that
in the kernel even aside from some of the loop construct macros.
Because putting variables in natural minimal scope is a GoodThing(tm).

Of course, we shouldn't go crazy with it. Even after we do that
-std=gnu11 thing, we'll have backports to worry about. And it's not
clear that we necessarily want to backport that gnu11 thing - since
people who run old stable kernels also may be still using those really
old compilers...

Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread Linus Torvalds
On Tue, Mar 1, 2022 at 10:14 AM Kees Cook  wrote:
>
> The first big glitch with -Wshadow was with shadowed global variables.
> GCC 4.8 fixed that, but it still yells about shadowed functions. What
> _almost_ works is -Wshadow=local.

Heh. Yeah, I just have long memories of "-Wshadow was a disaster". You
looked into the details.

> Another way to try to catch misused shadow variables is
> -Wunused-but-set-varible, but it, too, has tons of false positives.

That on the face of it should be an easy warning to get technically
right for a compiler.

So I assume the "false positives" are simply because we end up having
various variables that really don't end up being used - and
"intentionally" so).

Or rather, they might only be used under some config option - perhaps
the use is even syntactically there and parsed, but the compiler
notices that it's turned off under some

if (IS_ENABLED(..))

option? Because yeah, we have a lot of those.

I think that's a common theme with a lot of compiler warnings: on the
face of it they sound "obviously sane" and nobody should ever write
code like that.

A conditional that is always true? Sounds idiotic, and sounds like a
reasonable thing for a compiler to warn about, since why would you
have a conditional in the first place for that?

But then you realize that maybe the conditional is a build config
option, and "always true" suddenly makes sense. Or it's a test for
something that is always true on _that_architecture_ but not in some
general sense (ie testing "sizeof()"). Or it's a purely syntactic
conditional, like "do { } while (0)".

It's why I'm often so down on a lot of the odd warnings that are
hiding under W=1 and friends. They all may make sense in the trivial
case ("That is insane") but then in the end they happen for sane code.

And yeah, -Wshadow has had tons of history with macro nesting, and
just being badly done in the first place (eg "strlen" can be a
perfectly fine local variable).

That said, maybe people could ask the gcc and clan people for a way to
_mark_ the places where we expect to validly see shadowing. For
example, that "local variable in a macro expression statement" thing
is absolutely horrendous to fix with preprocessor tricks to try to
make for unique identifiers.

But I think it would be much more syntactically reasonable to add (for
example) a "shadow" attribute to such a variable exactly to tell the
compiler "yeah, yeah, I know this identifier could shadow an outer
one" and turn it off that way.

   Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread Linus Torvalds
On Mon, Feb 28, 2022 at 2:29 PM James Bottomley
 wrote:
>
> However, if the desire is really to poison the loop variable then we
> can do
>
> #define list_for_each_entry(pos, head, member)  \
> for (pos = list_first_entry(head, typeof(*pos), member);\
>  !list_entry_is_head(pos, head, member) && ((pos = NULL) == NULL; 
>   \
>  pos = list_next_entry(pos, member))
>
> Which would at least set pos to NULL when the loop completes.

That would actually have been excellent if we had done that
originally. It would not only avoid the stale and incorrectly typed
head entry left-over turd, it would also have made it very easy to
test for "did I find an entry in the loop".

But I don't much like it in the situation we are now.

Why? Mainly because it basically changes the semantics of the loop
_without_ any warnings about it.  And we don't actually get the
advantage of the nicer semantics, because we can't actually make code
do

list_for_each_entry(entry, ) {
..
}
if (!entry)
return -ESRCH;
.. use the entry we found ..

because that would be a disaster for back-porting, plus it would be a
flag-day issue (ie we'd have to change the semantics of the loop at
the same time we change every single user).

So instead of that simple "if (!entry)", we'd effectively have to
continue to use something that still works with the old world order
(ie that "if (list_entry_is_head())" model).

So we couldn't really take _advantage_ of the nicer semantics, and
we'd not even get a warning if somebody does it wrong - the code would
just silently do the wrong thing.

IOW: I don't think you are wrong about that patch: it would solve the
problem that Jakob wants to solve, and it would have absolutely been
much better if we had done this from the beginning. But I think that
in our current situation, it's actually a really fragile solution to
the "don't do that then" problem we have.

  Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread Linus Torvalds
On Tue, Mar 1, 2022 at 11:06 AM Linus Torvalds
 wrote:
>
> So instead of that simple "if (!entry)", we'd effectively have to
> continue to use something that still works with the old world order
> (ie that "if (list_entry_is_head())" model).

Just to prove my point about how this is painful, that doesn't work at all.

If the loop iterator at the end is NULL (good, in theory), we can't
use "list_entry_is_head()" to check whether we ended. We'd have to use
a new thing entirely, to handle the "list_for_each_entry() has the
old/new semantics" cases.

That's largely why I was pushing for the "let's make it impossible to
use the loop iterator at all outside the loop". It avoids the
confusing case, and the patches to move to that stricter semantic can
be merged independently (and before) doing the actual semantic change.

I'm not saying my suggested approach is wonderful either. Honestly,
it's painful that we have so nasty semantics for the end-of-loop case
for list_for_each_entry().

The minimal patch would clearly be to keep those broken semantics, and
just force everybody to use the list_entry_is_head() case. That's the
"we know we messed up, we are too lazy to fix it, we'll just work
around it and people need to be careful" approach.

And laziness is a virtue. But bad semantics are bad semantics. So it's
a question of balancing those two issues.

   Linus


Re: [PATCH 6/6] treewide: remove check of list iterator against head past the loop body

2022-03-01 Thread Linus Torvalds
So looking at this patch, I really reacted to the fact that quite
often the "use outside the loop" case is all kinds of just plain
unnecessary, but _used_ to be a convenience feature.

I'll just quote the first chunk in it's entirely as an example - not
because I think this chunk is particularly important, but because it's
a good example:

On Mon, Feb 28, 2022 at 3:09 AM Jakob Koschel  wrote:
>
> diff --git a/arch/arm/mach-mmp/sram.c b/arch/arm/mach-mmp/sram.c
> index 6794e2db1ad5..fc47c107059b 100644
> --- a/arch/arm/mach-mmp/sram.c
> +++ b/arch/arm/mach-mmp/sram.c
> @@ -39,19 +39,22 @@ static LIST_HEAD(sram_bank_list);
>  struct gen_pool *sram_get_gpool(char *pool_name)
>  {
> struct sram_bank_info *info = NULL;
> +   struct sram_bank_info *tmp;
>
> if (!pool_name)
> return NULL;
>
> mutex_lock(&sram_lock);
>
> -   list_for_each_entry(info, &sram_bank_list, node)
> -   if (!strcmp(pool_name, info->pool_name))
> +   list_for_each_entry(tmp, &sram_bank_list, node)
> +   if (!strcmp(pool_name, tmp->pool_name)) {
> +   info = tmp;
> break;
> +   }
>
> mutex_unlock(&sram_lock);
>
> -   if (&info->node == &sram_bank_list)
> +   if (!info)
> return NULL;
>
> return info->gpool;

I realize this was probably at least auto-generated with coccinelle,
but maybe that script could be taught to do avoid the "use after loop"
by simply moving the code _into_ the loop.

IOW, this all would be cleaner and clear written as

if (!pool_name)
return NULL;

mutex_lock(&sram_lock);
list_for_each_entry(info, &sram_bank_list, node) {
if (!strcmp(pool_name, info->pool_name)) {
mutex_unlock(&sram_lock);
return info;
}
}
mutex_unlock(&sram_lock);
return NULL;

Ta-daa - no use outside the loop, no need for new variables, just a
simple "just do it inside the loop". Yes, we end up having that lock
thing twice, but it looks worth it from a "make the code obvious"
standpoint.

Would it be even cleaner if the locking was done in the caller, and
the loop was some simple helper function? It probably would. But that
would require a bit more smarts than probably a simple coccinelle
script would do.

Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread Linus Torvalds
On Tue, Mar 1, 2022 at 2:58 PM David Laight  wrote:
>
> Can it be resolved by making:
> #define list_entry_is_head(pos, head, member) ((pos) == NULL)
> and double-checking that it isn't used anywhere else (except in
> the list macros themselves).

Well, yes, except for the fact that then the name is entirely misleading...

And somebody possibly uses it together with list_first_entry() etc, so
it really is completely broken to mix that change with the list
traversal change.

 Linus

   Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread Linus Torvalds
On Tue, Mar 1, 2022 at 3:19 PM David Laight  wrote:
>
> Having said that there are so few users of list_entry_is_head()
> it is reasonable to generate two new names.

Well, the problem is that the users of list_entry_is_head() may be few
- but there are a number of _other_ ways to check "was that the HEAD
pointer", and not all of them are necessarily correct.

IOW, different places do different random tests for "did we walk the
whole loop without breaking out". And many of them happen to work. In
fact, in practice, pretty much *all* of them happen to work, and you
have to have the right struct layout and really really bad luck to hit
a case of "type confusion ended up causing the test to not work".

And *THAT* is the problem here. It's not the "there are 25ish places
that current use list_entry_is_head()".

It's the "there are ~480 places that use the type-confused HEAD entry
that has been cast to the wrong type".

And THAT is why I think we'd be better off with that bigger change
that simply means that you can't use the iterator variable at all
outside the loop, and try to make it something where the compiler can
help catch mis-uses.

Now, making the list_for_each_entry() thing force the iterator to NULL
at the end of the loop does fix the problem. The issue I have with it
is really just that you end up getting no warning at all from the
compiler if you mix old-style and new-style semantics. Now, you *will*
get an oops (if using a new-style iterator with an old-style check),
but many of these things will be in odd driver code and may happen
only for error cases.

And if you use a new-style check with an old-style iterator (ie some
backport problem), you will probably end up getting random memory
corruption, because you'll decide "it's not a HEAD entry", and then
you'll actually *use* the HEAD that has the wrong type cast associated
with it.

See what my worry is?

With the "don't use iterator outside the loop" approach, the exact
same code works in both the old world order and the new world order,
and you don't have the semantic confusion. And *if* you try to use the
iterator outside the loop, you'll _mostly_ (*) get a compiler warning
about it not being initialized.

 Linus

(*) Unless somebody initializes the iterator pointer pointlessly.
Which clearly does happen. Thus the "mostly". It's not perfect, and
that's most definitely not nice - but it should at least hopefully
make it that much harder to mess up.


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Linus Torvalds
On Wed, Mar 2, 2022 at 12:07 PM Kees Cook  wrote:
>
> I've long wanted to change kfree() to explicitly set pointers to NULL on
> free. https://github.com/KSPP/linux/issues/87

We've had this discussion with the gcc people in the past, and gcc
actually has some support for it, but it's sadly tied to the actual
function name (ie gcc has some special-casing for "free()")

See

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94527

for some of that discussion.

Oh, and I see some patch actually got merged since I looked there last
so that you can mark "deallocator" functions, but I think it's only
for the context matching, not for actually killing accesses to the
pointer afterwards.

   Linus


Re: [git pull] drm for 5.15-rc1

2021-09-01 Thread Linus Torvalds
On Mon, Aug 30, 2021 at 10:53 PM Dave Airlie  wrote:
>
> There are a bunch of conflicts with your tree, but none of them seem
> too serious, but I might have missed something.

No worries. I enjoyed seeing the AMD code-names in the conflicts, they
are using positively kernel-level naming.

That said, I wonder why AMD people can't use consistent formatting,
mixing ALL-CAPS with underscores, spaces, whatever:

/* Sienna_Cichlid */
/* Yellow Carp */
/* Navy_Flounder */
/* DIMGREY_CAVEFISH */
/* Aldebaran */
/* CYAN_SKILLFISH */
/* BEIGE_GOBY */

which shows a distinct lack of professionalism and caring in the silly naming.

 Linus


Re: [git pull] drm for 5.15-rc1

2021-09-01 Thread Linus Torvalds
On Wed, Sep 1, 2021 at 10:57 AM Linus Torvalds
 wrote:
>
> No worries. I enjoyed seeing the AMD code-names in the conflicts, they
> are using positively kernel-level naming.

Oh, I spoke too soon.

The conflict in amdgpu_ras_eeprom.c is trivial to fix up, but the
*code* is garbage.

It does this (from commit 14fb496a84f1: "drm/amdgpu: set RAS EEPROM
address from VBIOS"):

...
control->i2c_address = 0;

if (amdgpu_atomfirmware_ras_rom_addr(adev,
(uint8_t*)&control->i2c_address))
{
if (control->i2c_address == 0xA0)
control->i2c_address = 0;
...

and honestly, that just hurts to look at. It's completely wrong, even
if it happens to work on a little-endian machine.

Yes, yes, BE is irrelevant, and doubly so for an AMD GPU driver, but still.

It's assigning a 8-bit value to a 32-bit entity by doing a pointer
cast on the address, and then mixing things up by using/assigning to
that same field.

That's just *wrong* and nasty.

Oh, the resolution would be easy - just take that broken code as-is -
but I can't actually make myself do that.

So I fixed it up to not be that incredibly ugly garbage.

Please holler if I did something wrong.

 Linus


Re: [PATCH] drm/ttm: fix the type mismatch error on sparc64

2021-09-14 Thread Linus Torvalds
On Tue, Sep 14, 2021 at 12:48 PM Alex Deucher  wrote:
>
> On Tue, Sep 7, 2021 at 6:25 AM Christian König  
> wrote:
> >
> >
> > Reviewed-by: Christian König 
>
> Is one of you going to push this to drm-misc?

I was assuming it was there already.

I guess I'll just apply it directly.

 Linus


Re: [git pull] drm for 5.14-rc1

2021-09-18 Thread Linus Torvalds
On Sat, Sep 18, 2021 at 2:18 AM Michael Stapelberg
 wrote:
>
> torvalds at linux-foundation.org (Linus Torvalds) writes:
> > Did I fix it up correctly? Who knows. The code makes more sense to me
> > now and seems valid. But I really *really* want to stress how locking
> > is important.
>
> As far as I can tell, this merge conflict resolution made my Raspberry
> Pi 3 hang on boot.

Ok, that's a different merge issue than the locking one (which is
about the amd ttm code).

But the VC4 driver did have changes close to each other in the hdmi
detection and clock setting code.

And it doesn't seem to be just RPi3, there was a report back a couple
of weeks ago about RPi4 also having regressed (with an Ubuntu
install). That one was an oops in vc4_hdmi_audio_prepare(). I don't
know if that got resolved, I heard nothing about it after the report.

So there's something seriously wrong in VC4 space.

The main issue seems to be the runtime power management changes. As
far as I can tell, the commits that didn't come in through that drm
pull were these two

  9984d6664ce9 ("drm/vc4: hdmi: Make sure the controller is powered in detect")
  411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to runtime_pm")

Michael - do things work if you revert those two (sadly, they don't
revert cleanly exactly _because_ of the other changes in the same
area)?

Maxime?

 Linus


Re: [git pull] drm for 5.14-rc1

2021-09-18 Thread Linus Torvalds
On Sat, Sep 18, 2021 at 1:13 PM Michael Stapelberg
 wrote:
>
> > Michael - do things work if you revert those two (sadly, they don't
> > revert cleanly exactly _because_ of the other changes in the same
> > area)?
>
> Reverting only 9984d6664ce9 is not sufficient, but reverting both
> 9984d6664ce9 and 411efa18e4b0 does indeed make my Raspberry Pi 3 boot
> again!

Since you did those reverts and fixed up the conflicts, would you mind
sending out the resulting patch so that maybe Sudip can test it too?

Maybe the RPi4 hdmi audio issues are related to the RPi4 hdmi problems
despite the symptoms apparently being rather different..

  Linus


Re: [git pull] drm for 5.14-rc1

2021-09-18 Thread Linus Torvalds
On Sat, Sep 18, 2021 at 3:00 PM Sudip Mukherjee
 wrote:
>
> Its still there. I am seeing it every night. This was from last night
> - https://lava.qa.codethink.co.uk/scheduler/job/164#L1356

Note that that web server is not available at least to me.  Looks like
some internal name or limited dns, I just get

lava.qa.codethink.co.uk: Name or service not known

so your reports aren't getting a lot of traction.

 Linus


Re: [git pull] drm for 5.14-rc1

2021-09-18 Thread Linus Torvalds
On Sat, Sep 18, 2021 at 3:48 PM Sudip Mukherjee
 wrote:
>
> Also, I have now tested by reverting those two commits and I still get
> the same trace on rpi4.

Ok. I'm afraid we really need to have the VC4 people figure it out - I
count do the two reverts that are reported to fix the RPi3 issue, but
it looks like the RPi4 pulseaudio issue needs something else.

Any chance you could bisect that?

Actually, looking at that first report of yours, the RPi4 sound
problem apparently happened in the range

9e9fb7655ed5..7c636d4d20f8

and while that range does have a drm merge from Dave Airlie, it _also_
has a sound merge from Takashi and various ARM SoC updates from Arnd.

But most (all?) of the changes in that range to the vc4 source code
came from the drm tree. And Maxime in particular. I think.

Linus


Re: [git pull] drm for 5.14-rc1

2021-09-19 Thread Linus Torvalds
On Sun, Sep 19, 2021 at 4:05 AM Sudip Mukherjee
 wrote:
>
> And indeed, reverting 27da370e0fb3 ("drm/vc4: hdmi: Remove
> drm_encoder->crtc usage") on top of d4d016caa4b8 ("alpha: move
> __udiv_qrnnd library function to arch/alpha/lib/")
> has fixed the error.

Ok, since your pulseaudio problem was reported already over two weeks
ago with no apparent movement, I've done that revert in my tree.

I reverted the two runtime PM changes that cause problems for Michael too.

  Linus


Re: [git pull] drm for 5.14-rc1

2021-09-20 Thread Linus Torvalds
On Mon, Sep 20, 2021 at 5:17 AM Maxime Ripard  wrote:
>
> I'm sorry, but I find that situation to be a bit ridiculous.

Honestly, the original report about the pulseaudio problem came in
over two weeks ago, and all you seemed to do was to ignore everything
that Sudip said and reported.

THAT is the ridiculous part.

The rules for the kernel are very clear: regressions get fixed or they
get reverted. And I saw absolutely _zero_ indication that you took
that pulseaudio oops at all seriously.

Linus


Re: [git pull] drm for 5.14-rc1

2021-09-20 Thread Linus Torvalds
On Mon, Sep 20, 2021 at 10:33 AM Maxime Ripard  wrote:
>
> What I was interested in was more about the context itself, and I'd
> still like an answer on whether it's ok to wait for a review for 5
> months though, or if it's an expectation from now on that we are
> supposed to fix bugs over the week-end.

Oh, it's definitely not "over a weekend". These reverts happened on a
Sunday just because that's when I do rc releases, and this was one of
those pending issues that had been around long enough that I went "ok,
I'm reverting now since it's been bisected and verified".

So it happened on a weekend, but that's pretty incidental.

You should not wait for 5 months to send bug-fixes. That's not the
point of review, and review shouldn't hold up reported regressions of
existing code. That's just basic _testing_ - either the fix should be
applied, or - if the fix is too invasive or too ugly - the problematic
source of the regression should be reverted.

Review should be about new code, it shouldn't be holding up "there's a
bug report, here's the obvious fix".

And for something like a NULL pointer dereference, there really should
generally be an "obvious fix".

Of course, a corollary to that "fixes are different from new
development", though, is that bug fixes need to be kept separate from
new code - just so that they _can_ be handled separately and so that
you could have sent Sudip (and Michael, although that was apparently a
very different bug, and the report came in later) a "can you test this
fix" kind of thing.

I don't know what the review issue on the vc4 drm side is, but I
suspect that the vc4 people are just perhaps not as integrated with a
lot of the other core drm people. Or maybe review of new features are
held off because there are bug reports on the old code.

   Linus


Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Linus Torvalds
On Wed, Sep 22, 2021 at 3:11 AM Sudip Mukherjee
 wrote:
>
> That test script is triggering the openqa job, but its running only
> after lava is able to login. The trace is appearing before the login
> prompt even, so test_mainline.sh should not matter here.

Side note: the traces might be more legible if you have debug info in
the kernel, and run the dmesg through the script in

  scripts/decode_stacktrace.sh

which should give line numbers and inlining information.

That often makes it much easier to see which access it is that hits a
NULL pointer dereference.

On x86-64, generally just decode the instruction stream, and look at
the instruction patterns and try to figure out where an oops is coming
from, but that's much less useful on arm64 (partly because I'm not as
used to it, but because the arm64 oopses don't print out much of the
instructions so there's often little to go by).

 Linus


Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Linus Torvalds
On Wed, Sep 22, 2021 at 10:02 AM Sudip Mukherjee
 wrote:
>
>
> Attached is a complete dmesg and also the decoded trace.
> This is done on 4357f03d6611 ("Merge tag 'pm-5.15-rc2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm")

drivers/gpu/drm/vc4/vc4_hdmi.c:1214 is

tmp = (u64)(mode->clock * 1000) * n;

in vc4_hdmi_set_n_cts(), which has apparently been inlined from
vc4_hdmi_audio_prepare() in vc4_hdmi.c:1398.

So it looks like 'mode' is some offset off a NULL pointer.

Which looks not impossible:

  1207  struct drm_connector *connector = &vc4_hdmi->connector;
  1208  struct drm_crtc *crtc = connector->state->crtc;
  1209  const struct drm_display_mode *mode =
&crtc->state->adjusted_mode;

looks like crtc->state perhaps might be NULL.

Although it's entirely possible that it's 'crtc' itself that is NULL
or one of the earlier indirection accesses.

The exact line information from the debug info is very useful and
mostly correct, but at the same time should always be taken with a
small pinch of salt.

Compiler optimizations means that code gets munged and moved around,
and since this is the first access to 'mode', I would not be surprised
if some of the calculations and accesses to get 'mode' might be moved
around to it.

   Linus


Re: Regression with mainline kernel on rpi4

2021-09-22 Thread Linus Torvalds
On Wed, Sep 22, 2021 at 1:19 PM Sudip Mukherjee
 wrote:
>
> I added some debugs to print the addresses, and I am getting:
> [   38.813809] sudip crtc 
>
> This is from struct drm_crtc *crtc = connector->state->crtc;

Yeah, that was my personal suspicion, because while the line number
implied "crtc->state" being NULL, the drm data structure documentation
and other drivers both imply that "crtc" was the more likely one.

I suspect a simple

if (!crtc)
return;

in vc4_hdmi_set_n_cts() is at least part of the fix for this all, but
I didn't check if there is possibly something else that needs to be
done too.

Linus


Re: [BUG 5.15-rc3] kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!

2021-10-02 Thread Linus Torvalds
On Sat, Oct 2, 2021 at 5:17 AM Steven Rostedt  wrote:
>
> On Sat, 2 Oct 2021 03:17:29 -0700 (PDT)
> Hugh Dickins  wrote:
>
> > Yes (though bisection doesn't work right on this one): the fix
>
> Interesting, as it appeared to be very reliable. But I didn't do the
> "try before / after" on the patch.

Well, even the before/after might well have worked, since the problem
depended on how that sw_fence_dummy_notify() function ended up
aligned. So random unrelated changes could re-align it just by
mistake.

Patch applied directly.

I'd also like to point out how that BUG_ON() actually made things
worse, and made this harder to debug. If it had been a WARN_ON_ONCE(),
this would presumably not even have needed bisecting, it would have
been obvious.

BUG_ON() really is pretty much *always* the wrong thing to do. It
onl;y results in problems being harder to see because you end up with
a dead machine and the message is often hidden.

  Linus


Re: [GIT PULL] fbdev updates for v5.17-rc1

2022-01-16 Thread Linus Torvalds
On Sun, Jan 16, 2022 at 9:32 PM Helge Deller  wrote:
>
> This pull request contains only one single initial patch which adds
> myself to the MAINTAINERS file for the FRAMBUFFER LAYER.

I'll pull this (as my test builds for other things complete), but this
is just a note to say that this pull request email was marked as spam
for me, with gmail saying something along the lines of "lots of emails
from gmx.de have been marked as spam"

I see nothing odd in the email itself, and it has proper SPF and DKIM,
but it's possible that you end up sharing a subnet (or an ISP) with
spammers...

Or maybe it was a random one-off. We'll see. I check spam filters
enough that I _usually_ tend to catch these things.

Linus


Re: [PATCH 2/2] Revert "fbcon: Disable accelerated scrolling"

2022-01-19 Thread Linus Torvalds
On Wed, Jan 19, 2022 at 2:29 PM Helge Deller  wrote:
>
> >>
> >> Ah, no, that was just the soft scrollback code I was thinking of, which
>
> Right.
> That was commit 973c096f6a85 and it was about vgacon, not fbcon.

No, fbcon had some bug too, although I've paged out the details. See
commit 50145474f6ef ("fbcon: remove soft scrollback code").

If I remember correctly (and it's entirely possible that I don't), the
whole "softback_lines" logic had serious problems with resizing the
console (or maybe changing the font size).

There may have been some other bad interaction with
foreground/background consoles too, I forget.

   Linus


Re: [git pull] drm fixes + one missed next for 5.16-rc1

2021-11-12 Thread Linus Torvalds
On Thu, Nov 11, 2021 at 7:25 PM Dave Airlie  wrote:
>
> I missed a drm-misc-next pull for the main pull last week. It wasn't
> that major and isn't the bulk of this at all. This has a bunch of
> fixes all over, a lot for amdgpu and i915.

Ugh.

The i915 conflict was trivial, but made me aware of that absolutely
disgusting "wbinvd_on_all_cpus()" hack.

And that thing is much too ugly to survive. I made my merge resolution
remove that disgusting thing.

That driver is x86-only anyway, so it all seemed completely bogus in
the first place.

And if there is some actual non-x86 work in progress for i915, then
that wbinvd_on_all_cpus() needs to be replaced with something proper
and architecture-neutral anyway, most definitely involving a name
change, and almost certainly also involving a range for the cache
writeback.

Because that "create broken macro on other architectures" thing is
*NOT* acceptable.

And I sincerely hope to the gods that no cache-incoherent i915 mess
ever makes it out of the x86 world. Incoherent IO was always a
historical mistake and should never ever happen again, so we should
not spread that horrific pattern around.

Linus


Re: regression with mainline kernel

2021-11-13 Thread Linus Torvalds
[ Hmm. This email was marked as spam for me. I see no obvious reason
for it being marked as spam, but it happens.. ]

On Thu, Nov 11, 2021 at 12:52 PM Sudip Mukherjee
 wrote:
>
> # first bad commit: [cd7f5ca33585918febe5e2f6dc090a21cfa775b0]
> drm/virtio: implement context init: add virtio_gpu_fence_event

Hmm. Judging from your automated screenshots, the login never appears.

> And, indeed reverting cd7f5ca33585 on top of debe436e77c7 has fixed
> the problem I was seeing on my qemu test of x86_64. The qemu image is
> based on Ubuntu.

Presumably either that commit is somehow buggy in itself - or it does
exactly what it means to do, and the new poll() semantics just
confuses the heck out of the X server (or wayland or whatever).

And honestly, if I read that thing correctly, the patch is entirely
broken. The new poll function (virtio_gpu_poll()) will unconditionally
remove the first event from the event list, and then report "Yeah, I
had events".

This is completely bogus for a few reasons:

 - poll() really should be idempotent, because the poll function gets
called multiple times

 - it doesn't even seem to check that the event that it removes is the
new VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL kind of event, so it will
unconditionally just remove random events.

 - it does seem to check the "vfpriv->ring_idx_mask" and do the old
thing if that is zero, but I see absolutely no reason for that (and
that check itself has caused problems, see below)

Honestly, my reaction to this all is that that commit is fundamentally
broken and probably should be reverted regardless as "this commit does
bad things".

HOWEVER - it has had a fix for a NULL pointer dereference in the
meantime - can you check whether the current top of tree happens to
work for you? Maybe your problem isn't due to "that commit does
unnatural things", but simply due to the bug fixed in d89c0c8322ec
("drm/virtio: Fix NULL dereference error in virtio_gpu_poll").

And if it's still broken with that commit, I'll happily revert it and
people need to go back to the drawing board.

In fact, I would really suggest that people look at that
virtio_gpu_poll() function regardless. That odd "let's unconditionally
just drop events in the poll function is really REALLY broken
behavior.

  Linus


Re: [git pull] drm fixes + one missed next for 5.16-rc1

2021-11-14 Thread Linus Torvalds
On Sun, Nov 14, 2021 at 1:00 PM Dave Airlie  wrote:
>
> i915 will no longer be x86-64 only in theory, since Intel now produces
> PCIe graphics cards using the same hw designs.

Well, at least in my tree, it still has the "depends on X86", along
with several other x86-only things (like "select INTEL_GTT", which is
also x86-only)

So by the time that non-x86 theory becomes reality, hopefully the i915
people will also have figured out how to do the cache flushing
properly.

And hopefully that "do it properly" ends up being simply that the
particular configuration that ends up being portable simply doesn't
need to do it at all and can statically just not build it,
sidestepping the issue entirely.

Fingers crossed.

.. of course, I'm also sure some clueless hardware engineer is still
convinced that non-coherent IO is the way to go for graphics, and that
doing cross-CPU IPIs to write back all caches is somehow still a valid
model. Because some people were still convinced about that not _that_
long ago. Hopefully reality (perhaps in the form of Apple) has caused
people to finally reconsider.

 Linus


Re: [git pull] drm for 5.11-rc1

2020-12-14 Thread Linus Torvalds
On Thu, Dec 10, 2020 at 7:52 PM Dave Airlie  wrote:
>
> This is an early pull request for drm for 5.11 merge window. I'm going
> to be out for most of the first week of the merge window so thought
> I'd just preempt things and send this early.

Ok, I've merged this, and Dave is likely gone already, but it _looks_
like there's a problem with the drm tree on my Radeon workstation.

It works fine on my i915 laptop.

The symptoms are that the boot just hangs after

   fb0: switching to amdgpudrmfb from EFI VGA

when on a working kernel the next lines are

  amdgpu :49:00.0: vgaarb: deactivate vga console
  [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF
0x1DA2:0xE353 0xE7).
  amdgpu :49:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
  [drm] register mmio base: 0xE1C0
  [drm] register mmio size: 262144

I'm bisecting for more details, but it might take a while, and I
thought I'd start the ball rolling with this initial report in case
somebody goes "Aahh.., we know about that case already, the fix is
XYZ"...

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.11-rc1

2020-12-14 Thread Linus Torvalds
On Mon, Dec 14, 2020 at 2:29 PM Alex Deucher  wrote:
>
> The relevant fixes are:

Ok, I can confirm that applying those two patches gets my workstation
working properly again.

Would it be possible to get those submitted properly (or I can just
take them as-is, but would like to get a "please just pick them
directly")?

I don't like to continue to do merges during the merge windows with a
known broken base - it makes for a nightmare of bisection when you
have multiple independent problems, and I assume this hits not just me
but a lot of people with radeon/amdgpu graphics?

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm fixes for 5.11-rc1

2020-12-24 Thread Linus Torvalds
On Wed, Dec 23, 2020 at 6:29 PM Dave Airlie  wrote:
>
> Xmas eve pull request present. Just some fixes that trickled in this
> past week. Mostly amdgpu fixes, with a dma-buf/mips build fix and some
> misc komeda fixes.

Well, I already pulled and pushed out my merge, but only noticed
afterwards that clang complains about this, and I think it's a real
bug:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_mpc.c:475:6: warning:
 variable 'val' is used uninitialized whenever 'if' condition is
false [-Wsometimes-uninitialized]

and it sure is true: the code literally does

uint32_t val;

if (opp_id < MAX_OPP && REG(MUX[opp_id]))
REG_GET(MUX[opp_id], MPC_OUT_MUX, &val);

return val;

so clearly 'val' isn't initialized if that if-statement isn't true.

I assume 'opp_id' is always presumed to be valid, but that code really
is disgusting.

Just make it return 0 (or whatever) for invalid, possibly together
with a WARN_ON_ONCE(). Ok?

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm final fixes for 5.16

2022-01-07 Thread Linus Torvalds
On Thu, Jan 6, 2022 at 7:23 PM Dave Airlie  wrote:
>
> There is only the amdgpu runtime pm regression fix in here.

Thanks, from a quick test it works for me - the backlight actually
does eventually go away.

It does so only on the second time the monitors say "no signal, going
to power save", but that has been true before too.

So I think there's still some confusion in this area, but it might be
elsewhere - who knows what Wayland and friends do. At least it doesn't
look like a regression to me any more.

Thanks,
Linus


Re: [git pull] drm fixes for 5.16-rc3

2022-01-10 Thread Linus Torvalds
On Sun, Jan 9, 2022 at 11:38 PM Geert Uytterhoeven  wrote:
>
> The commit that merged this branch made a seemingly innocent change to
> the top Makefile:

"Seemingly" innocent?

Or something darker and more sinister, related to the unrelenting
slaughter of flightless fowl?

You be the judge.

 Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Thu, Jan 6, 2022 at 10:12 PM Dave Airlie  wrote:
>
> nouveau_fence.c is the only conflict I've seen and I've taken the result from
> our rerere cache in the merge above. It's non trivial, would be good to have
> Christian confirm it as well.

Thanks, that conflict really ended up being one that I would have done
very differently without having had that pointer to your reference
merge. And I would almost certainly have messed it up in the process.

So what I did was to look at your merge resolution (or possibly
Christian's? I don't know how you guys share your trees and the origin
of that rerere), and tried to understand it, and basically recreate
it.

It's not exactly the same (different whitespace and variable
lifetimes), but I think I got the gist of it.

Thanks for the pointer, and hopefully I didn't mess it up _despite_
your merge showing me what I should aim for ;)

   Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Thu, Jan 6, 2022 at 10:12 PM Dave Airlie  wrote:
>
>   git://anongit.freedesktop.org/drm/drm tags/drm-next-2022-01-07

Gaah. I merged things and it built cleanly, and I pushed it out.

But then I actually *booted* it, and that's not pretty.

It *works", but it's almost unusable because of random scanline
flickering.  I'm not sure how to explain it, but it's as if there
wasn't quite enough bandwidth on the scan-out, so you get these lines
of noise and/or shifted output. They are temporary - so the
framebuffer contents themselves is not damaged (although I don't know
how the compositor works - maybe the problem happens before scanout).

This is on the same Radeon device:

   49:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)

with dual 4k monitors.

Any idea?

  Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Mon, Jan 10, 2022 at 2:13 PM Alex Deucher  wrote:
>
> Sounds like something related to watermarks.  That said, we haven't
> really touched the display code for DCE11 cards in quite a while.  Can
> you provide your dmesg output?

I'm not seeing anything that would look interesting, but here's the
parts that look relevant for drm..

  Linus


dmesg-gpu
Description: Binary data


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Mon, Jan 10, 2022 at 5:11 PM Alex Deucher  wrote:
>
> We are putting together a system to try and repro the issue.  Does it
> happen with a single monitor or only with two?

Nope. With a single monitor everything seems to look fine. And when I
plug in the second monitor, it immediately starts happening again.

It also seems to depend a bit on the screen contents - or possibly on
what else is going on. Hiding the browser window makes it happen less,
I think. But I suspect that's about "less gpu activity" than anything
else.

I'll see if I can bisect it at least partially.

  Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Mon, Jan 10, 2022 at 5:21 PM Linus Torvalds
 wrote:
>
> It also seems to depend a bit on the screen contents - or possibly on
> what else is going on. Hiding the browser window makes it happen less,
> I think. But I suspect that's about "less gpu activity" than anything
> else.

Actually, sometimes "more activity" makes it go away too. Moving a
window around wildly with the mouse makes it *stop* happen.

But moving the mouse over different elements of the screen - or
writing text in the web browser email window - seems to make it worse.

Funky.

It does "feel" to me like some bandwidth limitation, it has kind of
the same behavior that I remember from the bad old times when you were
pushing the video card past a resolution that it could really handle.

But that can't be the case, this card has had no problems with this before.

   Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Mon, Jan 10, 2022 at 5:21 PM Linus Torvalds
 wrote:
>
> I'll see if I can bisect it at least partially.

It seems to be very reliable. I can see the flickering even at early
boot before gdb has started - the graphical screen where you type the
encrypted disk password at boot already shows it as you type.

Right now it is

  bad: 9602044d1cc12280e20c5f2cd640ae80f69e
  good: 3867e3704f136beadf5e004b61696ef7f990bee4

so it's going to be one of these:

  9602044d1cc1 drm/amd/display: Fix for the no Audio bug with Tiled Displays
  a896f870f8a5 drm/amd/display: Fix for otg synchronization logic
  aba3c3fede54 drm/amd/display: Clear DPCD lane settings after repeater training
  9311ed1e1241 drm/amd/display: add hdmi disable debug check
  6421f7c750e9 drm/amd/display: Allow DSC on supported MST branch devices
  ebe5ffd8e271 drm/amd/display: Enable P010 for DCN3x ASICs
  c022375ae095 drm/amd/display: Add DP-HDMI FRL PCON Support in DC
  50b1f44ec547 drm/amd/display: Add DP-HDMI FRL PCON SST Support in DM
  81d104f4afbf drm/amdgpu: Don't halt RLC on GFX suspend
  fe9c5c9affc9 drm/amdgpu: Use MAX_HWIP instead of HW_ID_MAX
  370016988665 drm/amdgpu: fix the missed handling for SDMA2 and SDMA3
  6c18ecefaba7 drm/amdgpu: declare static function to fix compiler warning
  94a80b5bc7a2 amdgpu/pm: Modify implmentations of
get_power_profile_mode to use amdgpu_pp_profile_name

and I guess I'll do the few more bisections to pick out the exact one.

 Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Mon, Jan 10, 2022 at 6:22 PM Linus Torvalds
 wrote:
>
> and I guess I'll do the few more bisections to pick out the exact one.

a896f870f8a5f23ec961d16baffd3fda1f8be57c is the first bad commit.

Attaching ther BISECT_LOG in case anybody cares.

I'll double-check to see if a revert fixes it at the top of my tree.

Linus


BISECT_LOG
Description: Binary data


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-10 Thread Linus Torvalds
On Mon, Jan 10, 2022 at 6:44 PM Linus Torvalds
 wrote:
>
> I'll double-check to see if a revert fixes it at the top of my tree.

Yup. It reverts cleanly, and the end result builds and works fine, and
doesn't show the horrendous flickering.

I have done that revert, and will continue the merge window work.
Somebody else gets to figure out what the actual bug is, but that
commit was horribly broken on my machine (Sapphire Pulse RX 580 8GB,
fwiw).

   Linus


Re: [git pull] drm for 5.17-rc1 (pre-merge window pull)

2022-01-11 Thread Linus Torvalds
On Tue, Jan 11, 2022 at 7:38 AM Harry Wentland  wrote:
>
> Attached is a v2 of the buggy patch that should get this right.
> If you have a chance to try it out let us know

I can confirm that I do not see the horribly flickering behavior with
this patch.

I didn't look at what the actual differences were from the one I
reverted, but at least on my machine this version works.

Linus


[git pull] drm for v4.8

2016-08-01 Thread Linus Torvalds
On Mon, Aug 1, 2016 at 9:32 PM, Dave Airlie  wrote:
>
> This is the main drm pull request for 4.8, I'm down with a cold at the moment
> so hopefully this isn't in too bad a state, I finished pulling stuff last
> week mostly (nouveau fixes just went in today), so only this message should
> be influenced by illness. Apologies to anyone who's major feature I missed :-)
>
> i915:
> BXT support enabled by default
> GVT-g infrastructure
> GuC command submission and fixes
> BXT workarounds
> SKL/BKL workarounds
> Demidlayering device registration
> Thundering herd fixes
> Missing pci ids
> Atomic updates

Hmm. I did the merge and pushed it out, but testing it on my laptop
shows some very annoying flickering problem.

The screen goes dark for a very short while (one frame? Who knows?
Seems longer occasionally). I have no idea what triggers it, but it
happens quite a lot when it happens. Like once every second or two.
And it seems to happen most of the time, although right now it happens
to be behaving nicely, so sometimes it goes for a while without the
flickering.

Things *work*, but the flickering is nasty enough to make the end
result painful to use.

The only thing I see in dmesg that looks bad is

   [drm:intel_set_cpu_fifo_underrun_reporting [i915]] *ERROR*
uncleared fifo underrun on pipe A
   [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A
FIFO underrun

but I've seen that before, and it happens a couple of times during
boot. Not once per second.

This is my old Vaio 11 Pro, now running Fedora 24 (up-to-date as of today).

So it's bog-standard intel graphics (i5-4200U - Haswell ULT).

Suggestions to try?

 Linus


[git pull] drm for v4.8

2016-08-02 Thread Linus Torvalds
On Tue, Aug 2, 2016 at 4:10 AM, Ville Syrjälä
 wrote:
>
> I have a couple of pending PSR patches you may want to try as well,
> if i915.enable_psr=0 helps.

Yes. i915.enable_psr=0 seems to make the bad flickering go away.

I'll try your git trees out later, but what exactly changed with
regards to psr lately? It's set as a "dangerous" option, and even just
clearing it caused

  Setting dangerous option enable_psr - tainting kernel

which seems entirely bogus.

  Linus


[git pull] drm for v4.8

2016-08-02 Thread Linus Torvalds
On Tue, Aug 2, 2016 at 4:10 AM, Ville Syrjälä
 wrote:
>
> So PSR seems more likely. The underruns might point at some watermark
> fail though :(
>
> I have a couple of pending PSR patches you may want to try as well,
> if i915.enable_psr=0 helps.
>
> First set is here:
> git://github.com/vsyrjala/linux.git psr_setup_time_2
> This should be perfectly safe to go in actually, as it will only result
> in disabling PSR with certain panels.

This first git pull fixes it for me, as far as I can tell. I'm not
sure that the problem is 100% reproducible, but I booted into each
kernel twice, and the current git tree is broken, while with your
psr_setup_time_2 branch pulled it works.  So it does seem to be the
fix.

> The second set is here:
> git://github.com/vsyrjala/linux.git psr_fixes_2

I didn't even test that one.

Should I just pull that psr_setup_time2 branch for real? I'd like to
get a real pull request with explanations etc, but other than that it
looks good to go.

Linus


Linux 4.8-rc1 Cirrus QEMU crashes on boot.

2016-08-08 Thread Linus Torvalds
On Mon, Aug 8, 2016 at 1:24 PM, Eric W. Biederman  
wrote:
>
> Booting up v4.8-rc1 in qemu today I ran I ran into this beautiful oops.
>
> I am just about to head out the door on vacation until the end of the
> month so hopefully I am tossing out enough information to the right
> people.
>
> I have confirmed that disabling CONFIG_DRM_CIRRUS_QEMU avoids this
> crash.
>
> [0.720167] BUG: unable to handle kernel NULL pointer dereference at 
> 0018
> [0.721348] IP: [] drm_pick_crtcs+0xfc/0x270
> [0.722249] PGD 0
> [0.722659] Oops:  [#1] SMP
> [0.723141] Modules linked in:
> [0.723643] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #116
> [0.724602] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [0.725450] task: 8800bbf28000 task.stack: 8800bbf3
> [0.726327] RIP: 0010:[]  [] 
> drm_pick_crtcs+0xfc/0x270
> [0.727648] RSP: :8800bbf33978  EFLAGS: 00010217
> [0.728444] RAX: 8d623a20 RBX:  RCX: 
> 8800baf3f860
> [0.729498] RDX:  RSI: 1000 RDI: 
> 8800baf3f800
> [0.730626] RBP: 8800bbf339f8 R08:  R09: 
> 0001
> [0.731704] R10: 0001 R11:  R12: 
> 8800bbaf3af0
> [0.732827] R13: 8800bbaf3af8 R14:  R15: 
> 8800baf3fc00
> [0.733863] FS:  () GS:8800bf80() 
> knlGS:
> [0.735106] CS:  0010 DS:  ES:  CR0: 80050033
> [0.735964] CR2: 0018 CR3: 3b219000 CR4: 
> 06f0
> [0.737021] Stack:
> [0.737342]  8800bbf33998 0246 8800bbaf3b18 
> 8800bbaf3b00
> [0.738602]  1001  8800bbaf3b00 
> 8800baf3f800
> [0.739807]  00031000 8800bbaf3b18 8800bbf339e8 
> 8800baf3fc00
> [0.740993] Call Trace:
> [0.741393]  [] drm_setup_crtcs+0x3bd/0xb50
> [0.742252]  [] ? mark_held_locks+0x71/0x90
> [0.743176]  [] ? __mutex_unlock_slowpath+0xeb/0x1c0
> [0.744138]  [] drm_fb_helper_initial_config+0x81/0x3c0
> [0.745166]  [] ? drm_modeset_unlock_all+0x54/0x80
> [0.746136]  [] cirrus_fbdev_init+0xa0/0xb0
> [0.747033]  [] cirrus_modeset_init+0x18b/0x1e0
> .. snip snip ..
> [0.771023] Code: e8 ba e1 ff ff 48 8b 55 b8 48 83 f8 01 83 5d c4 ff 48 8b 
> 7d b8 48 8b 82 98 02 00 00 49 8b 57 08 48 8b 40 10 48 8b 92 00 0a 00 00 <48> 
> 83 7a 18 00 74 09 48 85 c0 0f 84 53 01 00 00 ff d0 49 89 c3

This decodes to

   0: e8 ba e1 ff ff   callq  ...
   5: 48 8b 55 b8   mov-0x48(%rbp),%rdx
   9: 48 83 f8 01   cmp$0x1,%rax
   d: 83 5d c4 ff   sbbl   $0x,-0x3c(%rbp)
  11: 48 8b 7d b8   mov-0x48(%rbp),%rdi
  15: 48 8b 82 98 02 00 00 mov0x298(%rdx),%rax
  1c: 49 8b 57 08   mov0x8(%r15),%rdx
  20: 48 8b 40 10   mov0x10(%rax),%rax
  24: 48 8b 92 00 0a 00 00 mov0xa00(%rdx),%rdx
  2b:* 48 83 7a 18 00   cmpq   $0x0,0x18(%rdx) <-- trapping instruction
  30: 74 09 je 0x3b
  32: 48 85 c0 test   %rax,%rax
  35: 0f 84 53 01 00 00 je 0x18e
  3b: ff d0 callq  *%rax
  3d: 49 89 c3 mov%rax,%r11


and trying to find matching code I think the oops is from this:

if (fb_helper->dev->mode_config.funcs->atomic_commit &&
!connector_funcs->best_encoder)
encoder = drm_atomic_helper_best_encoder(connector);
else
encoder = connector_funcs->best_encoder(connector);

in particular, the trapping instruction is the load of "atomic_commit".

(The "callq *%rax" seems to be the "connector_funcs->best_encoder()" call)

So I *think* we have "fb_helper->dev->mode_config.funcs" being NULL.

But I might have gotten the code matching wrong. I don't have the same
compiler as Eric, so even when using his config my code generation
doesn't really match. But there is only one indirect call in that
function, which is that ->best_encoder() call, so it does seem to
match up.

Adding Boris Brezillon and Daniel Vetter to the cc, since assuming I'm
right, the main suspect is commit c61b93fe51b1 ("drm/atomic: Fix
remaining places where !funcs->best_encoder is valid").

Boris? Daniel?

  Linus


Re: [git pull] drm for 5.2-rc1

2019-05-08 Thread Linus Torvalds
Dave,

On Wed, May 8, 2019 at 8:28 PM Dave Airlie  wrote:
>
> This is the main drm pull request for 5.2.

Thanks. I've merged it, but I got a couple of conflicts with fixes
(reverts) to mainline in the meantime.

The one to the i915 driver you seem to have applied again (after the
function was moved and renamed).

The one to the virtgpu driver, I really don't know if is needed any
more. I suspect I completely unnecessarily merged that
virtgpu_gem_prime_import_sg_table() function that came in because I
decided to do the merge of the revert.

It's a trivial function that just returns an error, and your tree just
leaves it as NULL, and I suspect my merge doesn't hurt, but it also
probably doesn't matter.

So you should check my merge.

Thanks,

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC][PATCH] kernel.h: Add generic roundup_64() macro

2019-05-23 Thread Linus Torvalds
On Thu, May 23, 2019 at 7:00 AM Steven Rostedt  wrote:
>
> +# define roundup_64(x, y) (\
> +{  \
> +   typeof(y) __y = y;  \
> +   typeof(x) __x = (x) + (__y - 1);\
> +   do_div(__x, __y);   \
> +   __x * __y;  \
> +}  \

The thing about this is that it absolutely sucks for power-of-two arguments.

The regular roundup() that uses division has the compiler at least
optimize them to shifts - at least for constant cases. But do_div() is
meant for "we already know it's not a power of two", and the compiler
doesn't have any understanding of the internals.

And it looks to me like the use case you want this for is very much
probably a power of two. In which case division is all kinds of just
stupid.

And we already have a power-of-two round up function that works on
u64. It's called "round_up()".

I wish we had a better visual warning about the differences between
"round_up()" (limited to powers-of-two, but efficient, and works with
any size) and "roundup()" (generic, potentially horribly slow, and
doesn't work for 64-bit on 32-bit).

Side note: "round_up()" has the problem that it uses "x" twice.

End result: somebody should look at this, but I really don't like the
"force division" case that is likely horribly slow and nasty.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC][PATCH] kernel.h: Add generic roundup_64() macro

2019-05-23 Thread Linus Torvalds
On Thu, May 23, 2019 at 8:27 AM Steven Rostedt  wrote:
>
> I haven't yet tested this, but what about something like the following:

So that at least handles the constant case that the normal "roundup()"
case also handles.

At the same time, in the case you are talking about, I really do
suspect that we have a (non-constant) power of two, and that you
should have just used "round_up()" which works fine regardless of
size, and is always efficient.

On a slight tangent.. Maybe we should have something like this:

#define size_fn(x, prefix, ...) ({  \
typeof(x) __ret;\
switch (sizeof(x)) {\
case 1: __ret = prefix##8(__VA_ARGS__); break;  \
case 2: __ret = prefix##16(__VA_ARGS__); break; \
case 4: __ret = prefix##32(__VA_ARGS__); break; \
case 8: __ret = prefix##64(__VA_ARGS__); break; \
default: __ret = prefix##bad(__VA_ARGS__);  \
} __ret; })

#define type_fn(x, prefix, ...) ({  \
typeof(x) __ret;\
if ((typeof(x))-1 > 1)  \
__ret = size_fn(x, prefix##_u, __VA_ARGS__);\
else\
__ret = size_fn(x, prefix##_s, __VA_ARGS__);\
__ret; })

which would allow typed integer functions like this. So you could do
something like

 #define round_up(x, y) size_fn(x, round_up_size, x, y)

and then you define functions for round_up_size8/16/32/64 (and you
have toi declare - but not define - round_up_sizebad()).

Of course, you probably want the usual "at least use 'int'" semantics,
in which case the "type" should be "(x)+0":

 #define round_up(x, y) size_fn((x)+0, round_up_size, x, y)

 and the 8-bit and 16-bit cases will never be used.

We have a lot of cases where we end up using "type overloading" by
size. The most explicit case is perhaps "get_user()" and "put_user()",
but this whole round_up thing is another example.

Maybe we never really care about "char" and "short", and always want
just the "int-vs-long-vs-longlong"? That would make the cases simpler
(32 and 64). And maybe we never care about sign. But we could try to
have some unified helper model like the above..

  Linus


Re: [RFC][PATCH] kernel.h: Add generic roundup_64() macro

2019-05-23 Thread Linus Torvalds
On Thu, May 23, 2019 at 10:36 AM Steven Rostedt  wrote:
>
> >
> > Of course, you probably want the usual "at least use 'int'" semantics,
> > in which case the "type" should be "(x)+0":
> >
> >  #define round_up(x, y) size_fn((x)+0, round_up_size, x, y)
> >
> >  and the 8-bit and 16-bit cases will never be used.
>
> I'm curious to what the advantage of that is?

Let's say that you have a structure with a 'unsigned char' member,
because the value range is 0-255.

What happens if you do

   x = round_up(p->member, 4);

and the value is 255?

Now, if you stay in 'unsigned char' the end result is 0. If you follow
the usual C integer promotion rules ("all arithmetic promotes to at
least 'int'"), you get 256.

Most people probably expect 256, and that implies that even if you
pass an 'unsigned char' to an arithmetic function like this, you
expect any math to be done in 'int'. Doing the "(x)+0" forces that,
because the "+0" changes the type of the expression from "unsigned
char" to "int" due to C integer promotion.

Yes. The C integer type rules are subtle and sometimes surprising. One
of the things I've wanted is to have some way to limit silent
promotion (and silent truncation!), and cause warnings. 'sparse' does
some of that with some special-case types (ie __bitwise), but it's
pretty limited.

  Linus


Re: [git pull] drm fixes for 5.0-rc3

2019-01-17 Thread Linus Torvalds
On Fri, Jan 18, 2019 at 12:43 PM Dave Airlie  wrote:
>
> Going to be at LCA next week in Christchurch, but should be fine for
> normal pulls.

.. hey, me too.

>   git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2019-01-18

Pulled,

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm fixes for v5.2-rc4 (v2)

2019-06-07 Thread Linus Torvalds
On Fri, Jun 7, 2019 at 12:24 AM Dave Airlie  wrote:
>
> I sent this out earlier, but I forgot the subject, and then Ben asked
> about some nouveau firmware fixes.

Well, the first one at least had the address to pull from, and the diffstat.

The second one has the subject, and mentions nouveau, but doesn't
actually have the tag name or the expected diffstat and shortlog.

Third time is the charm?

   Linus


Re: [git pull] drm fixes for v5.2-rc4 (v2)

2019-06-07 Thread Linus Torvalds
On Fri, Jun 7, 2019 at 10:20 AM Linus Torvalds
 wrote:
>
> The second one has the subject, and mentions nouveau, but doesn't
> actually have the tag name or the expected diffstat and shortlog.

Hmm. I'm guessing you meant for me to pull the

  'tags/drm-fixes-2019-06-07-1'

thing, which looks likely, but I'd like to have confirmation.

 Linus


Re: [git pull] drm fixes for 5.2-rc6

2019-06-21 Thread Linus Torvalds
On Thu, Jun 20, 2019 at 9:21 PM Dave Airlie  wrote:
>
> Just catching up on the week since back from holidays, everything
> seems quite sane.

.. well, except for anongit.freedesktop.org itself, which seems very
sick indeed.

Does it work for you? I'm getting a connection reset, no data.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [git pull] drm fixes for 5.2-rc6

2019-06-21 Thread Linus Torvalds
On Fri, Jun 21, 2019 at 10:06 AM Daniel Stone  wrote:
>
> > Does it work for you? I'm getting a connection reset, no data.
>
> It is quite sick indeed; it's fallen down an NFS hole and is being
> restarted. Should be back in a few minutes.

Thanks,  everything does indeed look good again now,

 Linus


Re: [GIT PULL] Please pull hmm changes

2019-07-14 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 12:24 PM Jason Gunthorpe  wrote:
>
> I'm sending it early as it is now a dependency for several patches in
> mm's quilt.

.. but I waited to merge it until I had time to review it more
closely, because I expected the review to be painful.

I'm happy to say that I was overly pessimistic, and that instead of
finding things to hate, I found it all looking good.

Particularly the whole "use reference counts properly, so that
lifetimes make sense and all those nasty cases can't happen" parts.

It's all merged, just waiting for the test-build to verify that I
didn't miss anything (well, at least nothing obvious).

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: drm pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 12:08 AM Dave Airlie  wrote:
>
> VMware had some mm helpers go in via my tree (looking back I'm not
> sure Thomas really secured enough acks on these, but I'm going with it
> for now until I get push back).

Yeah, this is the kind of completely unacceptable stuff that I was
_afraid_ I'd get from the hmm tree, but didn't.

It's not just "mm helpers". It's core changes like changing how
do_page_mkwrite() works. With not a single ack or review from any of
the VM people.

Maybe that commit is fine. But there's a whole slew of questionable
core VM changes there, and absolutely none of them look obvious, and
none of them hack acks from any of the VM people.

The hmm tree looked like good cleanups that largely removed broken code.

This looks like it *adds* broken code, or at least adds code that had
absolutely no real review outside of vmware.

I'm not pulling this. Why did you merge it into your tree, when
apparently you were aware of how questionable it is judging by the drm
pull request.

 Linus


  1   2   3   4   5   6   7   8   >