Re: [Intel-gfx] [PATCH v6 08/10] dt-bindings: msm/dp: Add bindings for HDCP registers

2023-01-18 Thread Johan Hovold
On Wed, Jan 18, 2023 at 07:30:13PM +, Mark Yacoub wrote:
> From: Sean Paul 
> 
> This patch adds the bindings for the MSM DisplayPort HDCP registers
> which are required to write the HDCP key into the display controller as
> well as the registers to enable HDCP authentication/key
> exchange/encryption.
> 
> We'll use a new compatible string for this since the fields are optional.
> 
> Cc: Rob Herring 
> Cc: Stephen Boyd 
> Reviewed-by: Rob Herring 
> Signed-off-by: Sean Paul 
> Signed-off-by: Mark Yacoub 

Just a drive-by comment: Your mail address is missing an 'm' here.

Perhaps check the rest of the series as well (checkpatch should catch
this).

Johan


[Intel-gfx] NULL derefs after failed suspend (i915, pm, ext4, slub)

2014-10-28 Thread Johan Hovold
Hi, 

I have had some problems with crashes involving suspend-to-disk after
updating to v3.16.

Below is a log with 3.16.6 from a failed suspend attempt after which I
get a NULL deref in ext4 code.

A couple of weeks ago I got something similar, with backtraces from 
ext4 (ext4_alloc_inode) and NULL-derefs in vfs (vfs_get_attr_nosec) when
trying to do IO after resuming from suspend. That was with 3.16.3 and I
was hoping that whatever it was would have been fixed in 3.16.6 (there
were some ext4 error handling patches in there). I only got photos of
those oopses but it involved kmem_cache_alloc (slub) and a NULL-deref in
vfs_get_attr_nosec. I can put the photos up somewhere. That time I also
got back to X and could issue a dmesg in an xterm, but any process trying
to do IO died.

Something similar happened with 3.16.1 but unfortunately I do not have
any logs from that.

I also have experienced occasional hangs during suspend, but I believe I
have seen this with older kernels as well so not sure if related. Seems
to be more frequent with 3.16.

This is my main machine so not keen on trying to bisect this on it.

It's an i7-4770 on an Intel DH87MC using the integrated HD Graphics 4600.

I'm CCing the Intel graphics guys due to some errors drm errors in the
logs, and reports of other people having problems involving suspend and
this driver.

Note that there was nothing obviously i915 related in the 3.16.3 oops
but I see now that it occurred soon after

video LNXVIDEO:00: Restoring backlight state

just like it does below.

[137452.181187] [drm:intel_dp_start_link_train] *ERROR* failed to enable link 
training
[137452.183960] [drm:intel_dp_complete_link_train] *ERROR* failed to start 
channel equalization
[138715.329101] [drm:intel_dp_start_link_train] *ERROR* failed to enable link 
training
[138715.331798] [drm:intel_dp_complete_link_train] *ERROR* failed to start 
channel equalization

I get the above occasionally when coming out of dpms-off and my
secondary monitor fails to power up. But here's the suspend attempt:

[148288.654477] s2disk.sh (24636): drop_caches: 3
[148288.654667] PM: Syncing filesystems ... done.
[148288.836275] Freezing user space processes ... (elapsed 0.000 seconds) done.
[148288.837641] PM: Preallocating image memory... done (allocated 1668895 pages)
[148289.299805] PM: Allocated 6675580 kbytes in 0.46 seconds (14512.13 MB/s)
[148289.299806] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
done.
[148289.301127] Suspending console(s) (use no_console_suspend to debug)
[148289.331656] PM: freeze of devices complete after 30.513 msecs
[148289.331925] PM: late freeze of devices complete after 0.267 msecs
[148289.332470] PM: noirq freeze of devices complete after 0.544 msecs
[148289.332471] Disabling non-boot CPUs ...
[148289.332488] intel_pstate CPU 1 exiting
[148289.333592] kvm: disabling virtualization on CPU1
[148289.333598] smpboot: CPU 1 is now offline
[148289.333979] intel_pstate CPU 2 exiting
[148289.335092] kvm: disabling virtualization on CPU2
[148289.335099] smpboot: CPU 2 is now offline
[148289.335370] intel_pstate CPU 3 exiting
[148289.336487] kvm: disabling virtualization on CPU3
[148289.336492] smpboot: CPU 3 is now offline
[148289.336757] intel_pstate CPU 4 exiting
[148289.337871] kvm: disabling virtualization on CPU4
[148289.338128] smpboot: CPU 4 is now offline
[148289.338344] intel_pstate CPU 5 exiting
[148289.339436] kvm: disabling virtualization on CPU5
[148289.339441] smpboot: CPU 5 is now offline
[148289.339608] intel_pstate CPU 6 exiting
[148289.340828] kvm: disabling virtualization on CPU6
[148289.340853] smpboot: CPU 6 is now offline
[148289.341407] intel_pstate CPU 7 exiting
[148289.342605] kvm: disabling virtualization on CPU7
[148289.342651] smpboot: CPU 7 is now offline
[148289.342943] PM: Creating hibernation image:
[148289.644672] PM: Need to copy 1665588 pages

Suspend has failed, resuming:

[148289.343968] Enabling non-boot CPUs ...
[148289.343991] x86: Booting SMP configuration:
[148289.343992] smpboot: Booting Node 0 Processor 1 APIC 0x2
[148289.355895] kvm: enabling virtualization on CPU1
[148289.357989] Intel pstate controlling: cpu 1
[148289.358014] CPU1 is up
[148289.358025] smpboot: Booting Node 0 Processor 2 APIC 0x4
[148289.369916] kvm: enabling virtualization on CPU2
[148289.371995] Intel pstate controlling: cpu 2
[148289.372016] CPU2 is up
[148289.372025] smpboot: Booting Node 0 Processor 3 APIC 0x6
[148289.383905] kvm: enabling virtualization on CPU3
[148289.385993] Intel pstate controlling: cpu 3
[148289.386013] CPU3 is up
[148289.386023] smpboot: Booting Node 0 Processor 4 APIC 0x1
[148289.397882] kvm: enabling virtualization on CPU4
[148289.39] Intel pstate controlling: cpu 4
[148289.400021] CPU4 is up
[148289.400031] smpboot: Booting Node 0 Processor 5 APIC 0x3
[148289.411884] kvm: enabling virtualization on CPU5
[148289.414003] Intel pstate controlling: cpu 5
[148289.414025] CPU5 is up
[148289.414034] smpboot: Bootin

Re: [Intel-gfx] NULL derefs after failed suspend (i915, pm, ext4, slub)

2014-10-29 Thread Johan Hovold
On Tue, Oct 28, 2014 at 05:06:01PM +0200, Jani Nikula wrote:
> On Tue, 28 Oct 2014, Johan Hovold  wrote:
> > Hi, 
> >
> > I have had some problems with crashes involving suspend-to-disk after
> > updating to v3.16.
> >
> > Below is a log with 3.16.6 from a failed suspend attempt after which I
> > get a NULL deref in ext4 code.
> >
> > A couple of weeks ago I got something similar, with backtraces from 
> > ext4 (ext4_alloc_inode) and NULL-derefs in vfs (vfs_get_attr_nosec) when
> > trying to do IO after resuming from suspend. That was with 3.16.3 and I
> > was hoping that whatever it was would have been fixed in 3.16.6 (there
> > were some ext4 error handling patches in there). I only got photos of
> > those oopses but it involved kmem_cache_alloc (slub) and a NULL-deref in
> > vfs_get_attr_nosec. I can put the photos up somewhere. That time I also
> > got back to X and could issue a dmesg in an xterm, but any process trying
> > to do IO died.
> >
> > Something similar happened with 3.16.1 but unfortunately I do not have
> > any logs from that.
> >
> > I also have experienced occasional hangs during suspend, but I believe I
> > have seen this with older kernels as well so not sure if related. Seems
> > to be more frequent with 3.16.
> >
> > This is my main machine so not keen on trying to bisect this on it.
> >
> > It's an i7-4770 on an Intel DH87MC using the integrated HD Graphics 4600.
> >
> > I'm CCing the Intel graphics guys due to some errors drm errors in the
> > logs, and reports of other people having problems involving suspend and
> > this driver.
> 
> My first suggestion would be to try to reproduce the NULL deref without
> i915 loaded, and track the issues you have independently.

I actually don't think this is i915 related, the new drm errors after
failed suspend could possibly just be a side effect of whatever is
causing the apparent memory corruption. As I mentioned, the first log I
have of this do not seem to point at i915 (even if backlight-restore
happens when tasks are restarted).

> Please file any i915 issues against DRM/Intel at [1].

I'll see if I can get around to that. There are bug reports in various
distro tracker about the intel_ddi_pll_enable warning dating back to
April.

It's there on every resume. For instance this morning:

[108109.324398] WARNING: CPU: 1 PID: 7298 at 
/home/johan/src/linux/linux-xi/drivers/gpu/drm/i915/intel_ddi.c:911 
intel_ddi_pll_enable+0x233/0x240()
[108109.324398] WRPLL1 already enabled
[108109.324399] Modules linked in:
[108109.324400] CPU: 1 PID: 7298 Comm: kworker/u16:8 Tainted: GW 
3.16.6 #1
[108109.324401] Hardware name:  /DH87MC, BIOS 
MCH8710H.86A.0154.2014.0123.1542 01/23/2014
[108109.324403] Workqueue: events_unbound async_run_entry_fn
[108109.324405]   0009 81739c03 
88053e89baf8
[108109.324405]  810850f6 8807fadf b035061f 
0001
[108109.324406]  00046040 81a10a41 810851d5 
81a10a83
[108109.324407] Call Trace:
[108109.324410]  [] ? dump_stack+0x49/0x6a
[108109.324412]  [] ? warn_slowpath_common+0x86/0xb0
[108109.324414]  [] ? warn_slowpath_fmt+0x45/0x50
[108109.324415]  [] ? intel_ddi_pll_enable+0x233/0x240
[108109.324417]  [] ? haswell_crtc_mode_set+0x1a/0x30
[108109.324419]  [] ? __intel_set_mode+0x6a8/0x1590
[108109.324420]  [] ? intel_modeset_setup_hw_state+0x817/0xd10
[108109.324422]  [] ? drm_modeset_lock_all_crtcs+0x39/0x50
[108109.324424]  [] ? pci_pm_suspend_noirq+0x1b0/0x1b0
[108109.324426]  [] ? __i915_drm_thaw+0x11e/0x1a0
[108109.324426]  [] ? i915_resume+0x1f/0x40
[108109.324428]  [] ? dpm_run_callback+0x4f/0x150
[108109.324428]  [] ? device_resume+0x93/0x1d0
[108109.324429]  [] ? async_resume+0x14/0x40
[108109.324430]  [] ? async_run_entry_fn+0x2d/0x120
[108109.324433]  [] ? process_one_work+0x158/0x410
[108109.324434]  [] ? worker_thread+0x116/0x510
[108109.324435]  [] ? __wake_up_common+0x4c/0x80
[108109.324436]  [] ? init_pwq+0x160/0x160
[108109.324437]  [] ? kthread+0xbc/0xe0
[108109.324439]  [] ? workqueue_sysfs_register+0x110/0x150
[108109.324440]  [] ? kthread_freezable_should_stop+0x60/0x60
[108109.324442]  [] ? ret_from_fork+0x7c/0xb0
[108109.324443]  [] ? kthread_freezable_should_stop+0x60/0x60

Thanks,
Johan
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] NULL derefs after failed suspend (i915, pm, ext4, slub)

2014-11-19 Thread Johan Hovold
On Tue, Oct 28, 2014 at 03:29:10PM +0100, Johan Hovold wrote:
> Hi, 
> 
> I have had some problems with crashes involving suspend-to-disk after
> updating to v3.16.
> 
> Below is a log with 3.16.6 from a failed suspend attempt after which I
> get a NULL deref in ext4 code.
> 
> A couple of weeks ago I got something similar, with backtraces from 
> ext4 (ext4_alloc_inode) and NULL-derefs in vfs (vfs_get_attr_nosec) when
> trying to do IO after resuming from suspend. That was with 3.16.3 and I
> was hoping that whatever it was would have been fixed in 3.16.6 (there
> were some ext4 error handling patches in there). I only got photos of
> those oopses but it involved kmem_cache_alloc (slub) and a NULL-deref in
> vfs_get_attr_nosec. I can put the photos up somewhere. That time I also
> got back to X and could issue a dmesg in an xterm, but any process trying
> to do IO died.
> 
> Something similar happened with 3.16.1 but unfortunately I do not have
> any logs from that.

For the record, it seems this problem went away with 3.17 (.2 and .3).
Been suspending daily now for almost three weeks without triggering this
again.

Since 3.16 is EOL (and 3.15 did not suffer from this) I guess there's no
need to track it down for stable.

Johan
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [PATCH v3 1/4] drm/dp: Add helper to set LTTPRs in transparent mode

2025-01-07 Thread Johan Hovold
On Fri, Jan 03, 2025 at 02:58:15PM +0200, Abel Vesa wrote:
> According to the DisplayPort standard, LTTPRs have two operating
> modes:
>  - non-transparent - it replies to DPCD LTTPR field specific AUX
>requests, while passes through all other AUX requests
>  - transparent - it passes through all AUX requests.
> 
> Switching between this two modes is done by the DPTX by issuing
> an AUX write to the DPCD PHY_REPEATER_MODE register.
> 
> Add a generic helper that allows switching between these modes.
> 
> Also add a generic wrapper for the helper that handles the explicit
> disabling of non-transparent mode and its disable->enable sequence
> mentioned in the DP Standard v2.0 section 3.6.6.1. Do this in order
> to move this handling out of the vendor specific driver implementation
> into the generic framework.
> 
> Tested-by: Johan Hovold 
> Signed-off-by: Abel Vesa 

> +/**
> + * drm_dp_lttpr_init - init LTTPR transparency mode according to DP standard
> + *
> + * @aux: DisplayPort AUX channel
> + * @lttpr_count: Number of LTTPRs. Between 0 and 8, according to DP standard.
> + *   Negative error code for any non-valid number.
> + *   See drm_dp_lttpr_count().
> + *
> + * Returns 0 on success or a negative error code on failure.
> + */
> +int drm_dp_lttpr_init(struct drm_dp_aux *aux, int lttpr_count)
> +{
> + int ret;
> +
> + if (!lttpr_count)
> + return 0;
> +
> + /*
> +  * See DP Standard v2.0 3.6.6.1 about the explicit disabling of
> +  * non-transparent mode and the disable->enable non-transparent mode
> +  * sequence.
> +  */
> + ret = drm_dp_lttpr_set_transparent_mode(aux, true);
> + if (ret)
> + return ret;
> +
> + if (lttpr_count < 0)
> + return -ENODEV;
> +
> + /*
> +  * Roll-back to tranparent mode if setting non-tranparent mode failed

typo: transparent (2x)

> +  */

I think this comment now needs to go inside the conditional, if you want
to keep it at all.

> + if (drm_dp_lttpr_set_transparent_mode(aux, false)) {
> + drm_dp_lttpr_set_transparent_mode(aux, true);
> +     return -EINVAL;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_dp_lttpr_init);

This looks much better to me now, so with the above addressed: 

Reviewed-by: Johan Hovold 


Re: [PATCH v3 4/4] drm/msm/dp: Add support for LTTPR handling

2025-01-07 Thread Johan Hovold
On Fri, Jan 03, 2025 at 02:58:18PM +0200, Abel Vesa wrote:
> Link Training Tunable PHY Repeaters (LTTPRs) are defined in DisplayPort
> 1.4a specification. As the name suggests, these PHY repeaters are
> capable of adjusting their output for link training purposes.
> 
> According to the DisplayPort standard, LTTPRs have two operating
> modes:
>  - non-transparent - it replies to DPCD LTTPR field specific AUX
>requests, while passes through all other AUX requests
>  - transparent - it passes through all AUX requests.
> 
> Switching between this two modes is done by the DPTX by issuing
> an AUX write to the DPCD PHY_REPEATER_MODE register.
> 
> The msm DP driver is currently lacking any handling of LTTPRs.
> This means that if at least one LTTPR is found between DPTX and DPRX,
> the link training would fail if that LTTPR was not already configured
> in transparent mode.
> 
> The section 3.6.6.1 from the DisplayPort v2.0 specification mandates
> that before link training with the LTTPR is started, the DPTX may place
> the LTTPR in non-transparent mode by first switching to transparent mode
> and then to non-transparent mode. This operation seems to be needed only
> on first link training and doesn't need to be done again until device is
> unplugged.
> 
> It has been observed on a few X Elite-based platforms which have
> such LTTPRs in their board design that the DPTX needs to follow the
> procedure described above in order for the link training to be successful.
> 
> So add support for reading the LTTPR DPCD caps to figure out the number
> of such LTTPRs first. Then, for platforms (or Type-C dongles) that have
> at least one such an LTTPR, set its operation mode to transparent mode
> first and then to non-transparent, just like the mentioned section of
> the specification mandates.
> 
> Tested-by: Johan Hovold 
> Signed-off-by: Abel Vesa 

Reviewed-by: Johan Hovold 


Re: [PATCH v2 1/4] drm/dp: Add helper to set LTTPRs in transparent mode

2024-12-11 Thread Johan Hovold
On Wed, Dec 11, 2024 at 03:04:12PM +0200, Abel Vesa wrote:
 
> +/**
> + * drm_dp_lttpr_set_transparent_mode - set the LTTPR in transparent mode
> + * @aux: DisplayPort AUX channel
> + * @enable: Enable or disable transparent mode
> + *
> + * Returns 0 on success or a negative error code on failure.
> + */
> +int drm_dp_lttpr_set_transparent_mode(struct drm_dp_aux *aux, bool enable)
> +{
> + u8 val = enable ? DP_PHY_REPEATER_MODE_TRANSPARENT :
> +   DP_PHY_REPEATER_MODE_NON_TRANSPARENT;
> + int ret = drm_dp_dpcd_writeb(aux, DP_PHY_REPEATER_MODE, val);
> +
> + return ret == 1 ? 0 : ret;

This looks correct, but I had to go look at drm_dp_dpcd_writeb() to make
sure it never returns 0 (for short transfers).

> +}
> +EXPORT_SYMBOL(drm_dp_lttpr_set_transparent_mode);

This appears to be what the driver currently uses, but why not
EXPORT_SYMBOL_GPL?

> +
> +/**
> + * drm_dp_lttpr_init - init LTTPR transparency mode according to DP standard
> + *
> + * @aux: DisplayPort AUX channel
> + * @lttpr_count: Number of LTTPRs
> + *
> + * Returns 0 on success or a negative error code on failure.
> + */
> +int drm_dp_lttpr_init(struct drm_dp_aux *aux, int lttpr_count)
> +{
> + if (!lttpr_count)
> + return 0;
> +
> + /*
> +  * See DP Standard v2.0 3.6.6.1 about the explicit disabling of
> +  * non-transparent mode and the disable->enable non-transparent mode
> +  * sequence.
> +  */
> + drm_dp_lttpr_set_transparent_mode(aux, true);

Error handling?

> +
> + if (lttpr_count > 0 && !drm_dp_lttpr_set_transparent_mode(aux, false))

No need to check lttpr_count again here.

> + return 0;

I'd check for errors instead of success here and do the rollback before
returning -EINVAL.

> +
> + /*
> +  * Roll-back to tranparent mode if setting non-tranparent mode failed or
> +  * the number of LTTPRs is invalid
> +  */
> + drm_dp_lttpr_set_transparent_mode(aux, true);
> +
> + return -EINVAL;

And return 0 explicitly here.

> +}
> +EXPORT_SYMBOL(drm_dp_lttpr_init);

In any case this works well and is needed for external display on the
Lenovo ThinkPad T14s, while not breaking the X13s which does not need
it:

Tested-by: Johan Hovold 

Johan


Re: [PATCH v2 4/4] drm/msm/dp: Add support for LTTPR handling

2024-12-11 Thread Johan Hovold
On Wed, Dec 11, 2024 at 03:04:15PM +0200, Abel Vesa wrote:
 
> +static void msm_dp_display_lttpr_init(struct msm_dp_display_private *dp)
> +{
> + int lttpr_count;
> +
> + if (drm_dp_read_lttpr_common_caps(dp->aux, dp->panel->dpcd,
> +   dp->lttpr_caps))
> + return;
> +
> + lttpr_count = drm_dp_lttpr_count(dp->lttpr_caps);

I was gonna say shouldn't you handle errors here, but that explains the
non-negative check I commented on the first patch in the series.

This looks error prone, but I think you should at least update the
kernel doc comment to drm_dp_lttpr_init() in the first patch so that
it's clear that you pass in the number of LTTPRs *or* an errno.

> +
> + drm_dp_lttpr_init(dp->aux, lttpr_count);
> +}
> +
>  static int msm_dp_display_process_hpd_high(struct msm_dp_display_private *dp)
>  {
>   struct drm_connector *connector = dp->msm_dp_display.connector;
>   const struct drm_display_info *info = &connector->display_info;
>   int rc = 0;
>  
> + msm_dp_display_lttpr_init(dp);

It looks like you ignore errors on purpose so I guess that's fine.

> +
>   rc = msm_dp_panel_read_sink_caps(dp->panel, connector);
>   if (rc)
>   goto end;

Either way, this is needed for external display on my x1e80100 machines,
while not breaking the X13s:

Tested-by: Johan Hovold 

Johan


Re: [PATCH v2 1/4] drm/dp: Add helper to set LTTPRs in transparent mode

2024-12-12 Thread Johan Hovold
On Thu, Dec 12, 2024 at 09:31:09AM +0100, neil.armstr...@linaro.org wrote:
> On 11/12/2024 15:42, Johan Hovold wrote:

> >> +EXPORT_SYMBOL(drm_dp_lttpr_set_transparent_mode);
> > 
> > This appears to be what the driver currently uses, but why not
> > EXPORT_SYMBOL_GPL?
> 
> drivers/gpu/drm/display/drm_dp_helper.c is not GPL licenced, so
> this is the right macro to use.

Wow. I should have checked the header. Thanks for explaining.

Johan