On 7/19/25 5:10 AM, Felix Richter wrote:
Thanks for the reply.
I am aware that i can read and `edid` via sysfs from the drm device. I
did not know about `drm_info` but from a quick look at it I don't think
it provides the information I need.
The problem is not that I need more information about the attached
display. The problem is that there is not enough information about the
what `i2c` device corresponds to which monitors ddc channel. Relying on
udev hierarchies is not sufficient, because in many cases the relevant
i2c device has no parent drm output device. So when I have no
information about the i2c device I need to get more information by
reading from it. Then I know more and can map the device to the correct
display. I am happy to change the approach if there is a simpler way for
me to get this information.
❯ ls -alh /sys/class/drm/*/ddc
Ultimately I don't think that me accessing the bus should be the issue
here … This issue did not happen with kernel 6.6, so it definitely
qualifies as a regression. In my mind it is the job of the driver to
handle resource allocation, so if the bus is in use by somebody else it
is the kernels job to handle who uses it. It is not the users job to
have to worry about some sort of synchronization issue. That is the
operating systems job.
I get where you're coming from, but there are cases that are ultimately
impossible to prevent when it comes to "long", or "frequent" sequences
and responding to interrupts. There are lots of examples like this in
the kernel that if you break what a driver is doing with a device from a
userspace interface you get to pick up the pieces.
I'll give you two examples:
1) You can access R/W PCI config data.
/sys/bus/pci/devices/*/config
You can break power management state machines, bus mastering, really
anything a device driver can do from a userspace application. For
example if I had a userspace app that did something like this:
dd if=/dev/zero of=/sys/bus/pci/devices/${BDF}/config bs=1 count=4096
and it broke how can the kernel do anything about it?
2) There was a case that fwupd was doing something very similar to you
with a "probe" but with the DP aux character device. It was trying to
detect devices with updates and would fight specifically with link
training. The outcome was non-functional devices. The workaround
currently employed is that fwupd will wait a few seconds (5 or 10, I
forget) and then do the probe to avoid that fight. This doesn't solve
things though because there are pulse interrupts that could still come
at any time. The DP spec has response requirements for these.
We talked about it at the display next hackfest this year and the
decision was this information that fwupd was needing should be pushed
into the kernel (let fwupd probe a sysfs file that gets cached data the
driver fetched).
People have been experiencing similar screen freezing issues randomly on
this drm issue thread: https://gitlab.freedesktop.org/drm/amd/-/
issues/4141#note_3016182> > This example highlights an issue that can be triggered reliably with a
very similar effect. It may not be the same issue, but they may be related.
Yeah; I'm aware of this thread and agree it's an issue with similar
symptoms.
On 7/18/25 20:02, Mario Limonciello wrote:
At least to me, this issue sounds like a case that multiple entities
are trying to communicate with the panel at the same time.
By setting dcdebugmask=0x10 what you're essentially doing is stopping
the display hardware from trying to put the panel into PSR. So there
is "less" I2C traffic to fight with.
*Why* are you using I2C to read the EDID like this? Could you instead
use /sys/class/drm/cardX-inputY/edid? Or even better - can you use
the information from drm_info to make decisions?
I think the less I2C traffic done directly from userspace the better
when it comes to synchronization issues..