Hi, On Wed, Sep 27, 2023 at 4:05 AM Christian König <christian.koe...@amd.com> wrote: > I'm not an expert for that stuff, but as far as I know the whole purpose > of the blocking functionality is to make sure that the CPU overhead > caused by the commit is accounted to the right process. I'm not an expert either, but that's clearly wrong.
The point of blocking functionality in drmAtomicCommit is for the ioctl to block until the operation is completed, so userspace knows that they can commit again after it returns without getting EBUSY, and they can be sure, e.g., the mode is set or the displays are disabled or whatever. To say the "whole point" is about CPU overhead accounting sounds rather absurd to me. Is that really what you meant? You could try to make the argument that one of the points of the blocking call is about CPU accounting (instead of the whole point), maybe that's what you meant? That seems likely wrong to me too. I mean just check the docs: RLIMIT_RTTIME (since Linux 2.6.25) This is a limit (in microseconds) on the amount of CPU time that a process scheduled under a real‐time scheduling policy may consume without making a blocking system call. For the purpose of this limit, each time a process makes a blocking system call, the count of its consumed CPU time is reset to zero. drmAtomicCommit() is making a blocking system call so the limit should be reset, in my opinion. Furthermore, a lot happens under the covers as part of drmAtomicCommit. Are you telling me that in all the drivers handling drmAtomicCommit, none of the work from those drivers gets deferred outside of process context if DRM_MODE_ATOMIC_NONBLOCK isn't set? I find that very hard to believe. But it would have to be true, if one of the main points of the blocking call is about CPU accounting, right? You can't say the point is about CPU accounting if some of the work is handled in a helper thread or work queue or whatever. ╎❯ git grep -E 'INIT_WORK|DECLARE_TASKLET|open_softirq|timer_setup|kthread_run' | wc -l 182 There seems to be a lot of non-process context execution going on in the tree... > So what you are suggesting here is to actually break that functionality > and that doesn't seem to make sense. What's the use case here that could break? Walk me through a real-world, sensible example where a program depends on drmAtomicCommit() blocking and continuing to eat into process cpu time while it blocks. I just want to see where you are coming from. Maybe I'm being unimaginative but I just don't see it. I do see actual main-stream display servers breaking today because the current behavior defies expectations. > When it's really not desirable to account the CPU overhead to the > process initiating it then you probably rather want to use an non > blocking commit plus a dma_fence to wait for the work to end from userspace. Well, first I don't think that's very convenient. You're talking about a per-plane property, so there would need to be a separate file descriptor allocated for every plane, right? and user-space would have to block on all of them before proceeding? Also, are you sure that works in all cases? The problematic case we're facing right now is when all planes and all crtcs are getting disabled. I'm just skimming over the code here, but I see this: → for_each_new_crtc_in_state(state, crtc, crtc_state, i) {• ... → → if (arg->flags & DRM_MODE_PAGE_FLIP_EVENT || fence_ptr) {• ... → → → e = create_vblank_event(crtc, arg->user_data);• ... → → → crtc_state->event = e;• → → }• ... → → if (fence_ptr) {• ... → → → fence = drm_crtc_create_fence(crtc);• ... → → → ret = setup_out_fence(&f[(*num_fences)++], fence);• ... → → → crtc_state->event->base.fence = fence;• → → }• Is the code really going to allocate a vblank_event when all the crtc's are disabled ? I guess it's possible, but that's counterintuitive to me. I really don't know. You're confident a set of fences will actually work? Regardless, this seems like a roundabout way to address a problem that we could just ... fix. --Ray