"Fixes" for page flipping under PRIME on AMD & nouveau
On 08/17/2016 07:02 PM, Christian König wrote: > Am 17.08.2016 um 18:35 schrieb Mario Kleiner: >> On 08/17/2016 06:27 PM, Christian König wrote: AMD uses copy swaps because radeon/amdgpu kms can't switch the scanout mode from tiled to linear on the fly during flips. >>> Well I'm not an expert on this, but as far as I know the bigger problem >>> is that the dedicated AMD hardware generations you are targeting usually >>> can't reliable scanout from system memory without a rather complicated >>> setup. >>> >>> So that is a complete NAK to the radeon changes. >> >> Hi Christian, >> >> thanks for the feedback, but i think that's a misunderstanding. The >> patches don't make them scanout from system memory, they just enforce >> a fresh copy from RAM/GTT -> VRAM before scanning out a buffer again. >> I just assume there is a more elegant/clean way than this "fake" >> pin/unpin to GTT to essentially tell the driver that its current VRAM >> content is stale and needs a refresh from the up to date dmabuf in >> system RAM. > > I was already wondering how the heck you got that working. > > What do you mean with a fresh copy from GTT to VRAM? A buffer exported > by DMA-buf should never move as long as it is exported, same for a > buffer pinned to VRAM. > Under DRI3/Present, the way it is currently implemented in the X-Server and Mesa, the display gpu (= normally integrated one) is importing the dma-buf that was exported by the render offload gpu. So the actual dmabuf doesn't move, but just stays where it is in system RAM. Afaiu the prime importing display gpu generates its own gem buffer handle (prime_fd_to_handle) from that dmabuf, importing scather-gather tables to access the dmabuf in system ram. As far as page flipping is concerned, so far those gem buffers / radeon_bo's aren't treated any different than native ones. During pageflip setup they get pinned into VRAM, which moves (=copies) their content from the RAM dmabuf backing store into VRAM. Then they get flipped and scanned out as usual. The disconnect happens when such a buffer gets flipped off the scanout (and unpinned) and later on page-flipped to the scanout again. Now the driver just reuses the bo that still likely resides in VRAM (although not pinned anymore) and forgets that it was associated with some dmabuf backing in RAM which may have updated visual content. So the exporting renderoffload gpu happily renders new frames into the dmabuf in ram, while radeon kms happily displays stale frames from its own copy in VRAM. > So using a DMA-buf for scanout is impossible and actually not valuable > cause is shouldn't matter if we copy from GTT to VRAM because of a > buffer migration or because of a copy triggered by the DDX. > > What are you actually trying to do here? > Make a typical Enduro laptop with an AMD iGPU + AMD dGPU work under DRI3/Present, without tearing and other ugliness, e.g., DRI_PRIME=1 glxgears -fullscreen -> discrete gpu renders, integrated gpu displays the rendered frames. Currently the drivers use copies for handling the PresentPixmap requests, which sort of works in showing the right pictures, but gives bad tearing and undefined timing. With copies we are too slow to keep ahead of the scanout and Present doesn't even guarantee that the copy starts vsync'ed. So at all levels, from delays in the x-server, mesa's way of doing things, commmand submission and the hw itself we end up blitting in the middle of scanout. And the presentation timing isn't ever trustworthy for timing sensitive applications unless we present via page flipping. The hack in my patch tricks the driver into migrating the bo back to GTT (skipping the actual pointless data copy though) and then back into VRAM to force a copy of fresh content from the imported dmabuf into VRAM, so page flipping flips up to date content into the scanout. -mario > Regards, > Christian. > >> >> Btw. i'll be offline for the next few hours, just wanted to get this >> out now. >> >> thanks, >> -mario >> >>> >>> Regards, >>> Christian. >>> >>> Am 17.08.2016 um 18:12 schrieb Mario Kleiner: Hi, i spent some time playing with DRI3/Present + PRIME for testing how well it works for Optimus/Enduro style setups wrt. page flipping on the current kernel/mesa/xorg. I want page flipping, because neuroscience/medical applications need the reliable timing/timestamping and tear free presentation we currently only can get via page flipping, but not the copyswap path. Intel as display gpu + nouveau for render offload worked nicely on intel-ddx with page flipping, proper timing, dmabuf fence sync and all. AMD uses copy swaps because radeon/amdgpu kms can't switch the scanout mode from tiled to linear on the fly during flips. That's a todo in itself. For the moment i used the ati-ddx with Option "ColorTiling/ColorTiling2D" "off" to force my pair of old Radeon HD-5770's into l
"Fixes" for page flipping under PRIME on AMD & nouveau
On 08/17/2016 07:43 PM, Alex Deucher wrote: > On Wed, Aug 17, 2016 at 12:35 PM, Mario Kleiner > wrote: >> On 08/17/2016 06:27 PM, Christian König wrote: AMD uses copy swaps because radeon/amdgpu kms can't switch the scanout mode from tiled to linear on the fly during flips. >>> >>> Well I'm not an expert on this, but as far as I know the bigger problem >>> is that the dedicated AMD hardware generations you are targeting usually >>> can't reliable scanout from system memory without a rather complicated >>> setup. >>> >>> So that is a complete NAK to the radeon changes. >> >> >> Hi Christian, >> >> thanks for the feedback, but i think that's a misunderstanding. The patches >> don't make them scanout from system memory, they just enforce a fresh copy >> from RAM/GTT -> VRAM before scanning out a buffer again. I just assume there >> is a more elegant/clean way than this "fake" pin/unpin to GTT to essentially >> tell the driver that its current VRAM content is stale and needs a refresh >> from the up to date dmabuf in system RAM. >> > > I think the ddx should handle the copy rather than the kernel. That > also takes care of the tiling. I.e., copy from the linear shared > buffer in system memory to the tiled scanout buffer in vram. The ddx > should also be able to take damage into account and only copy the > delta. From a bandwidth perspective, I'm not sure how much sense > pageflipping makes since there are so many copies already. > > Alex That's what the ati-ddx/amdgpu-ddx does at the moment, as it detects the mismatch in tiling flags and uses the DRI3/Present copy path instead of the pageflip path. The problem is that the servers Present implementation doesn't request a vsync'ed start of the copy operation and the whole procedure is too slow to keep ahead of the scanout, so it tears pretty badly for many animations. Also no page flipping = no reliable timestamps. And the modesetting ddx doesn't handle it at all, as it doesn't know about the tiling mismatch. You are right, going through page flipping doesn't save any bandwith, may even use more without damage handling, but it prevents tearing and undefined presentation timing. So it sounds as if the bug is not that page flipping doesn't quite work without my hack, but that i even managed to get this far? There is this other approach from NVidia's Alex Goins for their proprietary driver, whose patches landed in the X-Server 1.19 master branch a couple of weeks ago. I haven't read his patches in detail yet, and i so far couldn't successfully test them with the reference implementation in modesetting ddx 1.19. Afaik there the display gpu exports a pair of scanout friendly, page flipping compatible dmabufs (i assume linear, contiguous, accessible by the display engines), and the offload gpu imports those and renders into them. That saves one extra copy, so should be somewhat more efficient. Setting it up seems to be more involved and less flexible though. So far i couldn't make it work here for testing. Maybe bugs, maybe mistakes on my side, maybe i just have the wrong hardware for it. Need to read the patches first in detail to understand how it is supposed to work. -mario > >> Btw. i'll be offline for the next few hours, just wanted to get this out >> now. >> >> thanks, >> -mario >> >> >>> >>> Regards, >>> Christian. >>> >>> Am 17.08.2016 um 18:12 schrieb Mario Kleiner: Hi, i spent some time playing with DRI3/Present + PRIME for testing how well it works for Optimus/Enduro style setups wrt. page flipping on the current kernel/mesa/xorg. I want page flipping, because neuroscience/medical applications need the reliable timing/timestamping and tear free presentation we currently only can get via page flipping, but not the copyswap path. Intel as display gpu + nouveau for render offload worked nicely on intel-ddx with page flipping, proper timing, dmabuf fence sync and all. AMD uses copy swaps because radeon/amdgpu kms can't switch the scanout mode from tiled to linear on the fly during flips. That's a todo in itself. For the moment i used the ati-ddx with Option "ColorTiling/ColorTiling2D" "off" to force my pair of old Radeon HD-5770's into linear mode so page flipping can be used for prime. The current modesetting-ddx will use page flipping in any case as it doesn't detect the tiling format mismatch. nouveau uses page flips. Turns out that prime + page flipping currently doesn't work on nouveau and amd. The first offload rendered images from the imported dmabufs show up properly, but then the display is stuck alternating between the first two or three rendered frames. The problem is that during the pageflip ioctl we pin the dmabuf into VRAM in preparation for scanout, then unpin it when we are done with it at next flip, but the buffer stays in the V
[Bug 93288] dpm malfunction on radeon grenada r9 390X
https://bugs.freedesktop.org/show_bug.cgi?id=93288 --- Comment #11 from Jacinto --- Created attachment 125858 --> https://bugs.freedesktop.org/attachment.cgi?id=125858&action=edit screen corruption linux 4.7 mesa git , radeon ci r7 260x -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/17124607/attachment.html>
[Bug 93288] dpm malfunction on radeon grenada r9 390X
https://bugs.freedesktop.org/show_bug.cgi?id=93288 --- Comment #12 from Jacinto --- I added an attachment I might be hitting the same bug. With 4.7 the screen will artifact some white squares and then it will hard lock , I had to hard reboot. With 4.8 there are no artifacts It will freeze, and after a while I can get to tty dmesg show the same message. [drm:ci_dpm_enable [radeon]] *ERROR* ci_start_dpm failed [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/056c0874/attachment.html>
[Bug 93288] dpm malfunction on radeon grenada r9 390X
https://bugs.freedesktop.org/show_bug.cgi?id=93288 --- Comment #13 from Jacinto --- Created attachment 125859 --> https://bugs.freedesktop.org/attachment.cgi?id=125859&action=edit syslog linux 4.8 rc2 mesa git radeon ci r7 260x syslog linux kernel 4.8 rc2 , mesa git . -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/354402b0/attachment.html>
[PATCH v6 8/8] drm/rockchip: Add dmc notifier in vop driver
Hi, On 2016å¹´08æ18æ¥ 02:14, Sean Paul wrote: > On Tue, Aug 16, 2016 at 3:36 PM, Lin Huang wrote: >> when in ddr frequency scaling process, vop can not do >> enable or disable operation, since dcf will base on vop vblank >> time to do frequency scaling and need to get vop irq if there >> have vop enabled. > I'm a little confused by this. Does this mean you need vblank irq to > be enabled all the time when vop is enabled? We regularly disable it > when there aren't new fbs coming in. maybe the commit message lead to misunderstanding, in dcf it will read the vop register to check whether is it in vblank status, if not, it will wait until it into vblank status. When vop enable, we need the vop clock enable, so we can read the vop register. > > >> So need register to devfreq notifier, and we can >> get the dmc status. >> Also, when there have two vop enabled, we need >> to disable dmc, since dcf only base on one vop vblank time, so the >> other panel will flicker when do ddr frequency scaling. >> >> Signed-off-by: Lin Huang >> Reviewed-by: Chanwoo Choi >> --- >> Changes in v6: >> - fix a build error >> >> Changes in v5: >> - improve some nits >> >> Changes in v4: >> - register notifier to devfreq_register_notifier >> - use DEVFREQ_PRECHANGE and DEVFREQ_POSTCHANGE to get dmc status >> - when two vop enable, disable dmc >> - when two vop back to one vop, enable dmc >> >> Changes in v3: >> - when do vop eanble/disable, dmc will wait until it finish >> >> Changes in v2: >> - None >> >> Changes in v1: >> - use wait_event instead usleep >> >> drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 121 >> +++- >> 1 file changed, 119 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> index 31744fe..199529e 100644 >> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> @@ -12,6 +12,8 @@ >>* GNU General Public License for more details. >>*/ >> >> +#include >> +#include >> #include >> #include >> #include >> @@ -118,6 +120,13 @@ struct vop { >> >> const struct vop_data *data; >> >> + struct devfreq *devfreq; >> + struct devfreq_event_dev *devfreq_event_dev; >> + struct notifier_block dmc_nb; >> + int dmc_in_process; >> + int vop_switch_status; >> + wait_queue_head_t wait_dmc_queue; >> + wait_queue_head_t wait_vop_switch_queue; >> uint32_t *regsbak; >> void __iomem *regs; >> >> @@ -428,21 +437,56 @@ static void vop_dsp_hold_valid_irq_disable(struct vop >> *vop) >> spin_unlock_irqrestore(&vop->irq_lock, flags); >> } >> >> +static int dmc_notify(struct notifier_block *nb, unsigned long event, >> + void *data) >> +{ >> + struct vop *vop = container_of(nb, struct vop, dmc_nb); >> + >> + if (event == DEVFREQ_PRECHANGE) { >> + /* >> +* check if vop in enable or disable process, >> +* if yes, wait until it finishes, use 200ms as >> +* timeout. >> +*/ >> + if (!wait_event_timeout(vop->wait_vop_switch_queue, >> + !vop->vop_switch_status, HZ / 5)) >> + dev_warn(vop->dev, >> +"Timeout waiting for vop swtich status\n"); >> + vop->dmc_in_process = 1; >> + } else if (event == DEVFREQ_POSTCHANGE) { >> + vop->dmc_in_process = 0; >> + wake_up(&vop->wait_dmc_queue); >> + } >> + >> + return NOTIFY_OK; >> +} >> + >> static void vop_enable(struct drm_crtc *crtc) >> { >> struct vop *vop = to_vop(crtc); >> + int num_enabled_crtc = 0; >> int ret; >> >> + /* >> +* if in dmc scaling frequency process, wait until it finishes >> +* use 100ms as timeout time. >> +*/ >> + if (!wait_event_timeout(vop->wait_dmc_queue, >> + !vop->dmc_in_process, HZ / 5)) >> + dev_warn(vop->dev, >> +"Timeout waiting for dmc when vop enable\n"); >> + >> + vop->vop_switch_status = 1; >> ret = pm_runtime_get_sync(vop->dev); >> if (ret < 0) { >> dev_err(vop->dev, "failed to get pm runtime: %d\n", ret); >> - return; >> + goto err; >> } >> >> ret = clk_enable(vop->hclk); >> if (ret < 0) { >> dev_err(vop->dev, "failed to enable hclk - %d\n", ret); >> - return; >> + goto err; >> } >> >> ret = clk_enable(vop->dclk); >> @@ -485,6 +529,21 @@ static void vop_enable(struct drm_crtc *crtc) >> >> drm_crtc_vblank_on(crtc); >> >> + vop->vop_switch_status = 0; >> + wake_up(&vop->wait_vop_switch_queue); >> + >> + /* c
[Bug 97260] [bisected] R9 290 low performance in Linux 4.7
https://bugs.freedesktop.org/show_bug.cgi?id=97260 --- Comment #23 from Michel Dänzer --- Note that since the crash happens when Xorg tries to use the GPU, testing with DRI3 and Xorg not using the GPU directly cannot trigger the crash. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/357a9a5a/attachment.html>
[Bug 97260] [bisected] R9 290 low performance in Linux 4.7
https://bugs.freedesktop.org/show_bug.cgi?id=97260 --- Comment #24 from Michel Dänzer --- Sorry, wrong bug. :( -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/dcf7142d/attachment.html>
"Fixes" for page flipping under PRIME on AMD & nouveau
On 18/08/16 01:12 AM, Mario Kleiner wrote: > > Intel as display gpu + nouveau for render offload worked nicely > on intel-ddx with page flipping, proper timing, dmabuf fence sync > and all. How about with AMD instead of nouveau in this case? > Turns out that prime + page flipping currently doesn't work > on nouveau and amd. The first offload rendered images from > the imported dmabufs show up properly, but then the display > is stuck alternating between the first two or three rendered > frames. > > The problem is that during the pageflip ioctl we pin the > dmabuf into VRAM in preparation for scanout, then unpin it > when we are done with it at next flip, but the buffer stays > in the VRAM memory domain. Sounds like you found a bug here: BOs which are being shared between different GPUs should always be pinned to GTT, moving them to VRAM (and consequently the page flip) should fail. The latest versions of DCE support scanning out from GTT, so that might be a good solution at least for Carrizo and newer APUs, not sure it makes sense for dGPUs though. > AMD, as tested with dual Radeon HD-5770 seems to be fast as prime > importer/display gpu, but very slow as prime exporter/render offload, > e.g., taking 16 msecs to get a 1920x1080 framebuffer into RAM. Seems > that Mesa's blitImage function is the slow bit here. On r600 it seems > to draw a textured triangle strip to detile the gpu renderbuffer and > copy it into GTT. As drawing a textured fullscreen quad is normally > much faster, something special seems to be going on there wrt. DMA? Maybe the rasterization as two triangles results in bad PCIe bandwidth utilization. Using the asynchronous DMA engine for these transfers would probably be ideal, but having the 3D engine rasterize a single rectangle (either using the rectangle primitive or a large triangle with scissor) might already help. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
"Fixes" for page flipping under PRIME on AMD & nouveau
On 18/08/16 08:51 AM, Mario Kleiner wrote: > > That's what the ati-ddx/amdgpu-ddx does at the moment, as it detects the > mismatch in tiling flags and uses the DRI3/Present copy path instead of > the pageflip path. The problem is that the servers Present > implementation doesn't request a vsync'ed start of the copy operation [...] It waits for vblank before starting the copy. > There is this other approach from NVidia's Alex Goins for their > proprietary driver, whose patches landed in the X-Server 1.19 master > branch a couple of weeks ago. I haven't read his patches in detail yet, > and i so far couldn't successfully test them with the reference > implementation in modesetting ddx 1.19. Afaik there the display gpu > exports a pair of scanout friendly, page flipping compatible dmabufs (i > assume linear, contiguous, accessible by the display engines), FWIW, that wouldn't be possible with our "older" GPUs which can't scan out from GTT: A BO can be either shared with another GPU or scanout friendly, not both at the same time. > and the offload gpu imports those and renders into them. That saves > one extra copy, so should be somewhat more efficient. Using two shared buffers actually isn't as efficient as possible wrt inter-GPU bandwidth. > Setting it up seems to be more involved and less flexible though. So far > i couldn't make it work here for testing. Maybe bugs, maybe mistakes on > my side, maybe i just have the wrong hardware for it. Yeah, my impression has been it's a rather complicated solution geared towards the Intel iGPU + proprietary nVidia use case. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
[git pull] drm fixes
Hi Linus, Pretty quiet so far, a few amdgpu/radeon fixup for pcie pm changes, and a couple of amdgpu fixes, along with some build fixes, and printk fix. Dave. The following changes since commit 4b9eaf33d83d91430b7ca45d0ebf8241da489c92: Merge branch 'akpm' (patches from Andrew) (2016-08-11 16:58:24 -0700) are available in the git repository at: git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-4.8-rc3 for you to fetch changes up to 91d62d9f30206be6f7749a0e6f7fa58c6d70c702: Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes (2016-08-18 12:51:27 +1000) Alex Deucher (2): Revert "drm/amdgpu: work around lack of upstream ACPI support for D3cold" Revert "drm/radeon: work around lack of upstream ACPI support for D3cold" Arnd Bergmann (3): drm/mediatek: add COMMON_CLK dependency drm/mediatek: add CONFIG_OF dependency drm/mediatek: add ARM_SMCCC dependency Chunming Zhou (1): drm/amdgpu: fix vm init error path Colin Ian King (1): drm/amdkfd: print doorbell offset as a hex value Dave Airlie (4): Merge tag 'drm-amdkfd-fixes-2016-08-09' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Merge tag 'mediatek-drm-fixes-2016-08-12' of git://git.pengutronix.de/git/pza/linux into drm-fixes Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Felix Kuehling (1): drm/amdgpu: Change GART offset to 64-bit Jay Cornwall (1): drm/amdgpu: Fix memory trashing if UVD ring test fails drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 9 - drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 - drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c| 2 +- drivers/gpu/drm/mediatek/Kconfig | 3 +++ drivers/gpu/drm/radeon/radeon_atpx_handler.c | 9 - 8 files changed, 14 insertions(+), 25 deletions(-)
[Bug 153121] UVD not responding, trying to reset the VCPU, ATI Mobility Radeon HD 5650
https://bugzilla.kernel.org/show_bug.cgi?id=153121 Kontantin Ivanov changed: What|Removed |Added Severity|normal |low --- Comment #28 from Kontantin Ivanov --- Hah. Now, when I update my kernel to 4.8-rc2 (4.8.0-040800rc2) the error messages in dmesg are also present [6.361530] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [7.376047] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [8.390569] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [9.405078] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 10.419596] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 11.434111] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 12.448626] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 13.463148] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 14.477689] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 15.492208] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!! [ 15.512208] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, giving up!!! [ 16.535049] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait timed out. [ 16.535153] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on GFX ring (-110). [ 16.535235] [drm:radeon_device_init [radeon]] *ERROR* ib ring test failed (-110). But it appears only on system start and never again. When I start DRI_PRIME=1 glxinfo and DRI_PRIME=1 glxgears the programs are started without delay and without error messages in both stderr and dmesg. What is interesting? Ok, see here: glxinfo | grep -i opengl | egrep -i '(version|renderer)' OpenGL renderer string: Mesa DRI Intel(R) Ironlake Mobile OpenGL version string: 2.1 Mesa 11.2.0 OpenGL shading language version string: 1.20 OpenGL ES profile version string: OpenGL ES 2.0 Mesa 11.2.0 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16 DRI_PRIME=1 glxinfo | grep -i opengl | egrep -i '(version|renderer)' OpenGL renderer string: Mesa DRI Intel(R) Ironlake Mobile OpenGL version string: 2.1 Mesa 11.2.0 OpenGL shading language version string: 1.20 OpenGL ES profile version string: OpenGL ES 2.0 Mesa 11.2.0 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16 Both graphic cards (DRI_PRIME=0 and DRI_PRIME=1) looks like are the same card Intel(R) Ironlake Mobile. But let's render something: glxgears Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. 304 frames in 5.0 seconds = 60.605 FPS XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" after 1263 requests (1263 known processed) with 0 events remaining. DRI_PRIME=1 glxgears Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. 8731 frames in 5.0 seconds = 1746.049 FPS XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" after 24162 requests (24162 known processed) with 0 events remaining. Ha-ha, FPS is different and DRI_PRIME=1 looks like more powerful. I changed importance of this bug to low. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 153121] UVD not responding, trying to reset the VCPU, ATI Mobility Radeon HD 5650
https://bugzilla.kernel.org/show_bug.cgi?id=153121 --- Comment #29 from Michel Dänzer --- (In reply to Kontantin Ivanov from comment #28) > DRI_PRIME=1 glxinfo | grep -i opengl | egrep -i '(version|renderer)' > > OpenGL renderer string: Mesa DRI Intel(R) Ironlake Mobile [...] > DRI_PRIME=1 glxgears > > Running synchronized to the vertical refresh. The framerate should be > approximately the same as the monitor refresh rate. > 8731 frames in 5.0 seconds = 1746.049 FPS I suspect you just get a higher framerate because DRI_PRIME=1 implicitly disables sync-to-vblank with DRI2, even when it ends up using the same GPU, and you'll see with glxgears -info that it's also using the Intel GPU. Presumably because Xorg couldn't initialize the Radeon GPU. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 153121] UVD not responding, trying to reset the VCPU, ATI Mobility Radeon HD 5650
https://bugzilla.kernel.org/show_bug.cgi?id=153121 --- Comment #30 from Kontantin Ivanov --- Created attachment 229231 --> https://bugzilla.kernel.org/attachment.cgi?id=229231&action=edit 4.8.0-040800rc2 dmesg dmesg for 4.8.0 rc2 -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 153121] UVD not responding, trying to reset the VCPU, ATI Mobility Radeon HD 5650
https://bugzilla.kernel.org/show_bug.cgi?id=153121 --- Comment #31 from Kontantin Ivanov --- Created attachment 229241 --> https://bugzilla.kernel.org/attachment.cgi?id=229241&action=edit 4.8.0-040800rc2 Xorg.0.log Xorg log for 4.8.0 rc2 -- You are receiving this mail because: You are watching the assignee of the bug.
[PATCH for v4.8] cec-edid: check for IEEE identifier
The cec_get_edid_spa_location() function did not verify that the IEEE identifier in the Vendor Specific Data Block matched the HDMI-LLC identifier. This could result in the wrong VSDB block being returned. For example, for HDMI 2.0 EDIDs there is also a HDMI Forum VSDB. So check the IEEE identifier as well. Signed-off-by: Hans Verkuil --- diff --git a/drivers/media/cec-edid.c b/drivers/media/cec-edid.c index 7001824..5719b99 100644 --- a/drivers/media/cec-edid.c +++ b/drivers/media/cec-edid.c @@ -70,7 +70,10 @@ static unsigned int cec_get_edid_spa_location(const u8 *edid, unsigned int size) u8 tag = edid[i] >> 5; u8 len = edid[i] & 0x1f; - if (tag == 3 && len >= 5 && i + len <= end) + if (tag == 3 && len >= 5 && i + len <= end && + edid[i + 1] == 0x03 && + edid[i + 2] == 0x0c && + edid[i + 3] == 0x00) return i + 4; i += len + 1; } while (i < end);
[PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs
Hey, Op 17-08-16 om 21:55 schreef Lyude: > Since the watermark calculations for Skylake are still broken, we're apt > to hitting underruns very easily under multi-monitor configurations. > While it would be lovely if this was fixed, it's not. Another problem > that's been coming from this however, is the mysterious issue of > underruns causing full system hangs. An easy way to reproduce this with > a skylake system: > > - Get a laptop with a skylake GPU, and hook up two external monitors to > it > - Move the cursor from the built-in LCD to one of the external displays > as quickly as you can > - You'll get a few pipe underruns, and eventually the entire system will > just freeze. > > After doing a lot of investigation and reading through the bspec, I > found the existence of the SAGV, which is responsible for adjusting the > system agent voltage and clock frequencies depending on how much power > we need. According to the bspec: > > "The display engine access to system memory is blocked during the > adjustment time. SAGV defaults to enabled. Software must use the > GT-driver pcode mailbox to disable SAGV when the display engine is not > able to tolerate the blocking time." > > The rest of the bspec goes on to explain that software can simply leave > the SAGV enabled, and disable it when we use interlaced pipes/have more > then one pipe active. > > Sure enough, with this patchset the system hangs resulting from pipe > underruns on Skylake have completely vanished on my T460s. Additionally, > the bspec mentions turning off the SAGV with more then one pipe enabled > as a workaround for display underruns. While this patch doesn't entirely > fix that, it looks like it does improve the situation a little bit so > it's likely this is going to be required to make watermarks on Skylake > fully functional. > > This will still need additional work in the future: we shouldn't be > enabling the SAGV if any of the currently enabled planes can't enable WM > levels that introduce latencies >= 30 µs. > > Changes since v11: > - Add skl_can_enable_sagv() > - Make sure we don't enable SAGV when not all planes can enable >watermarks >= the SAGV engine block time. I was originally going to >save this for later, but I recently managed to run into a machine >that was having problems with a single pipe configuration + SAGV. > - Make comparisons to I915_SKL_SAGV_NOT_CONTROLLED explicit > - Change I915_SAGV_DYNAMIC_FREQ to I915_SAGV_ENABLE > - Move printks outside of mutexes > - Don't print error messages twice > Changes since v10: > - Apparently sandybridge_pcode_read actually writes values and reads >them back, despite it's misleading function name. This means we've >been doing this mostly wrong and have been writing garbage to the >SAGV control. Because of this, we no longer attempt to read the SAGV >status during initialization (since there are no helpers for this). > - mlankhorst noticed that this patch was breaking on some very early >pre-release Skylake machines, which apparently don't allow you to >disable the SAGV. To prevent machines from failing tests due to SAGV >errors, if the first time we try to control the SAGV results in the >mailbox indicating an invalid command, we just disable future attempts >to control the SAGV state by setting dev_priv->skl_sagv_status to >I915_SKL_SAGV_NOT_CONTROLLED and make a note of it in dmesg. > - Move mutex_unlock() a little higher in skl_enable_sagv(). This >doesn't actually fix anything, but lets us release the lock a little >sooner since we're finished with it. > Changes since v9: > - Only enable/disable sagv on Skylake > Changes since v8: > - Add intel_state->modeset guard to the conditional for >skl_enable_sagv() > Changes since v7: > - Remove GEN9_SAGV_LOW_FREQ, replace with GEN9_SAGV_IS_ENABLED (that's >all we use it for anyway) > - Use GEN9_SAGV_IS_ENABLED instead of 0x1 for clarification > - Fix a styling error that snuck past me > Changes since v6: > - Protect skl_enable_sagv() with intel_state->modeset conditional in >intel_atomic_commit_tail() > Changes since v5: > - Don't use is_power_of_2. Makes things confusing > - Don't use the old state to figure out whether or not to >enable/disable the sagv, use the new one > - Split the loop in skl_disable_sagv into it's own function > - Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail() > Changes since v4: > - Use is_power_of_2 against active_crtcs to check whether we have > 1 >pipe enabled > - Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0 >enabled > - Call skl_sagv_enable/disable() from pre/post-plane updates > Changes since v3: > - Use time_before() to compare timeout to jiffies > Changes since v2: > - Really apply minor style nitpicks to patch this time > Changes since v1: > - Added comments about this probably being one of the requirements to >fixing Skylake's wat
"Fixes" for page flipping under PRIME on AMD & nouveau
> Afaiu the prime importing display gpu generates its own gem buffer > handle (prime_fd_to_handle) from that dmabuf, importing scather-gather > tables to access the dmabuf in system ram. As far as page flipping is > concerned, so far those gem buffers / radeon_bo's aren't treated any > different than native ones. During pageflip setup they get pinned into > VRAM, which moves (=copies) their content from the RAM dmabuf backing > store into VRAM. Your understanding isn't correct. Buffers imported using prime always stay in GTT, they can't be moved to VRAM. It's the DDX which copies the buffer content from the imported prime handle into a native on which is enabled to scan out. Regards, Christian. Am 18.08.2016 um 01:29 schrieb Mario Kleiner: > On 08/17/2016 07:02 PM, Christian König wrote: >> Am 17.08.2016 um 18:35 schrieb Mario Kleiner: >>> On 08/17/2016 06:27 PM, Christian König wrote: > AMD uses copy swaps because radeon/amdgpu kms can't switch the > scanout mode from tiled to linear on the fly during flips. Well I'm not an expert on this, but as far as I know the bigger problem is that the dedicated AMD hardware generations you are targeting usually can't reliable scanout from system memory without a rather complicated setup. So that is a complete NAK to the radeon changes. >>> >>> Hi Christian, >>> >>> thanks for the feedback, but i think that's a misunderstanding. The >>> patches don't make them scanout from system memory, they just enforce >>> a fresh copy from RAM/GTT -> VRAM before scanning out a buffer again. >>> I just assume there is a more elegant/clean way than this "fake" >>> pin/unpin to GTT to essentially tell the driver that its current VRAM >>> content is stale and needs a refresh from the up to date dmabuf in >>> system RAM. >> >> I was already wondering how the heck you got that working. >> >> What do you mean with a fresh copy from GTT to VRAM? A buffer exported >> by DMA-buf should never move as long as it is exported, same for a >> buffer pinned to VRAM. >> > > Under DRI3/Present, the way it is currently implemented in the > X-Server and Mesa, the display gpu (= normally integrated one) is > importing the dma-buf that was exported by the render offload gpu. So > the actual dmabuf doesn't move, but just stays where it is in system RAM. > > Afaiu the prime importing display gpu generates its own gem buffer > handle (prime_fd_to_handle) from that dmabuf, importing scather-gather > tables to access the dmabuf in system ram. As far as page flipping is > concerned, so far those gem buffers / radeon_bo's aren't treated any > different than native ones. During pageflip setup they get pinned into > VRAM, which moves (=copies) their content from the RAM dmabuf backing > store into VRAM. Then they get flipped and scanned out as usual. The > disconnect happens when such a buffer gets flipped off the scanout > (and unpinned) and later on page-flipped to the scanout again. Now the > driver just reuses the bo that still likely resides in VRAM (although > not pinned anymore) and forgets that it was associated with some > dmabuf backing in RAM which may have updated visual content. So the > exporting renderoffload gpu happily renders new frames into the dmabuf > in ram, while radeon kms happily displays stale frames from its own > copy in VRAM. > >> So using a DMA-buf for scanout is impossible and actually not valuable >> cause is shouldn't matter if we copy from GTT to VRAM because of a >> buffer migration or because of a copy triggered by the DDX. >> >> What are you actually trying to do here? >> > > Make a typical Enduro laptop with an AMD iGPU + AMD dGPU work under > DRI3/Present, without tearing and other ugliness, e.g., > > DRI_PRIME=1 glxgears -fullscreen > > -> discrete gpu renders, integrated gpu displays the rendered frames. > > Currently the drivers use copies for handling the PresentPixmap > requests, which sort of works in showing the right pictures, but gives > bad tearing and undefined timing. With copies we are too slow to keep > ahead of the scanout and Present doesn't even guarantee that the copy > starts vsync'ed. So at all levels, from delays in the x-server, mesa's > way of doing things, commmand submission and the hw itself we end up > blitting in the middle of scanout. And the presentation timing isn't > ever trustworthy for timing sensitive applications unless we present > via page flipping. > > The hack in my patch tricks the driver into migrating the bo back to > GTT (skipping the actual pointless data copy though) and then back > into VRAM to force a copy of fresh content from the imported dmabuf > into VRAM, so page flipping flips up to date content into the scanout. > > -mario > >> Regards, >> Christian. >> >>> >>> Btw. i'll be offline for the next few hours, just wanted to get this >>> out now. >>> >>> thanks, >>> -mario >>> Regards, Christian. Am 17.08.2
"Fixes" for page flipping under PRIME on AMD & nouveau
Am 18.08.2016 um 04:32 schrieb Michel Dänzer: > On 18/08/16 08:51 AM, Mario Kleiner wrote: >> There is this other approach from NVidia's Alex Goins for their >> proprietary driver, whose patches landed in the X-Server 1.19 master >> branch a couple of weeks ago. I haven't read his patches in detail yet, >> and i so far couldn't successfully test them with the reference >> implementation in modesetting ddx 1.19. Afaik there the display gpu >> exports a pair of scanout friendly, page flipping compatible dmabufs (i >> assume linear, contiguous, accessible by the display engines), > FWIW, that wouldn't be possible with our "older" GPUs which can't scan > out from GTT: A BO can be either shared with another GPU or scanout > friendly, not both at the same time. And even for newer GPUs it is quite complicated to setup. As far as I understood it you need to make sure that at least: 1. A whole line buffered is continuous. E.g. if you want to scan out 1920x1080 32bpp without tilling you need 1920*4=7680 bytes of linear memory. The result is that you need to special allocate your GTT buffer. 2. You can't use multiple layer page tables for the system domain (we already do this). 3. The MC needs to guarantee enough PCIe bandwith for the CRTC. This means you need to reprogram some priorities in the MC differently which can only be done when the whole GPU is idle and we haven't released documentation for at all. But keep in mind that this is only *AFAIK* and from a document on how the DCE works I read quite a while ago. Regards, Christian.
"Fixes" for page flipping under PRIME on AMD & nouveau
On 18/08/16 04:41 PM, Christian König wrote: >> Afaiu the prime importing display gpu generates its own gem buffer >> handle (prime_fd_to_handle) from that dmabuf, importing scather-gather >> tables to access the dmabuf in system ram. As far as page flipping is >> concerned, so far those gem buffers / radeon_bo's aren't treated any >> different than native ones. During pageflip setup they get pinned into >> VRAM, which moves (=copies) their content from the RAM dmabuf backing >> store into VRAM. > > Your understanding isn't correct. Buffers imported using prime always > stay in GTT, they can't be moved to VRAM. That's the theory, but based on Mario's description it's clear that there is at least one bug which either actually allows a shared buffer to be moved to VRAM, or at least doesn't propagate the error correctly, so the page flip operation "succeeds". > It's the DDX which copies the buffer content from the imported prime > handle into a native on which is enabled to scan out. There is no such code which could explain what Mario is seeing. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
[PATCH v2] drm: drop DRIVER_HAVE_IRQ flag for some drivers
Acked-by: Vincent Abriou On 08/16/2016 09:06 AM, Shawn Guo wrote: > Since commit 4984979b9b90 ("drm/irq: simplify irq checks in > drm_wait_vblank"), the drm driver feature flag DRIVER_HAVE_IRQ is only > required for drivers that have an IRQ handler managed by the DRM core. > Some drivers, armada, etnaviv, kirin and sti, set this flag without > .irq_handler setup in drm_driver. These drivers manage IRQ handler > by themselves and the flag DRIVER_HAVE_IRQ makes no sense there. > > Drop the flag for these drivers to avoid confusion. > > Signed-off-by: Shawn Guo > Cc: Vincent Abriou > Cc: Xinliang Liu > Acked-by: Russell King (for armada and > etnaviv) > --- > Changes for v2: > - Improve commit log per Russell's suggestion > - Add Russell's ACK tag > > drivers/gpu/drm/armada/armada_drv.c | 2 +- > drivers/gpu/drm/etnaviv/etnaviv_drv.c | 3 +-- > drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c | 2 +- > drivers/gpu/drm/sti/sti_drv.c | 2 +- > 4 files changed, 4 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/armada/armada_drv.c > b/drivers/gpu/drm/armada/armada_drv.c > index f5ebdd681445..1e0e68f608e4 100644 > --- a/drivers/gpu/drm/armada/armada_drv.c > +++ b/drivers/gpu/drm/armada/armada_drv.c > @@ -211,7 +211,7 @@ static struct drm_driver armada_drm_driver = { > .desc = "Armada SoC DRM", > .date = "20120730", > .driver_features= DRIVER_GEM | DRIVER_MODESET | > - DRIVER_HAVE_IRQ | DRIVER_PRIME, > + DRIVER_PRIME, > .ioctls = armada_ioctls, > .fops = &armada_drm_fops, > }; > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > index ffd1b32caa8d..fd0ed61565f3 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > @@ -488,8 +488,7 @@ static const struct file_operations fops = { > }; > > static struct drm_driver etnaviv_drm_driver = { > - .driver_features= DRIVER_HAVE_IRQ | > - DRIVER_GEM | > + .driver_features= DRIVER_GEM | > DRIVER_PRIME | > DRIVER_RENDER, > .open = etnaviv_open, > diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c > b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c > index 1edd9bc80294..1fc2f502d20d 100644 > --- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c > +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c > @@ -169,7 +169,7 @@ static int kirin_gem_cma_dumb_create(struct drm_file > *file, > > static struct drm_driver kirin_drm_driver = { > .driver_features= DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | > - DRIVER_ATOMIC | DRIVER_HAVE_IRQ, > + DRIVER_ATOMIC, > .fops = &kirin_drm_fops, > > .gem_free_object_unlocked = drm_gem_cma_free_object, > diff --git a/drivers/gpu/drm/sti/sti_drv.c b/drivers/gpu/drm/sti/sti_drv.c > index 96bd3d08b2d4..f8311b2bc84e 100644 > --- a/drivers/gpu/drm/sti/sti_drv.c > +++ b/drivers/gpu/drm/sti/sti_drv.c > @@ -282,7 +282,7 @@ static const struct file_operations sti_driver_fops = { > }; > > static struct drm_driver sti_driver = { > - .driver_features = DRIVER_HAVE_IRQ | DRIVER_MODESET | > + .driver_features = DRIVER_MODESET | > DRIVER_GEM | DRIVER_PRIME | DRIVER_ATOMIC, > .gem_free_object_unlocked = drm_gem_cma_free_object, > .gem_vm_ops = &drm_gem_cma_vm_ops, >
"Fixes" for page flipping under PRIME on AMD & nouveau
Am 18.08.2016 um 09:52 schrieb Michel Dänzer: > On 18/08/16 04:41 PM, Christian König wrote: >>> Afaiu the prime importing display gpu generates its own gem buffer >>> handle (prime_fd_to_handle) from that dmabuf, importing scather-gather >>> tables to access the dmabuf in system ram. As far as page flipping is >>> concerned, so far those gem buffers / radeon_bo's aren't treated any >>> different than native ones. During pageflip setup they get pinned into >>> VRAM, which moves (=copies) their content from the RAM dmabuf backing >>> store into VRAM. >> Your understanding isn't correct. Buffers imported using prime always >> stay in GTT, they can't be moved to VRAM. > That's the theory, but based on Mario's description it's clear that > there is at least one bug which either actually allows a shared buffer > to be moved to VRAM, or at least doesn't propagate the error correctly, > so the page flip operation "succeeds". > > >> It's the DDX which copies the buffer content from the imported prime >> handle into a native on which is enabled to scan out. > There is no such code which could explain what Mario is seeing. How should this work then otherwise? I agree that I don't understand fully either what is happening here, but I find it quite unlikely that we actually scan out from system memory without the proper hardware setup. On the other hand that we accidentally move a prime imported buffer to VRAM could be possible, but this would clearly be a rather severe bug we hopefully have noticed already. Any other idea what actually happens here? Regards, Christian.
"Fixes" for page flipping under PRIME on AMD & nouveau
On 18/08/16 05:20 PM, Christian König wrote: > Am 18.08.2016 um 09:52 schrieb Michel Dänzer: >> On 18/08/16 04:41 PM, Christian König wrote: Afaiu the prime importing display gpu generates its own gem buffer handle (prime_fd_to_handle) from that dmabuf, importing scather-gather tables to access the dmabuf in system ram. As far as page flipping is concerned, so far those gem buffers / radeon_bo's aren't treated any different than native ones. During pageflip setup they get pinned into VRAM, which moves (=copies) their content from the RAM dmabuf backing store into VRAM. >>> Your understanding isn't correct. Buffers imported using prime always >>> stay in GTT, they can't be moved to VRAM. >> That's the theory, but based on Mario's description it's clear that >> there is at least one bug which either actually allows a shared buffer >> to be moved to VRAM, or at least doesn't propagate the error correctly, >> so the page flip operation "succeeds". >> >> >>> It's the DDX which copies the buffer content from the imported prime >>> handle into a native on which is enabled to scan out. >> There is no such code which could explain what Mario is seeing. > > How should this work then otherwise? [...] > On the other hand that we accidentally move a prime imported buffer to > VRAM could be possible, but this would clearly be a rather severe bug we > hopefully have noticed already. That's what seems to be happening, based on Mario's description and patches. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
[PATCH v2 2/2] drm/fence: allow fence waiting to be interrupted by userspace
Op 16-08-16 om 01:05 schreef Gustavo Padovan: > From: Gustavo Padovan > > If userspace is running an synchronously atomic commit and interrupts the > atomic operation during fence_wait() it will hang until the timer expires, > so here we change the wait to be interruptible so it stop immediately when > userspace wants to quit. > > Also adds the necessary error checking for fence_wait(). > > v2: Comment by Daniel Vetter > - Add error checking for fence_wait() > > v3: Rebase on top of new atomic noblocking support > > v4: Comment by Maarten Lankhorst > - remove 'swapped' bitfield as it was duplicating information > > Signed-off-by: Gustavo Padovan I think it would make sense to squash this patch with 1/2. > --- > drivers/gpu/drm/drm_atomic_helper.c | 33 - > 1 file changed, 28 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > b/drivers/gpu/drm/drm_atomic_helper.c > index bf1bc43..292d15b 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -1022,20 +1022,37 @@ int drm_atomic_helper_wait_for_fences(struct > drm_device *dev, > { > struct drm_plane *plane; > struct drm_plane_state *plane_state; > + struct fence *fence; > int i, ret; > > for_each_plane_in_state(state, plane, plane_state, i) { > - if (!plane->state->fence) > + if (!plane->state->fence && !plane_state->fence) > continue; Move this check to below, or put if (!intr) plane_state = plane->state on top, and only use plane_state. Daniel, I keep thinking more and more that a for_each_XXX_(old,new,both)_state would be useful, could we add those perhaps? > - WARN_ON(!plane->state->fb); > + /* > + * If the caller asks for an interruptible wait it means > + * that the state were not swapped yet and the operation > + * can still be interrupted by userspace, so we need > + * to look to plane_state instead. > + */ > + if (intr) { > + fence = plane_state->fence; > + WARN_ON(!plane_state->fb); > + } else { > + fence = plane->state->fence; > + WARN_ON(!plane->state->fb); > + } ^ if (!fence) continue; > - ret = fence_wait(plane->state->fence, intr); > + ret = fence_wait(fence, intr); > if (ret) > return ret; > > - fence_put(plane->state->fence); > - plane->state->fence = NULL; > + fence_put(fence); > + > + if (intr) > + plane_state->fence = NULL; > + else > + plane->state->fence = NULL; > } > > return 0; > @@ -1244,6 +1261,12 @@ int drm_atomic_helper_commit(struct drm_device *dev, > if (ret) > return ret; > > + if (!nonblock) { > + ret = drm_atomic_helper_wait_for_fences(dev, state, true); > + if (ret) > + return ret; > + } > + > /* >* This is the point of no return - everything below never fails except >* when the hw goes bonghits. Which means we can commit the new state on
Use of copy_from_user in msm_gem_submit.c while holding a spin_lock
On Wed, Aug 17, 2016 at 08:31:20PM +0100, Al Viro wrote: > On Wed, Aug 17, 2016 at 03:24:38PM -0400, Rob Clark wrote: > > > hmm, looks like, at least on arm (not sure about arm64), > > > > #define __copy_from_user_inatomic __copy_from_user > > > > ie. copy_from_user() minus the access_ok() and memset in the > > !access_ok() path.. but maybe what I want is just the > > pagefault_disable() if that disables copy_from_user() being able to > > block.. > > On a bunch of platforms copy_from_user() starts with might_sleep(); again, > that'll spread to all of the pretty soon. > > Right now those primitives are very badly out of sync; this will change, > but let's not add more PITA sources. That sounds great, as part of discussing this on irc with Rob I too noticed that the the *copy*user* funcs are all rather out of sync. On i915.ko we go full evil mode and pass (faulting) i915 buffer objects in as targets for all these copy*user operations. And for added evilness we have debugfs interfaces to force-unmap/evict these bo, which is used to make sure that the fault handling in slow-paths (after dropping locks and reacquiring them) also works - some of i915 code has slow-slow path fallbacks ;-) Oh and we have a debugfs knob to disable the prefaulting we do, since without those the race is way too small. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Use of copy_from_user in msm_gem_submit.c while holding a spin_lock
On Wed, Aug 17, 2016 at 05:29:31PM -0400, Rob Clark wrote: > On Wed, Aug 17, 2016 at 3:15 PM, Al Viro wrote: > > On Wed, Aug 17, 2016 at 02:49:32PM -0400, Rob Clark wrote: > >> If there is a copy_from_user() variant that will return an error > >> instead of blocking, I think that is really what I want so I can > >> implement a slow-path that drops the spin-lock temporarily. > > > > *shrug* > > > > pagefault_disable()/pagefault_enable() are there for purpose, so's > > __copy_from_user_inatomic()... Just remember that > > __copy_from_user_inatomic() > > does not check if the addresses are userland ones (i.e. the caller needs > > to check access_ok() itself) and it is *NOT* guaranteed to zero what it > > hadn't copied over. Currently it does zero tail on some, but not all > > architectures; come next cycle it and it will not do that zeroing on any > > of those. > > Ok, this is what I came up with.. let me know what you think. The > first hunk was just to see the problem in the first place (no idea if > other places on arm would have problems w/ that hunk so it wouldn't be > part of my fix+cc-stable patch.. but it seems like I good idea, I > would have discovered this issue much sooner if we had it) > > -- > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h > index 35c9db8..ce2e182 100644 > --- a/arch/arm/include/asm/uaccess.h > +++ b/arch/arm/include/asm/uaccess.h > @@ -542,6 +542,7 @@ __clear_user(void __user *addr, unsigned long n) > > static inline unsigned long __must_check copy_from_user(void *to, > const void __user *from, unsigned long n) > { > +might_fault(); > if (access_ok(VERIFY_READ, from, n)) > n = __copy_from_user(to, from, n); > else /* security hole - plug it */ > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c > b/drivers/gpu/drm/msm/msm_gem_submit.c > index 5cd4e9b..3cca013 100644 > --- a/drivers/gpu/drm/msm/msm_gem_submit.c > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c > @@ -66,6 +66,14 @@ void msm_gem_submit_free(struct msm_gem_submit *submit) > kfree(submit); > } > > +static inline unsigned long __must_check > +copy_from_user_inatomic(void *to, const void __user *from, unsigned long n) > +{ > +if (access_ok(VERIFY_READ, from, n)) > +return __copy_from_user_inatomic(to, from, n); > +return -EFAULT; > +} > + > static int submit_lookup_objects(struct msm_gem_submit *submit, > struct drm_msm_gem_submit *args, struct drm_file *file) > { > @@ -73,6 +81,7 @@ static int submit_lookup_objects(struct > msm_gem_submit *submit, > int ret = 0; > > spin_lock(&file->table_lock); > +pagefault_disable(); Hm, I thought with spinlocks this isn't needed ... we have that in some fastpaths to avoid locking inversion in i915_gem_execbuf.c, but that's because a mutex critical section can still sleep. > > for (i = 0; i < args->nr_bos; i++) { > struct drm_msm_gem_submit_bo submit_bo; > @@ -86,10 +95,15 @@ static int submit_lookup_objects(struct > msm_gem_submit *submit, > */ > submit->bos[i].flags = 0; > > -ret = copy_from_user(&submit_bo, userptr, sizeof(submit_bo)); > -if (ret) { > -ret = -EFAULT; > -goto out_unlock; > +ret = copy_from_user_inatomic(&submit_bo, userptr, > sizeof(submit_bo)); > +if (unlikely(ret)) { > +pagefault_enable(); > +spin_unlock(&file->table_lock); > +ret = copy_from_user(&submit_bo, userptr, sizeof(submit_bo)); > +if (ret) > +return -EFAULT; > +spin_lock(&file->table_lock); > +pagefault_disable(); > } > > if (submit_bo.flags & ~MSM_SUBMIT_BO_FLAGS) { > @@ -130,6 +144,7 @@ static int submit_lookup_objects(struct > msm_gem_submit *submit, > > out_unlock: > submit->nr_bos = i; > +pagefault_enable(); > spin_unlock(&file->table_lock); > > return ret; > -- > > danvet suggested the doubleplus-super-evil variant of the test > program, using an unfaulted but mmap'd gem bo passed in to submit > ioctl for the bo's table, which has the additional fun of wanting to > acquire the already held struct_mutex in msm_gem_fault(). Which > needed the further fix: > > -- > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c > index 6cd4af4..4502e4b 100644 > --- a/drivers/gpu/drm/msm/msm_gem.c > +++ b/drivers/gpu/drm/msm/msm_gem.c > @@ -201,6 +201,13 @@ int msm_gem_fault(struct vm_area_struct *vma, > struct vm_fault *vmf) > pgoff_t pgoff; > int ret; > > +/* I think this should only happen if userspace tries to pass a > + * mmap'd but unfaulted gem bo vaddr into submit ioctl, triggering > + * a page fault while struct_mutex is already held > + */ > +if (mutex_is_locked_by(&dev->struct_mutex, current)) > +return VM_FAULT_SIGBUS; This is an ok (well still horrible) heuristics for the shrinker, but for correctn
[PATCH 2/2] drm/rockchip: Use DRM_DEV_ERROR in vop
On 2016å¹´08æ13æ¥ 01:00, Sean Paul wrote: > Since we can have multiple vops, use DRM_DEV_ERROR to > make logs easier to process. > > Signed-off-by: Sean Paul looks good for me Acked-by: Mark Yao > --- > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 24 ++-- > 1 file changed, 14 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > index 31744fe..ec8ad00 100644 > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > @@ -238,7 +238,7 @@ static enum vop_data_format vop_convert_format(uint32_t > format) > case DRM_FORMAT_NV24: > return VOP_FMT_YUV444SP; > default: > - DRM_ERROR("unsupport format[%08x]\n", format); > + DRM_ERROR("unsupported format[%08x]\n", format); > return -EINVAL; > } > } > @@ -315,7 +315,7 @@ static void scl_vop_cal_scl_fac(struct vop *vop, const > struct vop_win_data *win, > int vskiplines = 0; > > if (dst_w > 3840) { > - DRM_ERROR("Maximum destination width (3840) exceeded\n"); > + DRM_DEV_ERROR(vop->dev, "Maximum dst width (3840) exceeded\n"); > return; > } > > @@ -353,11 +353,11 @@ static void scl_vop_cal_scl_fac(struct vop *vop, const > struct vop_win_data *win, > VOP_SCL_SET_EXT(vop, win, lb_mode, lb_mode); > if (lb_mode == LB_RGB_3840X2) { > if (yrgb_ver_scl_mode != SCALE_NONE) { > - DRM_ERROR("ERROR : not allow yrgb ver scale\n"); > + DRM_DEV_ERROR(vop->dev, "not allow yrgb ver scale\n"); > return; > } > if (cbcr_ver_scl_mode != SCALE_NONE) { > - DRM_ERROR("ERROR : not allow cbcr ver scale\n"); > + DRM_DEV_ERROR(vop->dev, "not allow cbcr ver scale\n"); > return; > } > vsu_mode = SCALE_UP_BIL; > @@ -970,7 +970,8 @@ static void vop_crtc_enable(struct drm_crtc *crtc) > VOP_CTRL_SET(vop, mipi_en, 1); > break; > default: > - DRM_ERROR("unsupport connector_type[%d]\n", s->output_type); > + DRM_DEV_ERROR(vop->dev, "unsupported connector_type [%d]\n", > + s->output_type); > } > VOP_CTRL_SET(vop, out_mode, s->output_mode); > > @@ -1154,7 +1155,8 @@ static irqreturn_t vop_isr(int irq, void *data) > > /* Unhandled irqs are spurious. */ > if (active_irqs) > - DRM_ERROR("Unknown VOP IRQs: %#02x\n", active_irqs); > + DRM_DEV_ERROR(vop->dev, "Unknown VOP IRQs: %#02x\n", > + active_irqs); > > return ret; > } > @@ -1189,7 +1191,8 @@ static int vop_create_crtc(struct vop *vop) > win_data->phy->nformats, > win_data->type, NULL); > if (ret) { > - DRM_ERROR("failed to initialize plane\n"); > + DRM_DEV_ERROR(vop->dev, "failed to init plane %d\n", > + ret); > goto err_cleanup_planes; > } > > @@ -1227,7 +1230,8 @@ static int vop_create_crtc(struct vop *vop) > win_data->phy->nformats, > win_data->type, NULL); > if (ret) { > - DRM_ERROR("failed to initialize overlay plane\n"); > + DRM_DEV_ERROR(vop->dev, "failed to init overlay %d\n", > + ret); > goto err_cleanup_crtc; > } > drm_plane_helper_add(&vop_win->base, &plane_helper_funcs); > @@ -1235,8 +1239,8 @@ static int vop_create_crtc(struct vop *vop) > > port = of_get_child_by_name(dev->of_node, "port"); > if (!port) { > - DRM_ERROR("no port node found in %s\n", > - dev->of_node->full_name); > + DRM_DEV_ERROR(vop->dev, "no port node found in %s\n", > + dev->of_node->full_name); > ret = -ENOENT; > goto err_cleanup_crtc; > } -- ï¼ark Yao
[PATCH v3 1/5] drm/rockchip: sort registers define by chip's number
Hi Sean Thanks for send v3 patch for rk3399 vop support. But sorry for that, I had changed my mind, those patches are deprecated, I have new rk3399 patch on my downstream kernel, I will upstream soon. Thanks. On 2016å¹´08æ18æ¥ 01:20, Sean Paul wrote: > From: Mark Yao > > No functional changes, sort the vop registers to make > code more readable. > > Signed-off-by: Mark Yao > [seanpaul resolved conflict with name change from _3066 to _3036] > Signed-off-by: Sean Paul > --- > > Changes in v3: > - Fix typo from _3066 _3036 (Tomasz Figa) > > drivers/gpu/drm/rockchip/rockchip_vop_reg.c | 168 > ++-- > 1 file changed, 84 insertions(+), 84 deletions(-) > > diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > index 919992c..44caf14 100644 > --- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > +++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > @@ -50,6 +50,88 @@ static const uint32_t formats_win_lite[] = { > DRM_FORMAT_BGR565, > }; > > +static const struct vop_scl_regs rk3036_win_scl = { > + .scale_yrgb_x = VOP_REG(RK3036_WIN0_SCL_FACTOR_YRGB, 0x, 0x0), > + .scale_yrgb_y = VOP_REG(RK3036_WIN0_SCL_FACTOR_YRGB, 0x, 16), > + .scale_cbcr_x = VOP_REG(RK3036_WIN0_SCL_FACTOR_CBR, 0x, 0x0), > + .scale_cbcr_y = VOP_REG(RK3036_WIN0_SCL_FACTOR_CBR, 0x, 16), > +}; > + > +static const struct vop_win_phy rk3036_win0_data = { > + .scl = &rk3036_win_scl, > + .data_formats = formats_win_full, > + .nformats = ARRAY_SIZE(formats_win_full), > + .enable = VOP_REG(RK3036_SYS_CTRL, 0x1, 0), > + .format = VOP_REG(RK3036_SYS_CTRL, 0x7, 3), > + .rb_swap = VOP_REG(RK3036_SYS_CTRL, 0x1, 15), > + .act_info = VOP_REG(RK3036_WIN0_ACT_INFO, 0x1fff1fff, 0), > + .dsp_info = VOP_REG(RK3036_WIN0_DSP_INFO, 0x0fff0fff, 0), > + .dsp_st = VOP_REG(RK3036_WIN0_DSP_ST, 0x1fff1fff, 0), > + .yrgb_mst = VOP_REG(RK3036_WIN0_YRGB_MST, 0x, 0), > + .uv_mst = VOP_REG(RK3036_WIN0_CBR_MST, 0x, 0), > + .yrgb_vir = VOP_REG(RK3036_WIN0_VIR, 0x, 0), > + .uv_vir = VOP_REG(RK3036_WIN0_VIR, 0x1fff, 16), > +}; > + > +static const struct vop_win_phy rk3036_win1_data = { > + .data_formats = formats_win_lite, > + .nformats = ARRAY_SIZE(formats_win_lite), > + .enable = VOP_REG(RK3036_SYS_CTRL, 0x1, 1), > + .format = VOP_REG(RK3036_SYS_CTRL, 0x7, 6), > + .rb_swap = VOP_REG(RK3036_SYS_CTRL, 0x1, 19), > + .act_info = VOP_REG(RK3036_WIN1_ACT_INFO, 0x1fff1fff, 0), > + .dsp_info = VOP_REG(RK3036_WIN1_DSP_INFO, 0x0fff0fff, 0), > + .dsp_st = VOP_REG(RK3036_WIN1_DSP_ST, 0x1fff1fff, 0), > + .yrgb_mst = VOP_REG(RK3036_WIN1_MST, 0x, 0), > + .yrgb_vir = VOP_REG(RK3036_WIN1_VIR, 0x, 0), > +}; > + > +static const struct vop_win_data rk3036_vop_win_data[] = { > + { .base = 0x00, .phy = &rk3036_win0_data, > + .type = DRM_PLANE_TYPE_PRIMARY }, > + { .base = 0x00, .phy = &rk3036_win1_data, > + .type = DRM_PLANE_TYPE_CURSOR }, > +}; > + > +static const int rk3036_vop_intrs[] = { > + DSP_HOLD_VALID_INTR, > + FS_INTR, > + LINE_FLAG_INTR, > + BUS_ERROR_INTR, > +}; > + > +static const struct vop_intr rk3036_intr = { > + .intrs = rk3036_vop_intrs, > + .nintrs = ARRAY_SIZE(rk3036_vop_intrs), > + .status = VOP_REG(RK3036_INT_STATUS, 0xf, 0), > + .enable = VOP_REG(RK3036_INT_STATUS, 0xf, 4), > + .clear = VOP_REG(RK3036_INT_STATUS, 0xf, 8), > +}; > + > +static const struct vop_ctrl rk3036_ctrl_data = { > + .standby = VOP_REG(RK3036_SYS_CTRL, 0x1, 30), > + .out_mode = VOP_REG(RK3036_DSP_CTRL0, 0xf, 0), > + .pin_pol = VOP_REG(RK3036_DSP_CTRL0, 0xf, 4), > + .htotal_pw = VOP_REG(RK3036_DSP_HTOTAL_HS_END, 0x1fff1fff, 0), > + .hact_st_end = VOP_REG(RK3036_DSP_HACT_ST_END, 0x1fff1fff, 0), > + .vtotal_pw = VOP_REG(RK3036_DSP_VTOTAL_VS_END, 0x1fff1fff, 0), > + .vact_st_end = VOP_REG(RK3036_DSP_VACT_ST_END, 0x1fff1fff, 0), > + .cfg_done = VOP_REG(RK3036_REG_CFG_DONE, 0x1, 0), > +}; > + > +static const struct vop_reg_data rk3036_vop_init_reg_table[] = { > + {RK3036_DSP_CTRL1, 0x}, > +}; > + > +static const struct vop_data rk3036_vop = { > + .init_table = rk3036_vop_init_reg_table, > + .table_size = ARRAY_SIZE(rk3036_vop_init_reg_table), > + .ctrl = &rk3036_ctrl_data, > + .intr = &rk3036_intr, > + .win = rk3036_vop_win_data, > + .win_size = ARRAY_SIZE(rk3036_vop_win_data), > +}; > + > static const struct vop_scl_extension rk3288_win_full_scl_ext = { > .cbcr_vsd_mode = VOP_REG(RK3288_WIN0_CTRL1, 0x1, 31), > .cbcr_vsu_mode = VOP_REG(RK3288_WIN0_CTRL1, 0x1, 30), > @@ -190,93 +272,11 @@ static const struct vop_data rk3288_vop = { > .win_size = ARRAY_SIZE(rk3288_vop_win_data), > }; > > -static const struct vop_scl_regs rk3036_win_scl = { > - .scale_yrgb_x = VOP_REG(RK3036_WIN0_SCL_FACTOR_YRGB, 0x, 0x0), > -
[PATCH v3 1/5] drm/rockchip: sort registers define by chip's number
On Thu, Aug 18, 2016 at 05:08:14PM +0800, Mark yao wrote: > Hi Sean > > Thanks for send v3 patch for rk3399 vop support. > > But sorry for that, I had changed my mind, those patches are deprecated, > I have new rk3399 patch on my downstream kernel, I will upstream soon. Wut? Imo merge Sean's patch here, and then rebase your downstream patches on top of it. That you have a downstream tree which is out of sync with upstream shouldn't be a reason to stall upstream development. -Daniel > > Thanks. > > On 2016å¹´08æ18æ¥ 01:20, Sean Paul wrote: > > From: Mark Yao > > > > No functional changes, sort the vop registers to make > > code more readable. > > > > Signed-off-by: Mark Yao > > [seanpaul resolved conflict with name change from _3066 to _3036] > > Signed-off-by: Sean Paul > > --- > > > > Changes in v3: > > - Fix typo from _3066 _3036 (Tomasz Figa) > > > > drivers/gpu/drm/rockchip/rockchip_vop_reg.c | 168 > > ++-- > > 1 file changed, 84 insertions(+), 84 deletions(-) > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > > b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > > index 919992c..44caf14 100644 > > --- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > > +++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c > > @@ -50,6 +50,88 @@ static const uint32_t formats_win_lite[] = { > > DRM_FORMAT_BGR565, > > }; > > +static const struct vop_scl_regs rk3036_win_scl = { > > + .scale_yrgb_x = VOP_REG(RK3036_WIN0_SCL_FACTOR_YRGB, 0x, 0x0), > > + .scale_yrgb_y = VOP_REG(RK3036_WIN0_SCL_FACTOR_YRGB, 0x, 16), > > + .scale_cbcr_x = VOP_REG(RK3036_WIN0_SCL_FACTOR_CBR, 0x, 0x0), > > + .scale_cbcr_y = VOP_REG(RK3036_WIN0_SCL_FACTOR_CBR, 0x, 16), > > +}; > > + > > +static const struct vop_win_phy rk3036_win0_data = { > > + .scl = &rk3036_win_scl, > > + .data_formats = formats_win_full, > > + .nformats = ARRAY_SIZE(formats_win_full), > > + .enable = VOP_REG(RK3036_SYS_CTRL, 0x1, 0), > > + .format = VOP_REG(RK3036_SYS_CTRL, 0x7, 3), > > + .rb_swap = VOP_REG(RK3036_SYS_CTRL, 0x1, 15), > > + .act_info = VOP_REG(RK3036_WIN0_ACT_INFO, 0x1fff1fff, 0), > > + .dsp_info = VOP_REG(RK3036_WIN0_DSP_INFO, 0x0fff0fff, 0), > > + .dsp_st = VOP_REG(RK3036_WIN0_DSP_ST, 0x1fff1fff, 0), > > + .yrgb_mst = VOP_REG(RK3036_WIN0_YRGB_MST, 0x, 0), > > + .uv_mst = VOP_REG(RK3036_WIN0_CBR_MST, 0x, 0), > > + .yrgb_vir = VOP_REG(RK3036_WIN0_VIR, 0x, 0), > > + .uv_vir = VOP_REG(RK3036_WIN0_VIR, 0x1fff, 16), > > +}; > > + > > +static const struct vop_win_phy rk3036_win1_data = { > > + .data_formats = formats_win_lite, > > + .nformats = ARRAY_SIZE(formats_win_lite), > > + .enable = VOP_REG(RK3036_SYS_CTRL, 0x1, 1), > > + .format = VOP_REG(RK3036_SYS_CTRL, 0x7, 6), > > + .rb_swap = VOP_REG(RK3036_SYS_CTRL, 0x1, 19), > > + .act_info = VOP_REG(RK3036_WIN1_ACT_INFO, 0x1fff1fff, 0), > > + .dsp_info = VOP_REG(RK3036_WIN1_DSP_INFO, 0x0fff0fff, 0), > > + .dsp_st = VOP_REG(RK3036_WIN1_DSP_ST, 0x1fff1fff, 0), > > + .yrgb_mst = VOP_REG(RK3036_WIN1_MST, 0x, 0), > > + .yrgb_vir = VOP_REG(RK3036_WIN1_VIR, 0x, 0), > > +}; > > + > > +static const struct vop_win_data rk3036_vop_win_data[] = { > > + { .base = 0x00, .phy = &rk3036_win0_data, > > + .type = DRM_PLANE_TYPE_PRIMARY }, > > + { .base = 0x00, .phy = &rk3036_win1_data, > > + .type = DRM_PLANE_TYPE_CURSOR }, > > +}; > > + > > +static const int rk3036_vop_intrs[] = { > > + DSP_HOLD_VALID_INTR, > > + FS_INTR, > > + LINE_FLAG_INTR, > > + BUS_ERROR_INTR, > > +}; > > + > > +static const struct vop_intr rk3036_intr = { > > + .intrs = rk3036_vop_intrs, > > + .nintrs = ARRAY_SIZE(rk3036_vop_intrs), > > + .status = VOP_REG(RK3036_INT_STATUS, 0xf, 0), > > + .enable = VOP_REG(RK3036_INT_STATUS, 0xf, 4), > > + .clear = VOP_REG(RK3036_INT_STATUS, 0xf, 8), > > +}; > > + > > +static const struct vop_ctrl rk3036_ctrl_data = { > > + .standby = VOP_REG(RK3036_SYS_CTRL, 0x1, 30), > > + .out_mode = VOP_REG(RK3036_DSP_CTRL0, 0xf, 0), > > + .pin_pol = VOP_REG(RK3036_DSP_CTRL0, 0xf, 4), > > + .htotal_pw = VOP_REG(RK3036_DSP_HTOTAL_HS_END, 0x1fff1fff, 0), > > + .hact_st_end = VOP_REG(RK3036_DSP_HACT_ST_END, 0x1fff1fff, 0), > > + .vtotal_pw = VOP_REG(RK3036_DSP_VTOTAL_VS_END, 0x1fff1fff, 0), > > + .vact_st_end = VOP_REG(RK3036_DSP_VACT_ST_END, 0x1fff1fff, 0), > > + .cfg_done = VOP_REG(RK3036_REG_CFG_DONE, 0x1, 0), > > +}; > > + > > +static const struct vop_reg_data rk3036_vop_init_reg_table[] = { > > + {RK3036_DSP_CTRL1, 0x}, > > +}; > > + > > +static const struct vop_data rk3036_vop = { > > + .init_table = rk3036_vop_init_reg_table, > > + .table_size = ARRAY_SIZE(rk3036_vop_init_reg_table), > > + .ctrl = &rk3036_ctrl_data, > > + .intr = &rk3036_intr, > > + .win = rk3036_vop_win_data, > > + .win_size = ARRAY_SIZE(rk3036_vop_win_data), > > +}; > > + > > static const struct vop_scl_extension rk3288_win_full_scl_ex
[Bug 97371] AMDGPU/Iceland amdgpu: failed testing IB on ring 9/10
https://bugs.freedesktop.org/show_bug.cgi?id=97371 Christian König changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #9 from Christian König --- Good, the patch should show up in the next rc. Thanks for testing, Christian. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/ba1edcc1/attachment.html>
[Bug 97099] Regression in 9ef8537e6 "drm/radeon: don't use fractional dividers on RS[78]80 if SS is enabled" (RV620)
https://bugs.freedesktop.org/show_bug.cgi?id=97099 Christian König changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from Christian König --- Thanks for testing. Yeah, it's especially tricky when the hardware is already a couple of years old and even the people who created it don't remember all the nasty details. The patch should show up in the next kernel release. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/5dc00601/attachment-0001.html>
[PATCH v3 1/5] drm/rockchip: sort registers define by chip's number
On 2016å¹´08æ18æ¥ 17:11, Daniel Vetter wrote: > On Thu, Aug 18, 2016 at 05:08:14PM +0800, Mark yao wrote: >> >Hi Sean >> > >> >Thanks for send v3 patch for rk3399 vop support. >> > >> >But sorry for that, I had changed my mind, those patches are deprecated, >> >I have new rk3399 patch on my downstream kernel, I will upstream soon. > Wut? Imo merge Sean's patch here, and then rebase your downstream patches > on top of it. That you have a downstream tree which is out of sync with > upstream shouldn't be a reason to stall upstream development. > -Daniel > Yeah, Sorry for that. In fact, on my downstream kernel, also have those patches, my new rk3399 patches are based on them, but the new rk3399 patches will cover the those patches, Sean's patches is old version. I just want to fast forward, don't want to send two version drivers to upstream. but if you and Dave feel ok for that, I have no problem:-) . merged Sean's patches and then apply new version patches. Thanks. -- ï¼ark Yao
[Intel-gfx] [PATCH 1/4] drm: add picture aspect ratio flags
On 4 August 2016 at 17:09, Daniel Vetter wrote: > On Thu, Aug 04, 2016 at 03:31:45PM +0100, Emil Velikov wrote: >> On 4 August 2016 at 14:15, Sharma, Shashank >> wrote: >> > On 8/4/2016 5:04 PM, Emil Velikov wrote: >> >> >> >> On 4 August 2016 at 11:16, Sharma, Shashank >> >> wrote: >> >>> >> >>> Hello Emil, >> >>> >> >>> Thanks for your time. >> >>> >> >>> I have got mixed opinion on this. >> >>> >> >>> IMHO we should expose them to userspace too, as UI agents like Hardware >> >>> composer/X/Wayland must know what does these >> >>> >> >>> flags means, so that they can display them on the end user screen (like >> >>> settings menu) >> >>> >> >>> But few people even think thats its too complex to be exposed to >> >>> userspace >> >>> agents. >> >>> >> >> If we want these/such flags passed between kernel and user space one must: >> >> - Provide a kernel interface how to do that >> >> - Provide a userspace (acked by respective developers/maintainers) >> >> that the approach is sane and proves useful. >> >> >> >> Since the above can take some time, I'd suggest dropping those from >> >> the UAPI header(s)... for now. >> >> >> >> -Emil >> > >> > Please guide me a bit more on this problem, Emil, Daniel. >> > The reason why I want to pass this to userspace is, so that, HWC/X/any >> > other >> > UI manager, can send a modeset >> > which is accurate upto aspect ratio. >> > >> Nobody(?) is arguing that you don't to pass such information to/from >> userspace :-) >> Your series does the internal parsing/management of the attribute and >> has no actual UAPI implementation and/or userspace references (to >> code/discussions). Thus the UAPI changes should _not_ be part of these >> patches. >> >> Daniel's blog [1] has many educational materials why and how things >> are done upstream. I would kindly invite you to give them (another?) >> courtesy read. > > It reuses the already existing uapi mode structure. And since it extends > them both on the probe side and on the modeset set this means userspace > can just pass through the right probed mode and it'll all magically work. > At least that's the idea. > > Now if you actually care about the different bits then you can select the > right mode, but that's about it. So if you want to compensate for the > non-square pixels in rendering then you need to look at it, but otherwise > it's just a case of select the mode you want. I don't see what new > userspace we'd need for that really, current one should all work nicely > as-is. At least the entire point of doing this was seamless support with > existing userspace. I failed to click that drm_mode_convert_umode does input validation, apart from the actual conversion. Thanks a lot gents and sorry for the noise. Emil
Use of copy_from_user in msm_gem_submit.c while holding a spin_lock
On Thu, Aug 18, 2016 at 4:36 AM, Daniel Vetter wrote: > On Wed, Aug 17, 2016 at 05:29:31PM -0400, Rob Clark wrote: >> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c >> index 6cd4af4..4502e4b 100644 >> --- a/drivers/gpu/drm/msm/msm_gem.c >> +++ b/drivers/gpu/drm/msm/msm_gem.c >> @@ -201,6 +201,13 @@ int msm_gem_fault(struct vm_area_struct *vma, >> struct vm_fault *vmf) >> pgoff_t pgoff; >> int ret; >> >> +/* I think this should only happen if userspace tries to pass a >> + * mmap'd but unfaulted gem bo vaddr into submit ioctl, triggering >> + * a page fault while struct_mutex is already held >> + */ >> +if (mutex_is_locked_by(&dev->struct_mutex, current)) >> +return VM_FAULT_SIGBUS; > > This is an ok (well still horrible) heuristics for the shrinker, but for > correctness it kinda doesn't cut it. What you need to do instead is drop > all the locks, copy relocations into a temp memory area and then proceed > in the msm command submission path above. > > Also reentrant mutexes are evil ;-) Please note that this is not a reentrant mutex in the fault path, it bails with VM_FAULT_SIGBUG! There is never a legit reason to use a gem bo for the bos (or cmds) table in the ioctl, so while this may not be pretty, I believe it is an acceptable solution. BR, -R
[PATCH 7/9] drm: Extract drm_property.[hc]
Hi Daniel, On 17 August 2016 at 21:56, Daniel Vetter wrote: > --- /dev/null > +++ b/include/drm/drm_property.h > +#ifndef __DRM_PROPERTY_H__ > +#define __DRM_PROPERTY_H__ > + > +#include > +#include > +#include > + Add the following fwd declaration since we use a pointer to the said struct ? From a brief look the other newly introduced headers could/should use it as well. struct drm_device; The declarations in include/drm should be the ones meant for drivers and there's no core drm internal ones ? Where/how does one manage API that should be kept private between the different core DRM components, even if there's symbol dependency between the different modules ? Thanks ! Emil
[PATCH] staging/android: add Doc for SW_SYNC ioctl interface
On Thu 2016-08-11 13:45:53, Gustavo Padovan wrote: > From: Gustavo Padovan > > This interface is hidden from kernel headers and it is intended for use > only for testing. So testers would have to add the ioctl information > internally. This is to prevent misuse of this feature. > > v2: take in Eric suggestions for the Documentation > > v3: really take in Eric suggestions > > Signed-off-by: Gustavo Padovan > Reviewed-by: Eric Engestrom Acked-by: Pavel Machek -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[Bug 97039] The Talos Principle and Serious Sam 3 GPU faults
https://bugs.freedesktop.org/show_bug.cgi?id=97039 --- Comment #14 from Marek Olšák --- Thanks. I know where the problem is and I'm working on it. Just FYI, the VM fault is completely harmless. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/68b32006/attachment.html>
Use of copy_from_user in msm_gem_submit.c while holding a spin_lock
On Thu, Aug 18, 2016 at 06:55:12AM -0400, Rob Clark wrote: > On Thu, Aug 18, 2016 at 4:36 AM, Daniel Vetter wrote: > > On Wed, Aug 17, 2016 at 05:29:31PM -0400, Rob Clark wrote: > >> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c > >> index 6cd4af4..4502e4b 100644 > >> --- a/drivers/gpu/drm/msm/msm_gem.c > >> +++ b/drivers/gpu/drm/msm/msm_gem.c > >> @@ -201,6 +201,13 @@ int msm_gem_fault(struct vm_area_struct *vma, > >> struct vm_fault *vmf) > >> pgoff_t pgoff; > >> int ret; > >> > >> +/* I think this should only happen if userspace tries to pass a > >> + * mmap'd but unfaulted gem bo vaddr into submit ioctl, triggering > >> + * a page fault while struct_mutex is already held > >> + */ > >> +if (mutex_is_locked_by(&dev->struct_mutex, current)) > >> +return VM_FAULT_SIGBUS; > > > > This is an ok (well still horrible) heuristics for the shrinker, but for > > correctness it kinda doesn't cut it. What you need to do instead is drop > > all the locks, copy relocations into a temp memory area and then proceed > > in the msm command submission path above. > > > > Also reentrant mutexes are evil ;-) > > Please note that this is not a reentrant mutex in the fault path, it > bails with VM_FAULT_SIGBUG! Except on UP it totally deadlocks ;-) -Daniel > There is never a legit reason to use a gem bo for the bos (or cmds) > table in the ioctl, so while this may not be pretty, I believe it is > an acceptable solution. -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH 7/9] drm: Extract drm_property.[hc]
On Thu, Aug 18, 2016 at 12:11:35PM +0100, Emil Velikov wrote: > Hi Daniel, > > On 17 August 2016 at 21:56, Daniel Vetter wrote: > > --- /dev/null > > +++ b/include/drm/drm_property.h > > > +#ifndef __DRM_PROPERTY_H__ > > +#define __DRM_PROPERTY_H__ > > + > > +#include > > +#include > > +#include > > + > Add the following fwd declaration since we use a pointer to the said > struct ? From a brief look the other newly introduced headers > could/should use it as well. > > struct drm_device; tbh this is only a half-useful attempt at untangling the header loop mess. Once drm_crtc.[hc] is fully split I guess we can look into what makes sense. I'll definitely be happy to review patches, but personally I don't mind loops in headers much. It's annoying, but as long as things are reasonably split and it's possible to untangle at least the code/structures and documentation itself I'm happy. Wrt all things drm_device: I'm not sure when (if ever) I'll cough up the courage to split up and untangle drmP.h ;-) > The declarations in include/drm should be the ones meant for drivers > and there's no core drm internal ones ? Where/how does one manage API > that should be kept private between the different core DRM components, > even if there's symbol dependency between the different modules ? drm_crtc_helper_internal.h, drm_crtc_internal.h and drm_internal.h, all in drivers/gpu/drm/ Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Use of copy_from_user in msm_gem_submit.c while holding a spin_lock
On Thu, Aug 18, 2016 at 9:08 AM, Daniel Vetter wrote: > On Thu, Aug 18, 2016 at 06:55:12AM -0400, Rob Clark wrote: >> On Thu, Aug 18, 2016 at 4:36 AM, Daniel Vetter wrote: >> > On Wed, Aug 17, 2016 at 05:29:31PM -0400, Rob Clark wrote: >> >> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c >> >> index 6cd4af4..4502e4b 100644 >> >> --- a/drivers/gpu/drm/msm/msm_gem.c >> >> +++ b/drivers/gpu/drm/msm/msm_gem.c >> >> @@ -201,6 +201,13 @@ int msm_gem_fault(struct vm_area_struct *vma, >> >> struct vm_fault *vmf) >> >> pgoff_t pgoff; >> >> int ret; >> >> >> >> +/* I think this should only happen if userspace tries to pass a >> >> + * mmap'd but unfaulted gem bo vaddr into submit ioctl, triggering >> >> + * a page fault while struct_mutex is already held >> >> + */ >> >> +if (mutex_is_locked_by(&dev->struct_mutex, current)) >> >> +return VM_FAULT_SIGBUS; >> > >> > This is an ok (well still horrible) heuristics for the shrinker, but for >> > correctness it kinda doesn't cut it. What you need to do instead is drop >> > all the locks, copy relocations into a temp memory area and then proceed >> > in the msm command submission path above. >> > >> > Also reentrant mutexes are evil ;-) >> >> Please note that this is not a reentrant mutex in the fault path, it >> bails with VM_FAULT_SIGBUG! > > Except on UP it totally deadlocks ;-) except UP does not exist ;-) like I mentioned, I can add 'depends SMP' to kconfig BR, -R > -Daniel > >> There is never a legit reason to use a gem bo for the bos (or cmds) >> table in the ioctl, so while this may not be pretty, I believe it is >> an acceptable solution. > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
[Bug 97039] The Talos Principle and Serious Sam 3 GPU faults
https://bugs.freedesktop.org/show_bug.cgi?id=97039 --- Comment #15 from smoki --- It is not harmless, as lower end machine is (APU) then you even lost sort of 20% with these constant non harmeless messAgess on top of already low perf. Also i obesrved one random GPU lockup if continue playing Serious Sam 3 with it, but let just say that is separate issue for now. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/dcb6debc/attachment.html>
[PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs
On Thu, 2016-08-18 at 09:39 +0200, Maarten Lankhorst wrote: > Hey, > > Op 17-08-16 om 21:55 schreef Lyude: > > > > Since the watermark calculations for Skylake are still broken, we're apt > > to hitting underruns very easily under multi-monitor configurations. > > While it would be lovely if this was fixed, it's not. Another problem > > that's been coming from this however, is the mysterious issue of > > underruns causing full system hangs. An easy way to reproduce this with > > a skylake system: > > > > - Get a laptop with a skylake GPU, and hook up two external monitors to > >  it > > - Move the cursor from the built-in LCD to one of the external displays > >  as quickly as you can > > - You'll get a few pipe underruns, and eventually the entire system will > >  just freeze. > > > > After doing a lot of investigation and reading through the bspec, I > > found the existence of the SAGV, which is responsible for adjusting the > > system agent voltage and clock frequencies depending on how much power > > we need. According to the bspec: > > > > "The display engine access to system memory is blocked during the > >  adjustment time. SAGV defaults to enabled. Software must use the > >  GT-driver pcode mailbox to disable SAGV when the display engine is not > >  able to tolerate the blocking time." > > > > The rest of the bspec goes on to explain that software can simply leave > > the SAGV enabled, and disable it when we use interlaced pipes/have more > > then one pipe active. > > > > Sure enough, with this patchset the system hangs resulting from pipe > > underruns on Skylake have completely vanished on my T460s. Additionally, > > the bspec mentions turning off the SAGV with more then one pipe > > enabled > > as a workaround for display underruns. While this patch doesn't entirely > > fix that, it looks like it does improve the situation a little bit so > > it's likely this is going to be required to make watermarks on Skylake > > fully functional. > > > > This will still need additional work in the future: we shouldn't be > > enabling the SAGV if any of the currently enabled planes can't enable WM > > levels that introduce latencies >= 30 µs. > > > > Changes since v11: > >  - Add skl_can_enable_sagv() > >  - Make sure we don't enable SAGV when not all planes can enable > >    watermarks >= the SAGV engine block time. I was originally going to > >    save this for later, but I recently managed to run into a machine > >    that was having problems with a single pipe configuration + SAGV. > >  - Make comparisons to I915_SKL_SAGV_NOT_CONTROLLED explicit > >  - Change I915_SAGV_DYNAMIC_FREQ to I915_SAGV_ENABLE > >  - Move printks outside of mutexes > >  - Don't print error messages twice > > Changes since v10: > >  - Apparently sandybridge_pcode_read actually writes values and reads > >    them back, despite it's misleading function name. This means we've > >    been doing this mostly wrong and have been writing garbage to the > >    SAGV control. Because of this, we no longer attempt to read the SAGV > >    status during initialization (since there are no helpers for this). > >  - mlankhorst noticed that this patch was breaking on some very early > >    pre-release Skylake machines, which apparently don't allow you to > >    disable the SAGV. To prevent machines from failing tests due to SAGV > >    errors, if the first time we try to control the SAGV results in the > >    mailbox indicating an invalid command, we just disable future attempts > >    to control the SAGV state by setting dev_priv->skl_sagv_status to > >    I915_SKL_SAGV_NOT_CONTROLLED and make a note of it in dmesg. > >  - Move mutex_unlock() a little higher in skl_enable_sagv(). This > >    doesn't actually fix anything, but lets us release the lock a little > >    sooner since we're finished with it. > > Changes since v9: > >  - Only enable/disable sagv on Skylake > > Changes since v8: > >  - Add intel_state->modeset guard to the conditional for > >    skl_enable_sagv() > > Changes since v7: > >  - Remove GEN9_SAGV_LOW_FREQ, replace with GEN9_SAGV_IS_ENABLED (that's > >    all we use it for anyway) > >  - Use GEN9_SAGV_IS_ENABLED instead of 0x1 for clarification > >  - Fix a styling error that snuck past me > > Changes since v6: > >  - Protect skl_enable_sagv() with intel_state->modeset conditional in > >    intel_atomic_commit_tail() > > Changes since v5: > >  - Don't use is_power_of_2. Makes things confusing > >  - Don't use the old state to figure out whether or not to > >    enable/disable the sagv, use the new one > >  - Split the loop in skl_disable_sagv into it's own function > >  - Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail() > > Changes since v4: > >  - Use is_power_of_2 against active_crtcs to check whether we have > 1 > >    pipe enabled > >  - Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0 >
[PATCH 4.4 088/138] drm: Restore double clflush on the last partial cacheline
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Chris Wilson commit 396f5d62d1a5fd99421855a08ffdef8edb43c76e upstream. This effectively reverts commit afcd950cafea6e27b739fe7772cbbeed37d05b8b Author: Chris Wilson Date: Wed Jun 10 15:58:01 2015 +0100 drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range() as we have observed issues with serialisation of the clflush operations on Baytrail+ Atoms with partial updates. Applying the double flush on the last cacheline forces that clflush to be ordered with respect to the previous clflush, and the mfence then protects against prefetches crossing the clflush boundary. The same issue can be demonstrated in userspace with igt/gem_exec_flush. Fixes: afcd950cafea6 (drm: Avoid the double clflush on the last cache...) Testcase: igt/gem_concurrent_blit Testcase: igt/gem_partial_pread_pwrite Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92845 Signed-off-by: Chris Wilson Cc: dri-devel at lists.freedesktop.org Cc: Akash Goel Cc: Imre Deak Cc: Daniel Vetter Cc: Jason Ekstrand Reviewed-by: Mika Kuoppala Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1467880930-23082-6-git-send-email-chris at chris-wilson.co.uk Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_cache.c |1 + 1 file changed, 1 insertion(+) --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -136,6 +136,7 @@ drm_clflush_virt_range(void *addr, unsig mb(); for (; addr < end; addr += size) clflushopt(addr); + clflushopt(end - 1); /* force serialisation */ mb(); return; }
[Bug 97039] The Talos Principle and Serious Sam 3 GPU faults
https://bugs.freedesktop.org/show_bug.cgi?id=97039 --- Comment #16 from Jan Ziak <0xe2.0x9a.0x9b at gmail.com> --- (In reply to smoki from comment #15) > It is not harmless, as lower end machine is (APU) then you even lost sort > of 20% with these constant non harmless messages on top of already low > perf. Of course, I (and probably also you) will do a performance measurement after Marek's patch is available. (There is lot of work to be done in Mesa to make it perform generally better and approach Nvidia Linux performance. Considering the amount of work required I do not expect it to materialize this year.) -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/d6736f7e/attachment.html>
[PATCH 4.7 132/186] drm: Restore double clflush on the last partial cacheline
4.7-stable review patch. If anyone has any objections, please let me know. -- From: Chris Wilson commit 396f5d62d1a5fd99421855a08ffdef8edb43c76e upstream. This effectively reverts commit afcd950cafea6e27b739fe7772cbbeed37d05b8b Author: Chris Wilson Date: Wed Jun 10 15:58:01 2015 +0100 drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range() as we have observed issues with serialisation of the clflush operations on Baytrail+ Atoms with partial updates. Applying the double flush on the last cacheline forces that clflush to be ordered with respect to the previous clflush, and the mfence then protects against prefetches crossing the clflush boundary. The same issue can be demonstrated in userspace with igt/gem_exec_flush. Fixes: afcd950cafea6 (drm: Avoid the double clflush on the last cache...) Testcase: igt/gem_concurrent_blit Testcase: igt/gem_partial_pread_pwrite Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92845 Signed-off-by: Chris Wilson Cc: dri-devel at lists.freedesktop.org Cc: Akash Goel Cc: Imre Deak Cc: Daniel Vetter Cc: Jason Ekstrand Reviewed-by: Mika Kuoppala Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1467880930-23082-6-git-send-email-chris at chris-wilson.co.uk Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_cache.c |1 + 1 file changed, 1 insertion(+) --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -136,6 +136,7 @@ drm_clflush_virt_range(void *addr, unsig mb(); for (; addr < end; addr += size) clflushopt(addr); + clflushopt(end - 1); /* force serialisation */ mb(); return; }
[Bug 97039] The Talos Principle and Serious Sam 3 GPU faults
https://bugs.freedesktop.org/show_bug.cgi?id=97039 --- Comment #17 from smoki --- I already do measurement month ago, when i bisected this. I would like to have 2core Temash APU to show even bigger issue that is i think highly recommended for perf measurements, but Kabini is slowest i have :D If i don't care i wouldn't be here, otherwise i would run Pascal Titan X with blob and be as ignorant as i can. But no i am not, not me. Here i actually don't pretend or push on higher perf at all, but opposite - just things to not regress this much... so it is small regression testing contribution from me. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/e3b172d9/attachment.html>
[RESEND PATCH v2 1/3] drm: dw_hdmi: use of_get_i2c_adapter_by_node interface
On Tue, Aug 16, 2016 at 11:26:43PM +0300, Vladimir Zapolskiy wrote: > This change is needed to properly lock I2C bus driver, which serves > DDC. > > The change fixes an overflow over zero of I2C bus driver user counter: > > root at imx6q:~# lsmod > Not tainted > dw_hdmi_ahb_audio 4082 0 - Live 0xbf02c000 > dw_hdmi_imx 3498 0 - Live 0xbf00d000 > dw_hdmi 16398 2 dw_hdmi_ahb_audio,dw_hdmi_imx, Live 0xbf004000 > i2c_imx 16687 0 - Live 0xbf017000 > > root at imx6q:~# rmmod dw_hdmi_imx > root at imx6q:~# lsmod > Not tainted > dw_hdmi_ahb_audio 4082 0 - Live 0xbf02c000 > dw_hdmi 16398 1 dw_hdmi_ahb_audio, Live 0xbf004000 > i2c_imx 16687 -1 - Live 0xbf017000 > ^^ > > root at imx6q:~# rmmod i2c_imx > rmmod: ERROR: Module i2c_imx is in use > > Note that prior to this change put_device() coupled with > of_find_i2c_adapter_by_node() was missing on error path of > dw_hdmi_bind(), added i2c_put_adapter() there along with the change. I *guess* this is the right thing, but it would be nice to see the results with the patch applied in the commit description. Nevertheless: Acked-by: Russell King I'd also like to see the DDC bits extracted from the various imx drivers, because this is surely common code - I've had this floating around for a few years but never got around to finishing it off and submitting it. If folk think this is a good idea, and want to take the idea on, that's fine by me. drivers/gpu/drm/Makefile| 2 + drivers/gpu/drm/bridge/dw-hdmi.c| 98 + drivers/gpu/drm/drm_ddc_connector.c | 91 ++ drivers/gpu/drm/imx/imx-tve.c | 59 ++ include/drm/drm_ddc_connector.h | 33 + 5 files changed, 163 insertions(+), 120 deletions(-) diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 0238bf8bc8c3..e31c72f86f8c 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -21,6 +21,8 @@ drm-$(CONFIG_DRM_PANEL) += drm_panel.o drm-$(CONFIG_OF) += drm_of.o drm-$(CONFIG_AGP) += drm_agpsupport.o +drm-y += drm_ddc_connector.o + drm_kms_helper-y := drm_crtc_helper.o drm_dp_helper.o drm_probe_helper.o \ drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \ drm_kms_helper_common.o drm_dp_dual_mode_helper.o \ diff --git a/drivers/gpu/drm/bridge/dw-hdmi.c b/drivers/gpu/drm/bridge/dw-hdmi.c index 65dd102689ef..786c0c076585 100644 --- a/drivers/gpu/drm/bridge/dw-hdmi.c +++ b/drivers/gpu/drm/bridge/dw-hdmi.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include "dw-hdmi.h" @@ -103,7 +104,7 @@ struct hdmi_data_info { }; struct dw_hdmi { - struct drm_connector connector; + struct drm_ddc_connector *ddc_conn; struct drm_encoder *encoder; struct drm_bridge *bridge; @@ -124,10 +125,7 @@ struct dw_hdmi { bool phy_enabled; struct drm_display_mode previous_mode; - struct i2c_adapter *ddc; void __iomem *regs; - bool sink_is_hdmi; - bool sink_has_audio; struct mutex mutex; /* for state below and previous_mode */ enum drm_connector_force force; /* mutex-protected force state */ @@ -862,7 +860,7 @@ static int dw_hdmi_phy_init(struct dw_hdmi *hdmi) bool cscon; /*check csc whether needed activated in HDMI mode */ - cscon = hdmi->sink_is_hdmi && is_color_space_conversion(hdmi); + cscon = hdmi->ddc_conn->sink_is_hdmi && is_color_space_conversion(hdmi); /* HDMI Phy spec says to do the phy initialization sequence twice */ for (i = 0; i < 2; i++) { @@ -1039,7 +1037,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi, HDMI_FC_INVIDCONF_IN_I_P_INTERLACED : HDMI_FC_INVIDCONF_IN_I_P_PROGRESSIVE; - inv_val |= hdmi->sink_is_hdmi ? + inv_val |= hdmi->ddc_conn->sink_is_hdmi ? HDMI_FC_INVIDCONF_DVI_MODEZ_HDMI_MODE : HDMI_FC_INVIDCONF_DVI_MODEZ_DVI_MODE; @@ -1238,7 +1236,7 @@ static int dw_hdmi_setup(struct dw_hdmi *hdmi, struct drm_display_mode *mode) /* HDMI Initialization Step B.3 */ dw_hdmi_enable_video_path(hdmi); - if (hdmi->sink_has_audio) { + if (hdmi->ddc_conn->sink_has_audio) { dev_dbg(hdmi->dev, "sink has audio support\n"); /* HDMI Initialization Step E - Configure audio */ @@ -1262,7 +1260,7 @@ static int dw_hdmi_setup(struct dw_hdmi *hdmi, struct drm_display_mode *mode) hdmi_tx_hdcp_config(hdmi); dw_hdmi_clear_overflow(hdmi); - if (hdmi->cable_plugin && hdmi->sink_is_hdmi) + if (hdmi->cable_plugin && hdmi->ddc_conn->sink_is_hdmi) hdmi_enable_overflow_interrupts(hdmi); return 0; @@ -1438,8 +1436,7 @@ static void dw_hdmi_bridge_enable(struct drm_bridge *bridge) static enum drm_connector_status dw_hdmi_connector_detect(struct
[PATCH] drm: bridge/dw-hdmi: Fix colorspace and scan information registers values
Colorspace and scan information values were being written in wrong offsets. This patch corrects this and writes the values at the offsets specified in the databook. Signed-off-by: Jose Abreu Cc: Carlos Palminha Cc: David Airlie Cc: Russell King Cc: Daniel Vetter Cc: dri-devel at lists.freedesktop.org Cc: linux-kernel at vger.kernel.org --- drivers/gpu/drm/bridge/dw-hdmi.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/bridge/dw-hdmi.c b/drivers/gpu/drm/bridge/dw-hdmi.c index 77ab473..cdf39aa 100644 --- a/drivers/gpu/drm/bridge/dw-hdmi.c +++ b/drivers/gpu/drm/bridge/dw-hdmi.c @@ -940,10 +940,11 @@ static void hdmi_config_AVI(struct dw_hdmi *hdmi, struct drm_display_mode *mode) */ /* -* AVI data byte 1 differences: Colorspace in bits 4,5 rather than 5,6, -* active aspect present in bit 6 rather than 4. +* AVI data byte 1 differences: Colorspace in bits 0,1 rather than 5,6, +* scan info in bits 4,5 rather than 0,1 and active aspect present in +* bit 6 rather than 4. */ - val = (frame.colorspace & 3) << 4 | (frame.scan_mode & 0x3); + val = (frame.scan_mode & 3) << 4 | (frame.colorspace & 3); if (frame.active_aspect & 15) val |= HDMI_FC_AVICONF0_ACTIVE_FMT_INFO_PRESENT; if (frame.top_bar || frame.bottom_bar) -- 2.1.4
[PATCH v2 0/4] Picture aspect ratio support in DRM layer
Hi, On 09-08-2016 15:55, Shashank Sharma wrote: > This patch series adds 4 patches. > - The first two patches add aspect ratio support in DRM layes > - Next two patches add new aspect ratios defined in CEA-861-F > supported for HDMI 2.0 4k modes. > > Adding aspect ratio support in DRM layer: > - The CEA videmodes contain aspect ratio information, which we > parse when we read the modes from EDID. But while transforming > user_mode to kernel_mode or viceversa, DRM layer lose this > information. > - HDMI compliance testing for CEA modes, expects the AVI info frames > to contain exact VIC no for the 'video mode under test'. Now CEA > modes have different VIC for same modes but different aspect ratio > for example: > VIC 2 = 720x480 at 60 4:3 > VIC 3 = 720x480 at 60 16:9 > In this way, lack of aspect ratio information, can cause wrong VIC > no in AVI IF, causing HDMI complaince test to fail. > - This patch set adds code, which embeds the aspect ratio information > also in DRM video mode flags, and uses it while comparing two modes. > > Adding new aspect ratios for HDMI 2.0 > - CEA-861-F defines two new aspect ratios, to be used for 4k HDMI 2.0 > modes. > - 64:27 > - 256:135 > Last two patches in the series, adds code to handle these new > aspect ratios. > > V2: Fixed review comments from Sean, Emil, Daniel > > Shashank Sharma (4): > drm: add picture aspect ratio flags > drm: Add aspect ratio parsing in DRM layer > video: Add new aspect ratios for HDMI 2.0 > drm: Add and handle new aspect ratios in DRM layer I am using these patches to run HDMI 2.0 compliance so: Tested-by: Jose Abreu I also have code ready that does the EDID parsing of HDMI 2.0 blocks (HF-VSDB, 4:2:0 VDB and 4:2:0 VCB). The code was tested and validated against bridge driver dw-hdmi resorting to HDMI compliance equipment. Besides the parsing I also added support for YCbCr 4:2:0 encoding. Still, to send the patches I need this series to get accepted first. > > drivers/gpu/drm/drm_modes.c | 43 +++ > drivers/video/hdmi.c| 4 > include/linux/hdmi.h| 2 ++ > include/uapi/drm/drm_mode.h | 24 +++- > 4 files changed, 68 insertions(+), 5 deletions(-) > Best regards, Jose Miguel Abreu
Fw: Re: BUG : unable to handle kernel NULL pointer dereference at 00000148 on a ThinkCentre
Best, Rares Original Message Subject: Re: BUG : unable to handle kernel NULL pointer dereference at 0148 on a ThinkCentre Local Time: August 18, 2016 5:59 AM UTC Time: August 18, 2016 2:59 AM From: airl...@linux.ie To: schbax at protonmail.ch Hi, can you post this to dri-devel at lists.freedesktop.org? Thanks, Dave. On Sun, 14 Aug 2016, Rares Aioanei wrote: > Hey, > > While booting 4.8.0-rc1-00282-g118253a I noticed that my monitor started > displaying "Not Optimum Mode Recommended Mode : > 1440x900 60Hz" - monitor is Samsung SyncMaster 923nw. Looking at the dmesg > (attached) via ssh, I see that the kernel > crashes. The distribution is Debian staqble/testing. Perhaps note-worthy is > the fact that kernel > 4.7.0-rc2-00300-g3d0f0b6 and before work fine. If there is anything else, > please let me know. > > Best, > Rares > > > > -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/d96d05fe/attachment-0001.html>
[PATCH] drm: bridge/dw-hdmi: Fix colorspace and scan information registers values
On Thu, Aug 18, 2016 at 03:34:12PM +0100, Jose Abreu wrote: > Colorspace and scan information values were being written in wrong > offsets. This patch corrects this and writes the values at the > offsets specified in the databook. > > Signed-off-by: Jose Abreu > Cc: Carlos Palminha > Cc: David Airlie > Cc: Russell King > Cc: Daniel Vetter > Cc: dri-devel at lists.freedesktop.org > Cc: linux-kernel at vger.kernel.org Acked-by: Russell King Thanks. > --- > drivers/gpu/drm/bridge/dw-hdmi.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/bridge/dw-hdmi.c > b/drivers/gpu/drm/bridge/dw-hdmi.c > index 77ab473..cdf39aa 100644 > --- a/drivers/gpu/drm/bridge/dw-hdmi.c > +++ b/drivers/gpu/drm/bridge/dw-hdmi.c > @@ -940,10 +940,11 @@ static void hdmi_config_AVI(struct dw_hdmi *hdmi, > struct drm_display_mode *mode) >*/ > > /* > - * AVI data byte 1 differences: Colorspace in bits 4,5 rather than 5,6, > - * active aspect present in bit 6 rather than 4. > + * AVI data byte 1 differences: Colorspace in bits 0,1 rather than 5,6, > + * scan info in bits 4,5 rather than 0,1 and active aspect present in > + * bit 6 rather than 4. >*/ > - val = (frame.colorspace & 3) << 4 | (frame.scan_mode & 0x3); > + val = (frame.scan_mode & 3) << 4 | (frame.colorspace & 3); > if (frame.active_aspect & 15) > val |= HDMI_FC_AVICONF0_ACTIVE_FMT_INFO_PRESENT; > if (frame.top_bar || frame.bottom_bar) > -- > 2.1.4 > -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
IGT on ARM - Build # 137 - Failure!
IGT on ARM - Build # 137 - Failure: Check console output at https://jenkins.freedesktop.org/job/IGT-ARM/137/ to view the results. -- next part -- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 1275211 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/cf75ea45/attachment-0001.obj>
[Bug 97025] flip queue failed: Device or resource busy
https://bugs.freedesktop.org/show_bug.cgi?id=97025 --- Comment #13 from Bernd Steinhauser --- One more remark: I've only observed the effect when the OpenGL 3.1 compositing backend in kwin is active. I tested with OpenGL 2 backend over the last week and have not seen this happening since. I should also mention that I've had the egl interface activated, which is not recommended for kwin. I've not had issues with it before, but it could be related, so the next thing I'm testing is glx/OpenGL 3.1 and hope I can narrow this down this way. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/94f36a74/attachment.html>
[PATCH 2/2] drm/rockchip: Use DRM_DEV_ERROR in vop
On Thu, Aug 18, 2016 at 1:53 AM, Mark yao wrote: > On 2016å¹´08æ13æ¥ 01:00, Sean Paul wrote: >> >> Since we can have multiple vops, use DRM_DEV_ERROR to >> make logs easier to process. >> >> Signed-off-by: Sean Paul > > > looks good for me > > Acked-by: Mark Yao > Applied to drm-misc >> --- >> drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 24 >> ++-- >> 1 file changed, 14 insertions(+), 10 deletions(-) >> >> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> index 31744fe..ec8ad00 100644 >> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >> @@ -238,7 +238,7 @@ static enum vop_data_format >> vop_convert_format(uint32_t format) >> case DRM_FORMAT_NV24: >> return VOP_FMT_YUV444SP; >> default: >> - DRM_ERROR("unsupport format[%08x]\n", format); >> + DRM_ERROR("unsupported format[%08x]\n", format); >> return -EINVAL; >> } >> } >> @@ -315,7 +315,7 @@ static void scl_vop_cal_scl_fac(struct vop *vop, const >> struct vop_win_data *win, >> int vskiplines = 0; >> if (dst_w > 3840) { >> - DRM_ERROR("Maximum destination width (3840) exceeded\n"); >> + DRM_DEV_ERROR(vop->dev, "Maximum dst width (3840) >> exceeded\n"); >> return; >> } >> @@ -353,11 +353,11 @@ static void scl_vop_cal_scl_fac(struct vop *vop, >> const struct vop_win_data *win, >> VOP_SCL_SET_EXT(vop, win, lb_mode, lb_mode); >> if (lb_mode == LB_RGB_3840X2) { >> if (yrgb_ver_scl_mode != SCALE_NONE) { >> - DRM_ERROR("ERROR : not allow yrgb ver scale\n"); >> + DRM_DEV_ERROR(vop->dev, "not allow yrgb ver >> scale\n"); >> return; >> } >> if (cbcr_ver_scl_mode != SCALE_NONE) { >> - DRM_ERROR("ERROR : not allow cbcr ver scale\n"); >> + DRM_DEV_ERROR(vop->dev, "not allow cbcr ver >> scale\n"); >> return; >> } >> vsu_mode = SCALE_UP_BIL; >> @@ -970,7 +970,8 @@ static void vop_crtc_enable(struct drm_crtc *crtc) >> VOP_CTRL_SET(vop, mipi_en, 1); >> break; >> default: >> - DRM_ERROR("unsupport connector_type[%d]\n", >> s->output_type); >> + DRM_DEV_ERROR(vop->dev, "unsupported connector_type >> [%d]\n", >> + s->output_type); >> } >> VOP_CTRL_SET(vop, out_mode, s->output_mode); >> @@ -1154,7 +1155,8 @@ static irqreturn_t vop_isr(int irq, void *data) >> /* Unhandled irqs are spurious. */ >> if (active_irqs) >> - DRM_ERROR("Unknown VOP IRQs: %#02x\n", active_irqs); >> + DRM_DEV_ERROR(vop->dev, "Unknown VOP IRQs: %#02x\n", >> + active_irqs); >> return ret; >> } >> @@ -1189,7 +1191,8 @@ static int vop_create_crtc(struct vop *vop) >>win_data->phy->nformats, >>win_data->type, NULL); >> if (ret) { >> - DRM_ERROR("failed to initialize plane\n"); >> + DRM_DEV_ERROR(vop->dev, "failed to init plane >> %d\n", >> + ret); >> goto err_cleanup_planes; >> } >> @@ -1227,7 +1230,8 @@ static int vop_create_crtc(struct vop *vop) >>win_data->phy->nformats, >>win_data->type, NULL); >> if (ret) { >> - DRM_ERROR("failed to initialize overlay plane\n"); >> + DRM_DEV_ERROR(vop->dev, "failed to init overlay >> %d\n", >> + ret); >> goto err_cleanup_crtc; >> } >> drm_plane_helper_add(&vop_win->base, &plane_helper_funcs); >> @@ -1235,8 +1239,8 @@ static int vop_create_crtc(struct vop *vop) >> port = of_get_child_by_name(dev->of_node, "port"); >> if (!port) { >> - DRM_ERROR("no port node found in %s\n", >> - dev->of_node->full_name); >> + DRM_DEV_ERROR(vop->dev, "no port node found in %s\n", >> + dev->of_node->full_name); >> ret = -ENOENT; >> goto err_cleanup_crtc; >> } > > > > -- > ï¼ark Yao > >
[PATCH v4] drm: Introduce DRM_DEV_* log messages
On Tue, Aug 16, 2016 at 9:56 AM, Eric Engestrom wrote: > On Tue, Aug 16, 2016 at 09:18:34AM -0700, Sean Paul wrote: >> On Tue, Aug 16, 2016 at 5:28 AM, Eric Engestrom >> wrote: >> > On Mon, Aug 15, 2016 at 04:18:04PM -0700, Sean Paul wrote: >> >> This patch consolidates all the various log functions/macros into >> >> one uber function, drm_log. It also introduces some new DRM_DEV_* >> >> variants that print the device name to delineate multiple devices >> >> of the same type. >> >> >> >> Signed-off-by: Sean Paul >> >> --- >> >> >> >> Changes in v2: >> >> - Use dev_printk for the dev variant (Chris Wilson) >> >> >> >> Changes in v3: >> >> - Rename drm_log to drm_dev_printk (Chris Wilson) >> >> - Break out drm_printk from drm_dev_printk to reduce >> >> image growth due to passing NULL around (Chris Wilson) >> >> >> >> Changes in v4: >> >> - Pull format string out into #define (Eric Engestrom) >> >> >> >> >> >> drivers/gpu/drm/drm_drv.c | 27 ++--- >> >> include/drm/drmP.h| 140 >> >> +++--- >> >> 2 files changed, 103 insertions(+), 64 deletions(-) >> >> >> >> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c >> >> index 57ce973..a7f6282 100644 >> >> --- a/drivers/gpu/drm/drm_drv.c >> >> +++ b/drivers/gpu/drm/drm_drv.c >> >> @@ -63,37 +63,48 @@ static struct idr drm_minors_idr; >> >> >> >> static struct dentry *drm_debugfs_root; >> >> >> >> -void drm_err(const char *format, ...) >> >> +#define DRM_PRINTK_FMT "[" DRM_NAME ":%s]%s %pV" >> >> + >> >> +void drm_dev_printk(const struct device *dev, const char *level, >> >> + unsigned int category, const char *function_name, >> >> + const char *prefix, const char *format, ...) >> >> { >> >> struct va_format vaf; >> >> va_list args; >> >> >> >> - va_start(args, format); >> >> + if (category != DRM_UT_NONE && !(drm_debug & category)) >> >> + return; >> >> >> >> + va_start(args, format); >> >> vaf.fmt = format; >> >> vaf.va = &args; >> >> >> >> - printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV", >> >> -__builtin_return_address(0), &vaf); >> >> + dev_printk(level, dev, DRM_PRINTK_FMT, function_name, prefix, >> >> +&vaf); >> >> >> >> va_end(args); >> >> } >> >> -EXPORT_SYMBOL(drm_err); >> >> +EXPORT_SYMBOL(drm_dev_printk); >> >> >> >> -void drm_ut_debug_printk(const char *function_name, const char *format, >> >> ...) >> >> +void drm_printk(const char *level, unsigned int category, >> >> + const char *function_name, const char *prefix, >> >> + const char *format, ...) >> >> { >> >> struct va_format vaf; >> >> va_list args; >> >> >> >> + if (category != DRM_UT_NONE && !(drm_debug & category)) >> >> + return; >> >> + >> >> va_start(args, format); >> >> vaf.fmt = format; >> >> vaf.va = &args; >> >> >> >> - printk(KERN_DEBUG "[" DRM_NAME ":%s] %pV", function_name, &vaf); >> >> + printk("%s" DRM_PRINTK_FMT, level, function_name, prefix, &vaf); >> >> >> >> va_end(args); >> >> } >> >> -EXPORT_SYMBOL(drm_ut_debug_printk); >> >> +EXPORT_SYMBOL(drm_printk); >> >> >> >> /* >> >> * DRM Minors >> >> diff --git a/include/drm/drmP.h b/include/drm/drmP.h >> >> index f8e87fd..94eb138 100644 >> >> --- a/include/drm/drmP.h >> >> +++ b/include/drm/drmP.h >> >> @@ -127,6 +127,7 @@ struct dma_buf_attachment; >> >> * run-time by echoing the debug value in its sysfs node: >> >> * # echo 0xf > /sys/module/drm/parameters/debug >> >> */ >> >> +#define DRM_UT_NONE 0x00 >> >> #define DRM_UT_CORE 0x01 >> >> #define DRM_UT_DRIVER0x02 >> >> #define DRM_UT_KMS 0x04 >> >> @@ -134,11 +135,15 @@ struct dma_buf_attachment; >> >> #define DRM_UT_ATOMIC0x10 >> >> #define DRM_UT_VBL 0x20 >> >> >> >> -extern __printf(2, 3) >> >> -void drm_ut_debug_printk(const char *function_name, >> >> - const char *format, ...); >> >> -extern __printf(1, 2) >> >> -void drm_err(const char *format, ...); >> >> +extern __printf(6, 7) >> >> +void drm_dev_printk(const struct device *dev, const char *level, >> >> + unsigned int category, const char *function_name, >> >> + const char *prefix, const char *format, ...); >> >> + >> >> +extern __printf(5, 6) >> >> +void drm_printk(const char *level, unsigned int category, >> >> + const char *function_name, const char *prefix, >> >> + const char *format, ...); >> >> >> >> /***/ >> >> /** \name DRM template customization defaults */ >> >> @@ -169,8 +174,12 @@ void drm_err(const char *format, ...); >> >> * \param fmt printf() like format string. >> >> * \param arg arguments >> >> */ >> >> -#define DRM_ERROR(fmt, ...)
[PATCH] drm/edid: CEA mode 64 1080p100 vsync pulse width incorrect
On Mon, Aug 15, 2016 at 10:31 AM, wrote: > From: Clint Taylor > > In the CEA-861 specification VIC 64 specifies a vsync pulse of 5 and > a backporch of 36. Adjust vsync pulse width to match specification. > > Cc: Ville Syrjälä > Signed-off-by: Clint Taylor Seems to line up with the spec. Applied to drm-misc Sean > --- > drivers/gpu/drm/drm_edid.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > index 637a0aa..ea27aaa 100644 > --- a/drivers/gpu/drm/drm_edid.c > +++ b/drivers/gpu/drm/drm_edid.c > @@ -991,7 +991,7 @@ static const struct drm_display_mode edid_cea_modes[] = { > .vrefresh = 120, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, }, > /* 64 - 1920x1080 at 100Hz */ > { DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 297000, 1920, 2448, > - 2492, 2640, 0, 1080, 1084, 1094, 1125, 0, > + 2492, 2640, 0, 1080, 1084, 1089, 1125, 0, >DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC), > .vrefresh = 100, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, }, > }; > -- > 1.7.9.5 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v4 1/4] drm: extra printk() wrapper macros
We had only DRM_INFO() and DRM_ERROR(), whereas the underlying printk() provides several other useful intermediate levels such as NOTICE and WARNING. So this patch fills out the set by providing both regular and once-only macros for each of the levels INFO, NOTICE, and WARNING, using a common underlying macro that does all the token-pasting. DRM_ERROR is unchanged, as it's not just a printk wrapper. v2: Fix whitespace, missing ## (Eric Engestrom) Signed-off-by: Dave Gordon Reviewed-by: Eric Engestrom Cc: dri-devel at lists.freedesktop.org --- include/drm/drmP.h | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/include/drm/drmP.h b/include/drm/drmP.h index f8e87fd..734e4fb 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -163,6 +163,26 @@ void drm_err(const char *format, ...); /** \name Macros to make printk easier */ /*@{*/ +#define _DRM_PRINTK(once, level, fmt, ...) \ + do {\ + printk##once(KERN_##level "[" DRM_NAME "] " fmt,\ +##__VA_ARGS__);\ + } while (0) + +#define DRM_INFO(fmt, ...) \ + _DRM_PRINTK(, INFO, fmt, ##__VA_ARGS__) +#define DRM_NOTE(fmt, ...) \ + _DRM_PRINTK(, NOTICE, fmt, ##__VA_ARGS__) +#define DRM_WARN(fmt, ...) \ + _DRM_PRINTK(, WARNING, fmt, ##__VA_ARGS__) + +#define DRM_INFO_ONCE(fmt, ...) \ + _DRM_PRINTK(_once, INFO, fmt, ##__VA_ARGS__) +#define DRM_NOTE_ONCE(fmt, ...) \ + _DRM_PRINTK(_once, NOTICE, fmt, ##__VA_ARGS__) +#define DRM_WARN_ONCE(fmt, ...) \ + _DRM_PRINTK(_once, WARNING, fmt, ##__VA_ARGS__) + /** * Error output. * @@ -188,12 +208,6 @@ void drm_err(const char *format, ...); drm_err(fmt, ##__VA_ARGS__);\ }) -#define DRM_INFO(fmt, ...) \ - printk(KERN_INFO "[" DRM_NAME "] " fmt, ##__VA_ARGS__) - -#define DRM_INFO_ONCE(fmt, ...)\ - printk_once(KERN_INFO "[" DRM_NAME "] " fmt, ##__VA_ARGS__) - /** * Debug output. * -- 1.9.1
[PATCH 1/2] drm: Allow drivers to modify plane_state in prepare_fb/cleanup_fb
The drivers have to modify the atomic plane state during the prepare_fb callback so they track allocations, reservations and dependencies for this atomic operation involving this fb. In particular, how else do we set the plane->fence from the framebuffer! Signed-off-by: Chris Wilson Cc: Daniel Vetter Cc: dri-devel at lists.freedesktop.org --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 4 ++-- drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c | 4 ++-- drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 4 ++-- drivers/gpu/drm/i915/intel_display.c| 4 ++-- drivers/gpu/drm/i915/intel_drv.h| 4 ++-- drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c | 4 ++-- drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c | 4 ++-- drivers/gpu/drm/omapdrm/omap_plane.c| 4 ++-- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 ++-- drivers/gpu/drm/tegra/dc.c | 4 ++-- include/drm/drm_modeset_helper_vtables.h| 4 ++-- 11 files changed, 22 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c index 146809a97a07..72e6b7dd457b 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c @@ -755,7 +755,7 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p, } static int atmel_hlcdc_plane_prepare_fb(struct drm_plane *p, - const struct drm_plane_state *new_state) + struct drm_plane_state *new_state) { /* * FIXME: we should avoid this const -> non-const cast but it's @@ -780,7 +780,7 @@ static int atmel_hlcdc_plane_prepare_fb(struct drm_plane *p, } static void atmel_hlcdc_plane_cleanup_fb(struct drm_plane *p, - const struct drm_plane_state *old_state) +struct drm_plane_state *old_state) { /* * FIXME: we should avoid this const -> non-const cast but it's diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c index e50467a0deb0..2a3e92976700 100644 --- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c +++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c @@ -171,13 +171,13 @@ static void fsl_dcu_drm_plane_atomic_update(struct drm_plane *plane, static void fsl_dcu_drm_plane_cleanup_fb(struct drm_plane *plane, -const struct drm_plane_state *new_state) +struct drm_plane_state *new_state) { } static int fsl_dcu_drm_plane_prepare_fb(struct drm_plane *plane, -const struct drm_plane_state *new_state) +struct drm_plane_state *new_state) { return 0; } diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c index 91188f33b1d9..6417158dad61 100644 --- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c @@ -818,14 +818,14 @@ static void ade_disable_channel(struct ade_plane *aplane) } static int ade_plane_prepare_fb(struct drm_plane *plane, - const struct drm_plane_state *new_state) + struct drm_plane_state *new_state) { /* do nothing */ return 0; } static void ade_plane_cleanup_fb(struct drm_plane *plane, -const struct drm_plane_state *old_state) +struct drm_plane_state *old_state) { /* do nothing */ } diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 8a203b5f347e..123112c240e0 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -14439,7 +14439,7 @@ static const struct drm_crtc_funcs intel_crtc_funcs = { */ int intel_prepare_plane_fb(struct drm_plane *plane, - const struct drm_plane_state *new_state) + struct drm_plane_state *new_state) { struct drm_device *dev = plane->dev; struct drm_framebuffer *fb = new_state->fb; @@ -14525,7 +14525,7 @@ intel_prepare_plane_fb(struct drm_plane *plane, */ void intel_cleanup_plane_fb(struct drm_plane *plane, - const struct drm_plane_state *old_state) + struct drm_plane_state *old_state) { struct drm_device *dev = plane->dev; struct intel_plane_state *old_intel_state; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1c700b0c3cea..774aab342f40 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1251,9 +1251,9 @@ void intel_finish_page_flip_cs(struct drm_i915_private *dev_priv, int pipe); void intel_finish_page_flip_mmio(struct drm_i915_private *dev_priv, int pipe); void intel_check_pa
[PATCH 2/2] drm/i915: Replace intel_plane->wait_req with plane->fence
Now that we subclass our request from struct fence, we start using the common primitives more freely and so avoid hand-rolling routines already provided for by the helpers. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_atomic_plane.c | 3 -- drivers/gpu/drm/i915/intel_display.c | 52 +++ drivers/gpu/drm/i915/intel_drv.h | 1 - 3 files changed, 5 insertions(+), 51 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c index b82de3072d4f..b41bf380f2ab 100644 --- a/drivers/gpu/drm/i915/intel_atomic_plane.c +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c @@ -77,14 +77,12 @@ intel_plane_duplicate_state(struct drm_plane *plane) struct intel_plane_state *intel_state; intel_state = kmemdup(plane->state, sizeof(*intel_state), GFP_KERNEL); - if (!intel_state) return NULL; state = &intel_state->base; __drm_atomic_helper_plane_duplicate_state(plane, state); - intel_state->wait_req = NULL; return state; } @@ -101,7 +99,6 @@ void intel_plane_destroy_state(struct drm_plane *plane, struct drm_plane_state *state) { - WARN_ON(state && to_intel_plane_state(state)->wait_req); drm_atomic_helper_plane_destroy_state(plane, state); } diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 123112c240e0..1b5f653d595b 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13944,9 +13944,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev, bool nonblock) { struct drm_i915_private *dev_priv = to_i915(dev); - struct drm_plane_state *plane_state; struct drm_crtc_state *crtc_state; - struct drm_plane *plane; struct drm_crtc *crtc; int i, ret; @@ -13969,27 +13967,6 @@ static int intel_atomic_prepare_commit(struct drm_device *dev, ret = drm_atomic_helper_prepare_planes(dev, state); mutex_unlock(&dev->struct_mutex); - if (!ret && !nonblock) { - for_each_plane_in_state(state, plane, plane_state, i) { - struct intel_plane_state *intel_plane_state = - to_intel_plane_state(plane_state); - - if (!intel_plane_state->wait_req) - continue; - - ret = i915_wait_request(intel_plane_state->wait_req, - true, NULL, NULL); - if (ret) { - /* Any hang should be swallowed by the wait */ - WARN_ON(ret == -EIO); - mutex_lock(&dev->struct_mutex); - drm_atomic_helper_cleanup_planes(dev, state); - mutex_unlock(&dev->struct_mutex); - break; - } - } - } - return ret; } @@ -14076,27 +14053,12 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) struct drm_crtc_state *old_crtc_state; struct drm_crtc *crtc; struct intel_crtc_state *intel_cstate; - struct drm_plane *plane; - struct drm_plane_state *plane_state; bool hw_check = intel_state->modeset; unsigned long put_domains[I915_MAX_PIPES] = {}; unsigned crtc_vblank_mask = 0; - int i, ret; - - for_each_plane_in_state(state, plane, plane_state, i) { - struct intel_plane_state *intel_plane_state = - to_intel_plane_state(plane_state); - - if (!intel_plane_state->wait_req) - continue; - - ret = i915_wait_request(intel_plane_state->wait_req, - true, NULL, NULL); - /* EIO should be eaten, and we can't get interrupted in the -* worker, and blocking commits have waited already. */ - WARN_ON(ret); - } + int i; + drm_atomic_helper_wait_for_fences(dev, state); drm_atomic_helper_wait_for_dependencies(state); if (intel_state->modeset) { @@ -14506,9 +14468,9 @@ intel_prepare_plane_fb(struct drm_plane *plane, } if (ret == 0) { - to_intel_plane_state(new_state)->wait_req = - i915_gem_active_get(&obj->last_write, - &obj->base.dev->struct_mutex); + new_state->fence = + &i915_gem_active_get(&obj->last_write, + &obj->base.dev->struct_mutex)->fence; } return ret; @@ -14529,7 +14491,6 @@ intel_cleanup_plane_fb(struct drm_plane *plane, { struct drm_device *dev = plane->dev; stru
[PATCH 2/4] drm/tilcdc: Add blue-and-red-wiring -device tree property
On Tue, Aug 16, 2016 at 12:24:28PM +0300, Jyri Sarha wrote: > Add "blue-and-red-wiring"-device tree property and update devicetree > binding document. The red and blue components are reversed between 24 > and 16 bit modes on am335x LCDC output pins. To get 24 RGB format the > red and blue wires has to be crossed and this in turn causes 16 colors > output to be in BGR format. With straight wiring the 16 color is RGB > and 24 bit is BGR. The new property describes whether the red and blue > wires are crossed or not. The am335x-evm has the wires going to LCD > crossed and that is chosen to be the default mode if > "blue-and-red-wiring"-property is not found. > > For more details see section 3.1.1 in AM335x Silicon Errata: > http://www.ti.com/general/docs/lit/getliterature.tsp?baseLiteratureNumber=sprz360 > > Signed-off-by: Jyri Sarha > --- > .../devicetree/bindings/display/tilcdc/tilcdc.txt | 12 ++ > drivers/gpu/drm/tilcdc/tilcdc_drv.c| 44 > ++ > drivers/gpu/drm/tilcdc/tilcdc_drv.h| 4 ++ > drivers/gpu/drm/tilcdc/tilcdc_plane.c | 9 ++--- > 4 files changed, 63 insertions(+), 6 deletions(-) > > diff --git a/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt > b/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt > index 6efa4c5..992803b 100644 > --- a/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt > +++ b/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt > @@ -17,6 +17,8 @@ Optional properties: > the lcd controller. > - max-pixelclock: The maximum pixel clock that can be supported > by the lcd controller in KHz. > + - blue-and-red-wiring: Either "crossed" or "straight", if not present > + crossed wiring is assumed for dtb backward compatibility. [1] Just do a boolean and restrict it to a particular compatible string so we don't have to carry it forever on new parts. Rob
"Fixes" for page flipping under PRIME on AMD & nouveau
On Thu, Aug 18, 2016 at 4:23 AM, Michel Dänzer wrote: > Maybe the rasterization as two triangles results in bad PCIe bandwidth > utilization. Using the asynchronous DMA engine for these transfers would > probably be ideal, but having the 3D engine rasterize a single rectangle > (either using the rectangle primitive or a large triangle with scissor) > might already help. There is only one thing that's bad for PCIe when the surface is linear: the 3D engine. Disabling all but the first shader engine and all but the first 2 RBs should improve performance for blits from VRAM to GTT. The closed driver does that, but I don't remember if the destination must be linear, must be in GTT, or both. In any case, SDMA should still be the best for VRAM->GTT blits. Marek
[PATCH 0/2] GPU-DRM-Savage: Fine-tuning for savage_bci_cmdbuf()
From: Markus Elfring Date: Thu, 18 Aug 2016 21:38:37 +0200 A few update suggestions were taken into account from static source code analysis. Markus Elfring (2): Use memdup_user() rather than duplicating its implementation Less function calls after error detection drivers/gpu/drm/savage/savage_state.c | 42 +++ 1 file changed, 18 insertions(+), 24 deletions(-) -- 2.9.3
[PATCH 1/2] GPU-DRM-Savage: Use memdup_user() rather than duplicating
From: Markus Elfring Date: Thu, 18 Aug 2016 18:12:03 +0200 Reuse existing functionality from memdup_user() instead of keeping duplicate source code. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/savage/savage_state.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/savage/savage_state.c b/drivers/gpu/drm/savage/savage_state.c index c01ad0a..3dc0d8f 100644 --- a/drivers/gpu/drm/savage/savage_state.c +++ b/drivers/gpu/drm/savage/savage_state.c @@ -1001,15 +1001,9 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ cmdbuf->cmd_addr = kcmd_addr; } if (cmdbuf->vb_size) { - kvb_addr = kmalloc(cmdbuf->vb_size, GFP_KERNEL); - if (kvb_addr == NULL) { - ret = -ENOMEM; - goto done; - } - - if (copy_from_user(kvb_addr, cmdbuf->vb_addr, - cmdbuf->vb_size)) { - ret = -EFAULT; + kvb_addr = memdup_user(cmdbuf->vb_addr, cmdbuf->vb_size); + if (IS_ERR(kvb_addr)) { + ret = PTR_ERR(kvb_addr); goto done; } cmdbuf->vb_addr = kvb_addr; -- 2.9.3
[PATCH 2/2] GPU-DRM-Savage: Less function calls in savage_bci_cmdbuf() after error detection
From: Markus Elfring Date: Thu, 18 Aug 2016 21:28:58 +0200 The kfree() function was called in a few cases by the savage_bci_cmdbuf() function during error handling even if a passed variable contained a null pointer. Adjust jump targets according to the Linux coding style convention. Signed-off-by: Markus Elfring --- drivers/gpu/drm/savage/savage_state.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/savage/savage_state.c b/drivers/gpu/drm/savage/savage_state.c index 3dc0d8f..5b484aa 100644 --- a/drivers/gpu/drm/savage/savage_state.c +++ b/drivers/gpu/drm/savage/savage_state.c @@ -1004,7 +1004,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ kvb_addr = memdup_user(cmdbuf->vb_addr, cmdbuf->vb_size); if (IS_ERR(kvb_addr)) { ret = PTR_ERR(kvb_addr); - goto done; + goto free_cmd; } cmdbuf->vb_addr = kvb_addr; } @@ -1013,13 +1013,13 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ GFP_KERNEL); if (kbox_addr == NULL) { ret = -ENOMEM; - goto done; + goto free_vb; } if (copy_from_user(kbox_addr, cmdbuf->box_addr, cmdbuf->nbox * sizeof(struct drm_clip_rect))) { ret = -EFAULT; - goto done; + goto free_vb; } cmdbuf->box_addr = kbox_addr; } @@ -1052,7 +1052,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ "beyond end of command buffer\n"); DMA_FLUSH(); ret = -EINVAL; - goto done; + goto free_box; } /* fall through */ case SAVAGE_CMD_DMA_PRIM: @@ -1071,7 +1071,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ cmdbuf->vb_stride, cmdbuf->nbox, cmdbuf->box_addr); if (ret != 0) - goto done; + goto free_box; first_draw_cmd = NULL; } } @@ -1086,7 +1086,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ "beyond end of command buffer\n"); DMA_FLUSH(); ret = -EINVAL; - goto done; + goto free_box; } ret = savage_dispatch_state(dev_priv, &cmd_header, (const uint32_t *)cmdbuf->cmd_addr); @@ -1099,7 +1099,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ "beyond end of command buffer\n"); DMA_FLUSH(); ret = -EINVAL; - goto done; + goto free_box; } ret = savage_dispatch_clear(dev_priv, &cmd_header, cmdbuf->cmd_addr, @@ -1117,12 +1117,12 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ cmd_header.cmd.cmd); DMA_FLUSH(); ret = -EINVAL; - goto done; + goto free_box; } if (ret != 0) { DMA_FLUSH(); - goto done; + goto free_box; } } @@ -1133,7 +1133,7 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ cmdbuf->nbox, cmdbuf->box_addr); if (ret != 0) { DMA_FLUSH(); - goto done; + goto free_box; } } @@ -1147,11 +1147,11 @@ int savage_bci_cmdbuf(struct drm_device *dev, void *data, struct drm_file *file_ savage_freelist_put(dev, dmabuf); } -done: - /* If we didn't need to allocate them, these'll be NULL */ - kfree(kcmd_addr); - kfree(kvb_addr); +free_box: kfree(kbox_addr); - +free_vb: + kfree(kvb_addr); +free_cmd: + kfree(kcmd_addr);
[git pull] drm fixes
On Thu, Aug 18, 2016 at 4:58 AM, Dave Airlie wrote: > > Hi Linus, > > Pretty quiet so far, a few amdgpu/radeon fixup for pcie pm changes, > and a couple of amdgpu fixes, along with some build fixes, and printk fix. It's missing the intel fixes pile from earlier this week, already pinged Dave over irc bug I guess was too late in the evening to respin. -Daniel > > Dave. > > The following changes since commit 4b9eaf33d83d91430b7ca45d0ebf8241da489c92: > > Merge branch 'akpm' (patches from Andrew) (2016-08-11 16:58:24 -0700) > > are available in the git repository at: > > git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-4.8-rc3 > > for you to fetch changes up to 91d62d9f30206be6f7749a0e6f7fa58c6d70c702: > > Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux > into drm-fixes (2016-08-18 12:51:27 +1000) > > > Alex Deucher (2): > Revert "drm/amdgpu: work around lack of upstream ACPI support for > D3cold" > Revert "drm/radeon: work around lack of upstream ACPI support for > D3cold" > > Arnd Bergmann (3): > drm/mediatek: add COMMON_CLK dependency > drm/mediatek: add CONFIG_OF dependency > drm/mediatek: add ARM_SMCCC dependency > > Chunming Zhou (1): > drm/amdgpu: fix vm init error path > > Colin Ian King (1): > drm/amdkfd: print doorbell offset as a hex value > > Dave Airlie (4): > Merge tag 'drm-amdkfd-fixes-2016-08-09' of > git://people.freedesktop.org/~gabbayo/linux into drm-fixes > Merge branch 'drm-fixes-4.8' of > git://people.freedesktop.org/~agd5f/linux into drm-fixes > Merge tag 'mediatek-drm-fixes-2016-08-12' of > git://git.pengutronix.de/git/pza/linux into drm-fixes > Merge branch 'drm-fixes-4.8' of > git://people.freedesktop.org/~agd5f/linux into drm-fixes > > Felix Kuehling (1): > drm/amdgpu: Change GART offset to 64-bit > > Jay Cornwall (1): > drm/amdgpu: Fix memory trashing if UVD ring test fails > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 ++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 9 - > drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 - > drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c| 2 +- > drivers/gpu/drm/mediatek/Kconfig | 3 +++ > drivers/gpu/drm/radeon/radeon_atpx_handler.c | 9 - > 8 files changed, 14 insertions(+), 25 deletions(-) > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
[Bug 97402] AMDGPU/Iceland kernel panic on shutdown
https://bugs.freedesktop.org/show_bug.cgi?id=97402 Bug ID: 97402 Summary: AMDGPU/Iceland kernel panic on shutdown Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel at lists.freedesktop.org Reporter: krejzi at email.com Created attachment 125892 --> https://bugs.freedesktop.org/attachment.cgi?id=125892&action=edit Kernel panic capture Linux 4.8-rc2. See attached screenshot. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/f1bd7855/attachment.html>
[Bug 97402] AMDGPU/Iceland kernel panic on shutdown
https://bugs.freedesktop.org/show_bug.cgi?id=97402 Armin K changed: What|Removed |Added Attachment #125892|0 |1 is obsolete|| --- Comment #1 from Armin K --- Created attachment 125893 --> https://bugs.freedesktop.org/attachment.cgi?id=125893&action=edit Kernel panic capture Wrong picture in the first comment, ignore it. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/15fb49de/attachment-0001.html>
[PATCH] virtio-gpu: Use memdup_user() rather than duplicating its implementation
From: Markus Elfring Date: Thu, 18 Aug 2016 22:35:14 +0200 Reuse existing functionality from memdup_user() instead of keeping duplicate source code. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index c046903..512e7cd 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -152,15 +152,10 @@ static int virtio_gpu_execbuffer(struct drm_device *dev, if (ret) goto out_free; - buf = kmalloc(exbuf->size, GFP_KERNEL); - if (!buf) { - ret = -ENOMEM; - goto out_unresv; - } - if (copy_from_user(buf, (void __user *)(uintptr_t)exbuf->command, - exbuf->size)) { - kfree(buf); - ret = -EFAULT; + buf = memdup_user((void __user *)(uintptr_t)exbuf->command, + exbuf->size); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); goto out_unresv; } virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, -- 2.9.3
[PATCH 1/2] drm: Allow drivers to modify plane_state in prepare_fb/cleanup_fb
On Thu, Aug 18, 2016 at 07:00:16PM +0100, Chris Wilson wrote: > The drivers have to modify the atomic plane state during the prepare_fb > callback so they track allocations, reservations and dependencies for > this atomic operation involving this fb. In particular, how else do we > set the plane->fence from the framebuffer! > > Signed-off-by: Chris Wilson > Cc: Daniel Vetter > Cc: dri-devel at lists.freedesktop.org Yeah, Laurent just pinged me today about how he should store the dma mapping when he has hw with an iommu per plane. We concluded that the const is bogus and needs to go ;-) Applied to drm-misc, thanks. -Daniel > --- > drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 4 ++-- > drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c | 4 ++-- > drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 4 ++-- > drivers/gpu/drm/i915/intel_display.c| 4 ++-- > drivers/gpu/drm/i915/intel_drv.h| 4 ++-- > drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c | 4 ++-- > drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c | 4 ++-- > drivers/gpu/drm/omapdrm/omap_plane.c| 4 ++-- > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 ++-- > drivers/gpu/drm/tegra/dc.c | 4 ++-- > include/drm/drm_modeset_helper_vtables.h| 4 ++-- > 11 files changed, 22 insertions(+), 22 deletions(-) > > diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > index 146809a97a07..72e6b7dd457b 100644 > --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > @@ -755,7 +755,7 @@ static int atmel_hlcdc_plane_atomic_check(struct > drm_plane *p, > } > > static int atmel_hlcdc_plane_prepare_fb(struct drm_plane *p, > - const struct drm_plane_state *new_state) > + struct drm_plane_state *new_state) > { > /* >* FIXME: we should avoid this const -> non-const cast but it's > @@ -780,7 +780,7 @@ static int atmel_hlcdc_plane_prepare_fb(struct drm_plane > *p, > } > > static void atmel_hlcdc_plane_cleanup_fb(struct drm_plane *p, > - const struct drm_plane_state *old_state) > + struct drm_plane_state *old_state) > { > /* >* FIXME: we should avoid this const -> non-const cast but it's > diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c > b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c > index e50467a0deb0..2a3e92976700 100644 > --- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c > +++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c > @@ -171,13 +171,13 @@ static void fsl_dcu_drm_plane_atomic_update(struct > drm_plane *plane, > > static void > fsl_dcu_drm_plane_cleanup_fb(struct drm_plane *plane, > - const struct drm_plane_state *new_state) > + struct drm_plane_state *new_state) > { > } > > static int > fsl_dcu_drm_plane_prepare_fb(struct drm_plane *plane, > - const struct drm_plane_state *new_state) > + struct drm_plane_state *new_state) > { > return 0; > } > diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > index 91188f33b1d9..6417158dad61 100644 > --- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c > @@ -818,14 +818,14 @@ static void ade_disable_channel(struct ade_plane > *aplane) > } > > static int ade_plane_prepare_fb(struct drm_plane *plane, > - const struct drm_plane_state *new_state) > + struct drm_plane_state *new_state) > { > /* do nothing */ > return 0; > } > > static void ade_plane_cleanup_fb(struct drm_plane *plane, > - const struct drm_plane_state *old_state) > + struct drm_plane_state *old_state) > { > /* do nothing */ > } > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 8a203b5f347e..123112c240e0 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -14439,7 +14439,7 @@ static const struct drm_crtc_funcs intel_crtc_funcs = > { > */ > int > intel_prepare_plane_fb(struct drm_plane *plane, > -const struct drm_plane_state *new_state) > +struct drm_plane_state *new_state) > { > struct drm_device *dev = plane->dev; > struct drm_framebuffer *fb = new_state->fb; > @@ -14525,7 +14525,7 @@ intel_prepare_plane_fb(struct drm_plane *plane, > */ > void > intel_cleanup_plane_fb(struct drm_plane *plane, > -const struct drm_plane_state *old_state) > +struct drm_plane_state *old_state) > { > struct drm_device *dev = p
[PATCH v3 5/5] dt-bindings: add compatible strings for big/little rockchip vops
From: Mark Yao This patch documents the compatible strings for the big and little vop in rockchip's drm driver. Cc: Rob Herring Cc: Pawel Moll Cc: Mark Rutland Cc: Ian Campbell Cc: Kumar Gala Reviewed-by: Tomasz Figa Signed-off-by: Mark Yao [seanpaul removed superfluous description per tfiga's review] Signed-off-by: Sean Paul --- Resending to add devicetree list Changes in v3: - Updated subject (Sean Paul) - Removed unnecessary description (Tomasz) Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt index 196121f..9eb3f0a 100644 --- a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt +++ b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt @@ -8,6 +8,8 @@ Required properties: - compatible: value should be one of the following "rockchip,rk3036-vop"; "rockchip,rk3288-vop"; + "rockchip,rk3399-vop-big"; + "rockchip,rk3399-vop-lit"; - interrupts: should contain a list of all VOP IP block interrupts in the order: VSYNC, LCD_SYSTEM. The interrupt specifier -- 2.8.0.rc3.226.g39d4020
[PATCH v3 2/5] dt-bindings: sort Rockchip vop compatible by chip's number
From: Mark Yao Reorder the compatible vop devices to be sorted by chip number in ascending order. Reviewed-by: Tomasz Figa Signed-off-by: Mark Yao [seanpaul added commit description per tfiga's review] Signed-off-by: Sean Paul --- Resending to add devicetree list Changes in v3: - Updated commit message (Tomasz Figa) Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt index 5489b59..196121f 100644 --- a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt +++ b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt @@ -6,8 +6,8 @@ buffer to an external LCD interface. Required properties: - compatible: value should be one of the following - "rockchip,rk3288-vop"; "rockchip,rk3036-vop"; + "rockchip,rk3288-vop"; - interrupts: should contain a list of all VOP IP block interrupts in the order: VSYNC, LCD_SYSTEM. The interrupt specifier -- 2.8.0.rc3.226.g39d4020
[PATCH v3 03/11] drm/i915: return EACCES for check_cmd() failures
On Mon, Aug 15, 2016 at 4:04 PM, Chris Wilson wrote: > On Mon, Aug 15, 2016 at 03:41:20PM +0100, Robert Bragg wrote: > > check_cmd() is checking whether a command adheres to certain > > restrictions that ensure it's safe to execute within a privileged batch > > buffer. Returning false implies a privilege problem, not that the > > command is invalid. > > > > The distinction makes the difference between allowing the buffer to be > > executed as an unprivileged batch buffer or returning an EINVAL error to > > userspace without executing anything. > > Ah, but you choose to actually execute it instead. We can't allow that > either. > Okey, I've got myself a bit confused over this, going in circles a few times now... Initially I took this to imply the cmd parser is not only there to enable more than the HW policy allows, and there must be some stuff we're blocking that the HW policy otherwise allows (and therefore it's bad to return -EACCES here and fallback to the HW policy). One of the things that's confusing me is that looking at the command parser code, it looks like it's possible to bail early with an MI_BATCH_BUFFER_START command, returning -EACCES and falling back to the non-privileged HW policy. If the HW policy isn't strict enough in some cases then presumably we wouldn't ever allow an early fallback to the HW policy? oacontrol does look to be an example where it seems a little dubious that the HW considers it ok to write from a non-privileged buffer for gen8+, but for gen7 all LRIs are considered privileged afik and the -EACCES fallback should result in an oacontrol LRI becoming a NOOP. The change appears to be having the desired effect with respect to mesa's check for oacontrol writes failing gracefully with AMD_performance_monitor not being advertised. The fallback via -EACCES to the non-privileged HW policy seems to be NOOPing the LRIs based on running gputop to capture system-wide metrics (oacontrol enabled via i915 perf interface) and then running Mesa/GL applications that that would otherwise clobber oacontrol with test LRIs when deciding to advertise AMD_performance_monitor. This would interfere with gputop by disabling the OA unit if the LRIs were successful. Unless I've got the wrong end of the stick, I think this is ok for Haswell. - Robert > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre > -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/b83a8aaa/attachment-0001.html>
[Bug 97403] AMDGPU/Iceland DPM not properly working on 4.9-wip
] ring test on 9 succeeded in 4 usecs [ 15.835449] [drm] ring test on 10 succeeded in 3 usecs [ 15.835649] [drm] ib test on ring 0 succeeded [ 15.835829] [drm] ib test on ring 1 succeeded [ 15.835939] [drm] ib test on ring 2 succeeded [ 15.836043] [drm] ib test on ring 3 succeeded [ 15.836153] [drm] ib test on ring 4 succeeded [ 15.836256] [drm] ib test on ring 5 succeeded [ 15.836274] [drm] ib test on ring 6 succeeded [ 15.836291] [drm] ib test on ring 7 succeeded [ 15.836307] [drm] ib test on ring 8 succeeded [ 15.836321] [drm] ib test on ring 9 succeeded [ 15.836333] [drm] ib test on ring 10 succeeded [ 15.838598] [drm] Initialized amdgpu 3.3.0 20150101 for :01:00.0 on minor 1 [ 20.998755] VI should always have 2 performance levels Contents of various pp_* files from /sys/class/drm/card1/device # cat power_dpm_force_performance_level off # cat power_dpm_state performance # cat pp_cur_state 0 # cat pp_dpm_mclk 0: 300Mhz 1: 600Mhz 2: 1000Mhz # cat pp_dpm_pcie 0: 2.5GB, x8 1: 2.5GB, x8 2: 8.0GB, x16 3: 8.0GB, x16 4: 8.0GB, x16 5: 8.0GB, x16 # cat pp_dpm_sclk 0: 300Mhz 1: 551Mhz 2: 678Mhz 3: 754Mhz 4: 810Mhz 5: 867Mhz 6: 943Mhz 7: 1021Mhz # cat pp_force_state (empty, not the actual output) # cat pp_mclk_od 0 # cat pp_num_states states: 3 0 boot 1 performance 2 battery # cat pp_sclk_od 0 -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/7bb321a5/attachment.html>
[Bug 97403] AMDGPU/Iceland DPM not properly working on 4.9-wip
https://bugs.freedesktop.org/show_bug.cgi?id=97403 --- Comment #1 from Armin K --- Correction: DPM does seen to work, as indicated in performance increase in Talos Principle (17 FPS on 4.8 to 44 FPS on 4.9), but the rendering is broken. Still, message about VI performance levels is a bit confusing. I assume HP's atombios has different perf levels. It would be nice if that would be fixed. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160818/ecc16e4d/attachment.html>
[PATCH v4 00/11] Enable i915 perf stream for Haswell OA unit
I've updated the stream->ops->read() interface to avoid the struct i915_perf_read_state so it's hopefully a bit clearer to see the state being passed around: int (*read)(struct i915_perf_stream *stream, char __user *buf, size_t count, size_t *offset); The inout offset is the buf offset (aka bytes read) which are still tracked separate from any error state until we get back to the common code that handles the final check for whether the error should be squashed if any data was successfully copied. This avoids some duplication and more fiddly error paths in the ->read implementations. The initialization code is now spit into an i915_perf_init() called in i915_driver_init_early() and an i915_perf_register() called in i915_driver_register() once we're visible to userspace, after sysfs has been initialized. - Robert Robert Bragg (11): drm/i915: Add i915 perf infrastructure drm/i915: rename OACONTROL GEN7_OACONTROL drm/i915: return EACCES for check_cmd() failures drm/i915: don't whitelist oacontrol in cmd parser drm/i915: Add 'render basic' Haswell OA unit config drm/i915: Enable i915 perf stream for Haswell OA unit drm/i915: advertise available metrics via sysfs drm/i915: Add dev.i915.perf_event_paranoid sysctl option drm/i915: add oa_event_min_timer_exponent sysctl drm/i915: Add more Haswell OA metric sets drm/i915: Add a kerneldoc summary for i915_perf.c drivers/gpu/drm/i915/Makefile |4 + drivers/gpu/drm/i915/i915_cmd_parser.c | 40 +- drivers/gpu/drm/i915/i915_drv.c |9 + drivers/gpu/drm/i915/i915_drv.h | 160 +++ drivers/gpu/drm/i915/i915_gem_context.c | 22 +- drivers/gpu/drm/i915/i915_oa_hsw.c | 659 drivers/gpu/drm/i915/i915_oa_hsw.h | 38 + drivers/gpu/drm/i915/i915_perf.c| 1668 +++ drivers/gpu/drm/i915/i915_reg.h | 340 ++- drivers/gpu/drm/i915/intel_ringbuffer.c |7 +- include/uapi/drm/i915_drm.h | 133 +++ 11 files changed, 3038 insertions(+), 42 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_oa_hsw.c create mode 100644 drivers/gpu/drm/i915/i915_oa_hsw.h create mode 100644 drivers/gpu/drm/i915/i915_perf.c -- 2.9.2
[PATCH v4 01/11] drm/i915: Add i915 perf infrastructure
Adds base i915 perf infrastructure for Gen performance metrics. This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64 properties to configure a stream of metrics and returns a new fd usable with standard VFS system calls including read() to read typed and sized records; ioctl() to enable or disable capture and poll() to wait for data. A stream is opened something like: uint64_t properties[] = { /* Single context sampling */ DRM_I915_PERF_PROP_CTX_HANDLE,ctx_handle, /* Include OA reports in samples */ DRM_I915_PERF_PROP_SAMPLE_OA, true, /* OA unit configuration */ DRM_I915_PERF_PROP_OA_METRICS_SET,metrics_set_id, DRM_I915_PERF_PROP_OA_FORMAT, report_format, DRM_I915_PERF_PROP_OA_EXPONENT, period_exponent, }; struct drm_i915_perf_open_param parm = { .flags = I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK | I915_PERF_FLAG_DISABLED, .properties_ptr = (uint64_t)properties, .num_properties = sizeof(properties) / 16, }; int fd = drmIoctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, ¶m); Records read all start with a common { type, size } header with DRM_I915_PERF_RECORD_SAMPLE being of most interest. Sample records contain an extensible number of fields and it's the DRM_I915_PERF_PROP_SAMPLE_xyz properties given when opening that determine what's included in every sample. No specific streams are supported yet so any attempt to open a stream will return an error. Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/Makefile| 3 + drivers/gpu/drm/i915/i915_drv.c | 4 + drivers/gpu/drm/i915/i915_drv.h | 91 drivers/gpu/drm/i915/i915_perf.c | 448 +++ include/uapi/drm/i915_drm.h | 67 ++ 5 files changed, 613 insertions(+) create mode 100644 drivers/gpu/drm/i915/i915_perf.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 6092f0e..9a2f1f9 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -106,6 +106,9 @@ i915-y += dvo_ch7017.o \ # virtual gpu code i915-y += i915_vgpu.o +# perf code +i915-y += i915_perf.o + ifeq ($(CONFIG_DRM_I915_GVT),y) i915-y += intel_gvt.o include $(src)/gvt/Makefile diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 83afdd0..92f668e 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -851,6 +851,8 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, intel_device_info_dump(dev_priv); + i915_perf_init(dev_priv); + /* Not all pre-production machines fall into this category, only the * very first ones. Almost everything should work, except for maybe * suspend/resume. And we don't implement workarounds that affect only @@ -872,6 +874,7 @@ err_workqueues: */ static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv) { + i915_perf_fini(dev_priv); i915_gem_load_cleanup(&dev_priv->drm); i915_workqueues_cleanup(dev_priv); } @@ -2578,6 +2581,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW), }; static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 65ada5d..9f27a74 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1724,6 +1724,84 @@ struct intel_wm_config { bool sprites_scaled; }; +struct i915_perf_stream; + +struct i915_perf_stream_ops { + /* Enables the collection of HW samples, either in response to +* I915_PERF_IOCTL_ENABLE or implicitly called when stream is +* opened without I915_PERF_FLAG_DISABLED. +*/ + void (*enable)(struct i915_perf_stream *stream); + + /* Disables the collection of HW samples, either in response to +* I915_PERF_IOCTL_DISABLE or implicitly called before +* destroying the stream. +*/ + void (*disable)(struct i915_perf_stream *stream); + + /* Return: true if any i915 perf records are ready to read() +* for this stream. +*/ + bool (*can_read)(struct i915_perf_stream *stream); + + /* Call poll_wait, passing a wait queue that will be woken +* once there is something ready to read() for the stream +*/ + void (*poll_wait)(struct i915_perf_stream *stream, + struct file *file, + poll_table *wait); + + /* For handling a blocking read, wait until
[PATCH v4 02/11] drm/i915: rename OACONTROL GEN7_OACONTROL
OACONTROL changes quite a bit for gen8, with some bits split out into a per-context OACTXCONTROL register. Rename now before adding more gen7 OA registers Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_cmd_parser.c | 4 ++-- drivers/gpu/drm/i915/i915_reg.h| 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 1db829c..cfe3e7a 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -446,7 +446,7 @@ static const struct drm_i915_reg_descriptor gen7_render_regs[] = { REG64(PS_INVOCATION_COUNT), REG64(PS_DEPTH_COUNT), REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE), - REG32(OACONTROL), /* Only allowed for LRI and SRM. See below. */ + REG32(GEN7_OACONTROL), /* Only allowed for LRI and SRM. See below. */ REG64(MI_PREDICATE_SRC0), REG64(MI_PREDICATE_SRC1), REG32(GEN7_3DPRIM_END_OFFSET), @@ -1097,7 +1097,7 @@ static bool check_cmd(const struct intel_engine_cs *engine, * to the register. Hence, limit OACONTROL writes to * only MI_LOAD_REGISTER_IMM commands. */ - if (reg_addr == i915_mmio_reg_offset(OACONTROL)) { + if (reg_addr == i915_mmio_reg_offset(GEN7_OACONTROL)) { if (desc->cmd.value == MI_LOAD_REGISTER_MEM) { DRM_DEBUG_DRIVER("CMD: Rejected LRM to OACONTROL\n"); return false; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 9397dde..ba91eff 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -616,7 +616,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define HSW_CS_GPR(n) _MMIO(0x2600 + (n) * 8) #define HSW_CS_GPR_UDW(n) _MMIO(0x2600 + (n) * 8 + 4) -#define OACONTROL _MMIO(0x2360) +#define GEN7_OACONTROL _MMIO(0x2360) #define _GEN7_PIPEA_DE_LOAD_SL 0x70068 #define _GEN7_PIPEB_DE_LOAD_SL 0x71068 -- 2.9.2
[PATCH v4 03/11] drm/i915: return EACCES for check_cmd() failures
check_cmd() is checking whether a command adheres to certain restrictions that ensure it's safe to execute within a privileged batch buffer. Returning false implies a privilege problem, not that the command is invalid. The distinction makes the difference between allowing the buffer to be executed as an unprivileged batch buffer or returning an EINVAL error to userspace without executing anything. In a case where userspace may want to test whether it can successfully write to a register that needs privileges the distinction may be important and an EINVAL error may be considered fatal. In particular this is currently true for Mesa, which includes a test for whether OACONTROL can be written too, but Mesa treats any error when flushing a batch buffer as fatal, calling exit(1). As it is currently Mesa can gracefully handle a failure to write to OACONTROL if the command parser is disabled, but if we were to remove OACONTROL from the parser's whitelist then the returned EINVAL would break Mesa applications as they attempt an OACONTROL write. Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_cmd_parser.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index cfe3e7a..71e778b 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1261,7 +1261,7 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, if (!check_cmd(engine, desc, cmd, length, is_master, &oacontrol_set)) { - ret = -EINVAL; + ret = -EACCES; break; } -- 2.9.2
[PATCH v4 04/11] drm/i915: don't whitelist oacontrol in cmd parser
Being able to program OACONTROL from a non-privileged batch buffer is not sufficient to be able to configure the OA unit. This was originally allowed to help enable Mesa to expose OA counters via the INTEL_performance_query extension, but the current implementation based on programming OACONTROL via a batch buffer isn't able to report useable data without a more complete OA unit configuration. Mesa handles the possibility that writes to OACONTROL may not be allowed and so only advertises the extension after explicitly testing that a write to OACONTROL succeeds. Based on this; removing OACONTROL from the whitelist should be ok for userspace. Removing this simplifies adding a new kernel api for configuring the OA unit without needing to consider the possibility that userspace might trample on OACONTROL state which we'd like to start managing within the kernel instead. In particular running any Mesa based GL application currently results in clearing OACONTROL when initializing which would disable the capturing of metrics. Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_cmd_parser.c | 38 ++ 1 file changed, 2 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 71e778b..ac03c71 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -446,7 +446,6 @@ static const struct drm_i915_reg_descriptor gen7_render_regs[] = { REG64(PS_INVOCATION_COUNT), REG64(PS_DEPTH_COUNT), REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE), - REG32(GEN7_OACONTROL), /* Only allowed for LRI and SRM. See below. */ REG64(MI_PREDICATE_SRC0), REG64(MI_PREDICATE_SRC1), REG32(GEN7_3DPRIM_END_OFFSET), @@ -1049,8 +1048,7 @@ bool intel_engine_needs_cmd_parser(struct intel_engine_cs *engine) static bool check_cmd(const struct intel_engine_cs *engine, const struct drm_i915_cmd_descriptor *desc, const u32 *cmd, u32 length, - const bool is_master, - bool *oacontrol_set) + const bool is_master) { if (desc->flags & CMD_DESC_REJECT) { DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd); @@ -1088,31 +1086,6 @@ static bool check_cmd(const struct intel_engine_cs *engine, } /* -* OACONTROL requires some special handling for -* writes. We want to make sure that any batch which -* enables OA also disables it before the end of the -* batch. The goal is to prevent one process from -* snooping on the perf data from another process. To do -* that, we need to check the value that will be written -* to the register. Hence, limit OACONTROL writes to -* only MI_LOAD_REGISTER_IMM commands. -*/ - if (reg_addr == i915_mmio_reg_offset(GEN7_OACONTROL)) { - if (desc->cmd.value == MI_LOAD_REGISTER_MEM) { - DRM_DEBUG_DRIVER("CMD: Rejected LRM to OACONTROL\n"); - return false; - } - - if (desc->cmd.value == MI_LOAD_REGISTER_REG) { - DRM_DEBUG_DRIVER("CMD: Rejected LRR to OACONTROL\n"); - return false; - } - - if (desc->cmd.value == MI_LOAD_REGISTER_IMM(1)) - *oacontrol_set = (cmd[offset + 1] != 0); - } - - /* * Check the value written to the register against the * allowed mask/value pair given in the whitelist entry. */ @@ -1202,7 +1175,6 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, { u32 *cmd, *batch_base, *batch_end; struct drm_i915_cmd_descriptor default_desc = { 0 }; - bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */ int ret = 0; batch_base = copy_batch(shadow_batch_obj, batch_obj, @@ -1259,8 +1231,7 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, break; } - if (!check_cmd(engine, desc, cmd, length, is_master, - &oacontrol_set)) { + if (!check_cmd(engine, desc, cmd, length, is_master)) { ret = -EACCES; break; } @@ -1268,11 +1239,6 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, cmd += length; } -
[PATCH v4 05/11] drm/i915: Add 'render basic' Haswell OA unit config
Adds a static OA unit, MUX + B Counter configuration for basic render metrics on Haswell. This is auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C gputop-data -f Makefile.xml SYSFS=0 WHITELIST=RenderBasic Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_drv.h| 14 drivers/gpu/drm/i915/i915_oa_hsw.c | 132 + drivers/gpu/drm/i915/i915_oa_hsw.h | 34 ++ 4 files changed, 182 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_oa_hsw.c create mode 100644 drivers/gpu/drm/i915/i915_oa_hsw.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 9a2f1f9..0eb4c83 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -107,7 +107,8 @@ i915-y += dvo_ch7017.o \ i915-y += i915_vgpu.o # perf code -i915-y += i915_perf.o +i915-y += i915_perf.o \ + i915_oa_hsw.o ifeq ($(CONFIG_DRM_I915_GVT),y) i915-y += intel_gvt.o diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9f27a74..9070794 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1724,6 +1724,11 @@ struct intel_wm_config { bool sprites_scaled; }; +struct i915_oa_reg { + i915_reg_t addr; + u32 value; +}; + struct i915_perf_stream; struct i915_perf_stream_ops { @@ -2096,6 +2101,15 @@ struct drm_i915_private { bool initialized; struct mutex lock; struct list_head streams; + + struct { + u32 metrics_set; + + const struct i915_oa_reg *mux_regs; + int mux_regs_len; + const struct i915_oa_reg *b_counter_regs; + int b_counter_regs_len; + } oa; } perf; /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ diff --git a/drivers/gpu/drm/i915/i915_oa_hsw.c b/drivers/gpu/drm/i915/i915_oa_hsw.c new file mode 100644 index 000..3e6006ec --- /dev/null +++ b/drivers/gpu/drm/i915/i915_oa_hsw.c @@ -0,0 +1,132 @@ +/* + * Autogenerated file, DO NOT EDIT manually! + * + * Copyright (c) 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include "i915_drv.h" + +enum metric_set_id { + METRIC_SET_ID_RENDER_BASIC = 1, +}; + +int i915_oa_n_builtin_metric_sets_hsw = 1; + +static const struct i915_oa_reg b_counter_config_render_basic[] = { + { _MMIO(0x2724), 0x0080 }, + { _MMIO(0x2720), 0x }, + { _MMIO(0x2714), 0x0080 }, + { _MMIO(0x2710), 0x }, +}; + +static const struct i915_oa_reg mux_config_render_basic[] = { + { _MMIO(0x253a4), 0x0160 }, + { _MMIO(0x25440), 0x0010 }, + { _MMIO(0x25128), 0x }, + { _MMIO(0x2691c), 0x0800 }, + { _MMIO(0x26aa0), 0x0150 }, + { _MMIO(0x26b9c), 0x6000 }, + { _MMIO(0x2791c), 0x0800 }, + { _MMIO(0x27aa0), 0x0150 }, + { _MMIO(0x27b9c), 0x6000 }, + { _MMIO(0x2641c), 0x0400 }, + { _MMIO(0x25380), 0x0010 }, + { _MMIO(0x2538c), 0x }, + { _MMIO(0x25384), 0x0800 }, + { _MMIO(0x25400), 0x0004 }, + { _MMIO(0x2540c), 0x06029000 }, + { _MMIO(0x25410), 0x0002 }, + { _MMIO(0x25404), 0x5c30 }, + { _MMIO(0x25100), 0x0016 }, + { _MMIO(0x25110), 0x0400 }, + { _MMIO(0x25104), 0x }, + { _MMIO(0x26804), 0x1211 }, + { _MMIO(0x26884), 0x0100 }, + { _MMIO(0x26900), 0x0002 }, + { _MMIO(0x26908), 0x0070 }, + { _MMIO(0x26904), 0x000
[PATCH v4 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit
Gen graphics hardware can be set up to periodically write snapshots of performance counters into a circular buffer via its Observation Architecture and this patch exposes that capability to userspace via the i915 perf interface. Cc: Chris Wilson Signed-off-by: Robert Bragg Signed-off-by: Zhenyu Wang --- drivers/gpu/drm/i915/i915_drv.h | 70 ++- drivers/gpu/drm/i915/i915_gem_context.c | 22 +- drivers/gpu/drm/i915/i915_perf.c| 986 +++- drivers/gpu/drm/i915/i915_reg.h | 338 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +- include/uapi/drm/i915_drm.h | 70 ++- 6 files changed, 1459 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9070794..4c302cd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1724,6 +1724,11 @@ struct intel_wm_config { bool sprites_scaled; }; +struct i915_oa_format { + u32 format; + int size; +}; + struct i915_oa_reg { i915_reg_t addr; u32 value; @@ -1744,11 +1749,6 @@ struct i915_perf_stream_ops { */ void (*disable)(struct i915_perf_stream *stream); - /* Return: true if any i915 perf records are ready to read() -* for this stream. -*/ - bool (*can_read)(struct i915_perf_stream *stream); - /* Call poll_wait, passing a wait queue that will be woken * once there is something ready to read() for the stream */ @@ -1758,9 +1758,7 @@ struct i915_perf_stream_ops { /* For handling a blocking read, wait until there is something * to ready to read() for the stream. E.g. wait on the same -* wait queue that would be passed to poll_wait() until -* ->can_read() returns true (if its safe to call ->can_read() -* without the i915 perf lock held). +* wait queue that would be passed to poll_wait(). */ int (*wait_unlocked)(struct i915_perf_stream *stream); @@ -1800,11 +1798,28 @@ struct i915_perf_stream { struct list_head link; u32 sample_flags; + int sample_size; struct i915_gem_context *ctx; bool enabled; - struct i915_perf_stream_ops *ops; + const struct i915_perf_stream_ops *ops; +}; + +struct i915_oa_ops { + void (*init_oa_buffer)(struct drm_i915_private *dev_priv); + int (*enable_metric_set)(struct drm_i915_private *dev_priv); + void (*disable_metric_set)(struct drm_i915_private *dev_priv); + void (*oa_enable)(struct drm_i915_private *dev_priv); + void (*oa_disable)(struct drm_i915_private *dev_priv); + void (*update_oacontrol)(struct drm_i915_private *dev_priv); + void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv, + u32 ctx_id); + int (*read)(struct i915_perf_stream *stream, + char __user *buf, + size_t count, + size_t *offset); + bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv); }; struct drm_i915_private { @@ -2099,16 +2114,47 @@ struct drm_i915_private { struct { bool initialized; + struct mutex lock; struct list_head streams; + spinlock_t hook_lock; + struct { - u32 metrics_set; + struct i915_perf_stream *exclusive_stream; + + u32 specific_ctx_id; + + struct hrtimer poll_check_timer; + wait_queue_head_t poll_wq; + atomic_t pollin; + + bool periodic; + int period_exponent; + int timestamp_frequency; + + int tail_margin; + + int metrics_set; const struct i915_oa_reg *mux_regs; int mux_regs_len; const struct i915_oa_reg *b_counter_regs; int b_counter_regs_len; + + struct { + struct drm_i915_gem_object *obj; + u32 gtt_offset; + u8 *addr; + int format; + int format_size; + } oa_buffer; + + u32 gen7_latched_oastatus1; + + struct i915_oa_ops ops; + const struct i915_oa_format *oa_formats; + int n_builtin_sets; } oa; } perf; @@ -3475,6 +3521,8 @@ struct drm_i915_gem_object * i915_gem_alloc_context_obj(struct drm_device *dev, size_t size); struct i915_gem_context * i915_gem_context_create_gvt(struct drm_device *dev); +int i915_gem_context_pin_legacy_rcs_state(struct drm_i915_private *dev_priv,
[PATCH v4 07/11] drm/i915: advertise available metrics via sysfs
Each metric set is given a sysfs entry like: /sys/class/drm/card0/metrics//id This allows userspace to enumerate the specific sets that are available for the current system. The 'id' file contains an unsigned integer that can be used to open the associated metric set via DRM_IOCTL_I915_PERF_OPEN. The is a globally unique ID for a specific OA unit register configuration that can be reliably used by userspace as a key to lookup corresponding counter meta data and normalization equations. The guid registry is currently maintained as part of gputop along with the XML metric set descriptions and code generation scripts, ref: https://github.com/rib/gputop > gputop-data/guids.xml > scripts/update-guids.py > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C gputop-data -f Makefile.xml SYSFS=1 WHITELIST=RenderBasic Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_drv.c| 5 + drivers/gpu/drm/i915/i915_drv.h| 4 drivers/gpu/drm/i915/i915_oa_hsw.c | 45 + drivers/gpu/drm/i915/i915_oa_hsw.h | 4 drivers/gpu/drm/i915/i915_perf.c | 46 ++ 5 files changed, 104 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 92f668e..0f5f51b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1172,6 +1172,9 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) * cannot run before the connectors are registered. */ intel_fbdev_initial_config_async(dev); + + /* Depends on sysfs having been initialized */ + i915_perf_register(dev_priv); } /** @@ -1180,6 +1183,8 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) */ static void i915_driver_unregister(struct drm_i915_private *dev_priv) { + i915_perf_unregister(dev_priv); + i915_audio_component_cleanup(dev_priv); intel_gpu_ips_teardown(); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4c302cd..dd88eb1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2115,6 +2115,8 @@ struct drm_i915_private { struct { bool initialized; + struct kobject *metrics_kobj; + struct mutex lock; struct list_head streams; @@ -3694,6 +3696,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, /* i915_perf.c */ extern void i915_perf_init(struct drm_i915_private *dev_priv); extern void i915_perf_fini(struct drm_i915_private *dev_priv); +extern void i915_perf_register(struct drm_i915_private *dev_priv); +extern void i915_perf_unregister(struct drm_i915_private *dev_priv); /* i915_suspend.c */ extern int i915_save_state(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_oa_hsw.c b/drivers/gpu/drm/i915/i915_oa_hsw.c index 3e6006ec..c32b5f8 100644 --- a/drivers/gpu/drm/i915/i915_oa_hsw.c +++ b/drivers/gpu/drm/i915/i915_oa_hsw.c @@ -24,6 +24,8 @@ * */ +#include + #include "i915_drv.h" enum metric_set_id { @@ -130,3 +132,46 @@ int i915_oa_select_metric_set_hsw(struct drm_i915_private *dev_priv) return -ENODEV; } } + +static ssize_t +show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC); +} + +static struct device_attribute dev_attr_render_basic_id = { + .attr = { .name = "id", .mode = S_IRUGO }, + .show = show_render_basic_id, + .store = NULL, +}; + +static struct attribute *attrs_render_basic[] = { + &dev_attr_render_basic_id.attr, + NULL, +}; + +static struct attribute_group group_render_basic = { + .name = "403d8832-1a27-4aa6-a64e-f5389ce7b212", + .attrs = attrs_render_basic, +}; + +int +i915_perf_init_sysfs_hsw(struct drm_i915_private *dev_priv) +{ + int ret; + + ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic); + if (ret) + goto error_render_basic; + + return 0; + +error_render_basic: + return ret; +} + +void +i915_perf_deinit_sysfs_hsw(struct drm_i915_private *dev_priv) +{ + sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic); +} diff --git a/drivers/gpu/drm/i915/i915_oa_hsw.h b/drivers/gpu/drm/i915/i915_oa_hsw.h index b618a1f..e4ba89d 100644 --- a/drivers/gpu/drm/i915/i915_oa_hsw.h +++ b/drivers/gpu/drm/i915/i915_oa_hsw.h @@ -31,4 +31,8 @@ extern int i915_oa_n_builtin_metric_sets_hsw; extern int i915_oa_select_metric_set_hsw(struct drm_i915_private *dev_priv); +extern int i915_perf_init_sysfs_hsw(struct drm_i915_private *dev_priv); + +extern void i915_perf_deinit_sysfs_hsw(struct drm_i915_private *dev_priv); + #endif diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 7873d15..7e1fc6b 100644 --- a/drivers/gpu/drm/i915/i
[PATCH v4 08/11] drm/i915: Add dev.i915.perf_event_paranoid sysctl option
Consistent with the kernel.perf_event_paranoid sysctl option that can allow non-root users to access system wide cpu metrics, this can optionally allow non-root users to access system wide OA counter metrics from Gen graphics hardware. Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 45 +++- 2 files changed, 45 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index dd88eb1..558cc0b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2116,6 +2116,7 @@ struct drm_i915_private { bool initialized; struct kobject *metrics_kobj; + struct ctl_table_header *sysctl_header; struct mutex lock; struct list_head streams; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 7e1fc6b..ac1f600 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -62,6 +62,8 @@ #define POLL_FREQUENCY 200 #define POLL_PERIOD (NSEC_PER_SEC / POLL_FREQUENCY) +static u32 i915_perf_stream_paranoid = true; + /* The maximum exponent the hardware accepts is 63 (essentially it selects one * of the 64bit timestamp bits to trigger reports from) but there's currently * no known use case for sampling as infrequently as once per 47 thousand years. @@ -1158,7 +1160,13 @@ int i915_perf_open_ioctl_locked(struct drm_device *dev, } } - if (!specific_ctx && !capable(CAP_SYS_ADMIN)) { + /* Similar to perf's kernel.perf_paranoid_cpu sysctl option +* we check a dev.i915.perf_stream_paranoid sysctl option +* to determine if it's ok to access system wide OA counters +* without CAP_SYS_ADMIN privileges. +*/ + if (!specific_ctx && + i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) { DRM_ERROR("Insufficient privileges to open system-wide i915 perf stream\n"); ret = -EACCES; goto err_ctx; @@ -1399,6 +1407,37 @@ void i915_perf_unregister(struct drm_i915_private *dev_priv) dev_priv->perf.metrics_kobj = NULL; } +static struct ctl_table oa_table[] = { + { +.procname = "perf_stream_paranoid", +.data = &i915_perf_stream_paranoid, +.maxlen = sizeof(i915_perf_stream_paranoid), +.mode = 0644, +.proc_handler = proc_dointvec, +}, + {} +}; + +static struct ctl_table i915_root[] = { + { +.procname = "i915", +.maxlen = 0, +.mode = 0555, +.child = oa_table, +}, + {} +}; + +static struct ctl_table dev_root[] = { + { +.procname = "dev", +.maxlen = 0, +.mode = 0555, +.child = i915_root, +}, + {} +}; + void i915_perf_init(struct drm_i915_private *dev_priv) { if (!IS_HASWELL(dev_priv)) @@ -1431,6 +1470,8 @@ void i915_perf_init(struct drm_i915_private *dev_priv) dev_priv->perf.oa.n_builtin_sets = i915_oa_n_builtin_metric_sets_hsw; + dev_priv->perf.sysctl_header = register_sysctl_table(dev_root); + dev_priv->perf.initialized = true; } @@ -1439,6 +1480,8 @@ void i915_perf_fini(struct drm_i915_private *dev_priv) if (!dev_priv->perf.initialized) return; + unregister_sysctl_table(dev_priv->perf.sysctl_header); + memset(&dev_priv->perf.oa.ops, 0, sizeof(dev_priv->perf.oa.ops)); dev_priv->perf.initialized = false; } -- 2.9.2
[PATCH v4 09/11] drm/i915: add oa_event_min_timer_exponent sysctl
The minimal sampling period is now configurable via a dev.i915.oa_min_timer_exponent sysctl parameter. Following the precedent set by perf, the default is the minimum that won't (on its own) exceed the default kernel.perf_event_max_sample_rate default of 10 samples/s. Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_perf.c | 42 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ac1f600..3d0ba09 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -74,6 +74,23 @@ static u32 i915_perf_stream_paranoid = true; */ #define OA_EXPONENT_MAX 31 +/* for sysctl proc_dointvec_minmax of i915_oa_min_timer_exponent */ +static int zero; +static int oa_exponent_max = OA_EXPONENT_MAX; + +/* Theoretically we can program the OA unit to sample every 160ns but don't + * allow that by default unless root... + * + * The period is derived from the exponent as: + * + * period = 80ns * 2^(exponent + 1) + * + * Referring to perf's kernel.perf_event_max_sample_rate for a precedent + * (10 by default); with an OA exponent of 6 we get a period of 10.240 + * microseconds - just under 10Hz + */ +static u32 i915_oa_min_timer_exponent = 6; + /* XXX: beware if future OA HW adds new report formats that the current * code assumes all reports have a power-of-two size and ~(size - 1) can * be used as a mask to align the OA tail pointer. @@ -1303,21 +1320,13 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, return -EINVAL; } - /* NB: The exponent represents a period as follows: -* -* 80ns * 2^(period_exponent + 1) -* -* Theoretically we can program the OA unit to sample + /* Theoretically we can program the OA unit to sample * every 160ns but don't allow that by default unless * root. -* -* Referring to perf's -* kernel.perf_event_max_sample_rate for a precedent -* (10 by default); with an OA exponent of 6 we get -* a period of 10.240 microseconds -just under 10Hz */ - if (value < 6 && !capable(CAP_SYS_ADMIN)) { - DRM_ERROR("Sampling period too high without root privileges\n"); + if (value < i915_oa_min_timer_exponent && + !capable(CAP_SYS_ADMIN)) { + DRM_ERROR("OA timer exponent too low without root privileges\n"); return -EACCES; } @@ -1415,6 +1424,15 @@ static struct ctl_table oa_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, + { +.procname = "oa_min_timer_exponent", +.data = &i915_oa_min_timer_exponent, +.maxlen = sizeof(i915_oa_min_timer_exponent), +.mode = 0644, +.proc_handler = proc_dointvec_minmax, +.extra1 = &zero, +.extra2 = &oa_exponent_max, +}, {} }; -- 2.9.2
[PATCH v4 10/11] drm/i915: Add more Haswell OA metric sets
This adds 'compute', 'compute extended', 'memory reads', 'memory writes' and 'sampler balance' metric sets for Haswell. The code is auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C gputop-data -f Makefile.xml Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_oa_hsw.c | 484 - 1 file changed, 483 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_oa_hsw.c b/drivers/gpu/drm/i915/i915_oa_hsw.c index c32b5f8..81e5628 100644 --- a/drivers/gpu/drm/i915/i915_oa_hsw.c +++ b/drivers/gpu/drm/i915/i915_oa_hsw.c @@ -30,9 +30,14 @@ enum metric_set_id { METRIC_SET_ID_RENDER_BASIC = 1, + METRIC_SET_ID_COMPUTE_BASIC, + METRIC_SET_ID_COMPUTE_EXTENDED, + METRIC_SET_ID_MEMORY_READS, + METRIC_SET_ID_MEMORY_WRITES, + METRIC_SET_ID_SAMPLER_BALANCE, }; -int i915_oa_n_builtin_metric_sets_hsw = 1; +int i915_oa_n_builtin_metric_sets_hsw = 6; static const struct i915_oa_reg b_counter_config_render_basic[] = { { _MMIO(0x2724), 0x0080 }, @@ -118,6 +123,333 @@ static int select_render_basic_config(struct drm_i915_private *dev_priv) return 0; } +static const struct i915_oa_reg b_counter_config_compute_basic[] = { + { _MMIO(0x2710), 0x }, + { _MMIO(0x2714), 0x0080 }, + { _MMIO(0x2718), 0x }, + { _MMIO(0x271c), 0x }, + { _MMIO(0x2720), 0x }, + { _MMIO(0x2724), 0x0080 }, + { _MMIO(0x2728), 0x }, + { _MMIO(0x272c), 0x }, + { _MMIO(0x2740), 0x }, + { _MMIO(0x2744), 0x }, + { _MMIO(0x2748), 0x }, + { _MMIO(0x274c), 0x }, + { _MMIO(0x2750), 0x }, + { _MMIO(0x2754), 0x }, + { _MMIO(0x2758), 0x }, + { _MMIO(0x275c), 0x }, + { _MMIO(0x236c), 0x }, +}; + +static const struct i915_oa_reg mux_config_compute_basic[] = { + { _MMIO(0x253a4), 0x }, + { _MMIO(0x2681c), 0x01f00800 }, + { _MMIO(0x26820), 0x1000 }, + { _MMIO(0x2781c), 0x01f00800 }, + { _MMIO(0x26520), 0x0007 }, + { _MMIO(0x265a0), 0x0007 }, + { _MMIO(0x25380), 0x0010 }, + { _MMIO(0x2538c), 0x0030 }, + { _MMIO(0x25384), 0xaa8a }, + { _MMIO(0x25404), 0x }, + { _MMIO(0x26800), 0x4202 }, + { _MMIO(0x26808), 0x00605817 }, + { _MMIO(0x2680c), 0x10001005 }, + { _MMIO(0x26804), 0x }, + { _MMIO(0x27800), 0x0102 }, + { _MMIO(0x27808), 0x0c0701e0 }, + { _MMIO(0x2780c), 0x000200a0 }, + { _MMIO(0x27804), 0x }, + { _MMIO(0x26484), 0x4400 }, + { _MMIO(0x26704), 0x4400 }, + { _MMIO(0x26500), 0x0006 }, + { _MMIO(0x26510), 0x0001 }, + { _MMIO(0x26504), 0x8800 }, + { _MMIO(0x26580), 0x0006 }, + { _MMIO(0x26590), 0x0020 }, + { _MMIO(0x26584), 0x }, + { _MMIO(0x26104), 0x5582 }, + { _MMIO(0x26184), 0xaa86 }, + { _MMIO(0x25420), 0x08320c83 }, + { _MMIO(0x25424), 0x06820c83 }, + { _MMIO(0x2541c), 0x }, + { _MMIO(0x25428), 0x0c03 }, +}; + +static int select_compute_basic_config(struct drm_i915_private *dev_priv) +{ + dev_priv->perf.oa.mux_regs = + mux_config_compute_basic; + dev_priv->perf.oa.mux_regs_len = + ARRAY_SIZE(mux_config_compute_basic); + + dev_priv->perf.oa.b_counter_regs = + b_counter_config_compute_basic; + dev_priv->perf.oa.b_counter_regs_len = + ARRAY_SIZE(b_counter_config_compute_basic); + + return 0; +} + +static const struct i915_oa_reg b_counter_config_compute_extended[] = { + { _MMIO(0x2724), 0xf080 }, + { _MMIO(0x2720), 0x }, + { _MMIO(0x2714), 0xf080 }, + { _MMIO(0x2710), 0x }, + { _MMIO(0x2770), 0x0007fe2a }, + { _MMIO(0x2774), 0xff00 }, + { _MMIO(0x2778), 0x0007fe6a }, + { _MMIO(0x277c), 0xff00 }, + { _MMIO(0x2780), 0x0007fe92 }, + { _MMIO(0x2784), 0xff00 }, + { _MMIO(0x2788), 0x0007fea2 }, + { _MMIO(0x278c), 0xff00 }, + { _MMIO(0x2790), 0x0007fe32 }, + { _MMIO(0x2794), 0xff00 }, + { _MMIO(0x2798), 0x0007fe9a }, + { _MMIO(0x279c), 0xff00 }, + { _MMIO(0x27a0), 0x0007ff23 }, + { _MMIO(0x27a4), 0xff00 }, + { _MMIO(0x27a8), 0x0007fff3 }, + { _MMIO(0x27ac), 0xfffe }, +}; + +static const struct i915_oa_reg mux_config_compute_extended[] = { + { _MMIO(0x2681c), 0x3eb00800 }, + { _MMIO(0x26820), 0x0090 }, + { _MMIO(0x25384), 0x02aa }, + { _MMIO(0x25404), 0x03ff }, + { _MMIO(0x26800), 0x00142284 }
[PATCH v4 11/11] drm/i915: Add a kerneldoc summary for i915_perf.c
In particular this tries to capture for posterity some of the early challenges we had with using the core perf infrastructure in case we ever want to revisit adapting perf for device metrics. Cc: Chris Wilson Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_perf.c | 163 +++ 1 file changed, 163 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 3d0ba09..2798d70 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -24,6 +24,169 @@ * Robert Bragg */ + +/** + * DOC: i915 Perf, streaming API for GPU metrics + * + * Gen graphics supports a large number of performance counters that can help + * driver and application developers understand and optimize their use of the + * GPU. + * + * This i915 perf interface enables userspace to configure and open a file + * descriptor representing a stream of GPU metrics which can then be read() as + * a stream of sample records. + * + * The interface is particularly suited to exposing buffered metrics that are + * captured by DMA from the GPU, unsynchronized with and unrelated to the CPU. + * + * Streams representing a single context are accessible to applications with a + * corresponding drm file descriptor, such that OpenGL can use the interface + * without special privileges. Access to system-wide metrics requires root + * privileges by default, unless changed via the dev.i915.perf_event_paranoid + * sysctl option. + * + * + * The interface was initially inspired by the core Perf infrastructure but + * some notable differences are: + * + * i915 perf file descriptors represent a "stream" instead of an "event"; where + * a perf event primarily corresponds to a single 64bit value, while a stream + * might sample sets of tightly-coupled counters, depending on the + * configuration. For example the Gen OA unit isn't designed to support + * orthogonal configurations of individual counters; it's configured for a set + * of related counters. Samples for an i915 perf stream capturing OA metrics + * will include a set of counter values packed in a compact HW specific format. + * The OA unit supports a number of different packing formats which can be + * selected by the user opening the stream. Perf has support for grouping + * events, but each event in the group is configured, validated and + * authenticated individually with separate system calls. + * + * i915 perf stream configurations are provided as an array of u64 (key,value) + * pairs, instead of a fixed struct with multiple miscellaneous config members, + * interleaved with event-type specific members. + * + * i915 perf doesn't support exposing metrics via an mmap'd circular buffer. + * The supported metrics are being written to memory by the GPU unsynchronized + * with the CPU, using HW specific packing formats for counter sets. Sometimes + * the constraints on HW configuration require reports to be filtered before it + * would be acceptable to expose them to unprivileged applications - to hide + * the metrics of other processes/contexts. For these use cases a read() based + * interface is a good fit, and provides an opportunity to filter data as it + * gets copied from the GPU mapped buffers to userspace buffers. + * + * + * Some notes regarding Linux Perf: + * + * + * The first prototype of this driver was based on the core perf + * infrastructure, and while we did make that mostly work, with some changes to + * perf, we found we were breaking or working around too many assumptions baked + * into perf's currently cpu centric design. + * + * In the end we didn't see a clear benefit to making perf's implementation and + * interface more complex by changing design assumptions while we knew we still + * wouldn't be able to use any existing perf based userspace tools. + * + * Also considering the Gen specific nature of the Observability hardware and + * how userspace will sometimes need to combine i915 perf OA metrics with + * side-band OA data captured via MI_REPORT_PERF_COUNT commands; we're + * expecting the interface to be used by a platform specific userspace such as + * OpenGL or tools. This is to say; we aren't inherently missing out on having + * a standard vendor/architecture agnostic interface by not using perf. + * + * + * For posterity, in case we might re-visit trying to adapt core perf to be + * better suited to exposing i915 metrics these were the main pain points we + * hit: + * + * - The perf based OA PMU driver broke some significant design assumptions: + * + * Existing perf pmus are used for profiling work on a cpu and we were + * introducing the idea of _IS_DEVICE pmus with different security + * implications, the need to fake cpu-related data (such as user/kernel + * registers) to fit with perf's current design, and adding _DEVICE records + * as a way to forward device-specific status records. + * + * The OA unit writes r
[PATCH 0/3] drm/rockchip: Improve PSR/VBLANK interaction
Instead of using vblank to trigger psr enable/disable, use the flush timer built for fb_dirty all of the time. This allows us to mitiage the various edge cases, e.g. cursor. non-event updates, etc.. Sean Paul (3): drm/rockchip: Remove unnecessary vblank get/put from vop driver drm/rockchip: Don't key off vblank for psr drm/rockchip: Reduce psr flush time to 100ms drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 2 +- drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 74 - drivers/gpu/drm/rockchip/rockchip_drm_psr.h | 8 ++-- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++- 4 files changed, 63 insertions(+), 42 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH 1/3] drm/rockchip: Remove unnecessary vblank get/put from vop driver
vblank references are handled in the top-level atomic complete function, so no need to grab them again in the vop driver. Signed-off-by: Sean Paul --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index 1787084..7003e4d 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -112,7 +112,6 @@ struct vop { struct device *dev; struct drm_device *drm_dev; bool is_enabled; - bool vblank_active; /* mutex vsync_ work */ struct mutex vsync_mutex; @@ -1108,10 +1107,6 @@ static void vop_crtc_atomic_begin(struct drm_crtc *crtc, struct vop *vop = to_vop(crtc); spin_lock_irq(&crtc->dev->event_lock); - vop->vblank_active = true; - WARN_ON(drm_crtc_vblank_get(crtc) != 0); - WARN_ON(vop->event); - if (crtc->state->event) { vop->event = crtc->state->event; crtc->state->event = NULL; @@ -1201,11 +1196,6 @@ static void vop_handle_vblank(struct vop *vop) if (vop->event) { drm_crtc_send_vblank_event(crtc, vop->event); vop->event = NULL; - - } - if (vop->vblank_active) { - vop->vblank_active = false; - drm_crtc_vblank_put(crtc); } spin_unlock_irqrestore(&drm->event_lock, flags); @@ -1476,7 +1466,6 @@ static int vop_initial(struct vop *vop) clk_disable(vop->aclk); vop->is_enabled = false; - vop->vblank_active = false; return 0; -- 2.8.0.rc3.226.g39d4020
[PATCH 2/3] drm/rockchip: Don't key off vblank for psr
Instead of keying off vblank for psr, just flush every time we get an atomic update. This ensures that cursor updates will properly disable psr (without turning vblank on/off), and unifies the paths between fb_dirty and atomic psr enable/disable. Signed-off-by: Sean Paul --- drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 2 +- drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 72 - drivers/gpu/drm/rockchip/rockchip_drm_psr.h | 8 ++-- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 10 ++-- 4 files changed, 62 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index ba45d9d..10cafbc 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c @@ -70,7 +70,7 @@ static int rockchip_drm_fb_dirty(struct drm_framebuffer *fb, struct drm_clip_rect *clips, unsigned int num_clips) { - rockchip_drm_psr_flush(fb->dev); + rockchip_drm_psr_flush_all(fb->dev); return 0; } diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c index c6ac5d0..de6252f 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c @@ -31,6 +31,7 @@ struct psr_drv { struct drm_encoder *encoder; spinlock_t lock; + boolactive; enum psr_state state; struct timer_list flush_timer; @@ -67,11 +68,7 @@ static void psr_set_state_locked(struct psr_drv *psr, enum psr_state state) * v || * PSR_DISABLE < - - - - - - - - - */ - if (state == psr->state) - return; - - /* Requesting a flush when disabled is a noop */ - if (state == PSR_FLUSH && psr->state == PSR_DISABLE) + if (state == psr->state || !psr->active) return; psr->state = state; @@ -115,45 +112,79 @@ static void psr_flush_handler(unsigned long data) } /** - * rockchip_drm_psr_enable - enable the encoder PSR which bind to given CRTC + * rockchip_drm_psr_activate - activate PSR on the given pipe * @crtc: CRTC to obtain the PSR encoder * * Returns: * Zero on success, negative errno on failure. */ -int rockchip_drm_psr_enable(struct drm_crtc *crtc) +int rockchip_drm_psr_activate(struct drm_crtc *crtc) { struct psr_drv *psr = find_psr_by_crtc(crtc); + unsigned long flags; if (IS_ERR(psr)) return PTR_ERR(psr); - psr_set_state(psr, PSR_ENABLE); + spin_lock_irqsave(&psr->lock, flags); + psr->active = true; + spin_unlock_irqrestore(&psr->lock, flags); + return 0; } -EXPORT_SYMBOL(rockchip_drm_psr_enable); +EXPORT_SYMBOL(rockchip_drm_psr_activate); /** - * rockchip_drm_psr_disable - disable the encoder PSR which bind to given CRTC + * rockchip_drm_psr_deactivate - deactivate PSR on the given pipe * @crtc: CRTC to obtain the PSR encoder * * Returns: * Zero on success, negative errno on failure. */ -int rockchip_drm_psr_disable(struct drm_crtc *crtc) +int rockchip_drm_psr_deactivate(struct drm_crtc *crtc) { struct psr_drv *psr = find_psr_by_crtc(crtc); + unsigned long flags; + + if (IS_ERR(psr)) + return PTR_ERR(psr); + + spin_lock_irqsave(&psr->lock, flags); + psr->active = false; + spin_unlock_irqrestore(&psr->lock, flags); + del_timer_sync(&psr->flush_timer); + + return 0; +} +EXPORT_SYMBOL(rockchip_drm_psr_deactivate); + +static void rockchip_drm_do_flush(struct psr_drv *psr) +{ + mod_timer(&psr->flush_timer, + round_jiffies_up(jiffies + PSR_FLUSH_TIMEOUT)); + psr_set_state(psr, PSR_FLUSH); +} +/** + * rockchip_drm_psr_flush - flush a single pipe + * @crtc: CRTC of the pipe to flush + * + * Returns: + * 0 on success, -errno on fail + */ +int rockchip_drm_psr_flush(struct drm_crtc *crtc) +{ + struct psr_drv *psr = find_psr_by_crtc(crtc); if (IS_ERR(psr)) return PTR_ERR(psr); - psr_set_state(psr, PSR_DISABLE); + rockchip_drm_do_flush(psr); return 0; } -EXPORT_SYMBOL(rockchip_drm_psr_disable); +EXPORT_SYMBOL(rockchip_drm_psr_flush); /** - * rockchip_drm_psr_flush - force to flush all registered PSR encoders + * rockchip_drm_psr_flush_all - force to flush all registered PSR encoders * @dev: drm device * * Disable the PSR function for all registered encoders, and then enable the @@ -164,22 +195,18 @@ EXPORT_SYMBOL(rockchip_drm_psr_disable); * Returns: * Zero on success, negative errno on failure. */ -void rockchip_drm_psr_flush(struct drm_device *dev) +void rockchip_drm_psr_flush_all(struct drm_device *dev) { struct rockchip_drm_private *drm_drv = dev->dev_private; struct psr_drv *psr; uns
[PATCH 3/3] drm/rockchip: Reduce psr flush time to 100ms
3 seconds is a bit too conservative, drop this to 100ms for better power savings. Signed-off-by: Sean Paul --- drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c index de6252f..2cdd6eb 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c @@ -18,7 +18,7 @@ #include "rockchip_drm_drv.h" #include "rockchip_drm_psr.h" -#define PSR_FLUSH_TIMEOUT msecs_to_jiffies(3000) /* 3 seconds */ +#define PSR_FLUSH_TIMEOUT msecs_to_jiffies(100) enum psr_state { PSR_FLUSH, -- 2.8.0.rc3.226.g39d4020