[RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-19 Thread Marek Olšák
Hi, This is our initial proposal for explicit fences everywhere and new memory management that doesn't use BO fences. It's a redesign of how Linux graphics drivers work, and it can coexist with what we have now. *1. Introduction* (skip this if you are already sold on explicit fences) The curren

Re: [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-19 Thread Marek Olšák
ne fences would work across processes and how resilient they would be to segfaults. Marek On Mon, Apr 19, 2021 at 11:48 AM Jason Ekstrand wrote: > Not going to comment on everything on the first pass... > > On Mon, Apr 19, 2021 at 5:48 AM Marek Olšák wrote: > > > > Hi,

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-20 Thread Marek Olšák
rand: > > > Not going to comment on everything on the first pass... > > > > > > On Mon, Apr 19, 2021 at 5:48 AM Marek Olšák wrote: > > >> Hi, > > >> > > >> This is our initial proposal for explicit fences everywhere and new > memory m

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-20 Thread Marek Olšák
Daniel, imagine hardware that can only do what Windows does: future fences signalled by userspace whenever userspace wants, and no kernel queues like we have today. The only reason why current AMD GPUs work is because they have a ring buffer per queue with pointers to userspace command buffers fol

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-20 Thread Marek Olšák
On Tue, Apr 20, 2021 at 2:39 PM Daniel Vetter wrote: > On Tue, Apr 20, 2021 at 6:25 PM Marek Olšák wrote: > > > > Daniel, imagine hardware that can only do what Windows does: future > fences signalled by userspace whenever userspace wants, and no kernel > queues like we ha

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-26 Thread Marek Olšák
Thanks everybody. The initial proposal is dead. Here are some thoughts on how to do it differently. I think we can have direct command submission from userspace via memory-mapped queues ("user queues") without changing window systems. The memory management doesn't have to use GPU page faults like

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
, 2021, 04:02 Daniel Vetter, wrote: > On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote: > > Thanks everybody. The initial proposal is dead. Here are some thoughts on > > how to do it differently. > > > > I think we can have direct command submission from users

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
gt; > That could only be a problem for A+I Laptops. > > Memory management will just work with preemption fences which pause the > user queues of a process before evicting something. That will be a > dma_fence, but also a well known approach. > > Christian. > > Am 27.04.21

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
e "external -> amd" dependency too? Marek On Tue., Apr. 27, 2021, 08:15 Daniel Vetter, wrote: > On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák wrote: > > Ok. I'll interpret this as "yes, it will work, let's do it". > > It works if all you care ab

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
have the same problem in the kernel. > > The good news is I think we can relatively easily convert i915 and older > amdgpu device to something which is compatible with user fences. > > So yes, getting that fixed case by case should work. > > Christian > > Am 27.04.21 um

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
Supporting interop with any device is always possible. It depends on which drivers we need to interoperate with and update them. We've already found the path forward for amdgpu. We just need to find out how many other drivers need to be updated and evaluate the cost/benefit aspect. Marek On Tue,

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
to keep the kernel out of the picture. Marek On Tue, Apr 27, 2021 at 3:41 PM Jason Ekstrand wrote: > Trying to figure out which e-mail in this mess is the right one to reply > to > > On Tue, Apr 27, 2021 at 12:31 PM Lucas Stach > wrote: > > > > Hi, > > >

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Marek Olšák
On Wed., Apr. 28, 2021, 00:01 Jason Ekstrand, wrote: > On Tue, Apr 27, 2021 at 4:59 PM Marek Olšák wrote: > > > > Jason, both memory-based signalling as well as interrupt-based > signalling to the CPU would be supported by amdgpu. External devices don't > need to

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-05-01 Thread Marek Olšák
On Wed, Apr 28, 2021 at 5:07 AM Michel Dänzer wrote: > On 2021-04-28 8:59 a.m., Christian König wrote: > > Hi Dave, > > > > Am 27.04.21 um 21:23 schrieb Marek Olšák: > >> Supporting interop with any device is always possible. It depends on > which drivers we nee

Re: Is LLVM 13 (git) really ready for testing/development? libclc didn't compile

2021-03-05 Thread Marek Olšák
Hi, I can't answer this because our Mesa team doesn't work on LLVM and we don't build libclc. Marek On Thu, Mar 4, 2021 at 10:20 PM Dieter Nützel wrote: > Hello Marek, > > can't compile anything, here. > Poor Intel Nehalem X3470. > > Trying LLVM 12-rc2 later. > > Greetings, > Dieter > > llvm-p

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-05-03 Thread Marek Olšák
hink where that leaves us is that there is no safe place to > create a dma_fence except for inside the ioctl which submits the work > and only after any necessary memory has been allocated. That's a > pretty stiff requirement. We may still be able to interact with > userspace a bit

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-05-03 Thread Marek Olšák
but in the kernel without concurrency/preemption. Is this now safe enough for dma_fence? Marek On Mon, May 3, 2021 at 4:36 PM Marek Olšák wrote: > What about direct submit from the kernel where the process still has write > access to the GPU ring buffer but doesn't use it? I

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-05-04 Thread Marek Olšák
I see some mentions of XNACK and recoverable page faults. Note that all gaming AMD hw that has userspace queues doesn't have XNACK, so there is no overhead in compute units. My understanding is that recoverable page faults are still supported without XNACK, but instead of the compute unit replaying

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-29 Thread Marek Olšák
I don't know what iris does, but I would guess that the same problems as with AMD GPUs apply, making GPUs resets very fragile. Marek On Tue., Mar. 29, 2022, 08:14 Christian König, wrote: > My main question is what does the iris driver better than radeonsi when > the client doesn't support the r

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2022-01-17 Thread Marek Olšák
I don't think fork() would work with userspace where all buffers are shared. It certainly doesn't work now. The driver needs to be notified that a buffer or texture is shared to ensure data coherency between processes, and the driver must execute decompression and other render passes when a buffer

"Fixes" for page flipping under PRIME on AMD & nouveau

2016-08-18 Thread Marek Olšák
On Thu, Aug 18, 2016 at 4:23 AM, Michel Dänzer wrote: > Maybe the rasterization as two triangles results in bad PCIe bandwidth > utilization. Using the asynchronous DMA engine for these transfers would > probably be ideal, but having the 3D engine rasterize a single rectangle > (either using the

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-19 Thread Marek Olšák
From: Marek Olšák This reverts commit 2ce9dde0d47f2f94ab25c73a30596a7328bcdf1f. See the comment in the code. Basically, don't do cleanups in this header. Signed-off-by: Marek Olšák --- include/uapi/drm/amdgpu_drm.h | 295 +- 1 file changed, 150 inse

[PATCH 2/2] Revert "radeon_drm.h: use __u32 and __u64 from linux/types.h"

2016-08-19 Thread Marek Olšák
From: Marek Olšák This reverts commit 31b4dfe24e903e995a32f17e9a9cafbbecabc77a. See the comment in the code. Basically, don't do cleanups in this header. Signed-off-by: Marek Olšák --- include/uapi/drm/radeon_drm.h | 133 ++ 1 file changed, 69 inser

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-19 Thread Marek Olšák
On Fri, Aug 19, 2016 at 4:52 PM, Mikko Rapeli wrote: > On Fri, Aug 19, 2016 at 04:26:40PM +0200, Christian König wrote: >> Am 19.08.2016 um 15:50 schrieb Marek Olšák: >> >From: Marek Olšák >> > >> >This reverts commit 2ce9dde0d47f2f94ab25c73a30596a7328bcdf1f. >> > >> >See the comment in the

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-19 Thread Marek Olšák
On Fri, Aug 19, 2016 at 7:12 PM, Daniel Vetter wrote: > On Fri, Aug 19, 2016 at 7:11 PM, Daniel Vetter wrote: >> On Fri, Aug 19, 2016 at 5:22 PM, Marek Olšák wrote: >>> On Fri, Aug 19, 2016 at 4:52 PM, Mikko Rapeli >>> wrote: On Fri, Aug 19, 2016 at 04:26:40PM +0200, Christian König wr

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-20 Thread Marek Olšák
On Sat, Aug 20, 2016 at 12:54 AM, Emil Velikov wrote: > On 19 August 2016 at 15:26, Christian König > wrote: >> Am 19.08.2016 um 15:50 schrieb Marek Olšák: >>> >>> From: Marek Olšák >>> >>> This reverts commit 2ce9dde0d47f2f94ab25c73a30596a7328bcdf1f. >>> >>> See the comment in the code.

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-20 Thread Marek Olšák
On Sat, Aug 20, 2016 at 1:08 PM, Emil Velikov wrote: > On 20 August 2016 at 11:05, Marek Olšák wrote: >> On Sat, Aug 20, 2016 at 12:54 AM, Emil Velikov >> wrote: >>> On 19 August 2016 at 15:26, Christian König >>> wrote: Am 19.08.2016 um 15:50 schrieb Marek Olšák: > > From:

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-20 Thread Marek Olšák
On Sat, Aug 20, 2016 at 2:20 PM, Emil Velikov wrote: > On 20 August 2016 at 12:47, Marek Olšák wrote: >> On Sat, Aug 20, 2016 at 1:08 PM, Emil Velikov >> wrote: >>> On 20 August 2016 at 11:05, Marek Olšák wrote: On Sat, Aug 20, 2016 at 12:54 AM, Emil Velikov >>> gmail.com> wrote: >>>

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-20 Thread Marek Olšák
On Sat, Aug 20, 2016 at 8:08 PM, Mikko Rapeli wrote: > Cc'ing lkml. > > On Sat, Aug 20, 2016 at 12:05:54PM +0200, Marek Olšák wrote: >> On Sat, Aug 20, 2016 at 12:54 AM, Emil Velikov >> wrote: >> > On 19 August 2016 at 15:26, Christian König >> > wrote: >> >> Am 19.08.2016 um 15:50 schrieb

[PATCH 1/2] Revert "include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from "

2016-08-20 Thread Marek Olšák
On Sat, Aug 20, 2016 at 8:28 PM, Marek Olšák wrote: > On Sat, Aug 20, 2016 at 8:08 PM, Mikko Rapeli wrote: >> Cc'ing lkml. >> >> On Sat, Aug 20, 2016 at 12:05:54PM +0200, Marek Olšák wrote: >>> On Sat, Aug 20, 2016 at 12:54 AM, Emil Velikov >> gmail.com> wrote: >>> > On 19 August 2016 at 15:2

Re: [PATCH 4/4] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-10 Thread Marek Olšák
Hi, This patch series doesn't help with the OOM errors due to GDS. Reproducible with: AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears Marek On Fri, May 10, 2019 at 10:13 AM Christian König < ckoenig.leichtzumer...@gmail.com> wrote: > This avoids OOM situations when we have lots of

Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Marek Olšák
This series fixes the OOM errors. However, if I torture the kernel driver more, I can get it to deadlock and end up with unkillable processes. I can also get an OOM error. I just ran the test 5 times: AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears & AMD_

[ANNOUNCE] libdrm 2.4.97

2019-01-22 Thread Marek Olšák
Marchi (2): gitignore: sort file gitignore: add _build Marek Olšák (3): amdgpu: update amdgpu_drm.h amdgpu: add a faster BO list API Bump the version to 2.4.97 Mauro Rossi (1): android: Fix 32-bit app crashing in 64-bit Android git tag: libdrm-2.4.

[ANNOUNCE] libdrm 2.4.99

2019-07-02 Thread Marek Olšák
with memset rather than "= {0}" Leo Liu (1): tests/amdgpu/vcn: add VCN2.0 decode support Lucas Stach (1): etnaviv: drop etna_bo_from_handle symbol Marek Olšák (1): Bump version to 2.4.99 Marek Vasut (1): etnaviv: Fix double-free in etna_bo_cache_free() M

Re: Why is Thunderbolt 3 limited to 2.5 GT/s on Linux?

2019-07-03 Thread Marek Olšák
You can run: AMD_DEBUG=testdmaperf glxgears It tests transfer sizes of up to 128 MB, and it tests ~60 slightly different methods of transfering data. Marek On Wed, Jul 3, 2019 at 4:07 AM Michel Dänzer wrote: > On 2019-07-02 11:49 a.m., Timur Kristóf wrote: > > On Tue, 2019-07-02 at 10:09 +0200

Re: [PATCH 1/2] mesa: Fix clang build error w/ util_blitter_get_color_format_for_zs()

2019-07-03 Thread Marek Olšák
logic. > > Cc: Rob Clark > Cc: Emil Velikov > Cc: Amit Pundir > Cc: Sumit Semwal > Cc: Alistair Strachan > Cc: Greg Hartman > Cc: Tapani Pälli > Cc: Marek Olšák > Signed-off-by: John Stultz > --- > src/gallium/auxiliary/util/u_blitter.h | 3 +++ > 1 f

Re: Why is Thunderbolt 3 limited to 2.5 GT/s on Linux?

2019-07-05 Thread Marek Olšák
On Fri, Jul 5, 2019 at 5:27 AM Timur Kristóf wrote: > On Wed, 2019-07-03 at 14:44 -0400, Marek Olšák wrote: > > You can run: > > AMD_DEBUG=testdmaperf glxgears > > > > It tests transfer sizes of up to 128 MB, and it tests ~60 slightly > > different methods o

[ANNOUNCE] libdrm 2.4.100

2019-10-16 Thread Marek Olšák
with kernel Marek Olšák (5): include: update amdgpu_drm.h amdgpu: add amdgpu_cs_query_reset_state2 for AMDGPU_CTX_OP_QUERY_STATE2 Bump the version to 2.4.100 Revert "libdrm: remove autotools support" Bump the version to 2.4.100 for autotools Niclas

Re: [RFC] drm: Add AMD GFX9+ format modifiers.

2019-10-17 Thread Marek Olšák
On Wed, Oct 16, 2019 at 9:48 AM Bas Nieuwenhuizen wrote: > This adds initial format modifiers for AMD GFX9 and newer GPUs. > > This is particularly useful to determine if we can use DCC, and whether > we need an extra display compatible DCC metadata plane. > > Design decisions: > - Always expos

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-29 Thread Marek Olšák
For Mesa, we could run CI only when Marge pushes, so that it's a strictly pre-merge CI. Marek On Sat., Feb. 29, 2020, 17:20 Nicolas Dufresne, wrote: > Le samedi 29 février 2020 à 15:54 -0600, Jason Ekstrand a écrit : > > On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf > wrote: > > > On Sat, 2020

Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-13 Thread Marek Olšák
There is no synchronization between processes (e.g. 3D app and compositor) within X on AMD hw. It works because of some hacks in Mesa. Marek On Wed, Mar 11, 2020 at 1:31 PM Jason Ekstrand wrote: > All, > > Sorry for casting such a broad net with this one. I'm sure most people > who reply will g

Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-15 Thread Marek Olšák
how implicit sync works, I'd like to have it corrected. People continue > claiming that AMD is somehow special but I have yet to grasp what makes it > so. (Not that anyone has bothered to try all that hard to explain it.) > > > --Jason > > On March 13, 2020 21:03:21 Mar

Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-16 Thread Marek Olšák
On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer wrote: > On 2020-03-16 4:50 a.m., Marek Olšák wrote: > > The synchronization works because the Mesa driver waits for idle (drains > > the GFX pipeline) at the end of command buffers and there is only 1 > > graphics queue, so

Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-17 Thread Marek Olšák
On Tue., Mar. 17, 2020, 06:02 Michel Dänzer, wrote: > On 2020-03-16 7:33 p.m., Marek Olšák wrote: > > On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer > wrote: > >> On 2020-03-16 4:50 a.m., Marek Olšák wrote: > >>> The synchronization works because the Mes

Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-19 Thread Marek Olšák
On Thu., Mar. 19, 2020, 06:51 Daniel Vetter, wrote: > On Tue, Mar 17, 2020 at 11:01:57AM +0100, Michel Dänzer wrote: > > On 2020-03-16 7:33 p.m., Marek Olšák wrote: > > > On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer > wrote: > > >> On 2020-03-16 4:50 a.m

Re: [PATCH] drm: add drm device name

2019-09-16 Thread Marek Olšák
; On Fri, Sep 6, 2019 at 3:16 PM Marek Olšák wrote: > > > > > > + dri-devel > > > > > > On Tue, Sep 3, 2019 at 5:41 PM Jiang, Sonny > wrote: > > >> > > >> Add DRM device name and use DRM_IOCTL_VERSION ioctl drmVersion::desc

Re: [PATCH] drm: add drm device name

2019-09-17 Thread Marek Olšák
gt;> Am 17.09.19 um 10:17 schrieb Daniel Vetter: > >>>>> On Tue, Sep 17, 2019 at 10:12 AM Christian König > >>>>> wrote: > >>>>>> Am 17.09.19 um 07:47 schrieb Jani Nikula: > >>>>>>> On Mon, 16 Sep 2019, Marek Olšák w

Re: [PATCH] drm: add drm device name

2019-09-18 Thread Marek Olšák
On Wed, Sep 18, 2019 at 10:03 AM Michel Dänzer wrote: > On 2019-09-18 1:41 a.m., Marek Olšák wrote: > > drmVersion::name = amdgpu, radeon, intel, etc. > > drmVersion::desc = vega10, vega12, vega20, ... > > > > The common Mesa code will use name and desc to select th

Re: [PATCH] drm: add drm device name

2019-09-18 Thread Marek Olšák
Let's drop this patch. Mesa will use family_id. Marek On Wed, Sep 18, 2019 at 4:10 PM Marek Olšák wrote: > On Wed, Sep 18, 2019 at 10:03 AM Michel Dänzer wrote: > >> On 2019-09-18 1:41 a.m., Marek Olšák wrote: >> > drmVersion::name = amdgpu, radeon, intel, etc. >

Re: [PATCH] drm: add drm device name

2019-09-06 Thread Marek Olšák
+ dri-devel On Tue, Sep 3, 2019 at 5:41 PM Jiang, Sonny wrote: > Add DRM device name and use DRM_IOCTL_VERSION ioctl drmVersion::desc > passing it to user space > instead of unused DRM driver name descriptor. > > Change-Id: I809f6d3e057111417efbe8fa7cab8f0113ba4b21 > Signed-off-by: Sonny Jiang

Linux Graphics Next: Userspace submission update

2021-05-27 Thread Marek Olšák
Hi, Since Christian believes that we can't deadlock the kernel with some changes there, we just need to make everything nice for userspace too. Instead of explaining how it will work, I will explain the cases where future hardware (and its kernel driver) will break existing userspace in order to p

Re: Linux Graphics Next: Userspace submission update

2021-05-28 Thread Marek Olšák
iggest problem are the sync_files for Android, since they are really > not easy to support at all. If Android wants to support user queues we > would probably have to do some changes there. > > Regards, > Christian. > > Am 27.05.21 um 23:51 schrieb Marek Olšák: > > Hi

Re: Linux Graphics Next: Userspace submission update

2021-05-28 Thread Marek Olšák
e, but it's not possible to know which process is guilty (all processes holding the buffer handle would be suspects). Marek On Fri, May 28, 2021 at 6:25 PM Marek Olšák wrote: > If both implicit and explicit synchronization are handled the same, then > the kernel won't be able to iden

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-01 Thread Marek Olšák
On 2021-06-01 12:21 p.m., Christian König wrote: > >>>> Am 01.06.21 um 11:02 schrieb Michel Dänzer: > >>>>> On 2021-05-27 11:51 p.m., Marek Olšák wrote: > >>>>>> 3) Compositors (and other privileged processes, and display > flipping) can

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
Yes, we can't break anything because we don't want to complicate things for us. It's pretty much all NAK'd already. We are trying to gather more knowledge and then make better decisions. The idea we are considering is that we'll expose memory-based sync objects to userspace for read only, and the

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák wrote: > Yes, we can't break anything because we don't want to complicate things > for us. It's pretty much all NAK'd already. We are trying to gather more > knowledge and then make better decisions. > > The idea we a

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
On Wed, Jun 2, 2021 at 5:44 AM Christian König < ckoenig.leichtzumer...@gmail.com> wrote: > Am 02.06.21 um 10:57 schrieb Daniel Stone: > > Hi Christian, > > > > On Tue, 1 Jun 2021 at 13:51, Christian König > > wrote: > >> Am 01.06.21 um 14:30 schrieb Daniel Vetter: > >>> If you want to enable thi

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
On Wed, Jun 2, 2021 at 2:48 PM Daniel Vetter wrote: > On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote: > > On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák wrote: > > > > > Yes, we can't break anything because we don't want to complicate things > &

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-03 Thread Marek Olšák
On Thu, Jun 3, 2021 at 3:47 AM Daniel Vetter wrote: > On Wed, Jun 02, 2021 at 11:16:39PM -0400, Marek Olšák wrote: > > On Wed, Jun 2, 2021 at 2:48 PM Daniel Vetter wrote: > > > > > On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote: > > > > On Wed

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-03 Thread Marek Olšák
On Thu., Jun. 3, 2021, 06:03 Daniel Vetter, wrote: > On Thu, Jun 03, 2021 at 04:20:18AM -0400, Marek Olšák wrote: > > On Thu, Jun 3, 2021 at 3:47 AM Daniel Vetter wrote: > > > > > On Wed, Jun 02, 2021 at 11:16:39PM -0400, Marek Olšák wrote: > > > > On Wed,

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-03 Thread Marek Olšák
r the hw does it. It's the same code, just in a different place. Thanks, Marek On Thu, Jun 3, 2021 at 7:22 AM Daniel Vetter wrote: > On Thu, Jun 3, 2021 at 12:55 PM Marek Olšák wrote: > > > > On Thu., Jun. 3, 2021, 06:03 Daniel Vetter, wrote: > >> > >> On T

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-03 Thread Marek Olšák
On Thu., Jun. 3, 2021, 15:18 Daniel Vetter, wrote: > On Thu, Jun 3, 2021 at 7:53 PM Marek Olšák wrote: > > > > Daniel, I think what you are suggesting is that we need to enable user > queues with the drm scheduler and dma_fence first, and once that works, we > can investi

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-10 Thread Marek Olšák
Hi Daniel, We just talked about this whole topic internally and we came up to the conclusion that the hardware needs to understand sync object handles and have high-level wait and signal operations in the command stream. Sync objects will be backed by memory, but they won't be readable or writable

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-14 Thread Marek Olšák
d that. > > If the hardware says that seq - 1 was written fine, but seq is missing > then the kernel blames whoever was supposed to write seq. > > Just pieping the write through a privileged instance should be fine to > make sure that we don't run into issues. > > Chri

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-17 Thread Marek Olšák
gt; As long as we can figure out who touched to a certain sync object last > that > > would indeed work, yes. > > Don't you need to know who will touch it next, i.e. who is holding up your > fence? Or maybe I'm just again totally confused. > -Daniel > > > >

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-17 Thread Marek Olšák
against such behavior because it will receive them too. I don't know if that would work with dma_fence. Marek On Thu, Jun 17, 2021 at 3:04 PM Daniel Vetter wrote: > On Thu, Jun 17, 2021 at 02:28:06PM -0400, Marek Olšák wrote: > > The kernel will know who should touch the implicit-

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-24 Thread Marek Olšák
Soft resets are fatal just as hard resets, but no reset is "always fatal". There are cases when apps keep working depending on which features are being used. It's still unsafe. Marek On Mon, Apr 24, 2023, 03:03 Christian König wrote: > Am 24.04.23 um 03:43 schrieb André Almeida: > > When a DRM

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-25 Thread Marek Olšák
ed by userspace may result in persistent corruption. Marek On Tue, Apr 25, 2023 at 6:27 AM Michel Dänzer wrote: > On 4/24/23 18:45, Marek Olšák wrote: > > Soft resets are fatal just as hard resets, but no reset is "always > fatal". There are cases when apps keep working

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-25 Thread Marek Olšák
y the case. If an application has > enabled robustness it should notice that something went wrong and act > appropriately. > > The only thing we need to handle is for applications without robustness > in case of a hard reset or otherwise it will trigger an reset over and > over again.

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-26 Thread Marek Olšák
probability. No such app can be allowed to continue executing after a reset. Marek On Wed, Apr 26, 2023 at 5:51 AM Michel Dänzer wrote: > On 4/25/23 21:11, Marek Olšák wrote: > > The last 3 comments in this thread contain arguments that are false and > were specifically pointed out as fals

Re: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

2023-05-03 Thread Marek Olšák
GPU hangs are pretty common post-bringup. They are not common per user, but if we gather all hangs from all users, we can have lots and lots of them. GPU hangs are indeed not very debuggable. There are however some things we can do: - Identify the hanging IB by its VA (the kernel should know it) -

Re: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

2023-05-03 Thread Marek Olšák
WRITE_DATA with ENGINE=PFP will execute the packet on the frontend engine, while ENGINE=ME will execute the packet on the backend engine. Marek On Wed, May 3, 2023 at 1:08 PM Marek Olšák wrote: > GPU hangs are pretty common post-bringup. They are not common per user, > but if we gath

Re: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

2023-05-03 Thread Marek Olšák
On Wed, May 3, 2023, 14:53 André Almeida wrote: > Em 03/05/2023 14:08, Marek Olšák escreveu: > > GPU hangs are pretty common post-bringup. They are not common per user, > > but if we gather all hangs from all users, we can have lots and lots of > > them. > > > &

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-27 Thread Marek Olšák
On Tue, Jun 27, 2023, 09:23 André Almeida wrote: > Create a section that specifies how to deal with DRM device resets for > kernel and userspace drivers. > > Acked-by: Pekka Paalanen > Signed-off-by: André Almeida > --- > > v4: > https://lore.kernel.org/lkml/20230626183347.55118-1-andrealm...@i

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-27 Thread Marek Olšák
On Tue, Jun 27, 2023 at 5:31 PM André Almeida wrote: > Hi Marek, > > Em 27/06/2023 15:57, Marek Olšák escreveu: > > On Tue, Jun 27, 2023, 09:23 André Almeida > <mailto:andrealm...@igalia.com>> wrote: > > > > +User Mode Driver > > +---

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-30 Thread Marek Olšák
That's a terrible idea. Ignoring API calls would be identical to a freeze. You might as well disable GPU recovery because the result would be the same. There are 2 scenarios: - robust contexts: report the GPU reset status and skip API calls; let the app recreate the context to recover - non-robust

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-03 Thread Marek Olšák
On Mon, Jul 3, 2023, 03:12 Michel Dänzer wrote: > On 6/30/23 22:32, Marek Olšák wrote: > > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer < > michel.daen...@mailbox.org <mailto:michel.daen...@mailbox.org>> wrote: > >> On 6/30/23 16:59, Alex Deucher wrote: >

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-03 Thread Marek Olšák
On Mon, Jul 3, 2023, 22:38 Randy Dunlap wrote: > > > On 7/3/23 19:34, Marek Olšák wrote: > > > > > > On Mon, Jul 3, 2023, 03:12 Michel Dänzer <mailto:michel.daen...@mailbox.org>> wrote: > > > > Marek, > Please stop sending html emails to the ma

Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-07-25 Thread Marek Olšák
On Tue, Jul 25, 2023 at 4:03 AM Michel Dänzer wrote: > > On 7/25/23 04:55, André Almeida wrote: > > Hi everyone, > > > > It's not clear what we should do about non-robust OpenGL apps after GPU > > resets, so I'll try to summarize the topic, show some options and my > > proposal to move forward o

Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-08-02 Thread Marek Olšák
A screen that doesn't update isn't usable. Killing the window system and returning to the login screen is one option. Killing the window system manually from a terminal or over ssh and then returning to the login screen is another option, but 99% of users either hard-reset the machine or do sysrq+R

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-08 Thread Marek Olšák
It's the same situation as SIGSEGV. A process can catch the signal, but if it doesn't, it gets killed. GL and Vulkan APIs give you a way to catch the GPU error and prevent the process termination. If you don't use the API, you'll get undefined behavior, which means anything can happen, including pr

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-09 Thread Marek Olšák
On Wed, Aug 9, 2023 at 3:35 AM Michel Dänzer wrote: > > On 8/8/23 19:03, Marek Olšák wrote: > > It's the same situation as SIGSEGV. A process can catch the signal, > > but if it doesn't, it gets killed. GL and Vulkan APIs give you a way > > to catch the

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-04 Thread Marek Olšák
On Tue, Jul 4, 2023, 03:55 Michel Dänzer wrote: > On 7/4/23 04:34, Marek Olšák wrote: > > On Mon, Jul 3, 2023, 03:12 Michel Dänzer <mailto:michel.daen...@mailbox.org>> wrote: > > On 6/30/23 22:32, Marek Olšák wrote: > > > On Fri, Jun 30, 2023 at 11:11

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-05 Thread Marek Olšák
On Wed, Jul 5, 2023 at 3:32 AM Michel Dänzer wrote: > > On 7/5/23 08:30, Marek Olšák wrote: > > On Tue, Jul 4, 2023, 03:55 Michel Dänzer wrote: > > On 7/4/23 04:34, Marek Olšák wrote: > > > On Mon, Jul 3, 2023, 03:12 Michel Dänzer > > wrote: > >

mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]

2016-10-16 Thread Marek Olšák
On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer wrote: > > [ Adding Dan Williams and dri-devel ] > > On 14/10/16 03:28 AM, Shawn Starr wrote: >> Hello AMD folks, >> >> I have discovered a problem in Linus master that affects AMDGPU, nobody would >> notice this in drm-next-4.9-wip since its not in

Unix Device Memory Allocation project

2016-10-19 Thread Marek Olšák
Hi, The text below describes how open source AMDGPU buffer sharing works. I hope you'll find some useful bits in it. Producer = allocates a buffer (or texture), and exports its handle (DMABUF, etc.), and can use the buffer in various ways Consumer = imports the handle, and can use the buffer in

mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]

2016-10-19 Thread Marek Olšák
On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie wrote: > On 18 October 2016 at 23:53, Dan Williams wrote: >> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie wrote: >> [..] > Aren't there only 2 possibilities for this regression? > > 1/ a memtype entry was never made so track_pfn_insert() r

Unix Device Memory Allocation project

2016-10-19 Thread Marek Olšák
On Oct 19, 2016 8:24 AM, "Daniel Vetter" wrote: > > On Wed, Oct 19, 2016 at 1:40 AM, Marek Olšák wrote: > > - The producer-consumer interop API doesn't know about the metadata. > > All you need to pass around is a buffer handle. (KMS, DMABUF, etc.) > > * There was a note during the talk that

Unix Device Memory Allocation project

2016-10-19 Thread Marek Olšák
On Oct 19, 2016 2:33 PM, "Nicolai Hähnle" wrote: > > On 19.10.2016 01:40, Marek Olšák wrote: >> >> * We can build upon this idea. I think the worst thing to do would >> be to add metadata handling to driver-agnostic userspace APIs. Really, >> driver-agnostic APIs shouldn't know about that, be

Unix Device Memory Allocation project

2016-10-19 Thread Marek Olšák
On Wed, Oct 19, 2016 at 4:10 PM, Daniel Vetter wrote: > On Wed, Oct 19, 2016 at 03:15:08PM +0200, Marek Olšák wrote: >> On Oct 19, 2016 8:24 AM, "Daniel Vetter" wrote: >> > On Wed, Oct 19, 2016 at 1:40 AM, Marek Olšák wrote: >> > > - The producer-consumer interop API doesn't know about the m

mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]

2016-10-20 Thread Marek Olšák
On Thu, Oct 20, 2016 at 3:11 AM, Michel Dänzer wrote: > On 19/10/16 07:33 PM, Marek Olšák wrote: >> On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie wrote: >>> On 18 October 2016 at 23:53, Dan Williams >>> wrote: On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie wrote: [..] >>> Aren't

[ANNOUNCE] libdrm 2.4.76

2017-03-29 Thread Marek Olšák
dd uvd unit test support for vega10 tests/amdgpu: add vce unit test support for vega10 amdgpu_drm: add AMDGPU_HW_IP_UVD_ENC Marek Olšák (3): amdgpu: sync amdgpu_drm.h with kernel 4.11-rc2 amdgpu: update amdgpu_drm.h for Vega10 configure.ac: bump version for relea

Re: [PATCH libdrm 3/3] amdgpu: add REPLACE and CLEAR checking for VA op (v2)

2017-04-03 Thread Marek Olšák
t; - if (ops != AMDGPU_VA_OP_MAP && ops != AMDGPU_VA_OP_UNMAP) > + if (ops != AMDGPU_VA_OP_MAP && ops != AMDGPU_VA_OP_UNMAP && > + ops != AMDGPU_VA_OP_REPLACE && ops != AMDGPU_VA_OP_CLEAR) > + Spurious empty line? Other than that, the seri

[ANNOUNCE] libdrm 2.4.77

2017-04-04 Thread Marek Olšák
Erik Faye-Lund (1): tegra: update symbol-check Junwei Zhang (1): amdgpu: add REPLACE and CLEAR checking for VA op (v2) Marek Olšák (1): configure.ac: bump the version to 2.4.77 Nicolai Hähnle (3): amdgpu: add amdgpu_bo_va_op_raw headers: sync amdgpu_drm.h from

[ANNOUNCE] libdrm 2.4.79

2017-04-08 Thread Marek Olšák
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marek Olšák (1): configure.ac: bump version for release Samuel Pitoiset (1): amdgpu: allow to query GPU sensor related information git tag: libdrm-2.4.79 https://dri.freedesktop.org/libdrm/libdrm-2.4.79.tar.bz2 MD5

[PATCH libdrm] headers: Add README file

2016-11-11 Thread Marek Olšák
On Fri, Nov 11, 2016 at 5:21 PM, Alex Deucher wrote: > On Fri, Nov 11, 2016 at 8:44 AM, Emil Velikov > wrote: >> On 10 November 2016 at 21:07, Alex Deucher wrote: >>> On Thu, Nov 10, 2016 at 11:44 AM, Emil Velikov >> gmail.com> wrote: From: Emil Velikov Since we're trying to sta

Re: [PATCH 6/9] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-06-26 Thread Marek Olšák
On Mon, Jun 26, 2017 at 11:27 AM, Michel Dänzer wrote: > On 25/06/17 03:00 AM, Christian König wrote: >> Am 23.06.2017 um 19:39 schrieb John Brooks: >>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by >>> userspace, >>> it should only be treated as a hint to initially place a BO so

Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-07-07 Thread Marek Olšák
On Fri, Jun 30, 2017 at 8:47 AM, Christian König wrote: > Am 30.06.2017 um 04:24 schrieb Michel Dänzer: >> >> On 29/06/17 07:05 PM, Daniel Vetter wrote: >>> >>> On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote: On 29/06/17 05:23 PM, Christian König wrote: > > Am 29.0

Re: [PATCH libdrm] libdrm_amdgpu: add kernel semaphore support

2017-07-11 Thread Marek Olšák
On Tue, Jul 11, 2017 at 11:20 AM, Dave Airlie wrote: > On 11 July 2017 at 18:36, Christian König wrote: >> Am 11.07.2017 um 08:49 schrieb Dave Airlie: >>> >>> On 7 July 2017 at 19:07, Christian König wrote: Hi Dave, on first glance that looks rather good to me, but there is o

Re: [PATCH libdrm] libdrm/amdgpu: add interface for kernel semaphores

2017-03-14 Thread Marek Olšák
While it's nice that you are all having fun here, I don't think that's the way to communicate this. The truth is, if AMD had an open source driver using the semaphores (e.g. Vulkan) and if the libdrm semaphore code was merged, Dave wouldn't be able to change it, ever. If a dependent open source pr

  1   2   3   4   5   6   >