On Mon, Apr 13, 2015 at 05:35:04PM +0200, Christian König wrote: > On 13.04.2015 17:25, Serguei Sagalovitch wrote: > >> the BO to be kept in the same place while it is mapped inside the kernel > >page table > >... > >> So this requires that we pin down the BO for the duration of the wait > >IOCTL. > > > >But my understanding is that it should be not duration of "wait" IOCTL but > >"duration" of command buffer execution. > > > >BTW: I would assume that this is not the new scenario. > > > > This is scenario: > > - User allocate BO > > - User get CPU address for BO > > - User submit command buffer to write to BO > > - User could "poll" / "read" or "write" BO data by CPU > > > >So when TTM needs to move BO to another location it should also update > >CPU "mapping" correctly so user will always read / write the correct data. > > > >Did I miss anything? > > The problem is that kernel mappings are not updated when TTM moves the > buffer around. In the case of a swapped out buffer that wouldn't even be > possible cause kernel mappings aren't pageable. > > You just can't map the BO into kernel space unless you have it pinned down, > so you can't check the current value written in the BO in your IOCTL. > > One alternative is to send all interrupts in question unfiltered to user > space and let userspace do the check if the right value was written or not. > But I assume that this would be rather bad for performance.
Yeah this most likey would be seriously bad. It might even allow malicous userspace to force irq storm. > > Another alternative would be to use the userspace mapping to check the BO > value, but this approach isn't compatible with a GPU scheduler. E.g. you > can't really do cross process space memory access in device drivers. Not to mention that you would need mmu_notifier to protect you from munmap. I think the solution i proposed in the other mail is simplest and safest. Cheers, Jérôme > > Regards, > Christian. > > > > > > >Sincerely yours, > >Serguei Sagalovitch > > > >On 15-04-13 10:52 AM, Christian König wrote: > >>Hello everyone, > >> > >>we have a requirement for a bit different kind of fence handling. Currently > >>we handle fences completely inside the kernel, but in the future we would > >>like to emit multiple fences inside the same IB as well. > >> > >>This works by adding multiple fence commands into an IB which just write > >>their value to a specific location inside a BO and trigger the appropriate > >>hardware interrupt. > >> > >>The user part of the driver stack should then be able to call an IOCTL to > >>wait for the interrupt and block for the value (or something larger) to be > >>written to the specific location. > >> > >>This has the advantage that you can have multiple synchronization points in > >>the same IB and don't need to split up your draw commands over several IBs > >>so that the kernel can insert kernel fences in between. > >> > >>The following set of patches tries to implement exactly this IOCTL. The big > >>problem with that IOCTL is that TTM needs the BO to be kept in the same > >>place while it is mapped inside the kernel page table. So this requires > >>that we pin down the BO for the duration of the wait IOCTL. > >> > >>This practically gives userspace a way of pinning down BOs for as long as > >>it wants, without the ability for the kernel for intervention. > >> > >>Any ideas how to avoid those problems? Or better ideas how to handle the > >>new requirements? > >> > >>Please note that the patches are only hacked together quick&dirty to > >>demonstrate the problem and not very well tested. > >> > >>Best regards, > >>Christian. > > >