On Wed, Jul 22, 2020 at 9:27 PM Chia-I Wu <olva...@gmail.com> wrote: > > On Wed, Jul 22, 2020 at 4:28 AM Daniel Vetter <dan...@ffwll.ch> wrote: > > > > On Wed, Jul 22, 2020 at 1:12 PM Christian König > > <christian.koe...@amd.com> wrote: > > > > > > Am 22.07.20 um 09:32 schrieb Daniel Vetter: > > > > On Wed, Jul 22, 2020 at 9:19 AM Christian König > > > > <christian.koe...@amd.com> wrote: > > > >> Am 22.07.20 um 02:22 schrieb Gurchetan Singh: > > > >> > > > >> +Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb: > > > >> > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fdma-buf%2FKconfig%23n46&data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637310000122752271&sdata=WKs3KSr3K1DdVmDaIZ%2FnV8VEBPWMGMSGweeay0HIOxw%3D&reserved=0 > > > >> > > > >> Currently, the user seems to amdgpu for P2P dma-buf and it seems to > > > >> plumb ttm (*move_notify) callback to dma-buf. We're not sure if it's > > > >> a security issue occurring across DRM drivers, or one more specific to > > > >> the new amdgpu use case. > > > >> > > > >> On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu <olva...@gmail.com> wrote: > > > >>> Hi list, > > > >>> > > > >>> virtio-gpu is moving in the direction where BO pages are pinned for > > > >>> the lifetime for simplicity. I am wondering if that is considered a > > > >>> security issue in general, especially aCan you elaborate a little bit > > > >>> what these other problems might be? Memory fragmentation?fter > > > >>> running into the > > > >>> description of the new DMABUF_MOVE_NOTIFY config option. > > > >> > > > >> Yes, that is generally considered a deny of service possibility and so > > > >> far Dave and Daniel have rejected all tries to upstream stuff like > > > >> this as far as I know. > > > > Uh we have merged pretty much all arm-soc drivers without real > > > > shrinkers. Whether that was a good idea or not is maybe different > > > > question - now that we do have pretty good helpers maybe we should > > > > poke this a bit more. But then SoCs Suck (tm). > > > > > > I was under the impression that those SoC drivers still use the GEM > > > helpers which unpinns stuff when it is not in use. But I might be wrong. > > > > It's kinda mostly there, even some helpers for shrinking but a) > > helpers on, not all drivers use it b) for purgeable objects only, not > > generally for inactive stuff - there's no active use tracking c) cma > > helpers (ok that one is only for vc4 as the render driver) don't even > > have that. I had some slow burner series to get us towards dma_resv > > locking in shmem helpers and then maybe even a common shrinker helper > > with some "actually kick it out now" callback, but yeah never got > > there. > My quick survey of the SoC drivers also told me that they tend to > demonstrate a) or b). > > About b), I was thinking maybe that's because the systems the drivers > run on are historically swap-less. There is no place to write the > dirty pages back to and thus less incentive to support shrinking > inactive objects. > > You both mentioned that the lack of swap is irrelevant (or at least > not the only factor). Can you elaborate a little bit on that? > Shrinking inactive objects returns the pages to swap cache... hmm, I > guess that helps memory defragmentation?
Yup, kernel can then move it around as it sees fit. Also, zswap is a thing nowadays, and I think used by default on android now. -Daniel > > > > So maybe per-device object shrinker helper would be something neat we > > could lift out of ttm (when it's happening), maybe with a simple > > callback somewhere in it's lru tracking. Probably best if the shrinker > > lru is outright separate from anything else or it just gets messy. > > -Daniel > > > > > > But for real gpus they do indeed all have shrinkers, and not just "pin > > > > everything forever" model. Real gpus = stuff you might run on servers > > > > or multi-app and all that stuff, not with a simple "we just kill all > > > > background jobs if memory gets low" model like on android and other > > > > such things. > > > > > > > >> DMA-buf an pinning for scanout are the only exceptions since the > > > >> implementation wouldn't have been possible otherwise. > > > >> > > > >>> Most drivers do not have a shrinker, or whether a BO is purgeable is > > > >>> entirely controlled by the userspace (madvice). They can be > > > >>> categorized as "a security problem where userspace is able to pin > > > >>> unrestricted amounts of memory". But those drivers are normally found > > > >>> on systems without swap. I don't think the issue applies. > > > >> > > > >> This is completely independent of the availability of swap or not. > > > >> > > > >> Pinning of pages in large quantities can result in all kind of > > > >> problems and needs to be prevented even without swap. > > > > Yeah you don't just kill swap, you kill a ton of other kernel services > > > > with mass pinning. I think even the pinning of scanout buffers for > > > > i915 from system memory is somewhat questionable (but I guess small > > > > enough to not matter in practice). > > > > > > Yeah, we had a really hard time explaining that internally as well. > > > > > > Christian. > > > > > > >> Otherwise you can ran into problems even with simple I/O operations > > > >> for example. > > > >> > > > >>> Of the desktop GPU drivers, i915's shrinker certainly supports purging > > > >>> to swap. TTM is a bit hard to follow. I can't really tell if amdgpu > > > >>> or nouveau supports that. virtio-gpu is more commonly found on > > > >>> systems with swaps so I think it should follow the desktop practices? > > > >> > > > >> What we do at least in the amdgpu, radeon, i915 and nouveau is to only > > > >> allow it for scanout and that in turn is limited by the physical > > > >> number of CRTCs on the board. > > > >> > > > >>> Truth is, the emulated virtio-gpu device always supports page moves > > > >>> with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING. It is just that > > > >>> the driver does not make use of them. That makes this less of an > > > >>> issue because the driver can be fixed anytime (finger crossed that the > > > >>> emulator won't have bugs in these untested paths). This issue becomes > > > >>> more urgent because we are considering adding a new HW command[1] > > > >>> where page moves will be disallowed. We definitely don't want a HW > > > >>> command that is inherently insecure, if BO pages pinned for the > > > >>> lifetime is considered a security issue on desktops. > > > >> > > > >> Yeah, that's probably not such a good idea :) > > > > Well if the pinning is just for the duration of the hw command, it's > > > > fine, just like batch buffers. But if it's long term pinning then that > > > > doesn't sound like a good idea. RDMA has this as their inherit hw > > > > programming model (except if your hw is really fancy and has hw page > > > > fault handling on the rdma nic), and they hard limit such pins to what > > > > you can mlock (or something similar within rdma). > Yeah, the driver pins the pages for the lifetime of any BO at the moment. > > There are existing HW commands that the driver can use to notify the > emulator about page migrations. But because the driver pins pages > forever, those commands are unused. We were considering taking that > further by disallowing those commands in some cases. I am glad to > learn that it is a bad idea :) > > > > > -Daniel > > > > > > > >> Regards, > > > >> Christian. > > > >> > > > >>> [1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB > > > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fvirgl%2Fdrm-misc-next%2F-%2Fblob%2Fvirtio-gpu-next%2Finclude%2Fuapi%2Flinux%2Fvirtio_gpu.h%23L396&data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637310000122752271&sdata=AWtwzk%2BP7P7031ibTumr2J%2FQB%2Fzssg1Imag%2FstysR5A%3D&reserved=0 > > > >> > > > >> _______________________________________________ > > > >> dri-devel mailing list > > > >> dri-devel@lists.freedesktop.org > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637310000122752271&sdata=I0VHr96oWzpWyiNXjD1jG9%2Fp2%2F848CdPc4%2FTf6SkWm8%3D&reserved=0 > > > > > > > > > > > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel