On Friday, May 4, 2018 12:02:33 AM PDT Chris Wilson wrote: > Quoting Kenneth Graunke (2018-05-04 02:12:36) > > This introduces a new fast virtual memory allocator integrated with our > > BO cache bucketing. For larger objects, it falls back to the simple > > free-list allocator (util_vma). > > I wouldn't say fast just yet ;) If you want to explain any advantages, > focus on the lack of relocation processing required in userspace, and > the novel approach of memzones. > > Note that you can use user allocated addresses *without* softpin just > fine (suggest an address to the kernel and it will use it, if > empty/idle). That lets you preassign an address and avoid relocations on > the first pass; just without softpin you cannot force the kernel to use > it and so must supply the relocation fixups just in case. > > Since you do use NO_RELOC, the kernel should never have to touch the > relocation arrays, and I don't have a profile that makes me worry about > the drm_mm range manager performance. > > Hmm, something else to note is that this vma manager works best with > DRI3. Importing a bo for a single frame (and then reimporting it again > for the next frame etc) is unlikely to keep the same vma allocation. > -Chris
I wasn't actually trying to compare the performance of the userspace allocator to the kernel allocator - I was more saying that the bitfield based allocator should be cheap. Grab first element and ffsll should definitely be cheaper than a freelist with interval comparisons. I noticed that we allocate enough BOs on the fly in i965 that I figured it would be better to do something a bit nicer than util_vma's simple approach. You're right that the main advantages are that we... - Don't even need to think about relocation processing - Can use memzones to stream out state without worrying about hitting a limit, having to make a larger BO, and memcpy stuff. - Can pre-bake things like SURFACE_STATE up front, including addresses I thought about allocating addresses even with relocations. But I don't think that would look quite like this...we'd want to "free" our assigned VMA when updating GTT offsets after execbuf...and probably mark the new ranges returned by the kernel as in-use...gets a bit messy... For full-PPGTT platforms, we definitely don't want to keep relocs. For non-full-PPGTT platforms...yeah, we can assign addresses...but if we conflict with some other process...the kernel will just move them anyway. I guess we could choose random addresses and srand based on the process ID or GEM context ID or something. I'm not sure it's worth it?
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev