On Wed 15 Mar 2017, Jason Ekstrand wrote: > This cache allows us to easily ensure that we have a unique anv_bo for > each gem handle. We'll need this in order to support multiple-import of > memory objects and semaphores. > > v2 (Jason Ekstrand): > - Reject BO imports if the size doesn't match the prime fd size as > reported by lseek(). > > v3 (Jason Ekstrand): > - Fix reference counting around cache_release (Chris Willson) > - Move the mutex_unlock() later in cache_release > --- > src/intel/vulkan/anv_allocator.c | 261 > +++++++++++++++++++++++++++++++++++++++ > src/intel/vulkan/anv_private.h | 26 ++++ > 2 files changed, 287 insertions(+)
> +static uint32_t > +hash_uint32_t(const void *key) > +{ > + return (uint32_t)(uintptr_t)key; > +} This hash function does not appear hashy. If I correctly understand the details of Mesa's struct hash_table, choosing the identify function for the hash function causes unwanted clustering when inserting consecutive gem handles. Since the kernel does allocate gem handles consecutively, the problem is real. For proof, consider the following: - Suppose a long-running process (maybe the compositor) has thrashed on the hash table long enough that its bucket count is ht->size = hash_sizes[7].size = 283. Suppose a spike of compositor activity raises the hash table's density to about 0.5. And suppose the hash table buckets are filled with the consecutive gem handles {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 } The exact density is (128 - 4 + 1) / 283 = 0.4417. - Next, some other in-process activity (maybe OpenGL) generated a lot of gem handles after Vulkan's most recently imported gem handle, 128. - Next, a new compositor client appears. When the compositor imports the new client's dma_buf, PRIME_FD_TO_HANDLE returns 287. - For an open-addressing, linear double-hashing table like Mesa's, with a load factor of 0.4417, and a perfectly random hash function, the expected number of probes when inserting a new key is 1.7862, according to my hacky Python script. - When inserting 287, the actual number of probes is 1 + ceil((128 - 4) / 7) = 19. (I used a hacky Python script to confirm this by simulation). Expected probes: 1.7862 Actual probes: 19 - Again, PRIME_FD_TO_HANDLE returns 288. Insert it into the table... Expected probes: 1.7975 Actual probes: 17 - Again, PRIME_FD_TO_HANDLE returns 289. Insert it into the table... Expected probes: 1.8089 Actual probes: 15 - One more time, PRIME_FD_TO_HANDLE returns 290. Expected probes: 1.8201 Actual probes: 14 You see the problem... the performance is about 10x to 5x slower for a long time. Replacing the identity hash function with _mesa_hash_pointer() may fix the linear degradation. Or.... you could just say "Meh, in the common case, the hash table is super fast; it's effectively a direct-addressed array. In Chad's pathological case, the table is still fast enough. I want super-fast for the common case". And keep the identity hash. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev