allocator: Add a BO cache

Chad Versace Fri, 31 Mar 2017 17:44:11 -0700

On Wed 15 Mar 2017, Jason Ekstrand wrote:
> This cache allows us to easily ensure that we have a unique anv_bo for
> each gem handle.  We'll need this in order to support multiple-import of
> memory objects and semaphores.
> 
> v2 (Jason Ekstrand):
>  - Reject BO imports if the size doesn't match the prime fd size as
>    reported by lseek().
> 
> v3 (Jason Ekstrand):
>  - Fix reference counting around cache_release (Chris Willson)
>  - Move the mutex_unlock() later in cache_release
> ---
>  src/intel/vulkan/anv_allocator.c | 261 
> +++++++++++++++++++++++++++++++++++++++
>  src/intel/vulkan/anv_private.h   |  26 ++++
>  2 files changed, 287 insertions(+)



> +static uint32_t
> +hash_uint32_t(const void *key)
> +{
> +   return (uint32_t)(uintptr_t)key;
> +}

This hash function does not appear hashy.

If I correctly understand the details of Mesa's struct hash_table,
choosing the identify function for the hash function causes unwanted
clustering when inserting consecutive gem handles.  Since the kernel does
allocate gem handles consecutively, the problem is real.

For proof, consider the following:

   - Suppose a long-running process (maybe the compositor) has thrashed on the
     hash table long enough that its bucket count
     is ht->size = hash_sizes[7].size = 283. Suppose a spike of
     compositor activity raises the hash table's density to about 0.5.
     And suppose the hash table buckets are filled with the consecutive gem
     handles
 
     {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 }
 
     The exact density is (128 - 4 + 1) / 283 = 0.4417.

   - Next, some other in-process activity (maybe OpenGL) generated
     a lot of gem handles after Vulkan's most recently imported
     gem handle, 128.
    
   - Next, a new compositor client appears. When the compositor imports
     the new client's dma_buf, PRIME_FD_TO_HANDLE returns 287.

   - For an open-addressing, linear double-hashing table like Mesa's,
     with a load factor of 0.4417, and a perfectly random hash function,
     the expected number of probes when inserting a new key is 1.7862,
     according to my hacky Python script.

   - 
     When inserting 287, the actual number of probes is 1 + ceil((128
     - 4) / 7) = 19. (I used a hacky Python script to confirm this by
     simulation).
     Expected probes: 1.7862
     Actual probes: 19

   - Again, PRIME_FD_TO_HANDLE returns 288. Insert it into the table...
     Expected probes: 1.7975
     Actual probes: 17

   - Again, PRIME_FD_TO_HANDLE returns 289. Insert it into the table...
     Expected probes: 1.8089
     Actual probes: 15

   - One more time, PRIME_FD_TO_HANDLE returns 290.
     Expected probes: 1.8201
     Actual probes: 14

You see the problem... the performance is about 10x to 5x slower for
a long time.

Replacing the identity hash function with _mesa_hash_pointer() may fix
the linear degradation.

Or.... you could just say "Meh, in the common case, the hash table is
super fast; it's effectively a direct-addressed array. In Chad's
pathological case, the table is still fast enough. I want super-fast for
the common case". And keep the identity hash.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

Reply via email to