On Sat, 2016-11-05 at 15:28 +0100, Steinar H. Gunderson wrote: > There is currently no protection against walking a hash (using > _mesa_HashWalk()) and modifying it at the same time, for instance by > inserting > or deleting elements. This leads to segfaults in multithreaded code > if e.g. > someone calls glTexImage2D (which may have to walk the list of FBOs) > while > another thread is calling glDeleteFramebuffers on another thread with > the two > contexts sharing lists. > > The reason for this is that _mesa_HashWalk() doesn't actually take > the mutex > that normally protects the hash; it takes an entirely different > mutex. > Thus, walks are only protected against other walks, and there is also > no > outer lock taking this. There is an old comment saying that this is > to fix > problems with deadlock if the callback needs to take a mutex, but now > that we > have recursive mutexes, this is a non-issue.
Can you please change this line to say something like. With this change we switch to using a recursive mutex, which avoids the deadlock. I was a little confused. At first I thought it was already using a recursive mutex. Otherwise the change looks correct to me. Reviewed-by: Timothy Arceri <timothy.arc...@collabora.com> One thing that concerns me is that we can now call _mesa_HashRemove from the callback in _mesa_HashDeleteAll. I think we may need a patch before this one that changes _mesa_HashDeleteAll to wrap _mesa_hash_table_remove(table->ht, entry); with a deleted entry check otherwise the entry counts will get all messed up. /* Check that the cb didn't already delete the entry. If not * remove it. */ if (entry->key != table->ht->deleted_key) { _mesa_hash_table_remove(table->ht, entry); } > > A demonstration Helgrind hit from a real application: > > ==13412== Possible data race during write of size 8 at 0x3498C6A8 by > thread #1 > ==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400 > ==13412== at 0x1F040C99: _mesa_hash_table_remove > (hash_table.c:395) > ==13412== by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350) > ==13412== by 0x1EE98174: _mesa_HashRemove (hash.c:365) > ==13412== by 0x1EE2372D: _mesa_DeleteFramebuffers > (fbobject.c:2669) > ==13412== by 0x6105AA4: > movit::ResourcePool::cleanup_unlinked_fbos(void*) > (resource_pool.cpp:473) > ==13412== by 0x610615B: movit::ResourcePool::release_fbo(unsigned > int) (resource_pool.cpp:442) > [...] > ==13412== This conflicts with a previous read of size 8 by thread #20 > ==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318 > ==13412== at 0x1F040CD9: _mesa_hash_table_next_entry > (hash_table.c:415) > ==13412== by 0x1EE982A8: _mesa_HashWalk (hash.c:426) > ==13412== by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 > (teximage.c:2683) > ==13412== by 0x1EED9410: _mesa_update_fbo_texture > (teximage.c:3043) > ==13412== by 0x1EED9410: teximage (teximage.c:3073) > ==13412== by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105) > ==13412== by 0x166A68: operator() (mixer.cpp:454) > > There are many more interactions than just these two possible. > > Cc: 11.2 12.0 13.0 <mesa-sta...@lists.freedesktop.org> > Signed-off-by: Steinar H. Gunderson <steinar+m...@gunderson.no> > --- > src/mesa/main/hash.c | 18 +++++++----------- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c > index 7d8a5fd..670438a 100644 > --- a/src/mesa/main/hash.c > +++ b/src/mesa/main/hash.c > @@ -59,7 +59,6 @@ struct _mesa_HashTable { > struct hash_table *ht; > GLuint MaxKey; /**< highest key inserted > so far */ > mtx_t Mutex; /**< mutual exclusion lock */ > - mtx_t WalkMutex; /**< for _mesa_HashWalk() */ > GLboolean InDeleteAll; /**< Debug check */ > /** Value that would be in the table for DELETED_KEY_VALUE. */ > void *deleted_key_data; > @@ -129,8 +128,11 @@ _mesa_NewHashTable(void) > } > > _mesa_hash_table_set_deleted_key(table->ht, > uint_key(DELETED_KEY_VALUE)); > - mtx_init(&table->Mutex, mtx_plain); > - mtx_init(&table->WalkMutex, mtx_plain); > + /* > + * Needs to be recursive, since the callback in > _mesa_HashWalk() > + * is allowed to call _mesa_HashRemove(). > + */ > + mtx_init(&table->Mutex, mtx_recursive); > } > else { > _mesa_error_no_memory(__func__); > @@ -161,7 +163,6 @@ _mesa_DeleteHashTable(struct _mesa_HashTable > *table) > _mesa_hash_table_destroy(table->ht, NULL); > > mtx_destroy(&table->Mutex); > - mtx_destroy(&table->WalkMutex); > free(table); > } > > @@ -401,11 +402,6 @@ _mesa_HashDeleteAll(struct _mesa_HashTable > *table, > > /** > * Walk over all entries in a hash table, calling callback function > for each. > - * Note: we use a separate mutex in this function to avoid a > recursive > - * locking deadlock (in case the callback calls _mesa_HashRemove()) > and to > - * prevent multiple threads/contexts from getting tangled up. > - * A lock-less version of this function could be used when the table > will > - * not be modified. > * \param table the hash table to walk > * \param callback the callback function > * \param userData arbitrary pointer to pass along to the callback > @@ -422,13 +418,13 @@ _mesa_HashWalk(const struct _mesa_HashTable > *table, > > assert(table); > assert(callback); > - mtx_lock(&table2->WalkMutex); > + mtx_lock(&table2->Mutex); > hash_table_foreach(table->ht, entry) { > callback((uintptr_t)entry->key, entry->data, userData); > } > if (table->deleted_key_data) > callback(DELETED_KEY_VALUE, table->deleted_key_data, > userData); > - mtx_unlock(&table2->WalkMutex); > + mtx_unlock(&table2->Mutex); > } > > static void _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev