Am 16.11.2015 um 10:38 schrieb Andreas Färber: > Am 16.11.2015 um 09:16 schrieb Christian Borntraeger: >> On 11/16/2015 08:13 AM, Pavel Fedin wrote: >>>>>> (process:4102): GLib-CRITICAL **: g_hash_table_iter_next: assertion >>>>>> 'ri->version == ri->hash_table->version' failed >>>>>> >>>>>> (process:4102): GLib-CRITICAL **: g_hash_table_iter_next: assertion >>>>>> 'ri->version == ri->hash_table->version' failed >>>>>> >>>>>> (process:4102): GLib-CRITICAL **: iter_remove_or_steal: assertion >>>>>> 'ri->version == ri->hash_table->version' failed >>> >>> Wow... Actually this may come from attempts to modify the tree inside >>> iteration. >>> >>>> Thanks! sclp_init() seems to violate several QOM design principles in >>>> that it uses object_new() during TypeInfo::instance_init() and uses a >>>> TYPE_... constant as property name. But nothing else stands out >>>> immediately. >>> >>> I think we should refactor this and retry. If not all problems go away, >>> then we are indeed modifying the tree during iteration, and >>> we have to find some solution. >> >> David, Conny, >> >> the current tree of afaerber >> >> https://github.com/afaerber/qemu-cpu/commits/qom-next >> >> has this patch: >> >>> From: Pavel Fedin <p.fe...@samsung.com> >>> >>> ARM GICv3 systems with large number of CPUs create lots of IRQ pins. Since >>> every pin is represented as a property, number of these properties becomes >>> very large. Every property add first makes sure there's no duplicates. >>> Traversing the list becomes very slow, therefore qemu initialization takes >>> significant time (several seconds for e. g. 16 CPUs). >>> >>> This patch replaces list with GHashTable, making lookup very fast. The only >>> drawback is that object_child_foreach() and object_child_foreach_recursive() >>> cannot modify their objects during traversal, since GHashTableIter does not >>> have modify-safe version. However, the code seems not to modify objects via >>> these functions. >>> >>> Signed-off-by: Daniel P. Berrange <berra...@redhat.com> >>> Signed-off-by: Pavel Fedin <p.fe...@samsung.com> >> >> which causes failures in make check. A simple reproducer is >> >> qemu-system-s390x -device sclp,help >> >> >> any idea what would be the most simple fix? >> Can we refactor this to create the event facility and the bus in the >> machine or whatever? > > I believe it is rather a very general problem with the new > object_property_del_all() implementation. It iterates through > properties, releasing child<> and link<> properties, which results in an > unref, which at some point unparents that device, removing it in the > parent's properties hashtable while the parent is iterating through it. > > In this case it seems to be about the bus child<> on the event facility. > >>> I wonder... Could we have both list and hashtable? hashtable for searching >>> by name and list for iteration. In this case we would >>> not have to use glib's iterators, and would be free of problems with them. >>> Just keep the list and hashtable in sync. >>> Or, is there any hashtable implementation out there which would keep >>> iterators valid during modification? >>> OTOH, glib has a function "remove the element at iterator's position", and >>> we could postpone addition. So, perhaps, using both >>> containers would be an overkill, just refactor the code to adapt to the new >>> behavior. > > My idea, which I wanted to investigate after the weekend, is iterating > through the hashtable to create a list of prop->release functions and > call them only after finishing the iteration. That might not work > either, so we may need to loop over the releasing to allow for released > properties to disappear after prop->release().
I went with the latter and squashed the attached fixup (without last two hunks, preparing a separate patch for that), interrupting each iteration after prop->release() to be safe. That seems to fix it. Will prepend and test Dan's unit test next. Thanks, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton; HRB 21284 (AG Nürnberg)
diff --git a/qom/object.c b/qom/object.c index 0ac3bc1..284fa38 100644 --- a/qom/object.c +++ b/qom/object.c @@ -377,14 +377,22 @@ static void object_property_del_all(Object *obj) ObjectProperty *prop; GHashTableIter iter; gpointer key, value; + bool released; - g_hash_table_iter_init(&iter, obj->properties); - while (g_hash_table_iter_next(&iter, &key, &value)) { - prop = value; - if (prop->release) { - prop->release(obj, prop->name, prop->opaque); + do { + released = false; + g_hash_table_iter_init(&iter, obj->properties); + while (g_hash_table_iter_next(&iter, &key, &value)) { + prop = value; + if (prop->release) { + prop->release(obj, prop->name, prop->opaque); + prop->release = NULL; + released = true; + break; + } + g_hash_table_iter_remove(&iter); } - } + } while (released); g_hash_table_unref(obj->properties); } @@ -401,7 +409,15 @@ static void object_property_del_child(Object *obj, Object *child, Error **errp) if (object_property_is_child(prop) && prop->opaque == child) { if (prop->release) { prop->release(obj, prop->name, prop->opaque); + prop->release = NULL; } + break; + } + } + g_hash_table_iter_init(&iter, obj->properties); + while (g_hash_table_iter_next(&iter, &key, &value)) { + prop = value; + if (object_property_is_child(prop) && prop->opaque == child) { g_hash_table_iter_remove(&iter); break; } @@ -856,7 +872,7 @@ void object_ref(Object *obj) if (!obj) { return; } - atomic_inc(&obj->ref); + atomic_inc(&obj->ref); } void object_unref(Object *obj) @@ -864,7 +880,7 @@ void object_unref(Object *obj) if (!obj) { return; } - g_assert(obj->ref > 0); + g_assert_cmpint(obj->ref, >, 0); /* parent always holds a reference to its children */ if (atomic_fetch_dec(&obj->ref) == 1) {