On Mon, Nov 16, 2015 at 05:44:35PM +0100, Andreas Färber wrote: > Am 16.11.2015 um 10:38 schrieb Andreas Färber: > > Am 16.11.2015 um 09:16 schrieb Christian Borntraeger: > >> On 11/16/2015 08:13 AM, Pavel Fedin wrote: > >>>>>> (process:4102): GLib-CRITICAL **: g_hash_table_iter_next: assertion > >>>>>> 'ri->version == ri->hash_table->version' failed > >>>>>> > >>>>>> (process:4102): GLib-CRITICAL **: g_hash_table_iter_next: assertion > >>>>>> 'ri->version == ri->hash_table->version' failed > >>>>>> > >>>>>> (process:4102): GLib-CRITICAL **: iter_remove_or_steal: assertion > >>>>>> 'ri->version == ri->hash_table->version' failed > >>> > >>> Wow... Actually this may come from attempts to modify the tree inside > >>> iteration. > >>> > >>>> Thanks! sclp_init() seems to violate several QOM design principles in > >>>> that it uses object_new() during TypeInfo::instance_init() and uses a > >>>> TYPE_... constant as property name. But nothing else stands out > >>>> immediately. > >>> > >>> I think we should refactor this and retry. If not all problems go away, > >>> then we are indeed modifying the tree during iteration, and > >>> we have to find some solution. > >> > >> David, Conny, > >> > >> the current tree of afaerber > >> > >> https://github.com/afaerber/qemu-cpu/commits/qom-next > >> > >> has this patch: > >> > >>> From: Pavel Fedin <p.fe...@samsung.com> > >>> > >>> ARM GICv3 systems with large number of CPUs create lots of IRQ pins. Since > >>> every pin is represented as a property, number of these properties becomes > >>> very large. Every property add first makes sure there's no duplicates. > >>> Traversing the list becomes very slow, therefore qemu initialization takes > >>> significant time (several seconds for e. g. 16 CPUs). > >>> > >>> This patch replaces list with GHashTable, making lookup very fast. The > >>> only > >>> drawback is that object_child_foreach() and > >>> object_child_foreach_recursive() > >>> cannot modify their objects during traversal, since GHashTableIter does > >>> not > >>> have modify-safe version. However, the code seems not to modify objects > >>> via > >>> these functions. > >>> > >>> Signed-off-by: Daniel P. Berrange <berra...@redhat.com> > >>> Signed-off-by: Pavel Fedin <p.fe...@samsung.com> > >> > >> which causes failures in make check. A simple reproducer is > >> > >> qemu-system-s390x -device sclp,help > >> > >> > >> any idea what would be the most simple fix? > >> Can we refactor this to create the event facility and the bus in the > >> machine or whatever? > > > > I believe it is rather a very general problem with the new > > object_property_del_all() implementation. It iterates through > > properties, releasing child<> and link<> properties, which results in an > > unref, which at some point unparents that device, removing it in the > > parent's properties hashtable while the parent is iterating through it. > > > > In this case it seems to be about the bus child<> on the event facility. > > > >>> I wonder... Could we have both list and hashtable? hashtable for > >>> searching by name and list for iteration. In this case we would > >>> not have to use glib's iterators, and would be free of problems with > >>> them. Just keep the list and hashtable in sync. > >>> Or, is there any hashtable implementation out there which would keep > >>> iterators valid during modification? > >>> OTOH, glib has a function "remove the element at iterator's position", > >>> and we could postpone addition. So, perhaps, using both > >>> containers would be an overkill, just refactor the code to adapt to the > >>> new behavior. > > > > My idea, which I wanted to investigate after the weekend, is iterating > > through the hashtable to create a list of prop->release functions and > > call them only after finishing the iteration. That might not work > > either, so we may need to loop over the releasing to allow for released > > properties to disappear after prop->release(). > > I went with the latter and squashed the attached fixup (without last two > hunks, preparing a separate patch for that), interrupting each iteration > after prop->release() to be safe. That seems to fix it. > > Will prepend and test Dan's unit test next.
> diff --git a/qom/object.c b/qom/object.c > index 0ac3bc1..284fa38 100644 > --- a/qom/object.c > +++ b/qom/object.c > @@ -377,14 +377,22 @@ static void object_property_del_all(Object *obj) > ObjectProperty *prop; > GHashTableIter iter; > gpointer key, value; > + bool released; > > - g_hash_table_iter_init(&iter, obj->properties); > - while (g_hash_table_iter_next(&iter, &key, &value)) { > - prop = value; > - if (prop->release) { > - prop->release(obj, prop->name, prop->opaque); > + do { > + released = false; > + g_hash_table_iter_init(&iter, obj->properties); > + while (g_hash_table_iter_next(&iter, &key, &value)) { > + prop = value; > + if (prop->release) { > + prop->release(obj, prop->name, prop->opaque); > + prop->release = NULL; > + released = true; > + break; > + } > + g_hash_table_iter_remove(&iter); > } > - } > + } while (released); > > g_hash_table_unref(obj->properties); > } > @@ -401,7 +409,15 @@ static void object_property_del_child(Object *obj, > Object *child, Error **errp) > if (object_property_is_child(prop) && prop->opaque == child) { > if (prop->release) { > prop->release(obj, prop->name, prop->opaque); > + prop->release = NULL; > } > + break; > + } > + } > + g_hash_table_iter_init(&iter, obj->properties); > + while (g_hash_table_iter_next(&iter, &key, &value)) { > + prop = value; > + if (object_property_is_child(prop) && prop->opaque == child) { > g_hash_table_iter_remove(&iter); > break; > } > @@ -856,7 +872,7 @@ void object_ref(Object *obj) > if (!obj) { > return; > } > - atomic_inc(&obj->ref); > + atomic_inc(&obj->ref); > } > > void object_unref(Object *obj) > @@ -864,7 +880,7 @@ void object_unref(Object *obj) > if (!obj) { > return; > } > - g_assert(obj->ref > 0); > + g_assert_cmpint(obj->ref, >, 0); > > /* parent always holds a reference to its children */ > if (atomic_fetch_dec(&obj->ref) == 1) { This looks good to me so can add Signed-off-by: Daniel P. Berrange to this change. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|