> From: Luigi Rizzo <ri...@iet.unipi.it> > Date: Wed, Apr 14, 2010 10:16:30AM +0200 > > [Cc-ing Fabio as he may have more details] > ... > > BTW. So you decided to implement insert/remove functionality after all. > > I have some questions: > > > > - It is implemented as internal gsched hack, which is a pity, because > > this might be very useful functionality for other classes in the future. > > Is there a plan to make it more general and move it to the GEOM itself? > > yes there is such a plan -- of course if nobody has objections. > In principle it is only a library extensions with no modifications > to geom internals. > > However, when we developed that last year, we hit some corner case > where removal of an active node causes either a race or (if you try > to protect from the race) a livelock. Fixing this may require some > small cleanup to geom internals (we discusses the issue with phk > at last EuroBSDCon. Fabio may give you more details, as far as i > remember the problem was that some geom code takes shortcuts instead > of following a chain of pointers, and this can end up in the wrong > place in case of a removal.) > > For this reason, at this time i am not recommending to remove a > node from a chain with outstanding transactions until the issue is solved. >
I'm not sure I remember all the details, the major issues were: - g_disk_done() dereferences bio->parent->bio_to->geom->softc, thus changing bio_to->geom on the fly was not possible with pending request (they would find a wrong softc when bubbling up). For this reason we added a loop in g_insert_proxy() to wait for all the pending requests to be completed prior to inserting the proxy; but: - g_slice_finish_hot() completes requests in the event handling path, thus said loop (executed from an event handler) could result in a deadlock. To avoid this (it should be far from being frequent, considering the usages of hotspots in the slice code) g_insert_proxy() fails if it takes too long to complete the old requests. - Some classes (from a quick look I've seen only g_mirror, but I'm pretty sure there were some other ones) cache pointers to their providers. With these classes this implementation does not work. For the scheduler this is not a big issue, because its natural position is as close as possible to the disk device, but makes the mechanism quite hard to use in a more generic context. > > - Why g_sched_flush_pending() operates on global structure? I think it > > will break if you try to insert and remove at the same time. > If I'm not wrong, this should be safe because the global list is used only under critical sections protected by the topology lock. _______________________________________________ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"