Am 16.11.2018 um 17:34 hat Max Reitz geschrieben: > On 16.11.18 16:51, Kevin Wolf wrote: > > Am 16.11.2018 um 16:27 hat Max Reitz geschrieben: > >> On 16.11.18 16:18, Kevin Wolf wrote: > >>> Am 16.11.2018 um 16:03 hat Alberto Garcia geschrieben: > >>>>> I don't think anything needs a way to generally block graph changes > >>>>> around some node. We only need to prevent changes to very specific > >>>>> sets of edges. This is something that the permission system just > >>>>> cannot do. > >>>> > >>>> But what would you do then? > >>> > >>> I agree with you mostly in that I think that most problems that Max > >>> mentioned aren't readl. The only real problem I see with GRAPH_MOD as a > >>> permission on the node level is this overblocking > >> > >> I wholeheartedly disagree. Yes, it is true that most of the issues I > >> thought of can be fixed, and so those problems are not problems in a > >> technical sense. But to me this whole discussion points to the greatest > >> issue I have, which is that GRAPH_MOD is just too complicated to > >> understand. And I don't like a solution that works on a technical level > >> but that everybody is too afraid to touch because it's too weird. > > > > GRAPH_MOD with the meaning that Berto suggested isn't weird or > > complicated to understand. It's only the wrong tool because it blocks > > more than we want to block. But if we didn't care about that, it could > > be just another permission like any other. > > The meaning Berto has suggested (AFAIU) is never taking the permission > at all but just querying whether it is shared by all current parents > before doing a graph change. > > That is very weird to me because permissions are there to be taken.
It would be a shortcut for taking the permission temporarily, which would possibly work inside QEMU because we're holding the BQL, though I'm not sure whether I/O threads couldn't influence this. Anyway, with our mapping of permissions to filesystem locks, it's definitely not correct any more and you do need to actually temporarily acquire the locks. > The meaning I have suggested is that it would be taken before a graph > change and released right afterwards. I think this is still weird > because we don't take, say, the WRITE permission before every > bdrv*_write*() call and release it afterwards. > > Sure you could say "actually, why not". Please don't.[1] Actually, why not? Not in the sense that it would be the right model for WRITE, but it might still be the right one for GRAPH_MOD in most cases. We do take the RESIZE permission in qmp_block_resize() before calling blk_truncate(), and drop it right afterwards (by means of creating a temporary BlockBackend only for the purpose of resizing), so there is precedence. Of course, all of this is moot because GRAPH_MOD can't provide the exact semantics we want to have. > > If you want to change the graph, you'd need to get the permission first, > > and bdrv_replace_child_noperm() could assert that at least one parent of > > the parent node has acquired the permission (unless you want to pass the > > exact parent BdrvChild to it; maybe this is what we would really do > > then). > > Yep, that assertion would be like we do it in bdrv_co_write_req_prepare(). > > Thing is, you (as in "the caller that actually wants to do something", > like mirror) don't just "get the permission". Permissions are > associated with children, so you cannot get this permission before you > create your child. > > So this is where "parent of a parent node" comes in? I'm afraid I don't > understand that. Including GRAPH_MOD in the permission system means that it controls an operation that reaches a node through a BdrvChild. In this case, the operation is "modifiying a BdrvChild of the node that the permission holding BdrvChild points to". Okay, if you say that affecting children of a node instead of just the node itself is a bit unusual, I might have to agree. > What would make sense to me would be for the generic block layer to take > this permission on new children (and drop it immediately after having > attached the child) and children that are about to be detached. Which > again is just different from any other permission, because things > outside of block.c can never take it, they can only choose not to share > it. So it'd be an internal permission, which is visible to the outside, > and I think this is at least weirder than just having an internal > counter with a clear external interface (inc/dec). I don't think I'd want to implement it like that (but then, I'll repeat that I wouldn't want to implement it at all because it's the wrong tool). > Also, please excuse me for it being Friday and all, but I don't quite > believe in "It's actually clear, I have no idea why you claim to always > take so long to understand it." The code clearly has no idea what it's > supposed to do, and I do not believe that the sole reason for that is > that someone who did understand didn't have the time or the need to fix > it, or to implement it correctly in the first place. > > On the other hand it sounded a bit like Berto was about to fix it, > though, so there's that. > > Or I just got you wrong and this is just the usual case of "Now that > we've talked about this through a couple of lengthy emails, it does make > sense" which is just all too familiar and exactly my issue with the > whole thing. It's actually "now that we have a precise definition what GRAPH_MOD means", at least for the sake of this discussion. It's not a useful definition, so I don't think this is actually meaningful progress, but having this definition, you can reason about it now. It's not very complicated on the conceptual level (the implementation looks differernt because we'd have to introduce another BdrvChild parameter everywhere to know where the operation comes from), just not a very good tool either. What we noted down about counters was more meaningful progress because we think this can actually provide the semantics that we need. Of course, there is still some hand-waving involved (we'll just check the counter when modifying the graph), which might not be that trivial to implement because bdrv_replace_child() can't return an error... > >> We have this discussion again and again, and in the end we always come > >> up with something that looks like it might work, but it's just so weird > >> that we can't even remember it. > > > > I don't think we ever come up with something, weird or not, that > > achieves what we wanted to achieve - because the problem simply can't be > > solved properly at the node level. > > Yes, I've phrased my disagreement wrong. Of course I don't disagree > with you on that. I just disagree that this is the only issue with > GRAPH_MOD. > > >> Maybe it's just me, though. Frankly, I think the permission system > >> itself is already too complicated as it is, but I don't have a simpler > >> solution there. > > > > It doesn't feel too bad to me, but that's subjective. > > It's not worse than quiescing/draining, that's for sure. How would _you_ know? :-) (But I agree.) > [1] Please don't because on one hand I wouldn't have a counter-argument > for this. Yes, maybe we should actually take permissions right before > the operations that require them. Maybe keeping them around for as long > as a child stays attached to a parent is just being lazy. No, it's the very mechanism that requires users throughout the stack to declare what they want to do with their nodes. If you just automatically got write permissions when you do a write request, you could attach read-write devices to a read-only node and things would go up in flames as soon as the device actually writes. This would be consistent with GRAPH_MOD (or actually even the counters), if block jobs requested this from the beginning and not only when they complete. For the implementation with counters, maybe we actually need two of them like perm/shared in the permission system. One that means that we're going to modify the graph, and the other one that means that we can't tolerate graph modifications. For the public interface, we might actually treat it mostly like permissions, just edge permissions, not node permissions. > I suppose the main argument against taking permissions just when you > need them is that this introduces additional points of failure that are > mostly unnecessary. For GRAPH_MOD, however, these points would be > limited and easily handled, because they just involve making > bdrv_attach_child() fail. You probably don't even want to start the whole graph modification (which may well involve multiple changed children) when one of the changes is going to fail. So I think taking the permission beforehand is useful. I'd consider taking it even at the job start, but the latest I would take it is at the start of the .prepare callback. (This is relevant for the counters, too.) > So, no, I do not have a good technical counter-argument. But my > argument of complexity still stands after having gone through > understanding how GRAPH_MOD might actually work for at least the third > time now, without it ever having made the next time any easier. > > Even if GRAPH_MOD were correctly implemented tomorrow, in a way > conforming to what we've discussed now, I have the strong feeling that I > still would get to a point where I forgot its meaning and would have to > go through all of this again. > > And that's completely disregarding people new to the codebase. As I said, it's moot anyway. It doesn't provide the right semantics. Kevin
signature.asc
Description: PGP signature