On Mon, 2023-11-13 at 22:22 +0100, Johannes Berg wrote: > > My suggestion was to add the virtual interrupt line > > state change as a message to the calendar. > > Yes but it doesn't work unless the other side _already_ knows that it > will happen, because it broke the rule of "only one thing is running". > > Again you can argue that it's fine to return to the calendar even if the > receiving process won't actually do anything, but because the receiving > process is still thinking about it, you end up with all the contortions > you have to do in patch 4 and 7 ... because even the _calendar_ doesn't > know when the request will actually come to it. > > Perhaps one way of handling this that doesn't require all those > contortions would be for the sender of the event to actually > _completely_ handle the calendar request for the interrupt on behalf of > the receiver, so that the receiver doesn't actually do _anything_ but > mark "this IRQ is pending" in this case. Once it actually gets a RUN > message it will actually start running since it assumes that it will not > get a RUN message without having requested a calendar entry. If the > calendar entry were already handle on its behalf, you'd not need the > request and therefore not need the special handling for the request from > patch 4. > You'd need a different implementation of patches 2/3, and get rid of > patches 4, 6, 7 and 8. >
So maybe for my future self and all the bystanders, I'll try to explain how I see the issue that causes patches 4, 6, 7 and 8 to be needed: Actors: UML, ES (Event Sender), Cal (Calendar) In UML, we're using a line-based protocols as in drivers/line.c The ES is connected to that line and can send it messages. (The same scenario would probably apply the other way around, in theory, so we might need to have a way to implement this with roles reversed.) Let's start by sending a message to UML, that's the whole point of this discussion: ES ---> UML // so far nothing happened, the host kernel queued SIGIO to UML ES: continues running until idle and returns to Cal Now we already have a few options, and I don't know which one got implemented. A. Perhaps ES also told Cal to expect that UML is running: Cal ---RUN--> UML B. Perhaps Cal just asks everyone what their current request is Cal ---GET--> UML C. Perhaps something else? In any case, we already see a first race here, let's say A happened, and now: ! SIGIO -> UML UML calls simple_timetravel_handler -> time_travel_add_irq_event UML ---REQUEST--> Cal UML <----RUN----- Cal // UML really confused -> patch 4 Or maybe: UML <-- RUN ----- CAL UML: starts running, starts manipulating time event list ! SIGIO -> UML UML calls simple_timetravel_handler -> time_travel_add_irq_event manipulates time event list // UML corrupts data structures -> patch 8 Or: UML <-- RUN ------ CAL UML runs only really briefly, sends new request, time_travel_ext_prev_request_valid = true ! SIGIO -> UML UML calls simple_timetravel_handler -> time_travel_add_irq_event time_travel_ext_prev_request_valid == true // no new request, Cal confused -> patch 6, and maybe 7 Not sure if that's really all completely accurate, and almost certainly there are more scenarios that cause issues? But in all cases the root cause is the asynchronous nature of doing this; partially internally in UML (list protection etc.) and partially outside UML (calendar doesn't know what's supposed to happen next until async SIGIO is processed.) In contrast, with virtio, you get ES -- event ----> UML // so far nothing happened, the host kernel queued SIGIO to UML // ES waits for ACK ! SIGIO -> UML UML calls simple_timetravel_handler -> time_travel_add_irq_event UML ---REQUEST--> Cal UML <--ACK------- Cal ES <---ACK------- UML ES: continues running until idle and returns to Cal Cal: schedules next runnable entry, likely UML Note how due to the "// ES waits for ACK" nothing bad happens, because even if the host doesn't schedule the SIGIO immediately to the UML process, or that needs some time, _nothing_ else in the simulation makes progress either until UML has. IMNSHO that's the far simpler model than taking into account all the potential races of which I outlined some above and trying to work with them. Now then I went on to say that we could just basically make it _all_ the sender's responsibility on behalf of the receiver, and then we'd get ES -- event ----> UML // so far nothing happened, the host kernel queued SIGIO to UML ES -- add UML --> Cal ES <--- ACK ----- Cal ES: continues running u8ntil idle and returns to Cal Cal: schedules next runnable entry, likely UML Now the only thing we need to handle in terms of concurrency are two scenarios that continue from that scenario: 1. Cal --- RUN ----> UML ! SIGIO -> UML and 2. ! SIGIO -> UML Cal --- RUN ----> UML Obviously 2 is what you expect, but again you can even have races like in 1 and the SIGIO can happen at roughly any time. I'm tempted to say the only reasonable way to handle that would be to basically not do _anything_ in the SIGIO but go poll all the file descriptors this might happen for upon receiving a RUN message. We could even go so far as to add a new RUN_TRIGGERED_BY_OTHER message that would make it clear that someone else entered the entry into the calendar, and only then epoll for interrupts for this message, but I'm not sure that's needed or makes sense (if you were going to wake up anyway at that time, you'd still handle interrupts.) In any case, that feels like a _far_ more tractable problem, and the only concurrency would be between running and the SIGIO, where the SIGIO basically no longer matters. However ... if we consider this the other way around, we can actually see that it's much harder to implement than it sounds - now suddenly instead of having to connect the sockets with each other etc. you also have to give the implementation knowledge about who is on the other side, and how to even do a calendar request on their behalf! We don't even have a protocol for doing a request on someone else's behalf today (though with the shared memory you could just enter one), and actually changing the protocol to support it might even be tricky since it doesn't have much space for extra data in the messages ... maybe if we split 'op' or use 'seq' for it... But you'd need to have a (pre- assigned?) ID for each, or have them exchange IDs over something, ideally the socket that's actually used for communication, but that limits the kinds of sockets you can use, etc. So it's by no means trivial either. And I understand that. I just don't think making the protocol and implementations handle all the races that happen when you start doing things asynchronously is really a good idea. johannes _______________________________________________ linux-um mailing list linux-um@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um