Hey Nikita, Thank you for the proposal. Ergonomics of async I/O in PHP certainly leave > something to be desired right now, and improvements in this area are > welcome. > > Despite your explanations in the RFC and this thread, I'm still having a > hard time understanding the purpose of the FiberScheduler. > > My current understanding is that the FiberScheduler is a special type of > fiber that cannot be explicitly scheduled by the user -- it is > automatically scheduled by Fiber::suspend() and automatically un-scheduled > by Fiber::resume() or Fiber::throw(). It's the fiber that runs between > fibers :) Does that sound accurate? >
Yes, that's accurate. Fibers are used for cooperative multi-tasking and there's usually a single scheduler responsible for the scheduling. Multiple schedulers would block each other or busy wait. So having multiple schedulers is strongly discouraged in long running applications, however, it might be acceptable in traditional applications, i.e. PHP-FPM. In PHP-FPM, multiple schedulers partially blocking each other is still better than blocking entirely for every I/O operation. > What's not clear to me is why the scheduling fiber needs to be > distinguished from other fibers. If we want to stick with the general > approach, why is Fiber::suspend($scheduler) not Fiber::transferTo($fiber), > where $fiber would be the fiber serving as scheduler (but otherwise a > normal Fiber)? I would expect that context-switching between arbitrary > fibers would be both most expressive, and make for the smallest interface. > There are a few reasons to make a difference here: - SchedulerFibers are run to completion at script end, which isn't the case for normal fibers. - Terminating fibers need a fiber to return to. For schedulers it's fine if a resumed fiber terminates, for normal fibers it should be an exception if the scheduler fiber terminates without explicitly resuming the suspended fiber. - Keeping the previous fiber for each suspension point is complicated if not impossible to get right and generally complicates the implementation and cognitive load, see following example: main -> A -> B -> C -> A (terminates) -> C (previous) -> B (terminates) -> C (previous, terminates) -> main In the example above, the previous fiber linked list from C back to main needs to be optimized at some point, otherwise A and B need to be kept in memory and thus leak memory until C is resumed. I'm sure Aaron can present a few other reasons to keep the separation. > The more limited alternative is to instead have Fiber::suspend() return to > the parent fiber (the one that resume()d it). Here, the parent fiber > effectively becomes the scheduler fiber. If I understand right, the reason > why you don't want to use that approach, is that it doesn't allow you to > call some AMP function from the {main} fiber, create the scheduler there > and then treat {main} just like any other fiber. Is that correct? > Correct, this wouldn't allow top-level Fiber::suspend(). It would also make the starting / previously resuming party responsible for resuming the fiber instead of the fiber being able to "choose" the scheduler for a specific suspension point. One fiber would thus be effectively limited to a single scheduler. Best, Niklas