Re: [PHP-DEV] PHP True Async RFC
Good day, Alex. > > Can you please share a bit more details on how the Scheduler is implemented, to make sure that I understand why this contradiction exists? Also with some examples, if possible. > ```php $fiber1 = new Fiber(function () { echo "Fiber 1 starts\n"; $fiber2 = new Fiber(function () use (&$fiber1) { echo "Fiber 2 starts\n"; Fiber::suspend(); // Suspend the inner fiber echo "Fiber 2 resumes\n"; }); }); ``` Yes, of course, let's try to look at this in more detail. Here is the classic code demonstrating how Fiber works. Fiber1 creates Fiber2. When Fiber2 yields control, execution returns to Fiber1. Now, let's try to do the same thing with Fiber3. Inside Fiber2, we create Fiber3. Everything will work perfectly—Fiber3 will return control to Fiber2, and Fiber2 will return it to Fiber1—this forms a hierarchy. Now, imagine that we want to turn Fiber1 into a *Scheduler* while following these rules. To achieve this, we need to ensure that all Fiber instances are created from the *Scheduler*, so that control can always be properly returned. ```php class Scheduler { private array $queue = []; public function add(callable $task) { $fiber = new Fiber($task); $this->queue[] = $fiber; } public function run() { while (!empty($this->queue)) { $fiber = array_shift($this->queue); if ($fiber->isSuspended()) { $fiber->resume($this); } } } public function yield() { $fiber = Fiber::getCurrent(); if ($fiber) { $this->queue[] = $fiber; Fiber::suspend(); } } } $scheduler = new Scheduler(); $scheduler->add(function (Scheduler $scheduler) { echo "Task 1 - Step 1\n"; $scheduler->yield(); echo "Task 1 - Step 2\n"; }); $scheduler->add(function (Scheduler $scheduler) { echo "Task 2 - Step 1\n"; $scheduler->yield(); echo "Task 2 - Step 2\n"; }); $scheduler->run(); ``` So, to successfully switch between Fibers: 1. A Fiber must return control to the *Scheduler*. 2. The *Scheduler* selects the next Fiber from the queue and switches to it. 3. That Fiber then returns control back to the *Scheduler* again. This algorithm has one drawback: *it requires two context switches instead of one*. We could switch *FiberX* to *FiberY* directly. Breaking the contract not only disrupts the code in this RFC but also affects Revolt's functionality. However, in the case of Revolt, you can say: *"If you use this library, follow the library's contracts and do not use Fiber directly."* But PHP is not just a library, it's a language that must remain consistent and cohesive. > > Reading the RFC initially, I though that the Scheduler is using fibers for everything that runs. > Exactly. > > You mean that when one of the fibers started by the Scheduler is starting other fibers they would usually await for them to finish, and that is a blocking operating that blocks also the Scheduler? > When a *Fiber* from the *Scheduler* decides to create another *Fiber* and then tries to call blocking functions inside it, control can no longer return to the *Scheduler* from those functions. Of course, it would be possible to track the state and disable the concurrency mode flag when the user manually creates a *Fiber*. But… this wouldn't lead to anything good. Not only would it complicate the code, but it would also result in a mess with different behavior inside and outside of *Fiber*. This is even worse than calling *startScheduler*. The hierarchical switching rule is a *design flaw* that happened because a *low-level component* was introduced into the language as part of the implementation of a *higher-level component*. However, the high-level component is in *User-land*, while the low-level component is in *PHP core*. It's the same as implementing $this in OOP but requiring it to be explicitly passed in every method. This would lead to inconsistent behavior. So, this situation needs to be resolved one way or another. -- Ed
Re: [PHP-DEV] PHP True Async RFC
> > Maybe, we could create a different version of fibers ("managed fibers", maybe?) distinct from the current implementation, with the idea to deprecate them in PHP 10? > Then, at least, the scheduler could always be running. If you are using existing code that > uses fibers, you can't use the new fibers but it will "just work" if you aren't using the new fibers (since the scheduler will never pick up those fibers). > Yes, that can be done. It would be good to maintain compatibility with XDEBUG, but that needs to be investigated. During our discussion, everything seems to be converging on the idea that the changes introduced by the RFC into Fiber would be better moved to a separate class. This would reduce confusion between the old and new solutions. That way, developers wouldn't wonder why Fiber and coroutines behave differently—they are simply different classes. The new *Coroutine* class could have a different interface with new logic. This sounds like an excellent solution. The interface could look like this: - *suspend* (or another clear name) – a method that explicitly hands over execution to the *Scheduler*. - *defer* – a handler that is called when the coroutine completes. - *cancel* – a method to cancel the coroutine. - *context* – a property that stores the execution context. - *parent* (public property or getParent() method) – returns the parent coroutine. (*Just an example for now.*) The *Scheduler* would be activated automatically when a coroutine is created. If the index.php script reaches the end, the interpreter would wait for the *Scheduler* to finish its work under the hood. Do you like this approach? --- Ed.
Re: [PHP-DEV] PHP True Async RFC
On Sun, Mar 9, 2025, 09:05 Edmond Dantes wrote: > When a *Fiber* from the *Scheduler* decides to create another *Fiber* and > then tries to call blocking functions inside it, control can no longer > return to the *Scheduler* from those functions. > > Of course, it would be possible to track the state and disable the > concurrency mode flag when the user manually creates a *Fiber*. But… this > wouldn't lead to anything good. Not only would it complicate the code, but > it would also result in a mess with different behavior inside and outside > of *Fiber*. > > Thank you for explaining the problem space. Now let's see what solutions we can find. First of all, I think it would be better for the language to assume the Scheduler is always running and not have to be manually started. An idea that I have for now: Have a different method `Fiber::suspendToScheduler(Resume $resume)` that would return the control to the Scheduler. And this one would be used by all internal functions that does blocking operations, and maybe also user land ones if they need to. Of course, the name can be better, like `Fiber::await`. Maybe that is what we need: to be able to return control both to the parent fiber for custom logic that might be needed, and to the Scheduler so that the language would be concurrent. As for userland event loops, like Revolt, I am not so sure they fit with the new language level async model. But I can see how they could implement a different Event loop that would run only one "loop", schedule a deffered callback and pass control to the Scheduler (that would return the control in the next iteration to perform one more loop, and so on. -- Alex >
Re: [PHP-DEV] PHP True Async RFC
On Sun, Mar 9, 2025, at 09:05, Edmond Dantes wrote: > Good day, Alex. > > > > > Can you please share a bit more details on how the Scheduler is > > implemented, to make sure that I understand why this contradiction exists? > > Also with some examples, if possible. > > > > ```php > $fiber1 = new Fiber(function () { > echo "Fiber 1 starts\n"; > > $fiber2 = new Fiber(function () use (&$fiber1) { > echo "Fiber 2 starts\n"; > > Fiber::suspend(); // Suspend the inner fiber > echo "Fiber 2 resumes\n"; > > }); > > }); > ``` > > > Yes, of course, let's try to look at this in more detail. > Here is the classic code demonstrating how `Fiber` works. `Fiber1` creates > `Fiber2`. When `Fiber2` yields control, execution returns to `Fiber1`. > > Now, let's try to do the same thing with `Fiber3`. Inside `Fiber2`, we create > `Fiber3`. Everything will work perfectly—`Fiber3` will return control to > `Fiber2`, and `Fiber2` will return it to `Fiber1`—this forms a hierarchy. > > > Now, imagine that we want to turn `Fiber1` into a *Scheduler* while following > these rules. > To achieve this, we need to ensure that all `Fiber` instances are created > from the *Scheduler*, so that control can always be properly returned. > > ```php > > > class Scheduler { > private array $queue = []; > > public function add(callable $task) { > $fiber = new Fiber($task); > $this->queue[] = $fiber; > } > > public function run() { > while (!empty($this->queue)) { > $fiber = array_shift($this->queue); > > if ($fiber->isSuspended()) { > $fiber->resume($this); > } > } > } > > public function yield() { > $fiber = Fiber::getCurrent(); > if ($fiber) { > $this->queue[] = $fiber; > Fiber::suspend(); > } > } > } > > $scheduler = new Scheduler(); > > $scheduler->add(function (Scheduler $scheduler) { > echo "Task 1 - Step 1\n"; > $scheduler->yield(); > echo "Task 1 - Step 2\n"; > }); > > $scheduler->add(function (Scheduler $scheduler) { > echo "Task 2 - Step 1\n"; > $scheduler->yield(); > echo "Task 2 - Step 2\n"; > }); > > $scheduler->run(); > > ``` > > So, to successfully switch between Fibers: > > 1. A Fiber must return control to the *Scheduler*. > 2. The *Scheduler* selects the next Fiber from the queue and switches to it. > 3. That Fiber then returns control back to the *Scheduler* again. > > > This algorithm has one drawback: *it requires two context switches instead of > one*. We could switch *FiberX* to *FiberY* directly. > > Breaking the contract not only disrupts the code in this RFC but also affects > Revolt's functionality. However, in the case of Revolt, you can say: *"If you > use this library, follow the library's contracts and do not use Fiber > directly."* > > > > But PHP is not just a library, it's a language that must remain consistent > and cohesive. > > > > > > Reading the RFC initially, I though that the Scheduler is using fibers for > > everything that runs. > > > > Exactly. > > > > > > You mean that when one of the fibers started by the Scheduler is starting > > other fibers they would usually await for them to finish, and that is a > > blocking operating that blocks also the Scheduler? > > > > When a *Fiber* from the *Scheduler* decides to create another *Fiber* and > then tries to call blocking functions inside it, control can no longer return > to the *Scheduler* from those functions. > > Of course, it would be possible to track the state and disable the > concurrency mode flag when the user manually creates a *Fiber*. But… this > wouldn't lead to anything good. Not only would it complicate the code, but it > would also result in a mess with different behavior inside and outside of > *Fiber*. > > > > This is even worse than calling *startScheduler*. > > The hierarchical switching rule is a *design flaw* that happened because a > *low-level component* was introduced into the language as part of the > implementation of a *higher-level component*. However, the high-level > component is in *User-land*, while the low-level component is in *PHP core*. > > It's the same as implementing `$this` in OOP but requiring it to be > explicitly passed in every method. This would lead to inconsistent behavior. > > > > So, this situation needs to be resolved one way or another. > > -- > > Ed > Hi Ed, If I remember correctly, the original implementation of Fibers were built in such a way that extensions could create their own fiber types that were distinct from fibers but reused the context switch code. >From the original RFC: > An extension may still optionally provide their own custom fiber > implementation, but an internal API would allow the extension to use the > fiber implementation provided by PHP. Maybe, we could create a different ve
Re: [PHP-DEV] PHP True Async RFC
> > The wait_all block is EXPLICITLY DESIGNED to meddle with the internals of async libraries, > How exactly does it interfere with the implementation of asynchronous libraries? Especially considering that these libraries operate at the User-land level? It’s a contract. No more. No less. > > Libraries can full well handle cleanup of fibers in __destruct by themselves, without a wait_all block forcing them to reduce concurrency whenever the caller pleases. > Fiber is a final class, so there can be no destructors here. Even if you create a "Coroutine" class and allow defining a destructor, the result will be overly verbose code. I and many other developers have tested this. And the creators of AMPHP did not take this approach. Go doesn’t have it either. This is not a coincidence. > > It is, imo, a MAJOR FOOTGUN, and should not be even considered for implementation. > Why exactly is this a FOOTGUN? - Does this block lead to new violations of language integrity? - Does this block increase the likelihood of errors? A FOOTGUN is something that significantly breaks the language and pushes developers toward writing bad code. This is a rather serious flaw.
Re: [PHP-DEV] PHP True Async RFC
On 08/03/2025 22:28, Daniil Gentili wrote: Even its use is optional, its presence in the language could lead library developers to reduce concurrency in order to allow calls from async blocks, (i.e. don't spawn any background fiber in a method call because it might be called from an async {} block) which is what I meant by crippling async PHP. I think you've misunderstood what I meant by optional. I meant that putting the fiber into the managed context would be optional *at the point where the fiber was spawned*. A library wouldn't need to "avoid spawning background fibers", it would simply have the choice between "spawn a fiber that is expected to finish within the current managed scope, if any", and "spawn a fiber that I promise to manage myself, and please ignore anyone trying to manage it for me". There have been various suggestions of exactly what that could look like, e.g. in https://externals.io/message/126537#126625 and https://externals.io/message/126537#126630 The naming of "async {}" is also very misleading, as it does the opposite of making things async, if anything it should be called "wait_all {}" Yes, "async{}" is a bit of a generic placeholder name; I think Larry was the first to use it in an illustration, and we've been discussing exactly what it might mean. As we pin down more precise suggestions, we can probably come up with clearer names for them. The tone of your recent e-mails suggests you believe someone is forcing this precise keyword into the language, right now, and you urgently need to stop it before it's too late. That's not where we are at all, we're trying to work out if some such facility would be useful, and what it might look like. It sounds like you think: 1) The language absolutely needs a "spawn detached" operation, i.e. a way of starting a new fiber which is queued in the global scheduler, but has no automatic relationship to its parent. 2) If the language offered both "spawn managed" and "spawn detached", the "detached" mode would be overwhelmingly more common (i.e. users and library authors would want to manage the lifecycle of their coroutines manually), so the "spawn managed" mode isn't worth implementing. Would that be a fair summary of your opinion? -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Re: PHP True Async RFC
> > I think the same thing applies to scheduling coroutines: we want the Scheduler to take over the "null fiber", > Yes, you have quite accurately described a possible implementation. When a programmer loads the initial index.php, its code is already running inside a coroutine. We can call it the main coroutine or the root coroutine. When the index.php script reaches its last instruction, the coroutine finishes, execution is handed over to the Scheduler, and then everything proceeds as usual. Accordingly, if the Scheduler has more coroutines in the queue, reaching the last line of index.php does not mean the script terminates. Instead, it continues executing the queue until... there is nothing left to execute. > > At that point, the relationship to a block syntax perhaps becomes clearer: > Thanks to the extensive discussion, I realized that the implementation with startScheduler raises too many questions, and it's better to sacrifice a bit of backward compatibility for the sake of language elegance. After all, Fiber is unlikely to be used by ordinary programmers.
Re: [PHP-DEV] PHP True Async RFC
> > I can give you several examples where such logic is used in Amphp libraries, and it will break if they are invoked within an async block. > Got it, it looks like I misunderstood the post due to my focus. So, essentially, you're talking not so much about wait_all itself, but rather about the parent-child vs. free model. This question is what concerns me the most right now. If you have real examples of how this can cause problems, I would really appreciate it if you could share them. Code is the best criterion of truth. > > You misunderstand: > Yes, I misunderstood. It would be interesting to see the code with the destructor to analyze this approach better. *Let me summarize the current state for today:* 1. I am abandoning startScheduler and the idea of preserving backward compatibility with await_all or anything else in that category. The scheduler will be initialized implicitly, and this does not concern user-land. Consequently, the spawn function() code will work everywhere and always. 2. I will not base the implementation on Fiber (perhaps only on the low-level part). Instead of Fiber, there will be a separate class. There will be no changes to Fiber at all. This decision follows the principle of Win32 COM/DCOM: old interfaces should never be changed. If an old interface needs modification, it should be given a new name. This should have been done from the start. 3. I am abandoning low-level objects in PHP-land (FiberHandle, SocketHandle etc). Over time, no one has voted for them, which means they are unnecessary. There might be a low-level interface for compatibility with Revolt. 4. It might be worth restricting microtasks in PHP-land and keeping them only for C code. This would simplify the interface, but we need to ensure that it doesn’t cause any issues. The remaining question on the agenda: deciding which model to choose — *parent-child* or the *Go-style model*. Thanks --- Ed
Re: [PHP-DEV] Re: PHP True Async RFC
On 08/03/2025 20:22, Edmond Dantes wrote: For coroutines to work, a Scheduler must be started. There can be only one Scheduler per OS thread. That means creating a new async task does not create a new Scheduler. Apparently, async {} in the examples above is the entry point for the Scheduler. I've been pondering this, and I think talking about "starting" or "initialising" the Scheduler is slightly misleading, because it implies that the Scheduler is something that "happens over there". It sounds like we'd be writing this: // No scheduler running, this is probably an error Async\runOnScheduler( something(...) ); Async\startScheduler(); // Great, now it's running... Async\runonScheduler( something(...) ); // If we can start it, we can stop it I guess? Async\stopScheduler(); But that's not we're talking about. As the RFC says: > Once the Scheduler is activated, it will take control of the Null-Fiber context, and execution within it will pause until all Fibers, all microtasks, and all event loop events have been processed. The actual flow in the RFC is like this: // This is queued somewhere special, ready for a scheduler to pick it up later Async\enqueueForScheduler( something(...) ); // Only now does anything actually run Async\runSchedulerUntilQueueEmpty(); // At this point, the scheduler isn't running any more // If we add to the queue now, it won't run unless we run another scheduler Async\enqueueForScheduler( something(...) ); Pondering this, I think one of the things we've been missing is what Unix[-like] systems call "process 0". I'm not an expert, so may get details wrong, but my understanding is that if you had a single-tasking OS, and used it to bootstrap a Unix[-like] system, it would look something like this: 1. You would replace the currently running single process with the new kernel / scheduler process 2. That scheduler would always start with exactly one process in the queue, traditionally called "init" 3. The scheduler would hand control to process 0 (because it's the only thing in the queue), and that process would be responsible for starting all the other processes in the system: TTYs and login prompts, network daemons, etc I think the same thing applies to scheduling coroutines: we want the Scheduler to take over the "null fiber", but in order to be useful, it needs something in its queue. So I propose we have a similar "coroutine zero" [name for illustration only]: // No scheduler running, this is an error Async\runOnScheduler( something(...) ); Async\runScheduler( coroutine_zero: something(...); ); // At this point, the scheduler isn't running any more It's then the responsibility of "coroutine 0", here the function "something", to schedule what's actually wanted, like a network listener, or a worker pool reading from a queue, etc. At that point, the relationship to a block syntax perhaps becomes clearer: async { spawn start_network_listener(); } is roughly (ignoring the difference between a code block and a closure) sugar for: Async\runScheduler( coroutine_zero: function() { spawn start_network_listener(); } ); That leaves the question of whether it would ever make sense to nest those blocks (indirectly, e.g. something() itself contains an async{} block, or calls something else which does). I guess in our analogy, nested blocks could be like running Containers within the currently running OS: they don't actually start a new Scheduler, but they mark a namespace of related coroutines, that can be treated specially in some way. Alternatively, it could simply be an error, like trying to run the kernel as a userland program. -- Rowan Tommins [IMSoP]
Re: [PHP-DEV][RFC][VOTE] Add mb_levenshtein function
2025年3月8日(土) 19:06 Niels Dossche : > > On 08/03/2025 03:30, youkidearitai wrote: > > Hi, Internals > > > > The add mb_levenshtein was end and declined. > > Vote result is one yes and 5 no. > > > > Thank you very much voting. > > > > By the way, This message is means add grapheme_levenshtein instead of > > mb_levenshtein? > > Or nothing to do? > > Feel free to comment. > > > > Thank you again. > > Yuya. > > > > Hi Yuya > > I think an RFC for grapheme_levenshtein would be better, it would have my > vote at least. > Levenshtein makes more sense on graphemes than on unicode codepoints. > > Kind regards > Niels Hi, Niels Thank you very much for reply. Okay. I will go to grapheme_levenshtein RFC. Kind regards Yuya -- --- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai -
Re: [PHP-DEV] PHP True Async RFC
> I think you've misunderstood what I meant by optional. I meant that putting > the fiber into the managed context would be optional *at the point where the > fiber was spawned*. > > It sounds like you think: > > 1) The language absolutely needs a "spawn detached" operation, i.e. a way of > starting a new fiber which is queued in the global scheduler, but has no > automatic relationship to its parent. > 2) If the language offered both "spawn managed" and "spawn detached", the > "detached" mode would be overwhelmingly more common (i.e. users and library > authors would want to manage the lifecycle of their coroutines manually), so > the "spawn managed" mode isn't worth implementing. > > Would that be a fair summary of your opinion? Indeed, yes! That would be a complete summary of my opinion. If the user could choose whether to add fibers to the managed context or not, that would be more acceptable IMO. Then again see point 2, plus even an optional managed fiber context still introduces a certain degree of "magicness" and non-obvious/implicit behavior on initiative of the caller, that can be avoided by simply explicitly returning and awaiting any spawned fibers. Regards, Daniil Gentili.
Re: [PHP-DEV] PHP True Async RFC
> > Have a different method `Fiber::suspendToScheduler(Resume $resume)` that would return the control to the Scheduler. > That's exactly how it works. The RFC includes the method Async\wait() (Fiber::await() is nice), which hands control over to the Scheduler. At the PHP core level, there is an equivalent method used by all blocking functions. In other words, Fiber::suspend is not needed; instead, the Scheduler API is used. The only question is backward compatibility. If, for example, it is agreed that the necessary changes will be made in Revolt when this feature is released and we do not support the old behavior, then there is no problem. > > Maybe that is what we need: to be able to return control both to the parent fiber for custom logic that might be needed, and to the Scheduler so that the language would be concurrent. > 100% yes. > > As for userland event loops, like Revolt, I am not so sure they fit with the new language level async model. > Revolt can be adapted to this RFC by modifying the Driver module. I actually reviewed its code again today to assess the complexity of this change. It looks like it shouldn’t be difficult at all. The only problem arises with the code that has already been written and is publicly available. I know that the AMPHP stack is in use, so we need a *flow* that ensures a smooth transition. As I understand it, you believe that it’s better to introduce more radical changes and not be afraid of breaking old code. In that case, there are no questions at all. >
Re: [PHP-DEV] PHP True Async RFC
>> The wait_all block is EXPLICITLY DESIGNED to meddle with the internals of >>async libraries, >> > > How exactly does it interfere with the implementation of asynchronous > libraries? > Especially considering that these libraries operate at the User-land level? > It’s a contract. No more. No less. When you have a construct that is forcing all code within it to to terminate all running fibers. If any library invoked within a wait_all block suddenly decides to spawn a long-running fiber that is not stopped when exiting the block, but for example later, when the library itself decides to, the wait_all block will not exit, essentially forcing the library user or developer to mess with the internal and forcefully terminate the background fiber. The choice should never be up to the caller, and the presence of the wait_all block gives any caller the option to break the internal logic of libraries. I can give you several examples where such logic is used in Amphp libraries, and it will break if they are invoked within an async block. >> Libraries can full well handle cleanup of fibers in __destruct by >>themselves, without a wait_all block forcing them to reduce concurrency >>whenever the caller pleases. >> > Fiber is a *final* class, so there can be no destructors here. Even if you > create a "Coroutine" class and allow defining a destructor, the result will > be overly verbose code. I and many other developers have tested this. You misunderstand: this is about storing the FiberHandles of spawned fibers and awaiting them in the __destruct of an object (the same object that spawned them in a method), in order to make sure all spawned fibers are awaited and all unhandled exceptions are handled somewhere (in absence of an event loop error handler). Also see my discussion about ignoring referenced futures: https://externals.io/message/126537#126661 > >> >> It is, imo, a MAJOR FOOTGUN, and should not be even considered for >>implementation. >> > > Why exactly is this a FOOTGUN? > > * Does this block lead to new violations of language integrity? > * Does this block increase the likelihood of errors? 1) Yes, because it gives users tools to mess with the internal behavior of userland libraries 2) Yes, because (especially given how it's named) accidental usage will break existing and new async libraries by endlessly awaiting upon background fibers when exiting an async {} block haphazardly used by a newbie when calling most async libraries, or even worse force library developers to reduce concurrency, killing async PHP just because users can use async {} blocks. > A FOOTGUN is something that significantly breaks the language and pushes > developers toward writing bad code. This is a rather serious flaw. Indeed, this is precisely the case. As the maintainer of Psalm, among others, I fully understand the benefits of purity and immutability: however, this keyword is a toy exercise in purity, with no real usecases (all real usecases being already covered by awaitAll), which cannot work in the real world in current codebases and will break real-world applications if used, with consequences on the ecosystem. I don't know what else to say on the topic, I feel like I've made myself clear on the matter: if you still feel like it's a good idea and it should be added to the RFC as a separate poll, I can only hope that the majority will see the danger of adding such a useless keyword and vote against on that specific matter. Regards, Daniil Gentili.
Re: [PHP-DEV] Re: PHP True Async RFC
Edmond, The language barrier is bigger (because of me, I cannot properly explain it) so I will keep it simple. Having "await" makes it sync, not async. In hardware we use interrupts but we have to do it grandma style... The main loop checks from variables set on the interrupts which is async. So you have a main loop that checks a variable but that variable is set from another part of the processor cycle that has nothing to do with the main loop (it is not fire and forget style it is in real time). Basically you can have a standard `int main()`function that is sync because you can delay in it (yep sleep(0)) and while you block it you have an event that interrupts a function that works on another register which is independent from the main function. More details of this will be probably not interesting so I will stop. If you want to make async PHP with multiple processes you have to check variables semaphored to make it work. On Sun, Mar 9, 2025 at 8:16 PM Edmond Dantes wrote: > > > > I think the same thing applies to scheduling coroutines: we want the > Scheduler to take over the "null fiber", > > > > Yes, you have quite accurately described a possible implementation. > When a programmer loads the initial index.php, its code is already > running inside a coroutine. > We can call it the main coroutine or the root coroutine. > > When the index.php script reaches its last instruction, the coroutine > finishes, execution is handed over to the Scheduler, and then everything > proceeds as usual. > > Accordingly, if the Scheduler has more coroutines in the queue, reaching > the last line of index.php does not mean the script terminates. Instead, > it continues executing the queue until... there is nothing left to execute. > > > > > At that point, the relationship to a block syntax perhaps becomes > clearer: > > > > Thanks to the extensive discussion, I realized that the implementation > with startScheduler raises too many questions, and it's better to > sacrifice a bit of backward compatibility for the sake of language elegance. > > After all, Fiber is unlikely to be used by ordinary programmers. > -- Iliya Miroslavov Iliev i.mirosla...@gmail.com
Re: [PHP-DEV] Re: PHP True Async RFC
> Edmond, > > If you want to make async PHP with multiple processes you have to check > variables semaphored to make it work. > > Hello, Iliya. Thank you for your feedback. I'm not sure if I fully understood the entire context. But. At the moment, I have no intention of adding multitasking to PHP in the same way it works in Go. Therefore, code will not require synchronization. The current RFC proposes adding only asynchronous execution. That means each thread will have its own event loop, its own memory, and its own coroutines. P.s. I know also Russian and a bit asm. Ed. > >
Re: [PHP-DEV] Re: PHP True Async RFC
On Sun, Mar 9, 2025, at 14:17, Rowan Tommins [IMSoP] wrote: > On 08/03/2025 20:22, Edmond Dantes wrote: > > > > For coroutines to work, a Scheduler must be started. There can be only > > one Scheduler per OS thread. That means creating a new async task does > > not create a new Scheduler. > > > > Apparently, async {} in the examples above is the entry point for the > > Scheduler. > > > > I've been pondering this, and I think talking about "starting" or > "initialising" the Scheduler is slightly misleading, because it implies > that the Scheduler is something that "happens over there". > > It sounds like we'd be writing this: > > // No scheduler running, this is probably an error > Async\runOnScheduler( something(...) ); > > Async\startScheduler(); > // Great, now it's running... > > Async\runonScheduler( something(...) ); > > // If we can start it, we can stop it I guess? > Async\stopScheduler(); > > > But that's not we're talking about. As the RFC says: > > > Once the Scheduler is activated, it will take control of the > Null-Fiber context, and execution within it will pause until all Fibers, > all microtasks, and all event loop events have been processed. > > The actual flow in the RFC is like this: > > // This is queued somewhere special, ready for a scheduler to pick it up > later > Async\enqueueForScheduler( something(...) ); > > // Only now does anything actually run > Async\runSchedulerUntilQueueEmpty(); > // At this point, the scheduler isn't running any more > > // If we add to the queue now, it won't run unless we run another scheduler > Async\enqueueForScheduler( something(...) ); > > > Pondering this, I think one of the things we've been missing is what > Unix[-like] systems call "process 0". I'm not an expert, so may get > details wrong, but my understanding is that if you had a single-tasking > OS, and used it to bootstrap a Unix[-like] system, it would look > something like this: > > 1. You would replace the currently running single process with the new > kernel / scheduler process > 2. That scheduler would always start with exactly one process in the > queue, traditionally called "init" > 3. The scheduler would hand control to process 0 (because it's the only > thing in the queue), and that process would be responsible for starting > all the other processes in the system: TTYs and login prompts, network > daemons, etc Slightly off-topic, but you may find the following article interesting: https://manybutfinite.com/post/kernel-boot-process/ It's a bit old, but probably still relevant for the most part. At least for x86. — Rob
Re: [PHP-DEV] Re: PHP True Async RFC
On Sun, Mar 9, 2025, at 8:17 AM, Rowan Tommins [IMSoP] wrote: > That leaves the question of whether it would ever make sense to nest > those blocks (indirectly, e.g. something() itself contains an async{} > block, or calls something else which does). > > I guess in our analogy, nested blocks could be like running Containers > within the currently running OS: they don't actually start a new > Scheduler, but they mark a namespace of related coroutines, that can be > treated specially in some way. > > Alternatively, it could simply be an error, like trying to run the > kernel as a userland program. Support for nested blocks is absolutely mandatory, whatever else we do. If you cannot nest one async block (scheduler instance, coroutine, whatever it is) inside another, then basically no code can do anything async except the top level framework. This function needs to be possible, and work anywhere, regardless of whether there's an "open" async session 5 stack calls up. function par_map(iterable $it, callable $c) { $result = []; async { foreach ($it as $val) { $result[] = $c($val); } } return $result; } However it gets spelled, the above code needs to be supported. --Larry Garfield
Re: [PHP-DEV] PHP True Async RFC
On Sun, Mar 9, 2025, at 11:56 AM, Edmond Dantes wrote: > *Let me summarize the current state for today:* > > 1. I am abandoning `startScheduler` and the idea of preserving > backward compatibility with `await_all` or anything else in that > category. The scheduler will be initialized implicitly, and this does > not concern user-land. Consequently, the `spawn function()` code will > work everywhere and always. > > 2. I will not base the implementation on `Fiber` (perhaps only on the > low-level part). Instead of `Fiber`, there will be a separate class. > There will be no changes to `Fiber` at all. This decision follows the > principle of Win32 COM/DCOM: old interfaces should never be changed. If > an old interface needs modification, it should be given a new name. > This should have been done from the start. > > 3. I am abandoning low-level objects in PHP-land (FiberHandle, > SocketHandle etc). Over time, no one has voted for them, which means > they are unnecessary. There might be a low-level interface for > compatibility with Revolt. > > 4. It might be worth restricting microtasks in PHP-land and keeping > them only for C code. This would simplify the interface, but we need to > ensure that it doesn’t cause any issues. > > > The remaining question on the agenda: deciding which model to choose — > *parent-child* or the *Go-style model*. As noted, I am in broad agreement with the previously linked article on "playpens" (even if I hate that name), that the "go style model" is too analogous to goto statements. Basically, this is asking "so do we use gotos or for loops?" For which the answer is, I hope obviously, for loops. Offering both, frankly, undermines the whole point of having structured, predictable concurrency. The entire goal of that is to be able to know if there's some stray fiber running off in the background somewhere still doing who knows what, manipulating shared data, keeping references to objects, and other nefarious things. With a nursery, you don't have that problem... *but only if you remove goto*. A language with both a for loop and an arbitrary goto statement gets basically no systemic benefit from having the for loop, because neither developers nor compilers get any guarantees of what will or won't happen. Especially when, as demonstrated, the "this can run in the background and I don't care about the result" use case can be solved more elegantly with nested blocks and channels, and in a way that, in practice, would probably get subsumed into DI Containers eventually so most devs don't have to worry about it. Of interesting note along similar lines would be Rust, and... PHP. Rust's whole thing is memory safety. The language simply will not let you write memory-unsafe code, even if it means the code is a bit more verbose as a result. In exchange for the borrow checker, you get enough memory guarantees to write extremely safe parallel code. However, the designers acknowledge that occasionally you do need to turn off the checker and do something manually... in very edge-y cases in very small blocks set off with the keyword "unsafe". Viz, "I know what I'm doing is stupid, but trust me." The discouragement of doing so is built into the language, and tooling, and culture. PHP... has a goto operator. It was added late, kind of as a joke, but it's there. However, it is not a full goto. It can only jump within the current function, and only "up" control structures. It's basically a named break. While it only rarely has value, it's not al that harmful unless you do something really dumb with it. And then it's only harmful within the scope of the function that uses it. And, very very rarely, there's some micro-optimization to be had. (cf, this classic: https://github.com/igorw/retry/issues/3). But PHP has survived quite well for 30 years without an arbitrary goto statement. So if we start from a playpen-like, structured concurrency assumption, which (as demonstrated) gives us much more robust code that is easier to follow and still covers nearly all use cases, there's two questions to answer: 1. Is there still a need for an "unsafe {}" block or in-function goto equivalent? 2. If so, what would that look like? I am not convinced of 1 yet, honestly. But if it really is needed, we should be targeting the least-uncontrolled option possible to allow for those edge cases. A quick-n-easy "I'mma violate the structured concurrency guarantees, k?" undermines the entire purpose of structured concurrency. > During our discussion, everything seems to be converging on the idea > that the changes introduced by the RFC into `Fiber` would be better > moved to a separate class. This would reduce confusion between the old > and new solutions. That way, developers wouldn't wonder why `Fiber` and > coroutines behave differently—they are simply different classes. > The new *Coroutine* class could have a different interface w
Re: [PHP-DEV] RFC: short and inner classes
On Thu, Mar 6, 2025, at 09:04, Tim Düsterhus wrote: > Hi > > Am 2025-03-06 07:23, schrieb Rob Landers: > > So, technically, they aren’t required to be in the same RFC; but also, > > they complement each other very well. > > They really should be separate RFCs then. Your RFC text acknowledges > that in the very first sentence: “two significant enhancements to the > language”. Each individual proposal likely has sufficient bike-shedding > potential on its own and discussion will likely get messy, because one > needs to closely follow which of the two proposals an argument relates > to. I put a lot of thought into this issue off and on, all day. I've decided to remove short syntax from the RFC and focus on inner classes. If this passes, then I will propose it as a separate RFC. Introducing them concurrently makes little sense in light of the feedback I have gotten so far, and it is turning out that there is much more to discuss than I initially expected. Thus, I will skip replying about short classes. > > As for the “Inner classes” proposal: > > - “abstract is not allowed as an inner class cannot be parent classes.” > - Why? This is mostly a technical reason, as I was unable to determine a grammar rule that didn't result in ambiguity. Another reason is to ensure encapsulation and prevent usages outside their intended scope. We can always add it later. > - “type hint” - PHP does not have type hints, types are enforced. You > mean “Type declaration”. Thank you for pointing this out! I learned something new today! I've updated the RFC. > - “this allows you to redefine an inner class in a subclass, allowing > rich hierarchies” - The RFC does not specify if and how this interacts > with the LSP checks. It doesn't affect LSP. I've updated the RFC accordingly. On Thu, Mar 6, 2025, at 20:08, Niels Dossche wrote: > Hi Rob > > Without looking too deep (yet) into the details, I'm generally in favor of > the idea. > What I'm less in favor of is the implementation choice to expose the inner > class as a property/const and using a fetch mode to grab it. > That feels quite weird to me honestly. How did you arrive at this choice? > > Kind regards > Niels It's a slightly interesting story about how I arrived at this particular implementation. If you noticed the branch name, this is the second implementation. The first implementation used a dedicated list on the class-entry for inner classes. Since I wanted to prevent static property/consts from being declared with the same name, I had just set it to a string of the full class name as a placeholder. That implementation also required some pretty dramatic OPcache changes, which I didn't like. At one point, I went to add the first test that did `new Outer::Inner()` and the test passed... You can imagine my surprise to see a test pass that I had expected to fail, and it was then that I went into the details of what was going on. Any `new ClassName` essentially results in the following AST: ZEND_AST_NEW -- ZEND_AST_ZVAL -- "ClassName" -- (... args) The original grammar, at the time, was to reuse the existing static property access AST until I could properly understand OPcache/JIT. My change had resulted in (approximately) this AST: ZEND_AST_NEW -- ZEND_AST_ZVAL -- ZEND_AST_STATIC_PROP -- "Outer::Inner" -- (... args) Which, effectively resulted in emitting opcodes that found the prop + string value I happened to put there as a placeholder until I figured out a better solution, handling autoloading properly and everything. This pretty much negated all efforts up to that point, and I was stunned. So, I branched off from an earlier point and eventually wrote the version you see today. It's 1000x simpler and faster than the original implementation (literally), since it uses all pre-existing (optimized)) infrastructure instead of creating entirely new infrastructure. It doesn't have to check another hashmap (which is slow) for static props vs. constants vs. inner classes. In essence, while the diff can be improved further, it is quite simple; the core of it is less than 500 lines of code. I'd recommend leaving any comments about the PR on the PR itself (or via private email if you'd prefer that). I'm by no means an expert on this code base, and if it is not what you'd expect, being an expert yourself, I'd love to hear any suggestions for improvements or other approaches. On Thu, Mar 6, 2025, at 20:33, Larry Garfield wrote: > My biggest concern with this is that it makes methods and short-classes > mutually incompatible. So if you have a class that uses short-syntax, and as > it evolves you realize it needs one method, sucks to be you, now you have to > rewrite basically the whole class to a long-form constructor. That sucks > even more than rewriting a short-lambda arrow function to a long-form > closure, except without the justification of capture semantics. I literally fell out of my chair laugh