Hi Hammed, thank you for taking the time to read through this and share your 
thoughts.


> On Sep 19, 2024, at 1:41 PM, Hammed Ajao <hamieg...@gmail.com> wrote:
> 
> 
> 
> 
> On Tue, Sep 17, 2024 at 8:30 PM Dennis Snell <dennis.sn...@automattic.com
>> wrote:
> 
>> 
>> 
>> 
>>> On Sep 17, 2024, at 2:03 PM, Rob Landers <rob@bottled.codes> wrote:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Sep 17, 2024, at 14:57, Adam Zielinski wrote:
>>> 
>>>>> To summarize, I think PHP would benefit from:
>>>> 
>>>>>
>>>> 
>>>>> 1. Adding WASM for simple low-level extensibility that could run on
>>>> 
>>>>> shared hosts for things that are just not possible in PHP as described a
>>>> 
>>>>> few paragraphs prior, and where we could enhance functionality over time,
>>>> 
>>>>>
>>>> 
>>>>> 2. Constantly improving PHP the language, which is what you are solely
>>>> 
>>>>> advocating for over extensibility,
>>>> 
>>>> Hi Mike,
>>>> 
>>>> 
>>>> 
>>>> I’m Adam, I'm building WordPress Playground [1] – it's WordPress running 
>>>> in the browser via a WebAssembly PHP build [2]. I'm excited to see this 
>>>> discussion and wanted to offer my perspective.
>>>> 
>>>> 
>>>> 
>>>> WebAssembly support in PHP core would be a huge security and productivity 
>>>> improvement for the PHP and WordPress communities.
>>>> 
>>>> 
>>>> 
>>>>> To summarize, I think PHP would benefit from:
>>>> 
>>>>>
>>>> 
>>>>> 1. Adding WASM for simple low-level extensibility that could run on
>>>> 
>>>>> shared hosts for things that are just not possible in PHP as described a
>>>> 
>>>>> few paragraphs prior, and where we could enhance functionality over time,
>>>> 
>>>> 
>>>> 
>>>> Exactly this! With WASM, WordPress would get access to fast, safe, and 
>>>> battle-tested libraries.
>>>> 
>>>> 
>>>> 
>>>> Today, we're recreating a lot of existing libraries just to be able to use 
>>>> them in PHP, e.g. parsers for HTML [3], XML [4], Zip [5], MySQL [6], or an 
>>>> HTTP client [7]. There are just no viable alternatives. Viable, as in 
>>>> working on all webhosts, having stellar compliance with each format's 
>>>> specification, supporting stream parsing, and having low footprint. For 
>>>> example, the curl PHP extensions is brilliant, but it's unavailable on 
>>>> many webhosts.
>>>> 
>>>> 
>>>> 
>>>> With WebAssembly support, we could stop rewriting and start leaning on the 
>>>> popular C, Rust, etc. libraries instead. Who knows, maybe we could even 
>>>> polyfill the missing PHP extensions?
>>>> 
>>>> 
>>>> 
>>>>> 2. Constantly improving PHP the language, which is what you are solely
>>>> 
>>>>> advocating for over extensibility,
>>>> 
>>>> 
>>>> 
>>>> Just to add to that – I think WASM support is important for PHP to stay 
>>>> relevant. There's an exponential advantage to building a library once and 
>>>> reusing it across the language boundaries. A lot of companies is invested 
>>>> in PHP and that won't change in a day. However, lacking access to the WASM 
>>>> ecosystem, I can easily imagine the ecosystem slowly gravitating towards 
>>>> JavaScript, Python, Go, Rust, and other WASM-enabled languages.
>>>> 
>>>> 
>>>> 
>>>> Security-wise, WebAssembly is Sandboxed and would enable safe processing 
>>>> of untrusted files. Vulnerabilities like Zip slip [8] wouldn't affect a 
>>>> sandboxed filesystem. Perhaps we could even create a secure enclave for 
>>>> running composer packages and WordPress plugins without having to fully 
>>>> trust them.
>>>> 
>>>> 
>>>> 
>>>> Another use-case is code reuse between JavaScript and PHP. I'm sceptical 
>>>> this could work with reasonable speed and resource consumption, but let's 
>>>> assume for a moment there is a ultra low overhead JavaScript runtime in 
>>>> WebAssembly. WordPress could have a consistent templating language. PHP 
>>>> backend would render the website markup using the same templates and 
>>>> libraries as the JavaScript frontend. Half the code would achieve the same 
>>>> task.
>>>> 
>>>> 
>>>> 
>>>> Also, here's a few interesting "WASM in PHP" projects I found – maybe they 
>>>> would be helpful:
>>>> 
>>>> - WebAssembly runtime built in PHP (!) 
>>>> https://github.com/jasperweyne/unwasm
>>>> 
>>>> 
>>>> - WebAssembly runtime as a PHP language extension: 
>>>> https://github.com/veewee/ext-wasm
>>>> 
>>>> 
>>>> - WebAssembly runtime as a PHP language extension: 
>>>> https://github.com/extism/php-sdk
>>>> 
>>>> 
>>>> 
>>>> 
>>>> [1] https://github.com/WordPress/wordpress-playground/
>>>> 
>>>> 
>>>> [2] 
>>>> https://github.com/WordPress/wordpress-playground/tree/trunk/packages/php-wasm/compile
>>>> 
>>>> 
>>>> [3] https://developer.wordpress.org/reference/classes/wp_html_processor/
>>>> 
>>>> 
>>>> [4] https://github.com/WordPress/wordpress-develop/pull/6713
>>>> 
>>>> 
>>>> [5] 
>>>> https://github.com/WordPress/blueprints-library/blob/87afea1f9a244062a14aeff3949aae054bf74b70/src/WordPress/Zip/ZipStreamReader.php
>>>> 
>>>> 
>>>> [6] https://github.com/WordPress/sqlite-database-integration/pull/157
>>>> 
>>>> 
>>>> [7] 
>>>> https://github.com/WordPress/blueprints-library/blob/trunk/src/WordPress/AsyncHttp/Client.php
>>>> 
>>>> 
>>>> [8] https://security.snyk.io/research/zip-slip-vulnerability
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -Adam
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> Hey Adam,
>>> 
>>> 
>>> 
>>> I actually went down something like this road for a bit when working at 
>>> Automattic. My repo even probably still exists in the graveyard repository… 
>>> but I had plugins running in C# and Java over a couple of weeks. This was 
>>> long before wasm was a thing, and while cool, nobody really could think of 
>>> a use for it.
>>> 
>>> 
>>> 
>>> It seems like you have a use for it though, and I’m reasonably certain you 
>>> could get it working over ffi in a few weeks; yet you mention hosts not 
>>> even having the curl extension installed, so I doubt that even if wasm came 
>>> to be, it would be available on those hosts.
>>> 
>>> 
>> 
>> There are two major areas I have found that would benefit from having a WASM 
>> runtime in PHP:
>> 
>> 
>> Obviously, being able to run the same algorithms on the frontend and backend 
>> is a huge win for consistency in applications. 
>> 
>> 
>> 
> 
> I'm not convinced. That's what they said about nodejs(same algos and same 
> language on FE and BE). Except it's not really that consistent because there 
> are several discrepancies between the browser and node runtime. I'll believe 
> it when I see it.
> 
> 
> 

There’s a note about this point that I think is worth calling out, and that is 
something you probably already know, but JavaScript runtimes provide a standard 
library while a WASM runtime is mostly just a virtual machine. There’s also 
nothing provided that I’m aware of in WASM that offers filesystem access or 
network access, which are major areas where in-browser JavaScript and NodeJS 
backends differ (because the browser and server environments are fundamentally 
limited by different needs).


As things stand, projects are compiled into WebAssembly and literally run 
identically in the different runtimes because it’s the bytecode that’s 
specified, not specific functions or libraries. Whereas with JavaScript we’re 
shipping source code and interacting with very different systems, WASM bundles 
are a few steps removed from that, and have no DOM or system access to interact 
with.


I’m fairly confident we can say that it’s non-controversial that folks 
routinely run identical algorithms across different WASM runtimes in different 
environments. As you mentioned elsewhere, it’s very much akin to how Java and 
Closure and Scala all run on the JVM just fine together even being different 
languages, except in this case the runtime is an isolated sandbox by default 
with no external system access. WASM is a lovely little VM, successful in ways 
many before it haven’t been.


>  
>> Particularly with text-related algorithms it’s really easy for 
>> inconsistencies to develop due to the features available in each languages 
>> standard library, as well as due to differences in how each language handles 
>> and processes string.
>> 
>> 
>> 
> 
> I can see the appeal of that though.
> 
> 
>> 
>> 
>> The other major area is similar, and we’ve seen this with the HTML and XML 
>> parsing work recently undertaken in WordPress.
>> 
>> 
>> 
> 
> 
> 
> Yeah you could talk about html parsing before 8.4 but with 8.4 we get lexbor 
> (thanks to niels) and that's as good as it gets. Php already has beautiful 
> support for XML though so I'm not sure why you would implement a parser 
> yourself.
> 

It’s wonderful that PHP is finally getting a spec-compliant HTML DOM parser for 
the first time in its history, but \Dom\HTMLDocument is not the right interface 
for every server need, and remains ill-suited for the kind of work typical in a 
WordPress site, which needs to run on low memory budgets, perform as fast as 
possible, and exceed the safety of what a generic DOM parser produces (there 
are cases that \Dom\HTMLDocument will still introduce vulnerabilities into an 
HTML document because it’s able to create DOM trees that cannot be represented 
by HTML upon serialization, and as it implements the HTML spec, it cannot 
prevent creating those trees). There are still a number of steps every 
developer needs to take to properly setup the parser and get the right results 
back, and the parser has to load the entire DOM tree into memory before any 
reads or manipulations can be performed on it.


WordPress’ HTML API is a near-zero memory overhead streaming HTML parser 
designed around safe-by-default reading and writing of HTML which requires no 
configuration or manual steps to get “the right thing.” It’s also significantly 
slower in user-space PHP than it needs to be. I hope one day that PHP has its 
own copy of this streaming parser design, which is performant and available in 
every copy of PHP (which is another issue with code only available in 
extensions), but even if that never happens, running C or Rust code compiled to 
WebAssembly would provide almost the same value as having that design 
implemented in the language.


> 
> 
> 
>> There are plenty of cases where efficient and spec-compliant operations are 
>> valuable, but these kinds of things tend to cost significantly more in 
>> user-space PHP.
>> 
>> 
>> 
> 
>> Being able to jump into WASM, even with the overhead of exchanging that data 
>> and running another VM, would very likely result in a noticeable net 
>> improvement in runtime performance.
>> 
>> 
>> 
> 
> 
> 
> What exactly do you mean by jump into wasm? Like hand write it? Or you mean 
> jump into a language that can be compiled to wasm? How about debugging at 
> runtime? And if you mean better performance than PHP, while that is likely, 
> it isn't guaranteed. PHP is pretty fast and will be faster for some routines 
> that are optimized by the engine. Wasm will never be as fast as extensions 
> though because with extensions, all you're doing is extending the engine. 
> Same as any internal extension. With wasm you're interoperating with an 
> entirely separate VM.
> 

By jumping into WASM I’m talking about the second thing you mention: calling 
functions written in languages compiled to WebAssembly. Even with the overhead 
of marshaling data, the things that WebAssembly is good at are the things that 
PHP is slow at: specifically things like raw numeric computation and string 
manipulation and parsing. I write a lot of parsing code and frequently am 
surprised at the overhead cost of string processing and array operations in 
PHP. There are a number of straightforward operations available in C that just 
can’t be done in PHP. I don’t see this as a failing of PHP, just an aspect of 
how it is.


For runtime debugging I don’t have any particular thoughts. I’m not aware of 
anyone who has ever tried to runtime debug CURL calls or things like 
`mb_convert_encoding()`. Functions invoked in the WASM runtime would more or 
less be library functions, like `ffmpeg`. Debugging would likely most 
frequently be done as a library and dumped into the PHP application with no 
expectation for debugging.


Effectively these are user-space PHP extensions, and are very convenient 
because they can be updated without recompiling PHP or begging web hosts to 
update their PHP version, or to do that every other Tuesday, or whenever 
another security exploit is fixed in some image processor. On that note, the 
ability to sandbox image processing code (and any other user-provided content) 
is a huge perk. Many of the exploits of past PHP extensions could be contained 
inside the VM, which has limited ability to reach out into the system. Fixing 
vulnerabilities and bugs becomes something any auto-updater can accomplish, 
requiring no effort or interaction on the part of the host.


> 
> 
>> Additionally, it’s a perk being able to write such algorithms in languages 
>> that aid that development through more powerful type systems.
>> 
> 
> 
> 
> We can agree on that. But I use C++ for my extensions so there's also that.
> 
> 
>> There’s additional value in a number of other separate tasks. Converting 
>> images or generating thumbnails is a good example where raw performance is 
>> less of a concern than being able to ensure that the image library is 
>> available and not exposing the host system to risk. 
>> 
> 
> 
> 
> Imo this is where FFI should shine but I'll admit that the current 
> implementation is lacking in both security and functionality.
> 
> 
> 
> 
>> I imagine plenty of “PHP lite-extensions” appearing in this space because it 
>> would give people the opportunity to experiment with features that are 
>> impractical in user-space PHP before fully committing the language itself to 
>> that interface or library. It would extend the reach of PHP’s usability 
>> because it would make possible for folks, who happen to be running on cheap 
>> shared hosts, to run more complicated processing tasks than are practical 
>> today. While big software shops and SaaS vendors do and can run their own 
>> custom PHP extensions, there’s not great way to share those generally to 
>> people without the same full control over their stack.
>> 
> 
> 
> 
> Shared hosting for php gets you the worst possible version of php. 
> 

Couldn’t have said it better myself!


> Can't recompile to enable any bundled extension, can't install any new 
> extensions, so how exactly would you approach this? Wasm bundled with the 
> engine by default? Or some kind of opt in mechanism that shared hosters won't 
> even be able to use?
> 

As with many of the things I’ve been writing on this list lately, to me, an 
embedded WASM runtime makes most sense as a central language feature and 
available everywhere PHP is deployed. There are a few core basic subsystems 
that either are foundational to the environment PHP operates in (for example, 
web-related technologies like HTTP and HTML and URLs) or which bring so much 
value to the language that it opens up brand new paradigms or potentially 
removes major maintenance burdens.


If we could ship `imagemagick` as a WASM extension there would be no need for 
the `imagemagick` extension. The security environment out of the box is so much 
better; it’s not worth the lost potential for performance that a native 
extension offers. Someone may not agree with this, and that’s fine because they 
can always install a native extension or utilize the FFI on infrastructure they 
control.


I think at times WordPress sees a very different picture of the world than many 
great PHP projects see. Our reality is that we’re writing code that runs on 
hardware we don’t control or even know about. We cannot in any way install or 
force certain extensions to be present. The worst possible version of PHP is 
literally the constraint at which we are allowed to code. Anything beyond that 
and we can’t ship it because a large fraction of the internet will start 
crashing. It’s frustrating, but also an honor to be able to ensure that people 
who can’t afford high end servers can still build their own place on the world 
wide web.


Over the past several years, though, WordPress has also been a positive 
influence on persuading hosts to update their PHP versions, because PHP has 
gotten better enough that the argument is easy: upgrade to PHP 7 and your data 
center costs will drop X%. It’s not too hard to imagine winning similarly on 
the security argument.


WASM code on memory-constrained, oversubscribed, CPU-poor hosts is still 
considerably better for certain kinds of computation than user-space PHP code 
on memory-constrained, oversubscribed, CPU-poor hosts.


> 
> 
> >> 
>> 
>>> 
>>> 
>>> 
>>> However, plugins basically work via hooks/filters. So as long as you 
>>> register the right listeners and handle serialization properly, you can 
>>> simply run a separate process for the plugin, or call a socket for “remote” 
>>> plugins.
>>> 
>>> 
>>> 
>>> I don’t see anything stopping anyone from implementing that today.
>>> 
>>> 
>>> — Rob
>>> 
>> 
>> I’m excited to see this conversation. I’ve wanted to propose it a number of 
>> times myself.
>> 
>> 
>> Warmly,
>> Dennis Snell 
>> 
> 
> I actually love wasm, I'm currently in the process of compiling my mini php 
> runtime to wasm (basically a browser only version of 3v4l). I'm not against 
> this for any personal reasons, I'm simply not sure it's the right approach.
> 
> 
> 

That sounds awesome. The WordPress Playground ships a copy of PHP compiled to 
WASM, and it’s been an incredible journey realizing just what’s capable with 
this technology. It’s really boosted the developer experience working on 
WordPress itself and also that of those building their own projects using 
WordPress. Some are already bringing in libraries like ffmpeg to convert images 
and media on the frontend, though it’s sad that can’t also be done on the 
server yet.


> 
> Cheers,
> Hammed
> 
> 
> 

Hope you have a nice weekend. Cheers.

Reply via email to