[PHP-DEV] Modules, again.

Michael Morris Sun, 04 May 2025 00:35:57 -0700

It's been 9 months. Been researching, working on other projects, mulling
over
points raised the last time I brought this up. And at the moment I don't
think
PHP 8.5 is in its final weeks so this isn't a distraction for that.  The
previous discussion got seriously, seriously derailed and I got lost even
though
I started it. I'm not going to ask anyone to dig into the archives, let's
just
start afresh.


--------------------------------------------------------------------------------
THE PROBLEM
--------------------------------------------------------------------------------
PHP has no way of dealing with userland code trying to write to the same
entry
on the symbol tables.  Namespaces are a workaround of this and combined
with
autoloaders they've yielded the package environments we currently have
(usually
composer), but at the end of the day if two code blocks want different
versions
of the same package invoked at the same time there's going to be a crash.
This
is seen most visibly in WordPress plugins that use composer, as they resort
to
monkey-typing the composer packages they consume to avoid such collisions.

--------------------------------------------------------------------------------
PROPOSAL
--------------------------------------------------------------------------------
Modules - blocks of PHP code with independent symbol tables and autoload
queues.

Instead of using any new keywords along with the backwards compatibility
problems that creates three existing keywords will be used in a new way:
"use",
"require", and "yield"

The first file that PHP loads will always be on the "main thread". To bring
in
code as a module the use require structure is used. The simplest possible
version of this is as follows:

PHP Code
-----------------------------------------------------------------------

use require 'mymodule.php';
--------------------------------------------------------------------------------

The contents of 'mymodule.php' have two requirements. First, a namespace is
*required* of a module. Second, yield statements are used to mark the
functions
classes and constants the module exports, and at least one such yield must
be
present. Hence the file may look something like this:

PHP Code
-----------------------------------------------------------------------

namespace MyModule;

yield function sum(a, b) { return a + b; }

--------------------------------------------------------------------------------

Returning to our caller, it could make use of this function as follows.

PHP Code
-----------------------------------------------------------------------

use require 'mymodule.php';

echo MyModule::sum(3, 4);
--------------------------------------------------------------------------------

So far there is nothing here that couldn't have been done with a static
class.
The important difference though is in behavior.

1. The module does not affect or see variables on the main thread -
including
the superglobals. A module can only get to them if it receives them as an
argument in some sort of setter yield function.

2. The module does not affect or see constants or functions established on
the
main thread or in other modules. It can see and autoload classes from the
main
thread if the module author opts into this (discussed below).

The use case above is not typical - usually inclusions from modules are more
targeted.

PHP Code
-----------------------------------------------------------------------

use sum require 'mymodule.php';

echo sum(3, 4); // 7
--------------------------------------------------------------------------------

Here the class is not created in the main thread, only the yielded function
is.

Aliases and multiple objects can be declared in the use just as is the case
now.

PHP Code
-----------------------------------------------------------------------

use sum, difference as subtract require 'mymodule.php';
--------------------------------------------------------------------------------

And if desired the namespace of the module can be aliased on the fly using
the
wildcard operator

PHP Code
-----------------------------------------------------------------------

use * as AliasModule require 'mymodule.php';

AliasModule::sum(3,4);
--------------------------------------------------------------------------------

Require continues to invoke autoloaders, even when used in the context of
use
require.  The callback defined in spl_autoload_register will receive a
second
argument from this context, boolean true if the main thread is loading a
module,
and if a module is loading a module its namespace string will be sent. Hence

PHP Code
-----------------------------------------------------------------------

require('./vendor/composer/autoload.php')

use require 'mymodule';
// Callback will receive args ('mymodule', true)
class A;
// Callback will receive args ('A', false) - the false case should preserve
BC.
--------------------------------------------------------------------------------

And in a module

PHP Code
-----------------------------------------------------------------------
namespace MyModule;

/*
 * Mod Author can elect to use the global autoloader
 * by passing the string "global" instead of a callback
 */
spl_autoload_register('global');

use require 'othermodule';
// Callback will receive args ('othermodule', 'MyModule')
class A;
// Callback will receive args ('A', 'mymodule')
--------------------------------------------------------------------------------

Autoload callbacks are currently required to directly load the file. For
modules this *will not work* because the loading of a module file involves
the
setup of a new symbol table, autoload queue, and slightly different parsing
rules (again - namespace is required, not optional, and at least one top
level
yield statement must be present). So when an autoloader is asked about a
module
it must return the absolute path to the module, or false if it can't
resolve it
(handing off to the next loader in the queue). PHP will then load the
module as
if the filepath had been placed in the use require statement in the first
place.


--------------------------------------------------------------------------------
CLOSING REMARKS
--------------------------------------------------------------------------------

I want to take a moment to ruminate on what doors are opened by the above,
but
all that follows is NOT part of my proposal.  The above gets what I feel is
needed - a way to cleanly run disparate packages in an application whose
authors
refuse to update it to embrace composer (cough - WordPress - cough). But
even
for projects that do embrace composer and its package management the above
makes
the prospect of a large API change in a major version far less frightening.
In
the current setup all the extensions have to be kept current and more or
less
on the same page. The larger the extension authoring set becomes the less
feasible this is - projects get abandoned, stagnate and if you have a site
that
needs such then finding a replacement or upgrading it personally can be a
pain.

The string that gets passed to the autoloader could be anything btw - rules
for
which versions the module might accept like 'mymodule@7.x' can work if the
package manager is written to parse out such, but the rules for such, not to
mention what the composer.json file or equivalent would need to look like is
outside the scope of this proposal.

It should be noted that Modules offer a much more black box behavior than
the
current namespaces and autoloaders can provide.  The only part of a module
the
outside world can see is what it yields. If a class isn't yielded the
outside
world can't make an instance of it.  This shielding of internal API's
should be
useful because no matter how big a note you make in the comments about how
a
class shouldn't be used by outside code sooner or later someone will do it
and
their code will break when you change the internal API. While it is their
fault
for doing such, it can be a concern especially if the use of such "internal"
API's becomes commonplace (Drupal has several such instances of this)

Finally, my previous writings on this have mused about possibly having
module files
possess vastly different parsing rules. I bring this up as a possibility
but I
won't dwell on it as it could easily become a thread derailing distraction.
That said, it should be possible to use the autoload callback to signal to
PHP
that a block of code should be loaded using the module parsing rules if
having
them be different in any way is desired. That or a new require_module
statement
could be used to pull in code under the module parsing rules.

As to why this might be desirable - I'm no expert on the engine but I'm
going
to guess that giving each module an independent symbol table will incur
overhead, possibly significant overhead. One way to claw back performance
is to
fix bugs that have been unfixable for backwards compatibility reasons.
There is
no existing module code, so in one fell swoop this code can step away from
those
problems.  If this is done though it has to be done right as the window
closes
once projects with modules start appearing.

[PHP-DEV] Modules, again.

Reply via email to