On Sun, 6 Mar 2022 at 23:43, Martin Di Paola <martinp.dipa...@gmail.com> wrote:
>
> Hi everyone. I implemented time ago a small plugin engine to load code
> dynamically.
>
> So far it worked well but a few days ago an user told me that he wasn't
> able to run in parallel a piece of code in MacOS.
>
> He was using multiprocessing.Process to run the code and in MacOS, the
> default start method for such process is using "spawn". My understanding
> is that Python spawns an independent Python server (the child) which
> receives what to execute (the target function) from the parent process.

> Because Python does not really serialize code but only enough
> information to reload it, the serialization of "objs[0].sayhi" just
> points to its module, "foo".
>

Hmm. This is a route that has some tricky hazards on it. Generally, in
Python code, we can assume that a module is itself, no matter what; it
won't be a perfect clone of itself, it will actually be the same
module.

If you want to support multiprocessing, I would recommend
disconnecting yourself from the concept of loaded modules, and instead
identify the target by its module name.

> I came with a hack: use a trampoline() function to load the plugins
> in the child before executing the target function.
>
> In pseudo code it is:
>
> modules = loader() # load the plugins (Python modules at the end)
> objs = init(modules) # initialize the plugins
>
> def trampoline(target_str):
>     loader() # load the plugins now that we are in the child process
>
>     # deserialize the target and call it
>     target = reduction.loads(target_str)
>     target()
>
> # Serialize the real target function, but call in the child
> # trampoline(). Because it can be accessed by the child it will
> # not fail
> target_str = reduction.dumps(objs[0].sayhi)
> ch = multiprocessing.Process(target=trampoline, args=(target_str,))
> ch.start()
>
> The hack works but is this the correct way to do it?
>

The way you've described it, it's a hack. Allow me to slightly redescribe it.

modules = loader()
objs = init(modules)

def invoke(mod, func):
    # I'm assuming that the loader is smart enough to not load
    # a module that's already loaded. Alternatively, load just the
    # module you need, if that's a possibility.
    loader()
    target = getattr(modules[mod], func)
    target()

ch = multiprocessing.Process(target=invoke, args=("some_module", "sayhi"))
ch.start()


Written like this, it achieves the same goal, but looks a lot less
hacky, and as such, I would say that yes, this absolutely IS a correct
way to do it. (I won't say "the" correct way, as there are other valid
ways, but there's certainly nothing wrong with this idea.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to