So is it just a documentation / examples / getting the knowledge out there
problem?

Incidentally I'm not a fan of modules that "do" things when you import
them, nor am I a fan of the "try it as a module then a class" sort of
fallback stuff vs just choosing the type you expect and sticking with it,
giving very clear error messages. Also "ImportError" is going to be
misinterpreted 99% of the time. Having something that calls a named
function seems like it'll be a better experience all around.

Kenn

On Fri, Dec 13, 2024 at 4:38 PM Robert Bradshaw via dev <dev@beam.apache.org>
wrote:

> We already have
>
> https://github.com/apache/beam/blob/release-2.40.0/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L141
> that allows arbitrary code to be imported and executed on worker
> startup. (Perhaps we could generalize to let it also reference a
> function to be called rather than just a module.)
>
> On Fri, Dec 13, 2024 at 12:52 PM Danny McCormick via dev
> <dev@beam.apache.org> wrote:
> >
> > Thanks - I actually was thinking about this today and was annoyed that
> we don't have this ability. I'm +1 to the proposed approach.
> >
> > I dropped a comment, but also upleveling in case there is broader
> interest; it would be nice to have a similar capability for expansion
> service containers as well.
> >
> > On Fri, Dec 13, 2024 at 3:23 PM Valentyn Tymofieiev via dev <
> dev@beam.apache.org> wrote:
> >>
> >> Hi everyone,
> >>
> >> Currently we don't have a straightforward and documented way to do
> simple initialization steps on every Beam Python SDK worker before data
> processing  starts. It is a rough edge that I've encountered on several
> occasions myself and in conversations with Beam users
> >>
> >> I put together some thoughts on how we could provide that capability in
> https://s.apache.org/python_sdk_worker_initialization . Looking forward
> to your ideas and other feedback on this topic.
> >>
> >> Thanks,
> >> Valentyn
>

Reply via email to