On Tue, Dec 17, 2024 at 9:31 AM Kenneth Knowles <k...@apache.org> wrote:
>
> So is it just a documentation / examples / getting the knowledge out there 
> problem?

Possibly.

> Incidentally I'm not a fan of modules that "do" things when you import them, 
> nor am I a fan of the "try it as a module then a class" sort of fallback 
> stuff vs just choosing the type you expect and sticking with it, giving very 
> clear error messages. Also "ImportError" is going to be misinterpreted 99% of 
> the time. Having something that calls a named function seems like it'll be a 
> better experience all around.

This was initially introduced to register things like filesystems, as
Python doesn't have the service provider interface stuff that Java
has, so we need to "run some code on startup" to register it. I agree
a named function would be better, just thinking it might be preferable
to avoid two distinct ways of doing almost the same thing.

>
> Kenn
>
> On Fri, Dec 13, 2024 at 4:38 PM Robert Bradshaw via dev <dev@beam.apache.org> 
> wrote:
>>
>> We already have
>> https://github.com/apache/beam/blob/release-2.40.0/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L141
>> that allows arbitrary code to be imported and executed on worker
>> startup. (Perhaps we could generalize to let it also reference a
>> function to be called rather than just a module.)
>>
>> On Fri, Dec 13, 2024 at 12:52 PM Danny McCormick via dev
>> <dev@beam.apache.org> wrote:
>> >
>> > Thanks - I actually was thinking about this today and was annoyed that we 
>> > don't have this ability. I'm +1 to the proposed approach.
>> >
>> > I dropped a comment, but also upleveling in case there is broader 
>> > interest; it would be nice to have a similar capability for expansion 
>> > service containers as well.
>> >
>> > On Fri, Dec 13, 2024 at 3:23 PM Valentyn Tymofieiev via dev 
>> > <dev@beam.apache.org> wrote:
>> >>
>> >> Hi everyone,
>> >>
>> >> Currently we don't have a straightforward and documented way to do simple 
>> >> initialization steps on every Beam Python SDK worker before data 
>> >> processing  starts. It is a rough edge that I've encountered on several 
>> >> occasions myself and in conversations with Beam users
>> >>
>> >> I put together some thoughts on how we could provide that capability in 
>> >> https://s.apache.org/python_sdk_worker_initialization . Looking forward 
>> >> to your ideas and other feedback on this topic.
>> >>
>> >> Thanks,
>> >> Valentyn

Reply via email to