We already have
https://github.com/apache/beam/blob/release-2.40.0/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L141
that allows arbitrary code to be imported and executed on worker
startup. (Perhaps we could generalize to let it also reference a
function to be called rather than just a module.)
On Fri, Dec 13, 2024 at 12:52 PM Danny McCormick via dev
<dev@beam.apache.org> wrote:
>
> Thanks - I actually was thinking about this today and was annoyed that we 
> don't have this ability. I'm +1 to the proposed approach.
>
> I dropped a comment, but also upleveling in case there is broader interest; 
> it would be nice to have a similar capability for expansion service 
> containers as well.
>
> On Fri, Dec 13, 2024 at 3:23 PM Valentyn Tymofieiev via dev 
> <dev@beam.apache.org> wrote:
>>
>> Hi everyone,
>>
>> Currently we don't have a straightforward and documented way to do simple 
>> initialization steps on every Beam Python SDK worker before data processing  
>> starts. It is a rough edge that I've encountered on several occasions myself 
>> and in conversations with Beam users
>>
>> I put together some thoughts on how we could provide that capability in 
>> https://s.apache.org/python_sdk_worker_initialization . Looking forward to 
>> your ideas and other feedback on this topic.
>>
>> Thanks,
>> Valentyn

Reply via email to