Thanks Rui to remind me lifecycle of UDF. LOoks liek there is no any lifecycle. I checked the code looks like we create UDF's instance for each message:
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.runtime.SqlFunctions.isTrue(new > com.paloaltonetworks.cortex.streamcompute.functions.MyUDFFunction().apply(current.getString(2)) Do you think we should put the UDF instance in the setup function and call in processlement ? Do you see anything besides this ? Or How can i achieve my goal in a different way ? I thought I could use the Table Provider approach. But it does not update data with any mechanisim. Thanks On Wed, Feb 10, 2021 at 12:41 PM Talat Uyarer <tuya...@paloaltonetworks.com> wrote: > Does beam create UDF function for every bundle or in setup of pipeline ? > > I will keep internal state in memory. The Async thread will update that in > memory state based on an interval such as every hour etc. If beam keeps UDF > instance more than one bundle it is ok for me. > > > On Wed, Feb 10, 2021, 12:37 PM Rui Wang <ruw...@google.com> wrote: > >> The problem that I can think of is maybe before the async call is >> completed, the UDF life cycle has reached to the end. >> >> >> -Rui >> >> On Wed, Feb 10, 2021 at 12:34 PM Talat Uyarer < >> tuya...@paloaltonetworks.com> wrote: >> >>> Hi, >>> >>> We plan to use UDF on our sql. We want to achieve some kind of >>> filtering based on internal states. We want to update that internal state >>> with a separate async thread in UDF. Before implementing that thing I want >>> to get your options. Is there any limitation for UDF to have multi-thread >>> implementation ? Our UDF is a scalar function. It will get 1 or 2 input >>> and return boolean. >>> >>> I will appreciate your comments in advance. >>> >>> Thanks >>> >>