Hi,

An UDTF will be wrapped into an operator, an operator instance will be
executed by a slot (or parallelism/thread) ,
About operator, task, slot, you can refer to [1] for more details.
A TM (a JVM process) may has multiple slots, that means a JVM process may
has multiple UDTF instances.
It's better to make sure your UDTF stateless, otherwise you should care
about thread-safe problem.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/concepts/runtime.html#task-slots-and-resources

Best,
Godfrey



lec ssmi <shicheng31...@gmail.com> 于2020年4月16日周四 下午6:20写道:

> Hi:
>    I always wonder how much instance has been initialized in the whole
> flink application.
>    Suppose there is such a scenario:
>        I have a  UDTF  called '*mongo_join'*  through  which the flink
> table can join with external different mongo table  according to the
> parameters passed in.
>        So ,I have a sql table called    *trade . *Throughout  all the
> pipeline, I  join the *trade *table with  *item, * And *payment. *The sql
> statement as bellows:
>
>           * create view  trade_payment as  select trade_id, payment_id
> from trade , lateral table (mongo_join('payment')) as T(payment_id);*
> *          create view trade_item as  select trade_id,item_id from trade ,
> , lateral table (mongo_join('item')) as T(payment_id); *
>
>     As everyone thinks, I use  some *member variables* to store  the
> different MongoConnection  in the  instance of the UDTF.
>     So , will there be concurrency problems?  And how are the instances of
> the function distributed?
>
>   Thanks!
>
>

Reply via email to