Hi, An UDTF will be wrapped into an operator, an operator instance will be executed by a slot (or parallelism/thread) , About operator, task, slot, you can refer to [1] for more details. A TM (a JVM process) may has multiple slots, that means a JVM process may has multiple UDTF instances. It's better to make sure your UDTF stateless, otherwise you should care about thread-safe problem.
[1] https://ci.apache.org/projects/flink/flink-docs-stable/concepts/runtime.html#task-slots-and-resources Best, Godfrey lec ssmi <shicheng31...@gmail.com> 于2020年4月16日周四 下午6:20写道: > Hi: > I always wonder how much instance has been initialized in the whole > flink application. > Suppose there is such a scenario: > I have a UDTF called '*mongo_join'* through which the flink > table can join with external different mongo table according to the > parameters passed in. > So ,I have a sql table called *trade . *Throughout all the > pipeline, I join the *trade *table with *item, * And *payment. *The sql > statement as bellows: > > * create view trade_payment as select trade_id, payment_id > from trade , lateral table (mongo_join('payment')) as T(payment_id);* > * create view trade_item as select trade_id,item_id from trade , > , lateral table (mongo_join('item')) as T(payment_id); * > > As everyone thinks, I use some *member variables* to store the > different MongoConnection in the instance of the UDTF. > So , will there be concurrency problems? And how are the instances of > the function distributed? > > Thanks! > >