Re: CSV join in batch mode

2022-02-23 Thread Guowei Ma
Hi, Killian Sorry for responding late! I think there is no simple way that could catch csv processing errors. That means that you need to do it yourself.(Correct me if I am missing something). I think you could use RockDB State Backend[1], which would spill data to disk. [1] https://nightlies.apac

Re: java.lang.Exception: Job leader for job id 0efd8681eda64b072b72baef58722bc0 lost leadership.

2022-02-23 Thread Nicolaus Weidner
Hi Jai, On Tue, Feb 22, 2022 at 9:19 PM Jai Patel wrote: > It seems like the errors are similar to those discussed here: > - https://issues.apache.org/jira/browse/FLINK-14316 > - https://cdmana.com/2020/11/20201116104527255b.html > I couldn't find any other existing issue apart from the one you

Re: java.lang.Exception: Job leader for job id 0efd8681eda64b072b72baef58722bc0 lost leadership.

2022-02-23 Thread Jai Patel
Hi Nico, Thanks for getting back to us. We are using Flink 1.14.0 and we are using RocksDB. We currently are using the default memory settings. We'll look into increasing our managed memory fraction to 0.6 and see what happens. Do writes to ValueStates/MapStates have a direct on churn of the F

Restart from checkpoint - kinesis consumer

2022-02-23 Thread Vijayendra Yadav
Hi Team, I am running flink 1.11 kinesis consumer with say N kinesis shards, but i want to increase/decrease shards to N+M or N-M. Once i do that my flink consumer need to be restarted with changed parallelism. But i am unable to restart from existing checkpoint because of change in number of s

Re: Restart from checkpoint - kinesis consumer

2022-02-23 Thread Danny Cranmer
Hello Vijay, > Once i do that my flink consumer need to be restarted with changed parallelism. Why is this? The Flink consumer continuously scans for new shards, and will auto scale up/down the number of shard consumer threads to accommodate Kinesis resharding. Flink job/operator parallelism does

Re: Task manager errors with Flink ZooKeeper High Availability

2022-02-23 Thread Sigalit Eliazov
Thanks for the response on this issue. with the same configuration defined *high-availability: zookeeper* *high-availability.zookeeper.quorum: zk-noa-edge-infra:2181* *high-availability.zookeeper.path.root: /flink* *high-availability.storageDir: /flink_state* *high-availabi

Flink SQL: HOW to define `ANY` subtype in constructured Constructured Data Types(MAP, ARRAY...)

2022-02-23 Thread zhouhaifengmath
When I define a udf paramters like:    public @DataTypeHint("Row") Row eval(@DataTypeHint("MAP")     Map mapData)It gives error:    Please check for implementation mistakes and/or provide a corresponding hint.        at org.apac

Flink job recovery after task manager failure

2022-02-23 Thread Afek, Ifat (Nokia - IL/Kfar Sava)
Hi, I am trying to use Flink checkpoints solution in order to support task manager recovery. I’m running flink using beam with filesystem storage and the following parameters: checkpointingInterval=3 checkpointingMode=EXACTLY_ONCE. What I see is that if I kill a task manager pod, it takes f

Re: Trouble sinking to Kafka

2022-02-23 Thread Nicolaus Weidner
Hi Marco, I'm no expert on the Kafka producer, but I will try to help. [1] seems to have a decent explanation of possible error causes for the error you encountered. Which leads me to two questions: if (druidProducerTransactionMaxTimeoutMs > 0) { > > properties.setProperty("transaction.max

Re: Flink job recovery after task manager failure

2022-02-23 Thread Zhilong Hong
Hi, Afek! When a TaskManager is killed, JobManager will not be acknowledged until a heartbeat timeout happens. Currently, the default value of heartbeat.timeout is 50 seconds [1]. That's why it takes more than 30 seconds for Flink to trigger a failover. If you'd like to shorten the time a failover

Re: Trouble sinking to Kafka

2022-02-23 Thread Marco Villalobos
I fixed this, but I'm not 100% sure why. Here is my theory: My checkpoint interval is one minute, and the minimum pause interval is also one minute. My transaction timeout time is also one minute. I think the checkpoint causes Flink to hold the transaction open for one minute, and thus it times o

Re: Flink SQL: HOW to define `ANY` subtype in constructured Constructured Data Types(MAP, ARRAY...)

2022-02-23 Thread Jie Han
There is no built-in LogicType for ’ANY’, it’s a invalid token > 2022年2月23日 下午10:29,zhouhaifengmath 写道: > > > When I define a udf paramters like: > public @DataTypeHint("Row") Row > eval(@DataTypeHint("MAP") Map mapData) > > It gives error: > Please check for implementation mistak

[Flink-1.14.3] Restart of pod due to duplicatejob submission

2022-02-23 Thread Parag Somani
Hello, Recently due to log4j vulnerabilities, we have upgraded to Apache Flink 1.14.3. What we observed we are getting following exception, and because of it pod gets in crashloopback. We have seen this issues esp. during the time of upgrade or deployment time when existing pod is already running.

Re: Restart from checkpoint - kinesis consumer

2022-02-23 Thread Vijayendra Yadav
Thanks Danny, Let me comeback with results. > > On Feb 23, 2022, at 3:41 AM, Danny Cranmer wrote: > >  > Hello Vijay, > > > Once i do that my flink consumer need to be restarted with changed > > parallelism. > Why is this? The Flink consumer continuously scans for new shards, and will > aut

RE: Trouble sinking to Kafka

2022-02-23 Thread Schwalbe Matthias
Good morning Marco, Your fix is pretty plausible: * Kafka transactions get started at the beginning of a checkpoint period and contain all events collected through this period, * At the end of the checkpoint period the associated transaction is committed and concurrently the transactio

Re: Trouble sinking to Kafka

2022-02-23 Thread Nicolaus Weidner
Hi Marco, The documentation kind of suggestion this is the cause: > https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/connectors/kafka.html > > However, I think the documentation could benefit with a few examples and > scenarios that can ill-considered configurations. > Matthias alre