Re: Left join query not clearing state after migrating from 1.9.0 to 1.14.3

2022-03-03 Thread Prakhar Mathur
Hi, Can someone kindly help and take a look at this? It's a major blocker for us. Thanks, Prakhar On Wed, Mar 2, 2022 at 2:11 PM Prakhar Mathur wrote: > Hello, > > We recently did a migration of our Flink jobs from version 1.9.0 to > 1.14.3. These jobs consume from Kafka and produce to respect

Re: Example with JSONKeyValueDeserializationSchema?

2022-03-03 Thread HG
Hi Kamil, Aeden and others It was already answered This was the complete solution: KafkaSource source = KafkaSource.builder() .setProperties(kafkaProps) .setProperty("ssl.truststore.type",trustStoreType) .setProperty("ssl.truststore.password",trustStorePassword) .s

Pyflink1.13 or JavaFlink1.13 + Jpython + Python2.7, which way has better performance?

2022-03-03 Thread vtygoss
Hi, community! I am working on data processing structure optimization from full data pipeline to incremental data pipeline, from PySpark with PythonCode to two optional ways below: 1. PyFlink 1.13 + Python 2.7 2. JavaFlink 1.13 + JPython + Python 2.7 As far as i know, the python APIs only

Re:Re: source code build failure

2022-03-03 Thread Edwin
Hi, yu'an: Many thanks for your reply, it has been fixed :), it turns out to be related to some local environmental settings. Best regards, Edwin At 2022-03-03 15:58:43, "yu'an huang" wrote: Hi Edwin, I suddenly realised that I replied to you directly, so I just sent

Version Upgrade of FlinkSQL (1.10 to 1.12)

2022-03-03 Thread zihao chen
hi, All. I would like to upgrade the Flink version of several FlinkSQLs from 1.10 to 1.12. And I want to restore the state saved by version 1.10 on version 1.12. After I look at the StreamGraph, JobGraph, Checkpoint and some other related informations, I found these points that would cause the st

Re: Flink job recovery after task manager failure

2022-03-03 Thread yidan zhao
I think you should use nfs, which is easily to be deployed unlike hdfs. The state is written and read by TM. ZK is used to record some meta data of the checkpoint, such as the ckpt file path. Finally, I don't think your job can be recovered normally if you are not running with a shared storage.

Max parallelism and reactive mode

2022-03-03 Thread Alexis Sarda-Espinosa
Hi everyone, I have some questions regarding max parallelism and how interacts with deployment modes. The documentation states that max parallelism should be "set on a per-job and per-operator granularity" but doesn't provide more details. Is it possible to have different values of max parallel

Re: Example with JSONKeyValueDeserializationSchema?

2022-03-03 Thread Aeden Jameson
I believe you can solve this iss with, .setDeserializer(KafkaRecordDeserializationSchema.of(new JSONKeyValueDeserializationSchema(false))) On Thu, Mar 3, 2022 at 8:07 AM Kamil ty wrote: > > Hello, > > Sorry for the late reply. I have checked the issue and it seems to be a type > issue as the e

Task Manager shutdown causing jobs to fail

2022-03-03 Thread Puneet Duggal
Hi, Currently in production, i have HA session mode flink cluster with 3 job managers and multiple task managers with more than enough free task slots. But i have seen multiple times that whenever task manager goes down ( e.g. due to heartbeat issue).. so does all the jobs running on it even wh

Re: Help with pom dependencies for Flink with Table API

2022-03-03 Thread Adesh Dsilva
Hi, Yes, you are right. I was mixing some dependencies without knowing. I did a complete reset of all dependencies and started with a fresh pom and it fixed it. Many Thanks! > On 2 Mar 2022, at 17:37, Adesh Dsilva wrote: > > Hello, > > I think I accidentally posted this question on the wron

StreamingFileSink bulk formats - small files

2022-03-03 Thread Kamil ty
Hello all, In multiple jobs I'm saving data using the datastream API with StreamingFileSink and various bulk formats (avro, parquet). As bulk formats require a rolling policy that extends the CheckpointRollingPolicy I have created a policy that rolls on file size additionally. Unfortunately for so

Re: Example with JSONKeyValueDeserializationSchema?

2022-03-03 Thread Kamil ty
Hello, Sorry for the late reply. I have checked the issue and it seems to be a type issue as the exception suggests. What happens is that the JSONKeyValueDeserializationSchema included in flink implements a KafkaDeserializationSchema. The .setDeserializer method expects a Deserialization schema th

Re: Questions regarding connecting local FlinkSQL client to remote JobManager in K8s

2022-03-03 Thread yu'an huang
Hi Elkhan, I confirm that the FlinkSQL Client is communicating with JM via Rest endpoint. After I changed the “rest.port”, the sql client thrown exception: "[ERROR] Could not execute SQL statement. Reason: java.net.ConnectException: Connection refused”. So for your case, since Flink will creat

error: cannot find symbol .setDeliverGuarantee in KafkaRecordSerializationSchemaBuilder

2022-03-03 Thread HG
As per the documentation , https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/ A kafka sink can be defined as further below But in fact it fails with * error: cannot find symbol .setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE) ^ symbol: meth

Re: AWS Kinesis Flink vs K8s

2022-03-03 Thread Puneet Duggal
Hi Jeremy, Thank you for this detailed answer and yes this surely helps.. Regards, Puneet > On 16-Feb-2022, at 9:21 PM, Ber, Jeremy wrote: > > Hi Puneet, > Amazon Kinesis Data Analytics for Apache Flink is a managed Apache Flink > offering--it removes the need to setup your own checkpointing

Re: How to sort Iterable in ProcessWindowFunction?

2022-03-03 Thread HG
Hi, I have need to sort the input of the ProcesWindowFunction by one of the fields of the Tuple4 that is in the Iterator. Any advice as to what the best way is? static class MyProcessWindowFunction extends ProcessWindowFunction, String, String, TimeWindow> { @Override public vo

How to sort Iterable in ProcessWindowFunction?

2022-03-03 Thread HG
Hi, I have need to sort the input of the ProcesWindowFunction by one of the fields of the Tuple4 that is in the Iterator. static class MyProcessWindowFunction extends ProcessWindowFunction, String, String, TimeWindow> { @Override public void process(String key, Context context, It

Re: Help with pom dependencies for Flink with Table API

2022-03-03 Thread Francesco Guardiani
Hi, The moving of org.apache.flink.connector.file.table.factories.BulkReaderFormatFactory was done in master a couple of months ago by me, and it should be only on 1.15+. Could it be you're somehow mixing master snapshots with 1.14.x? Are you trying to run the job on a cluster using a Flink distri

Re: Questions regarding connecting local FlinkSQL client to remote JobManager in K8s

2022-03-03 Thread yu'an huang
Hi Elkhan, Except for JM have an external IP address, I think the port 6123 also need to be opened. You may need to set a host port for 6123 in JM pod or expose this port by Kubernetes service. But I am not sure whether the sql-client communicate with JM via Rest endpoint or RPC port. Hopes som