Re: Initializing broadcast state

2021-01-26 Thread Nick Bendtner
nk you could update the `myState` in the > `processBroadcastElement`. It is because you need a key before to update > the keyedstate. But there is no key in `processBroadcastElement` . > Best, > Guowei > > > On Tue, Jan 26, 2021 at 6:31 AM Nick Bendtner wrote: > >>

Re: Initializing broadcast state

2021-01-25 Thread Nick Bendtner
dcast side elements are arrived. Specially if >>> you are using the `KeyedBroadcastProcessFunction` you could use the >>> `applyToKeyedState` to access the element you cache before. >>> >>> Best, >>> Guowei >>> >>> >>> On Mon, Jan 2

Re: Initializing broadcast state

2021-01-24 Thread Nick Bendtner
e and handle it > when the element from the broadcast side elements are arrived. Specially if > you are using the `KeyedBroadcastProcessFunction` you could use the > `applyToKeyedState` to access the element you cache before. > > Best, > Guowei > > > On Mon, Jan 25

Initializing broadcast state

2021-01-24 Thread Nick Bendtner
Hi guys, What is the way to initialize broadcast state(say with default values) before the first element shows up in the broadcasting stream? I do a lookup on the broadcast state to process transactions which come from another stream. The problem is the broadcast state is empty until the first elem

Re: Status of a job when a kafka source dies

2020-08-13 Thread Nick Bendtner
Like `request.timeout.ms` or `consumer.timeout.ms`? But as > I wrote before, that would be more a question to Kafka guys. > > Piotrek > > [1] http://kafka.apache.org/20/documentation/ > > śr., 5 sie 2020 o 19:58 Nick Bendtner napisał(a): > >> +user group.

Re: Status of a job when a kafka source dies

2020-08-05 Thread Nick Bendtner
+user group. On Wed, Aug 5, 2020 at 12:57 PM Nick Bendtner wrote: > Thanks Piotr but shouldn't this event be handled by the FlinkKafkaConsumer > since the poll happens inside the FlinkKafkaConsumer. How can I catch this > event in my code since I don't have control over

Re: Status of a job when a kafka source dies

2020-08-04 Thread Nick Bendtner
flink/flink-docs-stable/dev/task_failure_recovery.html > > Best, > Aljoscha > > On 15.07.20 17:42, Nick Bendtner wrote: > > Hi guys, > > I want to know what is the default behavior of Kafka source when a kafka > > cluster goes down during streaming. Will the job statu

Status of a job when a kafka source dies

2020-07-15 Thread Nick Bendtner
Hi guys, I want to know what is the default behavior of Kafka source when a kafka cluster goes down during streaming. Will the job status go to failing or is the exception caught and there is a back off before the source tries to poll for more events ? Best, Nick.

Non parallel file sources

2020-06-23 Thread Nick Bendtner
Hi guys, What is the best way to process a file from a unix file system since there is no guarantee as to which task manager will be assigned to process the file. We run flink in standalone mode. We currently follow the brute force way in which we copy the file to every task manager, is there a bet

State backend considerations

2020-06-21 Thread Nick Bendtner
Hi guys, I have a few questions on state backends. Is there a guideline on how big the state has to be where it makes sense to use RocksDB rather than FsStatebackend ? Is there an analysis on latency for a full checkpoint for FsSateBackend based on increase in state size ? Best, Nick.

Re: kerberos integration with flink

2020-06-01 Thread Nick Bendtner
icket cache. I'll try to do a deeper investigation. > > [1] https://bugs.openjdk.java.net/browse/JDK-8058290. > > Best, > Yangze Guo > > On Sat, May 30, 2020 at 3:07 AM Nick Bendtner wrote: > > > > Hi Guo, > > Thanks again for your inputs. If I periodic

Re: kerberos integration with flink

2020-05-29 Thread Nick Bendtner
kens for job manager and task executor. As the document said, > the main drawback is that the cluster is necessarily short-lived since > the generated delegation tokens will expire (typically within a week). > > Best, > Yangze Guo > > On Sat, May 23, 2020 at 1:23 AM Nick Bendtner

Re: kerberos integration with flink

2020-05-22 Thread Nick Bendtner
projects/flink/flink-docs-master/ops/security-kerberos.html#hadoop-security-module > [4] > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java > > Best, > Yangze Guo > > On Thu, May 21, 2020 at 11:0

kerberos integration with flink

2020-05-21 Thread Nick Bendtner
Hi guys, Is there any difference in providing kerberos config to the flink jvm using this method in the flink configuration? env.java.opts: -Dconfig.resource=qa.conf -Djava.library.path=/usr/mware/flink-1.7.2/simpleapi/lib/ -Djava.security.auth.login.config=/usr/mware/flink-1.7.2/Jaas/kafka-jaas.

Re: ProducerRecord with Kafka Sink for 1.8.0

2020-05-14 Thread Nick Bendtner
LINK-11693 to Flink 1.8 yourself and build a custom Kafka connector. > > > On Tue, May 12, 2020 at 10:04 PM Nick Bendtner wrote: > > > > Hi Gary, > > Its because the flink distribution of the cluster is 1.7.2. We use a > standalone cluster , so in the lib directory in f

Re: ProducerRecord with Kafka Sink for 1.8.0

2020-05-12 Thread Nick Bendtner
ith provided > scope [1] > > Best, > Gary > > [1] > https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope > > On Tue, May 12, 2020 at 5:41 PM Nick Bendtner wrote: > > > > Hi Gary, > > Thanks for the info. I a

Re: ProducerRecord with Kafka Sink for 1.8.0

2020-05-12 Thread Nick Bendtner
k/flink-docs-release-1.9/api/java/org/apache/flink/streaming/connectors/kafka/KafkaSerializationSchema.html > > On Mon, May 11, 2020 at 10:59 PM Nick Bendtner wrote: > > > > Hi guys, > > I use 1.8.0 version for flink-connector-kafka. Do you have any > recommendations on

ProducerRecord with Kafka Sink for 1.8.0

2020-05-11 Thread Nick Bendtner
Hi guys, I use 1.8.0 version for flink-connector-kafka. Do you have any recommendations on how to produce a ProducerRecord from a kafka sink. Looking to add support to kafka headers therefore thinking about ProducerRecord. If you have any thoughts its highly appreciated. Best, Nick.

Using flink-connector-kafka-1.9.1 with flink-core-1.7.2

2020-05-06 Thread Nick Bendtner
Hi guys, I am using flink 1.7.2 version. I have to deserialize data from kafka into consumer records therefore I decided to update the flink-connector-kafka to 1.9.1 which provides support for consumer record. We use child first class loading. However it seems like I have compatibility issue as I g

Casting from org.apache.flink.api.java.tuple.Tuple2 to scala.Product; using java and scala in flink job

2020-05-05 Thread Nick Bendtner
Hi guys, In our flink job we use java source for deserializing a message from kafka using a kafka deserializer. Signature is as below. public class CustomAvroDeserializationSchema implements KafkaDeserializationSchema> The other parts of the streaming job are in scala. When data has to b

Re: Help with flink hdfs sink

2020-03-25 Thread Nick Bendtner
uot;hdfs:///tmp/auditlog/")". There is one additional >> / after hdfs://, which is a protocol name. >> >> Best, >> Jingsong Lee >> >> On Fri, Mar 20, 2020 at 3:13 AM Nick Bendtner wrote: >> >>> Hi guys, >>> I am using flink version

Help with flink hdfs sink

2020-03-19 Thread Nick Bendtner
Hi guys, I am using flink version 1.7.2. I am trying to write to hdfs sink from my flink job. I setup HADOOP_HOME. Here is the debug log for this : 2020-03-19 18:59:34,316 DEBUG org.apache.flink.runtime.util.HadoopUtils - Cannot find hdfs-default configuration-file path in Flin

Re: Providing hdfs name node IP for streaming file sink

2020-03-02 Thread Nick Bendtner
t; > > Best, > Yang > > > > [1]. > http://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html > > Nick Bendtner 于2020年2月29日周六 上午6:00写道: > >> To add to this question, do I need to setup env.hadoop.conf.dir to point >

Re: Providing hdfs name node IP for streaming file sink

2020-02-28 Thread Nick Bendtner
at 12:56 PM Nick Bendtner wrote: > Hi guys, > I am trying to write to hdfs from streaming file sink. Where should I > provide the IP address of the name node ? Can I provide it as a part of the > flink-config.yaml file or should I provide it like this : > > final Stre

Providing hdfs name node IP for streaming file sink

2020-02-28 Thread Nick Bendtner
Hi guys, I am trying to write to hdfs from streaming file sink. Where should I provide the IP address of the name node ? Can I provide it as a part of the flink-config.yaml file or should I provide it like this : final StreamingFileSink sink = StreamingFileSink .forBulkFormat(hdfs://nameno