Re: How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-08-29 Thread SHI Xiaogang
Hi Datashov, We faced similar problems in our production clusters. Now both lauching and stopping of containers are performed in the main thread of YarnResourceManager. As containers are launched and stopped one after another, it usually takes long time to boostrap large jobs. Things get worse wh

How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-08-29 Thread Elkhan Dadashov
Dear Flink developers, Having difficulty of getting a Flink job started. The job's uberjar/fat jar is around 400MB, and I need to kick 800+ containers. The default HDFS replication is 3. *The Yarn queue is empty, and 800 containers are allocated almost immediately by Yarn RM.* It takes v

Re: Flink and kerberos

2019-08-29 Thread Vishwas Siravara
Thanks, I'll check it out. On Thu, Aug 29, 2019 at 1:08 PM David Morin wrote: > Vishwas, > > A config that works on my Kerberized cluster (Flink on Yarn). > I hope this will help you. > > Flink conf: > security.kerberos.login.use-ticket-cache: true > security.kerberos.login.keytab: /home/myuser/

Re: Flink and kerberos

2019-08-29 Thread David Morin
Vishwas, A config that works on my Kerberized cluster (Flink on Yarn). I hope this will help you. Flink conf: security.kerberos.login.use-ticket-cache: true security.kerberos.login.keytab: /home/myuser/myuser.keytab security.kerberos.login.principal: myuser@ security.kerberos.login.contexts:

Re: Flink 1.9, MapR secure cluster, high availability

2019-08-29 Thread Stephan Ewen
Hi Maxim! The change of the MapR dependency should not have an impact on that. Do you know if the same thing worked in prior Flink versions? Is that a regression in 1.9? The exception that you report, is that from Flink's HA services trying to connect to ZK, or from the MapR FS client trying to c

Re: End of Window Marker

2019-08-29 Thread Eduardo Winpenny Tejedor
Hi, I'll chip in with an approach I'm trying at the moment that seems to work, and I say seems because I'm only running this on a personal project. Personally, I don't have anything against end-of-message markers per partition, Padarn you seem to not prefer this option as it overloads the meaning

Re: Flink and kerberos

2019-08-29 Thread Vishwas Siravara
Hey David , My consumers are registered , here is the debug log. The problem is the broker does not belong to me , so I can’t see what is going on there . But this is a new consumer group , so there is no state yet . org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer s

Re: Kafka consumer behavior with same group Id, 2 instances of apps in separate cluster?

2019-08-29 Thread Becket Qin
Hi Ashish, You are right. Flink does not use Kafka based group management. So if you have two clusters consuming the same topic, they will not divide the partitions. The cross cluster HA is not quite possible at this point. It would be good to know the reason you want to have such HA and see if Fl

Re: Flink and kerberos

2019-08-29 Thread David Morin
Hello Vishwas, You can use a keytab if you prefer. You generate a keytab for your user and then you can reference it in the Flink configuration. Then this keytab will be handled by Flink in a secure way and TGT will be created based on this keytab. However, that seems to be working. Did you check

Re: problem with avro serialization

2019-08-29 Thread Debasish Ghosh
Any update on this ? regards. On Tue, May 14, 2019 at 2:22 PM Tzu-Li (Gordon) Tai wrote: > Hi, > > Aljoscha opened a JIRA just recently for this issue: > https://issues.apache.org/jira/browse/FLINK-12501. > > Do you know if this is a regression from previous Flink versions? > I'm asking just to

Re: Kafka consumer behavior with same group Id, 2 instances of apps in separate cluster?

2019-08-29 Thread ashish pok
Looks like Flink is using “assign” partitions instead of “subscribe” which will not allow participating in a group if I read the code correctly.  Has anyone solved this type of problem in past of active-active HA across 2 clusters using Kafka?  - Ashish On Wednesday, August 28, 2019, 6:52 PM,

Re: Per Partition Watermarking source idleness

2019-08-29 Thread Eduardo Winpenny Tejedor
Hey Prakhar, Sorry for taking so long to reply. One possible strategy is to advance the watermark after not receiving new messages for T milliseconds. In order to do this you must be fairly confident that you will not get messages delayed for longer than T milliseconds. To this end I've written my

Left Anti-Join

2019-08-29 Thread ddwcg
hi, I want to calculate the amount of consumption relative to the user added in the previous year, but the result of the following sql calculation is incorrect. The "appendTable" is a table register from a appendStream select a.years,a.shopId,a.userId, a.amount from (select years,shopId,userId

Re: Loading dylibs

2019-08-29 Thread Yang Wang
Hi Vishwas, I think it just because dylib is loaded more than once in a jvm process(TaskManager). Multiple tasks are deployed in one TaskManager and running in different threads. So if you want to make the dylib only loaded once, maybe you use the parent classloader. You could use the the followin

Re: Flink and kerberos

2019-08-29 Thread Vishwas Siravara
I see this log as well , but I can't see any messages . I know for a fact that the topic I am subscribed to has messages as I checked with a simple java consumer with a different group. org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer subtask 0 will start reading th

Re: Kafka producer failed with InvalidTxnStateException when performing commit transaction

2019-08-29 Thread Tony Wei
Hi, Has anyone run into the same problem? I have updated my producer transaction timeout to 1.5 hours, but the problem sill happened when I restarted broker with active controller. It might not due to the problem that checkpoint duration is too long causing transaction timeout. I had no more clue

Flink and kerberos

2019-08-29 Thread Vishwas Siravara
Hi guys, I am using kerberos for my kafka source. I pass the jaas config and krb5.conf in the env.java.opts: -Dconfig.resource=qa.conf -Djava.library.path=/usr/mware/SimpleAPI/voltage-simple-api-java-05.12.-Linux-x86_64-64b-r234867/lib/ -Djava.security.auth.login.config=/home/was/Jaas/kafka-jaa