Re: Limitations with Retract Streams on SQL

2018-05-22 Thread Rong Rong
The SQL UNION is the reason here that's causing (a) the table is not append only, and (b) the inner GroupBy. If you check out the UNION operator[1], it suggests that: "Any duplicate records are automatically removed unless UNION ALL is used". So: (1) it is definitely not append-only operation as y

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-22 Thread Till Rohrmann
Hi Amit, it looks as if the current cancellation cause is not the same as the initially reported cancellation cause. In the current case, it looks as if the deployment of your tasks takes so long that that maximum `taskmanager.network.request-backoff.max` value has been reached. When this happens

How to restore state from savepoint with flink SQL

2018-05-22 Thread Yan Zhou [FDS Science]
Hi, My application use flink SQL and it's running in production. How can i update my application with topology changes yet doesn't lose the state data? Is there a way to assign UID to the operators that are translated from SQL? If not, is it intended and whats the rationality behind it? Acco

program args size for running jobs

2018-05-22 Thread Esteban Serrano
Hi Flink users Looking for some ideas on alternatives/workaround for running a job which requires a large number of parameters (one of them being a long JSON string). When all params are taken together, the resulting REST API URL puts the size of the URL over the current 4096 bytes limit. Ideally,

Limitations with Retract Streams on SQL

2018-05-22 Thread Gregory Fee
I'm trying to get a stream of data from a Table I've formed with roughly this SQL: SELECT user_id, count(msg), HOP_END(rowtime, INTERVAL '1' second, INTERVAL '1' minute) FROM (SELECT rowtime, user_id, action_name AS msg FROM event_client_action WHERE /* various clause

Order of events with chanined keyBy calls of same key

2018-05-22 Thread Shimony, Shay
Hi everyone, I have this question in StackOverflow, and would be happy if you could answer. https://stackoverflow.com/questions/50340107/order-of-events-with-chanined-keyby-calls-of-same-key Thanks! Shay

Delete Flink logs from YARN in a lifetime running job

2018-05-22 Thread Georgi Stoyanov
Hi guys, I have the following set – Yarn on top of Hadoop 2.8 and Flink on it and running it on 2-3 nodes. The problem is that during the running of the job the log files are growing and I’m not able to delete them, cause there is lock on them. I don’t want to stop the job in order to release t

Re: Multiple hdfs

2018-05-22 Thread Kien Truong
You only need to modify the core-site and hdfs-site read by Flink. Regards, Kiên On 5/22/2018 9:07 PM, Deepak Sharma wrote: Wouldnt 2 core-site and hdfs-site xmls need to be provided in this case then ? Thanks Deepak On Tue, May 22, 2018, 19:34 Raul Valdoleiros

Re: This server is not the leader for that topic-partition

2018-05-22 Thread gerardg
I've seen the same error while upgrading Kafka. We are using FlinkKafkaProducer011 and Kafka version 1.0.0. While upgrading to Kafka 1.1.0, each time a server was restarted, an already running Flink job failed with the same message. Gerard -- Sent from: http://apache-flink-user-mailing-list-arc

Re: Multiple hdfs

2018-05-22 Thread Deepak Sharma
Wouldnt 2 core-site and hdfs-site xmls need to be provided in this case then ? Thanks Deepak On Tue, May 22, 2018, 19:34 Raul Valdoleiros < raul.valdoleiros.olive...@gmail.com> wrote: > Hi Kien, > > Thanks for you reply. > > Your goal is to store the checkpoints in one hdfs cluster and the data

Re: Multiple hdfs

2018-05-22 Thread Raul Valdoleiros
Hi Kien, Thanks for you reply. Your goal is to store the checkpoints in one hdfs cluster and the data in other hdfs cluster. So the flink should be able to connect to two different hdfs clusters. Thanks 2018-05-22 15:00 GMT+01:00 Kien Truong : > Hi, > > If your cluster are not high-availabili

Re: Multiple hdfs

2018-05-22 Thread Kien Truong
Hi, If your cluster are not high-availability clusters then just use the full path to the cluster. For example, to refer to directory /checkpoint on cluster1, use hdfs://namenode1_ip:port/checkpoint Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data If your cluster is a HA

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-22 Thread Nico Kruber
Hi Amit, thanks for providing the logs, I'll look into it. We currently have a suspicion of this being caused by https://issues.apache.org/jira/browse/FLINK-9406 which we found by looking over the surrounding code. The RC4 has been cancelled since we see this as a release blocker. To rule out furt

Jobs running on a yarn per-job cluster fail to restart when a task manager is lost

2018-05-22 Thread 杨力
Hi, I am running a streaming job without checkpointing enabled. A failute rate restart strategy have been set with StreamExecutionEvironment.setRestartStrategy. When a task manager is lost because of memory problems, the job manager try to restart the job without launching a new task manager, and

mongodb connector

2018-05-22 Thread Sofer, Tovi
Hi group, Is there a ready to use mongo DB sink? I found online this closed issue of bahir-flink, but didn't find some version of this code to use\download... https://issues.apache.org/jira/browse/FLINK-6573 https://issues.apache.org/jira/browse/BAHIR-133 Thanks, Tovi