Re: Error during Kafka connection

2017-08-11 Thread Kien Truong
Hi, You mentioned that your kafka broker is behind a proxy. This could be a problem, because when the client try to get the cluster's topology, it will get the brokers ' private addresses , which is not reachable. Regards, Kien On Aug 11, 2017, 18:18, at 18:18, "Tzu-Li (Gordon) Tai" wrote:

Re: No file system found with scheme s3

2017-08-11 Thread Ted Yu
Shouldn't the config key be : org.apache.hadoop.fs.s3.S3FileSystem Cheers On Fri, Aug 11, 2017 at 5:38 PM, ant burton wrote: > Hello, > > After following the instructions to set the S3 filesystem in the > documentation (https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/aws.h

No file system found with scheme s3

2017-08-11 Thread ant burton
Hello, After following the instructions to set the S3 filesystem in the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem ) I encountered the

Re: Standalone cluster - taskmanager settings ignored

2017-08-11 Thread Kaepke, Marc
Hi Greg, I guess I restarted the cluster too fast. Combined with a high cpu inside the cluster. I tested it again few minutes ago and there was no issue! With „$ jps“ I checked if there any Java process -> there wasn’t But if the master don’t know slave5, how can slave5 reconnect to the JobMan

Re: [EXTERNAL] Re: difference between checkpoints & savepoints

2017-08-11 Thread Raja . Aravapalli
Thanks for the discussion. That answered many questions I have. Also, in the same line, can someone detail the difference between State Backend & External checkpoint? Also, programmatic API, thru which methods we can configure those. Regards, Raja. From: Stefan Richter Date: Thursday, Augus

FsStateBackend with incremental backup enable does not work with Keyed CEP

2017-08-11 Thread Daiqing Li
Hi, I am running fling 1.3.1 on EMR. But I am getting this exception after running for a while. java.lang.RuntimeException: Exception occurred while processing valve output watermark: at org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWat

Re: Standalone cluster - taskmanager settings ignored

2017-08-11 Thread Greg Hogan
Hi Marc, By chance did you edit the slaves file before shutting down the cluster? If so, then the removed worker would not be stopped and would reconnect to the restarted JobManager. Greg > On Aug 11, 2017, at 11:25 AM, Kaepke, Marc wrote: > > Hi, > > I have a cluster of 4 dedicated machin

Re: Standalone cluster - taskmanager settings ignored

2017-08-11 Thread Kaepke, Marc
I start my cluster with: bigdata@master:/usr/lib/flink-1.3.2$ ./bin/start-cluster.sh Starting cluster. Starting jobmanager daemon on host master. Starting taskmanager daemon on host master. Starting taskmanager daemon on host slave1. Starting taskmanager daemon on host slave3. And if I stop it: b

Standalone cluster - taskmanager settings ignored

2017-08-11 Thread Kaepke, Marc
Hi, I have a cluster of 4 dedicated machines (no VMs). My previous config was: 1 master and 3 slaves. Each machine provides a task- or jobmanager. Now I want to reduce my cluster and have 1 master and 3 slaves, but one machine provides a jobmanager and one task manager in parallel. I changed al

Re: stream partitioning to avoid network overhead

2017-08-11 Thread Urs Schoenenberger
Hi Karthik, maybe I'm misunderstanding, but there are a few things in your description that seem strange to me: - Your "slow" operator seems to be slow not because it's compute-heavy, but because it's waiting for a response. Is AsyncIO ( https://ci.apache.org/projects/flink/flink-docs-release-1.3

Re: Classloader and removal of native libraries

2017-08-11 Thread Conrad Crampton
Hi Aljoscha, “Hope that helps”… ABSOLUTELY!!! I have dug through the javacpp source code to find how the Loader class uses the temp cache location for the native libraries and in my open method in my RichMapFunction I am now setting the System.property to a random location so if the job restart

Re: Using latency markers

2017-08-11 Thread Aljoscha Krettek
Ok, I can also confirm that it doesn't work. I'll look into this and will let you know what I find. Best, Aljoscha > On 11. Aug 2017, at 14:57, Kien Truong wrote: > > Yes, we also tried changing the tracking interval to no avail, still no > latency metric. > Kien > > On 8/11/2017 7:26 PM, Gy

Re: Evolving serializers and impact on flink managed states

2017-08-11 Thread Biplob Biswas
Thanks a ton Stefan, that was really helpful. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Evolving-serializers-and-impact-on-flink-managed-states-tp14777p14837.html Sent from the Apache Flink User Mailing List archive. mailing list arch

Re: Overwrite environment variables in single-job deployment on YARN

2017-08-11 Thread Mariusz Wojakowski
Thanks! It works :) I’ve tried that but I’ve also misspelled ’taskmanager’… Sorry & thanks one more time! Mariusz > On 11 Aug 2017, at 14:28, Aljoscha Krettek wrote: > > Hi, > > Have you tried > > ./bin/flink run -m yarn-cluster -yD > yarn.taskmanager.env.JAVA_HOME=“/opt/jre1.8.0” > > -yD

Re: Using latency markers

2017-08-11 Thread Kien Truong
Yes, we also tried changing the tracking interval to no avail, still no latency metric. Kien On 8/11/2017 7:26 PM, Gyula Fóra wrote: Yes, they are enabled by default I think. Gyula On Fri, Aug 11, 2017, 14:14 Aljoscha Krettek > wrote: It seems you have to

Re: Overwrite environment variables in single-job deployment on YARN

2017-08-11 Thread Aljoscha Krettek
Hi, Have you tried ./bin/flink run -m yarn-cluster -yD yarn.taskmanager.env.JAVA_HOME=“/opt/jre1.8.0” -yD can be used to dynamically specify settings. Best, Aljoscha > On 11. Aug 2017, at 13:20, Mariusz Wojakowski wrote: > > Hi, > > I want to run Flink job on YARN and I need to overwrite J

Re: Using latency markers

2017-08-11 Thread Gyula Fóra
Yes, they are enabled by default I think. Gyula On Fri, Aug 11, 2017, 14:14 Aljoscha Krettek wrote: > It seems you have to enable latency tracking > via ExecutionConfig.setLatencyTrackingInterval(...). This will make the > sources emit latency tokens, which then in turn update the latency metri

Re: Using latency markers

2017-08-11 Thread Aljoscha Krettek
It seems you have to enable latency tracking via ExecutionConfig.setLatencyTrackingInterval(...). This will make the sources emit latency tokens, which then in turn update the latency metric: https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#latency-tracking <

Re: Evolving serializers and impact on flink managed states

2017-08-11 Thread Stefan Richter
Hi, yes, the assumption is correct. This is no instability, but actually stopping the user to corrupt data through attempting an unsupported operation. Migration required is the outcome of the compatibility check that would start a migration process. For Avro, the serializer does not have to si

Re: Evolving serializers and impact on flink managed states

2017-08-11 Thread Biplob Biswas
Hi Stefan, Thanks a lot for such a helpful response. That really made thing a lot clearer for me. Although at this point I have one more and probably last question. According to the Flink documentation, [Attention] Currently, as of Flink 1.3, if the result of the compatibility check acknowledge

Re: Error during Kafka connection

2017-08-11 Thread AndreaKinn
I just tried to use telnet to public ip:port from outside and it works. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-during-Kafka-connection-tp14822p14829.html Sent from the Apache Flink User Mailing List archive. mailing list archiv

Re: Error during Kafka connection

2017-08-11 Thread AndreaKinn
I tried running console consumer-producer from the localhost on the cluster: this say me that the broker is currently active. To reach the cluster from outside I use a redirect from a public (ip, port), because the ip of the kafka broker is private... I suspect the problem can be there. -- View

Overwrite environment variables in single-job deployment on YARN

2017-08-11 Thread Mariusz Wojakowski
Hi, I want to run Flink job on YARN and I need to overwrite JAVA_HOME on Task Managers. In multi-job mode I can do it by passing parameter when starting cluster: ./yarn-session.sh -D yarn.taskamanager.env.JAVA_HOME=“/opt/jre1.8.0” I want to overwrite this variable in single-job deployment, bu

Re: Error during Kafka connection

2017-08-11 Thread Tzu-Li (Gordon) Tai
No, there should be no difference between setting it up on Ubuntu or OS X. I can’t really tell any anything suspicious from the information provided so far, unfortunately. Perhaps you can try first checking that the Kafka topic is consumable from where you’re running Flink, e.g. using the exampl

Re: Aggregation based on Timestamp

2017-08-11 Thread Tzu-Li (Gordon) Tai
Hi, Yes, this is definitely doable in Flink, and should be very straightforward. Basically, what you would do is define a FlinkKafkaConsumer source for your Kafka topic [1], following that a keyBy operation on the hostname [2], and then a 1-minute time window aggregation [3]. At the end of your

Re: Error during Kafka connection

2017-08-11 Thread AndreaKinn
the kafka version I use is the latest (0.11.0.0). But to be honestly, also locally I use 0.11.0.0 and in that case it works correctly. Anyway the last kafka connector on flink is designed for kafka 0.10.x.x I use OS X locally and Ubuntu on the cluster. It has importance? -- View this message in

Re: Error during Kafka connection

2017-08-11 Thread Tzu-Li (Gordon) Tai
Hi, AFAIK, Kafka group coordinators are supposed to always be marked dead, because we use static assignment internally and therefore Kafka's group coordination functionality is disabled. Though it may be obvious, but to get that out of the way first: are you sure that the Kafka installation ve

Error during Kafka connection

2017-08-11 Thread AndreaKinn
Hi, In the last week I have correctly deployed a flink program which get data from a kafka broker on my local machine. Now I'm trying to produce the same thing but moving the kafka broker on a cluster. I didn't change any line of code, I report it here: DataStream> stream = env