Re: Flink EventTime Processing Watermark is always coming as 9223372036854725808

2020-03-03 Thread aj
Thanks, Robert for mentioning this, I will take care of it in future posts. I am able to figure out the issue. When I disable checkpoint then the watermark is getting updated and its working. I need to understand 2 things : 1. Please help to understand what is happening when I enable checkpointin

Re: StreamingFileSink Not Flushing All Data

2020-03-03 Thread Kostas Kloudas
Hi Austin and Rafi, @Rafi Thanks for providing the pointers! Unfortunately there is no progress on the FLIP (or the issue). @ Austin In the meantime, what you could do --assuming that your input is bounded -- you could simply not stop the job after the whole input is processed, then wait until t

Re: Operator latency metric not working in 1.9.1

2020-03-03 Thread orips
Thanks for the response. In 1.5 the docs also state that it should be enabled [1], however, it always worked without setting latencyTrackingInterval [1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/metrics.html#latency-tracking -- Sent from: http://apache-flink-user-m

Unable to recover from savepoint and checkpoint

2020-03-03 Thread Puneet Kinra
Hi Stuck with the simple program regarding the checkpointing Flink version I am using 1.10.0 *Here I have created DummySource for testing* *DummySource* package com.nudge.stateful; import org.apache.flink.api.java.tuple.Tuple2; import org.apache.flink.streaming.api.functions.source.SourceFuncti

Re: Unable to recover from savepoint and checkpoint

2020-03-03 Thread Puneet Kinra
Sorry for the missed information On recovery the value is coming as false instead of true, state.backend has been configured in flink-conf.yaml along the the path for checkpointing and savepoint. On Tue, Mar 3, 2020 at 3:34 PM Puneet Kinra < puneet.ki...@customercentria.com> wrote: > Hi > > Stu

Alink and Flink ML

2020-03-03 Thread Flavio Pompermaier
Hi to all, since Alink has been open sourced, is there any good reason to keep both Flink ML and Alink? >From what I understood Alink already contains the best ML implementation available for Flink..am I wrong? Maybe it could make sense to replace the current Flink ML with that of Alink..or is that

java.util.concurrent.ExecutionException

2020-03-03 Thread kant kodali
Hi All, I am just trying to read edges which has the following format in Kafka 1,2 1,3 1,5 using the Table API and then converting to DataStream of Edge Objects and printing them. However I am getting java.util.concurrent.ExecutionException but not sure why? Here is the sample code import org.

zookeeper.connect is not needed but Flink requires it

2020-03-03 Thread kant kodali
Hi All, The zookeeper.connect is not needed for KafkaConsumer or KafkaAdminClient however Flink requires it. You can also see in the Flink TaskManager logs the KafkaConsumer is not recognizing this property anyways. bsTableEnv.connect( new Kafka() .property("bootstrap.servers", "local

Re: Operator latency metric not working in 1.9.1

2020-03-03 Thread Gary Yao
Hi, There is a release note for Flink 1.7 that could be relevant for you [1] Granularity of latency metrics The default granularity for latency metrics has been modified. To restore the previous behavior users have to explicitly set the granularity to subtask. Best, Gary [1] https://ci.

Re: Unable to recover from savepoint and checkpoint

2020-03-03 Thread Gary Yao
Hi Puneet, Can you describe how you validated that the state is not restored properly? Specifically, how did you introduce faults to the cluster? Best, Gary On Tue, Mar 3, 2020 at 11:08 AM Puneet Kinra < puneet.ki...@customercentria.com> wrote: > Sorry for the missed information > > On recovery

Re: Alink and Flink ML

2020-03-03 Thread Gary Yao
Hi Flavio, I am looping in Becket (cc'ed) who might be able to answer your question. Best, Gary On Tue, Mar 3, 2020 at 12:19 PM Flavio Pompermaier wrote: > Hi to all, > since Alink has been open sourced, is there any good reason to keep both > Flink ML and Alink? > From what I understood Alink

Re: Operator latency metric not working in 1.9.1

2020-03-03 Thread orips
Thanks, that makes sense! In addition, I've just found the reason for this in the code: This is 1.5 (default value is 2000L): https://github.com/apache/flink/blob/release-1.5/flink-core/src/main/java/org/apache/flink/configuration/MetricOptions.java#L109 This is 1.9 (default value is 0L) https:/

Re: zookeeper.connect is not needed but Flink requires it

2020-03-03 Thread Jark Wu
Hi Kant, You are right. It is not needed since Kafka 0.9+. We already have an issue to make it optional. https://issues.apache.org/jira/browse/FLINK-16125 Best, Jark On Tue, 3 Mar 2020 at 20:17, kant kodali wrote: > Hi All, > > The zookeeper.connect is not needed for KafkaConsumer or KafkaAdmi

[Survey] Default size for the new JVM Metaspace limit in 1.10

2020-03-03 Thread Andrey Zagrebin
Hi All, Recently, FLIP-49 [1] introduced the new JVM Metaspace limit in the 1.10 release [2]. Flink scripts, which start the task manager JVM process, set this limit by adding the corresponding JVM argument. This has been done to properly plan resources. especially to derive container size for Yar

Re: Providing hdfs name node IP for streaming file sink

2020-03-03 Thread Vishwas Siravara
Thanks Yang. Going with setting the HADOOP_CONF_DIR in the flink application. It integrates neatly with flink. Best, Nick. On Mon, Mar 2, 2020 at 7:42 PM Yang Wang wrote: > It may work. However, you need to set your own retry policy(similar as > `ConfiguredFailoverProxyProvider` in hadoop). > A

Re: java.util.concurrent.ExecutionException

2020-03-03 Thread Gary Yao
Hi, Can you post the complete stacktrace? Best, Gary On Tue, Mar 3, 2020 at 1:08 PM kant kodali wrote: > Hi All, > > I am just trying to read edges which has the following format in Kafka > > 1,2 > 1,3 > 1,5 > > using the Table API and then converting to DataStream of Edge Objects and > printi

Re: java.util.concurrent.ExecutionException

2020-03-03 Thread kant kodali
The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: f57b682f5867a8bf6ff6e1ddce93a1ab) at org.apache.flink.client.program.Pack

Re: Flink Stream Sink ParquetWriter Failing with ClassCastException

2020-03-03 Thread Till Rohrmann
Hi Anuj, if you use the exact same schema with which the data has been written for reading and if there is no bug in the parquet Avro support, then it should indeed not fail. Hence, I suspect that the producer of your data might produce slightly different Avro records compared to what Parquet is e

Re: java.util.concurrent.ExecutionException

2020-03-03 Thread kant kodali
Hi Gary, This has to do with my Kafka. After restarting Kafka it seems to work fine! Thanks! On Tue, Mar 3, 2020 at 8:18 AM kant kodali wrote: > The program finished with the following exception: > > > org.apache.flink.client.program.ProgramInvocationException: The main > method caused an erro

RE: Building with Hadoop 3

2020-03-03 Thread LINZ, Arnaud
Hello, Have you shared it somewhere on the web already? Best, Arnaud De : vino yang Envoyé : mercredi 4 décembre 2019 11:55 À : Márton Balassi Cc : Chesnay Schepler ; Foster, Craig ; user@flink.apache.org; d...@flink.apache.org Objet : Re: Building with Hadoop 3 Hi Marton, Thanks for your exp

Re: java.util.concurrent.ExecutionException

2020-03-03 Thread Gary Yao
Hi, Thanks for getting back, and I am glad that you were able to resolve the issue. The root cause in the stacktrace you posted also indicates a problem related to Kafka: Caused by: org.apache.flink.kafka.shaded.org.apache.kafka.common.errors.TimeoutException: Timeout of 6ms expired befor

Re: Single stream, two sinks

2020-03-03 Thread John Smith
If I understand correctly he wants the HTTP result in the DB. So I do not think side output works here. The DB would have to be the sink. Also sinks in Flink are the final destination. So it would have to be RabbitMQ -> Some Cool Business Logic Operators Here > Async I/O HTTP Operator

Flink Session Windows State TTL

2020-03-03 Thread karl.pullicino
Hi all,We have an Apache Flink application which generates player sessions based on player events keyed by playerId. Sessions are based on EventTime. A session is created on first event event for that player and closes if there are 30 mins of inactivity. Events are merged in our custom /PlayerSessi

Re: Flink Session Windows State TTL

2020-03-03 Thread karl.pullicino
Added flink_oom_exception.txt as originally forgot to attach it -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

How to print the aggregated state everytime it is updated?

2020-03-03 Thread kant kodali
Hi All, I have a custom aggregated state that is represent by Set and I have a stream of values coming in from Kafka where I inspect, compute the custom aggregation and store it in Set. Now, I am trying to figureout how do I print the updated value everytime this state is updated? Imagine I have

Should I use a Sink or Connector? Or Both?

2020-03-03 Thread Castro, Fernando C.
Hello folks! I’m new to Flink and data streaming in general, just initial FYI ;) I’m currently doing this successfully: 1 - streaming data from Kafka in Flink 2 - aggregating the data with Flink’s sqlQuery API 3 - outputting the result of #2 into STDOUT via toRetreatStream() My objective is to ch

JobMaster does not register with ResourceManager in high availability setup

2020-03-03 Thread Bajaj, Abhinav
Hi, We recently came across an issue where JobMaster does not register with ResourceManager in Fink high availability setup. Let me share the details below. Setup * Flink 1.7.1 * K8s * High availability mode with a single Jobmanager and 3 zookeeper nodes in quorum. Scenario *

Re: StreamingFileSink Not Flushing All Data

2020-03-03 Thread Austin Cawley-Edwards
Hi all, Thanks for the docs pointer/ FLIP Rafi, and the workaround strategy Kostas -- strange though, as I wasn't using a bounded source when I first ran into this issue. I have updated the example repo to use an unbounded source[1], and the same file corruption problems remain. Anything else I c

Re: Should I use a Sink or Connector? Or Both?

2020-03-03 Thread John Smith
The sink if for Streaming API, it looks like you are using SQL and tables. So you can use the connector to output the table result to Elastic. Unless you want to convert from table to stream first. On Tue, 3 Mar 2020 at 16:25, Castro, Fernando C. wrote: > Hello folks! I’m new to Flink and data s

Very large _metadata file

2020-03-03 Thread Jacob Sevart
Per the documentation: "The meta data file of a Savepoint contains (primarily) pointers to all files on stable storage that are part of the Savepoint, in form of absolute paths." I somehow have a _metadata file that's 1.9GB. Running *strings *on it I find 962 strings, most of which look like HDFS

Re: Should I use a Sink or Connector? Or Both?

2020-03-03 Thread Jark Wu
John is right. Could you provide more detailed code? So that we can help to investigate. Best, Jark On Wed, 4 Mar 2020 at 06:20, John Smith wrote: > The sink if for Streaming API, it looks like you are using SQL and tables. > So you can use the connector to output the table result to Elastic.

Re: JobMaster does not register with ResourceManager in high availability setup

2020-03-03 Thread Xintong Song
Hi Abhinav, The JobMaster log "Connecting to ResourceManager ..." is printed after JobMaster retrieve ResourceManager address from ZooKeeper. In your case, I assume there's some ZK problem that JM cannot resolve RM address. Have you confirmed whether the ZK pods are recovered after the second di

Flink Web UI display nothing in k8s when use ingress

2020-03-03 Thread LakeShen
Hi community, now we plan to move all flink tasks to k8s cluster. For one flink task , we want to see this flink task web ui . First , we create the k8s Service to expose 8081 port of jobmanager, then we use ingress controller so that we can see it outside.But the flink web like this : [im

Re: Flink Web UI display nothing in k8s when use ingress

2020-03-03 Thread LakeShen
In my thought , I think I should config the correct flink jobserver for flink task LakeShen 于2020年3月4日周三 下午2:07写道: > Hi community, > now we plan to move all flink tasks to k8s cluster. For one flink > task , we want to see this flink task web ui . First , we create the k8s > Service to e

Re:Flink Web UI display nothing in k8s when use ingress

2020-03-03 Thread ouywl
Hi lake:    Ok, Show the jobmanager pod logs, Can you see the jm pods is running ok? Try use cube-proxy, or NodePort, That you can see the webUI? Best,Ouywl On 03/4/