RE: Resuming Savepoint issue with upgraded Flink version 1.11.2

2020-10-23 Thread Partha Mishra
Hi, None of the operator is renamed or removed. Testing is carried out with exactly same binary used with 1.9 and 1.11.2. Checkpoint saved in 1.9 is not being able to retrieve in 1.11.2 From: Sivaprasanna Sent: Friday, October 23, 2020 10:57 AM To: Partha Mishra Cc: user@flink.apache.org Sub

Re: Flink 1.8.3 GC issues

2020-10-23 Thread Piotr Nowojski
Hi Josson, Thanks for great investigation and coming back to use. Aljoscha, could you help us here? It looks like you were involved in this original BEAM-3087 issue. Best, Piotrek pt., 23 paź 2020 o 07:36 Josson Paul napisał(a): > @Piotr Nowojski @Nico Kruber > > An update. > > I am able to

Logging when building and testing Flink

2020-10-23 Thread Juha Mynttinen
Hey there, I noticed that when building and testing Flink itself, logging seems to be non-existing or very quiet. I had a look at the logging conf files (such as flink-tests/src/test/resources/log4j2-test.properties) and the pattern seems to be that the logging is turned off in tests. At least it

Re: How to understand NOW() in SQL when using Table & SQL API to develop a streaming app?

2020-10-23 Thread Till Rohrmann
Hi Longdexin, thanks for reaching out to the Flink community. I am pulling in Jark who might be able to help you with this question. Cheers, Till On Thu, Oct 22, 2020 at 2:56 PM Longdexin <274522...@qq.com> wrote: > From my point of view, the value of NOW() function in SQL is certain by the > t

Re: Dependency vulnerabilities with flink 1.11.1 version

2020-10-23 Thread Till Rohrmann
Hi Suchithra, thanks for doing this analysis. I think we should try to upgrade the affected libraries. I have opened issues to do these changes [1, 2, 3, 4, 5]. In the future, it would be great if you could first reach out to priv...@flink.apache.org so that we can fix these problems without drawi

Re: Flink Table SQL and MongoDB connector?

2020-10-23 Thread Till Rohrmann
Hi Dan, afaik Flink does not have a dedicated MongoDB connector (except for the DataSet API which is rather old). Hence I believe that the 2nd option seems to be more promising. Cheers, Till On Thu, Oct 22, 2020 at 6:45 AM Dan Hill wrote: > Has anyone connected these two? > > Looking through p

Re: Flink - Kafka topic null error; happens only when running on cluster

2020-10-23 Thread Timo Walther
Hi Manas, that is a good point. Feel free to open an issue for this. It is not the first time that your question appeared on the mailing list. Regards, Timo On 23.10.20 07:22, Manas Kale wrote: Hi Timo, I figured it out, thanks a lot for your help. Are there any articles detailing the pre-fl

Re: Flink Job Manager Memory Usage Keeps on growing when enabled checkpoint

2020-10-23 Thread Till Rohrmann
Hi Eleanore, how much memory did you assign to the JM pod? Maybe the limit is so high that it takes a bit of time until GC is triggered. Have you tried whether the same problem also occurs with newer Flink versions? The difference between checkpoints enabled and disabled is that the JM needs to d

Re: Job Restart Failure

2020-10-23 Thread Till Rohrmann
Hi Navneeth, sorry for the late reply. To me it looks as if /mnt/checkpoints/150dee2a70cecdd41b63a06b42a95649/chk-52/76363f89-d19f-44aa-aaf9-b33d89ec7c6c has not been mounted to the EC2 machine you are using to run the job. Could you try to log in onto the machine when the problem occurs and chec

Re: expected behavior when Flink job cluster exhausted all restarts

2020-10-23 Thread Till Rohrmann
Hi Eleanore, if you want to tolerate JM restarts, then you have to enable HA. W/o HA, a JM restart is effectively a submission of a new job. In order to tell you more about the Task submission rejection by the TaskExecutor, I would need to take a look at the logs of the JM and the rejecting TaskE

Re: savepoint failure

2020-10-23 Thread Till Rohrmann
Hi Rado, it is hard to tell the reason w/o a bit more details. Could you share with us the complete logs of the problematic run? Also the job you are running and the types of the state you are storing in RocksDB and use as events in your job are very important. In the linked SO question, the probl

[SURVEY] Remove Mesos support

2020-10-23 Thread Robert Metzger
Hi all, I wanted to discuss if it makes sense to remove support for Mesos in Flink. It seems that nobody is actively maintaining that component (except for necessary refactorings because of interfaces we are changing), and there are almost no users reporting issues or asking for features. The Apa

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Konstantin Knauf
Hi Robert, +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we would still support it in Flink 1.12- with bug fixes for some time so that users have time to move on. It would certainly be very interesting to hear from current Flink on Mesos users, on how they see the evolut

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Xintong Song
+1 for adding a warning in 1.12 about planning to remove Mesos support. With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definite

Re: KryoException UnsupportedOperationException when writing Avro GenericRecords to Parquet

2020-10-23 Thread Till Rohrmann
Hi Averell, it looks as if the org.apache.avro.Schema$Field contains a field which is an unmodifiable collection. The Kryo serializer will try to deserialize this field by creating an unmodifiable collection and then trying to add the elements into it. This will fail. I would recommend using the

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Till Rohrmann
Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden. Cheers, Till On F

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Kostas Kloudas
+1 for adding a warning about the removal of Mesos support and I would also propose to state explicitly in the warning the version that we are planning to actually remove it (e.g. 1.13 or even 1.14 if we feel it is too aggressive). This will help as a reminder to users and devs about the upcoming

Re: savepoint failure

2020-10-23 Thread Till Rohrmann
Glad to hear that you solved your problem. Afaik Flink should not read the fields of messages and call hashCode on them. Cheers, Till On Fri, Oct 23, 2020 at 2:18 PM Radoslav Smilyanov < radoslav.smilya...@smule.com> wrote: > Hi Till, > > I found my problem. It was indeed related to a mutable ha

A group window expects a time attribute for grouping in a stream environment.

2020-10-23 Thread ??????
I'm learning GroupBy Window Aggregation from document https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tableApi.html My code is: https://paste.ubuntu.com/p/GQqR4cqdp6/ pojo is: https://paste.ubuntu.com/p/CF4yttTGQ4/ I got  A group window expects a time attribute for grouping in

Re: Building Flink on VirtualBox VM failing

2020-10-23 Thread Juha Mynttinen
I'm trying again running the tests, now I have four cores (previously five) and 12 GB RAM (previously 8 GB). I'm still hit by the OOM killer. The command I'm running is: mvn -Dflink.forkCount=1 -Dflink.forkCountTestPackage=1 clean verify [INFO] BUILD FAILURE [INFO] --

how to register TableAggregateFunction?

2020-10-23 Thread ??????
I'm learning document part Flat Aggregate My code is: https://paste.ubuntu.com/p/HmB4q2WJSb/ Could you tell me how to register TableAggregateFunction Thanks for your help

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Piyush Narang
Hi folks, We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term feat

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Kostas Kloudas
Thanks Piyush for the message. After this, I revoke my +1. I agree with the previous opinions that we cannot drop code that is actively used by users, especially if it something that deep in the stack as support for cluster management framework. Cheers, Kostas On Fri, Oct 23, 2020 at 4:15 PM Piyu

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Piyush Narang
Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way. -- Piyush On 10/23/20, 10:25 AM, "Kostas Kloudas" wrote: Thanks Piyush for the message. After this, I revoke my +1. I agree with the previous

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Robert Metzger
Hey Piyush, thanks a lot for raising this concern. I believe we should keep Mesos in Flink then in the foreseeable future. Your offer to help is much appreciated. We'll let you know once there is something. On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang wrote: > Thanks Kostas. If there's items we

Running flink in a Local Execution Environment for Production Workloads

2020-10-23 Thread Joseph Lorenzini
Hi all,   I plan to run flink jobs as docker containers in a AWS Elastic Container Service. I will have checkpointing enabled where state is stored in a s3 bucket. Each deployment will run in a per-job mode.  Are there any non-obvious downsides to running these jobs with a local executi

Re: expected behavior when Flink job cluster exhausted all restarts

2020-10-23 Thread Eleanore Jin
Hi Till, thanks a lot for the explanation. Im using Flink 1.10.2 with java 11. Thanks! Eleanore On Fri, Oct 23, 2020 at 4:31 AM Till Rohrmann wrote: > Hi Eleanore, > > if you want to tolerate JM restarts, then you have to enable HA. W/o HA, a > JM restart is effectively a submission of a new j

Re: Flink Job Manager Memory Usage Keeps on growing when enabled checkpoint

2020-10-23 Thread Eleanore Jin
Hi Till, Thanks a lot for the prompt response, please see below information. 1. how much memory assign to JM pod? 6g for container memory limit, 5g for jobmanager.heap.size, I think this is the only available jm memory configuration for flink 1.10.2 2. Have you tried with newer Flink versions? I

Re: Flink Job Manager Memory Usage Keeps on growing when enabled checkpoint

2020-10-23 Thread Eleanore Jin
Hi Till, please see the screenshot of heap dump: https://ibb.co/92Hzrpr Thanks! Eleanore On Fri, Oct 23, 2020 at 9:25 AM Eleanore Jin wrote: > Hi Till, > Thanks a lot for the prompt response, please see below information. > > 1. how much memory assign to JM pod? > 6g for container memory limit

Re: [SURVEY] Remove Mesos support

2020-10-23 Thread Lasse Nedergaard
Hi At Trackunit We have been using Mesos for long time but have now moved to k8s. Med venlig hilsen / Best regards Lasse Nedergaard > Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger : > >  > Hey Piyush, > thanks a lot for raising this concern. I believe we should keep Mesos in > Flink then

Re: Trying to run Flink tests

2020-10-23 Thread Dan Hill
Changing down to maven 3.2 shows an error. It seems like I'm hitting flaky tests. I hit one error and then a different error when running again. I'm not blocked now. My diff was already merged and the related tests pass. Neither of these failures look related to my diff. <<< FAILURE! - in o

FLINK 1.11 Graphite Metrics

2020-10-23 Thread Vijayendra Yadav
Hi Team, for Flink 1.11 Graphite Metrics. I see the following Error in the log. Any suggestions? 020-10-23 21:55:14,652 ERROR org.apache.flink.runtime.metrics.ReporterSetup- Could not instantiate metrics reporter grph. Metrics might not be exposed/reported. java.lang.ClassNotFound

Re: Flink Job Manager Memory Usage Keeps on growing when enabled checkpoint

2020-10-23 Thread Eleanore Jin
I also tried enable native memory tracking, via jcmd, here is the memory breakdown: https://ibb.co/ssrZB4F since job manager memory configuration for flink 1.10.2 only has jobmanager.heap.size, and it only translates to heap settings, should I also set -XX:MaxDirectMemorySize and -XX:MaxMetaspaceS

Re: Trying to run Flink tests

2020-10-23 Thread Xintong Song
Hi Dan, I think these are unstable test cases. As we are approaching the feature freeze date for release 1.12.0, people are busy merging new features recently, which lead to the test instability. I'm not aware of any issue reported on the `OrcColumnarRowSplitReaderTest`. >From what you described,