[Survey] Demand collection for stream SQL window join

2020-08-26 Thread Danny Chan
Hi, users, here i want to collect some use cases about the window join[1], which is a supported feature on the data stream. The purpose is to make a decision whether to support it also on the SQL side, for example, 2 tumbling window join may look like this: ```sql select ... window_start, windo

Re: flink interval join后按窗口聚组问题

2020-08-26 Thread Benchao Li
Hi Danny, You are right, we have already considered the watermark lateness in this case. However our Interval Join Operator has some bug that will still produce records later than watermark. I've created a issue[1], we can discuss it in the jira issue. [1] https://issues.apache.org/jira/browse/FL

Re: Default Flink Metrics Graphite

2020-08-26 Thread Vijayendra Yadav
Hi Chesnay and Dawid, I see multiple entries as following in Log: 2020-08-26 23:46:19,105 WARN org.apache.flink.runtime.metrics.MetricRegistryImpl - Error while registering metric: numRecordsIn. java.lang.IllegalArgumentException: A metric named ip-99--99-99.taskmanager.container_159605640970

Re: Failures due to inevitable high backpressure

2020-08-26 Thread David Anderson
One other thought: some users experiencing this have found it preferable to increase the checkpoint timeout to the point where it is effectively infinite. Checkpoints that can't timeout are likely to eventually complete, which is better than landing in the vicious cycle you described. David On We

Re: Failures due to inevitable high backpressure

2020-08-26 Thread David Anderson
You should begin by trying to identify the cause of the backpressure, because the appropriate fix depends on the details. Possible causes that I have seen include: - the job is inadequately provisioned - blocking i/o is being done in a user function - a huge number of timers are firing simultaneo

Re: OOM error for heap state backend.

2020-08-26 Thread Vishwas Siravara
Thanks Andrey, My question is related to The FsStateBackend is encouraged for: - Jobs with large state, long windows, large key/value states. - All high-availability setups. How large is large state without any overhead added by the framework? Best, Vishwas On Wed, Aug 26, 2020 at 12:10

Re: OOM error for heap state backend.

2020-08-26 Thread Andrey Zagrebin
Hi Vishwas, is this quantifiable with respect to JVM heap size on a single node > without the node being used for other tasks ? I don't quite understand this question. I believe the recommendation in docs has the same reason: use larger state objects so that the Java object overhead pays off. R

Re: Setting job/task manager memory management in kubernetes

2020-08-26 Thread Alexey Trenikhun
Hello, What version of Flink do you use? If you use 1.10+ please check [1] (different properties names) [1] - https://ci.apache.org/projects/flink/flink-docs-stable/ops/memory/mem_setup.html Thanks, Alexey From: Sakshi Bansal Sent: Monday, August 24, 2020 3:30

Re: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Andrey Zagrebin
Hi Adam, maybe also check your SSL setup in a local cluster to exclude possibly related k8s things. Best, Andrey On Wed, Aug 26, 2020 at 3:59 PM Adam Roberts wrote: > Hey Nico - thanks for the prompt response, good catch - I've just tried > with the two security options (enabling rest and inte

Re: OOM error for heap state backend.

2020-08-26 Thread Andrey Zagrebin
Hi Vishwas, I believe the screenshots are from a heap size of 1GB? There are indeed many internal Flink state objects. They are overhead which is required for Flink to organise and track the state on-heap. Depending on the actual size of your state objects, the overhead may be relatively large or

Failures due to inevitable high backpressure

2020-08-26 Thread Hubert Chen
Hello, My Flink application has entered into a bad state and I was wondering if I could get some advice on how to resolve the issue. The sequence of events that led to a bad state: 1. A failure occurs (e.g., TM timeout) within the cluster 2. The application successfully recovers from the last co

Re: Idle stream does not advance watermark in connected stream

2020-08-26 Thread Aljoscha Krettek
Yes, I'm afraid this analysis is correct. The StreamOperator, AbstractStreamOperator to be specific, computes the combined watermarks from both inputs here: https://github.com/apache/flink/blob/f0ed29c06d331892a06ee9bddea4173d6300835d/flink-streaming-java/src/main/java/org/apache/flink/streaming

Re: Default Flink Metrics Graphite

2020-08-26 Thread Chesnay Schepler
metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#graphite-orgapacheflinkmetricsgraphitegraphitereporter On 26/08/2020 16:40, Vijayendra Yadav wrote: Hi Dawid, I have 1.10.0 vers

Re: Default Flink Metrics Graphite

2020-08-26 Thread Dawid Wysakowicz
I'd recommend then following this instruction from older docs[1] The difference are that you should set: |metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter| and put the reporter jar to the /lib folder: In order to use this reporter you must copy |/opt/flink-metrics

Re: Default Flink Metrics Graphite

2020-08-26 Thread Vijayendra Yadav
Hi Dawid, I have 1.10.0 version of flink. What is alternative for this version ? Regards, Vijay > > On Aug 25, 2020, at 11:44 PM, Dawid Wysakowicz wrote: > >  > Hi Vijay, > > I think the problem might be that you are using a wrong version of the > reporter. > > You say you used flink-metr

Resource leak in DataSourceNode?

2020-08-26 Thread Mark Davis
Hi, I am trying to investigate a problem with non-released resources in my application. I have a stateful application which submits Flink DataSetjobs using code very similar to the code in CliFrontend. I noticed what I am getting a lot of non-closed connections to my data store (HBase in my ca

RE: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Adam Roberts
Hey Nico - thanks for the prompt response, good catch - I've just tried with the two security options (enabling rest and internal SSL communications) and still hit the same problem   I've also tried turning off security (both in my Job definition and in my Flink cluster JobManager/TaskManager setti

Re: How to visit outer service in batch for sql

2020-08-26 Thread Danny Chan
Hi, did you try to define a UDAF there within your group window sql, where you can have a custom service there. I’m afraid you are right, SQL only supports time windows. Best, Danny Chan 在 2020年8月26日 +0800 PM8:02,刘建刚 ,写道: >       For API, we can visit outer service in batch through countWindow,

Re: flink interval join后按窗口聚组问题

2020-08-26 Thread Danny Chan
For SQL, we always hold back the watermark when we emit the elements, for time interval: return Math.max(leftRelativeSize, rightRelativeSize) + allowedLateness; For your case, the watermark would hold back for 1 hour, so the left join records would not delay when it is used by subsequent operat

How to visit outer service in batch for sql

2020-08-26 Thread 刘建刚
For API, we can visit outer service in batch through countWindow, such as the following. We can visit outer service every 1000 records. If we visit outer service every record, it will be very slow for our job. source.keyBy(new KeySelector()) .countWindow(1000) .apply((WindowF

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread Leonard Xu
Thanks ZhuZhu for being the release manager and everyone who contributed to this release. Best, Leonard

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread Congxian Qiu
Thanks ZhuZhu for managing this release and everyone else who contributed to this release! Best, Congxian Xingbo Huang 于2020年8月26日周三 下午1:53写道: > Thanks Zhu for the great work and everyone who contributed to this release! > > Best, > Xingbo > > Guowei Ma 于2020年8月26日周三 下午12:43写道: > >> Hi, >> >>

Re: Monitor the usage of keyed state

2020-08-26 Thread Yun Tang
Hi Mu I want to share something more about the memory usage of RocksDB. If you enable managed memory for rocksDB (which is enabled by default) [1], you should refer to the block cache usage as we cast index & filter into cache and counting write buffer usage in cache. We could refer to the usag

Re: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Nico Kruber
Hi Adam, the flink binary will pick up any configuration from the flink-conf.yaml of its directory. If that is the same as in the cluster, you wouldn't have to pass most of your parameters manually. However, if you prefer not having a flink-conf.yaml in place, you could remove the security.ssl.i

Re: Idle stream does not advance watermark in connected stream

2020-08-26 Thread Dawid Wysakowicz
Hi Kien, I am afraid this is a valid bug. I am not 100% sure but the way I understand the code the idleness mechanism applies to input channels, which means e.g. when multiple parallell instances shuffle its results to downstream operators. In case of a two input operator, combining the watermark

Re: Loading FlinkKafkaProducer fails with LinkError

2020-08-26 Thread Arvid Heise
Hi, @Chesnay Schepler The issue is that the uber-jar is first loaded with Flink's app classloader (because it's in lib) and then when the application starts, it gets loaded again in the ChildFirstCL and since it's child-first, the class is loaded anyways. What I don't quite understand is why the

Re: Why consecutive calls of orderBy are forbidden?

2020-08-26 Thread Dawid Wysakowicz
Hi, I think you are hitting a bug here. It should be possible what you are trying to do. Would you like to open a bug for it? However, the bug applies to the legacy batch planner (you are using the BatchTableEnvironment), which is no longer maintained and there were discussions already to drop it