Hi, users, here i want to collect some use cases about the window join[1],
which is a supported feature on the data stream. The purpose is to make a
decision whether to support it also on the SQL side, for example, 2 tumbling
window join may look like this:
```sql
select ... window_start, windo
Hi Danny,
You are right, we have already considered the watermark lateness in this
case.
However our Interval Join Operator has some bug that will still produce
records later than watermark.
I've created a issue[1], we can discuss it in the jira issue.
[1] https://issues.apache.org/jira/browse/FL
Hi Chesnay and Dawid,
I see multiple entries as following in Log:
2020-08-26 23:46:19,105 WARN
org.apache.flink.runtime.metrics.MetricRegistryImpl - Error while
registering metric: numRecordsIn.
java.lang.IllegalArgumentException: A metric named
ip-99--99-99.taskmanager.container_159605640970
One other thought: some users experiencing this have found it preferable to
increase the checkpoint timeout to the point where it is effectively
infinite. Checkpoints that can't timeout are likely to eventually complete,
which is better than landing in the vicious cycle you described.
David
On We
You should begin by trying to identify the cause of the backpressure,
because the appropriate fix depends on the details.
Possible causes that I have seen include:
- the job is inadequately provisioned
- blocking i/o is being done in a user function
- a huge number of timers are firing simultaneo
Thanks Andrey,
My question is related to
The FsStateBackend is encouraged for:
- Jobs with large state, long windows, large key/value states.
- All high-availability setups.
How large is large state without any overhead added by the framework?
Best,
Vishwas
On Wed, Aug 26, 2020 at 12:10
Hi Vishwas,
is this quantifiable with respect to JVM heap size on a single node
> without the node being used for other tasks ?
I don't quite understand this question. I believe the recommendation in
docs has the same reason: use larger state objects so that the Java object
overhead pays off.
R
Hello,
What version of Flink do you use? If you use 1.10+ please check [1] (different
properties names)
[1] -
https://ci.apache.org/projects/flink/flink-docs-stable/ops/memory/mem_setup.html
Thanks,
Alexey
From: Sakshi Bansal
Sent: Monday, August 24, 2020 3:30
Hi Adam,
maybe also check your SSL setup in a local cluster to exclude possibly
related k8s things.
Best,
Andrey
On Wed, Aug 26, 2020 at 3:59 PM Adam Roberts wrote:
> Hey Nico - thanks for the prompt response, good catch - I've just tried
> with the two security options (enabling rest and inte
Hi Vishwas,
I believe the screenshots are from a heap size of 1GB?
There are indeed many internal Flink state objects. They are overhead which
is required for Flink to organise and track the state on-heap.
Depending on the actual size of your state objects, the overhead may be
relatively large or
Hello,
My Flink application has entered into a bad state and I was wondering if I
could get some advice on how to resolve the issue.
The sequence of events that led to a bad state:
1. A failure occurs (e.g., TM timeout) within the cluster
2. The application successfully recovers from the last co
Yes, I'm afraid this analysis is correct. The StreamOperator,
AbstractStreamOperator to be specific, computes the combined watermarks
from both inputs here:
https://github.com/apache/flink/blob/f0ed29c06d331892a06ee9bddea4173d6300835d/flink-streaming-java/src/main/java/org/apache/flink/streaming
metrics.reporter.grph.class:
org.apache.flink.metrics.graphite.GraphiteReporter
https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#graphite-orgapacheflinkmetricsgraphitegraphitereporter
On 26/08/2020 16:40, Vijayendra Yadav wrote:
Hi Dawid,
I have 1.10.0 vers
I'd recommend then following this instruction from older docs[1]
The difference are that you should set:
|metrics.reporter.grph.class:
org.apache.flink.metrics.graphite.GraphiteReporter|
and put the reporter jar to the /lib folder:
In order to use this reporter you must copy
|/opt/flink-metrics
Hi Dawid,
I have 1.10.0 version of flink. What is alternative for this version ?
Regards,
Vijay
>
> On Aug 25, 2020, at 11:44 PM, Dawid Wysakowicz wrote:
>
>
> Hi Vijay,
>
> I think the problem might be that you are using a wrong version of the
> reporter.
>
> You say you used flink-metr
Hi,
I am trying to investigate a problem with non-released resources in my
application.
I have a stateful application which submits Flink DataSetjobs using code very
similar to the code in CliFrontend.
I noticed what I am getting a lot of non-closed connections to my data store
(HBase in my ca
Hey Nico - thanks for the prompt response, good catch - I've just tried with the two security options (enabling rest and internal SSL communications) and still hit the same problem
I've also tried turning off security (both in my Job definition and in my Flink cluster JobManager/TaskManager setti
Hi, did you try to define a UDAF there within your group window sql, where you
can have a custom service there.
I’m afraid you are right, SQL only supports time windows.
Best,
Danny Chan
在 2020年8月26日 +0800 PM8:02,刘建刚 ,写道:
> For API, we can visit outer service in batch through countWindow,
For SQL, we always hold back the watermark when we emit the elements, for time
interval:
return Math.max(leftRelativeSize, rightRelativeSize) + allowedLateness;
For your case, the watermark would hold back for 1 hour, so the left join
records would not delay when it is used by subsequent operat
For API, we can visit outer service in batch through countWindow,
such as the following. We can visit outer service every 1000 records. If we
visit outer service every record, it will be very slow for our job.
source.keyBy(new KeySelector())
.countWindow(1000)
.apply((WindowF
Thanks ZhuZhu for being the release manager and everyone who contributed to
this release.
Best,
Leonard
Thanks ZhuZhu for managing this release and everyone else who contributed
to this release!
Best,
Congxian
Xingbo Huang 于2020年8月26日周三 下午1:53写道:
> Thanks Zhu for the great work and everyone who contributed to this release!
>
> Best,
> Xingbo
>
> Guowei Ma 于2020年8月26日周三 下午12:43写道:
>
>> Hi,
>>
>>
Hi Mu
I want to share something more about the memory usage of RocksDB.
If you enable managed memory for rocksDB (which is enabled by default) [1], you
should refer to the block cache usage as we cast index & filter into cache and
counting write buffer usage in cache.
We could refer to the usag
Hi Adam,
the flink binary will pick up any configuration from the flink-conf.yaml of
its directory. If that is the same as in the cluster, you wouldn't have to
pass most of your parameters manually. However, if you prefer not having a
flink-conf.yaml in place, you could remove the security.ssl.i
Hi Kien,
I am afraid this is a valid bug. I am not 100% sure but the way I
understand the code the idleness mechanism applies to input channels,
which means e.g. when multiple parallell instances shuffle its results
to downstream operators.
In case of a two input operator, combining the watermark
Hi,
@Chesnay Schepler The issue is that the uber-jar is
first loaded with Flink's app classloader (because it's in lib) and then
when the application starts, it gets loaded again in the ChildFirstCL and
since it's child-first, the class is loaded anyways.
What I don't quite understand is why the
Hi,
I think you are hitting a bug here. It should be possible what you are
trying to do. Would you like to open a bug for it?
However, the bug applies to the legacy batch planner (you are using the
BatchTableEnvironment), which is no longer maintained and there were
discussions already to drop it
27 matches
Mail list logo