We are eagerly waiting for
- Extends Streaming Sinks:
- Bucketing Sink should support S3 properly (compensate for eventual
consistency), work with Flink's shaded S3 file systems, and efficiently
support formats that compress/index arcoss individual rows (Parquet, ORC,
...)
Especially for O
One more, since it we have to deal with it often:
- Idling sources (Kafka in particular) and proper watermark propagation:
FLINK-5018 / FLINK-5479
On Fri, Jun 8, 2018 at 2:58 PM, Elias Levy
wrote:
> Since wishes are free:
>
> - Standalone cluster job isolation: https://issues.
> apache.org/jira
The error for core-default.xml is interesting.
Flink doesn't have this file. Probably it came with Yarn. Please check the
hadoop version Flink was built with versus the hadoop version in your cluster.
Thanks
Original message From: Garvit Sharma
Date: 6/16/18 7:23 AM (GMT-08:0
Hi Fabian,
I'm still eager to expose # of active sessions as a key metric of our service
but I haven’t figured it out yet.
First of all, I want to ask you some questions regarding your suggestion.
> You could implement a Trigger that fires when a new window is created and
> when the window is c
I am not able to figure out, got stuck badly in this since last 1 week. Any
little help would be appreciated.
2018-06-16 19:25:10,523 DEBUG
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator -
Parallelism set: 1 for 8
2018-06-16 19:25:10,578 DEBUG
org.apache.flink.streaming.api.gra
Hi mates, since 1.5 release, BucketingSink has ability to configure suffix of
the part file. It’s very useful, when it’s necessary to set specific extension
of the file.
During the usage, I’ve found the issue - when new part file is created, it has
the same part index, as index of just closed f
Hi Jorn, Please find the source @https://github.com/swyow/flink_sample_git
Thank you!
From: Jörn Franke
Sent: Saturday, June 16, 2018 6:03 PM
To: Siew Wai Yow
Cc: user@flink.apache.org
Subject: Re: Flink application does not scale as expected, please help!
Can y
Can you share the app source on gitlab, github or bitbucket etc?
> On 16. Jun 2018, at 11:46, Siew Wai Yow wrote:
>
> Hi, There is an interesting finding, the reason of low parallelism work much
> better is because all task being run in same TM, once we scale more, the task
> is distributed t
Hi, There is an interesting finding, the reason of low parallelism work much
better is because all task being run in same TM, once we scale more, the task
is distributed to different TM and the performance worse than the low
parallelism case. Is this something expected? The more I scale the less
Hi Jorn, the input data is 1kb per record, in production it will have 10
billions of record per day and it will be increased so scalability is quite
important to us to handle more data. Unfortunately this is not work as expected
even with only 10 millions of testing data. The test application is
How large is the input data? If the input data is very small then it does not
make sense to scale it even more. The larger the data is the more parallelism
you will have. You can modify this behavior of course by changing the partition
on the Dataset.
> On 16. Jun 2018, at 10:41, Siew Wai Yow
11 matches
Mail list logo