Piotr Nowojski created FLINK-12868:
--
Summary: Yarn cluster can not be deployed if plugins dir does not
exist
Key: FLINK-12868
URL: https://issues.apache.org/jira/browse/FLINK-12868
Project: Flink
Hi all,
Thank you for starting the discussion. To start with I have to say I am
not entirely against leaving them. On the other hand I totally disagree
that the semantics are clearly defined. Actually the design is
fundamentally flawed.
1. We use String as a selector for elements. This is not th
Hi Dawid,
As the select method is only allowed on SplitStreams, it's impossible to
construct the example ds.split().select("a", "b").select("c", "d").
Are you meaning ds.split().select("a", "b").split().select("c", "d")?
If so, then the tagging in the first split operation should not affect the
s
Nicolas Fraison created FLINK-12869:
---
Summary: Add yarn acls capability to flink containers
Key: FLINK-12869
URL: https://issues.apache.org/jira/browse/FLINK-12869
Project: Flink
Issue Type
Hi Vino,
Thanks for the proposal, I think it is a very good feature!
One thing I want to make sure is the semantics for the `localKeyBy`. From
the document, the `localKeyBy` API returns an instance of `KeyedStream`
which can also perform sum(), so in this case, what's the semantics for
`localKeyB
Hi Eugene,
I'd say that what you want essentially is not "sort in windows", because
(as you mention), you want to emit elements from windows as soon as
watermark passes some timestamp. Maybe a better approach would be to
implement this using stateful processing, where you keep a buffer of
(un
Alexander Fedulov created FLINK-12870:
-
Summary: Improve documentation of keys schema evolution
Key: FLINK-12870
URL: https://issues.apache.org/jira/browse/FLINK-12870
Project: Flink
Issu
Hi Zili,
thank you for adding these threads :) I would have otherwise picked them up
next week, just couldn't put everything into one email.
Cheers,
Konstantin
On Sun, Jun 16, 2019 at 11:07 PM Zili Chen wrote:
> Hi Konstantin and all,
>
> Thank Konstantin very much for reviving this tradition
Hi Hequn,
Thanks for your reply.
The purpose of localKeyBy API is to provide a tool which can let users do
pre-aggregation in the local. The behavior of the pre-aggregation is
similar to keyBy API.
So the three cases are different, I will describe them one by one:
1. input.keyBy(0).sum(1)
*In
Hi all,
In the review of PR for FLINK-12473, there were a few comments regarding
pipeline exportation. We would like to start a follow up discussions to
address some related comments.
Currently, FLIP-39 proposal gives a way for users to persist a pipeline in
JSON format. But it does not specify h
Good to know that you have solved your problem :)
Piotrek
> On 15 Jun 2019, at 00:48, Timothy Farkas wrote:
>
> Resolved the issue. I was running a very old version of macos, after
> upgrading to Mojave the issue disappeared. Disk usage stopped spiking and I
> stopped running out of disk space.
On a related subject, it would be interesting to have the capability to encrypt
savepoints. That would allow processing and storing of sensitive data in Flink.
-Original Message-
From: Tzu-Li (Gordon) Tai
Sent: maandag 17 juni 2019 04:15
To: dev
Subject: Re: [DISCUSS] FLIP-41: Unify K
Is there any possibility to have something like Apache Livy [1] also for
Flink in the future?
[1] https://livy.apache.org/
On Tue, Jun 11, 2019 at 5:23 PM Jeff Zhang wrote:
> >>> Any API we expose should not have dependencies on the runtime
> (flink-runtime) package or other implementation det
Yes you are correct. The problem I described applies to the split not
select as I wrote in the first email. Sorry for that.
I will try to prepare a correct example. Let's have a look at this example:
val splitted1 = ds.split(if (1) then "a")
val splitted2 = ds.split(if (!=1) then "a")
I
Hi Jeff and Flavio,
Thanks Jeff a lot for proposing the design document.
We are also working on refactoring ClusterClient to allow flexible and
efficient job management in our real-time platform.
We would like to draft a document to share our ideas with you.
I think it's a good idea to have some
Hi Vino,
Thanks for the proposal.
Regarding to the "input.keyBy(0).sum(1)" vs
"input.localKeyBy(0).countWindow(5).sum(1).keyBy(0).sum(1)", have you done
some benchmark?
Because I'm curious about how much performance improvement can we get by
using count window as the local operator.
Best,
Jark
Nico Kruber created FLINK-12871:
---
Summary: Wrong SSL setup examples in docs
Key: FLINK-12871
URL: https://issues.apache.org/jira/browse/FLINK-12871
Project: Flink
Issue Type: Bug
Comp
Piotr Nowojski created FLINK-12872:
--
Summary: WindowOperator may fail with
UnsupportedOperationException when merging windows
Key: FLINK-12872
URL: https://issues.apache.org/jira/browse/FLINK-12872
P
+1 from my side.
Best,
Congxian
Tzu-Li (Gordon) Tai 于2019年6月17日周一 上午10:20写道:
> Hi Flink devs,
>
> I want to officially start a voting thread to formally adopt FLIP-41 [1].
>
> There are two relevant discussions threads for this feature [2] [3].
>
> The voting time will end on June 19th 17:00 CE
Hi Dawid,
Thanks a lot for your example.
I think most users will expect splitted1 to be empty in the example.
The unexpected results produced, in my opinion, is due to our problematic
implementation, instead of the confusing semantics.
We can fix the problem if we add a SELECT operator to filter
Andrey Zagrebin created FLINK-12873:
---
Summary: Create a separate maven module for Shuffle API
Key: FLINK-12873
URL: https://issues.apache.org/jira/browse/FLINK-12873
Project: Flink
Issue Ty
Hi Jan
Thanks for a quick reply.
Doing stateful transformation requires re-writing the same logic which is
already defined in Flink by itself. Let's consider example from my original
message:
There can be out-of-order data -> data should be propagated to next
operator only when watermark crosses
+1
With the restriction that it should be “canonical format”/“unified format” (or
something like it) and not save point format, i.e. not
KeyedBackendSavepointStrategyBase in the doc, for example
Aljoscha
> On 17. Jun 2019, at 14:05, Congxian Qiu wrote:
>
> +1 from my side.
> Best,
> Congxian
Timo Walther created FLINK-12874:
Summary: Improve the semantics of zero length character strings
Key: FLINK-12874
URL: https://issues.apache.org/jira/browse/FLINK-12874
Project: Flink
Issue
Hi all,
Thanks for sharing your thoughts on this topic.
First, we must admit that the current implementation for split/select is
flawed. I roughly went through the source codes, the problem may be that for
consecutive select/split(s), the former one will be overridden by the later one
during S
Hi all,
The scheduled meetup is only about a week away. Please note that RSVP at
meetup.com is required. In order for us to get the actual headcount to
prepare for the event, please sign up as soon as possible if you plan to
join. Thank you very much for your cooperation.
Regards,
Xuefu
On Thu,
Bowen Li created FLINK-12875:
Summary: support char, varchar, timestamp, date, decimal in input
arg conversion for Hive functions
Key: FLINK-12875
URL: https://issues.apache.org/jira/browse/FLINK-12875
Pr
Hi Vino,
Thanks for the proposal, I like the general idea and IMO it's very useful
feature.
But after reading through the document, I feel that we may over design the
required
operator for proper local aggregation. The main reason is we want to have a
clear definition and behavior about the "local
Hi all,
Thanks a lot for the discussion. I'm also in favor of rewriting/redesigning the
split/select API instead of removing them. It has been a consensus that the
side output API can achieve all the functionalities of the split/select API.
The problem is whether we should also support some eas
Zhu Zhu created FLINK-12876:
---
Summary: Adapt region failover NG for legacy scheduler
Key: FLINK-12876
URL: https://issues.apache.org/jira/browse/FLINK-12876
Project: Flink
Issue Type: Sub-task
Hi Jark,
We have done a comparative test. The effect is obvious.
>From our observation, the optimized effect mainly depends on two factors:
- the degree of the skew: this factor depends on users business ;
- the size of the window: localKeyBy support all the type of window
which provid
Bowen Li created FLINK-12877:
Summary: Unify catalog database implementations and remove
CatalogDatabase interfaces
Key: FLINK-12877
URL: https://issues.apache.org/jira/browse/FLINK-12877
Project: Flink
Hi Kurt,
Thanks for your comments.
It seems we both implemented local aggregation feature to optimize the
issue of data skew.
However, IMHO, the API level of optimizing revenue is different.
*Your optimization benefits from Flink SQL and it's not user's faces.(If I
understand it incorrectly, ple
godfrey he created FLINK-12878:
--
Summary: Add travis profile for
flink-table-planner-blink/flink-table-runtime-blink
Key: FLINK-12878
URL: https://issues.apache.org/jira/browse/FLINK-12878
Project: Flink
Hi Vino,
Now I feel that we may have different understandings about what kind of
problems or improvements you want to
resolve. Currently, most of the feedback are focusing on *how to do a
proper local aggregation to improve performance
and maybe solving the data skew issue*. And my gut feeling is
Liya Fan created FLINK-12879:
Summary: Improve the performance of AbstractBinaryWriter
Key: FLINK-12879
URL: https://issues.apache.org/jira/browse/FLINK-12879
Project: Flink
Issue Type: Improveme
Hi Kurt,
Thanks for your reply.
Actually, I am not against you to raise your design.
>From your description before, I just can imagine your high-level
implementation is about SQL and the optimization is inner of the API. Is it
automatically? how to give the configuration option about trigger
pre
Hi all,
I think we are getting closer to a consensus. I think most of us already
agree that the current behavior is broken. The remaining difference I
see is that I think those problems are caused by the design of the
split/select method. The current contract of the split method is that it
is actu
Yeah, sorry for not expressing myself clearly. I will try to provide more
details to make sure we are on the same page.
For DataStream API, it shouldn't be optimized automatically. You have to
explicitly call API to do local aggregation
as well as the trigger policy of the local aggregation. Take
Hi dev,
I noticed that all the travis tests triggered by pull request are failed
with the same error:
"Cached flink dir /home/travis/flink_cache/x/flink does not exist.
Exiting build."
Anyone have a clue on what happened and how to fix this?
Best,
Kurt
40 matches
Mail list logo