But on the Kafka source level it should be perfectly fine to do what Elias
proposed. This is of course is not the perfect solution but could bring us
forward quite a bit. The changes required for this should also be minimal.
This would become obsolete once we have something like shared state. But
u
The reason this selective reading doesn't work well in Flink in the moment is
because of checkpointing. For checkpointing, checkpoint barriers travel within
the streams. If we selectively read from inputs based on timestamps this is
akin to blocking an input if that input is very far ahead in ev
Would it maybe make sense to provide Flink as an engine on Hive
(„flink-on-Hive“)? Eg to address 4,5,6,8,9,10. this could be more loosely
coupled than integrating hive in all possible flink core modules and thus
introducing a very tight dependency to Hive in the core.
1,2,3 could be achieved via
Hi Fabian/Vno,
Thank you very much for your encouragement inquiry. Sorry that I didn't see
Fabian's email until I read Vino's response just now. (Somehow Fabian's went to
the spam folder.)
My proposal contains long-term and short-terms goals. Nevertheless, the effort
will focus on the followin
vinoyang created FLINK-10527:
Summary: Cleanup constant isNewMode in YarnTestBase
Key: FLINK-10527
URL: https://issues.apache.org/jira/browse/FLINK-10527
Project: Flink
Issue Type: Bug
Hi Xuefu,
Appreciate this proposal, and like Fabian, it would look better if you can
give more details of the plan.
Thanks, vino.
Fabian Hueske 于2018年10月10日周三 下午5:27写道:
> Hi Xuefu,
>
> Welcome to the Flink community and thanks for starting this discussion!
> Better Hive integration would be re
Yan Yan created FLINK-10526:
---
Summary: Hadoop FileSystem not initialized properly on Yarn
Key: FLINK-10526
URL: https://issues.apache.org/jira/browse/FLINK-10526
Project: Flink
Issue Type: Bug
On Wed, Oct 10, 2018 at 9:33 AM Fabian Hueske wrote:
> I think the new source interface would be designed to be able to leverage
> shared state to achieve time alignment.
> I don't think this would be possible without some kind of shared state.
>
> The problem of tasks that are far ahead in time
Hi Piotrek,
Thanks for the feedback and reviews.
Yes, as I explained previously in reply to the (2B) point. I think it is
possible to create our own customized window assigner without any API
change if we eliminate the requirement of
*"the same key should always results in the same offset"*
I ha
Thanks for the feedback and comments so far.
I want to elaborate more on the need for the shared state and awareness of
watermark alignment in the source implementation. Sources like Kafka and
Kinesis pull from the external system and then emit the records. For
Kinesis, we have multiple consumer t
I think the new source interface would be designed to be able to leverage
shared state to achieve time alignment.
I don't think this would be possible without some kind of shared state.
The problem of tasks that are far ahead in time cannot be solved with
back-pressure.
That's because a task canno
On Wed, Oct 10, 2018 at 8:15 AM Aljoscha Krettek
wrote:
> I think the two things (shared state and new source interface) are
> somewhat orthogonal. The new source interface itself alone doesn't solve
> the problem, we would still need some mechanism for sharing the event-time
> information betwee
Hi all,
I opened a PR [1] to add the PR review guide to the Flink website.
Cheers, Fabian
[1] https://github.com/apache/flink-web/pull/126
Am Mi., 10. Okt. 2018 um 17:27 Uhr schrieb Aljoscha Krettek <
aljos...@apache.org>:
> +1
>
> > On 9. Oct 2018, at 17:11, Hequn Cheng wrote:
> >
> > +1
> >
+1
> On 9. Oct 2018, at 17:11, Hequn Cheng wrote:
>
> +1
>
> On Tue, Oct 9, 2018 at 3:25 PM Till Rohrmann wrote:
>
>> +1
>>
>> On Tue, Oct 9, 2018 at 9:08 AM Zhijiang(wangzhijiang999)
>> wrote:
>>
>>> +1
>>> --
>>> 发件人:vino ya
+1
I did
- verify all changes between 4.0 and 5.0
- check signature and hash of the source release
- build a work-in-progress branch for Scala 2.12 support using the new shaded
asm6 package
> On 10. Oct 2018, at 15:11, Chesnay Schepler wrote:
>
> Hi everyone,
> Please review and vote on the
Sorry for also derailing this a bit earlier...
I think the two things (shared state and new source interface) are somewhat
orthogonal. The new source interface itself alone doesn't solve the problem, we
would still need some mechanism for sharing the event-time information between
different sub
Rinat Sharipov created FLINK-10525:
--
Summary: Deserialization schema, skip data, that couldn't be
properly deserialized
Key: FLINK-10525
URL: https://issues.apache.org/jira/browse/FLINK-10525
Project
Also, I'm afraid I derailed this thread just a bit.. So also back to
Thomas's original question..
If we decide state-sharing across source subtasks is the way forward for
now -- does anybody have thoughts to share on what form this should take?
Thomas mentioned Akka or JGroups. Other thoughts?
Okay, so I think there is a lot of agreement here about (a) This is a real
issue for people, and (b) an ideal long-term approach to solving it.
As Aljoscha and Elias said a full solution to this would be to also
redesign the source interface such that individual partitions are exposed
in the API a
Chesnay Schepler created FLINK-10524:
Summary: HeartbeatManagerTest failed on travis
Key: FLINK-10524
URL: https://issues.apache.org/jira/browse/FLINK-10524
Project: Flink
Issue Type: Bug
Hi everyone,
Please review and vote on the release candidate #1 for the version 5.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)
This release
* adds jackson-dataformat-csv to the the shaded-jackson module (used for
CSV table fac
Chesnay Schepler created FLINK-10523:
Summary: Add jackson-dataformat-csv to flink-shaded
Key: FLINK-10523
URL: https://issues.apache.org/jira/browse/FLINK-10523
Project: Flink
Issue Type
Hi everyone,
thanks for the feedback that we got so far. I will update the document
in the next couple of hours such that we can continue with the discussion.
Regarding the table type: Actually I just didn't mention it in the
document, because the table type is a SQL Client/External catalog
Kostas Kloudas created FLINK-10522:
--
Summary: Check if RecoverableWriter supportsResume and accordingly.
Key: FLINK-10522
URL: https://issues.apache.org/jira/browse/FLINK-10522
Project: Flink
Hi everyone, thx for all the comments and feedback. Let me address
everything individually:
@Till: yes, for the start my plan would be to just touch the
flink-runtime-web/web-dashboard repo/folder.
@Jin Sun:
- smaller icons on increasing server counts: yes, thats also something i
already t
Hi Xuefu,
Welcome to the Flink community and thanks for starting this discussion!
Better Hive integration would be really great!
Can you go into details of what you are proposing? I can think of a couple
ways to improve Flink in that regard:
* Support for Hive UDFs
* Support for Hive metadata cat
Florian Schmidt created FLINK-10521:
---
Summary: TaskManager metrics are not reported to prometheus after
running a job
Key: FLINK-10521
URL: https://issues.apache.org/jira/browse/FLINK-10521
Project:
27 matches
Mail list logo