Re: [DISCUSS] Repository split

2019-08-11 Thread Maximilian Michels
Apart from a technical explanation, the initial suggestion does not propose how 
the repository should be split up. The only meaningful split I see is for the 
connectors.

This discussion dates back a few years: 
https://lists.apache.org/thread.html/4ee502667a5801d23d76a01406e747e1a934417dc67ef7d26fb7f79c@1449757911@%3Cdev.flink.apache.org%3E

I would be in favor of keeping the mono repository. Like already mentioned 
here, there are other ways to resolve build time issue. For instance, in Beam 
we have granular build triggers that allow to test only specific components and 
their dependencies: 
https://github.com/apache/beam/blob/a2b57e3b55a09d641cee8c3b796cc6971a008db0/.test-infra/jenkins/job_PreCommit_Java.groovy#L26

Thanks,
Max

On 09.08.19 09:14, Biao Liu wrote:
> Hi folks,
>
> Thanks for bringing this discussion Chesnay.
>
> +1 for the motivation. It's really a bad experience of waiting Travis
> building for a long time.
>
> WRT the solution, personally I agree with Dawid/David.
>
> IMO the biggest benefit of splitting repository is reducing build time. I
> think it could be achieved without splitting the repository. That's the
> best solution for me.
>
> And there would be several pains I do really care about if we split the
> repository.
>
> 1. Most of our users are developer. The non-developer users probably do not
> care the code structure at all. They might use the released binary file
> directly. For developers, the multiple repositories are not so friendly to
> read, build or test the codes. I think it's a big regression.
>
> 2. It's definitely a nightmare to work across repositories. As Piotr said,
> it's should be a rare case. However Jack raised a good example, debugging a
> sub-repository IT case. Image the scenario, I'm debugging an unstable Kafka
> IT case. I need to add some logs in runtime components to find some clues.
> What should I do? I have to locally install the flink-main project for each
> time after adding logs. And it's easy to make mistakes with switching
> between repositories time after time.
>
> To sum up, at least for now I agree with Dawid that we should go toward
> splitting the CI builds not the repository.
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Fri, Aug 9, 2019 at 12:55 AM Jark Wu  wrote:
>
> > Hi,
> >
> > First of all, I agree with Dawid and David's point.
> >
> > I will share some experience on the repository split. We have been through
> > it for Alibaba Blink, which is the most worthwhile project to learn from I
> > think.
> > We split Blink project into "blink-connectors" and "blink", but we didn't
> > get much benefit for better development process. In the contrary, it slow
> > down the development sometimes.
> > We have suffered from the following issues after split as far as I can see:
> >
> > 1. Unstable build and test:
> > The interface or behavior changes in the underlying (e.g. core, table) will
> > lead to build fail and tests fail in the connectors repo. AFAIK, table api
> > are still under heavy evolution.
> > This will make connectors repo more unstable and makes us busy to fix the
> > build problems and tests problems **after-commit**.
> > First, it's not easy to locate which commit of main repo lead to the
> > connectors repo fail (we have over 70+ commits every day in flink master
> > now and it is growing).
> > Second, when 2 or 3 build/test problems happened at one time, it's hard to
> > fix the problem because we can't make the build/test pass in separate
> > hotfix pull requests.
> >
> > 2. Debug difficulty:
> > As modules are separate in different repositories, if we want to debug a
> > Kafka IT case,
> > we may need to debug some code in flink runtime or verify whether the
> > runtime code change
> > can fix the Kafka case. However, it will be more complex because they are
> > not in one project.
> >
> > IMO, this actually slows down the development process.
> >
> > --
> >
> > In my understanding, the issues we want to solve with the split include:
> > 1) long build/testing time
> > 2) unstable tests
> > 3) increasing number of PRs
> >
> > Ad. 1 I think we have several ways to reduce the build/testing time. As
> > Dawid said, we can trigger corresponding CI in a single repository (without
> > to run all the tests).
> > An easy way might be to analyse the pom.xml that which modules depends on
> > the changed module. And one thing we can do right now is skipping all the
> > tests for documentation changes.
> >
> > Ad. 2 I can't see how unstable connectors tests can be fixed more quickly
> > after moved to a separate repositories. As far as I can tell, this problem
> > might be more significant.
> >
> > Ad. 3 I also doubt how repository split could help with this. I think this
> > will give the sub-repositories less exposure and bahir-flink[1] is an
> > example (only 3 commits in the last 2 months).
> >
> > At the end, from my point of view,
> >   1) if we want to reduce build/testing time, we can start a new thread to
> > collect ideas f

Re: [DISCUSS] Flink project bylaws

2019-08-11 Thread Maximilian Michels
I'm a bit late to the discussion here. Three suggestions:

1) Procedure for "insufficient active binding voters to reach 2/3 majority

> 1. Wait until the minimum length of the voting passes.
> 2. Publicly reach out to the remaining binding voters in the voting mail 
> thread for at least 2 attempts with at least 7 days between two attempts.
> 3. If the binding voter being contacted still failed to respond after all 
> the attempts, the binding voter will be considered as inactive for the 
> purpose of this particular voting.

Step 2 should include a personal email to the PMC members in question.
I'm afraid reminders inside the vote thread could be overlooked easily.

2) "Consensus" => "Lazy Consensus"

The way the terms are described in the draft, the consensus is "lazy",
i.e. requires only 3 binding votes. I'd suggest renaming it to "Lazy
Consensus". This is in line with the other definitions such as "Lazy
Majority".

3) Committer / PMC Removal

Removing a committer / PMC member only requires 3 binding votes. I'd
expect an important action like this to require a 2/3 majority.


Do you think we could incorporate those suggestions?

Thanks,
Max

On 11.08.19 10:14, Becket Qin wrote:
> Hi folks,
> 
> Thanks for all the discussion and support. I have started the voting thread.
> 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Flink-Project-Bylaws-td31558.html
> 
> Thanks,
> 
> Jiangjie (Becket) Qin
> 
> On Thu, Aug 8, 2019 at 2:56 PM Fabian Hueske  wrote:
> 
>> Thanks for the update and driving the discussion Becket!
>> +1 for starting a vote.
>>
>> Am Mi., 7. Aug. 2019 um 11:44 Uhr schrieb Becket Qin >> :
>>
>>> Thanks Stephan.
>>>
>>> I think we have resolved all the comments on the wiki page. There are two
>>> minor changes made to the bylaws since last week.
>>> 1. For 2/3 majority, the required attempts to reach out to binding voters
>>> is reduced from 3 to 2. This helps shorten the voting process a little
>> bit.
>>> 2. Clarified what is considered as the adoption of new codebase.
>>>
>>> I think we almost reached consensus. I'll start a voting thread in two
>> days
>>> if there is no new concerns.
>>>
>>> Thanks,
>>>
>>> Jiangjie (Becket) Qin
>>>
>>> On Mon, Aug 5, 2019 at 1:09 PM Stephan Ewen  wrote:
>>>
 I added a clarification to the table, clarifying that the current
>>> phrasing
 means that committers do not need another +1 for their commits.

 On Mon, Jul 29, 2019 at 2:11 PM Fabian Hueske 
>> wrote:

> Hi Becket,
>
> Thanks a lot for pushing this forward and addressing the feedback.
> I'm very happy with the current state of the bylaws.
> +1 to wait until Friday before starting a vote.
>
> Best, Fabian
>
> Am Mo., 29. Juli 2019 um 13:47 Uhr schrieb Becket Qin <
> becket@gmail.com
>> :
>
>> Hi Everyone,
>>
>> Thanks for all the discussion and feedback.
>>
>> It seems that we have almost reached consensus. I'll leave the
 discussion
>> thread open until this Friday. If there is no more concerns raised,
 I'll
>> start a voting thread after that.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Fri, Jul 26, 2019 at 6:49 PM Yu Li  wrote:
>>
>>> Hi Becket,
>>>
>>> Thanks for noticing and resolving my comment around PMC removal
>> and
 ASF
>>> rules of PMC membership change process, which you seem to neglect
>>> in
> the
>>> summary of updates (smile).
>>>
>>> Best Regards,
>>> Yu
>>>
>>>
>>> On Wed, 24 Jul 2019 at 04:32, Becket Qin 
 wrote:
>>>
 Hi folks,

 Thanks for all the feedback.

 It seems that there are a few concerns over the emeritus status
> after 6
 months of inactivity. Given that the main purpose is just to
>> make
> sure
>>> 2/3
 majority can pass and we sort of have a solution for that, I
>> just
>> updated
 the draft with the following changes:

 1. Removed the inactivity term for emeritus committers / PMCs.
>> A
>>> committer
 / PMC will only be considered emeritus by their own claim.
 2. Removed the approval process for reinstatement of the
>> emeritus
 committers / PMCs. An emeritus committer / PMC will be
>> reinstated
> when
>>> they
 send an email to the priv...@flink.apache.org.
 3. Adde the term to ensure 2/3 majority voting is still doable
>>> when
>> there
 are non-emeritus committers / PMCs who do not cast the vote.

 Please let me know if you have any further thoughts.

 Thanks,

 Jiangjie (Becket) Qin

 On Tue, Jul 23, 2019 at 10:18 AM Becket Qin <
>>> becket@gmail.com>
>>> wrote:

> Hi Fabian,
>
> Thanks for the feedback.
>
> I agree that if we don't like emeritus com

Re: [DISCUSS] Flink project bylaws

2019-08-13 Thread Maximilian Michels
Hi Becket,

Thanks for clarifying and updating the draft. The changes look good to me.

I don't feel strong about a 2/3 majority in case of committer/PMC
removal. Like you pointed out, both provide a significant hurdle due to
possible vetos or a 2/3 majority.

Thanks,
Max

On 13.08.19 10:36, Becket Qin wrote:
> Piotr just reminded me that there was a previous suggestion to clarify a
> committer's responsibility when commit his/her own patch. So I'd like to
> incorporate that in the bylaws. The additional clarification is following
> in bold and italic font.
> 
> one +1 from a committer followed by a Lazy approval (not counting the vote
>> of the contributor), moving to lazy majority if a -1 is received.
>>
> 
> 
> Note that this implies that committers can +1 their own commits and merge
>> right away. *However, the committe**rs should use their best judgement to
>> respect the components expertise and ongoing development plan.*
> 
> 
> This does not really change any of the existing bylaws, just about
> clarification.
> 
> If there is no objection to this additional clarification, after the bylaws
> wiki is updated, I'll send an update notice to the voting thread to inform
> those who already voted about this addition.
> 
> Thanks,
> 
> Jiangjie (Becket) Qin
> 
> On Mon, Aug 12, 2019 at 11:19 AM Becket Qin  wrote:
> 
>> Hi Maximillian,
>>
>> Thanks for the feedback. Please see the reply below:
>>
>> Step 2 should include a personal email to the PMC members in question.
>>
>> I'm afraid reminders inside the vote thread could be overlooked easily.
>>
>>
>> This is exactly what I meant to say by "reach out" :) I just made it more
>> explicit.
>>
>> The way the terms are described in the draft, the consensus is "lazy",
>>> i.e. requires only 3 binding votes. I'd suggest renaming it to "Lazy
>>> Consensus". This is in line with the other definitions such as "Lazy
>>> Majority".
>>
>> It was initially called "lazy consensus", but Robert pointed out that
>> "lazy consensus" actually means something different in ASF term [1].
>> Here "lazy" pretty much means "assume everyone is +1 unless someone says
>> otherwise". This means any vote that requires a minimum number of +1 is not
>> really a "lazy" vote.
>>
>> Removing a committer / PMC member only requires 3 binding votes. I'd
>>> expect an important action like this to require a 2/3 majority.
>>
>> Personally I think consensus is good enough here. PMC members can cast a
>> veto if they disagree about the removal. In some sense, it is more
>> difficult than with 2/3 majority to remove a committer / PMC member. Also,
>> it might be a hard decision for some PMC members if they have never worked
>> with the person in question. That said, I am OK to change it to 2/3
>> majority as this will happen very rarely.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> [1] https://www.apache.org/foundation/voting.html#LazyConsensus
>>
>> On Sun, Aug 11, 2019 at 5:00 PM Maximilian Michels  wrote:
>>
>>> I'm a bit late to the discussion here. Three suggestions:
>>>
>>> 1) Procedure for "insufficient active binding voters to reach 2/3 majority
>>>
>>>> 1. Wait until the minimum length of the voting passes.
>>>> 2. Publicly reach out to the remaining binding voters in the voting
>>> mail thread for at least 2 attempts with at least 7 days between two
>>> attempts.
>>>> 3. If the binding voter being contacted still failed to respond
>>> after all the attempts, the binding voter will be considered as inactive
>>> for the purpose of this particular voting.
>>>
>>> Step 2 should include a personal email to the PMC members in question.
>>> I'm afraid reminders inside the vote thread could be overlooked easily.
>>>
>>> 2) "Consensus" => "Lazy Consensus"
>>>
>>> The way the terms are described in the draft, the consensus is "lazy",
>>> i.e. requires only 3 binding votes. I'd suggest renaming it to "Lazy
>>> Consensus". This is in line with the other definitions such as "Lazy
>>> Majority".
>>>
>>> 3) Committer / PMC Removal
>>>
>>> Removing a committer / PMC member only requires 3 binding votes. I'd
>>> expect an important action like this to require a 2/3 majority.
>>>
>>>
&g

Re: [VOTE] Flink Project Bylaws

2019-08-13 Thread Maximilian Michels
+1 It's good that we formalize this.

On 13.08.19 10:41, Fabian Hueske wrote:
> +1 for the proposed bylaws.
> Thanks for pushing this Becket!
>
> Cheers, Fabian
>
> Am Mo., 12. Aug. 2019 um 16:31 Uhr schrieb Robert Metzger <
> rmetz...@apache.org>:
>
> > I changed the permissions of the page.
> >
> > On Mon, Aug 12, 2019 at 4:21 PM Till Rohrmann 
> > wrote:
> >
> >> +1 for the proposal. Thanks a lot for driving this discussion Becket!
> >>
> >> Cheers,
> >> Till
> >>
> >> On Mon, Aug 12, 2019 at 3:02 PM Becket Qin  wrote:
> >>
> >>> Hi Robert,
> >>>
> >>> That's a good suggestion. Will you help to change the permission on
> > that
> >>> page?
> >>>
> >>> Thanks,
> >>>
> >>> Jiangjie (Becket) Qin
> >>>
> >>> On Mon, Aug 12, 2019 at 2:41 PM Robert Metzger 
> >>> wrote:
> >>>
>  Thanks for starting the vote.
>  How about putting a specific version in the wiki up for voting, or
>  restricting edit access to the page to the PMC?
>  There were already two changes (very minor) to the page since the
> > vote
> >>> has
>  started:
> 
> 
> >>>
> >>
> > https://cwiki.apache.org/confluence/pages/viewpreviousversions.action?pageId=120731026
>  I suggest to restrict edit access to the page.
> 
> 
> 
>  On Mon, Aug 12, 2019 at 11:43 AM Timo Walther 
> >>> wrote:
> 
> > +1
> >
> > Thanks for all the efforts you put into this for documenting how
> > the
> > project operates.
> >
> > Regards,
> > Timo
> >
> > Am 12.08.19 um 10:44 schrieb Aljoscha Krettek:
> >> +1
> >>
> >>> On 11. Aug 2019, at 10:07, Becket Qin 
> >> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I would like to start a voting thread on the project bylaws of
> >>> Flink.
>  It
> >>> aims to help the community coordinate more smoothly. Please see
> >> the
> > bylaws
> >>> wiki page below for details.
> >>>
> >>>
> >
> 
> >>>
> >>
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> >>>
> >>> The discussion thread is following:
> >>>
> >>>
> >
> 
> >>>
> >>
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-project-bylaws-td30409.html
> >>>
> >>> The vote will be open for at least 6 days. PMC members' votes
> > are
> >>> considered as binding. The vote requires 2/3 majority of the
> >> binding
> > +1s to
> >>> pass.
> >>>
> >>> Thanks,
> >>>
> >>> Jiangjie (Becket) Qin
> >
> >
> >
> 
> >>>
> >>
> >
>



Re: [DISCUSS] Releasing Flink 1.8.2

2019-08-30 Thread Maximilian Michels

Hi Jincheng,

+1 I would be for a 1.8.2 release such that we can fix the problems with 
the nested closure cleaner which currently block 1.8.1 users with Beam: 
https://issues.apache.org/jira/browse/FLINK-13367


Thanks,
Max

On 30.08.19 11:25, jincheng sun wrote:

Hi Flink devs,

It has been nearly 2 months since the 1.8.1 released. So, what do you think
about releasing Flink 1.8.2 soon?

We already have some blocker and critical fixes in the release-1.8 branch:

[Blocker]
- FLINK-13159 java.lang.ClassNotFoundException when restore job
- FLINK-10368 'Kerberized YARN on Docker test' unstable
- FLINK-12578 Use secure URLs for Maven repositories

[Critical]
- FLINK-12736 ResourceManager may release TM with allocated slots
- FLINK-12889 Job keeps in FAILING state
- FLINK-13484 ConnectedComponents end-to-end test instable with
NoResourceAvailableException
- FLINK-13508 CommonTestUtils#waitUntilCondition() may attempt to sleep
with negative time
- FLINK-13806 Metric Fetcher floods the JM log with errors when TM is lost

Furthermore, I think the following one blocker issue should be merged
before 1.8.2 release.

- FLINK-13897: OSS FS NOTICE file is placed in wrong directory

It would also be great if we can have the fix of Elasticsearch6.x connector
threads leaking (FLINK-13689) in 1.8.2 release which is identified as major.

Please let me know what you think?

Cheers,
Jincheng



Re: [DISCUSS] FLINK-31873: Add setMaxParallelism to the DataStreamSink Class

2023-04-25 Thread Maximilian Michels
+1

On Tue, Apr 25, 2023 at 5:24 PM David Morávek  wrote:
>
> Hi Eric,
>
> this sounds reasonable, there are definitely cases where you need to limit
> sink parallelism for example not to overload the storage or limit the
> number of output files
>
> +1
>
> Best,
> D.
>
> On Sun, Apr 23, 2023 at 1:09 PM Weihua Hu  wrote:
>
> > Hi, Eric
> >
> > Thanks for bringing this discussion.
> > I think it's reasonable to add ''setMaxParallelism" for DataStreamSink.
> >
> > +1
> >
> > Best,
> > Weihua
> >
> >
> > On Sat, Apr 22, 2023 at 3:20 AM eric xiao  wrote:
> >
> > > Hi there devs,
> > >
> > > I would like to start a discussion thread for FLINK-31873[1].
> > >
> > > We are in the processing of enabling Flink reactive mode as the default
> > > scheduling mode. While reading configuration docs [2] (I believe it was
> > > also mentioned during one of the training sessions during Flink Forward
> > > 2022), one can/should replace all setParallelism calls with
> > > setMaxParallelism when migrating to reactive mode.
> > >
> > > This currently isn't possible on a sink in a Flink pipeline as we do not
> > > expose a setMaxParallelism on the DataStreamSink class [3]. The
> > underlying
> > > Transformation class does have both a setMaxParallelism and
> > setParallelism
> > > function defined [4], but only setParallelism is offered in the
> > > DataStreamSink class.
> > >
> > > I believe adding setMaxParallelism would be beneficial for not just flink
> > > reactive mode, both modes of running of a flink pipeline (non reactive
> > > mode, flink auto scaling).
> > >
> > > Best,
> > >
> > > Eric Xiao
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-31873
> > > [2]
> > >
> > >
> > https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration
> > > [3]
> > >
> > >
> > https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java
> > > [4]
> > >
> > >
> > https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285
> > >
> >


Re: [DISCUSS] Preventing Mockito usage for the new code with Checkstyle

2023-04-26 Thread Maximilian Michels
If we ban Mockito imports, I can still write tests using the full
qualifiers, right?

For example:

org.mockito.Mockito.when(somethingThatShouldHappen).thenReturn(somethingThatNeverActuallyHappens)

Just kidding, +1 on the proposal.

-Max

On Wed, Apr 26, 2023 at 9:02 AM Panagiotis Garefalakis
 wrote:
>
> Thanks for bringing this up!  +1 for the proposal
>
> @Jing Ge -- we don't necessarily need to completely migrate to Junit5 (even
> though it would be ideal).
> We could introduce the checkstyle rule and add suppressions for the
> existing problematic paths (as we do today for other rules e.g.,
> AvoidStarImport)
>
> Cheers,
> Panagiotis
>
> On Tue, Apr 25, 2023 at 11:48 PM Weihua Hu  wrote:
>
> > Thanks for driving this.
> >
> > +1 for Mockito and Junit4.
> >
> > A clarity checkstyle will be of great help to new developers.
> >
> > Best,
> > Weihua
> >
> >
> > On Wed, Apr 26, 2023 at 1:47 PM Jing Ge 
> > wrote:
> >
> > > This is a great idea, thanks for bringing this up. +1
> > >
> > > Also +1 for Junit4. If I am not mistaken, it could only be done after the
> > > Junit5 migration is done.
> > >
> > > @Chesnay thanks for the hint. Do we have any doc about it? If not, it
> > might
> > > deserve one. WDYT?
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Wed, Apr 26, 2023 at 5:13 AM Lijie Wang 
> > > wrote:
> > >
> > > > Thanks for driving this. +1 for the proposal.
> > > >
> > > > Can we also prevent Junit4 usage in new code by this way?Because
> > > currently
> > > > we are aiming to migrate our codebase to JUnit 5.
> > > >
> > > > Best,
> > > > Lijie
> > > >
> > > > Piotr Nowojski  于2023年4月25日周二 23:02写道:
> > > >
> > > > > Ok, thanks for the clarification.
> > > > >
> > > > > Piotrek
> > > > >
> > > > > wt., 25 kwi 2023 o 16:38 Chesnay Schepler 
> > > > napisał(a):
> > > > >
> > > > > > The checkstyle rule would just ban certain imports.
> > > > > > We'd add exclusions for all existing usages as we did when
> > > introducing
> > > > > > other rules.
> > > > > > So far we usually disabled checkstyle rules for a specific files.
> > > > > >
> > > > > > On 25/04/2023 16:34, Piotr Nowojski wrote:
> > > > > > > +1 to the idea.
> > > > > > >
> > > > > > > How would this checkstyle rule work? Are you suggesting to start
> > > > with a
> > > > > > > number of exclusions? On what level will those exclusions be? Per
> > > > file?
> > > > > > Per
> > > > > > > line?
> > > > > > >
> > > > > > > Best,
> > > > > > > Piotrek
> > > > > > >
> > > > > > > wt., 25 kwi 2023 o 13:18 David Morávek 
> > > napisał(a):
> > > > > > >
> > > > > > >> Hi Everyone,
> > > > > > >>
> > > > > > >> A long time ago, the community decided not to use Mockito-based
> > > > tests
> > > > > > >> because those are hard to maintain. This is already baked in our
> > > > Code
> > > > > > Style
> > > > > > >> and Quality Guide [1].
> > > > > > >>
> > > > > > >> Because we still have Mockito imported into the code base, it's
> > > very
> > > > > > easy
> > > > > > >> for newcomers to unconsciously introduce new tests violating the
> > > > code
> > > > > > style
> > > > > > >> because they're unaware of the decision.
> > > > > > >>
> > > > > > >> I propose to prevent Mockito usage with a Checkstyle rule for a
> > > new
> > > > > > code,
> > > > > > >> which would eventually allow us to eliminate it. This could also
> > > > > prevent
> > > > > > >> some wasted work and unnecessary feedback cycles during reviews.
> > > > > > >>
> > > > > > >> WDYT?
> > > > > > >>
> > > > > > >> [1]
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > https://flink.apache.org/how-to-contribute/code-style-and-quality-common/#avoid-mockito---use-reusable-test-implementations
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> D.
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >


Re: [DISCUSS] Planning Flink 2.0

2023-04-26 Thread Maximilian Michels
Thanks for starting the discussion, Jark and Xingtong!

Flink 2.0 is long overdue. In the past, the expectations for such a
release were unreasonably high. I think everybody had a different
understanding of what exactly the criteria were. This led to releasing
18 minor releases for the current major version.

What I'm most excited about for Flink 2.0 is removal of baggage that
Flink has accumulated over the years:

- Removal of Scala, deprecated interfaces, unmaintained libraries and
APIs (DataSet)
- Consolidation of configuration
- Merging of multiple scheduler implementations
- Ability to freely combine batch / streaming tasks in the runtime

When I look at 
https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
, I'm a bit skeptical we will even be able to reach all these goals. I
think we have to prioritize and try to establish a deadline. Otherwise
we will end up never releasing 2.0.

+1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
deadline helps).

-Max


On Wed, Apr 26, 2023 at 10:08 AM Chesnay Schepler  wrote:
>
>  > /Instead of defining compatibility guarantees as "this API won't
> change in all 1.x/2.x series", what if we define it as "this API won't
> change in the next 2/3 years"./
>
> I can see some benefits to this approach (all APIs having a fixed
> minimum lifetime) but it's just gonna be difficult to communicate.
> Technically this implies that every minor release may contain breaking
> changes, which is exactly what users don't want.
>
> What problems to do you see in creating major releases every N years?
>
>  > /IIUC, the milestone releases are a breakdown of the 2.0 release,
> while we are free to introduce breaking changes between them. And you
> suggest using longer-living feature branches to keep the master branch
> in a releasable state (in terms of milestone releases). Am I
> understanding it correctly?/
>
> I think you got the general idea. There are a lot of details to be
> ironed out though (e.g., do we release connectors for each milestone?...).
>
> Conflicts in the long-lived branches are certainly a concern, but I
> think those will be inevitable. Right now I'm not _too_ worried about
> them, at least based on my personal wish-list.
> Maybe the milestones could even help with that, as we could preemptively
> decide on an order for certain changes that have a high chance of
> conflicting with each other?
> I guess we could do that anyway.
> Maybe we should explicitly evaluate how invasive a change is (in
> relation to other breaking changes!) and manage things accordingly
>
>
> Other thoughts:
>
> We need to figure out what this release means for connectors
> compatibility-wise. The current rules for which versions a connector
> must support don't cover major releases at all.
> (This depends a bit on the scope of 2.0; if we add binary compatibility
> to Public APIs and promote a few Evolving ones then compatibility across
> minor releases becomes trivial)
>
> What process are you thinking of for deciding what breaking changes to
> make? The obvious choice would be FLIPs, but I'm worried that this will
> overload the mailing list / wiki for lots of tiny changes.
>
> Provided that we agree on doing 2.0, when would we cut the 2.0 branch?
> Would we wait a few months for people to prepare/agree on changes so we
> reduce the time we need to merge things into 2 branches?
>
> On 26/04/2023 05:51, Xintong Song wrote:
> > Thanks all for the positive feedback.
> >
> > @Martijn
> >
> > If we want to have that roadmap, should we consolidate this into a
> >> dedicated Confluence page over storing it in a Google doc?
> >>
> > Having a dedicated wiki page is definitely a good way for the roadmap
> > discussion. I haven't created one yet because it's still a proposal to have
> > such roadmap discussion. If the community agrees with our proposal, the
> > release manager team can decide how they want to drive and track the
> > roadmap discussion.
> >
> > @Chesnay
> >
> > We should discuss how regularly we will ship major releases from now on.
> >> Let's avoid again making breaking changes because we "gotta do it now
> >> because 3.0 isn't happening anytime soon". (e.g., every 2 years or
> >> something)
> >
> > I'm not entirely sure about shipping major releases regularly. But I do
> > agree that we may want to avoid the situation that "breaking changes can
> > only happen now, or no idea when". Instead of defining compatibility
> > guarantees as "this API won't change in all 1.x/2.x series", what if we
> > define it as "this API won't change in the next 2/3 years". That should
> > allow us to incrementally iterate the APIs.
> >
> > E.g., in 2.a, all APIs annotated as `@Stable` will be guaranteed compatible
> > until 2 years after 2.a is shipped, and in 2.b if the API is still
> > annotated `@Stable` it extends the compatibility guarantee to 2 years after
> > 2.b is shipped. To remove an API, we would need to mark it as `@Deprecated`
> > and w

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.5.0, release candidate #2

2023-05-16 Thread Maximilian Michels
+1 (binding)

1. Downloaded the archives, checksums, and signatures
2. Verified the signatures and checksums
3. Extract and inspect the source code for binaries
4. Verified license files / headers
5. Compiled and tested the source code via mvn verify
6. Deployed helm chart to test cluster
7. Ran example job

-Max

On Tue, May 16, 2023 at 3:10 AM Jim Busche  wrote:
>
> +1 (Non-binding)
> I tested the following:
>
> - helm repo install from flink-kubernetes-operator-1.5.0-helm.tgz (See note 1 
> below)
> - podman Dockerfile build from source, looked good. (See note 2 below)
> - twistlock vulnerability scans of proposed 
> ghcr.io/apache/flink-kubernetes-operator:be07be7 looks good, except for known 
> Snake item.
> - UI, basic sample, basic session jobs look good. Logs look as expected.
> - Checksums looked good
> - Tested OLM build/install on OpenShift 4.10.54 and OpenShift 4.12.7
>
> Note 1: To install on OpenShift, I had to add an extra flink-operator 
> clusterrole resource.  See 
> https://github.com/apache/flink-kubernetes-operator/pull/600 and issue 
> https://issues.apache.org/jira/browse/FLINK-32103
>
> Note 2: For some reason, I can't use podman on Red Hat 8 to build Flink, but 
> the Podman from Red Hat 9.0 worked fine.
>
>
> Thanks, Jim


Re: [ANNOUNCE] Apache Flink Kubernetes Operator 1.5.0 released

2023-05-23 Thread Maximilian Michels
Niceee. Thanks for managing the release, Gyula!

-Max

On Wed, May 17, 2023 at 8:25 PM Márton Balassi  wrote:
>
> Thanks, awesome! :-)
>
> On Wed, May 17, 2023 at 2:24 PM Gyula Fóra  wrote:
>>
>> The Apache Flink community is very happy to announce the release of Apache 
>> Flink Kubernetes Operator 1.5.0.
>>
>> The Flink Kubernetes Operator allows users to manage their Apache Flink 
>> applications and their lifecycle through native k8s tooling like kubectl.
>>
>> Release highlights:
>>  - Autoscaler improvements
>>  - Operator stability, observability improvements
>>
>> Release blogpost:
>> https://flink.apache.org/2023/05/17/apache-flink-kubernetes-operator-1.5.0-release-announcement/
>>
>> The release is available for download at: 
>> https://flink.apache.org/downloads.html
>>
>> Maven artifacts for Flink Kubernetes Operator can be found at: 
>> https://search.maven.org/artifact/org.apache.flink/flink-kubernetes-operator
>>
>> Official Docker image for Flink Kubernetes Operator applications can be 
>> found at: https://hub.docker.com/r/apache/flink-kubernetes-operator
>>
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/projects/FLINK/versions/12352931
>>
>> We would like to thank all contributors of the Apache Flink community who 
>> made this release possible!
>>
>> Regards,
>> Gyula Fora


Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website

2023-07-17 Thread Maximilian Michels
+1

On Mon, Jul 17, 2023 at 10:45 AM Chesnay Schepler  wrote:
>
> +1
>
> On 16/07/2023 08:10, Mohan, Deepthi wrote:
> > @Chesnay
> >
> > Thank you for your feedback.
> >
> > An important takeaway from the previous discussion [1] and your feedback 
> > was to keep the design and text/diagram changes separate as each change for 
> > text and diagrams likely require deeper discussion. Therefore, as a first 
> > step I am proposing only UX changes with minimal text changes for the pages 
> > mentioned in the FLIP.
> >
> > The feedback we received from customers cover both aesthetics and 
> > functional aspects of the website. Note that most feedback is focused only 
> > on the main Flink website [2].
> >
> > 1) New customers who are considering Flink have said about the website 
> > “there is a lot going on”, “looks too complicated”, “I am not sure *why* I 
> > should use this" and similar feedback. The proposed redesign in this FLIP 
> > helps partially address this category of feedback, but we may need to make 
> > the use cases and value proposition “pop” more than we have currently 
> > proposed in the redesign. I’d like to get the community’s thoughts on this.
> >
> > 2) On the look and feel of the website, I’ve already shared feedback prior 
> > that I am repeating here: “like a wiki page thrown together by developers.” 
> > Customers also point out other related Apache project websites: [3] and [4] 
> > as having “modern” user design. The proposed redesign in this FLIP will 
> > help address this feedback. Modernizing the look and feel of the website 
> > will appeal to customers who are used to what they encounter on other 
> > contemporary websites.
> >
> > 3) New and existing Flink developers have said “I am not sure what the 
> > diagram is supposed to depict” - referencing the main diagram on [2] and 
> > have said that the website lacks useful graphics and colors. Apart from 
> > removing the diagram on the main page [2], the current FLIP does propose 
> > major changes to diagrams in the rest of website and we can discuss them 
> > separately as they become available. I’d like to keep the FLIP focused only 
> > on the website redesign.
> >
> > Ultimately, to Chesnay’s point in the earlier discussion in [1], I do not 
> > want to boil the ocean with all the changes at once. In this FLIP, my 
> > proposal is to first work on the UX design as that gives us a good starting 
> > point. We can use it as a framework to make iterative changes and 
> > enhancements to diagrams and the actual website content incrementally.
> >
> > I’ve added a few more screenshots of additional pages to the FLIP that will 
> > give you a clearer picture of the proposed changes for the main page, What 
> > is Flink [Architecture, Applications, and Operations] pages.
> >
> > And finally, I am not proposing any tooling changes.
> >
> > [1] https://lists.apache.org/thread/c3pt00cf77lrtgt242p26lgp9l2z5yc8
> > [2]https://flink.apache.org/
> > [3] https://spark.apache.org/
> > [4] https://kafka.apache.org/
> >
> > On 7/13/23, 6:25 AM, "Chesnay Schepler"  > > wrote:
> >
> >
> > CAUTION: This email originated from outside of the organization. Do not 
> > click links or open attachments unless you can confirm the sender and know 
> > the content is safe.
> >
> >
> >
> >
> >
> >
> > On 13/07/2023 08:07, Mohan, Deepthi wrote:
> >> However, even these developers when explicitly asked in our conversations 
> >> often comment that the website could do with a redesign
> >
> > Can you go into more detail as to their specific concerns? Are there
> > functional problems with the page, or is this just a matter of "I don't
> > like the way it looks"?
> >
> >
> > What had they trouble with? Which information was
> > missing/unnecessary/too hard to find?
> >
> >
> > The FLIP states that "/we want to modernize the website so that new and
> > existing users can easily find information to understand what Flink is,
> > the primary use cases where Flink is useful, and clearly understand its
> > value proposition/."
> >
> >
> >  From the mock-ups I don't /really/ see how these stated goals are
> > achieved. It mostly looks like a fresh coat of paint, with a compressed
> > nav bar (which does reduce how much information and links we throw at
> > people at once (which isn't necessarily bad)).
> >
> >
> > Can you go into more detail w.r.t. to the proposed
> > text/presentation/diagram changes?
> >
> >
> > I assume you are not proposing any tooling changes?
> >
> >
> >
> >
> >
>


Re: [DISCUSS][2.0] FLIP-340: Remove rescale REST endpoint

2023-07-18 Thread Maximilian Michels
+1

On Tue, Jul 18, 2023 at 12:29 PM Gyula Fóra  wrote:
>
> +1
>
> On Tue, 18 Jul 2023 at 12:12, Xintong Song  wrote:
>
> > +1
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Tue, Jul 18, 2023 at 4:25 PM Chesnay Schepler 
> > wrote:
> >
> > > The endpoint hasn't been working for years and was only kept to inform
> > > users about it. Let's finally remove it.
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint
> > >
> > >
> >


Re: FLIP-401: Remove brackets around keys returned by MetricGroup#getAllVariables

2023-07-18 Thread Maximilian Michels
Hi Chesnay,

+1 Sounds good to me!

-Max

On Tue, Jul 18, 2023 at 10:59 AM Chesnay Schepler  wrote:
>
> MetricGroup#getAllVariables returns all variables associated with the
> metric, for example:
>
> | = abcde|
> | = ||0|
>
> The keys are surrounded by brackets for no particular reason.
>
> In virtually every use-case for this method the user is stripping the
> brackets from keys, as done in:
>
>   * our datadog reporter:
> 
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L244
> 
> 
>   * our prometheus reporter (implicitly via a character filter):
> 
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L236
> 
> 
>   * our JMX reporter:
> 
> https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-metrics/flink-metrics-jmx/src/main/java/org/apache/flink/metrics/jmx/JMXReporter.java#L223
> 
> 
>
> I propose to change the method spec and implementation to remove the
> brackets around keys.
>
> For migration purposes it may make sense to add a new method with the
> new behavior (|getVariables()|) and deprecate the old method.
>
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425202


Re: [DISCUSS] Flink Kubernetes Operator cleanup procedure

2023-07-18 Thread Maximilian Michels
Hi Daren,

The behavior is consistent with the regular FlinkDeployment where the
cleanup will also cancel any running jobs. Are you intending to
recover jobs from another session cluster?

-Max

On Mon, Jul 17, 2023 at 4:48 PM Wong, Daren
 wrote:
>
> Hi devs,
>
> I would like to enquire about the cleanup procedure upon FlinkSessionJob 
> deletion. Currently, FlinkSessionJobController would trigger a cleanup in the 
> SessionJobReconciler which in turn cancels the Flink Job.
>
> Link to code: 
> https://github.com/apache/flink-kubernetes-operator/blob/371a2e6bbb8008c8ffccfff8fc338fb39bda19e2/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/sessionjob/SessionJobReconciler.java#L110
>
> This make sense to me as we want to ensure the Flink Job is ended gracefully 
> when the FlinkSessionJob associated with it is deleted. Otherwise, the Flink 
> Job will be “leaked” in the Flink cluster without a FlinkSessionJob 
> associated to it for the Kubernetes Operator to control.
>
> That being said, I was wondering if we should consider for scenarios where 
> users may not want FlinkSessionJob deletion to create a side-effect such as 
> cancelJob? For example, the user just wants to simply delete the whole 
> namespace. One way of achieving this could be to put the job in SUSPENDED 
> state instead of cancelling the job.
>
> I am opening this discussion thread to get feedback and input on whether this 
> alternative cleanup procedure is worth considering and if anyone else see any 
> potential use case/benefits/downsides with it?
>
> Thank you very much.
>
> Regards,
> Daren


Re: [VOTE] Apache Flink Kubernetes Operator Release 1.6.0, release candidate #1

2023-07-31 Thread Maximilian Michels
+1 (binding)

1. Downloaded the source and helm archives, checksums, and signatures
2. Verified the signatures and checksums
3. Extract and inspect the source code for binaries
4. Compiled and tested the source code via mvn verify
5. Verified license files / headers
6. Deployed helm chart to test cluster
7. Ran example job

-Max

On Sun, Jul 30, 2023 at 5:49 PM Mate Czagany  wrote:
>
> +1 (non-binding)
>
> - Verified checksums and signatures
> - Found no binaries in source
> - Helm chart points to correct docker image
> - Installed via remote helm repo
> - Reactive example up- and down-scaled with- and without reactive mode
> - Autoscale with Kafka source
> - HA stateful deployment, with savepoint and restart
> - 1.18 in-place autoscale with adaptive scheduler
>
> Best regards,
> Mate
>
> Rui Fan <1996fan...@gmail.com> ezt írta (időpont: 2023. júl. 30., V, 6:20):
>
> > +1 (non-binding)
> >
> > - Compiled and tested the source code via mvn verify
> > - Verified the signatures
> > - Downloaded the image : docker pull
> > ghcr.io/apache/flink-kubernetes-operator:e7045a6
> > - Deployed helm chart to test cluster
> > - Ran example job
> >
> > Best,
> > Rui Fan
> >
> > On Thu, Jul 27, 2023 at 10:53 PM Gyula Fóra  wrote:
> >
> > > Hi Everyone,
> > >
> > > Please review and vote on the release candidate #1 for the version 1.6.0
> > of
> > > Apache Flink Kubernetes Operator,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > **Release Overview**
> > >
> > > As an overview, the release consists of the following:
> > > a) Kubernetes Operator canonical source distribution (including the
> > > Dockerfile), to be deployed to the release repository at dist.apache.org
> > > b) Kubernetes Operator Helm Chart to be deployed to the release
> > repository
> > > at dist.apache.org
> > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > d) Docker image to be pushed to dockerhub
> > >
> > > **Staging Areas to Review**
> > >
> > > The staging areas containing the above mentioned artifacts are as
> > follows,
> > > for your review:
> > > * All artifacts for a,b) can be found in the corresponding dev repository
> > > at dist.apache.org [1]
> > > * All artifacts for c) can be found at the Apache Nexus Repository [2]
> > > * The docker image for d) is staged on github [3]
> > >
> > > All artifacts are signed with the key 21F06303B87DAFF1 [4]
> > >
> > > Other links for your review:
> > > * JIRA release notes [5]
> > > * source code tag "release-1.6.0-rc1" [6]
> > > * PR to update the website Downloads page to
> > > include Kubernetes Operator links [7]
> > >
> > > **Vote Duration**
> > >
> > > The voting time will run for at least 72 hours.
> > > It is adopted by majority approval, with at least 3 PMC affirmative
> > votes.
> > >
> > > **Note on Verification**
> > >
> > > You can follow the basic verification guide here[8].
> > > Note that you don't need to verify everything yourself, but please make
> > > note of what you have tested together with your +- vote.
> > >
> > > Cheers!
> > > Gyula Fora
> > >
> > > [1]
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.6.0-rc1/
> > > [2]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1646/
> > > [3] ghcr.io/apache/flink-kubernetes-operator:e7045a6
> > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [5]
> > >
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353230
> > > [6]
> > >
> > https://github.com/apache/flink-kubernetes-operator/tree/release-1.6.0-rc1
> > > [7] https://github.com/apache/flink-web/pull/666
> > > [8]
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
> > >
> >


Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and kubernetes

2023-08-01 Thread Maximilian Michels
Hi Rui,

Thanks for the proposal. I think it makes a lot of sense to decouple
the autoscaler from Kubernetes-related dependencies. A couple of notes
when I read the proposal:

1. You propose AutoScalerEventHandler, AutoScalerStateStore,
AutoScalerStateStoreFactory, and AutoScalerEventHandler.
AutoscalerStateStore is a generic key/value database (methods:
"get"/"put"/"delete"). I would propose to refine this interface and
make it less general purpose, e.g. add a method for persisting scaling
decisions as well as any metrics gathered for the current metric
window. For simplicity, I'd even go so far to remove the state store
entirely, but rather handle state in the AutoScalerEventHandler which
will receive all related scaling and metric collection events, and can
keep track of any state.

2. You propose to make the current autoscaler module
Kubernetes-agnostic by moving the Kubernetes parts into the main
operator module. I think that makes sense since the Kubernetes
implementation will continue to be tightly coupled with Kubernetes.
The goal of the separate module was to make the autoscaler logic
pluggable, but this will continue to be possible with the new
"flink-autoscaler" module which contains the autoscaling logic and
interfaces. In the long run, the autoscaling logic can move to a
separate repository, although this will complicate the release
process, so I would defer this unless there is strong interest.

3. The proposal mentions some removal of tests. It is critical for us
that all test coverage of the current implementation remains active.
It is ok if some of the test coverage only covers the Kubernetes
implementation. We can eventually move more tests without Kubernetes
significance into the implementation-agnostic autoscaler tests.

-Max

On Tue, Aug 1, 2023 at 9:46 AM Rui Fan  wrote:
>
> Hi all,
>
> I and Samrat(cc'ed) created the FLIP-334[1] to decoupling the autoscaler
> and kubernetes.
>
> Currently, the flink-autoscaler is tightly integrated with Kubernetes.
> There are compelling reasons to extend the use of flink-autoscaler to
> more types of Flink jobs:
> 1. With the recent merge of the Externalized Declarative Resource
> Management (FLIP-291[2]), in-place scaling is now supported
> across all types of Flink jobs. This development has made scaling Flink on
> YARN a straightforward process.
> 2. Several discussions[3] within the Flink user community, as observed in
> the mail list , have emphasized the necessity of flink-autoscaler
> supporting
> Flink on YARN.
>
> Please refer to the FLIP[1] document for more details about the proposed
> design and implementation. We welcome any feedback and opinions on
> this proposal.
>
> [1] https://cwiki.apache.org/confluence/x/x4qzDw
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management
> [3] https://lists.apache.org/thread/pr0r8hq8kqpzk3q1zrzkl3rp1lz24v7v


Re: [VOTE] Apache Flink Kubernetes Operator Release 1.6.0, release candidate #1

2023-08-07 Thread Maximilian Michels
Unfortunately, I've found an issue which might need to be fixed for
the release: https://issues.apache.org/jira/browse/FLINK-32774

-Max

On Wed, Aug 2, 2023 at 12:16 PM Gyula Fóra  wrote:
>
> +1 (binding)
>
> - Verified checksums and signatures, binaries, licenses
> - Tested release helm chart, docker image
> - Verified doc build, links
> - Ran basic stateful example, upgrade, savepoint. Checked logs, no errors
>
> Gyula
>
> On Mon, Jul 31, 2023 at 2:24 PM Maximilian Michels  wrote:
>
> > +1 (binding)
> >
> > 1. Downloaded the source and helm archives, checksums, and signatures
> > 2. Verified the signatures and checksums
> > 3. Extract and inspect the source code for binaries
> > 4. Compiled and tested the source code via mvn verify
> > 5. Verified license files / headers
> > 6. Deployed helm chart to test cluster
> > 7. Ran example job
> >
> > -Max
> >
> > On Sun, Jul 30, 2023 at 5:49 PM Mate Czagany  wrote:
> > >
> > > +1 (non-binding)
> > >
> > > - Verified checksums and signatures
> > > - Found no binaries in source
> > > - Helm chart points to correct docker image
> > > - Installed via remote helm repo
> > > - Reactive example up- and down-scaled with- and without reactive mode
> > > - Autoscale with Kafka source
> > > - HA stateful deployment, with savepoint and restart
> > > - 1.18 in-place autoscale with adaptive scheduler
> > >
> > > Best regards,
> > > Mate
> > >
> > > Rui Fan <1996fan...@gmail.com> ezt írta (időpont: 2023. júl. 30., V,
> > 6:20):
> > >
> > > > +1 (non-binding)
> > > >
> > > > - Compiled and tested the source code via mvn verify
> > > > - Verified the signatures
> > > > - Downloaded the image : docker pull
> > > > ghcr.io/apache/flink-kubernetes-operator:e7045a6
> > > > - Deployed helm chart to test cluster
> > > > - Ran example job
> > > >
> > > > Best,
> > > > Rui Fan
> > > >
> > > > On Thu, Jul 27, 2023 at 10:53 PM Gyula Fóra  wrote:
> > > >
> > > > > Hi Everyone,
> > > > >
> > > > > Please review and vote on the release candidate #1 for the version
> > 1.6.0
> > > > of
> > > > > Apache Flink Kubernetes Operator,
> > > > > as follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > > >
> > > > > **Release Overview**
> > > > >
> > > > > As an overview, the release consists of the following:
> > > > > a) Kubernetes Operator canonical source distribution (including the
> > > > > Dockerfile), to be deployed to the release repository at
> > dist.apache.org
> > > > > b) Kubernetes Operator Helm Chart to be deployed to the release
> > > > repository
> > > > > at dist.apache.org
> > > > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > > > d) Docker image to be pushed to dockerhub
> > > > >
> > > > > **Staging Areas to Review**
> > > > >
> > > > > The staging areas containing the above mentioned artifacts are as
> > > > follows,
> > > > > for your review:
> > > > > * All artifacts for a,b) can be found in the corresponding dev
> > repository
> > > > > at dist.apache.org [1]
> > > > > * All artifacts for c) can be found at the Apache Nexus Repository
> > [2]
> > > > > * The docker image for d) is staged on github [3]
> > > > >
> > > > > All artifacts are signed with the key 21F06303B87DAFF1 [4]
> > > > >
> > > > > Other links for your review:
> > > > > * JIRA release notes [5]
> > > > > * source code tag "release-1.6.0-rc1" [6]
> > > > > * PR to update the website Downloads page to
> > > > > include Kubernetes Operator links [7]
> > > > >
> > > > > **Vote Duration**
> > > > >
> > > > > The voting time will run for at least 72 hours.
> > > > > It is adopted by majority approval, with at least 3 PMC affirmative
> > > > votes.
> > > > >
> > > > > **Note on Verification**
> > > > >
> > > > > You can follow the basic verification guide here[8].
> > > > > Note that you don't need to verify everything yourself, but please
> > make
> > > > > note of what you have tested together with your +- vote.
> > > > >
> > > > > Cheers!
> > > > > Gyula Fora
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.6.0-rc1/
> > > > > [2]
> > > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1646/
> > > > > [3] ghcr.io/apache/flink-kubernetes-operator:e7045a6
> > > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > [5]
> > > > >
> > > > >
> > > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353230
> > > > > [6]
> > > > >
> > > >
> > https://github.com/apache/flink-kubernetes-operator/tree/release-1.6.0-rc1
> > > > > [7] https://github.com/apache/flink-web/pull/666
> > > > > [8]
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
> > > > >
> > > >
> >


Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl

2023-08-08 Thread Maximilian Michels
Congrats, well done, and welcome to the PMC Matthias!

-Max

On Tue, Aug 8, 2023 at 8:36 AM yh z  wrote:
>
> Congratulations, Matthias!
>
> Best,
> Yunhong Zheng (Swuferhong)
>
> Ryan Skraba  于2023年8月7日周一 21:39写道:
>
> > Congratulations Matthias -- very well-deserved, the community is lucky to
> > have you <3
> >
> > All my best, Ryan
> >
> > On Mon, Aug 7, 2023 at 3:04 PM Lincoln Lee  wrote:
> >
> > > Congratulations!
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Feifan Wang  于2023年8月7日周一 20:13写道:
> > >
> > > > Congrats Matthias!
> > > >
> > > >
> > > >
> > > > ——
> > > > Name: Feifan Wang
> > > > Email: zoltar9...@163.com
> > > >
> > > >
> > > >  Replied Message 
> > > > | From | Matthias Pohl |
> > > > | Date | 08/7/2023 16:16 |
> > > > | To |  |
> > > > | Subject | Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl
> > |
> > > > Thanks everyone. :)
> > > >
> > > > On Mon, Aug 7, 2023 at 3:18 AM Andriy Redko  wrote:
> > > >
> > > > Congrats Matthias, well deserved!!
> > > >
> > > > DC> Congrats Matthias!
> > > >
> > > > DC> Very well deserved, thankyou for your continuous, consistent
> > > > contributions.
> > > > DC> Welcome.
> > > >
> > > > DC> Thanks,
> > > > DC> Danny
> > > >
> > > > DC> On Fri, Aug 4, 2023 at 9:30 AM Feng Jin 
> > > wrote:
> > > >
> > > > Congratulations, Matthias!
> > > >
> > > > Best regards
> > > >
> > > > Feng
> > > >
> > > > On Fri, Aug 4, 2023 at 4:29 PM weijie guo 
> > > > wrote:
> > > >
> > > > Congratulations, Matthias!
> > > >
> > > > Best regards,
> > > >
> > > > Weijie
> > > >
> > > >
> > > > Wencong Liu  于2023年8月4日周五 15:50写道:
> > > >
> > > > Congratulations, Matthias!
> > > >
> > > > Best,
> > > > Wencong Liu
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2023-08-04 11:18:00, "Xintong Song" 
> > > > wrote:
> > > > Hi everyone,
> > > >
> > > > On behalf of the PMC, I'm very happy to announce that Matthias Pohl
> > > > has
> > > > joined the Flink PMC!
> > > >
> > > > Matthias has been consistently contributing to the project since
> > > > Sep
> > > > 2020,
> > > > and became a committer in Dec 2021. He mainly works in Flink's
> > > > distributed
> > > > coordination and high availability areas. He has worked on many
> > > > FLIPs
> > > > including FLIP195/270/285. He helped a lot with the release
> > > > management,
> > > > being one of the Flink 1.17 release managers and also very active
> > > > in
> > > > Flink
> > > > 1.18 / 2.0 efforts. He also contributed a lot to improving the
> > > > build
> > > > stability.
> > > >
> > > > Please join me in congratulating Matthias!
> > > >
> > > > Best,
> > > >
> > > > Xintong (on behalf of the Apache Flink PMC)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >


Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-08-09 Thread Maximilian Michels
+1 (binding)

-Max

On Tue, Aug 8, 2023 at 10:56 AM Etienne Chauchot  wrote:
>
> Hi all,
>
> As part of Flink bylaws, binding votes for FLIP changes are active
> committer votes.
>
> Up to now, we have only 2 binding votes. Can one of the committers/PMC
> members vote on this FLIP ?
>
> Thanks
>
> Etienne
>
>
> Le 08/08/2023 à 10:19, Etienne Chauchot a écrit :
> >
> > Hi Joseph,
> >
> > Thanks for the detailled review !
> >
> > Best
> >
> > Etienne
> >
> > Le 14/07/2023 à 11:57, Prabhu Joseph a écrit :
> >> *+1 (non-binding)*
> >>
> >> Thanks for working on this. We have seen good improvement during the cool
> >> down period with this feature.
> >> Below are details on the test results from one of our clusters:
> >>
> >> On a scale-out operation, 8 new nodes were added one by one with a gap of
> >> ~30 seconds. There were 8 restarts within 4 minutes with the default
> >> behaviour,
> >> whereas only one with this feature (cooldown period of 4 minutes).
> >>
> >> The number of records processed by the job with this feature during the
> >> restart window is higher (2909764), whereas it is only 1323960 with the
> >> default
> >> behaviour due to multiple restarts, where it spends most of the time
> >> recovering, and also whatever work progressed by the tasks after the last
> >> successful completed checkpoint is lost.
> >>
> >> Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
> >> Remarks
> >> NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
> >> the difference the cool down period brings in. When the job is doing
> >> multiple restarts, the task spends most of the time recovering, and the
> >> progress the task made will be lost during the restart.
> >>
> >> 2. There is only one restart with Cool Down Period which happened when the
> >> 8th node got added back.
> >>
> >> Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
> >> NumRestarts 8 1
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot
> >> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I'm going on vacation tonight for 3 weeks.
> >>>
> >>> Even if the vote is not finished, as the implementation is rather quick
> >>> and the design discussion had settled, I preferred I implementing
> >>> FLIP-322 [1] to allow people to take a look while I'm off.
> >>>
> >>> [1]https://github.com/apache/flink/pull/22985
> >>>
> >>> Best
> >>>
> >>> Etienne
> >>>
> >>> Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :
>  Hi all,
> 
>  Would you mind casting your vote to this second vote thread (opened
>  after new discussions) so that the subject can move forward ?
> 
>  @David, @Chesnay, @Robert you took part to the discussions, can you
>  please sent your vote ?
> 
>  Thank you very much
> 
>  Best
> 
>  Etienne
> 
>  Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :
> > Hi all,
> >
> > Thanks for your feedback about the FLIP-322: Cooldown period for
> > adaptive scheduler [1].
> >
> > This FLIP was discussed in [2].
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours (until July 9th 15:00 GMT) unless there is an objection or
> > insufficient votes.
> >
> > [1]
> >
> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> > [2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6
> >
> > Best,
> >
> > Etienne


Re: [VOTE] Apache Flink Kubernetes Operator Release 1.6.0, release candidate #2

2023-08-14 Thread Maximilian Michels
+1 (binding)

1. Downloaded the archives, checksums, and signatures
2. Verified the signatures and checksums
3. Extract and inspect the source code for binaries
4. Compiled and tested the source code via mvn verify
5. Verified license files / headers
6. Deployed helm chart to test cluster
7. Ran example job
8. Tested autoscaling without rescaling API

-Max

On Mon, Aug 14, 2023 at 3:44 PM Márton Balassi  wrote:
>
> Thank you, team.
>
> +1 (binding)
>
> - Verified Helm repo works as expected, points to correct image tag, build,
> version
> - Verified basic examples + checked operator logs everything looks as
> expected
> - Verified hashes, signatures and source release contains no binaries
> - Ran built-in tests, built jars + docker image from source successfully
>
> Best,
> Marton
>
> On Mon, Aug 14, 2023 at 1:24 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Thanks Gyula for the release!
> >
> > +1 (non-binding)
> >
> > - Compiled and tested the source code via mvn verify
> > - Verified the signatures
> > - Downloaded the image
> > - Deployed helm chart to test cluster
> > - Ran example job
> >
> > Best,
> > Rui
> >
> > On Mon, Aug 14, 2023 at 3:58 PM Gyula Fóra  wrote:
> >
> > > +1 (binding)
> > >
> > > Verified:
> > >  - Hashes, signatures, source files contain no binaries
> > >  - Maven repo contents look good
> > >  - Verified helm chart, image, deployed stateful and autoscaling
> > examples.
> > > Operator logs look good
> > >
> > > Cheers,
> > > Gyula
> > >
> > > On Thu, Aug 10, 2023 at 3:03 PM Gyula Fóra  wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > Please review and vote on the release candidate #2 for the
> > > > version 1.6.0 of Apache Flink Kubernetes Operator,
> > > > as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > **Release Overview**
> > > >
> > > > As an overview, the release consists of the following:
> > > > a) Kubernetes Operator canonical source distribution (including the
> > > > Dockerfile), to be deployed to the release repository at
> > dist.apache.org
> > > > b) Kubernetes Operator Helm Chart to be deployed to the release
> > > repository
> > > > at dist.apache.org
> > > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > > d) Docker image to be pushed to dockerhub
> > > >
> > > > **Staging Areas to Review**
> > > >
> > > > The staging areas containing the above mentioned artifacts are as
> > > follows,
> > > > for your review:
> > > > * All artifacts for a,b) can be found in the corresponding dev
> > repository
> > > > at dist.apache.org [1]
> > > > * All artifacts for c) can be found at the Apache Nexus Repository [2]
> > > > * The docker image for d) is staged on github [3]
> > > >
> > > > All artifacts are signed with the key 21F06303B87DAFF1 [4]
> > > >
> > > > Other links for your review:
> > > > * JIRA release notes [5]
> > > > * source code tag "release-1.6.0-rc2" [6]
> > > > * PR to update the website Downloads page to
> > > > include Kubernetes Operator links [7]
> > > >
> > > > **Vote Duration**
> > > >
> > > > The voting time will run for at least 72 hours.
> > > > It is adopted by majority approval, with at least 3 PMC affirmative
> > > votes.
> > > >
> > > >
> > > > **Note on Verification**
> > > >
> > > > You can follow the basic verification guide here[8].
> > > > Note that you don't need to verify everything yourself, but please make
> > > > note of what you have tested together with your +- vote.
> > > >
> > > > Cheers!
> > > > Gyula Fora
> > > >
> > > > [1]
> > > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.6.0-rc2/
> > > > [2]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1649/
> > > > [3] ghcr.io/apache/flink-kubernetes-operator:ebb1fed
> > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [5]
> > > >
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353230
> > > > [6]
> > > >
> > >
> > https://github.com/apache/flink-kubernetes-operator/tree/release-1.6.0-rc2
> > > > [7] https://github.com/apache/flink-web/pull/666
> > > > [8]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
> > > >
> > >
> >


Re: [DISCUSS] FLIP-361: Improve GC Metrics

2023-09-05 Thread Maximilian Michels
Hi Gyula,

+1 The proposed changes make sense and are in line with what is
available for other metrics, e.g. number of records processed.

-Max

On Tue, Sep 5, 2023 at 2:43 PM Gyula Fóra  wrote:
>
> Hi Devs,
>
> I would like to start a discussion on FLIP-361: Improve GC Metrics [1].
>
> The current Flink GC metrics [2] are not very useful for monitoring
> purposes as they require post processing logic that is also dependent on
> the current runtime environment.
>
> Problems:
>  - Total time is not very relevant for long running applications, only the
> rate of change (msPerSec)
>  - In most cases it's best to simply aggregate the time/count across the
> different GabrageCollectors, however the specific collectors are dependent
> on the current Java runtime
>
> We propose to improve the current situation by:
>  - Exposing rate metrics per GarbageCollector
>  - Exposing aggregated Total time/count/rate metrics
>
> These new metrics are all derived from the existing ones with minimal
> overhead.
>
> Looking forward to your feedback.
>
> Cheers,
> Gyula
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-361%3A+Improve+GC+Metrics
> [2]
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/#garbagecollection


Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and kubernetes

2023-09-05 Thread Maximilian Michels
Thanks Rui for the update!

Alongside with the refactoring to decouple autoscaler logic from the
deployment logic, are we planning to add an alternative implementation
against the new interfaces? I think the best way to get the interfaces
right, is to have an alternative implementation in addition to
Kubernetes. YARN or a standalone mode implementation were already
mentioned. Ultimately, this is the reason we are doing the
refactoring. Without a new implementation, it becomes harder to
justify the refactoring work.

Cheers,
Max

On Tue, Sep 5, 2023 at 9:48 AM Rui Fan  wrote:
>
> After discussing this FLIP-334[1] offline with Gyula and Max,
> I updated the FLIP based on the latest conclusion.
>
> Big thanks to Gyula and Max for their professional advice!
>
> > Does the interface function of handlerRecommendedParallelism
> > in AutoScalerEventHandler conflict with
> > handlerScalingFailure/handlerScalingReport (one of the
> > handles the event of scale failure, and the other handles
> > the event of scale success).
> Hi Matt,
>
> You can take a look at the FLIP, I think the issue has been fixed.
> Currently, we introduced the ScalingRealizer and
> AutoScalerEventHandler interface.
>
> The ScalingRealizer handles scaling action.
>
> The AutoScalerEventHandler  interface handles loggable events.
>
>
> Looking forward to your feedback, thanks!
>
> [1] https://cwiki.apache.org/confluence/x/x4qzDw
>
> Best,
> Rui
>
> On Thu, Aug 24, 2023 at 10:55 AM Matt Wang  wrote:
>>
>> Sorry for the late reply, I still have a small question here:
>> Does the interface function of handlerRecommendedParallelism
>> in AutoScalerEventHandler conflict with
>> handlerScalingFailure/handlerScalingReport (one of the
>> handles the event of scale failure, and the other handles
>> the event of scale success).
>>
>>
>>
>> --
>>
>> Best,
>> Matt Wang
>>
>>
>>  Replied Message 
>> | From | Rui Fan<1996fan...@gmail.com> |
>> | Date | 08/21/2023 17:41 |
>> | To |  |
>> | Cc | Maximilian Michels ,
>> Gyula Fóra ,
>> Matt Wang |
>> | Subject | Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and kubernetes |
>> Hi Max, Gyula and Matt,
>>
>> Do you have any other comments?
>>
>> The flink-kubernetes-operator 1.6 has been released recently,
>> it's a good time to kick off this FLIP.
>>
>> Please let me know if you have any questions or concerns,
>> looking forward to your feedback, thanks!
>>
>> Best,
>> Rui
>>
>> On Wed, Aug 9, 2023 at 11:55 AM Rui Fan <1996fan...@gmail.com> wrote:
>>
>> Hi Matt Wang,
>>
>> Thanks for your discussion here.
>>
>> it is recommended to unify the descriptions of AutoScalerHandler
>> and AutoScalerEventHandler in the FLIP
>>
>> Good catch, I have updated all AutoScalerHandler to
>> AutoScalerEventHandler.
>>
>> Can it support the use of zookeeper (zookeeper is a relatively
>> common use of flink HA)?
>>
>> In my opinion, it's a good suggestion. However, I prefer we
>> implement other state stores in the other FLINK JIRA, and
>> this FLIP focus on the decoupling and implementing the
>> necessary state store. Does that make sense?
>>
>> Regarding each scaling information, can it be persisted in
>> the shared file system through the filesystem? I think it will
>> be a more valuable requirement to support viewing
>> Autoscaling info on the UI in the future, which can provide
>> some foundations in advance;
>>
>> This is a good suggestion as well. It's useful for users to check
>> the scaling information. I propose to add a CompositeEventHandler,
>> it can include multiple EventHandlers.
>>
>> However, as the last question, I prefer we implement other
>> event handler in the other FLINK JIRA. What do you think?
>>
>> A solution mentioned in FLIP is to initialize the
>> AutoScalerEventHandler object every time an event is
>> processed.
>>
>> No, the FLIP mentioned `The AutoScalerEventHandler  object is shared for
>> all flink jobs`,
>> So the AutoScalerEventHandler is only initialized once.
>>
>> And we call the AutoScalerEventHandler#handlerXXX
>> every time an event is processed.
>>
>> Best,
>> Rui
>>
>> On Tue, Aug 8, 2023 at 9:40 PM Matt Wang  wrote:
>>
>> Hi Rui
>>
>> Thanks for driving the FLIP.
>>
>> I agree with the point fo this FLIP. This FLIP first provides a
>> general function of Autoscaler in Flink repo, and there is no
>

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-05 Thread Maximilian Michels
+1 Sounds good! Four releases give a decent amount of time to migrate
to the next Flink version.

On Tue, Sep 5, 2023 at 5:33 PM Őrhidi Mátyás  wrote:
>
> +1
>
> On Tue, Sep 5, 2023 at 8:03 AM Thomas Weise  wrote:
>
> > +1, thanks for the proposal
> >
> > On Tue, Sep 5, 2023 at 8:13 AM Gyula Fóra  wrote:
> >
> > > Hi All!
> > >
> > > @Maximilian Michels  has raised the question of Flink
> > > version support in the operator before the last release. I would like to
> > > open this discussion publicly so we can finalize this before the next
> > > release.
> > >
> > > Background:
> > > Currently the Flink Operator supports all Flink versions since Flink
> > 1.13.
> > > While this is great for the users, it introduces a lot of backward
> > > compatibility related code in the operator logic and also adds
> > considerable
> > > time to the CI. We should strike a reasonable balance here that allows us
> > > to move forward and eliminate some of this tech debt.
> > >
> > > In the current model it is also impossible to support all features for
> > all
> > > Flink versions which leads to some confusion over time.
> > >
> > > Proposal:
> > > Since it's a key feature of the kubernetes operator to support several
> > > versions at the same time, I propose to support the last 4 stable Flink
> > > minor versions. Currently this would mean to support Flink 1.14-1.17 (and
> > > drop 1.13 support). When Flink 1.18 is released we would drop 1.14
> > support
> > > and so on. Given the Flink release cadence this means about 2 year
> > support
> > > for each Flink version.
> > >
> > > What do you think?
> > >
> > > Cheers,
> > > Gyula
> > >
> >


Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and kubernetes

2023-09-06 Thread Maximilian Michels
Hey Rui, hey Samrat,

I want to ensure this is not just an exercise but has actual benefits
for the community. In the past, I've seen that the effort stops half
way through, the refactoring gets done with some regressions, but
actual alternative implementations based on the new design never
follow.

We need to go through these phases for the FLIP to be meaningful:

1. Decouple autoscaler from current autoscaler (generalization)
2. Ensure 100% functionality and test coverage of Kubernetes implementation
3. Interface with another backend (e.g. YARN or standalone)

If we don't follow through with this plan, I'm not sure we are better
off than with the current implementation. Apologies if I'm being a bit
strict here but the autoscaling code has become a critical
infrastructure component. We need to carefully weigh the pros and cons
here to avoid risks for our users, some of them using this code in
production and relying on it on a day to day basis.

That said, we are open to following through with the FLIP and we can
definitely help review code changes and build on the new design.

-Max


On Wed, Sep 6, 2023 at 11:26 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Hi Max,
>
> As the FLIP mentioned, we have the plan to add the
> alternative implementation.
>
> First of all, we will develop a generic autoscaler. This generic
> autoscaler will not have knowledge of specific jobs, and users
> will have the flexibility to pass the JobAutoScalerContext
> when utilizing the generic autoscaler. Communication with
> Flink jobs can be achieved through the RestClusterClient.
>
>- The generic ScalingRealizer is based on the rescale API (FLIP-291).
>- The generic EventHandler is based on the logger.
>- The generic StateStore is based on the Heap. This means that the state
>information is stored in memory and can be lost if the autoscaler restarts.
>
>
> Secondly, for yarn implementation, as Samrat mentioned,
> There is currently no flink-yarn-operator, and we cannot
> easily obtain the job list. We are not yet sure how to manage
> yarn's flink jobs. In order to prevent the FLIP from being too huge,
> after confirming with Gyula and Samrat before, it is decided
> that the current FLIP will not implement the automated
> yarn-autoscaler. And it will be a separate FLIP in the future.
>
>
> After this part is finished, flink users or other flink platforms can easy
> to use the autoscaler, they just pass the Context, and the autoscaler
> can find the flink job using the RestClient.
>
> The first part will be done in this FLIP. And we can discuss
> whether the second part should be done in this FLIP as well.
>
> Best,
> Rui
>
> On Wed, Sep 6, 2023 at 4:34 AM Samrat Deb  wrote:
>
> > Hi Max,
> >
> > > are we planning to add an alternative implementation
> > against the new interfaces?
> >
> > Yes, we are simultaneously working on the YARN implementation using the
> > interface. During the initial interface design, we encountered some
> > anomalies while implementing it in YARN.
> >
> > Once the interfaces are finalized, we will proceed to raise a pull request
> > (PR) for YARN as well.
> >
> > Our initial approach was to create a decoupled interface as part of
> > FLIP-334 and then implement it for YARN in the subsequent phase.
> > However, if you recommend combining both phases, we can certainly consider
> > that option.
> >
> > We look forward to hearing your thoughts on whether to have YARN
> > implementation as part of FLIP-334 or seperate one ?
> >
> > Bests
> > Samrat
> >
> >
> >
> > On Tue, Sep 5, 2023 at 8:41 PM Maximilian Michels  wrote:
> >
> > > Thanks Rui for the update!
> > >
> > > Alongside with the refactoring to decouple autoscaler logic from the
> > > deployment logic, are we planning to add an alternative implementation
> > > against the new interfaces? I think the best way to get the interfaces
> > > right, is to have an alternative implementation in addition to
> > > Kubernetes. YARN or a standalone mode implementation were already
> > > mentioned. Ultimately, this is the reason we are doing the
> > > refactoring. Without a new implementation, it becomes harder to
> > > justify the refactoring work.
> > >
> > > Cheers,
> > > Max
> > >
> > > On Tue, Sep 5, 2023 at 9:48 AM Rui Fan  wrote:
> > > >
> > > > After discussing this FLIP-334[1] offline with Gyula and Max,
> > > > I updated the FLIP based on the latest conclusion.
> > > >
> > > > Big thanks to Gyula and Max for their professiona

Re: [VOTE] FLIP-334: Decoupling autoscaler and kubernetes and support the Standalone Autoscaler

2023-09-13 Thread Maximilian Michels
+1 (binding)

On Wed, Sep 13, 2023 at 12:28 PM Gyula Fóra  wrote:
>
> +1 (binding)
>
> Gyula
>
> On Wed, 13 Sep 2023 at 09:33, Matt Wang  wrote:
>
> > Thank you for driving this FLIP,
> >
> > +1 (non-binding)
> >
> >
> > --
> >
> > Best,
> > Matt Wang
> >
> >
> >  Replied Message 
> > | From | ConradJam |
> > | Date | 09/13/2023 15:28 |
> > | To |  |
> > | Subject | Re: [VOTE] FLIP-334: Decoupling autoscaler and kubernetes and
> > support the Standalone Autoscaler |
> > best idea
> > +1 (non-binding)
> >
> > Ahmed Hamdy  于2023年9月13日周三 15:23写道:
> >
> > Hi Rui,
> > I have gone through the thread.
> > +1 (non-binding)
> >
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Wed, 13 Sept 2023 at 03:53, Rui Fan <1996fan...@gmail.com> wrote:
> >
> > Hi all,
> >
> > Thanks for all the feedback about the FLIP-334:
> > Decoupling autoscaler and kubernetes and
> > support the Standalone Autoscaler[1].
> > This FLIP was discussed in [2].
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours (until Sep 16th 11:00 UTC+8) unless there is an objection or
> > insufficient votes.
> >
> > [1]
> >
> >
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-334+%3A+Decoupling+autoscaler+and+kubernetes+and+support+the+Standalone+Autoscaler
> > [2] https://lists.apache.org/thread/kmm03gls1vw4x6vk1ypr9ny9q9522495
> >
> > Best,
> > Rui
> >
> >
> >
> >
> > --
> > Best
> >
> > ConradJam
> >


Re: [VOTE] FLIP-361: Improve GC Metrics

2023-09-14 Thread Maximilian Michels
+1 (binding)

On Thu, Sep 14, 2023 at 4:26 AM Venkatakrishnan Sowrirajan
 wrote:
>
> +1 (non-binding)
>
> On Wed, Sep 13, 2023, 6:55 PM Matt Wang  wrote:
>
> > +1 (non-binding)
> >
> >
> > Thanks for driving this FLIP
> >
> >
> >
> >
> > --
> >
> > Best,
> > Matt Wang
> >
> >
> >  Replied Message 
> > | From | Xintong Song |
> > | Date | 09/14/2023 09:54 |
> > | To |  |
> > | Subject | Re: [VOTE] FLIP-361: Improve GC Metrics |
> > +1 (binding)
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Thu, Sep 14, 2023 at 2:40 AM Samrat Deb  wrote:
> >
> > +1 ( non binding)
> >
> > These improved GC metrics will be a great addition.
> >
> > Bests,
> > Samrat
> >
> > On Wed, 13 Sep 2023 at 7:58 PM, ConradJam  wrote:
> >
> > +1 (non-binding)
> > gc metrics help with autoscale tuning features
> >
> > Chen Zhanghao  于2023年9月13日周三 22:16写道:
> >
> > +1 (unbinding). Looking forward to it
> >
> > Best,
> > Zhanghao Chen
> > 
> > 发件人: Gyula Fóra 
> > 发送时间: 2023年9月13日 21:16
> > 收件人: dev 
> > 主题: [VOTE] FLIP-361: Improve GC Metrics
> >
> > Hi All!
> >
> > Thanks for all the feedback on FLIP-361: Improve GC Metrics [1][2]
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours unless there is an objection or insufficient votes.
> >
> > Cheers,
> > Gyula
> >
> > [1]
> >
> >
> >
> >
> > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-361*3A*Improve*GC*Metrics__;JSsrKw!!IKRxdwAv5BmarQ!dpHSOqsSHlPJ5gCvZ2yxSGjcR4xA2N-mpGZ1w2jPuKb78aujNpbzENmi1e7B26d6v4UQ8bQZ7IQaUcI$
> > [2]
> > https://urldefense.com/v3/__https://lists.apache.org/thread/qqqv54vyr4gbp63wm2d12q78m8h95xb2__;!!IKRxdwAv5BmarQ!dpHSOqsSHlPJ5gCvZ2yxSGjcR4xA2N-mpGZ1w2jPuKb78aujNpbzENmi1e7B26d6v4UQ8bQZFdEMnAg$
> >
> >
> >
> > --
> > Best
> >
> > ConradJam
> >
> >
> >


Re: [DISCUSS] FLIP-364: Improve the restart-strategy

2023-10-19 Thread Maximilian Michels
Hey Rui,

+1 for making exponential backoff the default. I agree with Konstantin
that retrying forever is a good default for exponential backoff
because oftentimes the issue will resolve eventually. The purpose of
exponential backoff is precisely to continue to retry without causing
too much load. However, I'm not against adding an optional max number
of retries.

-Max

On Thu, Oct 19, 2023 at 11:35 AM Konstantin Knauf  wrote:
>
> Hi Rui,
>
> Thank you for this proposal and working on this. I also agree that
> exponential back off makes sense as a new default in general. I think
> restarting indefinitely (no max attempts) makes sense by default, though,
> but of course allowing users to change is valuable.
>
> So, overall +1.
>
> Cheers,
>
> Konstantin
>
> Am Di., 17. Okt. 2023 um 07:11 Uhr schrieb Rui Fan <1996fan...@gmail.com>:
>
> > Hi all,
> >
> > I would like to start a discussion on FLIP-364: Improve the
> > restart-strategy[1]
> >
> > As we know, the restart-strategy is critical for flink jobs, it mainly
> > has two functions:
> > 1. When an exception occurs in the flink job, quickly restart the job
> > so that the job can return to the running state.
> > 2. When a job cannot be recovered after frequent restarts within
> > a certain period of time, Flink will not retry but will fail the job.
> >
> > The current restart-strategy support for function 2 has some issues:
> > 1. The exponential-delay doesn't have the max attempts mechanism,
> > it means that flink will restart indefinitely even if it fails frequently.
> > 2. For multi-region streaming jobs and all batch jobs, the failure of
> > each region will increase the total number of job failures by +1,
> > even if these failures occur at the same time. If the number of
> > failures increases too quickly, it will be difficult to set a reasonable
> > number of retries.
> > If the maximum number of failures is set too low, the job can easily
> > reach the retry limit, causing the job to fail. If set too high, some jobs
> > will never fail.
> >
> > In addition, when the above two problems are solved, we can also
> > discuss whether exponential-delay can replace fixed-delay as the
> > default restart-strategy. In theory, exponential-delay is smarter and
> > friendlier than fixed-delay.
> >
> > I also thank Zhu Zhu for his suggestions on the option name in
> > FLINK-32895[2] in advance.
> >
> > Looking forward to and welcome everyone's feedback and suggestions, thank
> > you.
> >
> > [1] https://cwiki.apache.org/confluence/x/uJqzDw
> > [2] https://issues.apache.org/jira/browse/FLINK-32895
> >
> > Best,
> > Rui
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk


Re: [VOTE] Apache Flink Kubernetes Operator Release 1.6.1, release candidate #1

2023-10-26 Thread Maximilian Michels
+1 (binding)

1. Downloaded the archives, checksums, and signatures
2. Verified the signatures and checksums ( gpg --recv-keys
B2D64016B940A7E0B9B72E0D7D0528B28037D8BC )
3. Extract and inspect the source code for binaries
4. Compiled and tested the source code via mvn verify
5. Verified license files / headers
6. Deployed helm chart to test cluster
7. Ran example job
8. Tested autoscaling without rescaling API

@Rui Can you add your key to
https://dist.apache.org/repos/dist/release/flink/KEYS ?

-Max

On Thu, Oct 26, 2023 at 1:53 PM Márton Balassi  wrote:
>
> Thank you, team. @David Radley: Not having Rui's key signed is not ideal,
> but is acceptable for the release.
>
> +1 (binding)
>
> - Verified Helm repo works as expected, points to correct image tag, build,
> version
> - Verified basic examples + checked operator logs everything looks as
> expected
> - Verified hashes, signatures and source release contains no binaries
> - Ran built-in tests, built jars + docker image from source successfully
>
> Best,
> Marton
>
> On Thu, Oct 26, 2023 at 12:24 PM David Radley 
> wrote:
>
> > Hi,
> > I downloaded the artifacts.
> >
> >   *   I did an install of the operator and ran the basic sample
> >   *   I checked the checksums
> >   *   Checked the GPG signatures
> >   *   Ran the UI
> >   *   Ran a Twistlock scan
> >   *   I installed 1.6 then did a helm upgrade
> >   *   I have not managed to do the source build and subsequent install yet.
> > I wanted to check these 2 things are what you would expect:
> >
> >   1.  I followed link
> > https://github.com/apache/flink-kubernetes-operator/pkgs/container/flink-kubernetes-operator/139454270?tag=51eeae1
> > And notice that it does not have a description . Is this correct?
> >
> >   1.  I get this in the gpg verification . Is this ok?
> >
> >
> > gpg --verify flink-kubernetes-operator-1.6.1-src.tgz.asc
> >
> > gpg: assuming signed data in 'flink-kubernetes-operator-1.6.1-src.tgz'
> >
> > gpg: Signature made Fri 20 Oct 2023 04:07:48 PDT
> >
> > gpg:using RSA key B2D64016B940A7E0B9B72E0D7D0528B28037D8BC
> >
> > gpg: Good signature from "Rui Fan fan...@apache.org > fan...@apache.org>" [unknown]
> >
> > gpg: WARNING: This key is not certified with a trusted signature!
> >
> > gpg:  There is no indication that the signature belongs to the
> > owner.
> >
> > Primary key fingerprint: B2D6 4016 B940 A7E0 B9B7  2E0D 7D05 28B2 8037 D8BC
> >
> >
> >
> >
> > Hi Everyone,
> >
> > Please review and vote on the release candidate #1 for the version 1.6.1 of
> > Apache Flink Kubernetes Operator,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > **Release Overview**
> >
> > As an overview, the release consists of the following:
> > a) Kubernetes Operator canonical source distribution (including the
> > Dockerfile), to be deployed to the release repository at dist.apache.org
> > b) Kubernetes Operator Helm Chart to be deployed to the release repository
> > at dist.apache.org
> > c) Maven artifacts to be deployed to the Maven Central Repository
> > d) Docker image to be pushed to dockerhub
> >
> > **Staging Areas to Review**
> >
> > The staging areas containing the above mentioned artifacts are as follows,
> > for your review:
> > * All artifacts for a,b) can be found in the corresponding dev repository
> > at dist.apache.org [1]
> > * All artifacts for c) can be found at the Apache Nexus Repository [2]
> > * The docker image for d) is staged on github [3]
> >
> > All artifacts are signed with the
> > key B2D64016B940A7E0B9B72E0D7D0528B28037D8BC [4]
> >
> > Other links for your review:
> > * source code tag "release-1.6.1-rc1" [5]
> > * PR to update the website Downloads page to
> > include Kubernetes Operator links [6]
> > * PR to update the doc version of flink-kubernetes-operator[7]
> >
> > **Vote Duration**
> >
> > The voting time will run for at least 72 hours.
> > It is adopted by majority approval, with at least 3 PMC affirmative votes.
> >
> > **Note on Verification**
> >
> > You can follow the basic verification guide here[8].
> > Note that you don't need to verify everything yourself, but please make
> > note of what you have tested together with your +- vote.
> >
> > [1]
> >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.6.1-rc1/
> > [2]
> > https://repository.apache.org/content/repositories/orgapacheflink-1663/
> > [3]
> >
> > https://github.com/apache/flink-kubernetes-operator/pkgs/container/flink-kubernetes-operator/139454270?tag=51eeae1
> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [5]
> > https://github.com/apache/flink-kubernetes-operator/tree/release-1.6.1-rc1
> > [6] https://github.com/apache/flink-web/pull/690
> > [7] https://github.com/apache/flink-kubernetes-operator/pull/687
> > [8]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
> >
> > Best,
> > Rui
> >
>

Re: off for a week

2023-10-26 Thread Maximilian Michels
Have a great time off, Etienne!

On Thu, Oct 26, 2023 at 3:38 PM Etienne Chauchot  wrote:
>
> Hi,
>
> FYI, I'll be off and unresponsive for a week starting tomorrow evening.
> For ongoing work, please ping me before tomorrow evening or within a week
>
> Best
>
> Etienne


Re: [DISCUSS] Kubernetes Operator 1.7.0 release planning

2023-11-01 Thread Maximilian Michels
+1 for targeting the release as soon as possible. Given the effort
that Rui has undergone to decouple the autoscaling implementation, it
makes sense to also include an alternative implementation with the
release. In the long run, I wonder whether the standalone
implementation should even be part of the Kubernetes operator
repository. It can be hosted in a different repository and simply
consume the flink-autoscaler jar. But the same applies to the
flink-autoscaler module. For this release, we can keep everything
together.

I have a minor issue [1] I would like to include in the release.

-Max

[1] https://issues.apache.org/jira/browse/FLINK-33429

On Tue, Oct 31, 2023 at 11:14 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Thanks Gyula for driving this release!
>
> I'd like to check with you and community, could we
> postpone the code freeze by a week?
>
> I'm developing the FLINK-33099[1], and the prod code is done.
> I need some time to develop the tests. I hope this feature is included in
> 1.7.0 for two main reasons:
>
> 1. We have completed the decoupling of the autoscaler and
> kubernetes-operator in 1.7.0. During the decoupling period, we modified
> a large number of autoscaler-related interfaces. The standalone autoscaler
> is an autoscaler process that can run independently. It can help us confirm
> whether the new interface is reasonable.
> 2. 1.18.0 was recently released, standalone autoscaler allows more users to
> play autoscaler and in-place rescale.
>
> I have created a draft PR[2] for FLINK-33099, it just includes prod code.
> I have run it manually, it works well. And I will try my best to finish all
> unit tests before Friday, but must finish all unit tests before next Monday
> at the latest.
>
> WDYT?
>
> I'm deeply sorry for the request to postpone the release.
>
> [1] https://issues.apache.org/jira/browse/FLINK-33099
> [2] https://github.com/apache/flink-kubernetes-operator/pull/698
>
> Best,
> Rui
>
> On Tue, Oct 31, 2023 at 4:10 PM Samrat Deb  wrote:
>
> > Thank you Gyula
> >
> > (+1 non-binding) in support of you taking on the role of release manager.
> >
> > > I think this is reasonable as I am not aware of any big features / bug
> > fixes being worked on right now. Given the size of the changes related to
> > the autoscaler module refactor we should try to focus the remaining time on
> > testing.
> >
> > I completely agree with you. Since the changes are quite extensive, it's
> > crucial to allocate more time for thorough testing and verification.
> >
> > Regarding working with you for the release, I might not have the necessary
> > privileges for that.
> >
> > However, I'd be more than willing to assist with testing the changes,
> > validating various features, and checking for any potential regressions in
> > the flink-kubernetes-operator.
> > Just let me know how I can support the testing efforts.
> >
> > Bests,
> > Samrat
> >
> >
> > On Tue, 31 Oct 2023 at 12:59 AM, Gyula Fóra  wrote:
> >
> > > Hi all!
> > >
> > > I would like to kick off the release planning for the operator 1.7.0
> > > release. We have added quite a lot of new functionality over the last few
> > > weeks and I think the operator is in a good state to kick this off.
> > >
> > > Based on the original release schedule we had Nov 1 as the proposed
> > feature
> > > freeze date and Nov 7 as the date for the release cut / rc1.
> > >
> > > I think this is reasonable as I am not aware of any big features / bug
> > > fixes being worked on right now. Given the size of the changes related to
> > > the autoscaler module refactor we should try to focus the remaining time
> > on
> > > testing.
> > >
> > > I am happy to volunteer as a release manager but I am of course open to
> > > working together with someone on this.
> > >
> > > What do you think?
> > >
> > > Cheers,
> > > Gyula
> > >
> >


Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-17 Thread Maximilian Michels
Hi Mason,

Thank you for the proposal. This is a highly requested feature to make
the source scaling of Flink Autoscaling generic across all sources.
The current implementation handles every source individually, and if
we don't find any backlog metrics, we default to using busy time only.
At this point Kafka is the only supported source. We collect the
backlog size (pending metrics), as well as the number of available
splits / partitions.

For Kafka, we always read from all splits but I like how for the
generic interface we take note of both assigned and unassigned splits.
This allows for more flexible integration with other sources where we
might have additional splits we read from at a later point in time.

Considering Rui's point, I agree it makes sense to outline the
integration with existing sources. Other than that, +1 from my side
for the proposal.

Thanks,
Max

On Fri, Nov 17, 2023 at 4:06 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Hi Mason,
>
> Thank you for driving this proposal!
>
> Currently, Autoscaler only supports the maximum source parallelism
> of KafkaSource. Introducing the generic metric to support it is good
> to me, +1 for this proposal.
>
> I have a question:
> You added the metric in the flink repo, and Autoscaler will fetch this
> metric. But I didn't see any connector to register this metric. Currently,
> only IteratorSourceEnumerator setUnassignedSplitsGauge,
> and KafkaSource didn't register it. IIUC, if we don't do it, autoscaler
> still cannot fetch this metric, right?
>
> If yes, I suggest this FLIP includes registering metric part, otherwise
> these metrics still cannot work.
>
> Please correct me if I misunderstood anything, thanks~
>
> Best,
> Rui
>
> On Fri, Nov 17, 2023 at 6:53 AM Mason Chen  wrote:
>
> > Hi all,
> >
> > I would like to start a discussion on FLIP-394: Add Metrics for Connector
> > Agnostic Autoscaling [1].
> >
> > This FLIP recommends adding two metrics to make autoscaling work for
> > bounded split source implementations like IcebergSource. These metrics are
> > required by the Flink Kubernetes Operator autoscaler algorithm [2] to
> > retrieve information for the backlog and the maximum source parallelism.
> > The changes would affect the `@PublicEvolving` `SplitEnumeratorMetricGroup`
> > API of the source connector framework.
> >
> > Best,
> > Mason
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-394%3A+Add+Metrics+for+Connector+Agnostic+Autoscaling
> > [2]
> >
> > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/#limitations
> >


Re: [VOTE] Apache Flink Kubernetes Operator Release 1.7.0, release candidate #1

2023-11-20 Thread Maximilian Michels
+1 (binding)

1. Downloaded the archives, checksums, and signatures
2. Verified the signatures and checksums
3. Extract and inspect the source code for binaries
4. Compiled and tested the source code via mvn verify
5. Verified license files / headers
6. Deployed helm chart to test cluster
7. Build and ran dynamic autoscaling example image
8. Tested autoscaling without rescaling API

Hit a non-fatal error collecting metrics in the stabilization phase
(this is a new feature), not a release blocker though [1].

-Max

[1] Caused by: org.apache.flink.runtime.rest.util.RestClientException:
[org.apache.flink.runtime.rest.handler.RestHandlerException: Cannot
connect to ResourceManager right now. Please try to refresh.
at 
org.apache.flink.runtime.rest.handler.resourcemanager.AbstractResourceManagerHandler.lambda$getResourceManagerGateway$0(AbstractResourceManagerHandler.java:91)
 at
java.base/java.util.Optional.orElseThrow(Unknown Source)
at 
org.apache.flink.runtime.rest.handler.resourcemanager.AbstractResourceManagerHandler.getResourceManagerGateway(AbstractResourceManagerHandler.java:89)
...

On Mon, Nov 20, 2023 at 5:48 PM Márton Balassi  wrote:
>
> +1 (binding)
>
> - Verified Helm repo works as expected, points to correct image tag, build,
> version
> - Verified basic examples + checked operator logs everything looks as
> expected
> - Verified hashes, signatures and source release contains no binaries
> - Ran built-in tests, built jars + docker image from source successfully
> - Upgraded the operator and the CRD from 1.6.1 to 1.7.0
>
> Best,
> Marton
>
> On Mon, Nov 20, 2023 at 2:03 PM Gyula Fóra  wrote:
>
> > +1 (binding)
> >
> > Verified:
> >  - Release files, maven repo contents, checksums, signature
> >  - Verified and installed from Helm chart
> >  - Ran basic stateful example and verified
> >- Upgrade flow
> >- No errors in logs
> >- Autoscaler (turn on/off, verify configmap cleared correctly)
> >- In-place scaling with 1.18 and adaptive scheduler
> >  - Built from source with Java 11 & 17
> >  - Checked release notes
> >
> > Cheers,
> > Gyula
> >
> > On Fri, Nov 17, 2023 at 1:59 PM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > +1(non-binding)
> > >
> > > - Downloaded artifacts from dist
> > > - Verified SHA512 checksums
> > > - Verified GPG signatures
> > > - Build the source with java-11 and java-17
> > > - Verified the license header
> > > - Verified that chart and appVersion matches the target release
> > > - RC repo works as Helm rep(helm repo add flink-operator-repo-1.7.0-rc1
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.7.0-rc1/
> > > )
> > > - Verified Helm chart can be installed  (helm install
> > > flink-kubernetes-operator
> > > flink-operator-repo-1.7.0-rc1/flink-kubernetes-operator --set
> > > webhook.create=false)
> > > - Submitted the autoscaling demo, the autoscaler works well with rescale
> > > api (kubectl apply -f autoscaling.yaml)
> > > - Download Autoscaler standalone: wget
> > >
> > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1672/org/apache/flink/flink-autoscaler-standalone/1.7.0/flink-autoscaler-standalone-1.7.0.jar
> > > - Ran Autoscaler standalone locally, it works well with rescale api
> > >
> > > Best,
> > > Rui
> > >
> > > On Fri, Nov 17, 2023 at 2:45 AM Mate Czagany  wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - Checked signatures, checksums
> > > > - No binaries found in the source release
> > > > - Verified all source files contain the license header
> > > > - All pom files point to the correct version
> > > > - Verified Helm chart version and appVersion
> > > > - Verified Docker image tag
> > > > - Ran flink-autoscaler-standalone JAR downloaded from the maven
> > > repository
> > > > - Tested autoscaler upscales correctly on load with Flink 1.18
> > rescaling
> > > > API
> > > >
> > > > Thanks,
> > > > Mate
> > > >
> > > > Gyula Fóra  ezt írta (időpont: 2023. nov. 15., Sze,
> > > > 16:37):
> > > >
> > > > > Hi Everyone,
> > > > >
> > > > > Please review and vote on the release candidate #1 for the version
> > > 1.7.0
> > > > of
> > > > > Apache Flink Kubernetes Operator,
> > > > > as follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > > >
> > > > > **Release Overview**
> > > > >
> > > > > As an overview, the release consists of the following:
> > > > > a) Kubernetes Operator canonical source distribution (including the
> > > > > Dockerfile), to be deployed to the release repository at
> > > dist.apache.org
> > > > > b) Kubernetes Operator Helm Chart to be deployed to the release
> > > > repository
> > > > > at dist.apache.org
> > > > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > > > d) Docker image to be pushed to dockerhub
> > > > >
> > > > > **Staging Areas to Review**
> > > > >
> > > > > The staging areas containing the above mentioned artifact

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-24 Thread Maximilian Michels
Thanks for reviving the efforts here Matthias! +1 for the transition
to GitHub Actions.

As for ASF Infra Jenkins, it works fine. Jenkins is extremely
feature-rich. Not sure about the spare capacity though. I know that
for Apache Beam, Google donated a bunch of servers to get additional
build capacity.

-Max


On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
 wrote:
>
> Btw. even though we've been focusing on GitHub Actions with this FLIP, I'm
> curious whether somebody has experience with Apache Infra's Jenkins
> deployment. The discussion I found about Jenkins [1] is quite out-dated
> (2014). I haven't worked with it myself but could imagine that there are
> some features provided through plugins which are missing in GitHub Actions.
>
> [1] https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
>
> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl 
> wrote:
>
> > That's a valid point. I updated the FLIP accordingly:
> >
> >> Currently, the secrets (e.g. for S3 access tokens) are maintained by
> >> certain PMC members with access to the corresponding configuration in the
> >> Azure CI project. This responsibility will be moved to Apache Infra. They
> >> are in charge of handling secrets in the Apache organization. As a
> >> consequence, updating secrets is becoming a bit more complicated. This can
> >> be still considered an improvement from a legal standpoint because the
> >> responsibility is transferred from an individual company (i.e. Ververica
> >> who's the maintainer of the Azure CI project) to the Apache Foundation.
> >
> >
> > On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser 
> > wrote:
> >
> >> Hi Matthias,
> >>
> >> Thanks for the write-up and for the efforts on this. I really hope
> >> that we can move away from Azure towards GHA for a better integration
> >> as well (directly seeing if a PR can be merged due to CI passing for
> >> example).
> >>
> >> The one thing I'm missing in the FLIP is how we would setup the
> >> secrets for the nightly runs (for the S3 tests, potential tests with
> >> external services etc). My guess is we need to provide the secret to
> >> ASF Infra and then we would be able to refer to them in a pipeline?
> >>
> >> Best regards,
> >>
> >> Martijn
> >>
> >> On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
> >>  wrote:
> >> >
> >> > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
> >> > switched to FLIP-396 [2] for the sake of consistency. 8)
> >> >
> >> > [1] https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> >> > [2]
> >> >
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
> >> >
> >> > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl 
> >> > wrote:
> >> >
> >> > > Hi everyone,
> >> > >
> >> > > The Flink community discussed migrating from Azure CI to GitHub
> >> Actions
> >> > > quite some time ago [1]. The efforts around that stalled due to
> >> limitations
> >> > > around self-hosted runner support from Apache Infra’s side. There
> >> were some
> >> > > recent developments on that topic. Apache Infra is experimenting with
> >> > > ephemeral runners now which might enable us to move ahead with GitHub
> >> > > Actions.
> >> > >
> >> > > The goal is to join the trial phase for ephemeral runners and
> >> experiment
> >> > > with our CI workflows in terms of stability and performance. At the
> >> end we
> >> > > can decide whether we want to abandon Azure CI and move to GitHub
> >> Actions
> >> > > or stick to the former one.
> >> > >
> >> > > Nico Weidner and Chesnay laid the groundwork on this topic in the
> >> past. I
> >> > > picked up the work they did and continued experimenting with it in my
> >> own
> >> > > fork XComp/flink [2] the past few weeks. The workflows are in a state
> >> where
> >> > > I think that we start moving the relevant code into Flink’s
> >> repository.
> >> > > Example runs for the basic workflow [3] and the extended (nightly)
> >> workflow
> >> > > [4] are provided.
> >> > >
> >> > > This will bring a few more changes to the Flink contributors. That is
> >> why
> >> > > I wanted to bring this discussion to the mailing list first. I did a
> >> write
> >> > > up on (hopefully) all related topics in FLIP-395 [5].
> >> > >
> >> > > I’m looking forward to your feedback.
> >> > >
> >> > > Matthias
> >> > >
> >> > > [1] https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k
> >> > >
> >> > > [2] https://github.com/XComp/flink/actions
> >> > >
> >> > > [3] https://github.com/XComp/flink/actions/runs/6926309782
> >> > >
> >> > > [4] https://github.com/XComp/flink/actions/runs/6927443941
> >> > >
> >> > > [5]
> >> > >
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions
> >> > >
> >> > >
> >> > > --
> >> > >
> >> > > [image: Aiven] 
> >> > >
> >> > > *Matthias Pohl*
> >> > > Opensource Software Engineer, *Aiven*
> >> > > matthias.p...@aiven.io|  +49 170 9869525
> >> > > aiven.io 

Re: [VOTE] FLIP-364: Improve the restart-strategy

2023-11-30 Thread Maximilian Michels
+1 (binding)

-Max

On Thu, Nov 30, 2023 at 9:15 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> +1(binding)
>
> Best,
> Rui
>
> On Mon, Nov 13, 2023 at 11:01 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Hi everyone,
> >
> > Thank you to everyone for the feedback on FLIP-364: Improve the
> > restart-strategy[1]
> > which has been discussed in this thread [2].
> >
> > I would like to start a vote for it. The vote will be open for at least 72
> > hours unless there is an objection or not enough votes.
> >
> > [1] https://cwiki.apache.org/confluence/x/uJqzDw
> > [2] https://lists.apache.org/thread/5cgrft73kgkzkgjozf9zfk0w2oj7rjym
> >
> > Best,
> > Rui
> >


Re: [ANNOUNCE] Apache Flink 1.16.3 released

2023-11-30 Thread Maximilian Michels
Thank you Rui for driving this!

On Thu, Nov 30, 2023 at 3:01 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> The Apache Flink community is very happy to announce the release of
> Apache Flink 1.16.3, which is the
> third bugfix release for the Apache Flink 1.16 series.
>
>
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data
> streaming applications.
>
>
>
> The release is available for download at:
>
> https://flink.apache.org/downloads.html
>
>
>
> Please check out the release blog post for an overview of the
> improvements for this bugfix release:
>
> https://flink.apache.org/2023/11/29/apache-flink-1.16.3-release-announcement/
>
>
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353259
>
>
>
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
>
>
>
> Feel free to reach out to the release managers (or respond to this
> thread) with feedback on the release process. Our goal is to
> constantly improve the release process. Feedback on what could be
> improved or things that didn't go so well are appreciated.
>
>
>
> Regards,
>
> Release Manager


Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-07 Thread Maximilian Michels
Hey Rui,

+1 for changing the default restart strategy to exponential-delay.
This is something all users eventually run into. They end up changing
the restart strategy to exponential-delay. I think the current
defaults are quite balanced. Restarts happen quickly enough unless
there are consecutive failures where I think it makes sense to double
the waiting time up till the max.

-Max


On Wed, Dec 6, 2023 at 12:51 AM Mason Chen  wrote:
>
> Hi Rui,
>
> Sorry for the late reply. I was suggesting that perhaps we could do some
> testing with Kubernetes wrt configuring values for the exponential restart
> strategy. We've noticed that the default strategy in 1.17 caused a lot of
> requests to the K8s API server for unstable deployments.
>
> However, people in different Kubernetes setups will have different limits
> so it would be challenging to provide a general benchmark. Another thing I
> found helpful in the past is to refer to Kubernetes--for example, the
> default strategy is exponential for pod restarts and we could draw
> inspiration from what they have set as a general purpose default config.
>
> Best,
> Mason
>
> On Sun, Nov 19, 2023 at 9:43 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Hi David and Mason,
> >
> > Thanks for your feedback!
> >
> > To David:
> >
> > > Given that the new default feels more complex than the current behavior,
> > if we decide to do this I think it will be important to include the
> > rationale you've shared in the documentation.
> >
> > Sounds make sense to me, I will add the related doc if we
> > update the default strategy.
> >
> > To Mason:
> >
> > > I suppose we could do some benchmarking on what works well for the
> > resource providers that Flink relies on e.g. Kubernetes. Based on
> > conferences and blogs,
> > > it seems most people are relying on Kubernetes to deploy Flink and the
> > restart strategy has a large dependency on how well Kubernetes can scale to
> > requests to redeploy the job.
> >
> > Sorry, I didn't understand what type of benchmarking
> > we should do, could you elaborate on it? Thanks a lot.
> >
> > Best,
> > Rui
> >
> > On Sat, Nov 18, 2023 at 3:32 AM Mason Chen  wrote:
> >
> >> Hi Rui,
> >>
> >> I suppose we could do some benchmarking on what works well for the
> >> resource providers that Flink relies on e.g. Kubernetes. Based on
> >> conferences and blogs, it seems most people are relying on Kubernetes to
> >> deploy Flink and the restart strategy has a large dependency on how well
> >> Kubernetes can scale to requests to redeploy the job.
> >>
> >> Best,
> >> Mason
> >>
> >> On Fri, Nov 17, 2023 at 10:07 AM David Anderson 
> >> wrote:
> >>
> >>> Rui,
> >>>
> >>> I don't have any direct experience with this topic, but given the
> >>> motivation you shared, the proposal makes sense to me. Given that the new
> >>> default feels more complex than the current behavior, if we decide to do
> >>> this I think it will be important to include the rationale you've shared 
> >>> in
> >>> the documentation.
> >>>
> >>> David
> >>>
> >>> On Wed, Nov 15, 2023 at 10:17 PM Rui Fan <1996fan...@gmail.com> wrote:
> >>>
>  Hi dear flink users and devs:
> 
>  FLIP-364[1] intends to make some improvements to restart-strategy
>  and discuss updating some of the default values of exponential-delay,
>  and whether exponential-delay can be used as the default
>  restart-strategy.
>  After discussing at dev mail list[2], we hope to collect more feedback
>  from Flink users.
> 
>  # Why does the default restart-strategy need to be updated?
> 
>  If checkpointing is enabled, the default value is fixed-delay with
>  Integer.MAX_VALUE restart attempts and '1 s' delay[3]. It means
>  the job will restart infinitely with high frequency when a job
>  continues to fail.
> 
>  When the Kafka cluster fails, a large number of flink jobs will be
>  restarted frequently. After the kafka cluster is recovered, a large
>  number of high-frequency restarts of flink jobs may cause the
>  kafka cluster to avalanche again.
> 
>  Considering the exponential-delay as the default strategy with
>  a couple of reasons:
> 
>  - The exponential-delay can reduce the restart frequency when
>    a job continues to fail.
>  - It can restart a job quickly when a job fails occasionally.
>  - The restart-strategy.exponential-delay.jitter-factor can avoid r
>    estarting multiple jobs at the same time. It’s useful to prevent
>    avalanches.
> 
>  # What are the current default values[4] of exponential-delay?
> 
>  restart-strategy.exponential-delay.initial-backoff : 1s
>  restart-strategy.exponential-delay.backoff-multiplier : 2.0
>  restart-strategy.exponential-delay.jitter-factor : 0.1
>  restart-strategy.exponential-delay.max-backoff : 5 min
>  restart-strategy.exponential-delay.reset-backoff-threshold : 1h
> 
>  backoff-multiplier=2 means 

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff
between restarting fast but not putting too much pressure on the
cluster due to restarts.

-Max

On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Hi Maximilian and Mason,
>
> Thanks a lot for your feedback!
>
> After an offline consultation with Max, I guess I understand your
> concern for now: when flink job restarts, it will make a bunch of
> calls to the Kubernetes API, e.g. read/write to config maps, create
> task managers. Currently, the default restart strategy is fixed-delay
> with 1s delay time, so flink will restart jobs with high frequency
> even if flink jobs cannot be started. It will cause the Kubernetes
> cluster became unstable.
>
> That's why I propose changing the default restart strategy to
> exponential-delay. It can achieve: restarts happen quickly
> enough unless there are consecutive failures. It is helpful for
> the stability of external components.
>
> After discussing with Max and Zhu Zhu at the PR comment[1],
> Max suggested using 1.5 as the default value of backoff-multiplier
> instead of 1.2. The 1.2 is a little small(delay time is too short).
> This picture[2] is the relationship between restart-attempts and
> retry-delay-time when backoff-multiplier is 1.2 and 1.5:
>
> - The delay-time will reach 1 min after 12 attempts when backoff-multiplier 
> is 1.5
> - The delay-time will reach 1 min after 24 attempts when backoff-multiplier 
> is 1.2
>
> Is there any other suggestion? Looking forward to more feedback, thanks~
>
> BTW, as Zhu said in the comment[1], if we update the default value,
> a new vote is needed for this default value. So I will pause
> FLINK-33736[1] first, and the rest of the JIRAs of FLIP-364 will be
> continued.
>
> To Mason:
>
> If I understand your concerns correctly, I still don't know how
> to benchmark. The kubernetes cluster instability only happens
> when one cluster has a lot of jobs. In general, the test cannot
> reproduce the pressure. Could you elaborate on how to
> benchmark for this?
>
> After this FLIP, the default restart frequency will be reduced
> significantly. Especially when a job fails consecutively.
> Do you think the benchmark is necessary?
>
> Looking forward to your feedback, thanks~
>
> [1] https://github.com/apache/flink/pull/23247#discussion_r1422626734
> [2] 
> https://github.com/apache/flink/assets/38427477/642c57e0-b415-4326-af05-8b506c5fbb3a
> [3] https://issues.apache.org/jira/browse/FLINK-33736
>
> Best,
> Rui
>
> On Thu, Dec 7, 2023 at 10:57 PM Maximilian Michels  wrote:
>>
>> Hey Rui,
>>
>> +1 for changing the default restart strategy to exponential-delay.
>> This is something all users eventually run into. They end up changing
>> the restart strategy to exponential-delay. I think the current
>> defaults are quite balanced. Restarts happen quickly enough unless
>> there are consecutive failures where I think it makes sense to double
>> the waiting time up till the max.
>>
>> -Max
>>
>>
>> On Wed, Dec 6, 2023 at 12:51 AM Mason Chen  wrote:
>> >
>> > Hi Rui,
>> >
>> > Sorry for the late reply. I was suggesting that perhaps we could do some
>> > testing with Kubernetes wrt configuring values for the exponential restart
>> > strategy. We've noticed that the default strategy in 1.17 caused a lot of
>> > requests to the K8s API server for unstable deployments.
>> >
>> > However, people in different Kubernetes setups will have different limits
>> > so it would be challenging to provide a general benchmark. Another thing I
>> > found helpful in the past is to refer to Kubernetes--for example, the
>> > default strategy is exponential for pod restarts and we could draw
>> > inspiration from what they have set as a general purpose default config.
>> >
>> > Best,
>> > Mason
>> >
>> > On Sun, Nov 19, 2023 at 9:43 PM Rui Fan <1996fan...@gmail.com> wrote:
>> >
>> > > Hi David and Mason,
>> > >
>> > > Thanks for your feedback!
>> > >
>> > > To David:
>> > >
>> > > > Given that the new default feels more complex than the current 
>> > > > behavior,
>> > > if we decide to do this I think it will be important to include the
>> > > rationale you've shared in the documentation.
>> > >
>> > > Sounds make sense to me, I will add the related doc if we
>> > > update the default strategy.
>> > >
>> > > To Mason

Re: [VOTE] FLIP-401: REST API JSON response deserialization unknown field tolerance

2023-12-12 Thread Maximilian Michels
+1 (binding)

On Tue, Dec 12, 2023 at 2:23 PM Peter Huang  wrote:
>
> +1 Non-binding
>
>
> Peter Huang
>
> Őrhidi Mátyás 于2023年12月12日 周二下午9:14写道:
>
> > +1
> > Matyas
> >
> > On Mon, Dec 11, 2023 at 10:26 PM Gyula Fóra  wrote:
> >
> > > +1
> > >
> > > Gyula
> > >
> > > On Mon, Dec 11, 2023 at 1:26 PM Gabor Somogyi  > >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I'd like to start a vote on FLIP-401: REST API JSON response
> > > > deserialization unknown field tolerance [1] which has been discussed in
> > > > this thread [2].
> > > >
> > > > The vote will be open for at least 72 hours unless there is an
> > objection
> > > or
> > > > not enough votes.
> > > >
> > > > BR,
> > > > G
> > > >
> > > > [1]
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance
> > > > [2] https://lists.apache.org/thread/s52w9cf60d6s10bpzv9qjczpl6m394rz
> > > >
> > >
> >


Re: [DISCUSS] Should Configuration support getting value based on String key?

2023-12-13 Thread Maximilian Michels
Hi Rui,

+1 for removing the @Deprecated annotation from `getString(String key,
String defaultValue)`. I would remove the other typed variants with
default values but I'm ok with keeping them if they are still used.

-Max

On Wed, Dec 13, 2023 at 4:59 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Hi devs,
>
> I'd like to start a discussion to discuss whether Configuration supports
> getting value based on the String key.
>
> In the FLIP-77[1] and FLINK-14493[2], a series of methods of Configuration
> are marked as @Deprecated, for example:
> - public String getString(String key, String defaultValue)
> - public long getLong(String key, long defaultValue)
> - public boolean getBoolean(String key, boolean defaultValue)
> - public int getInteger(String key, int defaultValue)
>
> The java doc suggests using getString(ConfigOption, String) or
> getOptional(ConfigOption), it means using ConfigOption as key
> instead of String.
>
> They are depreated since Flink-1.10, but these methods still
> be used in a lot of code. I think getString(String key, String
> defaultValue)
> shouldn't be deprecated with 2 reasons:
>
> 1. A lot of scenarios don't define the ConfigOption, they using
> String as the key and value directly, such as: StreamConfig,
> TaskConfig, DistributedCache, etc.
>
> 2. Some code wanna convert all keys or values, this converting
> is generic, so the getString(String key, String defaultValue) is needed.
> Such as: kubernetes-operator [3].
>
> Based on it, I have 2 solutions:
>
> 1. Removing the @Deprecated for these methods.
>
> 2. Only removing the @Deprecated for `public String getString(String key,
> String defaultValue)`
> and delete other getXxx(String key, Xxx defaultValue) directly.
> They have been depreated 8 minor versions ago. In general, the
> getString can replace getInteger, getBoolean, etc.
>
> I prefer solution1, because these getXxx methods are used for now,
> they are easy to use and don't bring large maintenance costs.
>
> Note: The alternative to public String getString(String key, String
> defaultValue)
> is Configuration.toMap. But the ease of use is not very convenient.
>
> Looking forward to hear more thoughts about it! Thank you~
> Also, very much looking forward to feedback from Dawid, the author of
> FLIP-77.
>
> [1] https://cwiki.apache.org/confluence/x/_RPABw
> [2] https://issues.apache.org/jira/browse/FLINK-14493
> [3]
> https://github.com/apache/flink-kubernetes-operator/pull/729/files#r1424811105
>
> Best,
> Rui


Re: [DISCUSS] Release flink-connector-parent v1.01

2023-12-21 Thread Maximilian Michels
> Anyone for pushing my pub key to apache dist ?

Done.

On Thu, Dec 21, 2023 at 2:36 PM Etienne Chauchot  wrote:
>
> Hello,
>
> All the ongoing PRs on this repo were merged. But, I'd like to leave
> some more days until feature freeze in case someone had a feature ready
> to integrate.
>
> Let' put the feature freeze to  00:00:00 UTC on December 27th.
>
> Best
>
> Etienne
>
> Le 15/12/2023 à 16:41, Ryan Skraba a écrit :
> > Hello!  I've been following this discussion (while looking and
> > building a lot of the connectors):
> >
> > +1 (non-binding) to doing a 1.1.0 release adding the configurability
> > of surefire and jvm flags.
> >
> > Thanks for driving this!
> >
> > Ryan
> >
> > On Fri, Dec 15, 2023 at 2:06 PM Etienne Chauchot  
> > wrote:
> >> Hi PMC members,
> >>
> >> Version will be 1.1.0 and not 1.0.1 as one of the PMC members already
> >> created this version tag in jira and tickets are targeted to this version.
> >>
> >> Anyone for pushing my pub key to apache dist ?
> >>
> >> Thanks
> >>
> >> Etienne
> >>
> >> Le 14/12/2023 à 17:51, Etienne Chauchot a écrit :
> >>> Hi all,
> >>>
> >>> It has been 2 weeks since the start of this release discussion. For
> >>> now only Sergey agreed to release. On a lazy consensus basis, let's
> >>> say that we leave until Monday for people to express concerns about
> >>> releasing connector-parent.
> >>>
> >>> In the meantime, I'm doing my environment setup and I miss the rights
> >>> to upload my GPG pub key to flink apache dist repo. Can one of the PMC
> >>> members push it ?
> >>>
> >>> Joint to this email is the updated KEYS file with my pub key added.
> >>>
> >>> Thanks
> >>>
> >>> Best
> >>>
> >>> Etienne
> >>>
> >>> Le 05/12/2023 à 16:30, Etienne Chauchot a écrit :
>  Hi Péter,
> 
>  My answers are inline
> 
> 
>  Best
> 
>  Etienne
> 
> 
>  Le 05/12/2023 à 05:27, Péter Váry a écrit :
> > Hi Etienne,
> >
> > Which branch would you cut the release from?
>  the parent_pom branch (consisting of a single maven pom file)
> > I find the flink-connector-parent branches confusing.
> >
> > If I merge a PR to the ci_utils branch, would it immediately change the 
> > CI
> > workflow of all of the connectors?
>  The ci_utils branch is basically one ci.yml workflow. _testing.yml
>  and maven test-project are both for testing the ci.yml workflow and
>  display what it can do to connector authors.
> 
>  As the connectors workflows refer ci.yml as this:
>  apache/flink-connector-shared-utils/.github/workflows/ci.yml@ci_utils,
>  if we merge changes to ci.yml all the CIs in the connectors' repo
>  will change.
> 
> > If I merge something to the release_utils branch, would it immediately
> > change the release process of all of the connectors?
>  I don't know how release-utils scripts are integrated with the
>  connectors' code yet
> > I would like to add the possibility of creating Python packages for the
> > connectors [1]. This would consist of some common code, which should 
> > reside
> > in flink-connector-parent, like:
> > - scripts for running Python test - test infra. I expect that this would
> > evolve in time
> > - ci workflow - this would be more slow moving, but might change if the
> > infra is charging
> > - release scripts - this would be slow moving, but might change too.
> >
> > I think we should have a release for all of the above components, so the
> > connectors could move forward on their own pace.
> 
>  I think it is quite out of the scope of this release: here we are
>  only talking about releasing a parent pom maven file for the connectors.
> 
> > What do you think?
> >
> > Thanks,
> > Péter
> >
> > [1]https://issues.apache.org/jira/browse/FLINK-33528
> >
> > On Thu, Nov 30, 2023, 16:55 Etienne Chauchot   
> > wrote:
> >
> >> Thanks Sergey for your vote. Indeed I have listed only the PRs merged
> >> since last release but there are these 2 open PRs that could be worth
> >> reviewing/merging before release.
> >>
> >> https://github.com/apache/flink-connector-shared-utils/pull/25
> >>
> >> https://github.com/apache/flink-connector-shared-utils/pull/20
> >>
> >> Best
> >>
> >> Etienne
> >>
> >>
> >> Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :
> >>> thanks for volunteering Etienne
> >>>
> >>> +1 for releasing
> >>> however there is one more PR to enable custom jvm flags for connectors
> >>> in similar way it is done in Flink main repo for modules
> >>> It will simplify a bit support for java 17
> >>>
> >>> could we have this as well in the coming release?
> >>>
> >>>
> >>>
> >>> On Wed, Nov 29, 2023 at 11:40 AM Etienne 
> >>> Chauchot
> >>> wrote:
> >>>
>  Hi all,
> 
>  I would li

[ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-02 Thread Maximilian Michels
Happy New Year everyone,

I'd like to start the year off by announcing Alexander Fedulov as a
new Flink committer.

Alex has been active in the Flink community since 2019. He has
contributed more than 100 commits to Flink, its Kubernetes operator,
and various connectors [1][2].

Especially noteworthy are his contributions on deprecating and
migrating the old Source API functions and test harnesses, the
enhancement to flame graphs, the dynamic rescale time computation in
Flink Autoscaling, as well as all the small enhancements Alex has
contributed which make a huge difference.

Beyond code contributions, Alex has been an active community member
with his activity on the mailing lists [3][4], as well as various
talks and blog posts about Apache Flink [5][6].

Congratulations Alex! The Flink community is proud to have you.

Best,
The Flink PMC

[1] https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
[2] 
https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
[3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
[4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
[5] 
https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
[6] 
https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series


Re: [Discuss][Flink-31326] Flink autoscaler code

2024-01-04 Thread Maximilian Michels
We discussed in the PR that it's actually a feature, but thanks Yang
for bringing it up and improving the docs around this piece of code!

-Max

On Tue, Jan 2, 2024 at 10:06 PM Yang LI  wrote:
>
> Hello Rui,
>
> Here is the jira ticket https://issues.apache.org/jira/browse/FLINK-33966, I 
> have pushed a tiny pr for this ticket.
>
> Regards,
> Yang
>
> On Tue, 2 Jan 2024 at 16:15, Rui Fan <1996fan...@gmail.com> wrote:
>>
>> Thanks Yang for reporting this issue!
>>
>> You are right, these 2 conditions are indeed the same. It's unexpected IIUC.
>> Would you like to fix it?
>>
>> Feel free to create a FLINK JIRA to fix it if you would like to, and I'm
>> happy to
>> review!
>>
>> And cc @Maximilian Michels 
>>
>> Best,
>> Rui
>>
>> On Tue, Jan 2, 2024 at 11:03 PM Yang LI  wrote:
>>
>> > Hello,
>> >
>> > I see we have 2 times the same condition check in the
>> > function getNumRecordsInPerSecond (L220
>> > <
>> > https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/metrics/ScalingMetrics.java#L220
>> > >
>> > and
>> > L224
>> > <
>> > https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/metrics/ScalingMetrics.java#L224
>> > >).
>> > I imagine you want to use SOURCE_TASK_NUM_RECORDS_OUT_PER_SEC when the
>> > operator is not the source. Can you confirm this and if we have a FIP
>> > ticket to fix this?
>> >
>> > Regards,
>> > Yang LI
>> >


Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-10 Thread Maximilian Michels
+1 (binding)

On Wed, Jan 10, 2024 at 11:22 AM Martijn Visser
 wrote:
>
> +1 (binding)
>
> On Wed, Jan 10, 2024 at 4:43 AM Xingbo Huang  wrote:
> >
> > +1 (binding)
> >
> > Best,
> > Xingbo
> >
> > Dian Fu  于2024年1月10日周三 11:35写道:
> >
> > > +1 (binding)
> > >
> > > Regards,
> > > Dian
> > >
> > > On Wed, Jan 10, 2024 at 5:09 AM Sharath  wrote:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Best,
> > > > Sharath
> > > >
> > > > On Tue, Jan 9, 2024 at 1:02 PM Venkata Sanath Muppalla <
> > > sanath...@gmail.com>
> > > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks,
> > > > > Sanath
> > > > >
> > > > > On Tue, Jan 9, 2024 at 11:16 AM Peter Huang <
> > > huangzhenqiu0...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > >
> > > > > > Best Regards
> > > > > > Peter Huang
> > > > > >
> > > > > >
> > > > > > On Tue, Jan 9, 2024 at 5:26 AM Jane Chan 
> > > wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > Best,
> > > > > > > Jane
> > > > > > >
> > > > > > > On Tue, Jan 9, 2024 at 8:41 PM Lijie Wang <
> > > wangdachui9...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Lijie
> > > > > > > >
> > > > > > > > Jiabao Sun  于2024年1月9日周二
> > > 19:28写道:
> > > > > > > >
> > > > > > > > > +1 (non-binding)
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Jiabao
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On 2024/01/09 09:58:04 xiangyu feng wrote:
> > > > > > > > > > +1 (non-binding)
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Xiangyu Feng
> > > > > > > > > >
> > > > > > > > > > Danny Cranmer  于2024年1月9日周二 17:50写道:
> > > > > > > > > >
> > > > > > > > > > > +1 (binding)
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Danny
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Jan 9, 2024 at 9:31 AM Feng Jin 
> > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1 (non-binding)
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Feng Jin
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Jan 9, 2024 at 5:29 PM Yuxin Tan <
> > > ta...@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > +1 (non-binding)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yuxin
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Márton Balassi  于2024年1月9日周二 17:25写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +1 (binding)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Jan 9, 2024 at 10:15 AM Leonard Xu <
> > > > > > xb...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1(binding)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Leonard
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2024年1月9日 下午5:08,Yangze Guo 
> > > 写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1 (non-binding)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Jan 9, 2024 at 5:06 PM Robert Metzger <
> > > > > > > > > > > rmetz...@apache.org
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> +1 (binding)
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> On Tue, Jan 9, 2024 at 9:54 AM Guowei Ma <
> > > > > > > gu...@gmail.com
> > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>> +1 (binding)
> > > > > > > > > > > > > > > >>> Best,
> > > > > > > > > > > > > > > >>> Guowei
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>> On Tue, Jan 9, 2024 at 4:49 PM Rui Fan <
> > > > > > > 19...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > >  +1 (non-binding)
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > >  Best,
> > > > > > > > > > > > > > >  Rui
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > >  On Tue, Jan 9, 2024 at 4:41 PM Hang Ruan <
> > > > > > > > > > > > ruanhang1...@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > +1 (non-binding)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Hang
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > gongzhongqiang 
> > > 于2024年1月9日周二
> > > > > > > > > > > > 16:25写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> +1 non-binding
> > > > > > > > > > > > > > > >>
> > > >

Re: [VOTE] FLIP-78: Flink Python UDF Environment and Dependency Management

2019-10-16 Thread Maximilian Michels
I'm also late to the party here :) When I saw the first draft, I was 
thinking how exactly the design doc would tie in with Beam. Thanks for 
the update.


A couple of comments with this regard:


Flink has provided a distributed cache mechanism and allows users to upload their files using 
"registerCachedFile" method in ExecutionEnvironment/StreamExecutionEnvironment. The python files users 
specified through "add_python_file", "set_python_requirements" and "add_python_archive" 
are also uploaded through this method eventually.


For process-based execution we use Flink's cache distribution instead of 
Beam's artifact staging.


Apache Beam Portability Framework already supports artifact staging that works out of the box with the Docker environment. We can use the artifact staging service defined in Apache Beam to transfer the dependencies from the operator to Python SDK harness running in the docker container. 


Do we want to implement two different ways of staging artifacts? It 
seems sensible to use the same artifact staging functionality also for 
the process-based execution. Apart from being simpler, this would also 
allow the process-based execution to run in other environments than the 
Flink TaskManager environment.


Thanks,
Max

On 15.10.19 11:13, Wei Zhong wrote:

Hi Thomas,

Thanks a lot for your suggestion!

As you can see from the section "Goals" that this FLIP focuses on the 
dependency management in process mode. However, the APIs and design proposed in this FLIP 
also applies for the docker mode. So it makes sense to me to also describe how this 
design is integated to the artifact staging service of Apache Beam in docker mode. I have 
updated the design doc and looking forward to your feedback.

Thanks,
Wei


在 2019年10月15日,01:54,Thomas Weise  写道:

Sorry for joining the discussion late.

The Beam environment already supports artifact staging, it works out of the
box with the Docker environment. I think it would be helpful to explain in
the FLIP how this proposal relates to what Beam offers / how it would be
integrated.

Thanks,
Thomas


On Mon, Oct 14, 2019 at 8:09 AM Jeff Zhang  wrote:


+1

Hequn Cheng  于2019年10月14日周一 下午10:55写道:


+1

Good job, Wei!

Best, Hequn

On Mon, Oct 14, 2019 at 2:54 PM Dian Fu  wrote:


Hi Wei,

+1 (non-binding). Thanks for driving this.

Thanks,
Dian


在 2019年10月14日,下午1:40,jincheng sun  写道:

+1

Wei Zhong  于2019年10月12日周六 下午8:41写道:


Hi all,

I would like to start the vote for FLIP-78[1] which is discussed and
reached consensus in the discussion thread[2].

The vote will be open for at least 72 hours. I'll try to close it by
2019-10-16 18:00 UTC, unless there is an objection or not enough

votes.


Thanks,
Wei

[1]






https://cwiki.apache.org/confluence/display/FLINK/FLIP-78%3A+Flink+Python+UDF+Environment+and+Dependency+Management

<






https://cwiki.apache.org/confluence/display/FLINK/FLIP-78:+Flink+Python+UDF+Environment+and+Dependency+Management



[2]






http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-UDF-Environment-and-Dependency-Management-td33514.html

<






http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-UDF-Environment-and-Dependency-Management-td33514.html














--
Best Regards

Jeff Zhang





Re: [VOTE] Accept Stateful Functions into Apache Flink

2019-10-25 Thread Maximilian Michels

+1 (binding)

On 25.10.19 14:31, Congxian Qiu wrote:

+1 (non-biding)
Best,
Congxian


Terry Wang  于2019年10月24日周四 上午11:15写道:


+1 (non-biding)

Best,
Terry Wang




2019年10月24日 10:31,Jingsong Li  写道:

+1 (non-binding)

Best,
Jingsong Lee

On Wed, Oct 23, 2019 at 9:02 PM Yu Li  wrote:


+1 (non-binding)

Best Regards,
Yu


On Wed, 23 Oct 2019 at 16:56, Haibo Sun  wrote:


+1 (non-binding)Best,
Haibo


At 2019-10-23 09:07:41, "Becket Qin"  wrote:

+1 (binding)

Thanks,

Jiangjie (Becket) Qin

On Tue, Oct 22, 2019 at 11:44 PM Tzu-Li (Gordon) Tai <

tzuli...@apache.org


wrote:


+1 (binding)

Gordon

On Tue, Oct 22, 2019, 10:58 PM Zhijiang 
wrote:


+1 (non-binding)

Best,
Zhijiang


--
From:Zhu Zhu 
Send Time:2019 Oct. 22 (Tue.) 16:33
To:dev 
Subject:Re: [VOTE] Accept Stateful Functions into Apache Flink

+1 (non-binding)

Thanks,
Zhu Zhu

Biao Liu  于2019年10月22日周二 上午11:06写道:


+1 (non-binding)

Thanks,
Biao /'bɪ.aʊ/



On Tue, 22 Oct 2019 at 10:26, Jark Wu  wrote:


+1 (non-binding)

Best,
Jark

On Tue, 22 Oct 2019 at 09:38, Hequn Cheng 


wrote:



+1 (non-binding)

Best, Hequn

On Tue, Oct 22, 2019 at 9:21 AM Dian Fu <

dian0511...@gmail.com>

wrote:



+1 (non-binding)

Regards,
Dian


在 2019年10月22日,上午9:10,Kurt Young  写道:

+1 (binding)

Best,
Kurt


On Tue, Oct 22, 2019 at 12:56 AM Fabian Hueske <

fhue...@gmail.com>

wrote:



+1 (binding)

Am Mo., 21. Okt. 2019 um 16:18 Uhr schrieb Thomas Weise <

t...@apache.org

:



+1 (binding)


On Mon, Oct 21, 2019 at 7:10 AM Timo Walther <

twal...@apache.org



wrote:



+1 (binding)

Thanks,
Timo


On 21.10.19 15:59, Till Rohrmann wrote:

+1 (binding)

Cheers,
Till

On Mon, Oct 21, 2019 at 12:13 PM Robert Metzger <

rmetz...@apache.org



wrote:



+1 (binding)

On Mon, Oct 21, 2019 at 12:06 PM Stephan Ewen <

se...@apache.org



wrote:



This is the official vote whether to accept the

Stateful

Functions

code

contribution to Apache Flink.

The current Stateful Functions code, documentation,

and

website

can

be

found here:
https://statefun.io/
https://github.com/ververica/stateful-functions

This vote should capture whether the Apache Flink

community

is

interested

in accepting, maintaining, and evolving Stateful

Functions.


Reiterating my original motivation, I believe that

this

project

is

a

great

match for Apache Flink, because it helps Flink to

grow

the

community

into a

new set of use cases. We see current users

interested

in

such

use

cases,

but they are not well supported by Flink as it

currently

is.


I also personally commit to put time into making

sure

this

integrates

well

with Flink and that we grow contributors and

committers

to

maintain

this

new component well.

This is a "Adoption of a new Codebase" vote as per

the

Flink

bylaws

[1].

Only PMC votes are binding. The vote will be open at

least

6

days

(excluding weekends), meaning until Tuesday Oct.29th

12:00

UTC,

or

until

we

achieve the 2/3rd majority.

Happy voting!

Best,
Stephan

[1]


























https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026





























--
Best, Jingsong Lee







Re: [VOTE] FLIP-78: Flink Python UDF Environment and Dependency Management

2019-10-25 Thread Maximilian Michels
 python worker is
launched by the operator, so it is always in the same environment as

the

operator.

Thanks again for your feedback, and it is valuable for find out the

final

best architecture.

Feel free to correct me if there is anything incorrect.

Best,
Jincheng

Maximilian Michels  于2019年10月16日周三 下午4:23写道:


I'm also late to the party here :) When I saw the first draft, I

was

thinking how exactly the design doc would tie in with Beam. Thanks

for

the update.

A couple of comments with this regard:


Flink has provided a distributed cache mechanism and allows users

to

upload their files using "registerCachedFile" method in
ExecutionEnvironment/StreamExecutionEnvironment. The python files

users

specified through "add_python_file", "set_python_requirements" and
"add_python_archive" are also uploaded through this method

eventually.


For process-based execution we use Flink's cache distribution

instead

of

Beam's artifact staging.


Apache Beam Portability Framework already supports artifact

staging

that

works out of the box with the Docker environment. We can use the

artifact

staging service defined in Apache Beam to transfer the dependencies

from

the operator to Python SDK harness running in the docker container.

Do we want to implement two different ways of staging artifacts? It
seems sensible to use the same artifact staging functionality also

for

the process-based execution. Apart from being simpler, this would

also

allow the process-based execution to run in other environments than

the

Flink TaskManager environment.

Thanks,
Max

On 15.10.19 11:13, Wei Zhong wrote:

Hi Thomas,

Thanks a lot for your suggestion!

As you can see from the section "Goals" that this FLIP focuses on

the

dependency management in process mode. However, the APIs and design
proposed in this FLIP also applies for the docker mode. So it makes

sense

to me to also describe how this design is integated to the artifact

staging

service of Apache Beam in docker mode. I have updated the design

doc

and

looking forward to your feedback.


Thanks,
Wei


在 2019年10月15日,01:54,Thomas Weise  写道:

Sorry for joining the discussion late.

The Beam environment already supports artifact staging, it works

out

of

the

box with the Docker environment. I think it would be helpful to

explain

in

the FLIP how this proposal relates to what Beam offers / how it

would

be

integrated.

Thanks,
Thomas


On Mon, Oct 14, 2019 at 8:09 AM Jeff Zhang 

wrote:



+1

Hequn Cheng  于2019年10月14日周一 下午10:55写道:


+1

Good job, Wei!

Best, Hequn

On Mon, Oct 14, 2019 at 2:54 PM Dian Fu <

dian0511...@gmail.com>

wrote:



Hi Wei,

+1 (non-binding). Thanks for driving this.

Thanks,
Dian


在 2019年10月14日,下午1:40,jincheng sun 


写道:


+1

Wei Zhong  于2019年10月12日周六 下午8:41写道:


Hi all,

I would like to start the vote for FLIP-78[1] which is

discussed

and

reached consensus in the discussion thread[2].

The vote will be open for at least 72 hours. I'll try to

close

it

by

2019-10-16 18:00 UTC, unless there is an objection or not

enough

votes.


Thanks,
Wei

[1]
















https://cwiki.apache.org/confluence/display/FLINK/FLIP-78%3A+Flink+Python+UDF+Environment+and+Dependency+Management

<
















https://cwiki.apache.org/confluence/display/FLINK/FLIP-78:+Flink+Python+UDF+Environment+and+Dependency+Management



[2]
















http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-UDF-Environment-and-Dependency-Management-td33514.html

<
















http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-UDF-Environment-and-Dependency-Management-td33514.html














--
Best Regards

Jeff Zhang

















Spreading Tasks across TaskManagers

2018-10-11 Thread Maximilian Michels

Hi everyone,

I've recently come across a cluster scheduling problem users are facing. 
Clusters where TaskManagers have more slots than the parallelism 
(#tm_slots > job_parallelism), tend to schedule all job tasks on a 
single TaskManager.


This is not good for spreading load and has been discussed in FLINK-1003 
[1] and the other duplicate JIRA issues.


I know that this is not really an issue if the cluster is created 
exclusively for the Job, or if the number of slots per Taskmanager is 
smaller than the parallelism. However, this seems like a rather easy 
improvement to the Scheduler which would have a huge impact on performance.


On the JIRA issue page it has been mentioned that this was put on hold 
to work on dynamic scaling first.


Now that the basic building blocks for dynamic scaling are in place, do 
you think it would be possible to tackle FLINK-1003?


Thanks,
Max


[1] https://issues.apache.org/jira/browse/FLINK-1003


Re: Spreading Tasks across TaskManagers

2018-10-16 Thread Maximilian Michels

the community is currently working on Flink's scheduler component [1]
That sounds great! I agree that spreading tasks across the nodes is not 
always desirable but it would be nice to give users an option to provide 
hints to the scheduler. The location aware bulk scheduling you mentioned 
would be useful.


Today, there is already the option to assign Resources to a 
StreamTransformation. From a quick test, it seems like those resource 
specifications are not honored yet.


-Max

On 13.10.18 01:41, Thomas Weise wrote:

Hi Till,

Thanks for the pointer, glad that this is being worked on.

It almost looks like the non deterministic distribution behavior started
with 1.5.x (?) and that surprised us.

https://issues.apache.org/jira/browse/BEAM-5713

I agree that there is no one strategy that fits every use case. If an
application is limited by a resource per machine that the scheduler does
not understand (like let's say CPU or disk I/O), then it would be nice to
have a way to hint that round-robin distribution is desired (or achieve the
same through anti-affinity or resource constraints).

Thanks,
Thomas



On Fri, Oct 12, 2018 at 2:06 AM Till Rohrmann  wrote:


Hi Max,

the community is currently working on Flink's scheduler component [1]. One
of the things we want to enable in the future is bulk scheduling. With
this, it should also be possible to add strategies how to distribute tasks
across multiple TMs (spreading vs. co-locating).

In general, I'm not 100% sure whether spreading out tasks is always the
best strategy. Especially if you have a network heavy job co-locating tasks
on the same TM could have benefits over spreading the tasks out.

[1] https://issues.apache.org/jira/browse/FLINK-10429

Cheers,
Till

On Thu, Oct 11, 2018 at 8:16 PM Maximilian Michels  wrote:


Hi everyone,

I've recently come across a cluster scheduling problem users are facing.
Clusters where TaskManagers have more slots than the parallelism
(#tm_slots > job_parallelism), tend to schedule all job tasks on a
single TaskManager.

This is not good for spreading load and has been discussed in FLINK-1003
[1] and the other duplicate JIRA issues.

I know that this is not really an issue if the cluster is created
exclusively for the Job, or if the number of slots per Taskmanager is
smaller than the parallelism. However, this seems like a rather easy
improvement to the Scheduler which would have a huge impact on

performance.


On the JIRA issue page it has been mentioned that this was put on hold
to work on dynamic scaling first.

Now that the basic building blocks for dynamic scaling are in place, do
you think it would be possible to tackle FLINK-1003?

Thanks,
Max


[1] https://issues.apache.org/jira/browse/FLINK-1003







Re: [ANNOUNCE] Apache Flink 1.5.5 released

2018-10-30 Thread Maximilian Michels

Great work! :)

Was just trying the new release out with Beam and found that 
"force-shading" is not published for 1.5.5. Also visible here: 
https://repo.maven.apache.org/maven2/org/apache/flink/force-shading/


Was this intentional? It is not a big problem because it is a dependency 
which is only required at build time to trick the Shade plugin. It can 
otherwise be excluded.


-Max

On 29.10.18 11:55, Till Rohrmann wrote:

Great news. Thanks a lot to you Chesnay for being our release manager and
the community for making this release possible.

Cheers,
Till

On Mon, Oct 29, 2018 at 8:36 AM Chesnay Schepler  wrote:


The Apache Flink community is very happy to announce the release of
Apache Flink 1.5.5, which is the fifth bugfix release for the Apache
Flink 1.5 series.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data
streaming applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the
improvements for this bugfix release:
https://flink.apache.org/news/2018/10/29/release-1.5.5.html

The full release notes are available in Jira:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344112

We would like to thank all contributors of the Apache Flink community
who made this release possible!

Regards,
Chesnay






JIRA notifications

2018-11-21 Thread Maximilian Michels

Hi!

Do you think it would make sense to send JIRA notifications to a 
separate mailing list? Some people just want to casually follow the 
mailing list and it requires a filter to delete all the JIRA mails.


We already have an "issues" mailing list which receives the JIRA 
notifications: https://mail-archives.apache.org/mod_mbox/flink-issues/


What do you think?

Thanks,
Max


Re: JIRA notifications

2018-11-25 Thread Maximilian Michels
It seems that some people find it useful to receive JIRA mails for new 
issues only. The "issues" mailing list gets all JIRA notifications, 
including updates to existing issues.


So I'd leave things as they are now because JIRA doesn't allow 
individuals to configure receiving notifications only for newly created 
issues.


Thanks,
Max

On 21.11.18 12:52, vino yang wrote:

+1

Flavio Pompermaier  于2018年11月21日周三 下午7:24写道:


+1

On Wed, Nov 21, 2018 at 12:05 PM Saar Bar  wrote:


💯 agree

Sent from my iPhone


On 21 Nov 2018, at 13:03, Maximilian Michels  wrote:

Hi!

Do you think it would make sense to send JIRA notifications to a

separate mailing list? Some people just want to casually follow the

mailing

list and it requires a filter to delete all the JIRA mails.


We already have an "issues" mailing list which receives the JIRA

notifications: https://mail-archives.apache.org/mod_mbox/flink-issues/


What do you think?

Thanks,
Max








Re: Create version 1.7.1 in JIRA

2018-12-14 Thread Maximilian Michels

Hi Chesnay,

Just saw this. I had unarchived the version because I thought it had been 
archived accidentally. I have archived it again.


I hope it is ok to include this fix: 
https://issues.apache.org/jira/browse/FLINK-10566


Otherwise, feel free to move it to 1.7.2.

Thanks,
Max

On 13.12.18 21:42, Chesnay Schepler wrote:
The 1.7.1 version already exists. As I'Ve already started the release process 
for 1.1 I have archived this version temporarily to save me the hassle of 
updating JIRAs that people now mark as fixed for 1.7.1 even though they aren't 
included.


On 13.12.2018 21:17, Thomas Weise wrote:

Hi,

Can a PMC member please create the version number 1.7.1. There are already
some JIRAs with 1.7.2 version that may have to updated as well.

https://issues.apache.org/jira/projects/FLINK?selectedItem=com.atlassian.jira.jira-projects-plugin:release-page&status=released-unreleased 



Thanks





Re: [DISCUSS] Creating last bug fix release for 1.5 branch

2018-12-14 Thread Maximilian Michels
I have pushed this fix to the release-1.5 branch: 
https://issues.apache.org/jira/browse/FLINK-10566


Would be great if we could include it because it has been blocking some 
pipelines on the Beam side.


Thanks,
Max

On 13.12.18 20:22, Chesnay Schepler wrote:
FLINK-11023: will not be fixed for 1.5.6; this would take significantly longer 
to implement, and TBH I'm not really keen on doing that for a final bugfix release.
FLINK-7991: This is just a minor cleanup; the issue doesn't affect users in any 
way. It is thus not particularly important to have for this release and can be 
omitted IMO; I would also have to double-check whether the open PR applies 
properly to 1.5.6, and frankly I don't have the time for that right now anyway.
FLINK-10251: has been in review for a while, but will likely not be merged this 
year from what I know.
FLINK-9253: appears to require additional changes and is also quite outdated (it 
is from May after all), and looks more like a general improvement than a bug fix 
from the JIRA description. I would omit this from the release, unless Nico objects.


On 13.12.2018 17:08, Thomas Weise wrote:

Hi,

I would be interested to try my hand at being the release manger for this.

There are currently still 5 in-progress issues [1], all except [2] with an
open PR.

Nico, Chesnay, Till: Can you please take a look and see if these can be
completed?

Thanks,
Thomas


[1]
https://issues.apache.org/jira/issues/?jql=statusCategory%20%3D%20indeterminate%20AND%20project%20%3D%2012315522%20AND%20fixVersion%20%3D%2012344315%20ORDER%20BY%20priority%20DESC%2C%20key%20ASC 


[2] https://issues.apache.org/jira/browse/FLINK-9010






On Mon, Dec 10, 2018 at 3:15 PM Thomas Weise  wrote:


Thanks Till and my belated +1 for a final patch release :)

On Mon, Dec 10, 2018 at 5:47 AM Till Rohrmann 
wrote:


Thanks for the feedback! I conclude that the community is in favour of a
last 1.5.6 release. I'll try to make the arrangements in the next two
weeks.

Cheers,
Till

On Mon, Dec 10, 2018 at 2:40 AM jincheng sun 
wrote:


+1. There are incompatible improvements between 1.5.x and 1.6/1.7, so

many

1.5.x users may not be willing to upgrade to 1.6 or 1.7 due to migration
costs, so it makes sense to creating last bug fix release for 1.5

branch.

Bests,
Jincheng

Jeff Zhang  于2018年12月10日周一 上午9:24写道:


+1, I think very few people would use 1.6 or 1.7 in their production

in

near future, so I expect they would use 1.5 in production for a long
period,it makes sense to provide a stable version for production

usage.

Ufuk Celebi  于2018年12月9日周日 下午6:07写道:


+1. This seems reasonable to me. Since the fixes are already in and
also part of other releases, the release overhead should be
manageable.

@Vino: I agree with your assessment.

@Qi: As Till mentioned, the official project guideline is to support
the last two minor releases, e.g. currently 1.7 and 1.6.

Best,

Ufuk

On Sun, Dec 9, 2018 at 3:48 AM qi luo  wrote:

Hi Till,

Does Flink has an agreement on how long will a major version be

supported? Some companies may need a long time to upgrade Flink

major

versions in production. If Flink terminates support for a major

version

too

quickly, it may be a concern for companies.

Best,
Qi


On Dec 8, 2018, at 10:57 AM, vino yang 

wrote:

Hi Till,

I think it makes sense to release a bug fix version (especially

some

serious bug fixes) for flink 1.5.
Consider that some companies' production environments are more

cautious

about upgrading large versions.
I think some organizations are still using 1.5.x or even 1.4.x.

Best,
Vino

Till Rohrmann  于2018年12月7日周五 下午11:39写道:


Dear community,

I wanted to reach out to you and discuss whether we should

release a

last

bug fix release for the 1.5 branch.

Since we have already released Flink 1.7.0, we only need to

support

the

1.6.x and 1.7.x branches (last two major releases). However,

the

current

release-1.5 branch contains 45 unreleased fixes. Some of the

fixes

address

serializer duplication problems (FLINK-10839, FLINK-10693),

fixing

retractions (FLINK-10674) or prevent a deadlock in the
SpillableSubpartition (FLINK-10491). I think it would be nice

for

our

users

if we officially terminated the Flink 1.5.x support with a last

1.5.6

release. What do you think?

Cheers,
Till



--
Best Regards

Jeff Zhang





Re: Thanks for hiding ASF GitHub Bot logs on JIRA

2018-12-14 Thread Maximilian Michels

Can confirm they still go to the main comment section.

Would be great if we could send them to the Work Log instead. We do that for the 
Beam JIRA project and it has proven very useful for reducing noise.


Thanks,
Max

On 13.12.18 22:15, Tzu-Li Chen wrote:

hmm..then what is it


Chesnay Schepler  于2018年12月14日周五 上午5:07写道:


ehhfrom what I can tell this is not the case.

On 13.12.2018 22:00, Tzu-Li Chen wrote:

Not sure how we arrive here but comments a bit noisy made by ASF GitHub

Bot

are now hidden to Work Log. Thank you!

Best,
tison.








Re: [DISCUSS] Python (and Non-JVM) Language Support in Flink

2018-12-14 Thread Maximilian Michels

Hi Xianda, hi Shaoxuan,

I'd be in favor of option (1). There is great potential in Beam and Flink 
joining forces on this one. Here's why:


The Beam project spent at least a year developing a portability layer with a 
reasonable amount of people working on it. Developing a new portability layer 
from scratch will probably take about the same amount of time and resources.


Concerning option (2): There is already a Python API for Flink but an API is 
only one part of the portability story. In Beam the portability is structured 
into three components:


- SDK (API, its Protobuf serialization, and interaction with the SDK Harness)
- Runner (Translation from Protobuf pipeline to Flink job)
- SDK Harness (UDF execution, Interaction with the SDK and the execution engine)

I could imagine the Flink Python API would be another SDK which could have its 
own API but would reuse code for the interaction with the SDK Harness.


We would be able to focus on the optimizations instead of rebuilding a 
portability layer from scratch.


Thanks,
Max

On 13.12.18 11:52, Shaoxuan Wang wrote:

RE: Stephen's options (
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Usage-of-flink-python-and-flink-streaming-python-td25793.html
)
* Option (1): Language portability via Apache Beam
* Option (2): Implement own Python API
* Option (3): Implement own portability layer

Hi Stephen,
Eventually, I think we should support both option1 and option3. TMO, these
two options are orthogonal. I agree with you that we can leverage the
existing work and ecosystem in beam by supporting option1. But the problem
of beam is that it skips (to the best of my knowledge) the natural
table/SQL optimization framework provided by Flink. We should spend all the
needed efforts to support solution1 (as it is the better alternative of the
current Flink python API), but cannot solely bet on it. Option3 is the
ideal choice for Flink to support all Non-JVM languages which we should
better plan to achieve. We have done some preliminary prototypes for
option2/option3, and it seems not quite complex and difficult to accomplish.

Regards,
Shaoxuan


On Thu, Dec 13, 2018 at 4:58 PM Xianda Ke  wrote:


Currently there is an ongoing survey about Python usage of Flink [1]. Some
discussion was also brought up there regarding non-jvm language support
strategy in general. To avoid polluting the survey thread, we are starting
this discussion thread and would like to move the discussions here.

In the interest of facilitating the discussion, we would like to first
share the following design doc which describes what we have done at Alibaba
about Python API for Flink. It could serve as a good reference to the
discussion.

  [DISCUSS] Flink Python API
<
https://docs.google.com/document/d/1JNGWdLwbo_btq9RVrc1PjWJV3lYUgPvK0uEWDIfVNJI/edit?usp=drive_web




As of now, we've implemented and delivered Python UDF for SQL for the
internal users at Alibaba.
We are starting to implement Python API.

To recap and continue the discussion from the survey thread, I agree with
@Stephan that we should figure out in which general direction Python
support should go. Stephan also list three options there:
* Option (1): Language portability via Apache Beam
* Option (2): Implement own Python API
* Option (3): Implement own portability layer

 From my perspective,
(1). Flink language APIs and Beam's languages support are not mutually
exclusive.
It is nice that Beam has Python/NodeJS/Go APIs, and support Flink as the
runner.
Flink's own Python(or NodeJS/Go) APIs will benefit Flink's ecosystem.

(2). Python API / portability layer
To support non-JVM languages in Flink,
  * at client side, Flink would provide language interfaces, which will
translate user's application to Flink StreamGraph.
* at server side, Flink would execute user's UDF code at runtime
The non-JVM languages communicate with JVM via RPC(or low-level socket,
embedded interpreter and so on). What the portability layer can do maybe is
abstracting the RPC layer. When the portability layer is ready, still there
are lots of stuff to do for a specified language. Say, Python, we may still
have to write the interface classes by hand for the users because generated
code without detailed documentation is unacceptable for users, or handle
the serialization issue of lambda/closure which is not a built-in feature
in Python.  Maybe, we can start with Python API, then extend to other
languages and abstract the logic in common as the portability layer.

---
References:
[1] [SURVEY] Usage of flink-python and flink-streaming-python

Regards,
Xianda





Status of FLINK-2491 (Checkpointing of shutdown sources)

2019-03-13 Thread Maximilian Michels

Hi,

Has there been any progress on 
https://issues.apache.org/jira/browse/FLINK-2491?


For the Flink Runner in Apache Beam we keep operators alive to avoid 
checkpointing to stop working [1]. Users of Flink's native API have to 
take care of this themselves.


To fix FLINK-2491 we have to:

  1) Remove shutdown operators from the list of to-be-checkpointed
 operators.

  2) Persist the shutdown operators in checkpoints to be able to 
restore

 the job correctly afterwards.

It would be great to fix this long-standing issue. Apart from removing 
the need for workarounds it would also simplify some of the test setup 
which relies on checkpointing to continue working when operators shut down.


Do you think we can make progress on this matter?

Cheers,
Max

[1] 
https://github.com/apache/beam/blob/6e89b6c7a8191429fc48c5b8d5c75c9caa05/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java#L198


Re: [DISCUSS] FLIP-38 Support python language in flink TableAPI

2019-04-24 Thread Maximilian Michels

Hi Stephan,

This is excited! Thanks for sharing. The inter-process communication 
code looks like the most natural choice as a common ground. To go 
further, there are indeed some challenges to solve.



=> Biggest question is whether the language-independent DAG is expressive 
enough to capture all the expressions that we want to map directly to Table API 
expressions. Currently much is hidden in opaque UDFs. Kenn mentioned the structure 
should be flexible enough to capture more expressions transparently.


Just to add some context how this could be done, there is the concept of 
a FunctionSpec which is part of a transform in the DAG. FunctionSpec 
contains a URN and with a payload. FunctionSpec can be either (1) 
translated by the Runner directly, e.g. map to table API concepts or (2) 
run a user-defined function with an Environment. It could be feasible 
for Flink to choose the direct path, whereas Beam Runners would leverage 
the more generic approach using UDFs. Granted, compatibility across 
Flink and Beam would only work if both of the translation paths yielded 
the same semantics.



 If the DAG is generic enough to capture the additional information, we 
probably still need some standardization, so that all the different language 
APIs represent their expressions the same way


I wonder whether that's necessary as a first step. I think it would be 
fine for Flink to have its own way to represent API concepts in the Beam 
DAG which Beam Runners may not be able to understand. We could then 
successively add the capability for these transforms to run with Beam.



 Similarly, it makes sense to standardize the type system (and type inference) 
as far as built-in expressions and their interaction with UDFs are concerned. 
The Flink Table API and Blink teams found this to be essential for a consistent 
API behavior. This would not prevent all-UDF programs from still using purely 
binary/opaque types.


Beam has a set of standard coders which can be used across languages. We 
will have to expand those to play well with Flink's: 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tableApi.html#data-types


I think we will need to exchange more ideas to work out a model that 
will work for both Flink and Beam. A regular meeting could be helpful.


Thanks,
Max

On 23.04.19 21:23, Stephan Ewen wrote:

Hi all!

Below are my notes on the discussion last week on how to collaborate 
between Beam and Flink.
The discussion was between Tyler, Kenn, Luke, Ahmed, Xiaowei, Shaoxuan, 
Jincheng, and me.


This represents my understanding of the discussion, please augment this 
where I missed something or where your conclusion was different.


Best,
Stephan

===

*Beams Python and Portability Framework*

   - Portability core to Beam
   - Language independent dataflow DAG that is defined via ProtoBuf
   - DAG can be generated from various languages (Java, Python, Go)
   - The DAG describes the pipelines and contains additional parameters 
to describe each operator, and contains artifacts that need to be 
deployed / executed as part of an operator execution.
   - Operators execute in language-specific containers, data is 
exchanged between the language-specific container and the runner 
container (JVM) via gRPC.


*Flink's desiderata for Python API*

   - Python API should mirror Java / Scala Table API
   - All relational expressions that correspond to built-in functions 
should be translated to corresponding expressions in the Table API. That 
way the planner generated Java code for the data types and built-in 
expressions, meaning no Python code is necessary during execution

   - UDFs should be supported and run similarly as in Beam's approach
   - Python programs should be similarly created and submitted/deployed 
as Java / Scala programs (CLI, web, containerized, etc.)


*Consensus to share inter-process communication code*

   - Crucial code for robust setup and high performance data exchange 
across processes
   - The code for the SDK harness, the artifact boostrapping, and the 
data exchange make sense to share.
   - Ongoing discussion whether this can be a dedicated module with slim 
dependencies in Beam


*Potential Long Term Perspective: Share language-independent DAG 
representation*


   - Beam's language independent DAG could become a standard 
representation used in both projects
   - Flink would need an way to receive that DAG, map it to the Table 
API, execute it from there
   - The DAG would need to have a standardized representation of 
functions and expressions that then get mapped to Table API expressions 
to let the planner optimize those and generate Java code for those
   - Similar as UDFs are supported in the Table API, there would be 
additional "external UDFs" that would go through the above mentioned 
inter-process communication layer


   - _Advantages:_
     => Flink and Beam could share more language bindings
     => Flink 

Re: [DISCUSS] Java code style

2015-10-23 Thread Maximilian Michels
I don't think lazily adding comments will work. However, I'm fine with
adding all the checkstyle rules one module at a time (with a jira
issue to keep track of the modules already converted). It's not going
to happen that we lazily add comments because that's the reason why
comments are missing in the first place...

On Fri, Oct 23, 2015 at 12:05 AM, Henry Saputra  wrote:
> Could we make certain rules to give warning instead of error?
>
> This would allow us to cherry-pick certain rules we would like people
> to follow but not strictly enforced.
>
> - Henry
>
> On Thu, Oct 22, 2015 at 9:20 AM, Stephan Ewen  wrote:
>> I don't think a "let add comments to everything" effort gives us good
>> comments, actually. It just gives us checkmark comments that make the rules
>> pass.
>>
>> On Thu, Oct 22, 2015 at 3:29 PM, Fabian Hueske  wrote:
>>
>>> Sure, I don't expect it to be free.
>>> But everybody should be aware of the cost of adding this code style, i.e.,
>>> spending a huge amount of time on reformatting and documenting code.
>>>
>>> Alternatively, we could drop the JavaDocs rule and make the transition
>>> significantly cheaper.
>>>
>>> 2015-10-22 15:24 GMT+02:00 Till Rohrmann :
>>>
>>> > There ain’t no such thing as a free lunch and code style.
>>> >
>>> > On Thu, Oct 22, 2015 at 3:13 PM, Maximilian Michels 
>>> > wrote:
>>> >
>>> > > I think we have to document all these classes. Code Style doesn't come
>>> > > for free :)
>>> > >
>>> > > On Thu, Oct 22, 2015 at 3:09 PM, Fabian Hueske 
>>> > wrote:
>>> > > > Any ideas how to deal with the mandatory JavaDoc rule for existing
>>> > code?
>>> > > > Just adding empty headers to make the checkstyle pass or start a
>>> > serious
>>> > > > effort to add the missing docs?
>>> > > >
>>> > > > 2015-10-21 13:31 GMT+02:00 Matthias J. Sax :
>>> > > >
>>> > > >> Agreed. That's the reason why I am in favor of using vanilla Google
>>> > code
>>> > > >> style.
>>> > > >>
>>> > > >> On 10/21/2015 12:31 PM, Stephan Ewen wrote:
>>> > > >> > We started out originally with mixed tab/spaces, but it ended up
>>> > with
>>> > > >> > people mixing spaces and tabs arbitrarily, and there is little way
>>> > to
>>> > > >> > enforce Matthias' specific suggestion via checkstyle.
>>> > > >> > That's why we dropped spaces alltogether...
>>> > > >> >
>>> > > >> > On Wed, Oct 21, 2015 at 12:03 PM, Gyula Fóra <
>>> gyula.f...@gmail.com>
>>> > > >> wrote:
>>> > > >> >
>>> > > >> >> I think the nice thing about a common codestyle is that everyone
>>> > can
>>> > > set
>>> > > >> >> the template in the IDE and use the formatting commands.
>>> > > >> >>
>>> > > >> >> Matthias's suggestion makes this practically impossible so -1 for
>>> > > mixed
>>> > > >> >> tabs/spaces from my side.
>>> > > >> >>
>>> > > >> >> Matthias J. Sax  ezt írta (időpont: 2015. okt.
>>> > > 21.,
>>> > > >> Sze,
>>> > > >> >> 11:46):
>>> > > >> >>
>>> > > >> >>> I actually like tabs a lot, however, in a "mixed" style together
>>> > > with
>>> > > >> >>> spaces. Example:
>>> > > >> >>>
>>> > > >> >>> myVar.callMethod(param1, // many more
>>> > > >> >>> .paramX); // the dots mark space
>>> indention
>>> > > >> >>>
>>> > > >> >>> indenting "paramX" with tabs does not give nice aliment. Not
>>> sure
>>> > if
>>> > > >> >>> this would be a feasible compromise to keeps tabs in general,
>>> but
>>> > > use
>>> > > >> >>> space for cases as above.
>>> > > >> >>>
>>> > > &

From 0.10 to 1.0

2015-10-23 Thread Maximilian Michels
Dear Flink community,

We have forked the current release candidate from the master to the
release-0.10-rc0 branch. Changes for the next release candidate should
be pushed to the release-0.10 branch. The master needed to be updated
to deploy only one snapshot version to the Maven repository and to
continue development for the next release.

Some of us have suggested to go to 1.0 after the 0.10. That's why I
updated the version to 1.0-SNAPSHOT in the master. Only after I
committed, I realized not everyone might agree with that. I wanted to
emphasize that this is just my personal opinion. We may change the
version again if we agree on a different version.

Let me know how you feel about 1.0 as the next targeted release.
Personally, I think it's great for Flink and we can really push for a
great 1.0 release after 0.10 is out.

Cheers,
Max


Re: FastR-Flink: a new open source Truffle project

2015-10-23 Thread Maximilian Michels
Great project. Thanks for sharing!

On Thu, Oct 22, 2015 at 9:29 PM, Ufuk Celebi  wrote:

> Wow! Very nice. Thanks for sharing. I will try it out :-)
>
> On Thursday, October 22, 2015, Kunft, Andreas 
> wrote:
>
> > FYI:
> >
> >
> > FastR on Flink, a project to combine the R programming language with
> > Flink, just went open source:
> >
> >
> >
> http://mail.openjdk.java.net/pipermail/graal-dev/2015-October/003728.html?
> >
> >
> > Best
> >
> > Andreas
> >
>


Re: Broken link for master Javadocs

2015-10-26 Thread Maximilian Michels
Thanks for reporting, Suneel. On my machine the Java docs build.

Here's the build log:
https://ci.apache.org/builders/flink-docs-master/builds/122/steps/Java%20%26%20Scala%20docs/logs/stdio


[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:35:
error: not found: type ILoopCompat
[ERROR]   extends ILoopCompat(in0, out0) {
[ERROR]   ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:29:
error: too many arguments for constructor Object: ()Object
[ERROR] class FlinkILoop(
[ERROR] ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:118:
error: value createInterpreter is not a member of AnyRef
[ERROR] super.createInterpreter()
[ERROR]   ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:120:
error: not found: value addThunk
[ERROR] addThunk {
[ERROR] ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:138:
error: not found: value intp
[ERROR] val vd = intp.virtualDirectory
[ERROR]  ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:186:
error: not found: value echo
[ERROR] echo(
[ERROR] ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala:151:
error: value process is not a member of
org.apache.flink.api.scala.FlinkILoop
[ERROR]   repl.foreach(_.process(settings))
[ERROR]  ^
[ERROR] 
/home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala:153:
error: value closeInterpreter is not a member of
org.apache.flink.api.scala.FlinkILoop
[ERROR]   repl.foreach(_.closeInterpreter())
[ERROR]  ^
[ERROR] 8 errors found


Not sure what the issue is. I'll try to look into later.

Thanks,
Max

On Mon, Oct 26, 2015 at 7:12 AM, Henry Saputra 
wrote:

> Thanks for the heads up, Suneel.
>
> Seemed like master Java api (api/java/index.html) is not being built:
> https://ci.apache.org/projects/
>
> I have filed ticket with infra to help figure out why.
>
> - Henry
>
> On Sat, Oct 24, 2015 at 5:45 PM, Suneel Marthi  wrote:
> > https://ci.apache.org/projects/flink/flink-docs-master/api/java
> >
> > needs to be fixed.
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc0)

2015-10-26 Thread Maximilian Michels
Now that the pressing issues of the rc0 have been fixed and pushed to the
release-0.10 branch, I would like to go ahead and create a new rc1.

There are still FLINK-2763 and FLINK-2800 but no immediate fix seems to be
available.

Any further issues we would like to fix in the new release candidate?

On Thu, Oct 22, 2015 at 6:00 PM, Stephan Ewen  wrote:

> Thanks for bringing up the projects and the record API:
>
>  - Concerning the projects: Nice to have, but not critical, unless we want
> to change the names of the Maven artifacts. I would rather not rush this
>
>  - Removal of Record API. Good thing to have, but should not be a release
> blocker. I would be fine with doing this for 1.0
>
> On Thu, Oct 22, 2015 at 5:24 PM, Fabian Hueske  wrote:
>
> > Hmm, it took IntelliJ some time to figure out all the consequences of
> > removing the Record API.
> > Seems to be more than I initially expected.
> >
> > @Chesnay, do you want to help? I would push my current version to my
> > repository and you could take over some packages and fix the tests. Just
> > reply to me directly to coordinate. Thanks.
> >
> > 2015-10-22 16:45 GMT+02:00 Fabian Hueske :
> >
> > > I just deleted the Record API to check what would break.
> > > Doesn't look too scary, just a few tests that need to be adapted. I'm
> > > right in the middle of that. Hope to open a PR soon.
> > >
> > > 2015-10-22 16:42 GMT+02:00 Chesnay Schepler :
> > >
> > >> @RecordAPI: Yes, i was curious where we are at regarding the removal
> of
> > >> the Record API.
> > >> If there are still tests left to port (or other related things) I'd be
> > >> more than happy to do it (got a /lot/ of free time on my hands).
> > >> The related JIRA issues weren't particularly helpful though in
> figuring
> > >> out what still needs to be done.
> > >>
> > >> @Project Restructuring: I prefer doing it now.
> > >>
> > >>
> > >> On 22.10.2015 15:54, Till Rohrmann wrote:
> > >>
> > >>> This reminded me that at some point we wanted to remove the old
> record
> > >>> API (
> > >>> https://issues.apache.org/jira/browse/FLINK-1681). I think that
> > Chesnay
> > >>> checked with Henry on this topic in JIRA.
> > >>>
> > >>> On Thu, Oct 22, 2015 at 3:38 PM, Fabian Hueske 
> > >>> wrote:
> > >>>
> > >>> I'd like to bring up Vasia's question on the project structure.
> > >>>>
> > >>>> Stephan started the discussion and proposed a new project structure
> > >>>> about
> > >>>> three weeks ago [1].
> > >>>> The proposal was refined a bit and eventually backed by many +1s.
> > >>>>
> > >>>> Do we want to make this happen in 0.10 or do we postpone it after
> the
> > >>>> release?
> > >>>>
> > >>>> Cheers,
> > >>>> Fabian
> > >>>>
> > >>>> [1][
> > >>>>
> > >>>>
> > >>>>
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201510.mbox/%3CCANC1h_u6qtEsF1WCcoU1d38JGd%2BXTAQWmvp9Stx4vfe68BOjBw%40mail.gmail.com%3E
> > >>>>
> > >>>>
> > >>>> 2015-10-22 15:10 GMT+02:00 Suneel Marthi :
> > >>>>
> > >>>> We r actually targeting Flink 0.10, since 0.10 would be out by the
> > time
> > >>>>>
> > >>>> we
> > >>>>
> > >>>>> have Flink-Mahout integration in place.
> > >>>>>
> > >>>>> On Thu, Oct 22, 2015 at 9:02 AM, Till Rohrmann <
> trohrm...@apache.org
> > >
> > >>>>> wrote:
> > >>>>>
> > >>>>> Forget my last mail. Just found out that the Mahout guys are still
> > >>>>>>
> > >>>>> running
> > >>>>>
> > >>>>>> on 0.9-SNAPSHOT.
> > >>>>>> ​
> > >>>>>>
> > >>>>>> On Thu, Oct 22, 2015 at 2:53 PM, Till Rohrmann <
> > trohrm...@apache.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>> I found another issue (FLINK-2894
> > >>>>>>> <https://issues.apache.org/jira/browse/FLINK-2894>) while
> help

Re: Web Page Issue

2015-10-26 Thread Maximilian Michels
Thanks Matthias for pointing this out. I opened an issue some time ago with
a similar description: https://issues.apache.org/jira/browse/FLINK-2752

I agree with Fabian and Ufuk that it makes sense to separate the website
and the source repository. However, the distinction between the
documentation and the homepage should be more clear.

On Mon, Oct 26, 2015 at 10:35 AM, Ufuk Celebi  wrote:

>
> > On 26 Oct 2015, at 10:27, Fabian Hueske  wrote:
> >
> > The website consists of two parts which are maintained in two separate
> > respositories:
> >
> > 1) The project website about features, community, etc.
> > 2) The documentation of the project
> >
> > We have the separation because we want to be able to update source and
> > documentation in one repository to avoid that the documentation gets out
> of
> > sync. The documentation is built every night and hosted at ci.apache.org
> to
> > achieve that.
> >
> > IMO, this separation makes sense, because the project website is not
> > changed very often whereas the documentation should be touched whenever
> the
> > API or behavior is changed. I think it is very important to have
> > documentation in sync with the code. In fact, I believe both parts of the
> > website should not be related to each other, so they shouldn't be a way
> to
> > have both parts getting out-of-sync, except for layout / design which is
> > nice to have but not crucial. We might even think about changing the
> > color-scheme of the documentation to make the difference more clear.
>
> Yes, Max pointed this out in the beginning. Let’s change the colors/design
> to make the distinction clear. The confusion comes from the fact that they
> look similar. It only makes sense to assume that they are hosted on the
> same web server etc. But as Fabian said, there are good reasons against it.
>
> – Ufuk


Scala 2.10/2.11 Maven dependencies

2015-10-26 Thread Maximilian Michels
Hi Flinksters,

We have recently committed an easy way to change Flink's Scala version. The
question arises now whether we should ship Scala 2.11 as binaries and via
Maven. For the rc0, I created all binaries twice, for Scala 2.10 and 2.11.
However, I didn't create Maven artifacts. This follows our current shipping
strategy where we only ship Hadoop1 and Hadoop 2.3.0 Maven dependencies but
additionally Hadoop 2.4, 2.6, 2.7 as binaries.

Should we also upload Maven dependencies for Scala 2.11?

If so, the next question arises: What version pattern should we have for
the Flink Scala 2.11 dependencies? For Hadoop, we append -hadoop1 to the
VERSION, e.g. artifactID=flink-core, version=0.9.1-hadoop1.

However, it is common practice to append the suffix to the artifactID of
the Maven dependency, e.g. artifactID=flink-core_2.11, version=0.9.1. This
has mostly historic reasons but is widely used.

Whatever naming pattern we choose, it should be consistent. I would be in
favor of changing our artifact names to contain the Hadoop and Scala
version. This would also imply that all Scala dependent Maven modules
receive a Scala suffix (also the default Scala 2.10 modules).

Cheers,
Max


[VOTE] [RESULT] Release Apache Flink 0.10.0 (release-0.10.0-rc0)

2015-10-26 Thread Maximilian Michels
The vote is cancelled in favor of a new release candidate.

On Mon, Oct 26, 2015 at 3:52 PM, Till Rohrmann  wrote:

> I wasn't able to reproduce the problem of FLINK-2800 either. Still looking
> into it.
>
> On Mon, Oct 26, 2015 at 10:38 AM, Fabian Hueske  wrote:
>
> > +1 for a new RC.
> >
> > I tried to reproduce FLINK-2800 but did not succeed yet. I will spend a
> bit
> > more time on it and if we have a fix within time (before a new RC) we can
> > include it.
> >
> > 2015-10-26 10:36 GMT+01:00 Maximilian Michels :
> >
> > > Now that the pressing issues of the rc0 have been fixed and pushed to
> the
> > > release-0.10 branch, I would like to go ahead and create a new rc1.
> > >
> > > There are still FLINK-2763 and FLINK-2800 but no immediate fix seems to
> > be
> > > available.
> > >
> > > Any further issues we would like to fix in the new release candidate?
> > >
> > > On Thu, Oct 22, 2015 at 6:00 PM, Stephan Ewen 
> wrote:
> > >
> > > > Thanks for bringing up the projects and the record API:
> > > >
> > > >  - Concerning the projects: Nice to have, but not critical, unless we
> > > want
> > > > to change the names of the Maven artifacts. I would rather not rush
> > this
> > > >
> > > >  - Removal of Record API. Good thing to have, but should not be a
> > release
> > > > blocker. I would be fine with doing this for 1.0
> > > >
> > > > On Thu, Oct 22, 2015 at 5:24 PM, Fabian Hueske 
> > > wrote:
> > > >
> > > > > Hmm, it took IntelliJ some time to figure out all the consequences
> of
> > > > > removing the Record API.
> > > > > Seems to be more than I initially expected.
> > > > >
> > > > > @Chesnay, do you want to help? I would push my current version to
> my
> > > > > repository and you could take over some packages and fix the tests.
> > > Just
> > > > > reply to me directly to coordinate. Thanks.
> > > > >
> > > > > 2015-10-22 16:45 GMT+02:00 Fabian Hueske :
> > > > >
> > > > > > I just deleted the Record API to check what would break.
> > > > > > Doesn't look too scary, just a few tests that need to be adapted.
> > I'm
> > > > > > right in the middle of that. Hope to open a PR soon.
> > > > > >
> > > > > > 2015-10-22 16:42 GMT+02:00 Chesnay Schepler  >:
> > > > > >
> > > > > >> @RecordAPI: Yes, i was curious where we are at regarding the
> > removal
> > > > of
> > > > > >> the Record API.
> > > > > >> If there are still tests left to port (or other related things)
> > I'd
> > > be
> > > > > >> more than happy to do it (got a /lot/ of free time on my hands).
> > > > > >> The related JIRA issues weren't particularly helpful though in
> > > > figuring
> > > > > >> out what still needs to be done.
> > > > > >>
> > > > > >> @Project Restructuring: I prefer doing it now.
> > > > > >>
> > > > > >>
> > > > > >> On 22.10.2015 15:54, Till Rohrmann wrote:
> > > > > >>
> > > > > >>> This reminded me that at some point we wanted to remove the old
> > > > record
> > > > > >>> API (
> > > > > >>> https://issues.apache.org/jira/browse/FLINK-1681). I think
> that
> > > > > Chesnay
> > > > > >>> checked with Henry on this topic in JIRA.
> > > > > >>>
> > > > > >>> On Thu, Oct 22, 2015 at 3:38 PM, Fabian Hueske <
> > fhue...@gmail.com>
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>> I'd like to bring up Vasia's question on the project structure.
> > > > > >>>>
> > > > > >>>> Stephan started the discussion and proposed a new project
> > > structure
> > > > > >>>> about
> > > > > >>>> three weeks ago [1].
> > > > > >>>> The proposal was refined a bit and eventually backed by many
> > +1s.
> > > > > >>>>
> > > > > >>>> Do we want to make this happen

[VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc1)

2015-10-26 Thread Maximilian Michels
Please vote on releasing the following candidate as Apache Flink version
0.10.0:

The commit to be voted on:
d4479404a9a9245ed897189973d8f6dadb9c814b

Branch:
release-0.10.0-rc1 (see
https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)

The release artifacts to be voted on can be found at:
http://people.apache.org/~mxm/flink-0.10.0-rc1/

The release artifacts are signed with the key with fingerprint C2909CBF:
http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapacheflink-1048

-

The vote is open for the next 72 hours and passes if a majority of at least
three +1 PMC votes are cast.

The vote ends on Thursday October 29, 2015.

[ ] +1 Release this package as Apache Flink 0.10.0
[ ] -1 Do not release this package because ...

===

The following commits have been added on top of release-0.10.0-rc0:

65fcd3a [FLINK-2891] [streaming] Set keys for key/value state in window
evaluation of fast-path windows.
8e4cb0a [FLINK-2888] [streaming] State backends return copies of the
default values
c2811ce [FLINK-2866] [runtime] Eagerly close FSDataInputStream in file
state handle
ec1730b [docs] add information on how to use Kerberos
856b278 [hotfix] Improve handling of Window Trigger results
bc5b852 [hotfix] Add Window Parameter in
Trigger.onEventTime/onProcessingTime
15d3f10 [FLINK-2895] Duplicate immutable object creation
8ec828c [FLINK-2893] [runtime] Consistent naming of recovery config
parameters
c0d7073 [FLINK-1982] [record-api] Remove dependencies on Record API from
flink-runtime tests
712c868 [hotfix] Fix Mutable Object window aggregator/Disable Object Copy
45ab0eb [hotfix] Fix broken copy in OperatorChain
c257abf Add copy() to Tuple base class.
85b73e0 [hotfix] Fix processing time triggering on Window Operator
c72eff4 [FLINK-2874] Fix recognition of Scala default setters
42b5ead [FLINK-2874] Fix Avro getter/setter recognition
5c3eb8b [FLINK-2668] [DataSet] [api-breaking] Chained Projections are no
longer appended
dadb1a8 [FLINK-2206] [webui] Fix incorrect counts of finished, canceled,
and failed jobs in new web dashboard
e340f83 [FLINK-2891] Set KV-State key upon Window Evaluation in General
Windows
db19973 [FLINK-2887] [gelly] make sendMessageToAllNeighbors respect the
EdgeDirection if set in the configuration


Re: Broken link for master Javadocs

2015-10-27 Thread Maximilian Michels
Hi Henry,

Yes, there is. The Commits@ list actually gets notifications on failures
and recoveries. I figured sending them to dev@ would bother too many people
because sometimes the infrastructure is flaky and it fails for no
particular reason.

Cheers,
Max

On Tue, Oct 27, 2015 at 4:18 AM, Henry Saputra 
wrote:

> Hi Max,
>
> Is there a way that dev@ list gets email notification if the build fail
> for
> the build bot?
>
> - Henry
>
> On Monday, October 26, 2015, Maximilian Michels  wrote:
>
> > Thanks for reporting, Suneel. On my machine the Java docs build.
> >
> > Here's the build log:
> >
> >
> https://ci.apache.org/builders/flink-docs-master/builds/122/steps/Java%20%26%20Scala%20docs/logs/stdio
> >
> >
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:35:
> > error: not found: type ILoopCompat
> > [ERROR]   extends ILoopCompat(in0, out0) {
> > [ERROR]   ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:29:
> > error: too many arguments for constructor Object: ()Object
> > [ERROR] class FlinkILoop(
> > [ERROR] ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:118:
> > error: value createInterpreter is not a member of AnyRef
> > [ERROR] super.createInterpreter()
> > [ERROR]   ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:120:
> > error: not found: value addThunk
> > [ERROR] addThunk {
> > [ERROR] ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:138:
> > error: not found: value intp
> > [ERROR] val vd = intp.virtualDirectory
> > [ERROR]  ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:186:
> > error: not found: value echo
> > [ERROR] echo(
> > [ERROR] ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala:151:
> > error: value process is not a member of
> > org.apache.flink.api.scala.FlinkILoop
> > [ERROR]   repl.foreach(_.process(settings))
> > [ERROR]  ^
> > [ERROR]
> >
> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala:153:
> > error: value closeInterpreter is not a member of
> > org.apache.flink.api.scala.FlinkILoop
> > [ERROR]   repl.foreach(_.closeInterpreter())
> > [ERROR]  ^
> > [ERROR] 8 errors found
> >
> >
> > Not sure what the issue is. I'll try to look into later.
> >
> > Thanks,
> > Max
> >
> > On Mon, Oct 26, 2015 at 7:12 AM, Henry Saputra  > >
> > wrote:
> >
> > > Thanks for the heads up, Suneel.
> > >
> > > Seemed like master Java api (api/java/index.html) is not being built:
> > > https://ci.apache.org/projects/
> > >
> > > I have filed ticket with infra to help figure out why.
> > >
> > > - Henry
> > >
> > > On Sat, Oct 24, 2015 at 5:45 PM, Suneel Marthi  > > wrote:
> > > > https://ci.apache.org/projects/flink/flink-docs-master/api/java
> > > >
> > > > needs to be fixed.
> > >
> >
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc1)

2015-10-27 Thread Maximilian Michels
I've prepared a new testing document:
https://docs.google.com/document/d/1S3niz5dPElA4dX-SfLwf9JwkzEAYHVLD7vsLLOAu06Q/edit

Please take one or two checks and verify them. Thanks and happy testing :)



On Mon, Oct 26, 2015 at 11:06 PM, Maximilian Michels  wrote:

> Please vote on releasing the following candidate as Apache Flink version
> 0.10.0:
>
> The commit to be voted on:
> d4479404a9a9245ed897189973d8f6dadb9c814b
>
> Branch:
> release-0.10.0-rc1 (see
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>
> The release artifacts to be voted on can be found at:
> http://people.apache.org/~mxm/flink-0.10.0-rc1/
>
> The release artifacts are signed with the key with fingerprint C2909CBF:
> http://www.apache.org/dist/flink/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1048
>
> -
>
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PMC votes are cast.
>
> The vote ends on Thursday October 29, 2015.
>
> [ ] +1 Release this package as Apache Flink 0.10.0
> [ ] -1 Do not release this package because ...
>
> ===
>
> The following commits have been added on top of release-0.10.0-rc0:
>
> 65fcd3a [FLINK-2891] [streaming] Set keys for key/value state in window
> evaluation of fast-path windows.
> 8e4cb0a [FLINK-2888] [streaming] State backends return copies of the
> default values
> c2811ce [FLINK-2866] [runtime] Eagerly close FSDataInputStream in file
> state handle
> ec1730b [docs] add information on how to use Kerberos
> 856b278 [hotfix] Improve handling of Window Trigger results
> bc5b852 [hotfix] Add Window Parameter in
> Trigger.onEventTime/onProcessingTime
> 15d3f10 [FLINK-2895] Duplicate immutable object creation
> 8ec828c [FLINK-2893] [runtime] Consistent naming of recovery config
> parameters
> c0d7073 [FLINK-1982] [record-api] Remove dependencies on Record API from
> flink-runtime tests
> 712c868 [hotfix] Fix Mutable Object window aggregator/Disable Object Copy
> 45ab0eb [hotfix] Fix broken copy in OperatorChain
> c257abf Add copy() to Tuple base class.
> 85b73e0 [hotfix] Fix processing time triggering on Window Operator
> c72eff4 [FLINK-2874] Fix recognition of Scala default setters
> 42b5ead [FLINK-2874] Fix Avro getter/setter recognition
> 5c3eb8b [FLINK-2668] [DataSet] [api-breaking] Chained Projections are no
> longer appended
> dadb1a8 [FLINK-2206] [webui] Fix incorrect counts of finished, canceled,
> and failed jobs in new web dashboard
> e340f83 [FLINK-2891] Set KV-State key upon Window Evaluation in General
> Windows
> db19973 [FLINK-2887] [gelly] make sendMessageToAllNeighbors respect the
> EdgeDirection if set in the configuration
>
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc1)

2015-10-27 Thread Maximilian Michels
Thanks for spotting this, Aljoscha. The main issue are the quickstart
files. Quite odd the release scripts didn't catch that.

On Tue, Oct 27, 2015 at 12:18 PM, Ufuk Celebi  wrote:

>
> > On 27 Oct 2015, at 12:12, Aljoscha Krettek  wrote:
> >
> > There are still references to 0.10-SNAPSHOT in the release. Especially
> for the quickstarts this is problematic:
> >
> > ~/D/flink (release-0.10.0-rc1|✔) $ ag "0.10-SNAPSHOT"
> > docs/_config.yml
> > 30:version: "0.10-SNAPSHOT"
> >
> > docs/apis/best_practices.md
> > 331:  0.10-SNAPSHOT
> > 346:  0.10-SNAPSHOT
> > 361:  0.10-SNAPSHOT
> >
> > docs/apis/storm_compatibility.md
> > 276:`flink-storm-examples-0.10-SNAPSHOT.jar` is **no** valid jar file
> for job execution (it is only a standard maven artifact).
> >
> > docs/internals/monitoring_rest_api.md
> > 89:  "flink-version": "0.10-SNAPSHOT",
> >
> >
> flink-quickstart/flink-quickstart-java/src/main/resources/archetype-resources/pom.xml
> > 33:   0.10-SNAPSHOT
> >
> >
> flink-quickstart/flink-quickstart-scala/src/main/resources/archetype-resources/pom.xml
> > 48:   0.10-SNAPSHOT
> >
> > tools/change-version.sh
> > 21:NEW="0.10-SNAPSHOT”
>
> We need to adapt the _config.yml. The docs should use the variable and not
> the hardcoded values.
>
> The docs will be updated on the web page (built from release-0.10 branch
> and not the RC), so we can change this after the release.
>
> @Max: Let’s add a note to the releasing Wiki page?
>
> – Ufuk
>
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc1)

2015-10-27 Thread Maximilian Michels
Good catch, Aljoscha. As far as I know the plan visualizer is only broken
for Safari. It works for me with Firefox.

On Tue, Oct 27, 2015 at 3:14 PM, Aljoscha Krettek 
wrote:

> The plan visualizer does not show anything for the output generated by
> “bin/flink info”
>
> > On 27 Oct 2015, at 13:48, Aljoscha Krettek  wrote:
> >
> > start-cluster-streaming.sh and start-local-streaming.sh don’t work if
> the flink path has spaces. I’m fixing it on master and on release-0.10.
> >> On 26 Oct 2015, at 23:06, Maximilian Michels  wrote:
> >>
> >> Please vote on releasing the following candidate as Apache Flink version
> >> 0.10.0:
> >>
> >> The commit to be voted on:
> >> d4479404a9a9245ed897189973d8f6dadb9c814b
> >>
> >> Branch:
> >> release-0.10.0-rc1 (see
> >> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
> >>
> >> The release artifacts to be voted on can be found at:
> >> http://people.apache.org/~mxm/flink-0.10.0-rc1/
> >>
> >> The release artifacts are signed with the key with fingerprint C2909CBF:
> >> http://www.apache.org/dist/flink/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapacheflink-1048
> >>
> >> -
> >>
> >> The vote is open for the next 72 hours and passes if a majority of at
> least
> >> three +1 PMC votes are cast.
> >>
> >> The vote ends on Thursday October 29, 2015.
> >>
> >> [ ] +1 Release this package as Apache Flink 0.10.0
> >> [ ] -1 Do not release this package because ...
> >>
> >> ===
> >>
> >> The following commits have been added on top of release-0.10.0-rc0:
> >>
> >> 65fcd3a [FLINK-2891] [streaming] Set keys for key/value state in window
> >> evaluation of fast-path windows.
> >> 8e4cb0a [FLINK-2888] [streaming] State backends return copies of the
> >> default values
> >> c2811ce [FLINK-2866] [runtime] Eagerly close FSDataInputStream in file
> >> state handle
> >> ec1730b [docs] add information on how to use Kerberos
> >> 856b278 [hotfix] Improve handling of Window Trigger results
> >> bc5b852 [hotfix] Add Window Parameter in
> >> Trigger.onEventTime/onProcessingTime
> >> 15d3f10 [FLINK-2895] Duplicate immutable object creation
> >> 8ec828c [FLINK-2893] [runtime] Consistent naming of recovery config
> >> parameters
> >> c0d7073 [FLINK-1982] [record-api] Remove dependencies on Record API from
> >> flink-runtime tests
> >> 712c868 [hotfix] Fix Mutable Object window aggregator/Disable Object
> Copy
> >> 45ab0eb [hotfix] Fix broken copy in OperatorChain
> >> c257abf Add copy() to Tuple base class.
> >> 85b73e0 [hotfix] Fix processing time triggering on Window Operator
> >> c72eff4 [FLINK-2874] Fix recognition of Scala default setters
> >> 42b5ead [FLINK-2874] Fix Avro getter/setter recognition
> >> 5c3eb8b [FLINK-2668] [DataSet] [api-breaking] Chained Projections are no
> >> longer appended
> >> dadb1a8 [FLINK-2206] [webui] Fix incorrect counts of finished, canceled,
> >> and failed jobs in new web dashboard
> >> e340f83 [FLINK-2891] Set KV-State key upon Window Evaluation in General
> >> Windows
> >> db19973 [FLINK-2887] [gelly] make sendMessageToAllNeighbors respect the
> >> EdgeDirection if set in the configuration
> >
>
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc1)

2015-10-27 Thread Maximilian Michels
Thank you all for testing so far!

We fixed the version number problem. Ufuk's configuration parameters and
shading for EMR have been pulled into the release-0.10 branch. I will go
ahead and create a new RC.

On Tue, Oct 27, 2015 at 6:13 PM, Aljoscha Krettek 
wrote:

> Apart from a few minor problems the release seems to be in good shape.
>  - The version number problem should be easy to fix
>  - I don’t know if we’re going to fix the visualizer for Safari before the
> release
>  - The start-*-streaming.sh scripts are already fixed to support spaces in
> paths
>
> Today I ran the tests in the document, including:
>  - Running in different cluster/local/streaming modes
>  - running the examples with builtin/external data
>  - Running a stateful job on a cluster with Yarn and Kafka to test
> resilience to TaskManager failures and JobManager failures (with Ufuk)
>
> These all worked, so if we can quickly create a new release candidate it
> could be successful.
>
> @Ufuk, do you still want to fix the missing default configuration
> parameters before the next RC?
>
> What do you think?
> > On 27 Oct 2015, at 15:46, Aljoscha Krettek  wrote:
> >
> > Yes, I can confirm that it works with Chrome on OS X
> >> On 27 Oct 2015, at 15:26, Vasiliki Kalavri 
> wrote:
> >>
> >> I tested this for rc0, and I confirm.
> >> Worked fine for Firefox and Chrome, didn't work for Safari (I left a
> note
> >> in the previous testing doc).
> >>
> >> -Vasia.
> >>
> >> On 27 October 2015 at 15:18, Maximilian Michels  wrote:
> >>
> >>> Good catch, Aljoscha. As far as I know the plan visualizer is only
> broken
> >>> for Safari. It works for me with Firefox.
> >>>
> >>> On Tue, Oct 27, 2015 at 3:14 PM, Aljoscha Krettek  >
> >>> wrote:
> >>>
> >>>> The plan visualizer does not show anything for the output generated by
> >>>> “bin/flink info”
> >>>>
> >>>>> On 27 Oct 2015, at 13:48, Aljoscha Krettek 
> >>> wrote:
> >>>>>
> >>>>> start-cluster-streaming.sh and start-local-streaming.sh don’t work if
> >>>> the flink path has spaces. I’m fixing it on master and on
> release-0.10.
> >>>>>> On 26 Oct 2015, at 23:06, Maximilian Michels 
> wrote:
> >>>>>>
> >>>>>> Please vote on releasing the following candidate as Apache Flink
> >>> version
> >>>>>> 0.10.0:
> >>>>>>
> >>>>>> The commit to be voted on:
> >>>>>> d4479404a9a9245ed897189973d8f6dadb9c814b
> >>>>>>
> >>>>>> Branch:
> >>>>>> release-0.10.0-rc1 (see
> >>>>>> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
> >>>>>>
> >>>>>> The release artifacts to be voted on can be found at:
> >>>>>> http://people.apache.org/~mxm/flink-0.10.0-rc1/
> >>>>>>
> >>>>>> The release artifacts are signed with the key with fingerprint
> >>> C2909CBF:
> >>>>>> http://www.apache.org/dist/flink/KEYS
> >>>>>>
> >>>>>> The staging repository for this release can be found at:
> >>>>>>
> >>> https://repository.apache.org/content/repositories/orgapacheflink-1048
> >>>>>>
> >>>>>> -
> >>>>>>
> >>>>>> The vote is open for the next 72 hours and passes if a majority of
> at
> >>>> least
> >>>>>> three +1 PMC votes are cast.
> >>>>>>
> >>>>>> The vote ends on Thursday October 29, 2015.
> >>>>>>
> >>>>>> [ ] +1 Release this package as Apache Flink 0.10.0
> >>>>>> [ ] -1 Do not release this package because ...
> >>>>>>
> >>>>>> ===
> >>>>>>
> >>>>>> The following commits have been added on top of release-0.10.0-rc0:
> >>>>>>
> >>>>>> 65fcd3a [FLINK-2891] [streaming] Set keys for key/value state in
> >>> window
> >>>>>> evaluation of fast-path windows.
> >>>>>> 8e4cb0a [FLINK-2888] [streaming] State backends return copies of the
> >>>>>> default values
> >

Re: [VOTE] [RESULT] Release Apache Flink 0.10.0 (release-0.10.0-rc1)

2015-10-27 Thread Maximilian Michels
The vote is cancelled in favor of a new release candidate.

On Tue, Oct 27, 2015 at 10:10 PM, Maximilian Michels  wrote:

> Thank you all for testing so far!
>
> We fixed the version number problem. Ufuk's configuration parameters and
> shading for EMR have been pulled into the release-0.10 branch. I will go
> ahead and create a new RC.
>
> On Tue, Oct 27, 2015 at 6:13 PM, Aljoscha Krettek 
> wrote:
>
>> Apart from a few minor problems the release seems to be in good shape.
>>  - The version number problem should be easy to fix
>>  - I don’t know if we’re going to fix the visualizer for Safari before
>> the release
>>  - The start-*-streaming.sh scripts are already fixed to support spaces
>> in paths
>>
>> Today I ran the tests in the document, including:
>>  - Running in different cluster/local/streaming modes
>>  - running the examples with builtin/external data
>>  - Running a stateful job on a cluster with Yarn and Kafka to test
>> resilience to TaskManager failures and JobManager failures (with Ufuk)
>>
>> These all worked, so if we can quickly create a new release candidate it
>> could be successful.
>>
>> @Ufuk, do you still want to fix the missing default configuration
>> parameters before the next RC?
>>
>> What do you think?
>> > On 27 Oct 2015, at 15:46, Aljoscha Krettek  wrote:
>> >
>> > Yes, I can confirm that it works with Chrome on OS X
>> >> On 27 Oct 2015, at 15:26, Vasiliki Kalavri 
>> wrote:
>> >>
>> >> I tested this for rc0, and I confirm.
>> >> Worked fine for Firefox and Chrome, didn't work for Safari (I left a
>> note
>> >> in the previous testing doc).
>> >>
>> >> -Vasia.
>> >>
>> >> On 27 October 2015 at 15:18, Maximilian Michels 
>> wrote:
>> >>
>> >>> Good catch, Aljoscha. As far as I know the plan visualizer is only
>> broken
>> >>> for Safari. It works for me with Firefox.
>> >>>
>> >>> On Tue, Oct 27, 2015 at 3:14 PM, Aljoscha Krettek <
>> aljos...@apache.org>
>> >>> wrote:
>> >>>
>> >>>> The plan visualizer does not show anything for the output generated
>> by
>> >>>> “bin/flink info”
>> >>>>
>> >>>>> On 27 Oct 2015, at 13:48, Aljoscha Krettek 
>> >>> wrote:
>> >>>>>
>> >>>>> start-cluster-streaming.sh and start-local-streaming.sh don’t work
>> if
>> >>>> the flink path has spaces. I’m fixing it on master and on
>> release-0.10.
>> >>>>>> On 26 Oct 2015, at 23:06, Maximilian Michels 
>> wrote:
>> >>>>>>
>> >>>>>> Please vote on releasing the following candidate as Apache Flink
>> >>> version
>> >>>>>> 0.10.0:
>> >>>>>>
>> >>>>>> The commit to be voted on:
>> >>>>>> d4479404a9a9245ed897189973d8f6dadb9c814b
>> >>>>>>
>> >>>>>> Branch:
>> >>>>>> release-0.10.0-rc1 (see
>> >>>>>> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>> >>>>>>
>> >>>>>> The release artifacts to be voted on can be found at:
>> >>>>>> http://people.apache.org/~mxm/flink-0.10.0-rc1/
>> >>>>>>
>> >>>>>> The release artifacts are signed with the key with fingerprint
>> >>> C2909CBF:
>> >>>>>> http://www.apache.org/dist/flink/KEYS
>> >>>>>>
>> >>>>>> The staging repository for this release can be found at:
>> >>>>>>
>> >>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1048
>> >>>>>>
>> >>>>>> -
>> >>>>>>
>> >>>>>> The vote is open for the next 72 hours and passes if a majority of
>> at
>> >>>> least
>> >>>>>> three +1 PMC votes are cast.
>> >>>>>>
>> >>>>>> The vote ends on Thursday October 29, 2015.
>> >>>>>>
>> >>>>>> [ ] +1 Release this package as Apache Flink 0.10.0
>> >>>>>> [ ] -1 Do not release this package because ...
>> >

[VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-27 Thread Maximilian Michels
Please vote on releasing the following candidate as Apache Flink version
0.10.0:

The commit to be voted on:
ed75049dfc9748eae81ace9d4d686907dcd7835c

Branch:
release-0.10.0-rc2 (see
https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)

The release artifacts to be voted on can be found at:
http://people.apache.org/~mxm/flink-0.10.0-rc2/

The release artifacts are signed with the key with fingerprint C2909CBF:
http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapacheflink-1049

-

The vote is open for the next 72 hours and passes if a majority of at least
three +1 PMC votes are cast.

The vote ends on Thursday Friday 30, 2015.

[ ] +1 Release this package as Apache Flink 0.10.0
[ ] -1 Do not release this package because ...

===

The following commits have been added on top of release-0.10.0-rc1:

ae19d2b [FLINK-2927] [runtime] Provide default required configuration keys
in flink-conf of binary distribution
874c500 Add org.apache.httpcomponents:(httpcore, httpclient) to dependency
management
04e25e1 [docs] remove hard-coded version artifacts
b240a80 [FLINK-2878] [webmonitor] Fix unexpected leader address pattern
16a9edc [hotfix] Fix issue with spaces in Path in start-*-streaming.sh


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
Thanks for testing, Vasia :)

Here is the new document:
https://docs.google.com/document/d/1CR3DH4tUJvukxGFQ1ySxfnzO00LjPhSTwkeE7Mf98CY/edit

I've transferred results which are unaffected by the changes of the new RC.

On Wed, Oct 28, 2015 at 10:33 AM, Vasiliki Kalavri <
vasilikikala...@gmail.com> wrote:

> Is there a new testing doc for rc2 or are we using the previous one?
> Thanks!
>
> On 27 October 2015 at 22:17, Maximilian Michels  wrote:
>
> > Please vote on releasing the following candidate as Apache Flink version
> > 0.10.0:
> >
> > The commit to be voted on:
> > ed75049dfc9748eae81ace9d4d686907dcd7835c
> >
> > Branch:
> > release-0.10.0-rc2 (see
> > https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
> >
> > The release artifacts to be voted on can be found at:
> > http://people.apache.org/~mxm/flink-0.10.0-rc2/
> >
> > The release artifacts are signed with the key with fingerprint C2909CBF:
> > http://www.apache.org/dist/flink/KEYS
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapacheflink-1049
> >
> > -
> >
> > The vote is open for the next 72 hours and passes if a majority of at
> least
> > three +1 PMC votes are cast.
> >
> > The vote ends on Thursday Friday 30, 2015.
> >
> > [ ] +1 Release this package as Apache Flink 0.10.0
> > [ ] -1 Do not release this package because ...
> >
> > ===
> >
> > The following commits have been added on top of release-0.10.0-rc1:
> >
> > ae19d2b [FLINK-2927] [runtime] Provide default required configuration
> keys
> > in flink-conf of binary distribution
> > 874c500 Add org.apache.httpcomponents:(httpcore, httpclient) to
> dependency
> > management
> > 04e25e1 [docs] remove hard-coded version artifacts
> > b240a80 [FLINK-2878] [webmonitor] Fix unexpected leader address pattern
> > 16a9edc [hotfix] Fix issue with spaces in Path in start-*-streaming.sh
> >
>


Re: Flink "Material"

2015-10-28 Thread Maximilian Michels
Yes, you can find lots of Flink slides on the Slideshare.

On Tue, Oct 27, 2015 at 9:46 PM, Matthias J. Sax  wrote:

> Hi,
>
> I just "discovered" that on the Flink "Material" page, a couple of slide
> decks are listed (https://flink.apache.org/material.html). I guess, this
> list is far from complete. Should we try to extend it?
>
> -Matthias
>
>


Re: Web Page Issue

2015-10-28 Thread Maximilian Michels
t;
> > >> => "Python", "Interactive Scale Shell", "Connectors", "Iterations",
> > >> "Hadoop" could go to "DataSet API"
> > >> => "Storm" could go to "DataStream API"
> > >> => as an alternative, "Pyhton", "Hadoop", and "Storm" could go to
> > >> "Libraries" too
> > >>
> > >>
> > >> -Matthias
> > >>
> > >>
> > >>
> > >> On 10/27/2015 11:42 AM, Fabian Hueske wrote:
> > >>> Hi Matthias,
> > >>>
> > >>> thanks for taking care of this issue.
> > >>> How about we change the menu completely, i.e., have menue entries for:
> > >>>
> > >>> - Project Website
> > >>> - Setup
> > >>>   - Local
> > >>>   - Cluster
> > >>>   - Yarn
> > >>> - DataSet API
> > >>>   - Programming guide
> > >>>   - transformations
> > >>> - DataStream API
> > >>>   - Programming Guide
> > >>> - Internals
> > >>>
> > >>> This is not a complete list, just what came to my mind right now.
> > >>>
> > >>> Cheers,
> > >>> Fabian
> > >>>
> > >>> 2015-10-27 3:39 GMT+01:00 Matthias J. Sax :
> > >>>
> > >>>> I started to work on this. Please see here:
> > >>>> https://github.com/mjsax/flink/tree/flink-2752-webpage
> > >>>>
> > >>>> Basically, I just changed the color schema of the menu. I also remove
> > >>>> "How to Contribute" and "Coding Guidelines" from "Internals".
> > >>>>
> > >>>> To get an even better separation, I would like to change the menu from
> > >>>> the main web page, too.
> > >>>>
> > >>>>  - At least we should change the link of "Overview" which is not
> > useful
> > >>>> at all, right now (is it broken or intentionally?)
> > >>>>  - I would also move "Quickstart" as sub-menu point of "Documentation"
> > >>>>  - maybe we could move "Overview" as sub-menu point of
> > "Documentation",
> > >> too
> > >>>>
> > >>>> From my point of view, having a different menu structure and color
> > >>>> should be good enough to make the distinction of both pages clear.
> > >>>>
> > >>>> Btw: the link "setup guide" in *Getting Started* section at the main
> > >>>> page is broken... I would fix this together with those changes (if
> > >>>> accepted).
> > >>>>
> > >>>> Please give feedback.
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>> On 10/26/2015 10:40 AM, Maximilian Michels wrote:
> > >>>>> Thanks Matthias for pointing this out. I opened an issue some time
> > ago
> > >>>> with
> > >>>>> a similar description:
> > >> https://issues.apache.org/jira/browse/FLINK-2752
> > >>>>>
> > >>>>> I agree with Fabian and Ufuk that it makes sense to separate the
> > >> website
> > >>>>> and the source repository. However, the distinction between the
> > >>>>> documentation and the homepage should be more clear.
> > >>>>>
> > >>>>> On Mon, Oct 26, 2015 at 10:35 AM, Ufuk Celebi 
> > wrote:
> > >>>>>
> > >>>>>>
> > >>>>>>> On 26 Oct 2015, at 10:27, Fabian Hueske  wrote:
> > >>>>>>>
> > >>>>>>> The website consists of two parts which are maintained in two
> > >> separate
> > >>>>>>> respositories:
> > >>>>>>>
> > >>>>>>> 1) The project website about features, community, etc.
> > >>>>>>> 2) The documentation of the project
> > >>>>>>>
> > >>>>>>> We have the separation because we want to be able to update source
> > >> and
> > >>>>>>> documentation in one repository to avoid that the documentation
> > gets
> > >>>> out
> > >>>>>> of
> > >>>>>>> sync. The documentation is built every night and hosted at
> > >>>> ci.apache.org
> > >>>>>> to
> > >>>>>>> achieve that.
> > >>>>>>>
> > >>>>>>> IMO, this separation makes sense, because the project website is
> > not
> > >>>>>>> changed very often whereas the documentation should be touched
> > >> whenever
> > >>>>>> the
> > >>>>>>> API or behavior is changed. I think it is very important to have
> > >>>>>>> documentation in sync with the code. In fact, I believe both parts
> > of
> > >>>> the
> > >>>>>>> website should not be related to each other, so they shouldn't be a
> > >> way
> > >>>>>> to
> > >>>>>>> have both parts getting out-of-sync, except for layout / design
> > which
> > >>>> is
> > >>>>>>> nice to have but not crucial. We might even think about changing
> > the
> > >>>>>>> color-scheme of the documentation to make the difference more
> > clear.
> > >>>>>>
> > >>>>>> Yes, Max pointed this out in the beginning. Let’s change the
> > >>>> colors/design
> > >>>>>> to make the distinction clear. The confusion comes from the fact
> > that
> > >>>> they
> > >>>>>> look similar. It only makes sense to assume that they are hosted on
> > >> the
> > >>>>>> same web server etc. But as Fabian said, there are good reasons
> > >> against
> > >>>> it.
> > >>>>>>
> > >>>>>> – Ufuk
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >


Re: [DISCUSS] flink-external

2015-10-28 Thread Maximilian Michels
Thanks Matthias! I made a comment. Please open a pull request.

On Tue, Oct 27, 2015 at 10:37 PM, Matthias J. Sax  wrote:
> Just updated this. Improved the layout and added FastR project.
>
> https://github.com/mjsax/flink-web/tree/flink-external-page
>
> -Matthias
>
> On 10/27/2015 03:25 AM, Matthias J. Sax wrote:
>> Hi,
>>
>> I updated the flink-external section on the Flink Web-Page:
>> https://github.com/mjsax/flink-web/tree/flink-external-page
>>
>> The section is now located in "Contribute" page. The layout needs some
>> refinement though... Some Project are "previews", ie, Flink support was
>> announced but there is not information on the according project web
>> pages. We might want to reach out to those people to see if we should
>> include those project already or just add them later on.
>>
>> Please give feedback.
>>
>>
>> -Matthias
>>
>>
>>
>>
>> On 10/09/2015 03:34 PM, Maximilian Michels wrote:
>>> Yes, Community is a better place. You can also add the Dataflow Runner
>>> https://github.com/dataArtisans/flink-dataflow.
>>>
>>> On Fri, Oct 9, 2015 at 3:32 PM, Vasiliki Kalavri
>>>  wrote:
>>>> Thank you Matthias!
>>>>
>>>> I'm not sure where the "Downloads" section is the right place for this.
>>>> I would actually put it under "Community", with a header "External
>>>> Contributions" or something like this, but I'm not feeling strong about
>>>> this :)
>>>>
>>>> -Vasia.
>>>>
>>>>
>>>> On 9 October 2015 at 15:29, Matthias J. Sax  wrote:
>>>>
>>>>> I was not sure what we should add and was hoping for input from the
>>>>> community.
>>>>>
>>>>> I am aware of the following projects we might want to add:
>>>>>
>>>>>   - Zeppelin
>>>>>   - SAMOA
>>>>>   - Mahout
>>>>>   - Cascading (dataartisan repo)
>>>>>   - BigPetStore
>>>>>   - Gradoop
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>>
>>>>>
>>>>> On 10/09/2015 03:07 PM, Maximilian Michels wrote:
>>>>>> Cool. Right now the list is empty. Do you already have a list you
>>>>>> could include in the upcoming pull request? :)
>>>>>>
>>>>>> On Fri, Oct 9, 2015 at 2:29 PM, Matthias J. Sax 
>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I just started this. Please see
>>>>>>> https://github.com/mjsax/flink-web/tree/flink-external-page
>>>>>>>
>>>>>>> I think, it is the best way to extend the "Downloads" page. I would also
>>>>>>> add a link to this on the main page's "Getting Started" section.
>>>>>>>
>>>>>>> As a first try, I started like this:
>>>>>>>> Third party packages
>>>>>>>>
>>>>>>>> This is a list of third party packages (ie, libraries, system
>>>>> extensions, or examples) build for Flink. The Flink community only 
>>>>> collects
>>>>> links to those packages but does not maintain them. Thus, they do not
>>>>> belong to the Apache Flink project and the community cannot give any
>>>>> support for them.
>>>>>>>> Package Name
>>>>>>>>
>>>>>>>> Available for Flink 0.8.x and 0.9.x
>>>>>>>>
>>>>>>>> Short description
>>>>>>>>
>>>>>>>> Please let us know, if we missed to list your package. Be aware, that
>>>>> we might remove listed packages without notice.
>>>>>>>
>>>>>>> Can you please give me some input, what projects I should add initially?
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>>
>>>>>>> On 10/08/2015 04:03 PM, Maximilian Michels wrote:
>>>>>>>> IMHO we can do that. There should be a disclaimer that the third party
>>>>>>>> software is not officially supported.
>>>>>>>>
>>>>>>>> On Thu, Oct 8, 2015 at 2:25 PM, Matthias J. Sax 
&

Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
It is supposed to show some general statistics about a job but it is
currently just a place holder. The accumulators are shown in the
overview. This page should be removed before the release.

Thanks,
Max

On Wed, Oct 28, 2015 at 12:54 PM, Vasiliki Kalavri
 wrote:
> I have a question regarding the web interface :)
> What is the "Job Accumulator/Statistics" tab supposed to show? No matter
> what job I run, the values are the same (operator=1, parallelism=2,
> subtasks=3). Are these hard-coded defaults?
>
> Thanks!
> -Vasia.
>
> On 28 October 2015 at 10:50, Maximilian Michels  wrote:
>
>> Thanks for testing, Vasia :)
>>
>> Here is the new document:
>>
>> https://docs.google.com/document/d/1CR3DH4tUJvukxGFQ1ySxfnzO00LjPhSTwkeE7Mf98CY/edit
>>
>> I've transferred results which are unaffected by the changes of the new RC.
>>
>> On Wed, Oct 28, 2015 at 10:33 AM, Vasiliki Kalavri <
>> vasilikikala...@gmail.com> wrote:
>>
>> > Is there a new testing doc for rc2 or are we using the previous one?
>> > Thanks!
>> >
>> > On 27 October 2015 at 22:17, Maximilian Michels  wrote:
>> >
>> > > Please vote on releasing the following candidate as Apache Flink
>> version
>> > > 0.10.0:
>> > >
>> > > The commit to be voted on:
>> > > ed75049dfc9748eae81ace9d4d686907dcd7835c
>> > >
>> > > Branch:
>> > > release-0.10.0-rc2 (see
>> > > https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>> > >
>> > > The release artifacts to be voted on can be found at:
>> > > http://people.apache.org/~mxm/flink-0.10.0-rc2/
>> > >
>> > > The release artifacts are signed with the key with fingerprint
>> C2909CBF:
>> > > http://www.apache.org/dist/flink/KEYS
>> > >
>> > > The staging repository for this release can be found at:
>> > > https://repository.apache.org/content/repositories/orgapacheflink-1049
>> > >
>> > > -
>> > >
>> > > The vote is open for the next 72 hours and passes if a majority of at
>> > least
>> > > three +1 PMC votes are cast.
>> > >
>> > > The vote ends on Thursday Friday 30, 2015.
>> > >
>> > > [ ] +1 Release this package as Apache Flink 0.10.0
>> > > [ ] -1 Do not release this package because ...
>> > >
>> > > ===
>> > >
>> > > The following commits have been added on top of release-0.10.0-rc1:
>> > >
>> > > ae19d2b [FLINK-2927] [runtime] Provide default required configuration
>> > keys
>> > > in flink-conf of binary distribution
>> > > 874c500 Add org.apache.httpcomponents:(httpcore, httpclient) to
>> > dependency
>> > > management
>> > > 04e25e1 [docs] remove hard-coded version artifacts
>> > > b240a80 [FLINK-2878] [webmonitor] Fix unexpected leader address pattern
>> > > 16a9edc [hotfix] Fix issue with spaces in Path in start-*-streaming.sh
>> > >
>> >
>>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
@Vasia:

- There are two types of shapes which are colored :) The circles mark
the running/finished/cancelled/failed jobs while the squares mark the
status of a task within a job
(cancelled/running/failed/restart/pending/finished/total).

- I can see all four columns in the "Plan" tab on Firefox. Which
version are you using? Does resizing the window make any difference?

@Sachin: Thanks for your pull requests. Will pull them in for the next RC.

On Wed, Oct 28, 2015 at 2:03 PM, Vasiliki Kalavri
 wrote:
> I think I found 2 more issues with the web interface.
>
> When inside a running job's view:
> - I think the colorful boxes with the number of tasks in each status show
> wrong values (or show something else?). I get different values than the
> ones I see in "Overview" and "Running Jobs" tabs.
> - In the "Plan" tab, it seems that some information is hidden and I cannot
> scroll right to see it. I can only see 3 columns for each operator: bytes
> read, records read and bytes written. I'm using Firefox.
>
> -Vasia.
>
> On 28 October 2015 at 13:13, Sachin Goel  wrote:
>
>> While we're at it, we should also remove the dummy log and stdout tabs for
>> task managers. The work on that hasn't been finished yet.
>> I'll file a jira for both.
>> On Oct 28, 2015 5:39 PM, "Vasiliki Kalavri" 
>> wrote:
>>
>> > I see, thank you! +1 for removing before the release :)
>> >
>> > On 28 October 2015 at 13:06, Sachin Goel 
>> wrote:
>> >
>> > > Those are hard coded values.
>> > > What exactly should be there, I'm not sure either.
>> > > On Oct 28, 2015 5:25 PM, "Vasiliki Kalavri" > >
>> > > wrote:
>> > >
>> > > > I have a question regarding the web interface :)
>> > > > What is the "Job Accumulator/Statistics" tab supposed to show? No
>> > matter
>> > > > what job I run, the values are the same (operator=1, parallelism=2,
>> > > > subtasks=3). Are these hard-coded defaults?
>> > > >
>> > > > Thanks!
>> > > > -Vasia.
>> > > >
>> > > > On 28 October 2015 at 10:50, Maximilian Michels 
>> > wrote:
>> > > >
>> > > > > Thanks for testing, Vasia :)
>> > > > >
>> > > > > Here is the new document:
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1CR3DH4tUJvukxGFQ1ySxfnzO00LjPhSTwkeE7Mf98CY/edit
>> > > > >
>> > > > > I've transferred results which are unaffected by the changes of the
>> > new
>> > > > RC.
>> > > > >
>> > > > > On Wed, Oct 28, 2015 at 10:33 AM, Vasiliki Kalavri <
>> > > > > vasilikikala...@gmail.com> wrote:
>> > > > >
>> > > > > > Is there a new testing doc for rc2 or are we using the previous
>> > one?
>> > > > > > Thanks!
>> > > > > >
>> > > > > > On 27 October 2015 at 22:17, Maximilian Michels 
>> > > > wrote:
>> > > > > >
>> > > > > > > Please vote on releasing the following candidate as Apache
>> Flink
>> > > > > version
>> > > > > > > 0.10.0:
>> > > > > > >
>> > > > > > > The commit to be voted on:
>> > > > > > > ed75049dfc9748eae81ace9d4d686907dcd7835c
>> > > > > > >
>> > > > > > > Branch:
>> > > > > > > release-0.10.0-rc2 (see
>> > > > > > > https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>> > > > > > >
>> > > > > > > The release artifacts to be voted on can be found at:
>> > > > > > > http://people.apache.org/~mxm/flink-0.10.0-rc2/
>> > > > > > >
>> > > > > > > The release artifacts are signed with the key with fingerprint
>> > > > > C2909CBF:
>> > > > > > > http://www.apache.org/dist/flink/KEYS
>> > > > > > >
>> > > > > > > The staging repository for this release can be found at:
>> > > > > > >
>> > > >
>> https://repository.apache.org/content/repositories/orgapacheflink-1049
>> > > > > > >
>> > > > > > > -
>> > > > > > >
>> > > > > > > The vote is open for the next 72 hours and passes if a majority
>> > of
>> > > at
>> > > > > > least
>> > > > > > > three +1 PMC votes are cast.
>> > > > > > >
>> > > > > > > The vote ends on Thursday Friday 30, 2015.
>> > > > > > >
>> > > > > > > [ ] +1 Release this package as Apache Flink 0.10.0
>> > > > > > > [ ] -1 Do not release this package because ...
>> > > > > > >
>> > > > > > > ===
>> > > > > > >
>> > > > > > > The following commits have been added on top of
>> > release-0.10.0-rc1:
>> > > > > > >
>> > > > > > > ae19d2b [FLINK-2927] [runtime] Provide default required
>> > > configuration
>> > > > > > keys
>> > > > > > > in flink-conf of binary distribution
>> > > > > > > 874c500 Add org.apache.httpcomponents:(httpcore, httpclient) to
>> > > > > > dependency
>> > > > > > > management
>> > > > > > > 04e25e1 [docs] remove hard-coded version artifacts
>> > > > > > > b240a80 [FLINK-2878] [webmonitor] Fix unexpected leader address
>> > > > pattern
>> > > > > > > 16a9edc [hotfix] Fix issue with spaces in Path in
>> > > > start-*-streaming.sh
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>


Re: Web Page Issue

2015-10-28 Thread Maximilian Michels
+1 for keeping the Quickstart on the main page but I'm against
removing it from the documentation because it is, essentially, a part
of the documentation.

On Wed, Oct 28, 2015 at 2:25 PM, Matthias J. Sax  wrote:
> Good point.
>
> How often does "Quickstart" change? Seems to be fairly stable. Maybe we
> could move it from doc page to main page?
>
> Btw: The link "Flink on Windows" is broken in the Quickstart guide.
>
> -Matthias
>
> On 10/28/2015 02:09 PM, Aljoscha Krettek wrote:
>> I think the quickstarts should be very easy to discover, so we should keep 
>> them on the main page. If you just browse to flink.apache.org you would not 
>> be aware that they exist.
>>> On 28 Oct 2015, at 13:37, Matthias J. Sax  wrote:
>>>
>>> What about "Quickstart" menu point... I really would like to move it
>>> under "Documentation" (to get rid of linking to the doc page on two
>>> places...)
>>>
>>> Furthermore, I would suggest to rename "Overview" to "Home" on main page
>>> and keep "Overview" on doc page.
>>>
>>> Logo/Text: I am fine with a text-link (but would remove the logo link --
>>> there is no advantage in having it)
>>>
>>> If there are no strict objections, I will just update my branch
>>> accordingly, such that everybody can try it out. After that, we can try
>>> to come to a conclusion.
>>>
>>> I guess, together with the darker color schema on the doc page menu, the
>>> distinction between both pages should become clear.
>>>
>>> -Matthias
>>>
>>> On 10/28/2015 11:19 AM, Fabian Hueske wrote:
>>>> I agree with Max.
>>>> Renaming Overview in Documentation and adding a clear link back to the
>>>> project website are the most important issues, IMO.
>>>>
>>>>
>>>> 2015-10-28 10:59 GMT+01:00 Maximilian Michels :
>>>>
>>>>> We should be careful not to break links to the docs again. I'm in
>>>>> favor of making it more clear what is the Flink web site and what its
>>>>> documentation is. For me, it would be enough to change "Overview 1.0"
>>>>> to "Documentation 1.0" and have a clear link which says "Back to Flink
>>>>> website". That should do it.
>>>>>
>>>>> In the light of the release, all other changes are not that important
>>>>> to me right now but I wouldn't deny the structure of the documentation
>>>>> can be improved.
>>>>>
>>>>> On Wed, Oct 28, 2015 at 10:05 AM, Fabian Hueske  wrote:
>>>>>>
>>>>>> I agree, two Overview links pointing to different locations should be
>>>>>> changed.
>>>>>> I am not so sure about the Logo issue. IMO, there should be always a text
>>>>>> link. The logo link should only be an addition.
>>>>>>
>>>>>> Maybe we should wait for more opinions, before we continue.
>>>>>> The website has been changed a couple of times, so it would be good to
>>>>> get
>>>>>> input from those who built the current website, IMO.
>>>>>>
>>>>>>
>>>>>> 2015-10-28 9:56 GMT+01:00 Matthias J. Sax :
>>>>>>
>>>>>>> Yes, but I think that the "Overview" link to index.html is
>>>>>>> confusing/wrong. The doc web page has "Overview" too, and it points to
>>>>>>> https://ci.apache.org/projects/flink/flink-docs-master/index.html
>>>>>>>
>>>>>>> There should be only one page with name "Overview" (either on the web
>>>>>>> page or on the doc page). I actually thought, that the "Overview" link
>>>>>>> from the main page should point to the documentation overview and is
>>>>>>> currently just wrong?
>>>>>>>
>>>>>>> Last but not least, using the Logo and an additional "Overview" menu
>>>>>>> point both pointing to the same location is redundant. I would just go
>>>>>>> with the logo as a link (or if an explicit menu point is used, disable
>>>>>>> the logo as a link -- too me, it is always confusing it two links next
>>>>>>> to each other do the same thing).
>>>>>>>
>>>>>>> -Matthias
>>>&g

Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
@Vasia: This is a CSS problem which manifests because of a long class
name. The colored boxes show the status of tasks from your job which
you are viewing. Are the number not correct?

@Sachin: Could you fix the wrapping of the column?

On Wed, Oct 28, 2015 at 2:44 PM, Sachin Goel  wrote:
> The name of the vertex is very long and isn't getting wrapped around to
> accommodate all the columns. There's a TODO at the relevant place in
> app/scripts/common/filters.coffee which was probably meant to handle this.
>
>
> -- Sachin Goel
> Computer Science, IIT Delhi
> m. +91-9871457685
>
> On Wed, Oct 28, 2015 at 7:04 PM, Vasiliki Kalavri > wrote:
>
>> It's Firefox 41.0.2. Resizing doesn't work :/
>> See this screenshot for the colored boxes I'm referring to:
>>
>> https://drive.google.com/file/d/0BzQJrI2eGlyYX1VYQTlrQWNfUkE/view?usp=sharing
>> .
>> Shouldn't these numbers show tasks?
>>
>> On 28 October 2015 at 14:26, Maximilian Michels  wrote:
>>
>> > @Vasia:
>> >
>> > - There are two types of shapes which are colored :) The circles mark
>> > the running/finished/cancelled/failed jobs while the squares mark the
>> > status of a task within a job
>> > (cancelled/running/failed/restart/pending/finished/total).
>> >
>> > - I can see all four columns in the "Plan" tab on Firefox. Which
>> > version are you using? Does resizing the window make any difference?
>> >
>> > @Sachin: Thanks for your pull requests. Will pull them in for the next
>> RC.
>> >
>> > On Wed, Oct 28, 2015 at 2:03 PM, Vasiliki Kalavri
>> >  wrote:
>> > > I think I found 2 more issues with the web interface.
>> > >
>> > > When inside a running job's view:
>> > > - I think the colorful boxes with the number of tasks in each status
>> show
>> > > wrong values (or show something else?). I get different values than the
>> > > ones I see in "Overview" and "Running Jobs" tabs.
>> > > - In the "Plan" tab, it seems that some information is hidden and I
>> > cannot
>> > > scroll right to see it. I can only see 3 columns for each operator:
>> bytes
>> > > read, records read and bytes written. I'm using Firefox.
>> > >
>> > > -Vasia.
>> > >
>> > > On 28 October 2015 at 13:13, Sachin Goel 
>> > wrote:
>> > >
>> > >> While we're at it, we should also remove the dummy log and stdout tabs
>> > for
>> > >> task managers. The work on that hasn't been finished yet.
>> > >> I'll file a jira for both.
>> > >> On Oct 28, 2015 5:39 PM, "Vasiliki Kalavri" <
>> vasilikikala...@gmail.com>
>> > >> wrote:
>> > >>
>> > >> > I see, thank you! +1 for removing before the release :)
>> > >> >
>> > >> > On 28 October 2015 at 13:06, Sachin Goel 
>> > >> wrote:
>> > >> >
>> > >> > > Those are hard coded values.
>> > >> > > What exactly should be there, I'm not sure either.
>> > >> > > On Oct 28, 2015 5:25 PM, "Vasiliki Kalavri" <
>> > vasilikikala...@gmail.com
>> > >> >
>> > >> > > wrote:
>> > >> > >
>> > >> > > > I have a question regarding the web interface :)
>> > >> > > > What is the "Job Accumulator/Statistics" tab supposed to show?
>> No
>> > >> > matter
>> > >> > > > what job I run, the values are the same (operator=1,
>> > parallelism=2,
>> > >> > > > subtasks=3). Are these hard-coded defaults?
>> > >> > > >
>> > >> > > > Thanks!
>> > >> > > > -Vasia.
>> > >> > > >
>> > >> > > > On 28 October 2015 at 10:50, Maximilian Michels > >
>> > >> > wrote:
>> > >> > > >
>> > >> > > > > Thanks for testing, Vasia :)
>> > >> > > > >
>> > >> > > > > Here is the new document:
>> > >> > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://docs.google.com/document/d/1CR3DH4tUJvuk

Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
Yes, that's correct. One is running operators (top of the job view)
while the other lists all the parallel tasks (overview page, and
detail view in job view). I think it makes sense where they are
displayed at the moment. It's just confusing how they are displayed.
Could we add a label at the top of the job view to denote that these
are operator-level numbers?

On Wed, Oct 28, 2015 at 3:24 PM, Sachin Goel  wrote:
> I think the squares on top of the job page are showing the status of
> vertices, not tasks. The squares on overview pages however show the number
> of tasks. Should we make it vertices or tasks everywhere, for consistency?
>
> -- Sachin Goel
> Computer Science, IIT Delhi
> m. +91-9871457685
>
> On Wed, Oct 28, 2015 at 7:39 PM, Sachin Goel 
> wrote:
>
>> @Max, I will try to get the wrap working [rather ellipsifying the text in
>> this case.]. Not very good with CSS unfortunately.
>>
>> @Vasia, there seems to be different things which are being used to render
>> those two. For the running jobs page, job.tasks is rendered, while for the
>> job page, job.status-counts is being used. Looking into it now.
>>
>> -- Sachin Goel
>> Computer Science, IIT Delhi
>> m. +91-9871457685
>>
>> On Wed, Oct 28, 2015 at 7:26 PM, Vasiliki Kalavri <
>> vasilikikala...@gmail.com> wrote:
>>
>>> The numbers I see in the overview are different.
>>>
>>> See
>>>
>>> https://drive.google.com/file/d/0BzQJrI2eGlyYMHZZUGs2ZFJzaXc/view?usp=sharing
>>> vs.
>>>
>>> https://drive.google.com/file/d/0BzQJrI2eGlyYc3kzMlQ4OXN6a3c/view?usp=sharing
>>>
>>> -Vasia.
>>>
>>> On 28 October 2015 at 14:51, Maximilian Michels  wrote:
>>>
>>> > @Vasia: This is a CSS problem which manifests because of a long class
>>> > name. The colored boxes show the status of tasks from your job which
>>> > you are viewing. Are the number not correct?
>>> >
>>> > @Sachin: Could you fix the wrapping of the column?
>>> >
>>> > On Wed, Oct 28, 2015 at 2:44 PM, Sachin Goel 
>>> > wrote:
>>> > > The name of the vertex is very long and isn't getting wrapped around
>>> to
>>> > > accommodate all the columns. There's a TODO at the relevant place in
>>> > > app/scripts/common/filters.coffee which was probably meant to handle
>>> > this.
>>> > >
>>> > >
>>> > > -- Sachin Goel
>>> > > Computer Science, IIT Delhi
>>> > > m. +91-9871457685
>>> > >
>>> > > On Wed, Oct 28, 2015 at 7:04 PM, Vasiliki Kalavri <
>>> > vasilikikala...@gmail.com
>>> > >> wrote:
>>> > >
>>> > >> It's Firefox 41.0.2. Resizing doesn't work :/
>>> > >> See this screenshot for the colored boxes I'm referring to:
>>> > >>
>>> > >>
>>> >
>>> https://drive.google.com/file/d/0BzQJrI2eGlyYX1VYQTlrQWNfUkE/view?usp=sharing
>>> > >> .
>>> > >> Shouldn't these numbers show tasks?
>>> > >>
>>> > >> On 28 October 2015 at 14:26, Maximilian Michels 
>>> wrote:
>>> > >>
>>> > >> > @Vasia:
>>> > >> >
>>> > >> > - There are two types of shapes which are colored :) The circles
>>> mark
>>> > >> > the running/finished/cancelled/failed jobs while the squares mark
>>> the
>>> > >> > status of a task within a job
>>> > >> > (cancelled/running/failed/restart/pending/finished/total).
>>> > >> >
>>> > >> > - I can see all four columns in the "Plan" tab on Firefox. Which
>>> > >> > version are you using? Does resizing the window make any
>>> difference?
>>> > >> >
>>> > >> > @Sachin: Thanks for your pull requests. Will pull them in for the
>>> next
>>> > >> RC.
>>> > >> >
>>> > >> > On Wed, Oct 28, 2015 at 2:03 PM, Vasiliki Kalavri
>>> > >> >  wrote:
>>> > >> > > I think I found 2 more issues with the web interface.
>>> > >> > >
>>> > >> > > When inside a running job's view:
>>> > >> > > - I think the colorful boxes with the number of tasks in each
>>> status
>>> > >> show

Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
@Sachin: I've tried it out. It has the tendency to make things a bit
harder to read (because it breaks words at arbitrary positions).
However, we don't have a better fix.

On Wed, Oct 28, 2015 at 4:46 PM, Sachin Goel  wrote:
> Sorry. Wrong commit. In case you've pulled already.
>
> -- Sachin Goel
> Computer Science, IIT Delhi
> m. +91-9871457685
>
> On Wed, Oct 28, 2015 at 9:09 PM, Sachin Goel 
> wrote:
>
>> @Max: Here's a fix for the wrapping issue:
>> https://github.com/sachingoel0101/flink/tree/long-vertex-name. It's just
>> two lines, so I don't think opening a PR makes sense. Lemme know if I
>> should.
>>
>> @Vasia: Can you test it out on your job? I've checked on firefox, chrome
>> and IE and it seems to work. [You might wanna rebuild flink-runtime-web
>> followed by flink-dist first. :)]
>>
>>
>>
>> -- Sachin Goel
>> Computer Science, IIT Delhi
>> m. +91-9871457685
>>
>> On Wed, Oct 28, 2015 at 8:22 PM, Vasiliki Kalavri <
>> vasilikikala...@gmail.com> wrote:
>>
>>> Ah I see. Thanks Sachin, Max. I think a label would be nice there, yes.
>>>
>>> On 28 October 2015 at 15:45, Maximilian Michels  wrote:
>>>
>>> > Yes, that's correct. One is running operators (top of the job view)
>>> > while the other lists all the parallel tasks (overview page, and
>>> > detail view in job view). I think it makes sense where they are
>>> > displayed at the moment. It's just confusing how they are displayed.
>>> > Could we add a label at the top of the job view to denote that these
>>> > are operator-level numbers?
>>> >
>>> > On Wed, Oct 28, 2015 at 3:24 PM, Sachin Goel 
>>> > wrote:
>>> > > I think the squares on top of the job page are showing the status of
>>> > > vertices, not tasks. The squares on overview pages however show the
>>> > number
>>> > > of tasks. Should we make it vertices or tasks everywhere, for
>>> > consistency?
>>> > >
>>> > > -- Sachin Goel
>>> > > Computer Science, IIT Delhi
>>> > > m. +91-9871457685
>>> > >
>>> > > On Wed, Oct 28, 2015 at 7:39 PM, Sachin Goel <
>>> sachingoel0...@gmail.com>
>>> > > wrote:
>>> > >
>>> > >> @Max, I will try to get the wrap working [rather ellipsifying the
>>> text
>>> > in
>>> > >> this case.]. Not very good with CSS unfortunately.
>>> > >>
>>> > >> @Vasia, there seems to be different things which are being used to
>>> > render
>>> > >> those two. For the running jobs page, job.tasks is rendered, while
>>> for
>>> > the
>>> > >> job page, job.status-counts is being used. Looking into it now.
>>> > >>
>>> > >> -- Sachin Goel
>>> > >> Computer Science, IIT Delhi
>>> > >> m. +91-9871457685
>>> > >>
>>> > >> On Wed, Oct 28, 2015 at 7:26 PM, Vasiliki Kalavri <
>>> > >> vasilikikala...@gmail.com> wrote:
>>> > >>
>>> > >>> The numbers I see in the overview are different.
>>> > >>>
>>> > >>> See
>>> > >>>
>>> > >>>
>>> >
>>> https://drive.google.com/file/d/0BzQJrI2eGlyYMHZZUGs2ZFJzaXc/view?usp=sharing
>>> > >>> vs.
>>> > >>>
>>> > >>>
>>> >
>>> https://drive.google.com/file/d/0BzQJrI2eGlyYc3kzMlQ4OXN6a3c/view?usp=sharing
>>> > >>>
>>> > >>> -Vasia.
>>> > >>>
>>> > >>> On 28 October 2015 at 14:51, Maximilian Michels 
>>> > wrote:
>>> > >>>
>>> > >>> > @Vasia: This is a CSS problem which manifests because of a long
>>> class
>>> > >>> > name. The colored boxes show the status of tasks from your job
>>> which
>>> > >>> > you are viewing. Are the number not correct?
>>> > >>> >
>>> > >>> > @Sachin: Could you fix the wrapping of the column?
>>> > >>> >
>>> > >>> > On Wed, Oct 28, 2015 at 2:44 PM, Sachin Goel <
>>> > sachingoel0...@gmail.com>
>>> > >>> > wrote:
>>> > >>>

Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-28 Thread Maximilian Michels
Not sure. I think I'd rather leave it as it is because it renders the
normal view (when your screen is wide enough unreadable). I'd rather
wait for a proper fix.

Now
https://drive.google.com/file/d/0BziY9U_qva1sYzdxR3RJakltM0E/view?usp=sharing
Afterwards
https://drive.google.com/file/d/0BziY9U_qva1sSmg1ZVJ6NmlVSGs/view?usp=sharing

On Wed, Oct 28, 2015 at 5:28 PM, Sachin Goel  wrote:
> Yes. I just made it work in firefox. It was already working in Chrome this
> way.
> Maybe Piotr will have a better fix later on.
>
> -- Sachin Goel
> Computer Science, IIT Delhi
> m. +91-9871457685
>
> On Wed, Oct 28, 2015 at 9:42 PM, Maximilian Michels  wrote:
>
>> @Sachin: I've tried it out. It has the tendency to make things a bit
>> harder to read (because it breaks words at arbitrary positions).
>> However, we don't have a better fix.
>>
>> On Wed, Oct 28, 2015 at 4:46 PM, Sachin Goel 
>> wrote:
>> > Sorry. Wrong commit. In case you've pulled already.
>> >
>> > -- Sachin Goel
>> > Computer Science, IIT Delhi
>> > m. +91-9871457685
>> >
>> > On Wed, Oct 28, 2015 at 9:09 PM, Sachin Goel 
>> > wrote:
>> >
>> >> @Max: Here's a fix for the wrapping issue:
>> >> https://github.com/sachingoel0101/flink/tree/long-vertex-name. It's
>> just
>> >> two lines, so I don't think opening a PR makes sense. Lemme know if I
>> >> should.
>> >>
>> >> @Vasia: Can you test it out on your job? I've checked on firefox, chrome
>> >> and IE and it seems to work. [You might wanna rebuild flink-runtime-web
>> >> followed by flink-dist first. :)]
>> >>
>> >>
>> >>
>> >> -- Sachin Goel
>> >> Computer Science, IIT Delhi
>> >> m. +91-9871457685
>> >>
>> >> On Wed, Oct 28, 2015 at 8:22 PM, Vasiliki Kalavri <
>> >> vasilikikala...@gmail.com> wrote:
>> >>
>> >>> Ah I see. Thanks Sachin, Max. I think a label would be nice there, yes.
>> >>>
>> >>> On 28 October 2015 at 15:45, Maximilian Michels 
>> wrote:
>> >>>
>> >>> > Yes, that's correct. One is running operators (top of the job view)
>> >>> > while the other lists all the parallel tasks (overview page, and
>> >>> > detail view in job view). I think it makes sense where they are
>> >>> > displayed at the moment. It's just confusing how they are displayed.
>> >>> > Could we add a label at the top of the job view to denote that these
>> >>> > are operator-level numbers?
>> >>> >
>> >>> > On Wed, Oct 28, 2015 at 3:24 PM, Sachin Goel <
>> sachingoel0...@gmail.com>
>> >>> > wrote:
>> >>> > > I think the squares on top of the job page are showing the status
>> of
>> >>> > > vertices, not tasks. The squares on overview pages however show the
>> >>> > number
>> >>> > > of tasks. Should we make it vertices or tasks everywhere, for
>> >>> > consistency?
>> >>> > >
>> >>> > > -- Sachin Goel
>> >>> > > Computer Science, IIT Delhi
>> >>> > > m. +91-9871457685
>> >>> > >
>> >>> > > On Wed, Oct 28, 2015 at 7:39 PM, Sachin Goel <
>> >>> sachingoel0...@gmail.com>
>> >>> > > wrote:
>> >>> > >
>> >>> > >> @Max, I will try to get the wrap working [rather ellipsifying the
>> >>> text
>> >>> > in
>> >>> > >> this case.]. Not very good with CSS unfortunately.
>> >>> > >>
>> >>> > >> @Vasia, there seems to be different things which are being used to
>> >>> > render
>> >>> > >> those two. For the running jobs page, job.tasks is rendered, while
>> >>> for
>> >>> > the
>> >>> > >> job page, job.status-counts is being used. Looking into it now.
>> >>> > >>
>> >>> > >> -- Sachin Goel
>> >>> > >> Computer Science, IIT Delhi
>> >>> > >> m. +91-9871457685
>> >>> > >>
>> >>> > >> On Wed, Oct 28, 2015 at 7:26 PM, Vasiliki Kalavri <
>> >>> > >> vasilikikala...@gmail.com> wrote:
>> >>

Re: Broken link for master Javadocs

2015-10-28 Thread Maximilian Michels
The issue with our Java Docs has been resolved. The link works again.

On Tue, Oct 27, 2015 at 3:57 PM, Henry Saputra  wrote:
> Ah thanks Max, sending to commits@ is good
>
> - Henry
>
> On Tue, Oct 27, 2015 at 2:35 AM, Maximilian Michels  wrote:
>> Hi Henry,
>>
>> Yes, there is. The Commits@ list actually gets notifications on failures
>> and recoveries. I figured sending them to dev@ would bother too many people
>> because sometimes the infrastructure is flaky and it fails for no
>> particular reason.
>>
>> Cheers,
>> Max
>>
>> On Tue, Oct 27, 2015 at 4:18 AM, Henry Saputra 
>> wrote:
>>
>>> Hi Max,
>>>
>>> Is there a way that dev@ list gets email notification if the build fail
>>> for
>>> the build bot?
>>>
>>> - Henry
>>>
>>> On Monday, October 26, 2015, Maximilian Michels  wrote:
>>>
>>> > Thanks for reporting, Suneel. On my machine the Java docs build.
>>> >
>>> > Here's the build log:
>>> >
>>> >
>>> https://ci.apache.org/builders/flink-docs-master/builds/122/steps/Java%20%26%20Scala%20docs/logs/stdio
>>> >
>>> >
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:35:
>>> > error: not found: type ILoopCompat
>>> > [ERROR]   extends ILoopCompat(in0, out0) {
>>> > [ERROR]   ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:29:
>>> > error: too many arguments for constructor Object: ()Object
>>> > [ERROR] class FlinkILoop(
>>> > [ERROR] ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:118:
>>> > error: value createInterpreter is not a member of AnyRef
>>> > [ERROR] super.createInterpreter()
>>> > [ERROR]   ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:120:
>>> > error: not found: value addThunk
>>> > [ERROR] addThunk {
>>> > [ERROR] ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:138:
>>> > error: not found: value intp
>>> > [ERROR] val vd = intp.virtualDirectory
>>> > [ERROR]  ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkILoop.scala:186:
>>> > error: not found: value echo
>>> > [ERROR] echo(
>>> > [ERROR] ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala:151:
>>> > error: value process is not a member of
>>> > org.apache.flink.api.scala.FlinkILoop
>>> > [ERROR]   repl.foreach(_.process(settings))
>>> > [ERROR]  ^
>>> > [ERROR]
>>> >
>>> /home/buildslave2/slave2/flink-docs-master/build/flink-staging/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala:153:
>>> > error: value closeInterpreter is not a member of
>>> > org.apache.flink.api.scala.FlinkILoop
>>> > [ERROR]   repl.foreach(_.closeInterpreter())
>>> > [ERROR]  ^
>>> > [ERROR] 8 errors found
>>> >
>>> >
>>> > Not sure what the issue is. I'll try to look into later.
>>> >
>>> > Thanks,
>>> > Max
>>> >
>>> > On Mon, Oct 26, 2015 at 7:12 AM, Henry Saputra >> > >
>>> > wrote:
>>> >
>>> > > Thanks for the heads up, Suneel.
>>> > >
>>> > > Seemed like master Java api (api/java/index.html) is not being built:
>>> > > https://ci.apache.org/projects/
>>> > >
>>> > > I have filed ticket with infra to help figure out why.
>>> > >
>>> > > - Henry
>>> > >
>>> > > On Sat, Oct 24, 2015 at 5:45 PM, Suneel Marthi >> > > wrote:
>>> > > > https://ci.apache.org/projects/flink/flink-docs-master/api/java
>>> > > >
>>> > > > needs to be fixed.
>>> > >
>>> >
>>>


Re: Caching information from a stream

2015-10-28 Thread Maximilian Michels
Hi Andra,

What you thought of turns out to be one of the core features of the Flink
streaming API. Flink's operators support state. State can be partitioned by
the the key using keyBy(field).

You may use a MapFunction  to achieve what you wanted like so:

public static void main(String[] args) throws Exception {

   final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();

   env.fromElements(new Tuple2<>(1L, 3L),
 new Tuple2<>(2L, 5L),
 new Tuple2<>(6L, 7L),
 new Tuple2<>(1L, 5L))

   .keyBy(0)

   .map(new StatefulMapper())

   .print();

   env.execute();

}

The output is the following on my machine (discarded the output of the
print):

Key: 2 Previous state was: -1 Update state to: 5
Key: 1 Previous state was: -1 Update state to: 3
Key: 6 Previous state was: -1 Update state to: 7
Key: 1 Previous state was: 3 Update state to: 5


Cheers,
Max



On Wed, Oct 28, 2015 at 4:30 PM, Andra Lungu  wrote:

> Hey guys!
>
> I've been thinking about this one today:
>
> Say you have a stream of data in the form of (id, value) - This will
> evidently be a DataStream of Tuple2.
> I need to cache this data in some sort of static stream (perhaps even a
> DataSet).
> Then, if in the input stream, I see an id that was previously stored, I
> should update its value with the most recent entry.
>
> On an example:
>
> 1, 3
> 2, 5
> 6, 7
> 1, 5
>
> The value cached for the id 1 should be 5.
>
> How would you recommend caching the data? And what would be used for the
> update? A join function?
>
> As far as I see things, you cannot really combine DataSets with DataStreams
> although a DataSet is, in essence, just a finite stream.
> If this can indeed be done, some pseudocode would be nice :)
>
> Thanks!
> Andra
>


Re: Caching information from a stream

2015-10-28 Thread Maximilian Michels
Oups, forgot the mapper :)

static class StatefulMapper extends RichMapFunction, Tuple2> {

   private OperatorState counter;

   @Override
   public Tuple2 map(Tuple2 value) throws Exception {
  System.out.println("Key: " + value.f0 +
" Previous state was: "+ counter.value() +
" Update state to: "+ value.f1);
  counter.update(value.f1);
  return value;
   }

   @Override
   public void open(Configuration config) {
  counter = getRuntimeContext().getKeyValueState("mystate",
Long.class, -1L);
   }
}



On Wed, Oct 28, 2015 at 7:39 PM, Maximilian Michels  wrote:

> Hi Andra,
>
> What you thought of turns out to be one of the core features of the Flink
> streaming API. Flink's operators support state. State can be partitioned by
> the the key using keyBy(field).
>
> You may use a MapFunction  to achieve what you wanted like so:
>
> public static void main(String[] args) throws Exception {
>
>final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>
>env.fromElements(new Tuple2<>(1L, 3L),
>  new Tuple2<>(2L, 5L),
>  new Tuple2<>(6L, 7L),
>  new Tuple2<>(1L, 5L))
>
>.keyBy(0)
>
>.map(new StatefulMapper())
>
>.print();
>
>env.execute();
>
> }
>
> The output is the following on my machine (discarded the output of the
> print):
>
> Key: 2 Previous state was: -1 Update state to: 5
> Key: 1 Previous state was: -1 Update state to: 3
> Key: 6 Previous state was: -1 Update state to: 7
> Key: 1 Previous state was: 3 Update state to: 5
>
>
> Cheers,
> Max
>
>
>
> On Wed, Oct 28, 2015 at 4:30 PM, Andra Lungu 
> wrote:
>
>> Hey guys!
>>
>> I've been thinking about this one today:
>>
>> Say you have a stream of data in the form of (id, value) - This will
>> evidently be a DataStream of Tuple2.
>> I need to cache this data in some sort of static stream (perhaps even a
>> DataSet).
>> Then, if in the input stream, I see an id that was previously stored, I
>> should update its value with the most recent entry.
>>
>> On an example:
>>
>> 1, 3
>> 2, 5
>> 6, 7
>> 1, 5
>>
>> The value cached for the id 1 should be 5.
>>
>> How would you recommend caching the data? And what would be used for the
>> update? A join function?
>>
>> As far as I see things, you cannot really combine DataSets with
>> DataStreams
>> although a DataSet is, in essence, just a finite stream.
>> If this can indeed be done, some pseudocode would be nice :)
>>
>> Thanks!
>> Andra
>>
>
>


Re: Scala 2.10/2.11 Maven dependencies

2015-10-29 Thread Maximilian Michels
Seems like we agree that we need artifacts for different versions of Scala
on Maven. There also seems to be a preference for including the version in
the artifact name.

I've created an issue and marked it to be resolved for 1.0. For the 0.10
release, we will have binaries but no Maven artifacts. The biggest
challenge I see is to remove Scala from as many modules as possible. For
example, flink-java depends on Scala at the moment..

https://issues.apache.org/jira/browse/FLINK-2940

On Wed, Oct 28, 2015 at 7:31 PM, Frederick F. Kautz IV 
wrote:

> No idea if I get a vote ;) Nevertheless, +1 to have binaries for both
> versions in Maven and explicitly "scala versioned".
>
> Some background on this for those not as familiar with scala versioning:
>
> It's considered best practice to label what version of scala a library
> uses in the artifact id.
>
> The reason is compiled scala code is only compatible with the major
> version of scala it was compiled for. For example, a library compatible
> with 2.10 is not compatible with 2.11. The same will be true with 2.12 once
> it is released. Mixing versions will result in undefined behavior which
> will likely manifest itself as runtime exceptions.
>
> The convention to fix this problem is for all published libraries to
> specify the version of scala they are compatible with. Leaving out the
> scala version in a library is akin to saying "We don't depend on scala for
> this library, so feel free to use whatever you want." Sbt users will
> typically specify the version of scala they use and tooling is built around
> ensuring consistency with the "%%" operator.
>
> E.g.
>
> scalaVersion := "2.11.4"
>
> // this resolves to to artifactID: "scalacheck_2.11"
> libraryDependencies += "org.scalacheck" %% "scalacheck" % "1.12.0" % "test"
>
> The most important part of this is that the scala version is explicit
> which eliminates the problem for downstream users.
>
> Cheers,
> Frederick
>
>
> On 10/28/2015 10:55 AM, Fabian Hueske wrote:
>
>> +1 to have binaries for both versions in Maven and as build to download.
>>
>> 2015-10-26 17:11 GMT+01:00 Theodore Vasiloudis <
>> theodoros.vasilou...@gmail.com>:
>>
>> +1 for having binaries, I'm working on a Spark application currently with
>>> Scala 2.11 and having to rebuild everything when deploying e.g. to EC2
>>> is a
>>> pain.
>>>
>>> On Mon, Oct 26, 2015 at 4:22 PM, Ufuk Celebi  wrote:
>>>
>>> I agree with Till, but is this something you want to address in this
>>>> release already?
>>>>
>>>> I would postpone it to 1.0.0.
>>>>
>>>> – Ufuk
>>>>
>>>> On 26 Oct 2015, at 16:17, Till Rohrmann  wrote:
>>>>>
>>>>> I would be in favor of deploying also Scala 2.11 artifacts to Maven
>>>>>
>>>> since
>>>
>>>> more and more people will try out Flink with Scala 2.11. Having the
>>>>> dependencies in the Maven repository makes it considerably easier for
>>>>> people to get their Flink jobs running.
>>>>>
>>>>> Furthermore, I observed that people are not aware that our deployed
>>>>> artifacts, e.g. flink-runtime, are built with Scala 2.10. As a
>>>>>
>>>> consequence,
>>>>
>>>>> they mix flink dependencies with other dependencies pulling in Scala
>>>>>
>>>> 2.11
>>>
>>>> and then they wonder that the program crashes. It would be, imho,
>>>>>
>>>> clearer
>>>
>>>> if all our dependencies which depend on a specific Scala version would
>>>>>
>>>> have
>>>>
>>>>> the corresponding Scala suffix appended.
>>>>>
>>>>> Adding the 2.10 suffix would also spare us the hassle of upgrading to a
>>>>> newer Scala version in the future, because then the artifacts wouldn't
>>>>> share the same artifact name.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Mon, Oct 26, 2015 at 4:04 PM, Maximilian Michels 
>>>>>
>>>> wrote:
>>>>
>>>>> Hi Flinksters,
>>>>>>
>>>>>> We have recently committed an easy way to change Flink's Scala
>>>>>>
>>>>> version.
>>>
>>>> The
>>>>
>>>>> question arises now wh

Re: Diagnosing TaskManager disappearance

2015-10-29 Thread Maximilian Michels
Hi Greg,

Thanks for reporting. You wrote you didn't see any output in the .out files
of the task managers. What about the .log files of these instances?

Where and when did you produce the thread dump you included?

Thanks,
Max

On Thu, Oct 29, 2015 at 1:46 PM, Greg Hogan  wrote:

> I am testing again on a 64 node cluster (the JobManager is running fine
> having reduced some operator's parallelism and fixed the string conversion
> performance).
>
> I am seeing TaskManagers drop like flies every other job or so. I am not
> seeing any output in the .out log files corresponding to the crashed
> TaskManagers.
>
> Below is the stack trace from a java.hprof heap dump.
>
> How should I be debugging this?
>
> Thanks,
> Greg
>
>
> Threads at the heap dump:
>
> Unknown thread
>
>
> "Memory Logger" daemon prio=1 tid=119 TIMED_WAITING
> at java.lang.Thread.(Thread.java:507)
> at
>
> org.apache.flink.runtime.taskmanager.MemoryLogger.(MemoryLogger.java:67)
> at
>
> org.apache.flink.runtime.taskmanager.TaskManager$.runTaskManager(TaskManager.scala:1494)
> at
>
> org.apache.flink.runtime.taskmanager.TaskManager$.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala:1330)
>
>
> "Flink Netty Server (59693) Thread 0" daemon prio=5 tid=193 RUNNABLE
> at java.lang.Thread.(Thread.java:674)
> at
>
> java.util.concurrent.Executors$DefaultThreadFactory.newThread(Executors.java:613)
> at
>
> org.apache.flink.shaded.com.google.common.util.concurrent.ThreadFactoryBuilder$1.newThread(ThreadFactoryBuilder.java:162)
> at
>
> io.netty.util.concurrent.SingleThreadEventExecutor.(SingleThreadEventExecutor.java:106)
>
>
> "flink-akka.remote.default-remote-dispatcher-6" daemon prio=5 tid=30
> TIMED_WAITING
> at java.lang.Thread.(Thread.java:507)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.(ForkJoinWorkerThread.java:48)
> at
>
> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.(ThreadPoolBuilder.scala:164)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:187)
>
>
> "flink-akka.actor.default-dispatcher-4" daemon prio=5 tid=28 WAITING
> at java.lang.Thread.(Thread.java:507)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.(ForkJoinWorkerThread.java:48)
> at
>
> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.(ThreadPoolBuilder.scala:164)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:187)
>
>
> "flink-akka.remote.default-remote-dispatcher-5" daemon prio=5 tid=29
> WAITING
> at java.lang.Thread.(Thread.java:507)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.(ForkJoinWorkerThread.java:48)
> at
>
> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.(ThreadPoolBuilder.scala:164)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:187)
>
>
> "flink-akka.actor.default-dispatcher-2" daemon prio=5 tid=26 WAITING
> at java.lang.Thread.(Thread.java:507)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.(ForkJoinWorkerThread.java:48)
> at
>
> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.(ThreadPoolBuilder.scala:164)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:187)
>
>
> "SIGTERM handler" daemon prio=9 tid=268 RUNNABLE
> at java.lang.Thread.(Thread.java:547)
> at sun.misc.Signal.dispatch(Signal.java:216)
>
>
> "HPROF gc_finish watcher" daemon prio=10 tid=5 RUNNABLE
>
>
> "Reference Handler" daemon prio=10 tid=2 WAITING
>
>
> "main" prio=5 tid=1 WAITING
>
>
> "Signal Dispatcher" daemon prio=9 tid=4 RUNNABLE
>
>
> "Finalizer" daemon prio=8 tid=3 WAITING
>
>
> "flink-akka.actor.default-dispatcher-3" daemon prio=5 tid=27 TIMED_WAITING
> at java.lang.Thread.(Thread.java:507)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.(ForkJoinWorkerThread.java:48)
> at
>
> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.(ThreadPoolBuilder.scala:164)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:187)
>
>
> "New I/O worker #1" daemon prio=5 tid=31 RUNNABLE
> at java.lang.Thread.(Thread.java:547)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:193)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:612)
> at
>
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:925)
>
>
> "flink-scheduler-1" daemon prio=5 tid=25 TIMED_WAITING
> at java.lang.Thread.(Thread.java:547)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:193)
> at akka.actor.LightArrayRevolverScheduler.(Scheduler.scala:337)
> at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java)
>
>
> "New I/O worker #2" daemon prio=5 tid=32 RUNNABLE
> at java.lang.Thread.(Thread.java:547)
> at
>
> akka.dispatch.MonitorableThreadFactory.newThread(ThreadPoolBuilder.scala:193)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:612)
> at
>
> java.util.c

Re: New JobManager web frontend

2015-10-29 Thread Maximilian Michels
Hi Matthias,

There is currently no cancel button in the web frontend. Just filed this
ticket today: https://issues.apache.org/jira/browse/FLINK-2939

Cheers,
Max

On Thu, Oct 29, 2015 at 4:49 PM, Matthias J. Sax  wrote:

> Hi,
>
> I was just playing with the new JobManager web frontend and missing a
> button to cancel a running job. It there no such button, or is it hidden
> somewhere?
>
> -Matthias
>
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-30 Thread Maximilian Michels
Thanks for reporting and providing a fix, Till! We have also fixed two
issues with the new job manager web frontend and pushed them to the
release-0.10 branch. Please review these changes in the new release
candidate.

On Wed, Oct 28, 2015 at 6:55 PM, Till Rohrmann  wrote:

> -1 from my side. I just found a serious issue with the KryoSerializer
> (FLINK-2800) which in some cases produced duplicated elements or corrupted
> data. I opened a PR to fix the issue (
> https://github.com/apache/flink/pull/1308).
>
> Cheers,
> Till
> ​
>
> On Wed, Oct 28, 2015 at 5:38 PM, Maximilian Michels 
> wrote:
>
> > Not sure. I think I'd rather leave it as it is because it renders the
> > normal view (when your screen is wide enough unreadable). I'd rather
> > wait for a proper fix.
> >
> > Now
> >
> >
> https://drive.google.com/file/d/0BziY9U_qva1sYzdxR3RJakltM0E/view?usp=sharing
> > Afterwards
> >
> >
> https://drive.google.com/file/d/0BziY9U_qva1sSmg1ZVJ6NmlVSGs/view?usp=sharing
> >
> > On Wed, Oct 28, 2015 at 5:28 PM, Sachin Goel 
> > wrote:
> > > Yes. I just made it work in firefox. It was already working in Chrome
> > this
> > > way.
> > > Maybe Piotr will have a better fix later on.
> > >
> > > -- Sachin Goel
> > > Computer Science, IIT Delhi
> > > m. +91-9871457685
> > >
> > > On Wed, Oct 28, 2015 at 9:42 PM, Maximilian Michels 
> > wrote:
> > >
> > >> @Sachin: I've tried it out. It has the tendency to make things a bit
> > >> harder to read (because it breaks words at arbitrary positions).
> > >> However, we don't have a better fix.
> > >>
> > >> On Wed, Oct 28, 2015 at 4:46 PM, Sachin Goel <
> sachingoel0...@gmail.com>
> > >> wrote:
> > >> > Sorry. Wrong commit. In case you've pulled already.
> > >> >
> > >> > -- Sachin Goel
> > >> > Computer Science, IIT Delhi
> > >> > m. +91-9871457685
> > >> >
> > >> > On Wed, Oct 28, 2015 at 9:09 PM, Sachin Goel <
> > sachingoel0...@gmail.com>
> > >> > wrote:
> > >> >
> > >> >> @Max: Here's a fix for the wrapping issue:
> > >> >> https://github.com/sachingoel0101/flink/tree/long-vertex-name.
> It's
> > >> just
> > >> >> two lines, so I don't think opening a PR makes sense. Lemme know
> if I
> > >> >> should.
> > >> >>
> > >> >> @Vasia: Can you test it out on your job? I've checked on firefox,
> > chrome
> > >> >> and IE and it seems to work. [You might wanna rebuild
> > flink-runtime-web
> > >> >> followed by flink-dist first. :)]
> > >> >>
> > >> >>
> > >> >>
> > >> >> -- Sachin Goel
> > >> >> Computer Science, IIT Delhi
> > >> >> m. +91-9871457685
> > >> >>
> > >> >> On Wed, Oct 28, 2015 at 8:22 PM, Vasiliki Kalavri <
> > >> >> vasilikikala...@gmail.com> wrote:
> > >> >>
> > >> >>> Ah I see. Thanks Sachin, Max. I think a label would be nice there,
> > yes.
> > >> >>>
> > >> >>> On 28 October 2015 at 15:45, Maximilian Michels 
> > >> wrote:
> > >> >>>
> > >> >>> > Yes, that's correct. One is running operators (top of the job
> > view)
> > >> >>> > while the other lists all the parallel tasks (overview page, and
> > >> >>> > detail view in job view). I think it makes sense where they are
> > >> >>> > displayed at the moment. It's just confusing how they are
> > displayed.
> > >> >>> > Could we add a label at the top of the job view to denote that
> > these
> > >> >>> > are operator-level numbers?
> > >> >>> >
> > >> >>> > On Wed, Oct 28, 2015 at 3:24 PM, Sachin Goel <
> > >> sachingoel0...@gmail.com>
> > >> >>> > wrote:
> > >> >>> > > I think the squares on top of the job page are showing the
> > status
> > >> of
> > >> >>> > > vertices, not tasks. The squares on overview pages however
> show
> > the
> > >> >>> > number
> > >> >>> > >

[VOTE] [RESULT] Release Apache Flink 0.10.0 (release-0.10.0-rc2)

2015-10-30 Thread Maximilian Michels
This vote is cancelled in favor of a new RC.

On Fri, Oct 30, 2015 at 9:03 AM, Maximilian Michels  wrote:

> Thanks for reporting and providing a fix, Till! We have also fixed two
> issues with the new job manager web frontend and pushed them to the
> release-0.10 branch. Please review these changes in the new release
> candidate.
>
> On Wed, Oct 28, 2015 at 6:55 PM, Till Rohrmann 
> wrote:
>
>> -1 from my side. I just found a serious issue with the KryoSerializer
>> (FLINK-2800) which in some cases produced duplicated elements or corrupted
>> data. I opened a PR to fix the issue (
>> https://github.com/apache/flink/pull/1308).
>>
>> Cheers,
>> Till
>> ​
>>
>> On Wed, Oct 28, 2015 at 5:38 PM, Maximilian Michels 
>> wrote:
>>
>> > Not sure. I think I'd rather leave it as it is because it renders the
>> > normal view (when your screen is wide enough unreadable). I'd rather
>> > wait for a proper fix.
>> >
>> > Now
>> >
>> >
>> https://drive.google.com/file/d/0BziY9U_qva1sYzdxR3RJakltM0E/view?usp=sharing
>> > Afterwards
>> >
>> >
>> https://drive.google.com/file/d/0BziY9U_qva1sSmg1ZVJ6NmlVSGs/view?usp=sharing
>> >
>> > On Wed, Oct 28, 2015 at 5:28 PM, Sachin Goel 
>> > wrote:
>> > > Yes. I just made it work in firefox. It was already working in Chrome
>> > this
>> > > way.
>> > > Maybe Piotr will have a better fix later on.
>> > >
>> > > -- Sachin Goel
>> > > Computer Science, IIT Delhi
>> > > m. +91-9871457685
>> > >
>> > > On Wed, Oct 28, 2015 at 9:42 PM, Maximilian Michels 
>> > wrote:
>> > >
>> > >> @Sachin: I've tried it out. It has the tendency to make things a bit
>> > >> harder to read (because it breaks words at arbitrary positions).
>> > >> However, we don't have a better fix.
>> > >>
>> > >> On Wed, Oct 28, 2015 at 4:46 PM, Sachin Goel <
>> sachingoel0...@gmail.com>
>> > >> wrote:
>> > >> > Sorry. Wrong commit. In case you've pulled already.
>> > >> >
>> > >> > -- Sachin Goel
>> > >> > Computer Science, IIT Delhi
>> > >> > m. +91-9871457685
>> > >> >
>> > >> > On Wed, Oct 28, 2015 at 9:09 PM, Sachin Goel <
>> > sachingoel0...@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> >> @Max: Here's a fix for the wrapping issue:
>> > >> >> https://github.com/sachingoel0101/flink/tree/long-vertex-name.
>> It's
>> > >> just
>> > >> >> two lines, so I don't think opening a PR makes sense. Lemme know
>> if I
>> > >> >> should.
>> > >> >>
>> > >> >> @Vasia: Can you test it out on your job? I've checked on firefox,
>> > chrome
>> > >> >> and IE and it seems to work. [You might wanna rebuild
>> > flink-runtime-web
>> > >> >> followed by flink-dist first. :)]
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> -- Sachin Goel
>> > >> >> Computer Science, IIT Delhi
>> > >> >> m. +91-9871457685
>> > >> >>
>> > >> >> On Wed, Oct 28, 2015 at 8:22 PM, Vasiliki Kalavri <
>> > >> >> vasilikikala...@gmail.com> wrote:
>> > >> >>
>> > >> >>> Ah I see. Thanks Sachin, Max. I think a label would be nice
>> there,
>> > yes.
>> > >> >>>
>> > >> >>> On 28 October 2015 at 15:45, Maximilian Michels 
>> > >> wrote:
>> > >> >>>
>> > >> >>> > Yes, that's correct. One is running operators (top of the job
>> > view)
>> > >> >>> > while the other lists all the parallel tasks (overview page,
>> and
>> > >> >>> > detail view in job view). I think it makes sense where they are
>> > >> >>> > displayed at the moment. It's just confusing how they are
>> > displayed.
>> > >> >>> > Could we add a label at the top of the job view to denote that
>> > these
>> > >> >>> > are operator-level numbers?
>> > >&

[VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc3)

2015-10-30 Thread Maximilian Michels
Please vote on releasing the following candidate as Apache Flink version
0.10.0:

The commit to be voted on:
2cd5a3c05ceec7bb9c5969c502c2d51b1ec00d0c

Branch:
release-0.10.0-rc3 (see
https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)

The release artifacts to be voted on can be found at:
http://people.apache.org/~mxm/flink-0.10.0-rc3/

The release artifacts are signed with the key with fingerprint C2909CBF:
http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapacheflink-1050

-

The vote is open for the next 72 hours and passes if a majority of at least
three +1 PMC votes are cast.

The vote ends on Monday November 2, 2015.

[ ] +1 Release this package as Apache Flink 0.10.0
[ ] -1 Do not release this package because ...

===

The following commits have been added on top of release-0.10.0-rc2:

e1f30b0 [FLINK-2559] Clean up JavaDocs
44b03f2 [FLINK-2800] [kryo] Fix Kryo serialization to clear buffered data
cdc0dfd [FLINK-2932] Examples in docs now download shell script using https
instead of http
fcc1eed [FLINK-2902][web-dashboard] Sort finished jobs by their end time,
running jobs by start time
6a13b9f [FLINK-2934] Remove placeholder pages for job.statistics,
taskmanager.log and taskmanager.stdout
51ac46e [FLINK-1610][docs] fix javadoc building for aggregate-scaladoc
profile
54375b9 [scala-shell][docs] add scala sources in earlier phase


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc3)

2015-10-30 Thread Maximilian Michels
For testing, please refer to this document:
https://docs.google.com/document/d/1OtiAwILpnIwCqPF1Sk_8EcXuJOVc4uYtlP4i8m2c9rg/edit


On Fri, Oct 30, 2015 at 9:05 AM, Maximilian Michels  wrote:

> Please vote on releasing the following candidate as Apache Flink version
> 0.10.0:
>
> The commit to be voted on:
> 2cd5a3c05ceec7bb9c5969c502c2d51b1ec00d0c
>
> Branch:
> release-0.10.0-rc3 (see
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>
> The release artifacts to be voted on can be found at:
> http://people.apache.org/~mxm/flink-0.10.0-rc3/
>
> The release artifacts are signed with the key with fingerprint C2909CBF:
> http://www.apache.org/dist/flink/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1050
>
> -
>
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PMC votes are cast.
>
> The vote ends on Monday November 2, 2015.
>
> [ ] +1 Release this package as Apache Flink 0.10.0
> [ ] -1 Do not release this package because ...
>
> ===
>
> The following commits have been added on top of release-0.10.0-rc2:
>
> e1f30b0 [FLINK-2559] Clean up JavaDocs
> 44b03f2 [FLINK-2800] [kryo] Fix Kryo serialization to clear buffered data
> cdc0dfd [FLINK-2932] Examples in docs now download shell script using
> https instead of http
> fcc1eed [FLINK-2902][web-dashboard] Sort finished jobs by their end time,
> running jobs by start time
> 6a13b9f [FLINK-2934] Remove placeholder pages for job.statistics,
> taskmanager.log and taskmanager.stdout
> 51ac46e [FLINK-1610][docs] fix javadoc building for aggregate-scaladoc
> profile
> 54375b9 [scala-shell][docs] add scala sources in earlier phase
>


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc3)

2015-10-30 Thread Maximilian Michels
Hmpf. Just looked into this. In the Hadoop 2.X Scala 2.11 jar, Curator is
not shaded. Thus, it fails to load the shaded classes. After we fix this,
we will have to create a new RC.

On Fri, Oct 30, 2015 at 11:57 AM, Fabian Hueske  wrote:

> I'm sorry, but I have to give a -1 for this RC.
>
> Starting a Scala 2.11 build (hadoop2 and hadoop24) with
> ./bin/start-local.sh fails with a ClassNotFoundException:
>
> ava.lang.NoClassDefFoundError:
> org/apache/flink/shaded/org/apache/curator/RetryPolicy
> at
>
> org.apache.flink.runtime.jobmanager.JobManager$.parseArgs(JobManager.scala:1721)
> at
>
> org.apache.flink.runtime.jobmanager.JobManager$.liftedTree2$1(JobManager.scala:1384)
> at
> org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1383)
> at
> org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.flink.shaded.org.apache.curator.RetryPolicy
> at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>
> This happens on OSX and Windows 10 with Cygwin.
>
> 2015-10-30 10:47 GMT+01:00 Maximilian Michels :
>
> > For testing, please refer to this document:
> >
> >
> https://docs.google.com/document/d/1OtiAwILpnIwCqPF1Sk_8EcXuJOVc4uYtlP4i8m2c9rg/edit
> >
> >
> > On Fri, Oct 30, 2015 at 9:05 AM, Maximilian Michels 
> > wrote:
> >
> > > Please vote on releasing the following candidate as Apache Flink
> version
> > > 0.10.0:
> > >
> > > The commit to be voted on:
> > > 2cd5a3c05ceec7bb9c5969c502c2d51b1ec00d0c
> > >
> > > Branch:
> > > release-0.10.0-rc3 (see
> > > https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
> > >
> > > The release artifacts to be voted on can be found at:
> > > http://people.apache.org/~mxm/flink-0.10.0-rc3/
> > >
> > > The release artifacts are signed with the key with fingerprint
> C2909CBF:
> > > http://www.apache.org/dist/flink/KEYS
> > >
> > > The staging repository for this release can be found at:
> > > https://repository.apache.org/content/repositories/orgapacheflink-1050
> > >
> > > -
> > >
> > > The vote is open for the next 72 hours and passes if a majority of at
> > > least three +1 PMC votes are cast.
> > >
> > > The vote ends on Monday November 2, 2015.
> > >
> > > [ ] +1 Release this package as Apache Flink 0.10.0
> > > [ ] -1 Do not release this package because ...
> > >
> > > ===
> > >
> > > The following commits have been added on top of release-0.10.0-rc2:
> > >
> > > e1f30b0 [FLINK-2559] Clean up JavaDocs
> > > 44b03f2 [FLINK-2800] [kryo] Fix Kryo serialization to clear buffered
> data
> > > cdc0dfd [FLINK-2932] Examples in docs now download shell script using
> > > https instead of http
> > > fcc1eed [FLINK-2902][web-dashboard] Sort finished jobs by their end
> time,
> > > running jobs by start time
> > > 6a13b9f [FLINK-2934] Remove placeholder pages for job.statistics,
> > > taskmanager.log and taskmanager.stdout
> > > 51ac46e [FLINK-1610][docs] fix javadoc building for aggregate-scaladoc
> > > profile
> > > 54375b9 [scala-shell][docs] add scala sources in earlier phase
> > >
> >
>


[VOTE] [RESULT] Release Apache Flink 0.10.0 (release-0.10.0-rc3)

2015-10-30 Thread Maximilian Michels
This vote is cancelled in favor of a new RC.

On Fri, Oct 30, 2015 at 12:06 PM, Maximilian Michels  wrote:

> Hmpf. Just looked into this. In the Hadoop 2.X Scala 2.11 jar, Curator is
> not shaded. Thus, it fails to load the shaded classes. After we fix this,
> we will have to create a new RC.
>
> On Fri, Oct 30, 2015 at 11:57 AM, Fabian Hueske  wrote:
>
>> I'm sorry, but I have to give a -1 for this RC.
>>
>> Starting a Scala 2.11 build (hadoop2 and hadoop24) with
>> ./bin/start-local.sh fails with a ClassNotFoundException:
>>
>> ava.lang.NoClassDefFoundError:
>> org/apache/flink/shaded/org/apache/curator/RetryPolicy
>> at
>>
>> org.apache.flink.runtime.jobmanager.JobManager$.parseArgs(JobManager.scala:1721)
>> at
>>
>> org.apache.flink.runtime.jobmanager.JobManager$.liftedTree2$1(JobManager.scala:1384)
>> at
>>
>> org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1383)
>> at
>> org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.flink.shaded.org.apache.curator.RetryPolicy
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>
>> This happens on OSX and Windows 10 with Cygwin.
>>
>> 2015-10-30 10:47 GMT+01:00 Maximilian Michels :
>>
>> > For testing, please refer to this document:
>> >
>> >
>> https://docs.google.com/document/d/1OtiAwILpnIwCqPF1Sk_8EcXuJOVc4uYtlP4i8m2c9rg/edit
>> >
>> >
>> > On Fri, Oct 30, 2015 at 9:05 AM, Maximilian Michels 
>> > wrote:
>> >
>> > > Please vote on releasing the following candidate as Apache Flink
>> version
>> > > 0.10.0:
>> > >
>> > > The commit to be voted on:
>> > > 2cd5a3c05ceec7bb9c5969c502c2d51b1ec00d0c
>> > >
>> > > Branch:
>> > > release-0.10.0-rc3 (see
>> > > https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>> > >
>> > > The release artifacts to be voted on can be found at:
>> > > http://people.apache.org/~mxm/flink-0.10.0-rc3/
>> > >
>> > > The release artifacts are signed with the key with fingerprint
>> C2909CBF:
>> > > http://www.apache.org/dist/flink/KEYS
>> > >
>> > > The staging repository for this release can be found at:
>> > >
>> https://repository.apache.org/content/repositories/orgapacheflink-1050
>> > >
>> > > -
>> > >
>> > > The vote is open for the next 72 hours and passes if a majority of at
>> > > least three +1 PMC votes are cast.
>> > >
>> > > The vote ends on Monday November 2, 2015.
>> > >
>> > > [ ] +1 Release this package as Apache Flink 0.10.0
>> > > [ ] -1 Do not release this package because ...
>> > >
>> > > ===
>> > >
>> > > The following commits have been added on top of release-0.10.0-rc2:
>> > >
>> > > e1f30b0 [FLINK-2559] Clean up JavaDocs
>> > > 44b03f2 [FLINK-2800] [kryo] Fix Kryo serialization to clear buffered
>> data
>> > > cdc0dfd [FLINK-2932] Examples in docs now download shell script using
>> > > https instead of http
>> > > fcc1eed [FLINK-2902][web-dashboard] Sort finished jobs by their end
>> time,
>> > > running jobs by start time
>> > > 6a13b9f [FLINK-2934] Remove placeholder pages for job.statistics,
>> > > taskmanager.log and taskmanager.stdout
>> > > 51ac46e [FLINK-1610][docs] fix javadoc building for aggregate-scaladoc
>> > > profile
>> > > 54375b9 [scala-shell][docs] add scala sources in earlier phase
>> > >
>> >
>>
>
>


[VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc4)

2015-10-30 Thread Maximilian Michels
Please vote on releasing the following candidate as Apache Flink version
0.10.0:

The commit to be voted on:
6044b7f0366deec547022e4bc40c49e1b1c83f28

Branch:
release-0.10.0-rc4 (see
https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)

The release artifacts to be voted on can be found at:
http://people.apache.org/~mxm/flink-0.10.0-rc4/

The release artifacts are signed with the key with fingerprint C2909CBF:
http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapacheflink-1051

-

The vote is open for the next 72 hours and passes if a majority of at least
three +1 PMC votes are cast.

The vote ends on Monday November 2, 2015.

[ ] +1 Release this package as Apache Flink 0.10.0
[ ] -1 Do not release this package because ...

===

The following commits have been added on top of release-0.10.0-rc3:

698fbc3 [release][scripts] shade away curator correctly with different
Scala versions


Re: [VOTE] Release Apache Flink 0.10.0 (release-0.10.0-rc4)

2015-10-30 Thread Maximilian Michels
We can continue testing now:
https://docs.google.com/document/d/1keGYj2zj_AOOKH1bC43Xc4MDz0eLhTErIoxevuRtcus/edit

On Fri, Oct 30, 2015 at 3:49 PM, Maximilian Michels  wrote:

> Please vote on releasing the following candidate as Apache Flink version
> 0.10.0:
>
> The commit to be voted on:
> 6044b7f0366deec547022e4bc40c49e1b1c83f28
>
> Branch:
> release-0.10.0-rc4 (see
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git)
>
> The release artifacts to be voted on can be found at:
> http://people.apache.org/~mxm/flink-0.10.0-rc4/
>
> The release artifacts are signed with the key with fingerprint C2909CBF:
> http://www.apache.org/dist/flink/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1051
>
> -
>
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PMC votes are cast.
>
> The vote ends on Monday November 2, 2015.
>
> [ ] +1 Release this package as Apache Flink 0.10.0
> [ ] -1 Do not release this package because ...
>
> ===
>
> The following commits have been added on top of release-0.10.0-rc3:
>
> 698fbc3 [release][scripts] shade away curator correctly with different
> Scala versions
>


Re: [DISCUSS] Java code style

2015-10-30 Thread Maximilian Michels
I looked up if the Checkstyle plugin would also support tabs with a
fixed line length. Indeed, this is possible because a tab can be
mapped to a fixed number of spaces.

I've modified the default Google Style Checkstyle file. I changed the
indention to tabs (2 spaces) and increased the line length to 120:
https://gist.github.com/mxm/2ca4ef7702667c167d10

The scan of the entire Flink project resulted in 27,992 items in 1601
files. This is roughly corresponds to the number of lines we would
have to touch to comply with the style rules. Note that, one line may
contain multiple items. A lot of the items are import statements.

Next, I tried running the vanilla Google Style Checkstyle file over
the entire code base but my IntelliJ crashed. Using Maven, I wasn't
able to get a total result displayed but I'm assuming it would be
almost all lines of Flink code that had a violation due to tabs.

On Mon, Oct 26, 2015 at 6:56 PM, Suneel Marthi  wrote:
> 2 spaces is the convention that's followed on Mahout and Oryx.
>
> On Mon, Oct 26, 2015 at 1:42 PM, Till Rohrmann  wrote:
>
>> Concerning question 2 Tabs vs. Spaces, in case of spaces we would have to
>> decide on the number of spaces, too. The Google Java style says to use a 2
>> space indentation, which is in my opinion sufficient to distinguish
>> different indentations levels from each other. Furthermore, it would save
>> some space.
>>
>> I would not vote -1 if we keep tabs.
>>
>>
>>
>> On Sat, Oct 24, 2015 at 8:33 PM, Henry Saputra 
>> wrote:
>>
>> > +1 for adding restriction for Javadoc at least at the header of public
>> > classes and methods.
>> >
>> > We did the exercise in Twill and seemed to work pretty well.
>> >
>> > On Fri, Oct 23, 2015 at 1:34 AM, Maximilian Michels 
>> > wrote:
>> > > I don't think lazily adding comments will work. However, I'm fine with
>> > > adding all the checkstyle rules one module at a time (with a jira
>> > > issue to keep track of the modules already converted). It's not going
>> > > to happen that we lazily add comments because that's the reason why
>> > > comments are missing in the first place...
>> > >
>> > > On Fri, Oct 23, 2015 at 12:05 AM, Henry Saputra <
>> henry.sapu...@gmail.com>
>> > wrote:
>> > >> Could we make certain rules to give warning instead of error?
>> > >>
>> > >> This would allow us to cherry-pick certain rules we would like people
>> > >> to follow but not strictly enforced.
>> > >>
>> > >> - Henry
>> > >>
>> > >> On Thu, Oct 22, 2015 at 9:20 AM, Stephan Ewen 
>> wrote:
>> > >>> I don't think a "let add comments to everything" effort gives us good
>> > >>> comments, actually. It just gives us checkmark comments that make the
>> > rules
>> > >>> pass.
>> > >>>
>> > >>> On Thu, Oct 22, 2015 at 3:29 PM, Fabian Hueske 
>> > wrote:
>> > >>>
>> > >>>> Sure, I don't expect it to be free.
>> > >>>> But everybody should be aware of the cost of adding this code style,
>> > i.e.,
>> > >>>> spending a huge amount of time on reformatting and documenting code.
>> > >>>>
>> > >>>> Alternatively, we could drop the JavaDocs rule and make the
>> transition
>> > >>>> significantly cheaper.
>> > >>>>
>> > >>>> 2015-10-22 15:24 GMT+02:00 Till Rohrmann :
>> > >>>>
>> > >>>> > There ain’t no such thing as a free lunch and code style.
>> > >>>> >
>> > >>>> > On Thu, Oct 22, 2015 at 3:13 PM, Maximilian Michels <
>> m...@apache.org
>> > >
>> > >>>> > wrote:
>> > >>>> >
>> > >>>> > > I think we have to document all these classes. Code Style
>> doesn't
>> > come
>> > >>>> > > for free :)
>> > >>>> > >
>> > >>>> > > On Thu, Oct 22, 2015 at 3:09 PM, Fabian Hueske <
>> fhue...@gmail.com
>> > >
>> > >>>> > wrote:
>> > >>>> > > > Any ideas how to deal with the mandatory JavaDoc rule for
>> > existing
>> > >>>> > code?
>> > >>>> > > > Just adding empty headers to make the che

Re: Scala 2.10/2.11 Maven dependencies

2015-11-01 Thread Maximilian Michels
Good point. Didn't know that. We can still add them for the release.

On Sat, Oct 31, 2015 at 1:51 PM, Alexander Alexandrov
 wrote:
> My two cents - there are already Maven artifacts deployed for 2.11 in the
> SNAPSHOT repository. I think it might be confusing if they suddenly
> disappear for the stable release.
>
>
> 2015-10-29 11:58 GMT+01:00 Maximilian Michels :
>
>> Seems like we agree that we need artifacts for different versions of Scala
>> on Maven. There also seems to be a preference for including the version in
>> the artifact name.
>>
>> I've created an issue and marked it to be resolved for 1.0. For the 0.10
>> release, we will have binaries but no Maven artifacts. The biggest
>> challenge I see is to remove Scala from as many modules as possible. For
>> example, flink-java depends on Scala at the moment..
>>
>> https://issues.apache.org/jira/browse/FLINK-2940
>>
>> On Wed, Oct 28, 2015 at 7:31 PM, Frederick F. Kautz IV 
>> wrote:
>>
>> > No idea if I get a vote ;) Nevertheless, +1 to have binaries for both
>> > versions in Maven and explicitly "scala versioned".
>> >
>> > Some background on this for those not as familiar with scala versioning:
>> >
>> > It's considered best practice to label what version of scala a library
>> > uses in the artifact id.
>> >
>> > The reason is compiled scala code is only compatible with the major
>> > version of scala it was compiled for. For example, a library compatible
>> > with 2.10 is not compatible with 2.11. The same will be true with 2.12
>> once
>> > it is released. Mixing versions will result in undefined behavior which
>> > will likely manifest itself as runtime exceptions.
>> >
>> > The convention to fix this problem is for all published libraries to
>> > specify the version of scala they are compatible with. Leaving out the
>> > scala version in a library is akin to saying "We don't depend on scala
>> for
>> > this library, so feel free to use whatever you want." Sbt users will
>> > typically specify the version of scala they use and tooling is built
>> around
>> > ensuring consistency with the "%%" operator.
>> >
>> > E.g.
>> >
>> > scalaVersion := "2.11.4"
>> >
>> > // this resolves to to artifactID: "scalacheck_2.11"
>> > libraryDependencies += "org.scalacheck" %% "scalacheck" % "1.12.0" %
>> "test"
>> >
>> > The most important part of this is that the scala version is explicit
>> > which eliminates the problem for downstream users.
>> >
>> > Cheers,
>> > Frederick
>> >
>> >
>> > On 10/28/2015 10:55 AM, Fabian Hueske wrote:
>> >
>> >> +1 to have binaries for both versions in Maven and as build to download.
>> >>
>> >> 2015-10-26 17:11 GMT+01:00 Theodore Vasiloudis <
>> >> theodoros.vasilou...@gmail.com>:
>> >>
>> >> +1 for having binaries, I'm working on a Spark application currently
>> with
>> >>> Scala 2.11 and having to rebuild everything when deploying e.g. to EC2
>> >>> is a
>> >>> pain.
>> >>>
>> >>> On Mon, Oct 26, 2015 at 4:22 PM, Ufuk Celebi  wrote:
>> >>>
>> >>> I agree with Till, but is this something you want to address in this
>> >>>> release already?
>> >>>>
>> >>>> I would postpone it to 1.0.0.
>> >>>>
>> >>>> – Ufuk
>> >>>>
>> >>>> On 26 Oct 2015, at 16:17, Till Rohrmann  wrote:
>> >>>>>
>> >>>>> I would be in favor of deploying also Scala 2.11 artifacts to Maven
>> >>>>>
>> >>>> since
>> >>>
>> >>>> more and more people will try out Flink with Scala 2.11. Having the
>> >>>>> dependencies in the Maven repository makes it considerably easier for
>> >>>>> people to get their Flink jobs running.
>> >>>>>
>> >>>>> Furthermore, I observed that people are not aware that our deployed
>> >>>>> artifacts, e.g. flink-runtime, are built with Scala 2.10. As a
>> >>>>>
>> >>>> consequence,
>> >>>>
>> >>>>> they mix flink dependencies with other dependencies pulling in Scala
&g

  1   2   3   4   5   6   7   8   9   >