date:20200512

Jark Wu created FLINK-17630:
---

 Summary: Implement format factory for Avro serialization and 
deserialization schema
 Key: FLINK-17630
 URL: https://issues.apache.org/jira/browse/FLINK-17630
 Project: Flink
  Issue Type: Sub-task
Reporter: Jark Wu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17629) Implement format factory for JSON serialization and deserialization schema

Jark Wu created FLINK-17629:
---

 Summary: Implement format factory for JSON serialization and 
deserialization schema
 Key: FLINK-17629
 URL: https://issues.apache.org/jira/browse/FLINK-17629
 Project: Flink
  Issue Type: Sub-task
Reporter: Jark Wu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: What's the best practice to determine whether a job has finished or not?

Hi,

The problem is that the JobClient is talking to the wrong system. In YARN 
per-job mode the cluster will only run as long as the job runs so there will be 
no-one there to respond with the job status after the job is finished.

I think the solution is that the JobClient should talk to the right system, in 
the YARN example it should talk directly to YARN for figuring out the job 
status (and maybe for other things as well).

Best,
Aljoscha

On Fri, May 8, 2020, at 15:05, Kurt Young wrote:
> +dev 
> 
> Best,
> Kurt
> 
> 
> On Fri, May 8, 2020 at 3:35 PM Caizhi Weng  wrote:
> 
> > Hi Jeff,
> >
> > Thanks for the response. However I'm using executeAsync so that I can run
> > the job asynchronously and get a JobClient to monitor the job. JobListener
> > only works for synchronous execute method. Is there other way to achieve
> > this?
> >
> > Jeff Zhang  于2020年5月8日周五 下午3:29写道：
> >
> >> I use JobListener#onJobExecuted to be notified that the flink job is
> >> done.
> >> It is pretty reliable for me, the only exception is the client process is
> >> down.
> >>
> >> BTW, the reason you see ApplicationNotFound exception is that yarn app
> >> is terminated which means the flink cluster is shutdown. While for
> >> standalone mode, the flink cluster is always up.
> >>
> >>
> >> Caizhi Weng  于2020年5月8日周五 下午2:47写道：
> >>
> >>> Hi dear Flink community,
> >>>
> >>> I would like to determine whether a job has finished (no matter
> >>> successfully or exceptionally) in my code.
> >>>
> >>> I used to think that JobClient#getJobStatus is a good idea, but I found
> >>> that it behaves quite differently under different executing environments.
> >>> For example, under a standalone session cluster it will return the 
> >>> FINISHED
> >>> status for a finished job, while under a yarn per job cluster it will 
> >>> throw
> >>> a ApplicationNotFound exception. I'm afraid that there might be other
> >>> behaviors for other environments.
> >>>
> >>> So what's the best practice to determine whether a job has finished or
> >>> not? Note that I'm not waiting for the job to finish. If the job hasn't
> >>> finished I would like to know it and do something else.
> >>>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >>
> >
>

[VOTE] FLIP-126: FLIP-126: Unify (and separate) Watermark Assigners

Hi all,

I would like to start the vote for FLIP-126 [1], which is discussed and reached 
a consensus in the discussion thread [2].

The vote will be open until May 15th (72h), unless there is an objection or not 
enough votes.

Best,
Aljoscha

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-126%3A+Unify+%28and+separate%29+Watermark+Assigners

[2] 
https://lists.apache.org/thread.html/r7988ddfe5ca8d85e666039cf6240e1007a2ca337a52108f684b66d90%40%3Cdev.flink.apache.org%3E

Re: [DISCUSS] Releasing "fat" and "slim" Flink distributions

2020-05-12 Thread Jingsong Li

Thanks for your discussion.

Sorry to start discussing another thing:

The biggest problem I see is the variety of problems caused by users' lack
of format dependency.
As Aljoscha said, these three formats are very small and no third party
dependence, and they are widely used by table users.
Actually, we don't have any other built-in table formats now... In total
151K...

73K flink-avro-1.10.0.jar
36K flink-csv-1.10.0.jar
42K flink-json-1.10.0.jar

So, Can we just put them into "lib/" or flink-table-uber?
It not solve all problems and maybe it is independent of "fat" and "slim".
But also improve usability.
What do you think? Any objections?

Best,
Jingsong Lee

On Mon, May 11, 2020 at 5:48 PM Chesnay Schepler  wrote:

> One downside would be that we're shipping more stuff when running on
> YARN for example, since the entire plugins directory is shiped by default.
>
> On 17/04/2020 16:38, Stephan Ewen wrote:
> > @Aljoscha I think that is an interesting line of thinking. the swift-fs
> may
> > be rarely enough used to move it to an optional download.
> >
> > I would still drop two more thoughts:
> >
> > (1) Now that we have plugins support, is there a reason to have a metrics
> > reporter or file system in /opt instead of /plugins? They don't spoil the
> > class path any more.
> >
> > (2) I can imagine there still being a desire to have a "minimal" docker
> > file, for users that want to keep the container images as small as
> > possible, to speed up deployment. It is fine if that would not be the
> > default, though.
> >
> >
> > On Fri, Apr 17, 2020 at 12:16 PM Aljoscha Krettek 
> > wrote:
> >
> >> I think having such tools and/or tailor-made distributions can be nice
> >> but I also think the discussion is missing the main point: The initial
> >> observation/motivation is that apparently a lot of users (Kurt and I
> >> talked about this) on the chinese DingTalk support groups, and other
> >> support channels have problems when first using the SQL client because
> >> of these missing connectors/formats. For these, having additional tools
> >> would not solve anything because they would also not take that extra
> >> step. I think that even tiny friction should be avoided because the
> >> annoyance from it accumulates of the (hopefully) many users that we want
> >> to have.
> >>
> >> Maybe we should take a step back from discussing the "fat"/"slim" idea
> >> and instead think about the composition of the current dist. As
> >> mentioned we have these jars in opt/:
> >>
> >>17M flink-azure-fs-hadoop-1.10.0.jar
> >>52K flink-cep-scala_2.11-1.10.0.jar
> >> 180K flink-cep_2.11-1.10.0.jar
> >> 746K flink-gelly-scala_2.11-1.10.0.jar
> >> 626K flink-gelly_2.11-1.10.0.jar
> >> 512K flink-metrics-datadog-1.10.0.jar
> >> 159K flink-metrics-graphite-1.10.0.jar
> >> 1.0M flink-metrics-influxdb-1.10.0.jar
> >> 102K flink-metrics-prometheus-1.10.0.jar
> >>10K flink-metrics-slf4j-1.10.0.jar
> >>12K flink-metrics-statsd-1.10.0.jar
> >>36M flink-oss-fs-hadoop-1.10.0.jar
> >>28M flink-python_2.11-1.10.0.jar
> >>22K flink-queryable-state-runtime_2.11-1.10.0.jar
> >>18M flink-s3-fs-hadoop-1.10.0.jar
> >>31M flink-s3-fs-presto-1.10.0.jar
> >> 196K flink-shaded-netty-tcnative-dynamic-2.0.25.Final-9.0.jar
> >> 518K flink-sql-client_2.11-1.10.0.jar
> >>99K flink-state-processor-api_2.11-1.10.0.jar
> >>25M flink-swift-fs-hadoop-1.10.0.jar
> >> 160M opt
> >>
> >> The "filesystem" connectors ar ethe heavy hitters, there.
> >>
> >> I downloaded most of the SQL connectors/formats and this is what I got:
> >>
> >>73K flink-avro-1.10.0.jar
> >>36K flink-csv-1.10.0.jar
> >>55K flink-hbase_2.11-1.10.0.jar
> >>88K flink-jdbc_2.11-1.10.0.jar
> >>42K flink-json-1.10.0.jar
> >>20M flink-sql-connector-elasticsearch6_2.11-1.10.0.jar
> >> 2.8M flink-sql-connector-kafka_2.11-1.10.0.jar
> >>24M sql-connectors-formats
> >>
> >> We could just add these to the Flink distribution without blowing it up
> >> by much. We could drop any of the existing "filesystem" connectors from
> >> opt and add the SQL connectors/formats and not change the size of Flink
> >> dist. So maybe we should do that instead?
> >>
> >> We would need some tooling for the sql-client shell script to pick-up
> >> the connectors/formats up from opt/ because we don't want to add them to
> >> lib/. We're already doing that for finding the flink-sql-client jar,
> >> which is also not in lib/.
> >>
> >> What do you think?
> >>
> >> Best,
> >> Aljoscha
> >>
> >> On 17.04.20 05:22, Jark Wu wrote:
> >>> Hi,
> >>>
> >>> I like the idea of web tool to assemble fat distribution. And the
> >>> https://code.quarkus.io/ looks very nice.
> >>> All the users need to do is just select what he/she need (I think this
> >> step
> >>> can't be omitted anyway).
> >>> We can also provide a default fat distribution on the web which default
> >>> selects some popular connectors.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Fri, 17 Apr 2020 a

[jira] [Created] (FLINK-17631) Supports char type for ConfigOption

2020-05-12 Thread Danny Chen (Jira)

Danny Chen created FLINK-17631:
--

 Summary: Supports char type for ConfigOption
 Key: FLINK-17631
 URL: https://issues.apache.org/jira/browse/FLINK-17631
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.11.0
Reporter: Danny Chen
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Release 1.10.1, release candidate #3

2020-05-12 Thread Wei Zhong

+1 (non-binding)

- checked signatures
- the python package looks good
- the python shell looks good
- submitted a python job on local cluster

Best,
Wei

> 在 2020年5月12日，11:42，Zhu Zhu  写道：
> 
> +1 (non-binding)
> 
> - checked release notes
> - checked signatures
> - built from source
> - submitted an example job on yarn cluster
> - WebUI and logs look good
> 
> Thanks,
> Zhu Zhu
> 
> Leonard Xu  于2020年5月12日周二 上午11:10写道：
> 
>> +1 (non-binding)
>> 
>> - checked/verified signatures and hashes
>> - checked the release note
>> - checked that there are no missing artifacts in staging area
>> - built from source sing scala 2.11 succeeded
>> - started cluster and run some e2e test succeeded
>> - started a cluster, WebUI was accessible, submitted a wordcount job and
>> ran succeeded, no suspicious log output
>> - the web PR looks good
>> 
>> Best,
>> Leonard Xu
>> 
>>> 在 2020年5月12日，10:24，jincheng sun  写道：
>>> 
>>> +1(binding)
>>> 
>>> - built from source and run Streaming word count without unexpected
>>> information.
>>> - check the signatures, looks good!
>>> 
>>> BTW: I've added your PyPI account(carp84) as a maintainer role. Great job
>>> Yu!
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> Hequn Cheng  于2020年5月12日周二 上午9:49写道：
>>> 
 +1 (binding)
 
 - Go through all commits from 1.10.0 to 1.10.1 and spot no new license
 problems.
 - Built from source archive successfully.
 - Travis e2e tests have passed.
 - Signatures and hash are correct.
 - Run SocketWindowWordCount on the local cluster and check web ui &
>> logs.
 - Install Python package and run Python WordCount example.
 - Website PR looks good.
 
 Best,
 Hequn
 
 On Mon, May 11, 2020 at 10:39 PM Ufuk Celebi  wrote:
 
> +1 (binding)
> 
> - checked release notes
> - verified sums and hashes
> - reviewed website PR
> - successfully built an internal Flink distribution based on the
 1.10.1-rc3
> commit
> - successfully built internal jobs against the staging repo and
>> deployed
> those jobs to a 1.10.1 job cluster on Kubernetes and tested
>> checkpointing
> 
> –  Ufuk
> 
> On Mon, May 11, 2020 at 11:47 AM Tzu-Li (Gordon) Tai <
 tzuli...@apache.org>
> wrote:
>> 
>> +1 (binding)
>> 
>> Legal checks:
>> - checksum & gpg OK
>> - changes to Kinesis connector NOTICE files looks good
>> - built from source
>> 
>> Downstream checks in flink-statefun:
>> - Built StateFun with Flink 1.10.1 + e2e tests enabled (mvn clean
 install
>> -Prun-e2e-tests), builds and passes
>> - Previous issue preventing successful failure recovery when using the
> new
>> scheduler, is now fixed with this RC
>> 
>> Cheers,
>> Gordon
>> 
>> On Mon, May 11, 2020 at 2:47 PM Congxian Qiu 
> wrote:
>> 
>>> +1 (no-binding)
>>> 
>>> - checksum & gpg ok
>>> - build from source OK
>>> - all pom files points to the same version OK
>>> - LICENSE OK
>>> - run demos OK
>>> Best,
>>> Congxian
>>> 
>>> 
>>> Dian Fu  于2020年5月10日周日 下午10:14写道：
>>> 
 +1 (non-binding)
 
 - checked the dependency changes since 1.10.0
 1) kafka was bumped from 0.10.2.1 to 0.10.2.2 for
 flink-connector-kafka-0.10 and it has been reflected in the notice
> file
 2) amazon-kinesis-producer was bumped from 0.13.1 to 0.14.0 and
 it
> has
 been reflected in the notice file
 3) the dependencies com.carrotsearch:hppc,
 com.github.spullara.mustache.java,
> org.elasticsearch:elasticsearch-geo
>>> and
 org.elasticsearch.plugin:lang-mustache-client was bundled in the
 jar
> of
 flink-sql-connector-elasticsearch7 and it has been reflected in the
>>> notice
 file
 4) influxdb-java was bumped from 2.16 to 2.17 and it has been
> reflected
 in the notice file
 - verified the checksum and signature
 - checked that the PyFlink package could be pip installed
 - have left a few minor comments on the website PR
 
 Regards,
 Dian
 
 On Sat, May 9, 2020 at 12:02 PM Thomas Weise 
 wrote:
 
> Thanks for the RC!
> 
> +1 (binding)
> 
> - repeated benchmark runs
> 
> 
> On Fri, May 8, 2020 at 10:52 AM Robert Metzger <
> rmetz...@apache.org>
> wrote:
> 
>> Thanks a lot for creating another RC!
>> 
>> +1 (binding)
>> 
>> - checked diff to last RC:
>> 
>> 
> 
 
>>> 
> 
> 
 
>> https://github.com/apache/flink/compare/release-1.10.1-rc2...release-1.10.1-rc3
>> - kinesis dependency change is properly documented
>> - started Flink locally (on Java11 :) )
>>  - seems to be build off the specified commit
>

Re: [VOTE] Release 1.10.1, release candidate #3

2020-05-12 Thread Yu Li

Thanks all for checking and voting for the release!

With 11 +1 (6 binding) and no -1 votes, this vote has passed. Will
officially conclude the vote in another thread.

Best Regards,
Yu


On Tue, 12 May 2020 at 17:02, Wei Zhong  wrote:

> +1 (non-binding)
>
> - checked signatures
> - the python package looks good
> - the python shell looks good
> - submitted a python job on local cluster
>
> Best,
> Wei
>
> > 在 2020年5月12日，11:42，Zhu Zhu  写道：
> >
> > +1 (non-binding)
> >
> > - checked release notes
> > - checked signatures
> > - built from source
> > - submitted an example job on yarn cluster
> > - WebUI and logs look good
> >
> > Thanks,
> > Zhu Zhu
> >
> > Leonard Xu  于2020年5月12日周二 上午11:10写道：
> >
> >> +1 (non-binding)
> >>
> >> - checked/verified signatures and hashes
> >> - checked the release note
> >> - checked that there are no missing artifacts in staging area
> >> - built from source sing scala 2.11 succeeded
> >> - started cluster and run some e2e test succeeded
> >> - started a cluster, WebUI was accessible, submitted a wordcount job and
> >> ran succeeded, no suspicious log output
> >> - the web PR looks good
> >>
> >> Best,
> >> Leonard Xu
> >>
> >>> 在 2020年5月12日，10:24，jincheng sun  写道：
> >>>
> >>> +1(binding)
> >>>
> >>> - built from source and run Streaming word count without unexpected
> >>> information.
> >>> - check the signatures, looks good!
> >>>
> >>> BTW: I've added your PyPI account(carp84) as a maintainer role. Great
> job
> >>> Yu!
> >>>
> >>> Best,
> >>> Jincheng
> >>>
> >>> Hequn Cheng  于2020年5月12日周二 上午9:49写道：
> >>>
>  +1 (binding)
> 
>  - Go through all commits from 1.10.0 to 1.10.1 and spot no new license
>  problems.
>  - Built from source archive successfully.
>  - Travis e2e tests have passed.
>  - Signatures and hash are correct.
>  - Run SocketWindowWordCount on the local cluster and check web ui &
> >> logs.
>  - Install Python package and run Python WordCount example.
>  - Website PR looks good.
> 
>  Best,
>  Hequn
> 
>  On Mon, May 11, 2020 at 10:39 PM Ufuk Celebi  wrote:
> 
> > +1 (binding)
> >
> > - checked release notes
> > - verified sums and hashes
> > - reviewed website PR
> > - successfully built an internal Flink distribution based on the
>  1.10.1-rc3
> > commit
> > - successfully built internal jobs against the staging repo and
> >> deployed
> > those jobs to a 1.10.1 job cluster on Kubernetes and tested
> >> checkpointing
> >
> > –  Ufuk
> >
> > On Mon, May 11, 2020 at 11:47 AM Tzu-Li (Gordon) Tai <
>  tzuli...@apache.org>
> > wrote:
> >>
> >> +1 (binding)
> >>
> >> Legal checks:
> >> - checksum & gpg OK
> >> - changes to Kinesis connector NOTICE files looks good
> >> - built from source
> >>
> >> Downstream checks in flink-statefun:
> >> - Built StateFun with Flink 1.10.1 + e2e tests enabled (mvn clean
>  install
> >> -Prun-e2e-tests), builds and passes
> >> - Previous issue preventing successful failure recovery when using
> the
> > new
> >> scheduler, is now fixed with this RC
> >>
> >> Cheers,
> >> Gordon
> >>
> >> On Mon, May 11, 2020 at 2:47 PM Congxian Qiu <
> qcx978132...@gmail.com>
> > wrote:
> >>
> >>> +1 (no-binding)
> >>>
> >>> - checksum & gpg ok
> >>> - build from source OK
> >>> - all pom files points to the same version OK
> >>> - LICENSE OK
> >>> - run demos OK
> >>> Best,
> >>> Congxian
> >>>
> >>>
> >>> Dian Fu  于2020年5月10日周日 下午10:14写道：
> >>>
>  +1 (non-binding)
> 
>  - checked the dependency changes since 1.10.0
>  1) kafka was bumped from 0.10.2.1 to 0.10.2.2 for
>  flink-connector-kafka-0.10 and it has been reflected in the notice
> > file
>  2) amazon-kinesis-producer was bumped from 0.13.1 to 0.14.0 and
>  it
> > has
>  been reflected in the notice file
>  3) the dependencies com.carrotsearch:hppc,
>  com.github.spullara.mustache.java,
> > org.elasticsearch:elasticsearch-geo
> >>> and
>  org.elasticsearch.plugin:lang-mustache-client was bundled in the
>  jar
> > of
>  flink-sql-connector-elasticsearch7 and it has been reflected in
> the
> >>> notice
>  file
>  4) influxdb-java was bumped from 2.16 to 2.17 and it has been
> > reflected
>  in the notice file
>  - verified the checksum and signature
>  - checked that the PyFlink package could be pip installed
>  - have left a few minor comments on the website PR
> 
>  Regards,
>  Dian
> 
>  On Sat, May 9, 2020 at 12:02 PM Thomas Weise 
>  wrote:
> 
> > Thanks for the RC!
> >
> > +1 (binding)
> >
> > - repeated benchmark runs
> >
> >
> > On Fri,

Re: [PROPOSAL] Google Season of Docs 2020.

This is great newst :-) Thanks Marta for driving this effort!

On Mon, May 11, 2020 at 4:22 PM Sivaprasanna 
wrote:

> Awesome. Great job.
>
> On Mon, 11 May 2020 at 7:22 PM, Seth Wiesman  wrote:
>
> > Thank you for putting this together Marta!
> >
> > On Mon, May 11, 2020 at 8:35 AM Fabian Hueske  wrote:
> >
> > > Thanks Marta and congratulations!
> > >
> > > Am Mo., 11. Mai 2020 um 14:55 Uhr schrieb Robert Metzger <
> > > rmetz...@apache.org>:
> > >
> > > > Awesome :)
> > > > Thanks a lot for driving this Marta!
> > > >
> > > > Nice to see Flink (by virtue of having Apache as part of the name) so
> > > high
> > > > on the list, with other good open source projects :)
> > > >
> > > >
> > > > On Mon, May 11, 2020 at 2:18 PM Marta Paes Moreira <
> > ma...@ververica.com>
> > > > wrote:
> > > >
> > > > > I'm happy to announce that we were *accepted* into this year's
> Google
> > > > > Season of Docs!
> > > > >
> > > > > The list of projects was published today [1]. The next step is for
> > > > > technical writers to reach out (May 11th-June 8th) and apply (June
> > > > 9th-July
> > > > > 9th).
> > > > >
> > > > > Thanks again to everyone involved! I'm really looking forward to
> > > kicking
> > > > > off the project in September.
> > > > >
> > > > > [1]
> https://developers.google.com/season-of-docs/docs/participants/
> > > > >
> > > > > Marta
> > > > >
> > > > > On Thu, Apr 30, 2020 at 5:14 PM Marta Paes Moreira <
> > > ma...@ververica.com>
> > > > > wrote:
> > > > >
> > > > > > The application to Season of Docs 2020 is close to being
> finalized.
> > > > I've
> > > > > > created a PR with the application announcement for the Flink blog
> > [1]
> > > > (as
> > > > > > required by Google OSS).
> > > > > >
> > > > > > Thanks a lot to everyone who pitched in — and special thanks to
> > > > Aljoscha
> > > > > > and Seth for volunteering as mentors!
> > > > > >
> > > > > > I'll send an update to this thread once the results are out (May
> > > 11th).
> > > > > >
> > > > > > [1] https://github.com/apache/flink-web/pull/332
> > > > > >
> > > > > > On Mon, Apr 27, 2020 at 9:28 PM Seth Wiesman <
> sjwies...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > >> Hi Marta,
> > > > > >>
> > > > > >> I think this is a great idea, I'd be happy to help mentor a
> table
> > > > > >> documentation project.
> > > > > >>
> > > > > >> Seth
> > > > > >>
> > > > > >> On Thu, Apr 23, 2020 at 8:38 AM Marta Paes Moreira <
> > > > ma...@ververica.com
> > > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Thanks for the feedback!
> > > > > >> >
> > > > > >> > So far, the projects on the table are:
> > > > > >> >
> > > > > >> >1. Improving the Table API/SQL documentation.
> > > > > >> >2. Improving the documentation about Deployments.
> > > > > >> >3. Restructuring and standardizing the documentation about
> > > > > >> Connectors.
> > > > > >> >4. Finishing the Chinese translation.
> > > > > >> >
> > > > > >> > I think 2. would require a lot of technical knowledge about
> > Flink,
> > > > > which
> > > > > >> > might not be a good fit for GSoD (as discussed last year).
> > > > > >> >
> > > > > >> > As for mentors, we have:
> > > > > >> >
> > > > > >> >- Aljoscha (Table API/SQL)
> > > > > >> >- Till (Deployments)
> > > > > >> >- Stephan also said he'd be happy to participate as a
> mentor
> > if
> > > > > >> needed.
> > > > > >> >
> > > > > >> > For the translation project, I'm pulling in the people
> involved
> > in
> > > > > last
> > > > > >> > year's thread (Jark and Jincheng), as we would need two
> > > > > chinese-speaking
> > > > > >> > mentors.
> > > > > >> >
> > > > > >> > I'll follow up with a draft proposal early next week, once we
> > > reach
> > > > a
> > > > > >> > consensus and have enough mentors (2 per project). Thanks
> again!
> > > > > >> >
> > > > > >> > Marta
> > > > > >> >
> > > > > >> >
> > > > > >> > On Fri, Apr 17, 2020 at 2:53 PM Till Rohrmann <
> > > trohrm...@apache.org
> > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Thanks for driving this effort Marta.
> > > > > >> > >
> > > > > >> > > I'd be up for mentoring improvements for the deployment
> > section
> > > as
> > > > > >> > > described in FLIP-42.
> > > > > >> > >
> > > > > >> > > Cheers,
> > > > > >> > > Till
> > > > > >> > >
> > > > > >> > > On Fri, Apr 17, 2020 at 11:20 AM Aljoscha Krettek <
> > > > > >> aljos...@apache.org>
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Hi,
> > > > > >> > > >
> > > > > >> > > > first, excellent that you're driving this, Marta!
> > > > > >> > > >
> > > > > >> > > > By now I have made quite some progress on the FLIP-42
> > > > > restructuring
> > > > > >> so
> > > > > >> > > > that is not a good effort for someone to join now. Plus
> > there
> > > is
> > > > > >> also
> > > > > >> > > > [1], which is about incorporating the existing Flink
> > Training
> > > > > >> material
> > > > > >> > > > into the concepts section of the Flink doc.
> > > > > >> > > >
> > > > >

Re: [DISCUSS] Align the behavior of internal return result of MapState#entries, keys, values and iterator.

A bit late but also +1 for the proposal to return an empty
iterator/collection.

Cheers,
Till

On Mon, May 11, 2020 at 11:17 AM SteNicholas  wrote:

> Hi Tang Yun,
> I have already created new issue for FLINK-17610 to align the behavior
> of result of internal map state to return empty iterator. Please check this
> issue, if you have any question for it, please let me know about this.
>
>
> Thanks,
> Nicholas Jiang
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>

Re: [DISCUSS] Add a material web page under "https://flink.apache.org/"

I think this is a very good idea. With a team committed to keep the site up
to date the main concern should also be resolved.

Cheers,
Till

On Mon, May 11, 2020 at 5:25 PM Yuan Mei  wrote:

> Yep, Marta's concern is definitely a great point in terms of maintenance
> costs.
>
> What I would like to highlight is a place to collect scattered materials
> `altogether` so that they are easy to search, tag, and update. In that
> sense, I really like Fabian's idea.
>
> It is even better if we could have some level of interaction so that people
> can comment and give feedback to the author directly and ask questions
> under those materials. We can even analyze what type of materials are more
> popular based on feedback, and accordingly, we can adjust our promotion
> strategy. I think these would be great complementation to what we can
> achieve solely through mailing lists, slack, or stack overflow today.
>
> I can ask whether the team running to promote the Flink community in China
> can help to build such a site if appropriate. It should not be too
> difficult if we do not intend to do fancy stuff :-).
>
> What do you say?
>
> Best
>
> Yuan
>
>
>
> On Mon, May 11, 2020 at 7:54 PM Robert Metzger 
> wrote:
>
> > Fabian's idea is definitely nice, if we find somebody who wants to
> develop
> > such a site :)
> >
> > Marta has raised valid concerns. Maybe we could start by creating a Wiki
> > page, that nicely categorizes the different items.
> > I'm happy to prominently link the Wiki page from the "Powered By" and
> > Starting page.
> >
> > On Mon, May 11, 2020 at 11:05 AM Fabian Hueske 
> wrote:
> >
> > > Hi everyone,
> > >
> > > I like the idea of having a place to collect material (blogs, talk
> > > recordings, slides) on Flink.
> > > As Robert said, we had this before and dropped it at some point because
> > the
> > > content became stale.
> > > Also I feel we would need something more dynamic than a static website
> > > (like flink.apache.org) but something with search, tags, up/down
> votes.
> > >
> > > What do you think about collecting content on an external site similar
> to
> > > flink-packages.org.
> > > Maybe flink-packages.org can be extended to also list content pieces
> or
> > a
> > > new page (flink-content.org/flink-materials.org/???) could be built.
> > > This is of course significantly more work, but might pay off because it
> > can
> > > offer more interaction and we could also allow externals (i.e,
> > > non-committers) to post new content.
> > >
> > > What do you think?
> > >
> > > Cheers, Fabian
> > >
> > >
> > > Am Mo., 11. Mai 2020 um 09:44 Uhr schrieb Marta Paes Moreira <
> > > ma...@ververica.com>:
> > >
> > > > Hi, Yuan!
> > > >
> > > > Thanks for starting this discussion! It would be really good to make
> > > those
> > > > resources more prominent in the Flink website somehow.
> > > >
> > > > You can find an example close to the idea you're bringing up in [1].
> My
> > > > concern is that placing all resources in a page can be hard to
> maintain
> > > and
> > > > a bit overwhelming. For users, it also ends up being hard to pinpoint
> > > what
> > > > they are looking for if there is too much material.
> > > >
> > > > Rather than trying to maintain such a page, I think it could be
> > valuable
> > > to
> > > > rethink the "Use Cases" section of the website and incorporate some
> > > > relevant materials in association with the right context (e.g.
> > industry,
> > > > use case), then point users to a different place where they can find
> > more
> > > > (e.g. the Flink Forward playlist/Slideshare or other dedicated Flink
> > > > channel).
> > > >
> > > > For blogposts, what we usually do is reach out to the authors of
> > > > blogposts that are relevant to the community, but posted elsewhere,
> and
> > > ask
> > > > to submit them to the Flink blog as well. Could this also be an
> option
> > > for
> > > > the Chinese version of the blog? As far as I see, for now it's just a
> > > > mirror of the English blog.
> > > >
> > > > Marta
> > > >
> > > > [1]
> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink
> > /
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Various+Presentations
> > > >
> > > > On Mon, May 11, 2020 at 9:11 AM Yuan Mei 
> > wrote:
> > > >
> > > > > Hey folks,
> > > > >
> > > > > I am looking for a home to put and organize materials (mostly in
> > > Chinese)
> > > > > altogether for easier reference or search in the future.
> > > > >
> > > > > Those materials mainly include:
> > > > > - video records for live broadcasting of Flink related content.
> For
> > > > > example, the set of selected Flink Forward talks reinterpreted and
> > > > > broadcasted in Chinese.
> > > > > - regular meet up topics, minutes, and PPTs. Meet-ups are held
> twice
> > > > every
> > > > > month on average.
> > > > > - write-ups/blogs to share user experiences e.t.c.
> > > > >
> > > > > Currently, those materials are published through the official Flink
> > > > WeChat
> >

Re: What's the best practice to determine whether a job has finished or not?

For that attached YARN per-job mode, the cluster should wait until the
result of the job has been fetched. Can't this be used to report the final
status of the job?

Cheers,
Till

On Tue, May 12, 2020 at 9:52 AM Aljoscha Krettek 
wrote:

> Hi,
>
> The problem is that the JobClient is talking to the wrong system. In YARN
> per-job mode the cluster will only run as long as the job runs so there
> will be no-one there to respond with the job status after the job is
> finished.
>
> I think the solution is that the JobClient should talk to the right
> system, in the YARN example it should talk directly to YARN for figuring
> out the job status (and maybe for other things as well).
>
> Best,
> Aljoscha
>
> On Fri, May 8, 2020, at 15:05, Kurt Young wrote:
> > +dev 
> >
> > Best,
> > Kurt
> >
> >
> > On Fri, May 8, 2020 at 3:35 PM Caizhi Weng  wrote:
> >
> > > Hi Jeff,
> > >
> > > Thanks for the response. However I'm using executeAsync so that I can
> run
> > > the job asynchronously and get a JobClient to monitor the job.
> JobListener
> > > only works for synchronous execute method. Is there other way to
> achieve
> > > this?
> > >
> > > Jeff Zhang  于2020年5月8日周五 下午3:29写道：
> > >
> > >> I use JobListener#onJobExecuted to be notified that the flink job is
> > >> done.
> > >> It is pretty reliable for me, the only exception is the client
> process is
> > >> down.
> > >>
> > >> BTW, the reason you see ApplicationNotFound exception is that yarn app
> > >> is terminated which means the flink cluster is shutdown. While for
> > >> standalone mode, the flink cluster is always up.
> > >>
> > >>
> > >> Caizhi Weng  于2020年5月8日周五 下午2:47写道：
> > >>
> > >>> Hi dear Flink community,
> > >>>
> > >>> I would like to determine whether a job has finished (no matter
> > >>> successfully or exceptionally) in my code.
> > >>>
> > >>> I used to think that JobClient#getJobStatus is a good idea, but I
> found
> > >>> that it behaves quite differently under different executing
> environments.
> > >>> For example, under a standalone session cluster it will return the
> FINISHED
> > >>> status for a finished job, while under a yarn per job cluster it
> will throw
> > >>> a ApplicationNotFound exception. I'm afraid that there might be other
> > >>> behaviors for other environments.
> > >>>
> > >>> So what's the best practice to determine whether a job has finished
> or
> > >>> not? Note that I'm not waiting for the job to finish. If the job
> hasn't
> > >>> finished I would like to know it and do something else.
> > >>>
> > >>
> > >>
> > >> --
> > >> Best Regards
> > >>
> > >> Jeff Zhang
> > >>
> > >
> >
>

[RESULT] [VOTE] Release 1.10.1, release candidate #3

2020-05-12 Thread Yu Li

Hi everyone,

I'm happy to announce that we have unanimously approved this release.

There are 11 approving votes, 6 of which are binding:
* Robert Metzger (binding)
* Thomas Weise (binding)
* Tzu-Li (Gordon) Tai (binding)
* Ufuk Celebi (binding)
* Hequn Cheng (binding)
* Jincheng Sun (binding)
* Dian Fu
* Congxian Qiu
* Leonard Xu
* Zhu Zhu
* Wei Zhong

There are no disapproving votes.

Thanks everyone!

Cheers,
Yu

[jira] [Created] (FLINK-17632) Support to specify a remote path for job jar

2020-05-12 Thread Yang Wang (Jira)

Yang Wang created FLINK-17632:
-

 Summary: Support to specify a remote path for job jar
 Key: FLINK-17632
 URL: https://issues.apache.org/jira/browse/FLINK-17632
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission, Deployment / YARN
Reporter: Yang Wang


After FLINK-13938, we could support to register remote shared libs as Yarn 
local resources and prevent unnecessary uploading and downloading for system 
Flink jars.

However, we still need to specify a local user jar to run a Flink job on Yarn. 
This ticket aims to add the remote path support. It have at least two purposes.
 * Accelerate the submission
 * Better integration with application mode. Since the user main code is 
executed in the cluster(jobmanager), we do not need the user jar exists locally.

A very typical use case is like following.
{code:java}
./bin/flink run-application -p 10 -t yarn-application \
-yD yarn.provided.lib.dirs="hdfs://myhdfs/flink/lib" \ 
hdfs://myhdfs/jars/WindowJoin.jar{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17633) Improve FactoryUtil to align with new format options keys

Jark Wu created FLINK-17633:
---

 Summary: Improve FactoryUtil to align with new format options keys
 Key: FLINK-17633
 URL: https://issues.apache.org/jira/browse/FLINK-17633
 Project: Flink
  Issue Type: Sub-task
Reporter: Jark Wu
Assignee: Jark Wu
 Fix For: 1.11.0


As discussed in 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Hierarchies-in-ConfigOption-td40920.html,
 we propose to use the following new format options: 

{code:java}
format = json
json.fail-on-missing-field = true
json.ignore-parse-error = true

value.format = json
value.json.fail-on-missing-field = true
value.json.ignore-parse-error = true
{code}

However, the current {{discoverScan/SinkFormat}} of {{FactoryUtil}} uses 
{{value.format.fail-on-missing-field}} as the key, which is not handy for 
connectors to use. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17634) Reject multiple handler registrations under the same URL

2020-05-12 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-17634:


 Summary: Reject multiple handler registrations under the same URL
 Key: FLINK-17634
 URL: https://issues.apache.org/jira/browse/FLINK-17634
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


In FLINK-11853 a handler was added the is being registered under the same URL 
as another handler. This should never happen, and we should add a check to 
ensure this doesn't happen again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

Hi Aljoscha,

Sorry for adding comments during the vote, but I have some really minor
suggestions that should not influence the voting thread imo.

1) Does it make sense to have the TimestampAssigner extend from Flink's
Function? This implies it has to be serializable which with the factory
pattern is not strictly necessary, right? BTW I really like that you
suggested the FunctionInterface annotation there.

2) Could we rename the IdentityTimestampAssigner to e.g.
RecordTimestampAssigner/SystemTimestampAssigner/NativeTimestampAssigner...
Personally I found the IdentityTimestampAssigner a bit misleading as it
usually mean a no-op. Which did not click for me, as I assumed it
somehow returns the incoming record itself.

3) Could we rename the second parameter of TimestampAssigner#extract to
e.g. recordTimestamp/nativeTimestamp? This is similar to the point
above. This parameter was also a bit confusing for me as I thought at
times its somehow related to
TimerService#currentProcessingTimestamp()/currentWatermark() as the
whole system currentTimestamp.

Other than those three points I like the proposal and I was about to
vote +1 if it was not for those three points.

Best,

Dawid

On 11/05/2020 16:57, Jark Wu wrote:
> Thanks for the explanation. I like the fatory pattern to make the member
> variables immutable and final.
>
> So +1 to the proposal.
>
> Best,
> Jark
>
> On Mon, 11 May 2020 at 22:01, Stephan Ewen  wrote:
>
>> I am fine with that.
>>
>> Much of the principles seem agreed upon. I understand the need to support
>> code-generated extractors and we should support most of it already (as
>> Aljoscha mentioned via the factories) can extend this if needed.
>>
>> I think that the factory approach supports code-generated extractors in a
>> cleaner way even than an extractor with an open/init method.
>>
>>
>> On Mon, May 11, 2020 at 3:38 PM Aljoscha Krettek 
>> wrote:
>>
>>> We're slightly running out of time. I would propose we vote on the basic
>>> principle and remain open to later additions. This feature is quite
>>> important to make the new Kafka Source that is developed as part of
>>> FLIP-27 useful. Otherwise we would have to use the legacy interfaces in
>>> the newly added connector.
>>>
>>> I know that's a bit unorthodox but would everyone be OK with what's
>>> currently there and then we iterate?
>>>
>>> Best,
>>> Aljoscha
>>>
>>> On 11.05.20 13:57, Aljoscha Krettek wrote:
 Ah, I meant to write this in my previous email, sorry about that.

 The WatermarkStrategy, which is basically a factory for a
 WatermarkGenerator is the replacement for the open() method. This is
>> the
 same strategy that was followed for StreamOperatorFactory, which was
 introduced to allow code generation in the Table API [1]. If we need
 metrics or other things we would add that as a parameter to the factory
 method. What do you think?

 Best,
 Aljoscha

 [1] https://issues.apache.org/jira/browse/FLINK-11974

 On 10.05.20 05:07, Jark Wu wrote:
> Hi,
>
> Regarding to the `open()/close()`, I think it's necessary for
> Table&SQL to
> compile the generated code.
> In Table&SQL, the watermark strategy and event-timestamp is defined
>>> using
> SQL expressions, we will
> translate and generate Java code for the expressions. If we have
> `open()/close()`, we don't need lazy initialization.
> Besides that, I can see a need to report some metrics, e.g. the
>> current
> watermark, the dirty timestamps (null value), etc.
> So I think a simple `open()/close()` with a context which can get
> MetricGroup is nice and not complex for the first version.
>
> Best,
> Jark
>
>
>
> On Sun, 10 May 2020 at 00:50, Stephan Ewen  wrote:
>
>> Thanks, Aljoscha, for picking this up.
>>
>> I agree with the approach of doing the here proposed set of changes
>> for
>> now. It already makes things simpler and adds idleness support
>> everywhere.
>>
>> Rich functions and state always add complexity, let's do this in a
>> next
>> step, if we have a really compelling case.
>>
>>
>> On Wed, Apr 29, 2020 at 7:24 PM Aljoscha Krettek <
>> aljos...@apache.org>
>> wrote:
>>
>>> Regarding the WatermarkGenerator (WG) interface itself. The proposal
>>> is
>>> basically to turn emitting into a "flatMap", we give the
>>> WatermarkGenerator a "collector" (the WatermarkOutput) and the WG
>> can
>>> decide whether to output a watermark or not and can also mark the
>>> output
>>> as idle. Changing the interface to return a Watermark (as the
>> previous
>>> watermark assigner interface did) would not allow that flexibility.
>>>
>>> Regarding checkpointing the watermark and keeping track of the
>> minimum
>>> watermark, this would be the responsibility of the framework (or the
>>> KafkaConsumer in the current implementation). The user-suppl

[jira] [Created] (FLINK-17635) Add documentation about view support

2020-05-12 Thread Kurt Young (Jira)

Kurt Young created FLINK-17635:
--

 Summary: Add documentation about view support 
 Key: FLINK-17635
 URL: https://issues.apache.org/jira/browse/FLINK-17635
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17636) SingleInputGateTest.testConcurrentReadStateAndProcessAndClose: Trying to read from released RecoveredInputChannel

2020-05-12 Thread Robert Metzger (Jira)

Robert Metzger created FLINK-17636:
--

 Summary: 
SingleInputGateTest.testConcurrentReadStateAndProcessAndClose: Trying to read 
from released RecoveredInputChannel
 Key: FLINK-17636
 URL: https://issues.apache.org/jira/browse/FLINK-17636
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.11.0
Reporter: Robert Metzger


CI: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1080&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=4ed44b66-cdd6-5dcf-5f6a-88b07dda665d

{code}
2020-05-12T11:39:28.7058732Z [ERROR] Tests run: 22, Failures: 0, Errors: 1, 
Skipped: 0, Time elapsed: 2.643 s <<< FAILURE! - in 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGateTest
2020-05-12T11:39:28.7066377Z [ERROR] 
testConcurrentReadStateAndProcessAndClose(org.apache.flink.runtime.io.network.partition.consumer.SingleInputGateTest)
  Time elapsed: 0.032 s  <<< ERROR!
2020-05-12T11:39:28.7067491Z java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException: Trying to read from released 
RecoveredInputChannel
2020-05-12T11:39:28.7068238Zat 
java.util.concurrent.FutureTask.report(FutureTask.java:122)
2020-05-12T11:39:28.7068795Zat 
java.util.concurrent.FutureTask.get(FutureTask.java:192)
2020-05-12T11:39:28.7069538Zat 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest.submitTasksAndWaitForResults(RemoteInputChannelTest.java:1228)
2020-05-12T11:39:28.7070595Zat 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGateTest.testConcurrentReadStateAndProcessAndClose(SingleInputGateTest.java:235)
2020-05-12T11:39:28.7075974Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-05-12T11:39:28.7076784Zat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-05-12T11:39:28.7077522Zat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-05-12T11:39:28.7078212Zat 
java.lang.reflect.Method.invoke(Method.java:498)
2020-05-12T11:39:28.7078846Zat 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2020-05-12T11:39:28.7079607Zat 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2020-05-12T11:39:28.7080383Zat 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2020-05-12T11:39:28.7081173Zat 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2020-05-12T11:39:28.7081937Zat 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2020-05-12T11:39:28.7082708Zat 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2020-05-12T11:39:28.7083422Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2020-05-12T11:39:28.7084148Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2020-05-12T11:39:28.7084933Zat 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2020-05-12T11:39:28.7085562Zat 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2020-05-12T11:39:28.7086162Zat 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2020-05-12T11:39:28.7086806Zat 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2020-05-12T11:39:28.7087434Zat 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2020-05-12T11:39:28.7088036Zat 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
2020-05-12T11:39:28.7088647Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
2020-05-12T11:39:28.7089328Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
2020-05-12T11:39:28.7090106Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
2020-05-12T11:39:28.7090811Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
2020-05-12T11:39:28.7091674Zat 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
2020-05-12T11:39:28.7102178Zat 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
2020-05-12T11:39:28.7103048Zat 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
2020-05-12T11:39:28.7103701Zat 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
2020-05-12T11:39:28.7104367Z Caused by: java.lang.IllegalStateException: Trying 
to read from released RecoveredInputChannel
2020-05-12T11:39:28.7105513Zat 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
2020-05-12T11:39:28.7106574Zat 
org.apache.flink.runtime.io.network.partition.consumer

[jira] [Created] (FLINK-17637) HadoopS3RecoverableWriterITCase fails with Expected exception: java.io.FileNotFoundException

2020-05-12 Thread Robert Metzger (Jira)

Robert Metzger created FLINK-17637:
--

 Summary: HadoopS3RecoverableWriterITCase fails with Expected 
exception: java.io.FileNotFoundException
 Key: FLINK-17637
 URL: https://issues.apache.org/jira/browse/FLINK-17637
 Project: Flink
  Issue Type: Bug
  Components: FileSystems
Reporter: Robert Metzger


{code}
[ERROR] Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 70.885 
s <<< FAILURE! - in org.apache.flink.fs.s3hadoop.HadoopS3RecoverableWriterITCase
[ERROR] 
testCleanupRecoverableState(org.apache.flink.fs.s3hadoop.HadoopS3RecoverableWriterITCase)
  Time elapsed: 4.864 s  <<< FAILURE!
java.lang.AssertionError: Expected exception: java.io.FileNotFoundException
at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

{code}[ERROR] Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
70.885 s <<< FAILURE! - in 
org.apache.flink.fs.s3hadoop.HadoopS3RecoverableWriterITCase
[ERROR] 
testCleanupRecoverableState(org.apache.flink.fs.s3hadoop.HadoopS3RecoverableWriterITCase)
  Time elapsed: 4.864 s  <<< FAILURE!
java.lang.AssertionError: Expected exception: java.io.FileNotFoundException
at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execu

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Stephan Ewen

+1 to all of Dawid's suggestions, makes a lot of sense to me

On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz 
wrote:

> Hi Aljoscha,
>
> Sorry for adding comments during the vote, but I have some really minor
> suggestions that should not influence the voting thread imo.
>
> 1) Does it make sense to have the TimestampAssigner extend from Flink's
> Function? This implies it has to be serializable which with the factory
> pattern is not strictly necessary, right? BTW I really like that you
> suggested the FunctionInterface annotation there.
>
> 2) Could we rename the IdentityTimestampAssigner to e.g.
> RecordTimestampAssigner/SystemTimestampAssigner/NativeTimestampAssigner...
> Personally I found the IdentityTimestampAssigner a bit misleading as it
> usually mean a no-op. Which did not click for me, as I assumed it
> somehow returns the incoming record itself.
>
> 3) Could we rename the second parameter of TimestampAssigner#extract to
> e.g. recordTimestamp/nativeTimestamp? This is similar to the point
> above. This parameter was also a bit confusing for me as I thought at
> times its somehow related to
> TimerService#currentProcessingTimestamp()/currentWatermark() as the
> whole system currentTimestamp.
>
> Other than those three points I like the proposal and I was about to
> vote +1 if it was not for those three points.
>
> Best,
>
> Dawid
>
> On 11/05/2020 16:57, Jark Wu wrote:
> > Thanks for the explanation. I like the fatory pattern to make the member
> > variables immutable and final.
> >
> > So +1 to the proposal.
> >
> > Best,
> > Jark
> >
> > On Mon, 11 May 2020 at 22:01, Stephan Ewen  wrote:
> >
> >> I am fine with that.
> >>
> >> Much of the principles seem agreed upon. I understand the need to
> support
> >> code-generated extractors and we should support most of it already (as
> >> Aljoscha mentioned via the factories) can extend this if needed.
> >>
> >> I think that the factory approach supports code-generated extractors in
> a
> >> cleaner way even than an extractor with an open/init method.
> >>
> >>
> >> On Mon, May 11, 2020 at 3:38 PM Aljoscha Krettek 
> >> wrote:
> >>
> >>> We're slightly running out of time. I would propose we vote on the
> basic
> >>> principle and remain open to later additions. This feature is quite
> >>> important to make the new Kafka Source that is developed as part of
> >>> FLIP-27 useful. Otherwise we would have to use the legacy interfaces in
> >>> the newly added connector.
> >>>
> >>> I know that's a bit unorthodox but would everyone be OK with what's
> >>> currently there and then we iterate?
> >>>
> >>> Best,
> >>> Aljoscha
> >>>
> >>> On 11.05.20 13:57, Aljoscha Krettek wrote:
>  Ah, I meant to write this in my previous email, sorry about that.
> 
>  The WatermarkStrategy, which is basically a factory for a
>  WatermarkGenerator is the replacement for the open() method. This is
> >> the
>  same strategy that was followed for StreamOperatorFactory, which was
>  introduced to allow code generation in the Table API [1]. If we need
>  metrics or other things we would add that as a parameter to the
> factory
>  method. What do you think?
> 
>  Best,
>  Aljoscha
> 
>  [1] https://issues.apache.org/jira/browse/FLINK-11974
> 
>  On 10.05.20 05:07, Jark Wu wrote:
> > Hi,
> >
> > Regarding to the `open()/close()`, I think it's necessary for
> > Table&SQL to
> > compile the generated code.
> > In Table&SQL, the watermark strategy and event-timestamp is defined
> >>> using
> > SQL expressions, we will
> > translate and generate Java code for the expressions. If we have
> > `open()/close()`, we don't need lazy initialization.
> > Besides that, I can see a need to report some metrics, e.g. the
> >> current
> > watermark, the dirty timestamps (null value), etc.
> > So I think a simple `open()/close()` with a context which can get
> > MetricGroup is nice and not complex for the first version.
> >
> > Best,
> > Jark
> >
> >
> >
> > On Sun, 10 May 2020 at 00:50, Stephan Ewen  wrote:
> >
> >> Thanks, Aljoscha, for picking this up.
> >>
> >> I agree with the approach of doing the here proposed set of changes
> >> for
> >> now. It already makes things simpler and adds idleness support
> >> everywhere.
> >>
> >> Rich functions and state always add complexity, let's do this in a
> >> next
> >> step, if we have a really compelling case.
> >>
> >>
> >> On Wed, Apr 29, 2020 at 7:24 PM Aljoscha Krettek <
> >> aljos...@apache.org>
> >> wrote:
> >>
> >>> Regarding the WatermarkGenerator (WG) interface itself. The
> proposal
> >>> is
> >>> basically to turn emitting into a "flatMap", we give the
> >>> WatermarkGenerator a "collector" (the WatermarkOutput) and the WG
> >> can
> >>> decide whether to output a watermark or not and can also mark the
> >>> output
> >>> as

Re: [VOTE] FLIP-126: FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Stephan Ewen

+1 from me

There are discussions about minor changes (names and serializability)
pending, but these should not conflict with the design here.

On Tue, May 12, 2020 at 10:01 AM Aljoscha Krettek 
wrote:

> Hi all,
>
> I would like to start the vote for FLIP-126 [1], which is discussed and
> reached a consensus in the discussion thread [2].
>
> The vote will be open until May 15th (72h), unless there is an objection
> or not enough votes.
>
> Best,
> Aljoscha
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-126%3A+Unify+%28and+separate%29+Watermark+Assigners
>
> [2]
> https://lists.apache.org/thread.html/r7988ddfe5ca8d85e666039cf6240e1007a2ca337a52108f684b66d90%40%3Cdev.flink.apache.org%3E
>

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

Definitely +1 to point 2) raised by Dawid. I'm not sure on points 1) and 3).

1) I can see the benefit of that but in reality most timestamp assigners
will probably need to be Serializable. If you look at my (updated) POC
branch [1] you can see how a TimestampAssigner would be specified on the
WatermarkStrategies helper class: [2]. The signature of this would have
to be changed to something like:

public & Serializable>
WatermarkStrategies withTimestampAssigner(TA timestampAssigner)

Then, however, it would not be possible for users to specify a lambda or
anonymous inner function for the TimestampAssigner like this:

WatermarkStrategy testWmStrategy = WatermarkStrategies
.forGenerator(new PeriodicTestWatermarkGenerator())
.withTimestampAssigner((event, timestamp) -> event)
.build();

3) This makes sense if we only allow WatermarkStrategies on sources,
where the previous timestamp really is the "native" timestamp.
Currently, we also allow setting watermark strategies at arbitrary
points in the graph. I'm thinking we probably should only allow that in
sources but it's not the reality currently. I'm not against renaming it,
just voicing those thoughts.

Best,
Aljoscha

[1] https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased
[2]
https://github.com/aljoscha/flink/blob/flink-xxx-wm-generators-rebased/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkStrategies.java#L81

On 12.05.20 15:48, Stephan Ewen wrote:

+1 to all of Dawid's suggestions, makes a lot of sense to me

On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz
wrote:

Hi Aljoscha,

Sorry for adding comments during the vote, but I have some really minor
suggestions that should not influence the voting thread imo.

1) Does it make sense to have the TimestampAssigner extend from Flink's
Function? This implies it has to be serializable which with the factory
pattern is not strictly necessary, right? BTW I really like that you
suggested the FunctionInterface annotation there.

2) Could we rename the IdentityTimestampAssigner to e.g.
RecordTimestampAssigner/SystemTimestampAssigner/NativeTimestampAssigner...
Personally I found the IdentityTimestampAssigner a bit misleading as it
usually mean a no-op. Which did not click for me, as I assumed it
somehow returns the incoming record itself.

3) Could we rename the second parameter of TimestampAssigner#extract to
e.g. recordTimestamp/nativeTimestamp? This is similar to the point
above. This parameter was also a bit confusing for me as I thought at
times its somehow related to
TimerService#currentProcessingTimestamp()/currentWatermark() as the
whole system currentTimestamp.

Other than those three points I like the proposal and I was about to
vote +1 if it was not for those three points.

Best,

Dawid

On 11/05/2020 16:57, Jark Wu wrote:

Thanks for the explanation. I like the fatory pattern to make the member
variables immutable and final.

So +1 to the proposal.

Best,
Jark

On Mon, 11 May 2020 at 22:01, Stephan Ewen wrote:

I am fine with that.

Much of the principles seem agreed upon. I understand the need to

support

code-generated extractors and we should support most of it already (as
Aljoscha mentioned via the factories) can extend this if needed.

I think that the factory approach supports code-generated extractors in

cleaner way even than an extractor with an open/init method.

On Mon, May 11, 2020 at 3:38 PM Aljoscha Krettek
wrote:

We're slightly running out of time. I would propose we vote on the

basic

principle and remain open to later additions. This feature is quite
important to make the new Kafka Source that is developed as part of
FLIP-27 useful. Otherwise we would have to use the legacy interfaces in
the newly added connector.

I know that's a bit unorthodox but would everyone be OK with what's
currently there and then we iterate?

Best,
Aljoscha

On 11.05.20 13:57, Aljoscha Krettek wrote:

Ah, I meant to write this in my previous email, sorry about that.

The WatermarkStrategy, which is basically a factory for a
WatermarkGenerator is the replacement for the open() method. This is

the

same strategy that was followed for StreamOperatorFactory, which was
introduced to allow code generation in the Table API [1]. If we need
metrics or other things we would add that as a parameter to the

factory

method. What do you think?

Best,
Aljoscha

[1] https://issues.apache.org/jira/browse/FLINK-11974

On 10.05.20 05:07, Jark Wu wrote:

Hi,

Regarding to the `open()/close()`, I think it's necessary for
Table&SQL to
compile the generated code.
In Table&SQL, the watermark strategy and event-timestamp is defined

using

SQL expressions, we will
translate and generate Java code for the expressions. If we have
`open()/close()`, we don't need lazy initialization.
Besides that, I can see a need to report some metrics, e.g. the

current

watermark, the dirty timestamps (null value), etc.
So I thi

[jira] [Created] (FLINK-17638) FlinkKafkaConsumerBase restore from empty state will be set consum from earliest forced

2020-05-12 Thread chenchuangchuang (Jira)

chenchuangchuang created FLINK-17638:


 Summary: FlinkKafkaConsumerBase restore from empty state will be 
set consum from earliest forced
 Key: FLINK-17638
 URL: https://issues.apache.org/jira/browse/FLINK-17638
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.10.0, 1.9.3, 1.9.0
 Environment: Flink 1.9.0

kafka 1.1.0

jdk 1.8
Reporter: chenchuangchuang


my work target and data  is like this    :
 # i need count the number of post per user create last 30 days in my system
 # the total and realtime data is in MYSQL
 # i can get increment MYSQL binlog  from  kafka-1.1.1 ( it just  store the 
last 7 days binlog), the topic name is "binlog_post_topic"
 # so , i have to combine the MYSQL data and the binlog data

 

i do it in this way:
 # first , i carry a snapshot of MYSQL data to kafka  topic in order of 
create_time ( topic name is "init-post-topic"), and consume from kafka topic    
"init-post-topic" as flink data-stream with the SlidingEventTimeWindows
 # second, after the task do all the data in the topic "init-post-topic" , i 
create a save point for the task , call the save point  save-point-a
 # third, i modify my code ,
 ## the data source is "binlog_post_topic"  topic of kafka ,
 ## other operotor will not change,
 ## and the "binlog_post_topic"  is setted consuming  from  special timestamp 
(when the snapshot of MYSQL create )
 # forth, i restart my task from save-point-a

but i find the kafka consumer for the "binlog_post_topic" do not consume data 
from the timestamp i setted, but from the earlist,  i find the log in the task 
manager 
{code:java}
//代码占位符
2020-05-11 17:20:47,228 INFO  
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase  - Consumer 
subtask 0 restored state: {}.

...
2020-05-12 20:14:52,641 INFO  
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase  - Consumer 
subtask 0 will start reading 1 partitions with offsets in restored state: 
{KafkaTopicPartition{topic='binlog-kk_social-post', partition=0}=-915623761775}
2020-05-11 17:20:47,414 INFO  
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase  - Consumer 
subtask 0 creating fetcher with offsets 
{KafkaTopicPartition{topic='binlog-kk_social-post', partition=0}=-915623761775}.


{code}

i guess this may be caused by the FlinkKafkaConsumerBase
then i find code like this 

in the method FlinkKafkaConsumerBase.initializeState()
{code:java}
//代码占位符
if (context.isRestored() && !restoredFromOldState) {
   restoredState = new TreeMap<>(new KafkaTopicPartition.Comparator());
{code}
this code mean that  if a task is restart from the save point ,that 
restoredState will not be null, at least be an empty TreeMap;

and in FlinkKafkaConsumerBase.open()
{code:java}
//代码占位符
if (restoredState != null) {
   for (KafkaTopicPartition partition : allPartitions) {
  if (!restoredState.containsKey(partition)) {
 restoredState.put(partition, 
KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);
  }
   }
{code}
in this place will init the consumer , if a task is restart from a save-point , 
restoredState at least  is an empty TreeMap, then in this code , the consumer 
will be setted consume from 

KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET

i change this code like this 
{code:java}
//代码占位符
if (restoredState != null && !restoredState.isEmpty()) {


{code}
 

and this work well for me .

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Stephan Ewen

@Aljoscha

About (1) could we have an interface SerializableTimestampAssigner that
simply mixes in the java.io.Serializable interface? Or will this be too
clumsy?

About (3) RecordTimeStamp seems to fit both cases (in-source-record
timestamp, in stream-record timestamp).

On Tue, May 12, 2020 at 4:12 PM Aljoscha Krettek 
wrote:

> Definitely +1 to point 2) raised by Dawid. I'm not sure on points 1) and
> 3).
>
> 1) I can see the benefit of that but in reality most timestamp assigners
> will probably need to be Serializable. If you look at my (updated) POC
> branch [1] you can see how a TimestampAssigner would be specified on the
> WatermarkStrategies helper class: [2]. The signature of this would have
> to be changed to something like:
>
> public  & Serializable>
> WatermarkStrategies withTimestampAssigner(TA timestampAssigner)
>
> Then, however, it would not be possible for users to specify a lambda or
> anonymous inner function for the TimestampAssigner like this:
>
> WatermarkStrategy testWmStrategy = WatermarkStrategies
> .forGenerator(new PeriodicTestWatermarkGenerator())
> .withTimestampAssigner((event, timestamp) -> event)
> .build();
>
> 3) This makes sense if we only allow WatermarkStrategies on sources,
> where the previous timestamp really is the "native" timestamp.
> Currently, we also allow setting watermark strategies at arbitrary
> points in the graph. I'm thinking we probably should only allow that in
> sources but it's not the reality currently. I'm not against renaming it,
> just voicing those thoughts.
>
> Best,
> Aljoscha
>
>
> [1] https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased
> [2]
>
> https://github.com/aljoscha/flink/blob/flink-xxx-wm-generators-rebased/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkStrategies.java#L81
>
> On 12.05.20 15:48, Stephan Ewen wrote:
> > +1 to all of Dawid's suggestions, makes a lot of sense to me
> >
> > On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz  >
> > wrote:
> >
> >> Hi Aljoscha,
> >>
> >> Sorry for adding comments during the vote, but I have some really minor
> >> suggestions that should not influence the voting thread imo.
> >>
> >> 1) Does it make sense to have the TimestampAssigner extend from Flink's
> >> Function? This implies it has to be serializable which with the factory
> >> pattern is not strictly necessary, right? BTW I really like that you
> >> suggested the FunctionInterface annotation there.
> >>
> >> 2) Could we rename the IdentityTimestampAssigner to e.g.
> >>
> RecordTimestampAssigner/SystemTimestampAssigner/NativeTimestampAssigner...
> >> Personally I found the IdentityTimestampAssigner a bit misleading as it
> >> usually mean a no-op. Which did not click for me, as I assumed it
> >> somehow returns the incoming record itself.
> >>
> >> 3) Could we rename the second parameter of TimestampAssigner#extract to
> >> e.g. recordTimestamp/nativeTimestamp? This is similar to the point
> >> above. This parameter was also a bit confusing for me as I thought at
> >> times its somehow related to
> >> TimerService#currentProcessingTimestamp()/currentWatermark() as the
> >> whole system currentTimestamp.
> >>
> >> Other than those three points I like the proposal and I was about to
> >> vote +1 if it was not for those three points.
> >>
> >> Best,
> >>
> >> Dawid
> >>
> >> On 11/05/2020 16:57, Jark Wu wrote:
> >>> Thanks for the explanation. I like the fatory pattern to make the
> member
> >>> variables immutable and final.
> >>>
> >>> So +1 to the proposal.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Mon, 11 May 2020 at 22:01, Stephan Ewen  wrote:
> >>>
>  I am fine with that.
> 
>  Much of the principles seem agreed upon. I understand the need to
> >> support
>  code-generated extractors and we should support most of it already (as
>  Aljoscha mentioned via the factories) can extend this if needed.
> 
>  I think that the factory approach supports code-generated extractors
> in
> >> a
>  cleaner way even than an extractor with an open/init method.
> 
> 
>  On Mon, May 11, 2020 at 3:38 PM Aljoscha Krettek  >
>  wrote:
> 
> > We're slightly running out of time. I would propose we vote on the
> >> basic
> > principle and remain open to later additions. This feature is quite
> > important to make the new Kafka Source that is developed as part of
> > FLIP-27 useful. Otherwise we would have to use the legacy interfaces
> in
> > the newly added connector.
> >
> > I know that's a bit unorthodox but would everyone be OK with what's
> > currently there and then we iterate?
> >
> > Best,
> > Aljoscha
> >
> > On 11.05.20 13:57, Aljoscha Krettek wrote:
> >> Ah, I meant to write this in my previous email, sorry about that.
> >>
> >> The WatermarkStrategy, which is basically a factory for a
> >> WatermarkGenerator is the replacement for the

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

I have similar thoughts to @Stephan

Ad. 1 I tried something like this on your branch:

    /**
     * Adds the given {@link TimestampAssigner} to this {@link
WatermarkStrategies}. For top-level classes that implement both
Serializable and TimestampAssigner
     */
    public  & Serializable>
WatermarkStrategies withTimestampAssigner(TA timestampAssigner) {
        checkNotNull(timestampAssigner, "timestampAssigner");
        this.timestampAssigner = timestampAssigner;
        return this;
    }   

   @FunctionalInterface
    public interface SerializableTimestampAssigner extends
TimestampAssigner, Serializable {
    }   

 /**
      * Adds the given {@link TimestampAssigner} to this {@link
WatermarkStrategies}.
 * Helper method for serializable lambdas.
     */
    public WatermarkStrategies
withTimestampAssigner(SerializableTimestampAssigner timestampAssigner) {
        checkNotNull(timestampAssigner, "timestampAssigner");
        this.timestampAssigner = timestampAssigner;
        return this;
    }

But I understand if that's too hacky. It's just a pity that we must
enforce limitations on an interface that are not strictly necessary.

Ad 2/3

I am aware the watermark assigner/timestamp extractor can be applied
further down the graph. Originally I also wanted to suggest
sourceTimestamp and SourceTimestampAssigner, but then I realized it can
be used also after the sources as you correctly pointed out. Even if the
TimestampAssigner is used after the source there might be some
native/record timestamp in the StreamRecord, that could've been
extracted by previous assigner.

Best,

Dawid

On 12/05/2020 16:47, Stephan Ewen wrote:
> @Aljoscha
>
> About (1) could we have an interface SerializableTimestampAssigner that
> simply mixes in the java.io.Serializable interface? Or will this be too
> clumsy?
>
> About (3) RecordTimeStamp seems to fit both cases (in-source-record
> timestamp, in stream-record timestamp).
>
> On Tue, May 12, 2020 at 4:12 PM Aljoscha Krettek 
> wrote:
>
>> Definitely +1 to point 2) raised by Dawid. I'm not sure on points 1) and
>> 3).
>>
>> 1) I can see the benefit of that but in reality most timestamp assigners
>> will probably need to be Serializable. If you look at my (updated) POC
>> branch [1] you can see how a TimestampAssigner would be specified on the
>> WatermarkStrategies helper class: [2]. The signature of this would have
>> to be changed to something like:
>>
>> public  & Serializable>
>> WatermarkStrategies withTimestampAssigner(TA timestampAssigner)
>>
>> Then, however, it would not be possible for users to specify a lambda or
>> anonymous inner function for the TimestampAssigner like this:
>>
>> WatermarkStrategy testWmStrategy = WatermarkStrategies
>> .forGenerator(new PeriodicTestWatermarkGenerator())
>> .withTimestampAssigner((event, timestamp) -> event)
>> .build();
>>
>> 3) This makes sense if we only allow WatermarkStrategies on sources,
>> where the previous timestamp really is the "native" timestamp.
>> Currently, we also allow setting watermark strategies at arbitrary
>> points in the graph. I'm thinking we probably should only allow that in
>> sources but it's not the reality currently. I'm not against renaming it,
>> just voicing those thoughts.
>>
>> Best,
>> Aljoscha
>>
>>
>> [1] https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased
>> [2]
>>
>> https://github.com/aljoscha/flink/blob/flink-xxx-wm-generators-rebased/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkStrategies.java#L81
>>
>> On 12.05.20 15:48, Stephan Ewen wrote:
>>> +1 to all of Dawid's suggestions, makes a lot of sense to me
>>>
>>> On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz >>
>>> wrote:
>>>
 Hi Aljoscha,

 Sorry for adding comments during the vote, but I have some really minor
 suggestions that should not influence the voting thread imo.

 1) Does it make sense to have the TimestampAssigner extend from Flink's
 Function? This implies it has to be serializable which with the factory
 pattern is not strictly necessary, right? BTW I really like that you
 suggested the FunctionInterface annotation there.

 2) Could we rename the IdentityTimestampAssigner to e.g.

>> RecordTimestampAssigner/SystemTimestampAssigner/NativeTimestampAssigner...
 Personally I found the IdentityTimestampAssigner a bit misleading as it
 usually mean a no-op. Which did not click for me, as I assumed it
 somehow returns the incoming record itself.

 3) Could we rename the second parameter of TimestampAssigner#extract to
 e.g. recordTimestamp/nativeTimestamp? This is similar to the point
 above. This parameter was also a bit confusing for me as I thought at
 times its somehow related to
 TimerService#currentProcessingTimestamp()/currentWatermark() as the
 whole system currentTimestamp.

 Other than those three points I like the pr

[jira] [Created] (FLINK-17639) Document which FileSystems are supported by the StreamingFileSink

2020-05-12 Thread Kostas Kloudas (Jira)

Kostas Kloudas created FLINK-17639:
--

 Summary: Document which FileSystems are supported by the 
StreamingFileSink
 Key: FLINK-17639
 URL: https://issues.apache.org/jira/browse/FLINK-17639
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.10.1
Reporter: Kostas Kloudas


This issue targets at documenting which of the supported filesystems in Flink 
are also supported by the {{StreamingFileSink}}. Currently there is no such 
documentation, and, for example, users of Azure can only figure out at runtime 
that their FS is not supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

Yes, I am also ok with a SerializableTimestampAssigner. This only looks 
a bit clumsy in the API but as a user (that uses lambdas) you should not 
see this. I pushed changes for this to my branch: 
https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased


And yes, recordTimestamp sounds good for the TimestampAssigner. I admit 
I didn't read this well enough and only saw nativeTimestamp.


Best,
Aljoscha

On 12.05.20 17:16, Dawid Wysakowicz wrote:

I have similar thoughts to @Stephan

Ad. 1 I tried something like this on your branch:

     /**
      * Adds the given {@link TimestampAssigner} to this {@link
WatermarkStrategies}. For top-level classes that implement both
Serializable and TimestampAssigner
      */
     public  & Serializable>
WatermarkStrategies withTimestampAssigner(TA timestampAssigner) {
         checkNotNull(timestampAssigner, "timestampAssigner");
         this.timestampAssigner = timestampAssigner;
         return this;
     }

    @FunctionalInterface
     public interface SerializableTimestampAssigner extends
TimestampAssigner, Serializable {
     }

  /**
       * Adds the given {@link TimestampAssigner} to this {@link
WatermarkStrategies}.
  * Helper method for serializable lambdas.
      */
     public WatermarkStrategies
withTimestampAssigner(SerializableTimestampAssigner timestampAssigner) {
         checkNotNull(timestampAssigner, "timestampAssigner");
         this.timestampAssigner = timestampAssigner;
         return this;
     }

But I understand if that's too hacky. It's just a pity that we must
enforce limitations on an interface that are not strictly necessary.

Ad 2/3

I am aware the watermark assigner/timestamp extractor can be applied
further down the graph. Originally I also wanted to suggest
sourceTimestamp and SourceTimestampAssigner, but then I realized it can
be used also after the sources as you correctly pointed out. Even if the
TimestampAssigner is used after the source there might be some
native/record timestamp in the StreamRecord, that could've been
extracted by previous assigner.

Best,

Dawid

On 12/05/2020 16:47, Stephan Ewen wrote:

@Aljoscha

About (1) could we have an interface SerializableTimestampAssigner that
simply mixes in the java.io.Serializable interface? Or will this be too
clumsy?

About (3) RecordTimeStamp seems to fit both cases (in-source-record
timestamp, in stream-record timestamp).

On Tue, May 12, 2020 at 4:12 PM Aljoscha Krettek 
wrote:


Definitely +1 to point 2) raised by Dawid. I'm not sure on points 1) and
3).

1) I can see the benefit of that but in reality most timestamp assigners
will probably need to be Serializable. If you look at my (updated) POC
branch [1] you can see how a TimestampAssigner would be specified on the
WatermarkStrategies helper class: [2]. The signature of this would have
to be changed to something like:

public  & Serializable>
WatermarkStrategies withTimestampAssigner(TA timestampAssigner)

Then, however, it would not be possible for users to specify a lambda or
anonymous inner function for the TimestampAssigner like this:

WatermarkStrategy testWmStrategy = WatermarkStrategies
 .forGenerator(new PeriodicTestWatermarkGenerator())
 .withTimestampAssigner((event, timestamp) -> event)
 .build();

3) This makes sense if we only allow WatermarkStrategies on sources,
where the previous timestamp really is the "native" timestamp.
Currently, we also allow setting watermark strategies at arbitrary
points in the graph. I'm thinking we probably should only allow that in
sources but it's not the reality currently. I'm not against renaming it,
just voicing those thoughts.

Best,
Aljoscha


[1] https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased
[2]

https://github.com/aljoscha/flink/blob/flink-xxx-wm-generators-rebased/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkStrategies.java#L81

On 12.05.20 15:48, Stephan Ewen wrote:

+1 to all of Dawid's suggestions, makes a lot of sense to me

On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz 
Hi Aljoscha,

Sorry for adding comments during the vote, but I have some really minor
suggestions that should not influence the voting thread imo.

1) Does it make sense to have the TimestampAssigner extend from Flink's
Function? This implies it has to be serializable which with the factory
pattern is not strictly necessary, right? BTW I really like that you
suggested the FunctionInterface annotation there.

2) Could we rename the IdentityTimestampAssigner to e.g.


RecordTimestampAssigner/SystemTimestampAssigner/NativeTimestampAssigner...

Personally I found the IdentityTimestampAssigner a bit misleading as it
usually mean a no-op. Which did not click for me, as I assumed it
somehow returns the incoming record itself.

3) Could we rename the second parameter of TimestampAssigner#extract to
e.g. recordTimestamp/nativeTimestamp? This is similar to the point
above. This paramet

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

Thank you for the update and sorry again for chiming in so late...

Best,

Dawid


On 12/05/2020 18:21, Aljoscha Krettek wrote:
> Yes, I am also ok with a SerializableTimestampAssigner. This only
> looks a bit clumsy in the API but as a user (that uses lambdas) you
> should not see this. I pushed changes for this to my branch:
> https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased
>
> And yes, recordTimestamp sounds good for the TimestampAssigner. I
> admit I didn't read this well enough and only saw nativeTimestamp.
>
> Best,
> Aljoscha
>
> On 12.05.20 17:16, Dawid Wysakowicz wrote:
>> I have similar thoughts to @Stephan
>>
>> Ad. 1 I tried something like this on your branch:
>>
>>  /**
>>   * Adds the given {@link TimestampAssigner} to this {@link
>> WatermarkStrategies}. For top-level classes that implement both
>> Serializable and TimestampAssigner
>>   */
>>  public  & Serializable>
>> WatermarkStrategies withTimestampAssigner(TA timestampAssigner) {
>>      checkNotNull(timestampAssigner, "timestampAssigner");
>>      this.timestampAssigner = timestampAssigner;
>>      return this;
>>  }
>>
>>     @FunctionalInterface
>>  public interface SerializableTimestampAssigner extends
>> TimestampAssigner, Serializable {
>>  }
>>
>>   /**
>>    * Adds the given {@link TimestampAssigner} to this {@link
>> WatermarkStrategies}.
>>   * Helper method for serializable lambdas.
>>   */
>>  public WatermarkStrategies
>> withTimestampAssigner(SerializableTimestampAssigner
>> timestampAssigner) {
>>      checkNotNull(timestampAssigner, "timestampAssigner");
>>      this.timestampAssigner = timestampAssigner;
>>      return this;
>>  }
>>
>> But I understand if that's too hacky. It's just a pity that we must
>> enforce limitations on an interface that are not strictly necessary.
>>
>> Ad 2/3
>>
>> I am aware the watermark assigner/timestamp extractor can be applied
>> further down the graph. Originally I also wanted to suggest
>> sourceTimestamp and SourceTimestampAssigner, but then I realized it can
>> be used also after the sources as you correctly pointed out. Even if the
>> TimestampAssigner is used after the source there might be some
>> native/record timestamp in the StreamRecord, that could've been
>> extracted by previous assigner.
>>
>> Best,
>>
>> Dawid
>>
>> On 12/05/2020 16:47, Stephan Ewen wrote:
>>> @Aljoscha
>>>
>>> About (1) could we have an interface SerializableTimestampAssigner that
>>> simply mixes in the java.io.Serializable interface? Or will this be too
>>> clumsy?
>>>
>>> About (3) RecordTimeStamp seems to fit both cases (in-source-record
>>> timestamp, in stream-record timestamp).
>>>
>>> On Tue, May 12, 2020 at 4:12 PM Aljoscha Krettek 
>>> wrote:
>>>
 Definitely +1 to point 2) raised by Dawid. I'm not sure on points
 1) and
 3).

 1) I can see the benefit of that but in reality most timestamp
 assigners
 will probably need to be Serializable. If you look at my (updated) POC
 branch [1] you can see how a TimestampAssigner would be specified
 on the
 WatermarkStrategies helper class: [2]. The signature of this would
 have
 to be changed to something like:

 public  & Serializable>
 WatermarkStrategies withTimestampAssigner(TA timestampAssigner)

 Then, however, it would not be possible for users to specify a
 lambda or
 anonymous inner function for the TimestampAssigner like this:

 WatermarkStrategy testWmStrategy = WatermarkStrategies
  .forGenerator(new PeriodicTestWatermarkGenerator())
  .withTimestampAssigner((event, timestamp) -> event)
  .build();

 3) This makes sense if we only allow WatermarkStrategies on sources,
 where the previous timestamp really is the "native" timestamp.
 Currently, we also allow setting watermark strategies at arbitrary
 points in the graph. I'm thinking we probably should only allow
 that in
 sources but it's not the reality currently. I'm not against
 renaming it,
 just voicing those thoughts.

 Best,
 Aljoscha


 [1]
 https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased
 [2]

 https://github.com/aljoscha/flink/blob/flink-xxx-wm-generators-rebased/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkStrategies.java#L81


 On 12.05.20 15:48, Stephan Ewen wrote:
> +1 to all of Dawid's suggestions, makes a lot of sense to me
>
> On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz
> 
> wrote:
>
>> Hi Aljoscha,
>>
>> Sorry for adding comments during the vote, but I have some really
>> minor
>> suggestions that should not influence the voting thread imo.
>>
>> 1) Does it make sense to have the TimestampAssigner extend from
>> Flink's
>> Function? Th

Re: [VOTE] FLIP-126: FLIP-126: Unify (and separate) Watermark Assigners

+1 from me

Best,

Dawid

On 12/05/2020 15:49, Stephan Ewen wrote:
> +1 from me
>
> There are discussions about minor changes (names and serializability)
> pending, but these should not conflict with the design here.
>
> On Tue, May 12, 2020 at 10:01 AM Aljoscha Krettek 
> wrote:
>
>> Hi all,
>>
>> I would like to start the vote for FLIP-126 [1], which is discussed and
>> reached a consensus in the discussion thread [2].
>>
>> The vote will be open until May 15th (72h), unless there is an objection
>> or not enough votes.
>>
>> Best,
>> Aljoscha
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-126%3A+Unify+%28and+separate%29+Watermark+Assigners
>>
>> [2]
>> https://lists.apache.org/thread.html/r7988ddfe5ca8d85e666039cf6240e1007a2ca337a52108f684b66d90%40%3Cdev.flink.apache.org%3E
>>



signature.asc
Description: OpenPGP digital signature

[jira] [Created] (FLINK-17640) RecoveredInputChannelTest.testConcurrentReadStateAndProcessAndRelease() failed

2020-05-12 Thread Arvid Heise (Jira)

Arvid Heise created FLINK-17640:
---

 Summary: 
RecoveredInputChannelTest.testConcurrentReadStateAndProcessAndRelease() failed
 Key: FLINK-17640
 URL: https://issues.apache.org/jira/browse/FLINK-17640
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.11.0
Reporter: Arvid Heise
 Fix For: 1.11.0


Here is the 
[instance|[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1093&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=4ed44b66-cdd6-5dcf-5f6a-88b07dda665d].]

Easy to reproduce locally by running the test a few 100 times.
{noformat}
java.util.concurrent.ExecutionException: java.lang.AssertionError   at 
java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest.submitTasksAndWaitForResults(RemoteInputChannelTest.java:1228)
at 
org.apache.flink.runtime.io.network.partition.consumer.RecoveredInputChannelTest.testConcurrentReadStateAndProcessAndRelease(RecoveredInputChannelTest.java:215)
at 
org.apache.flink.runtime.io.network.partition.consumer.RecoveredInputChannelTest.testConcurrentReadStateAndProcessAndRelease(RecoveredInputChannelTest.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.flink.runtime.io.network.partition.consumer.RecoveredInputChannelTest.lambda$processRecoveredBufferTask$1(RecoveredInputChannelTest.java:257)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[Discuss] Breaking API Change in 1.10.1

2020-05-12 Thread Seth Wiesman

Hi Everyone,

I realize I'm about 7 hours late but I just realized there is a breaking
API change in 1.10.1. FLINK-16684 changes the API of the streaming file
sink in a binary incompatible way. Since the release has been approved I'm
not sure the correct course of action but I wanted to bring this to the
communities attention.

Seth

https://issues.apache.org/jira/browse/FLINK-16684

Re: [Discuss] Breaking API Change in 1.10.1

2020-05-12 Thread Thomas Weise

We also noticed that and had to make an adjustment downstream.

It would be good to mention this in the release notes (if that's not
already the case).

Thomas

On Tue, May 12, 2020 at 10:06 AM Seth Wiesman  wrote:

> Hi Everyone,
>
> I realize I'm about 7 hours late but I just realized there is a breaking
> API change in 1.10.1. FLINK-16684 changes the API of the streaming file
> sink in a binary incompatible way. Since the release has been approved I'm
> not sure the correct course of action but I wanted to bring this to the
> communities attention.
>
> Seth
>
> https://issues.apache.org/jira/browse/FLINK-16684
>

[jira] [Created] (FLINK-17641) How to secure flink applications on yarn on multi-tenant environment

2020-05-12 Thread Ethan Li (Jira)

Ethan Li created FLINK-17641:


 Summary: How to secure flink applications on yarn on multi-tenant 
environment
 Key: FLINK-17641
 URL: https://issues.apache.org/jira/browse/FLINK-17641
 Project: Flink
  Issue Type: Wish
Reporter: Ethan Li


This is a question I wish to get some insights on. 

We are trying to support and secure flink on shared yarn cluster. Besides the 
security provided by yarn side (queueACL, kerberos), what I noticed is that 
flink CLI can still interact with the flink job as long as it knows the 
jobmanager rpc port/hostname and rest.port, which can be obtained easily with 
yarn command. 

Also on the UI side, on yarn cluster, users can visit flink job UI via yarn 
proxy using browser. As long as the user can authenticate and view yarn 
resourcemanager webpage, he/she can visit the flink UI without any problem. 
This basically means Flink UI is wide-open to corp internal users.

On the internal connection side, I am aware of the support added in 1.10 to 
limit the mTLS connection by configuring security.ssl.internal.cert.fingerprint 
(https://ci.apache.org/projects/flink/flink-docs-stable/ops/security-ssl.html)

This works but it is not very flexible. Users need to update the config if the 
cert changes before they submit a new job.

I asked the similar question on the mailing list before. I am really interested 
in how other folks deal with this issue. Thanks.












--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17642) Exception while reading broken ORC file is hidden

2020-05-12 Thread Nikola (Jira)

Nikola created FLINK-17642:
--

 Summary: Exception while reading broken ORC file is hidden
 Key: FLINK-17642
 URL: https://issues.apache.org/jira/browse/FLINK-17642
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.10.1, 1.9.3, 1.8.3
Reporter: Nikola


I have a simple setup of a batch job like this:


{code:java}
BatchTableEnvironment tableEnvFirst = BatchTableEnvironment.create(env);

OrcTableSource orcTableSource = OrcTableSource.builder()
 .path(String.format("path"), true)
 .forOrcSchema(ORC.getSchema())
 .withConfiguration(hdfsConfig)
 .build();

tableEnvFirst.registerTableSource("table", orcTableSource);

Table nnfTable = tableEnvFirst.sqlQuery(sqlString);

return tableEnvFirst.toDataSet(nnfTable, Row.class);{code}
 

 

And that works just fine to fetch ORC files from hdfs as a DataSet.

However, there are some ORC files which are broken. "Broken" means that they 
are invalid in some way and cannot be processed / fetch normally. They throw 
exceptions. Examples of those are:


{code:java}
org.apache.orc.FileFormatException: Malformed ORC file /user/hdfs/orcfile-1 
Invalid postscript length 2org.apache.orc.FileFormatException: Malformed ORC 
file /user/hdfs/orcfile-1 Invalid postscript length 2 at 
org.apache.orc.impl.ReaderImpl.ensureOrcFooter(ReaderImpl.java:258) at 
org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:562) at 
org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
org.apache.orc.OrcFile.createReader(OrcFile.java:342) at 
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:225) at 
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:63) at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at 
java.lang.Thread.run(Thread.java:748){code}
 

 

 
{code:java}
com.google.protobuf.InvalidProtocolBufferException: Protocol message contained 
an invalid tag (zero).com.google.protobuf.InvalidProtocolBufferException: 
Protocol message contained an invalid tag (zero). at 
com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
 at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) at 
org.apache.orc.OrcProto$PostScript.(OrcProto.java:18526) at 
org.apache.orc.OrcProto$PostScript.(OrcProto.java:18490) at 
org.apache.orc.OrcProto$PostScript$1.parsePartialFrom(OrcProto.java:18628) at 
org.apache.orc.OrcProto$PostScript$1.parsePartialFrom(OrcProto.java:18623) at 
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:89) at 
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:95) at 
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) at 
org.apache.orc.OrcProto$PostScript.parseFrom(OrcProto.java:19022) at 
org.apache.orc.impl.ReaderImpl.extractPostScript(ReaderImpl.java:436) at 
org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:564) at 
org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
org.apache.orc.OrcFile.createReader(OrcFile.java:342) at 
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:225) at 
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:63) at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at 
java.lang.Thread.run(Thread.java:748){code}
 

 

Given that some specific files are broken, that's OK to throw exception. 
However, the issue is that I cannot catch those exceptions and they make my job 
to fail. I tried to wrap everything in a try-catch block just to see what I can 
catch and handle, but it seems that when flink runs it, it's not run from that 
place, but rather from DataSourceTask.invoke()

I can digged a little bit to find out why don't I get an exception and I can 
see that {{OrcTableSource}} creates {{OrcRowInputFormat}} instance 
[here|[https://github.com/apache/flink/blob/c8a23c74e618b752bbdc58dca62d997ddd303d40/flink-formats/flink-orc/src/main/java/org/apache/flink/orc/OrcTableSource.java#L157]]
 which then calls open() and open() has this signature: 


{code:java}
public void open(FileInputSplit fileSplit) throws IOException {{code}
 

 

So the open() throws the exception but I am not able to catch it. 

Is what I am doing correct or is there any other way to handle exception coming 
from DataSourceTask.invoke()?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17643) LaunchCoordinatorTest fails

2020-05-12 Thread Arvid Heise (Jira)

Arvid Heise created FLINK-17643:
---

 Summary: LaunchCoordinatorTest fails
 Key: FLINK-17643
 URL: https://issues.apache.org/jira/browse/FLINK-17643
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Mesos
Reporter: Arvid Heise
 Fix For: 1.11.0


Here is the 
[instance|[https://dev.azure.com/arvidheise0209/arvidheise/_build/results?buildId=234&view=logs&j=764762df-f65b-572b-3d5c-65518c777be4&t=8d823410-c7c7-5a4d-68bb-fa7b08da17b9].]

 
{noformat}
[ERROR] Tests run: 24, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 1.828 
s <<< FAILURE! - in org.apache.flink.mesos.scheduler.LaunchCoordinatorTest
[ERROR] The LaunchCoordinator when in state GatheringOffers should handle 
StateTimeout which stays in GatheringOffers when task queue is 
non-empty(org.apache.flink.mesos.scheduler.LaunchCoordinatorTest)  Time 
elapsed: 0.021 s  <<< ERROR!
java.lang.IllegalStateException: cannot reserve actor name '$$u': terminating
at 
akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:188)
at akka.actor.dungeon.Children$class.reserveChild(Children.scala:135)
at akka.actor.ActorCell.reserveChild(ActorCell.scala:429)
at akka.testkit.TestActorRef.(TestActorRef.scala:33)
at akka.testkit.TestFSMRef.(TestFSMRef.scala:40)
at akka.testkit.TestFSMRef$.apply(TestFSMRef.scala:91)
at 
org.apache.flink.mesos.scheduler.LaunchCoordinatorTest$Context.(LaunchCoordinatorTest.scala:254)
at 
org.apache.flink.mesos.scheduler.LaunchCoordinatorTest$$anonfun$1$$anonfun$apply$mcV$sp$8$$anonfun$apply$mcV$sp$15$$anonfun$apply$mcV$sp$34$$anon$25.(LaunchCoordinatorTest.scala:459)
at 
org.apache.flink.mesos.scheduler.LaunchCoordinatorTest$$anonfun$1$$anonfun$apply$mcV$sp$8$$anonfun$apply$mcV$sp$15$$anonfun$apply$mcV$sp$34.apply(LaunchCoordinatorTest.scala:459)
at 
org.apache.flink.mesos.scheduler.LaunchCoordinatorTest$$anonfun$1$$anonfun$apply$mcV$sp$8$$anonfun$apply$mcV$sp$15$$anonfun$apply$mcV$sp$34.apply(LaunchCoordinatorTest.scala:459)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.WordSpecLike$$anon$1.apply(WordSpecLike.scala:1078)
at org.scalatest.TestSuite$class.withFixture(TestSuite.scala:196)
at 
org.apache.flink.mesos.scheduler.LaunchCoordinatorTest.withFixture(LaunchCoordinatorTest.scala:57)
at 
org.scalatest.WordSpecLike$class.invokeWithFixture$1(WordSpecLike.scala:1075)
at 
org.scalatest.WordSpecLike$$anonfun$runTest$1.apply(WordSpecLike.scala:1088)
at 
org.scalatest.WordSpecLike$$anonfun$runTest$1.apply(WordSpecLike.scala:1088)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
at org.scalatest.WordSpecLike$class.runTest(WordSpecLike.scala:1088)
at 
org.apache.flink.mesos.scheduler.LaunchCoordinatorTest.runTest(LaunchCoordinatorTest.scala:57)
at 
org.scalatest.WordSpecLike$$anonfun$runTests$1.apply(WordSpecLike.scala:1147)
at 
org.scalatest.WordSpecLike$$anonfun$runTests$1.apply(WordSpecLike.scala:1147)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17644) Add support for state TTL.

2020-05-12 Thread Igal Shilman (Jira)

Igal Shilman created FLINK-17644:


 Summary: Add support for state TTL.
 Key: FLINK-17644
 URL: https://issues.apache.org/jira/browse/FLINK-17644
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Igal Shilman


The DataStream API supports[state 
TTL|[https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#state-time-to-live-ttl]],
 and it can be made accessible to stateful functions users.

To facilitate uses cases as described in 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Window-processing-in-Stateful-Functions-td34966.html]

 

The proposed approach would extend the PersistedValue, PersistedTable and 
PersistedBuffer

with another constructor that accepts ExpireAfter object

which has:
 * java time duration
 * refresh on read (boolean)

(we should never return an expired entry)

In addition, we need to extend the remote function state to support state 
expiration. 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

changing the output files names in Streamfilesink from part-00 to something else

2020-05-12 Thread dhurandar S

We want to change the name of the file being generated as the output of our
StreamFileSink.
, when files are generated they are named part-00*, is there a way that we
can change the name.

In Hadoop, we can change RecordWriters and MultipleOutputs. May I please
some help in this regard. This is causing blockers for us and will force us
t move to MR job

-- 
Thank you and regards,
Dhurandar

[jira] [Created] (FLINK-17645) REAPER_THREAD in SafetyNetCloseableRegistry start() failed, causing the repeated failover.

2020-05-12 Thread Zakelly Lan (Jira)

Zakelly Lan created FLINK-17645:
---

 Summary: REAPER_THREAD in SafetyNetCloseableRegistry start() 
failed, causing the repeated failover.
 Key: FLINK-17645
 URL: https://issues.apache.org/jira/browse/FLINK-17645
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.6.3
Reporter: Zakelly Lan


I'm running a modified version of Flink, and encountered the exception below 
when task start:

 
{code:java}
2020-05-12 00:46:19,037 ERROR [***] org.apache.flink.runtime.taskmanager.Task   
- Encountered an unexpected exception
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:802)
at 
org.apache.flink.core.fs.SafetyNetCloseableRegistry.(SafetyNetCloseableRegistry.java:73)
at 
org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
at java.lang.Thread.run(Thread.java:834)
2020-05-12 00:46:19,038 INFO  [***] org.apache.flink.runtime.taskmanager.Task 
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:802)
at 
org.apache.flink.core.fs.SafetyNetCloseableRegistry.(SafetyNetCloseableRegistry.java:73)
at 
org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
at java.lang.Thread.run(Thread.java:834)
{code}
 

The REAPER_THREAD.start() fails because of OOM, and REAPER_THREAD will never be 
null. Since then, every time SafetyNetCloseableRegistry init in this VM will 
cause an IllegalStateException:

 
{code:java}
java.lang.IllegalStateException
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
at 
org.apache.flink.core.fs.SafetyNetCloseableRegistry.(SafetyNetCloseableRegistry.java:71)
at 
org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
at java.lang.Thread.run(Thread.java:834){code}
 

This may happen in very old version of Flink as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17646) Reduce the python package size of PyFlink

2020-05-12 Thread Wei Zhong (Jira)

Wei Zhong created FLINK-17646:
-

 Summary: Reduce the python package size of PyFlink
 Key: FLINK-17646
 URL: https://issues.apache.org/jira/browse/FLINK-17646
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently the python package size of PyFlink has increased to about 320MB, 
which exceeds the size limit of pypi.org (300MB). We need to remove unnecessary 
jars to reduce the package size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17647) Improve new connector options exception in old planner