[jira] [Created] (FLINK-19433) An Error example of FROM_UNIXTIME function in document

2020-09-27 Thread Kyle Zhang (Jira)
Kyle Zhang created FLINK-19433:
--

 Summary: An Error example of FROM_UNIXTIME function in document
 Key: FLINK-19433
 URL: https://issues.apache.org/jira/browse/FLINK-19433
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Table SQL / API
Reporter: Kyle Zhang


In the 
documentation:[https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/systemFunctions.html#temporal-functions]

There is an example in FROM_UNIXTIME function
{code:java}
E.g., FROM_UNIXTIME(44) returns '1970-01-01 09:00:44' if in UTC time zone, but 
returns '1970-01-01 09:00:44' if in 'Asia/Tokyo' time zone.
{code}
However, the correct result should be 1970-01-01 00:00:44 in UTC time zone

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19434) Add StreamJobGraphGenerator support for source chaining

2020-09-27 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-19434:
---

 Summary: Add StreamJobGraphGenerator support for source chaining
 Key: FLINK-19434
 URL: https://issues.apache.org/jira/browse/FLINK-19434
 Project: Flink
  Issue Type: Sub-task
Reporter: Caizhi Weng


Add changes around {{StreamJobGraphGenerator}} to generate job graph with 
source chaining, so that both DataStream and Table API can take advantage of 
source chaining.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-33: Standardize connector metrics

2020-09-27 Thread Stephan Ewen
+1

On Mon, Sep 21, 2020 at 11:48 AM Yu Li  wrote:

> +1
>
> I could see this is a well written document after a long and thorough
> discussion [1]. Thanks for driving this all along, Becket!
>
> Best Regards,
> Yu
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-33-Standardize-connector-metrics-td26869.html
>
>
> On Mon, 21 Sep 2020 at 17:39, jincheng sun 
> wrote:
>
> > +1, Thanks for driving this, Becket!
> >
> > Best,
> > Jincheng
> >
> >
> > Becket Qin  于2020年9月21日周一 上午11:31写道:
> >
> > > Hi all,
> > >
> > > I would like to start the voting thread for FLIP-33 which proposes to
> > > standardize the metrics of Flink connectors.
> > >
> > > In short, we would like to provide a convention and guidance of Flink
> > > connector metrics. It will help simplify the monitoring and alerting on
> > > Flink jobs. The FLIP link is following:
> > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-33
> > > %3A+Standardize+Connector+Metrics
> > >
> > > The vote will be open for at least 72 hours.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>


Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

2020-09-27 Thread Stephan Ewen
Good to see this FLIP moving.

>From what I understand, the remaining questions are mainly about how to
express the roles of the CheckpointStorage and specifically the behavior of
JMCheckpointStorage and FsCheckpointStorage in the docs.
This sounds like details we should discuss over the concrete text proposals
in the PR.

On Sun, Sep 27, 2020 at 5:38 AM Yu Li  wrote:

> Thanks Seth, the updated FLIP overall LGTM, and I've left some inline
> comments in the doc.
>
> Best Regards,
> Yu
>
>
> On Fri, 25 Sep 2020 at 20:58, Seth Wiesman  wrote:
>
> > Done
> >
> > Seth
> >
> > On Fri, Sep 25, 2020 at 2:47 AM Yu Li  wrote:
> >
> > > *bq. I think it might help to highlight specific stumbling blocks users
> > > have today and why I believe this change addresses those issues.*
> > > Thanks for adding more details, I believe adding these blocks to the
> FLIP
> > > doc could make the motivation more vivid and convincing.
> > >
> > > *bq. To be concrete I think the JavaDoc for setCheckpointStorage would
> be
> > > something like...*
> > > I could see this definition extracts the existing description from the
> > > current `StateBackend` interface, it's a valid option, and let me quote
> > it
> > > again:
> > > - CheckpointStorage defines how checkpoint snapshots are persisted for
> > > fault tolerance. Various implementations store their checkpoints in
> > > different fashions and have different requirements and availability
> > > guarantees.
> > > - JobManagerCheckpointStorage stores checkpoints in the memory of the
> > > JobManager. It is lightweight and without additional dependencies but
> is
> > > not highly available.
> > > - FileSystemCheckpointStorage stores checkpoints in a file system. For
> > > systems like HDFS, NFS Drives, S3, and GCS, this storage policy
> supports
> > > large state size, in the magnitude of many terabytes while providing a
> > > highly available foundation for stateful applications. This checkpoint
> > > storage policy is recommended for most production deployments.
> > >
> > > Sticking to this definition, I think we should have the below migration
> > > plans for existing backends:
> > > - `MemoryStateBackend(null, String savepointPath)` to
> > > `HashMapStateBackend() + JobManagerCheckpointStorage()`
> > > - `MemoryStateBackend(, String
> savepointPath)`
> > to
> > > `HashMapStateBackend() + FileSystemCheckpointStorage()`
> > > in addition of the existing:
> > > - `MemoryStateBackend()` to `HashMapStateBackend() +
> > > JobManagerCheckpointStorage()`
> > > and could be summarized as:
> > > - MemoryStateBackend with checkpoint path: `HashMapStateBackend() +
> > > FileSystemCheckpointStorage()`
> > > - MemoryStateBackend w/o checkpoint path: `HashMapStateBackend() +
> > > JobManagerCheckpointStorage()`
> > >
> > > And I believe adding the above highlighted blocks to the FLIP doc (the
> > "New
> > > StateBackend User API" and "Migration Plan" sections, separately) could
> > > make it more complete.
> > >
> > > PS. Please note that although the current javadoc of `StateBackend`
> > states
> > > "MemoryStateBackend is not highly available", it actually supports
> > > persisting the checkpoint data to DFS when checkpoint path is given, so
> > the
> > > mapping between old and new APIs are not that straight-forward and need
> > > some clear clarifications, from my point of view.
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Fri, 25 Sep 2020 at 08:33, Seth Wiesman 
> wrote:
> > >
> > > > Hi Yu,
> > > >
> > > > bq* I thought the FLIP aims at resolving some *existing* confusion,
> > i.e.
> > > > the durability mystery to users.
> > > >
> > > > I think it might help to highlight specific stumbling blocks users
> have
> > > > today and why I believe this change addresses those issues. Some
> > frequent
> > > > things I've heard over the past several years include:
> > > >
> > > > 1) "We use RocksDB because we don't need fault tolerance."
> > > > 2) "We don't use RocksDB because we don't want to manage an external
> > > > database."
> > > > 3) Believing RocksDB is reading and writing directly with S3 or HDFS
> > (vs.
> > > > local disk)
> > > > 4) Believing FsStateBackend spills to disk or has anything to do with
> > the
> > > > local filesystem
> > > > 5) Pointing RocksDB at network-attached storage, believing that the
> > state
> > > > backend needs to be fault-tolerant
> > > >
> > > > This question from the ml is very representative of where users are
> > > > struggling[1]. Many of these questions were not from new users but
> from
> > > > organizations that were in production! Just yesterday I was on the
> > phone
> > > > with a company that didn't realize they were in production without
> > > > checkpointing; honestly, you would be shocked how often this happens.
> > The
> > > > current state backend abstraction is to complex for many of our
> users.
> > > What
> > > > all these questions have in common is misunderstanding the
> relationship
> > > > between how data is stored locall

[jira] [Created] (FLINK-19435) jdbc JDBCOutputFormat open function invoke Class.forName(drivername)

2020-09-27 Thread xiaodao (Jira)
xiaodao created FLINK-19435:
---

 Summary: jdbc JDBCOutputFormat open function invoke 
Class.forName(drivername)
 Key: FLINK-19435
 URL: https://issues.apache.org/jira/browse/FLINK-19435
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.10.2
Reporter: xiaodao
 Fix For: 1.10.3


when we sink data to multi jdbc outputformat , 

```

protected void establishConnection() throws SQLException, 
ClassNotFoundException {
 Class.forName(drivername);
 if (username == null) {
 connection = DriverManager.getConnection(dbURL);
 } else {
 connection = DriverManager.getConnection(dbURL, username, password);
 }
}

```

may cause jdbc driver deadlock. it need to change to synchronized function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-145: Support SQL windowing table-valued function

2020-09-27 Thread liupengcheng
Hi Jark,
Thanks for reply, yes, I think it's a good feature, it can improve the 
NRT scenarios
as you mentioned in the FLIP. Also, I think it can improve the 
streaming SQL greatly,
it can support richer window operations in flink SQL and bring great 
convenience to users. 
(we are now only supported group window in flink).

Regarding the SESSION window, I think it's especially useful for user 
behavior analysis(e.g.
counting user visits on a news website or social platform), but I agree 
that we can keep it
out of the FLIP now to catch up 1.12.

Recently, I've done some work on the stream planner with the TVFs, and 
I'm willing to contribute
to this part. Is it in the plan of this FLIP?

Best,
PengchengLiu


在 2020/9/26 下午11:09,“Jark Wu” 写入:

Hi pengcheng,

That's great to see you also have the need of window join.
You are right, the windowing TVF is a powerful feature which can support
more operations in the future.
I think it as of the date time "partition" selection in batch SQL jobs,
with this new syntax, I think it is possible
 to migrate traditional batch SQL jobs to Flink SQL by changing a few lines.

Regarding the SESSION window, this is on purpose to keep it out of the
FLIP, because we want to keep the
FLIP small to catch up 1.12 and SESSION TVF is rarely useful (e.g. session
window join?).

Best,
Jark

On Fri, 25 Sep 2020 at 22:59, liupengcheng 
wrote:

> Hi, Jark,
> I'm very interested in this feature, and I'm also working on this
> recently.
> I just have a glance at the FLIP, it's good, but I found that
> there is no plan to add SESSION windows.
> Also, I think there can be more things we can do based on this new
> syntax. For example,
> - window sort support
> - window union/intersect/minus support
> - Improve dimension table join
> We can have more deep discussion on this new feature later .
> I've also opened an jira that is related to this feature recently:
> https://issues.apache.org/jira/browse/FLINK-18830
>
> Best!
> PengchengLiu
>
> 在 2020/9/25 下午10:30,“Jark Wu” 写入:
>
> Hi everyone,
>
> I want to start a FLIP about supporting windowing table-valued
> functions
> (TVF).
> The main purpose of this FLIP is to improve the near real-time (NRT)
> experience of Flink.
>
> FLIP-145:
>
> 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function
>
> We want to introduce TUMBLE, HOP, CUMULATE windowing TVFs, the
> CUMULATE is
> a new kind of window.
> With the windowing TVFs, we can support richer operations on windows,
> including window join, window TopN and so on.
> This makes things simple: we only need to assign windows at the
> beginning
> of the query, and then apply operations after that like traditional
> batch
> SQL.
> We hope it can help to reduce the learning curve of windows, improve
> NRT
> for Flink, and attract more batch users.
>
> A simple code snippet for 10 minutes tumbling window aggregate:
>
> SELECT window_start, window_end, SUM(price)
> FROM TABLE(
> TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
> GROUP BY window_start, window_end;
>
> I'm looking forward to your feedback.
>
> Best,
> Jark
>
>
>




.ORC file reading from Apache Flink +SCALA

2020-09-27 Thread Ajay Kumar
Hi Team,

I am struggling to read .ORC file as a source from Apache flink.
Unfortunately there is no reach reference over internet also,

Appreciate your help !

Thanks in advance !

Regards
Ajay


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-27 Thread Steven Wu
It is more about extended outages of metastore. E.g. If we commit every 2
minutes, 4 hours of metastore outage can lead to over 120 GlobalCommitT.
And regarding metastore outages, it is undesirable for streaming jobs to
fail the job and keep restarting. It is better to keep processing records
(avoiding backlog) and upload to DFS (like S3). Commit will succeed
whenever the metastore comes back. It also provides a nice automatic
recovery story. Since GlobalCommT combines all data files (hundreds or
thousands in one checkpoint cycle) into a single item in state, this really
makes it scalable and efficient to deal with extended metastore outages.

"CommitResult commit(GlobalCommitT)" API can work, although it is less
efficient and flexible for some sinks. It is probably better to let sink
implementations decide what is the best retry behavior: one by one vs a big
batch/transaction. Hence I would propose APIs like these.
--
interface GlobalCommitter {
  // commit all pending GlobalCommitT items accumulated
  CommitResult commit(List)
}

interface CommitResult {
  List getSucceededCommitables();
  List getFailedCommitables();

  // most likely, framework just need to check and roll over the retryable
list to the next commit try
  List getRetrableCommittables();
}
---

Anyway, I am going to vote yes on the voting thread, since it is important
to move forward to meet the 1.12 goal. We can also discuss the small tweak
during the implementation phase.

Thanks,
Steven


On Sat, Sep 26, 2020 at 8:46 PM Guowei Ma  wrote:

> Hi Steven
>
> Thank you very much for your detailed explanation.
>
> Now I got your point, I could see that there are benefits from committing a
> collection of `GlobalCommT` as a whole when the external metastore
> environment is unstable at some time.
>
> But I have two little concern about introducing committing the collection
> of `GlobalCommit`:
>
> 1. For Option1: CommitResult commit(List). This option
> implies that users should commit to the collection of `GlobalCommit` as a
> whole.
> But maybe not all the system could do it as a whole, for example changing
> some file names could not do it. If it is the case I think maybe some guy
> would always ask the same question as I asked in the previous mail.
>
> 2. For Option2: List commit(List). This option
> is more clear than the first one. But IMHO this option has only benefits
> when the external metastore is unstable and we want to retry many times and
> not fail the job. Maybe we should not rety so many times and end up with a
> lot of the uncommitted `GlobalCommitT`. If this is the case maybe we should
> make the api more clear/simple for the normal scenario. In addition there
> is only a globalcommit instance so I think the external system could bear
> the pressure.
>
> So personally I would like to say we might keep the API simpler at the
> beginning in 1.12
>
> What do you think?
>
> Best,
> Guowei
>
>
> On Fri, Sep 25, 2020 at 9:30 PM Steven Wu  wrote:
>
> > I should clarify my last email a little more.
> >
> > For the example of commits for checkpoints 1-100 failed, the job is still
> > up (processing records and uploading files). When commit for checkpoint
> 101
> > came, IcebergSink would prefer the framework to pass in all 101
> GlobalCommT
> > (100 old + 1 new) so that it can commit all of them in one transaction.
> it
> > is more efficient than 101 separate transactions.
> >
> > Maybe the GlobalCommitter#commit semantics is to give the sink all
> > uncommitted GlobalCommT items and let sink implementation decide whether
> to
> > retry one by one or in a single transaction. It could mean that we need
> to
> > expand the CommitResult (e.g. a list for each result type, SUCCESS,
> > FAILURE, RETRY) interface. We can also start with the simple enum style
> > result for the whole list for now. If we need to break the experimental
> > API, it is also not a big deal since we only need to update a few sink
> > implementations.
> >
> > Thanks,
> > Steven
> >
> > On Fri, Sep 25, 2020 at 5:56 AM Steven Wu  wrote:
> >
> > > > 1. The frame can not know which `GlobalCommT` to retry if we use the
> > > > List as parameter when the `commit` returns `RETRY`.
> > > > 2. Of course we can let the `commit` return more detailed info but it
> > > might
> > > > be too complicated.
> > >
> > > If commit(List) returns RETRY, it means the whole list
> needs
> > > to be retried. E.g. we have some outage with metadata service, commits
> > for
> > > checkpoints 1-100 failed. We can accumulate 100 GlobalCommT items. we
> > don't
> > > want to commit them one by one. It is faster to commit the whole list
> as
> > > one batch.
> > >
> > > > 3. On the other hand, I think only when restoring IcebergSink needs a
> > > > collection of `GlobalCommT` and giving back another collection of
> > > > `GlobalCommT` that are not committed
> > >
> > > That is when the job restarted due to failure or deployment.
> > >
> > >
> >

Re: [VOTE] FLIP-143: Unified Sink API

2020-09-27 Thread Steven Wu
+1 (non-binding)

Although I would love to continue the discussion for tweaking the
CommitResult/GlobaCommitter interface maybe during the implementation phase.

On Fri, Sep 25, 2020 at 5:35 AM Aljoscha Krettek 
wrote:

> +1 (binding)
>
> Aljoscha
>
> On 25.09.20 14:26, Guowei Ma wrote:
> >  From the discussion[1] we could find that FLIP focuses on providing an
> > unified transactional sink API. So I updated the FLIP's title to "Unified
> > Transactional Sink API". But I found that the old link could not be
> opened
> > again.
> >
> > I would update the link[2] here. Sorry for the inconvenience.
> >
> > [1]
> >
> https://lists.apache.org/thread.html/rf09dfeeaf35da5ee98afe559b5a6e955c9f03ade0262727f6b5c4c1e%40%3Cdev.flink.apache.org%3E
> > [2] https://cwiki.apache.org/confluence/x/KEJ4CQ
> >
> > Best,
> > Guowei
> >
> >
> > On Thu, Sep 24, 2020 at 8:13 PM Guowei Ma  wrote:
> >
> >> Hi, all
> >>
> >> After the discussion in [1], I would like to open a voting thread for
> >> FLIP-143 [2], which proposes a unified sink api.
> >>
> >> The vote will be open until September 29th (72h + weekend), unless there
> >> is an objection or not enough votes.
> >>
> >> [1]
> >>
> https://lists.apache.org/thread.html/rf09dfeeaf35da5ee98afe559b5a6e955c9f03ade0262727f6b5c4c1e%40%3Cdev.flink.apache.org%3E
> >> [2]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> >>
> >> Best,
> >> Guowei
> >>
> >
>
>


Re: [VOTE] FLIP-143: Unified Sink API

2020-09-27 Thread Kostas Kloudas
+1 (binding)

@Steven Wu I think there will be opportunities to fine tune the API
during the implementation.

Cheers,
Kostas

On Sun, Sep 27, 2020 at 7:56 PM Steven Wu  wrote:
>
> +1 (non-binding)
>
> Although I would love to continue the discussion for tweaking the
> CommitResult/GlobaCommitter interface maybe during the implementation phase.
>
> On Fri, Sep 25, 2020 at 5:35 AM Aljoscha Krettek 
> wrote:
>
> > +1 (binding)
> >
> > Aljoscha
> >
> > On 25.09.20 14:26, Guowei Ma wrote:
> > >  From the discussion[1] we could find that FLIP focuses on providing an
> > > unified transactional sink API. So I updated the FLIP's title to "Unified
> > > Transactional Sink API". But I found that the old link could not be
> > opened
> > > again.
> > >
> > > I would update the link[2] here. Sorry for the inconvenience.
> > >
> > > [1]
> > >
> > https://lists.apache.org/thread.html/rf09dfeeaf35da5ee98afe559b5a6e955c9f03ade0262727f6b5c4c1e%40%3Cdev.flink.apache.org%3E
> > > [2] https://cwiki.apache.org/confluence/x/KEJ4CQ
> > >
> > > Best,
> > > Guowei
> > >
> > >
> > > On Thu, Sep 24, 2020 at 8:13 PM Guowei Ma  wrote:
> > >
> > >> Hi, all
> > >>
> > >> After the discussion in [1], I would like to open a voting thread for
> > >> FLIP-143 [2], which proposes a unified sink api.
> > >>
> > >> The vote will be open until September 29th (72h + weekend), unless there
> > >> is an objection or not enough votes.
> > >>
> > >> [1]
> > >>
> > https://lists.apache.org/thread.html/rf09dfeeaf35da5ee98afe559b5a6e955c9f03ade0262727f6b5c4c1e%40%3Cdev.flink.apache.org%3E
> > >> [2]
> > >>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > >>
> > >> Best,
> > >> Guowei
> > >>
> > >
> >
> >


Re: [VOTE] FLIP-33: Standardize connector metrics

2020-09-27 Thread Becket Qin
Hi all,

Thanks everyone for voting. Here is the result:

binding +1s: 4 (Stephan, Yu, Jincheng, Becket)
There is no -1 or non-binding +1s.

FLIP-33 has been accepted.

Thanks,

Jiangjie (Becket) Qin

On Sun, Sep 27, 2020 at 7:18 PM Stephan Ewen  wrote:

> +1
>
> On Mon, Sep 21, 2020 at 11:48 AM Yu Li  wrote:
>
> > +1
> >
> > I could see this is a well written document after a long and thorough
> > discussion [1]. Thanks for driving this all along, Becket!
> >
> > Best Regards,
> > Yu
> >
> > [1]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-33-Standardize-connector-metrics-td26869.html
> >
> >
> > On Mon, 21 Sep 2020 at 17:39, jincheng sun 
> > wrote:
> >
> > > +1, Thanks for driving this, Becket!
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > > Becket Qin  于2020年9月21日周一 上午11:31写道:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start the voting thread for FLIP-33 which proposes to
> > > > standardize the metrics of Flink connectors.
> > > >
> > > > In short, we would like to provide a convention and guidance of Flink
> > > > connector metrics. It will help simplify the monitoring and alerting
> on
> > > > Flink jobs. The FLIP link is following:
> > > >
> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-33
> > > > %3A+Standardize+Connector+Metrics
> > > >
> > > > The vote will be open for at least 72 hours.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > >
> >
>


[jira] [Created] (FLINK-19436) TPC-DS end-to-end test (Blink planner) failed during shutdown

2020-09-27 Thread Dian Fu (Jira)
Dian Fu created FLINK-19436:
---

 Summary: TPC-DS end-to-end test (Blink planner) failed during 
shutdown
 Key: FLINK-19436
 URL: https://issues.apache.org/jira/browse/FLINK-19436
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.11.0
Reporter: Dian Fu


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7009&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=2b7514ee-e706-5046-657b-3430666e7bd9

{code}
2020-09-27T22:37:53.2236467Z Stopping taskexecutor daemon (pid: 2992) on host 
fv-az655.
2020-09-27T22:37:53.4450715Z Stopping standalonesession daemon (pid: 2699) on 
host fv-az655.
2020-09-27T22:37:53.8014537Z Skipping taskexecutor daemon (pid: 11173), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8019740Z Skipping taskexecutor daemon (pid: 11561), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8022857Z Skipping taskexecutor daemon (pid: 11849), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8023616Z Skipping taskexecutor daemon (pid: 12180), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8024327Z Skipping taskexecutor daemon (pid: 12950), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8025027Z Skipping taskexecutor daemon (pid: 13472), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8025727Z Skipping taskexecutor daemon (pid: 16577), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8026417Z Skipping taskexecutor daemon (pid: 16959), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8027086Z Skipping taskexecutor daemon (pid: 17250), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8027770Z Skipping taskexecutor daemon (pid: 17601), because 
it is not running anymore on fv-az655.
2020-09-27T22:37:53.8028400Z Stopping taskexecutor daemon (pid: 18438) on host 
fv-az655.
2020-09-27T22:37:53.8029314Z 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT/bin/taskmanager.sh:
 line 99: 18438 Terminated  "${FLINK_BIN_DIR}"/flink-daemon.sh 
$STARTSTOP $ENTRYPOINT "${ARGS[@]}"
2020-09-27T22:37:53.8029895Z [FAIL] Test script contains errors.
2020-09-27T22:37:53.8032092Z Checking for errors...
2020-09-27T22:37:55.3713368Z No errors in log files.
2020-09-27T22:37:55.3713935Z Checking for exceptions...
2020-09-27T22:37:56.9046391Z No exceptions in log files.
2020-09-27T22:37:56.9047333Z Checking for non-empty .out files...
2020-09-27T22:37:56.9064402Z No non-empty .out files.
2020-09-27T22:37:56.9064859Z 
2020-09-27T22:37:56.9065588Z [FAIL] 'TPC-DS end-to-end test (Blink planner)' 
failed after 16 minutes and 54 seconds! Test exited with exit code 1
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Apply for permission of jira contributor

2020-09-27 Thread Xie Billy
Hi Guys,

I want to contribute to Apache Flink.
Would you please give me the permission as a contributor?
My JIRA ID is billyxie. thanks a lot.



Best Regards!
Billy xie(谢志民)


[jira] [Created] (FLINK-19437) FileSourceTextLinesITCase.testContinuousTextFileSource failed with "SimpleStreamFormat is not splittable, but found split end (0) different from file length (198)"

2020-09-27 Thread Dian Fu (Jira)
Dian Fu created FLINK-19437:
---

 Summary: FileSourceTextLinesITCase.testContinuousTextFileSource 
failed with "SimpleStreamFormat is not splittable, but found split end (0) 
different from file length (198)"
 Key: FLINK-19437
 URL: https://issues.apache.org/jira/browse/FLINK-19437
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Affects Versions: 1.12.0
Reporter: Dian Fu


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7008&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=62110053-334f-5295-a0ab-80dd7e2babbf

{code}
2020-09-27T21:58:38.9199090Z [ERROR] 
testContinuousTextFileSource(org.apache.flink.connector.file.src.FileSourceTextLinesITCase)
  Time elapsed: 0.517 s  <<< ERROR!
2020-09-27T21:58:38.9199619Z java.lang.RuntimeException: Failed to fetch next 
result
2020-09-27T21:58:38.9200118Zat 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:106)
2020-09-27T21:58:38.9200722Zat 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77)
2020-09-27T21:58:38.9201290Zat 
org.apache.flink.streaming.api.datastream.DataStreamUtils.collectRecordsFromUnboundedStream(DataStreamUtils.java:150)
2020-09-27T21:58:38.9201920Zat 
org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSource(FileSourceTextLinesITCase.java:136)
2020-09-27T21:58:38.9202570Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-09-27T21:58:38.9203054Zat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-09-27T21:58:38.9203539Zat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-09-27T21:58:38.9203968Zat 
java.lang.reflect.Method.invoke(Method.java:498)
2020-09-27T21:58:38.9204369Zat 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2020-09-27T21:58:38.9204844Zat 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2020-09-27T21:58:38.9205359Zat 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2020-09-27T21:58:38.9205814Zat 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2020-09-27T21:58:38.9206240Zat 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
2020-09-27T21:58:38.9206611Zat 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2020-09-27T21:58:38.9206971Zat 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2020-09-27T21:58:38.9207404Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2020-09-27T21:58:38.9207971Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2020-09-27T21:58:38.9208404Zat 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2020-09-27T21:58:38.9208877Zat 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2020-09-27T21:58:38.9209279Zat 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2020-09-27T21:58:38.9209680Zat 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2020-09-27T21:58:38.9210064Zat 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2020-09-27T21:58:38.9210476Zat 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2020-09-27T21:58:38.9210881Zat 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2020-09-27T21:58:38.9211272Zat 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2020-09-27T21:58:38.9211638Zat 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
2020-09-27T21:58:38.9212305Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
2020-09-27T21:58:38.9213157Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
2020-09-27T21:58:38.9213663Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
2020-09-27T21:58:38.9214123Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
2020-09-27T21:58:38.9214620Zat 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
2020-09-27T21:58:38.9215148Zat 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
2020-09-27T21:58:38.9215650Zat 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
2020-09-27T21:58:38.9216095Zat 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
2020-09-27T21:58:38.9216516Z Caused by: java.io.IOException: Failed to fetch 
job execution result
2020-09-27T21:58:38.9217004Zat 
org.apache.flink.streaming

[jira] [Created] (FLINK-19438) Queryable State need implement both read-uncommitted and read-committed

2020-09-27 Thread sheep (Jira)
sheep created FLINK-19438:
-

 Summary: Queryable State need implement both read-uncommitted and 
read-committed
 Key: FLINK-19438
 URL: https://issues.apache.org/jira/browse/FLINK-19438
 Project: Flink
  Issue Type: Wish
  Components: Runtime / Queryable State
Reporter: sheep


Flink exposes its managed keyed (partitioned) state to the outside world and 
allows the user to query a job’s state from outside Flink. From a traditional 
database isolation-level viewpoint, the queries access uncommitted state, thus 
following the read-uncommitted isolation level.

I fully understand Flink provides read-uncommitted state query in order to 
query real-time state. But the read-committed state is also important (I cannot 
fully explain). From Flink 1.9, querying even modifying the state in Checkpoint 
has been implemented. The state in Checkpoint is equivalent to read-committed 
state.  So, users can query read-committed state via the state processor api.

*Flink should provide users  for configuration of isolation level of Queryable 
State by integration of the two levels of state query.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Support registering custom JobStatusListeners when scheduling a job

2020-09-27 Thread 季文昊
Hi Aljoscha,

Yes, that is not enough, since the `JobListener`s are called only once when
`excute()` or `executeAsync()` is called. And in order to sync the status,
I also have to call `JobClient#getJobStatus` periodically.

On Fri, Sep 25, 2020 at 8:12 PM Aljoscha Krettek 
wrote:

> Hi,
>
> I understand from your email that
> `StreamExecutionEnvironment.registerJobListener()` would not be enought
> for you because you want to be notified of changes on the cluster side,
> correct? That is when the job status changes on the master.
>
> Best,
> Aljoscha
>
> On 23.09.20 14:31, 季文昊 wrote:
> > Hi there,
> >
> > I'm working on a Flink platform in my corp, which provides a service to
> > provision and manage multiple dedicated Flink clusters. The problem is
> that
> > we want to sync a job status without delay after its submission through
> our
> > platform as long as it has been changed.
> >
> > Since we want to update this in-time and make our services stateless,
> > pulling a job's status periodically is not a good solution. I do not find
> > any proper way to achieve this by letting a job manager push changes
> > directly to our platform except changing the source code, which registers
> > an additional `JobStatusListener` in the method
> > `org.apache.flink.runtime.jobmaster.JobMaster#startScheduling`.
> >
> > I wonder if we can enhance `JobStatusListener` a little bit so that a
> Flink
> > user can register his custom JobStatusListener at the startup.
> >
> > To be specific, we can have a `JobStatusListenerFactory` interface and
> its
> > corresponding `ServiceLoader`, where
> > the JobStatusListenerFactory will have the following method:
> >   - JobStatusListener createJobStatusListener(Properties properties);
> >
> > Custom listeners will be created during the JobMaster#startScheduling
> > method.
> >
> > If someone would like to implement his own JobStatusListener, he will
> > package all the related classes into a standalone jar with a
> >
> `META-INF/services/org.apache.flink.runtime.executiongraph.JobStatusListener`
> > file and place it under the `lib/` directory.
> >
> > In addition, I find that there is a Jira ticket similar to what I'm
> > asking: FLINK-17104 but I do not see any comment or update yet. Hope
> anyone
> > could help me move on this feature or give me some suggestions about it.
> >
> > Thanks,
> > Wenhao
> >
>
>


[ANNOUNCE] Apache Flink Stateful Functions 2.2.0 released

2020-09-27 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache
Flink Stateful Functions 2.2.0.

Stateful Functions is an API that simplifies the building of distributed
stateful applications with a runtime built for serverless architectures.
It's based on functions with persistent state that can interact dynamically
with strong consistency guarantees.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/news/2020/09/28/release-statefun-2.2.0.html

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Stateful Functions can be found at:
https://search.maven.org/search?q=g:org.apache.flink%20statefun

Python SDK for Stateful Functions published to the PyPI index can be found
at:
https://pypi.org/project/apache-flink-statefun/

Official Docker image for building Stateful Functions applications is
currently being published to Docker Hub. Progress for creating the Docker
Hub repository can be tracked at:
https://github.com/docker-library/official-images/pull/7749

In the meantime, before the official Docker images are available,
Ververica has volunteered to make Stateful Function's images available for
the community via their public registry:
https://hub.docker.com/r/ververica/flink-statefun

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12348350

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Cheers,
Gordon