Re: [DISCUSS] FLIP-84 Feedback Summary
Hi fabian, Thanks for you suggestions. Agree with you that `UNQ(f2, f3)` is more clear. A table can have only ONE primary key, this primary key can consist of single or multiple columns. [1] if primary key consists of single column, we can simply use `PRI` (or `PRI(xx)`) to represent it. if primary key have multiple columns, we should use `PRI(xx, yy, ...)` to represent it. A table may have multiple unique keys, each unique key can consist of single or multiple columns. [2] if there is only one unique key and this unique key has only single column, we can simply use `UNQ` (or `UNQ(xx)`) to represent it. otherwise, we should use `UNQ(xx, yy, ...)` to represent it. (a corner case: two unique keys with same columns, like `UNQ(f2, f3)`, `UNQ(f2, f3)`, we can forbid this case or add a unique name for each key in the future) primary key and unique key with multiple columns example: create table MyTable ( f0 BIGINT NOT NULL, f1 ROW, f2 VARCHAR<256>, f3 AS f0 + 1, f4 TIMESTAMP(3) NOT NULL, f5 BIGINT NOT NULL, * PRIMARY KEY (f0, f5)*, *UNIQUE (f3, f2)*, WATERMARK f4 AS f4 - INTERVAL '3' SECOND ) with (...) ++--+---++---+--+ | name | type | null | key | compute column | watermark | ++--+---++---+--+ | f0 | BIGINT | false | PRI(f0, f5) | (NULL) | (NULL) | ++--+---++---+--+ | f1 | ROW | true | (NULL)| (NULL) | (NULL) | ++--+---++---+--+ | f2 | VARCHAR<256> | true | UNQ(f2, f3) | (NULL) | (NULL) | ++--+---++---+--+ | f3 | BIGINT | false | UNQ(f2, f3) | f0 + 1 | (NULL) | ++--+---++---+--+ | f4 | TIMESTAMP(3)| false | (NULL)| (NULL) | f4 - INTERVAL '3' SECOND | ++--+---++---+--+ | f5 | BIGINT | false | PRI(f0, f5) | (NULL) | (NULL) | ++--+---++---+--+ "Regarding to the watermark on nested columns", that's a good approach which can both support watermark on nested columns in the future and keep current table form. [1] https://www.w3schools.com/sql/sql_primarykey.asp [2] https://www.w3schools.com/sql/sql_unique.ASP Best, Godfrey Fabian Hueske 于2020年5月7日周四 上午12:03写道: > Hi Godfrey, > > This looks good to me. > > One side note, indicating unique constraints with "UNQ" is probably not > enough. > There might be multiple unique constraints and users would like to know > which field combinations are unique. > So in your example above, "UNQ(f2, f3)" might be a better marker. > > Just as a thought, if we would later add support for watermark on nested > columns, we could add a row just for the nested field (in addition to the > top-level field) like this: > > > ++---+---+---+-+---+ > | f4.nested.rowtime | TIMESTAMP(3)| false | (NULL) | (NULL) > | f4.nested.rowtime - INTERVAL '3' SECOND | > > ++---+---+---+-+---+ > > Thanks, > Fabian > > Am Mi., 6. Mai 2020 um 17:51 Uhr schrieb godfrey he : > >> Hi @fhue...@gmail.com @Timo Walther @Dawid >> Wysakowicz >> What do you think we limit watermark must be defined on top-level column ? >> >> if we do that, we can add an expression column to represent watermark >> like compute column, >> An example of all cases: >> create table MyTable ( >> f0 BIGINT NOT NULL, >> f1 ROW, >> f2 VARCHAR<256>, >>
Re: [VOTE] FLIP-36 - Support Interactive Programming in Flink Table API
Hi, Thanks for the feedback, @Timo @Kurt. I agree that we can postpone the vote after May 15th since the FLIP is not targeted for Flink 1.11 and the questions you brought up need more discussion. I prefer moving the discussion back to the discussion thread and keeping the vote thread clear. I will restart this voting thread after May 15th if we come to a consensus. I will share my thought about your feedback in the discussion thread. Please stay tuned. Best, Xuannan On Thu, Apr 30, 2020 at 9:38 PM Kurt Young wrote: > +1 to what Timo has said. > > One more comment about the relation of this FLIP and FLIP-84, in FLIP-84 we > start to deprecate all APIs which will buffer the table operation or plans. > You can think of APIs like `sqlUpdate`, > and `insertInto` is some kind of buffer operation, and all buffered > operations will be executed by TableEnv.execute(). > > This causes ambiguous API behavior since we have other operation which will > be executed eagerly, like passing > a DDL statement to `sqlUpdate`. > > From the example of this FLIP, I think it still follows the buffer kind API > design which I think needs more discussion. > > Best, > Kurt > > > On Thu, Apr 30, 2020 at 6:57 PM Timo Walther wrote: > > > Hi Xuannan, > > > > sorry, for not entering the discussion earlier. Could you please update > > the FLIP to how it would like after FLIP-84? I think your proposal makes > > sense to me and aligns well with the other efforts from an API > > perspective. However, here are some thought from my side: > > > > It would be nice to already think about how we can support this feature > > in a multi-statement SQL file. Would we use planner hints (as discussed > > in FLIP-113) or rather introduce some custom DCL on top of views? > > > > Actually, the semantics of a cached table are very similar to > > materialized views. Maybe we should think about unifying these concepts? > > > > CREATE MATERIALIZED VIEW x SELECT * FROM t; > > > > table.materialize(); > > > > table.unmaterialize(); > > > > From a runtime perspecitive, even for streaming a user could define a > > caching service that is backed by a regular table source/table sink. > > > > Currently, people are busy with feature freeze of Flink 1.11. Maybe we > > could postpone the discussion after May 15th. I guess this FLIP is > > targeted for Flink 1.12 anyways. > > > > Regards, > > Timo > > > > On 30.04.20 09:00, Xuannan Su wrote: > > > Hi all, > > > > > > I'd like to start the vote for FLIP-36[1], which has been discussed in > > > thread[2]. > > > > > > The vote will be open for 72h, until May 3, 2020, 07:00 AM UTC, unless > > > there's an objection. > > > > > > Best, > > > Xuannan > > > > > > [1] > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink > > > [2] > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-36-Support-Interactive-Programming-in-Flink-Table-API-td40592.html > > > > > > > >
Re: [DISCUSS] FLIP-84 Feedback Summary
Thanks for the update Godfrey! +1 to this approach. Since there can be only one primary key, I'd also be fine to just use `PRI` even if it is composite, but `PRI(f0, f5)` might be more convenient for users. Thanks, Fabian Am Do., 7. Mai 2020 um 09:31 Uhr schrieb godfrey he : > Hi fabian, > Thanks for you suggestions. > > Agree with you that `UNQ(f2, f3)` is more clear. > > A table can have only ONE primary key, > this primary key can consist of single or multiple columns. [1] > if primary key consists of single column, > we can simply use `PRI` (or `PRI(xx)`) to represent it. > if primary key have multiple columns, > we should use `PRI(xx, yy, ...)` to represent it. > > A table may have multiple unique keys, > each unique key can consist of single or multiple columns. [2] > if there is only one unique key and this unique key has only single column, > we can simply use `UNQ` (or `UNQ(xx)`) to represent it. > otherwise, we should use `UNQ(xx, yy, ...)` to represent it. > (a corner case: two unique keys with same columns, like `UNQ(f2, f3)`, > `UNQ(f2, f3)`, > we can forbid this case or add a unique name for each key in the future) > > primary key and unique key with multiple columns example: > create table MyTable ( > f0 BIGINT NOT NULL, > f1 ROW, > f2 VARCHAR<256>, > f3 AS f0 + 1, > f4 TIMESTAMP(3) NOT NULL, > f5 BIGINT NOT NULL, > * PRIMARY KEY (f0, f5)*, > *UNIQUE (f3, f2)*, > WATERMARK f4 AS f4 - INTERVAL '3' SECOND > ) with (...) > > > ++--+---++---+--+ > | name | type | > null | key | compute column | watermark >| > > ++--+---++---+--+ > | f0 | BIGINT | > false | PRI(f0, f5) | (NULL) | (NULL) > | > > ++--+---++---+--+ > | f1 | ROW | true | (NULL)| > (NULL) | (NULL) | > > ++--+---++---+--+ > | f2 | VARCHAR<256> | true | > UNQ(f2, f3) | (NULL) | (NULL) >| > > ++--+---++---+--+ > | f3 | BIGINT | > false | UNQ(f2, f3) | f0 + 1 | (NULL) > | > > ++--+---++---+--+ > | f4 | TIMESTAMP(3)| false | > (NULL)| (NULL) | f4 - INTERVAL '3' SECOND | > > ++--+---++---+--+ > | f5 | BIGINT | > false | PRI(f0, f5) | (NULL) | (NULL) > | > > ++--+---++---+--+ > > "Regarding to the watermark on nested columns", that's a good approach > which can both support watermark on nested columns in the future and keep > current table form. > > [1] https://www.w3schools.com/sql/sql_primarykey.asp > [2] https://www.w3schools.com/sql/sql_unique.ASP > > Best, > Godfrey > > Fabian Hueske 于2020年5月7日周四 上午12:03写道: > >> Hi Godfrey, >> >> This looks good to me. >> >> One side note, indicating unique constraints with "UNQ" is probably not >> enough. >> There might be multiple unique constraints and users would like to know >> which field combinations are unique. >> So in your example above, "UNQ(f2, f3)" might be a better marker. >> >> Just as a thought, if we would later add support for watermark on nested >> columns, we could add a row just for the nested field (in addition to the >> top-level field) like this: >> >> >> ++---+---+---+-+---+ >> | f4.nested.rowtime | TIMESTAMP(3)| false | (NULL) | (NULL) >> | f4.nested.rowtime - INTERVAL '3' SECOND | >> >> ++---+---+---+-+---
Re: [VOTE] Release 1.10.1, release candidate #2
Thanks Yu! On Thu, May 7, 2020 at 7:44 AM Yu Li wrote: > @Dawid Yes we will include it into RC3. Thanks for the note. > > @Till Thanks for the quick fix and the note. > > I've checked and confirmed there are no open issues under 1.10.1, neither > any resolved/closed ones under 1.10.2, so will start to prepare RC3. > > Best Regards, > Yu > > > On Wed, 6 May 2020 at 17:36, Till Rohrmann wrote: > > > I've merged the fix for FLINK-17514. > > > > Cheers, > > Till > > > > On Wed, May 6, 2020 at 10:53 AM Dawid Wysakowicz > > > wrote: > > > > > Hi all, > > > > > > I wonder if we could also include FLINK-17313 which I backported into > > > 1.10 branch yesterday. > > > > > > Best, > > > > > > Dawid > > > > > > On 06/05/2020 07:26, Yu Li wrote: > > > > Thanks Till and Thomas, will include fix for both FLINK-17496 and > > > > FLINK-17514 in the next RC. > > > > > > > > Best Regards, > > > > Yu > > > > > > > > > > > > On Tue, 5 May 2020 at 22:10, Thomas Weise wrote: > > > > > > > >> I opened a PR to backport the Kinesis fix - it would be nice to > > include > > > if > > > >> there is another RC: > > > >> > > > >> https://github.com/apache/flink/pull/11998 > > > >> > > > >> > > > >> On Tue, May 5, 2020 at 4:50 AM Till Rohrmann > > > wrote: > > > >> > > > >>> I've opened a PR for FLINK-17514. > > > >>> > > > >>> Cheers, > > > >>> Till > > > >>> > > > >>> On Tue, May 5, 2020 at 11:46 AM Yu Li wrote: > > > >>> > > > Thanks all for the efforts. > > > > > > I'm hereby canceling the vote due to FLINK-17514, will prepare > > another > > > >> RC > > > after the issue is fixed (hopefully soon). > > > > > > Best Regards, > > > Yu > > > > > > > > > On Tue, 5 May 2020 at 17:28, Till Rohrmann > > > >> wrote: > > > > I agree with Aljoscha. This is something we should fix since this > > is > > > >>> very > > > > important for Flink's stability. I will prepare a fix for the > > > >> problem. > > > > Cheers, > > > > Till > > > > > > > > On Tue, May 5, 2020 at 10:30 AM Congxian Qiu < > > qcx978132...@gmail.com > > > > wrote: > > > > > > > >> +1 (no-binding) > > > >> - sha and gpg, ok > > > >> - all pom files point to same version, ok > > > >> - build from souce, ok > > > >> - LICENCE, ok > > > >> - run demo in standalone cluster, ok > > > >> > > > >> Best, > > > >> Congxian > > > >> > > > >> > > > >> Aljoscha Krettek 于2020年5月5日周二 下午3:50写道: > > > >> > > > >>> Unfortunately, I found this bug which prevents the > > > TaskCancelerWatchdog > > > >>> [sic] from working: > > > https://issues.apache.org/jira/browse/FLINK-17514. > > > > I > > > >>> think it's quite crucial that this failsafe mechanism works > > > correctly. > > > >>> We should cancel the release and fix it. > > > >>> > > > >>> Best, > > > >>> Aljoscha > > > >>> > > > >>> On 05.05.20 05:55, Hequn Cheng wrote: > > > Thanks a lot for managing the release! > > > > > > +1 (binding) > > > > > > - Go through all new commits for 1.10.1 and spot no new > license > > > >> problems. > > > - Built from source archive successfully. > > > - Signatures and hash are correct. > > > - Run SocketWindowWordCount on the local cluster. > > > - Install Python package and run Python WordCount example. > > > - Reviewed website PR > > > > > > Best, > > > Hequn > > > > > > On Sun, May 3, 2020 at 9:10 PM Robert Metzger < > > > >>> rmetz...@apache.org > > > >>> wrote: > > > > Thanks a lot for addressing the issues from the last release > > > > candidate > > > >>> and > > > > creating this one! > > > > > > > > +1 (binding) > > > > > > > > - Started Flink on YARN on Google Cloud DataProc by setting > > > > HADOOP_CLASSPATH > > > > - checked staging repo > > > > > > > > > > > > > > > > On Sat, May 2, 2020 at 6:57 PM Thomas Weise > > > wrote: > > > >> +1 (binding) > > > >> > > > >> Checked signatures and hashes. > > > >> > > > >> Run internal benchmark applications. > > > >> > > > >> I found a regression that was actually introduced with > > > >> 1.10.0, > > > > hence > > > >>> not > > > > a > > > >> blocker for this release: > > > >> > > > >> https://github.com/apache/flink/pull/11975 > > > >> > > > >> Thanks, > > > >> Thomas > > > >> > > > >> > > > >> On Fri, May 1, 2020 at 5:37 AM Yu Li > > > >> wrote: > > > >>> Hi everyone, > > > >>> > > > >>> Please review and vote on the release candidate #2 for > > > >> version > > > >> 1.10.1, > > > > as > > > >>> follows: > > > >>> [
Re: [DISCUSS] Send issue and pull request notifications for flink-web and flink-shaded to iss...@flink.apache.org
Quick update: The .asf.yamls have been committed and we should no longer see issue and pr notifications coming from flink-web and flink-shaded on dev@f.a.o. These notifications go now to issues@f.a.o. Cheers, Till On Thu, May 7, 2020 at 8:02 AM Zhu Zhu wrote: > +1 > > Thanks, > Zhu Zhu > > Congxian Qiu 于2020年5月6日周三 下午12:02写道: > > > +1 for this > > > > Best, > > Congxian > > > > > > Benchao Li 于2020年5月5日周二 下午5:22写道: > > > > > belated +1 > > > > > > Till Rohrmann 于2020年5月5日周二 下午2:24写道: > > > > > > > Thanks for all the feedback. I will open the PRs now which change the > > > > notification settings of the above-mentioned repositories. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Tue, May 5, 2020 at 6:26 AM Hequn Cheng wrote: > > > > > > > > > +1, thanks a lot for driving this. > > > > > > > > > > Best, Hequn > > > > > > > > > > On Tue, May 5, 2020 at 11:56 AM Dian Fu > > wrote: > > > > > > > > > > > +1 > > > > > > > > > > > > Regards, > > > > > > Dian > > > > > > > > > > > > > 在 2020年5月5日,上午9:58,Yangze Guo 写道: > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > Best, > > > > > > > Yangze Guo > > > > > > > > > > > > > > On Tue, May 5, 2020 at 6:14 AM Thomas Weise > > > wrote: > > > > > > >> > > > > > > >> +1 > > > > > > >> > > > > > > >> > > > > > > >> On Mon, May 4, 2020 at 10:02 AM Marta Paes Moreira < > > > > > ma...@ververica.com > > > > > > > > > > > > > >> wrote: > > > > > > >> > > > > > > >>> +1, this is quite annoying and distracting. > > > > > > >>> > > > > > > >>> Marta > > > > > > >>> > > > > > > >>> On Mon, May 4, 2020 at 6:27 PM Yu Li > wrote: > > > > > > >>> > > > > > > +1 > > > > > > > > > > > > Best Regards, > > > > > > Yu > > > > > > > > > > > > > > > > > > On Tue, 5 May 2020 at 00:21, Konstantin Knauf < > > > kna...@apache.org> > > > > > > wrote: > > > > > > > > > > > > > Yes, please. > > > > > > > > > > > > > > On Mon, May 4, 2020 at 5:50 PM Dawid Wysakowicz < > > > > > > >>> dwysakow...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > >> +1 > > > > > > >> > > > > > > >> Yes, please. I've also observed a lot of noise in the past > > > days. > > > > > > >> > > > > > > >> Best, > > > > > > >> > > > > > > >> Dawid > > > > > > >> > > > > > > >> On 04/05/2020 17:48, Tzu-Li (Gordon) Tai wrote: > > > > > > >>> +1 > > > > > > >>> > > > > > > >>> All the recent new repos, flink-statefun / > > > > flink-statefun-docker > > > > > / > > > > > > >>> flink-training etc. are also sending notifications to > > issues@ > > > . > > > > > > >>> > > > > > > >>> Gordon > > > > > > >>> > > > > > > >>> > > > > > > >>> On Mon, May 4, 2020, 11:44 PM Till Rohrmann < > > > > > trohrm...@apache.org> > > > > > > >> wrote: > > > > > > >>> > > > > > > Hi everyone, > > > > > > > > > > > > due to some changes on the ASF side, we are now seeing > > issue > > > > and > > > > > > pull > > > > > > request notifications for the flink-web [1] and > > flink-shaded > > > > [2] > > > > > > repo > > > > > > > on > > > > > > dev@flink.apache.org. I think this is not ideal since > the > > > dev > > > > > ML > > > > > > >>> is > > > > > > >> much > > > > > > more noisy now. > > > > > > > > > > > > I would propose to send these notifications to > > > > > > > iss...@flink.apache.org > > > > > > >> as > > > > > > we are currently doing it for the Flink main repo [3]. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > [1] https://github.com/apache/flink-web > > > > > > [2] https://github.com/apache/flink-shaded > > > > > > [3] https://gitbox.apache.org/schemes.cgi?flink > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Konstantin Knauf > > > > > > > > > > > > > > https://twitter.com/snntrable > > > > > > > > > > > > > > https://github.com/knaufk > > > > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Benchao Li > > > School of Electronics Engineering and Computer Science, Peking > University > > > Tel:+86-15650713730 > > > Email: libenc...@gmail.com; libenc...@pku.edu.cn > > > > > >
Re: [DISCUSS] FLIP-36 - Support Interactive Programming in Flink Table API
Hi, There are some feedbacks from @Timo and @Kurt in the voting thread for FLIP-36 and I want to share my thoughts here. 1. How would the FLIP-36 look like after FLIP-84? I don't think FLIP-84 will affect FLIP-36 from the public API perspective. Users can call .cache on a table object and the cached table will be generated whenever the table job is triggered to execute, either by Table#executeInsert or StatementSet#execute. I think that FLIP-36 should aware of the changes made by FLIP-84, but it shouldn't be a problem. At the end of the day, FLIP-36 only requires the ability to add a sink to a node, submit a table job with multiple sinks, and replace the cached table with a source. 2. How can we support cache in a multi-statement SQL file? The most intuitive way to support cache in a multi-statement SQL file is by using a view, where the view is corresponding to a cached table. 3. Unifying the cached table and materialized views It is true that the cached table and the materialized view are similar in some way. However, I think the materialized view is a more complex concept. First, a materialized view requires some kind of a refresh mechanism to synchronize with the table. Secondly, the life cycle of a materialized view is longer. The materialized view should be accessible even after the application exits and should be accessible by another application, while the cached table is only accessible in the application where it is created. The cached table is introduced to avoid recomputation of an intermediate table to support interactive programming in Flink Table API. And I think the materialized view needs more discussion and certainly deserves a whole new FLIP. Please let me know your thought. Best, Xuannan On Wed, Apr 29, 2020 at 3:53 PM Xuannan Su wrote: > Hi folks, > > The FLIP-36 is updated according to the discussion with Becket. In the > meantime, any comments are very welcome. > > If there are no further comments, I would like to start the voting > thread by tomorrow. > > Thanks, > Xuannan > > > On Sun, Apr 26, 2020 at 9:34 AM Xuannan Su wrote: > >> Hi Becket, >> >> You are right. It makes sense to treat retry of job 2 as an ordinary job. >> And the config does introduce some unnecessary confusion. Thank you for you >> comment. I will update the FLIP. >> >> Best, >> Xuannan >> >> On Sat, Apr 25, 2020 at 7:44 AM Becket Qin wrote: >> >>> Hi Xuannan, >>> >>> If user submits Job 1 and generated a cached intermediate result. And >>> later >>> on, user submitted job 2 which should ideally use the intermediate >>> result. >>> In that case, if job 2 failed due to missing the intermediate result, >>> Job 2 >>> should be retried with its full DAG. After that when Job 2 runs, it will >>> also re-generate the cache. However, once job 2 has fell back to the >>> original DAG, should it just be treated as an ordinary job that follow >>> the >>> recovery strategy? Having a separate configuration seems a little >>> confusing. In another word, re-generating the cache is just a byproduct >>> of >>> running the full DAG of job 2, but is not the main purpose. It is just >>> like >>> when job 1 runs to generate cache, it does not have a separate config of >>> retry to make sure the cache is generated. If it fails, it just fail like >>> an ordinary job. >>> >>> What do you think? >>> >>> Thanks, >>> >>> Jiangjie (Becket) Qin >>> >>> On Fri, Apr 24, 2020 at 5:00 PM Xuannan Su >>> wrote: >>> >>> > Hi Becket, >>> > >>> > The intermediate result will indeed be automatically re-generated by >>> > resubmitting the original DAG. And that job could fail as well. In that >>> > case, we need to decide if we should resubmit the original DAG to >>> > re-generate the intermediate result or give up and throw an exception >>> to >>> > the user. And the config is to indicate how many resubmit should happen >>> > before giving up. >>> > >>> > Thanks, >>> > Xuannan >>> > >>> > On Fri, Apr 24, 2020 at 4:19 PM Becket Qin >>> wrote: >>> > >>> > > Hi Xuannan, >>> > > >>> > > I am not entirely sure if I understand the cases you mentioned. The >>> > users >>> > > > can use the cached table object returned by the .cache() method in >>> > other >>> > > > job and it should read the intermediate result. The intermediate >>> result >>> > > can >>> > > > gone in the following three cases: 1. the user explicitly call the >>> > > > invalidateCache() method 2. the TableEnvironment is closed 3. >>> failure >>> > > > happens on the TM. When that happens, the intermeidate result will >>> not >>> > be >>> > > > available unless it is re-generated. >>> > > >>> > > >>> > > What confused me was that why do we need to have a *cache.retries.max >>> > > *config? >>> > > Shouldn't the missing intermediate result always be automatically >>> > > re-generated if it is gone? >>> > > >>> > > Thanks, >>> > > >>> > > Jiangjie (Becket) Qin >>> > > >>> > > >>> > > On Fri, Apr 24, 2020 at 3:59 PM Xuannan Su >>> > wrote: >>> > > >>> > > > Hi Becket, >>> > > > >
[jira] [Created] (FLINK-17552) UnionInputGate shouldn't be caching InputChannels
Piotr Nowojski created FLINK-17552: -- Summary: UnionInputGate shouldn't be caching InputChannels Key: FLINK-17552 URL: https://issues.apache.org/jira/browse/FLINK-17552 Project: Flink Issue Type: Bug Components: Runtime / Network Reporter: Piotr Nowojski Fix For: 1.11.0 For example currently UnionInputGate#getChannel can become inconsistent with SingleInputGate#getChannel after updating a channel inside SingleInputGate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17553) Group by constant and window causes error: Unsupported call: TUMBLE_END(TIMESTAMP(3) NOT NULL)
Terry Wang created FLINK-17553: -- Summary: Group by constant and window causes error: Unsupported call: TUMBLE_END(TIMESTAMP(3) NOT NULL) Key: FLINK-17553 URL: https://issues.apache.org/jira/browse/FLINK-17553 Project: Flink Issue Type: Bug Components: Table SQL / Planner Reporter: Terry Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17554) Add release hooks for user code class loader
Till Rohrmann created FLINK-17554: - Summary: Add release hooks for user code class loader Key: FLINK-17554 URL: https://issues.apache.org/jira/browse/FLINK-17554 Project: Flink Issue Type: New Feature Components: Runtime / Coordination Affects Versions: 1.11.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.11.0 Release hooks for the user code class loader which are run just before the user code class loader is released would allow clean up static references to classes of the user code class loader. This is important because these static references could prevent the user code classes from being garbage collected and eventually causing metaspace OOMs. Hence I suggest to extend the {{RuntimeContext}} with an additional method {{registerUserCodeClassLoaderReleaseHook(Runnable releaseHook)}} which allows the user code to register a release hook for the user code class loader. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17555) docstring of pyflink.table.descriptors.FileSystem:1:duplicate object description of pyflink.table.descriptors.FileSystem
Piotr Nowojski created FLINK-17555: -- Summary: docstring of pyflink.table.descriptors.FileSystem:1:duplicate object description of pyflink.table.descriptors.FileSystem Key: FLINK-17555 URL: https://issues.apache.org/jira/browse/FLINK-17555 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.10.0 Reporter: Piotr Nowojski Some document check failed on travis: https://api.travis-ci.org/v3/job/684185591/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17556) FATAL: Thread 'flink-metrics-akka.remote.default-remote-dispatcher-3' produced an uncaught exception. Stopping the process... java.lang.OutOfMemoryError: Direct buffer m
Tammy zhang created FLINK-17556: --- Summary: FATAL: Thread 'flink-metrics-akka.remote.default-remote-dispatcher-3' produced an uncaught exception. Stopping the process... java.lang.OutOfMemoryError: Direct buffer memory Key: FLINK-17556 URL: https://issues.apache.org/jira/browse/FLINK-17556 Project: Flink Issue Type: Bug Reporter: Tammy zhang My job consumes the data in kafka and then processes the data. After the job lasts for a while, the following error appears: ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler - FATAL: Thread 'flink-metrics-akka.remote.default-remote-dispatcher-3' produced an uncaught exception. Stopping the process... java.lang.OutOfMemoryError: Direct buffer memory i have set the "max.poll.records" propertity is "250", and it does not work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[VOTE] Release 1.10.1, release candidate #3
Hi everyone, Please review and vote on the release candidate #3 for version 1.10.1, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA release notes [1], * the official Apache source release and binary convenience releases to be deployed to dist.apache.org [2], which are signed with the key with fingerprint D8D3D42E84C753CA5F170BDF93C07902771AB743 [3], * all artifacts to be deployed to the Maven Central Repository [4], * source code tag "release-1.10.1-rc3" [5], * website pull request listing the new release and adding announcement blog post [6]. The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes. Thanks, Yu [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346891 [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.1-rc3/ [3] https://dist.apache.org/repos/dist/release/flink/KEYS [4] https://repository.apache.org/content/repositories/orgapacheflink-1364/ [5] https://github.com/apache/flink/commit/c5915cf87f96e1c7ebd84ad00f7eabade7e7fe37 [6] https://github.com/apache/flink-web/pull/330
[jira] [Created] (FLINK-17557) Add configuration to disallow ambiguous file schemes
Seth Wiesman created FLINK-17557: Summary: Add configuration to disallow ambiguous file schemes Key: FLINK-17557 URL: https://issues.apache.org/jira/browse/FLINK-17557 Project: Flink Issue Type: Improvement Reporter: Seth Wiesman Flink supports usage of both S3 presto and S3 hadoop within the same session via the plugin system. When this happens the scheme 's3://' is and references whichever filesystem happened to have been loaded last. Instead, users should use the schemes 's3p' and 's3a' respectively which are unique. There should be a configuration option that disallows the use of ambiguous file schemes to prevent strange behavior in produciton. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL
Hi everyone, Thanks everyone for joining the discussion on this. Please let me summarize what I have understood so far. 1) For joining an append-only table and a temporal table the syntax the "FOR SYSTEM_TIME AS OF " seems to be preferred (Fabian, Timo, Seth). 2) To define a temporal table based on a changelog stream from an external system CREATE TEMPORAL TABLE (as suggested by Timo/Fabian) could be used. 3) In order to also support temporal tables derived from an append-only stream, we either need to support TEMPORAL VIEW (as mentioned by Fabian) or need to have a way to convert an append-only table into a changelog table (briefly discussed in [1]). It is not completely clear to me how a temporal table based on an append-only table would be with the syntax proposed in [1] and 2). @Jark Wu could you elaborate a bit on that? How do we move forward with this? * It seems that a two-phased approach (1 + 2 now, 3 later) makes sense. What do you think? * If we proceed like this, what would this mean for the current syntax of LATERAL TABLE? Would we keep it? Would we eventually deprecate and drop it? Since only after 3) we would be on par with the current temporal table function join, I assume, we could only drop it thereafter. Thanks, Konstantin [1] https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.kduaw9moein6 On Sat, Apr 18, 2020 at 3:07 PM Jark Wu wrote: > Hi Fabian, > > Just to clarify a little bit, we decided to move the "converting > append-only table into changelog table" into future work. > So FLIP-105 only introduced some CDC formats (debezium) and new TableSource > interfaces proposed in FLIP-95. > I should have started a new FLIP for the new CDC formats and keep FLIP-105 > as it is to avoid the confusion, sorry about that. > > Best, > Jark > > > On Sat, 18 Apr 2020 at 00:35, Fabian Hueske wrote: > > > Thanks Jark! > > > > I certainly need to read up on FLIP-105 (and I'll try to adjust my > > terminology to changelog table from now on ;-) ) > > If FLIP-105 addresses the issue of converting an append-only table into a > > changelog table that upserts on primary key (basically what the VIEW > > definition in my first email did), > > TEMPORAL VIEWs become much less important. > > In that case, we would be well served with TEMPORAL TABLE and TEMPORAL > VIEW > > would be a nice-to-have feature for some later time. > > > > Cheers, Fabian > > > > > > > > > > > > > > Am Fr., 17. Apr. 2020 um 18:13 Uhr schrieb Jark Wu : > > > > > Hi Fabian, > > > > > > I think converting an append-only table into temporal table contains > two > > > things: > > > (1) converting append-only table into changelog table (or retraction > > table > > > as you said) > > > (2) define the converted changelog table (maybe is a view now) as > > temporal > > > (or history tracked). > > > > > > The first thing is also mentioned and discussed in FLIP-105 design > draft > > > [1] which proposed a syntax > > > to convert the append-only table into a changelog table. > > > > > > I think TEMPORAL TABLE is quite straightforward and simple, and can > > satisfy > > > most existing changelog > > > data with popular CDC formats. TEMPORAL VIEW is flexible but will > involve > > > more SQL codes. I think > > > we can support them both. > > > > > > Best, > > > Jark > > > > > > [1]: > > > > > > > > > https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.sz656g8mb2wb > > > > > > On Fri, 17 Apr 2020 at 23:52, Fabian Hueske wrote: > > > > > > > Hi, > > > > > > > > I agree with most of what Timo said. > > > > > > > > The TEMPORAL keyword (which unfortunately might be easily confused > with > > > > TEMPORARY...) looks very intuitive and I think using the only time > > > > attribute for versioning would be a good choice. > > > > > > > > However, TEMPORAL TABLE on retraction tables do not solve the full > > > problem. > > > > I believe there will be also cases where we need to derive a temporal > > > table > > > > from an append only table (what TemporalTableFunctions do right now). > > > > I think the best choice for this would be TEMPORAL VIEW but as I > > > explained, > > > > it might be a longer way until this can be supported. > > > > TEMPORAL VIEW would also address the problem of preprocessing. > > > > > > > > > Regarding retraction table with a primary key and a time-attribute: > > > > > These semantics are still unclear to me. Can retractions only occur > > > > > within watermarks? Or are they also used for representing late > > updates? > > > > > > > > Time attributes and retraction streams are a challenging topic that I > > > > haven't completely understood yet. > > > > So far we treated time attributes always as part of the data. > > > > In combination with retractions, it seems that they become metadata > > that > > > > specifies when a change was done. > > > > I think this is different from treating time attributes as regular > > data. > > > > > > >
Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL
Thanks for the summary Konstantin. I think you got all points right. IMO, the way forward would be to work on a FLIP to define * the concept of temporal tables, * how to feed them from retraction tables * how to feed them from append-only tables * their specification with CREATE TEMPORAL TABLE, * how to use temporal tables in temporal table joins * how (if at all) to use temporal tables in other types of queries We would keep the LATERAL TABLE syntax because it used for regular table-valued functions. However, we would probably remove the TemporalTableFunction (which is a built-in table-valued function) after we deprecated it for a while. Cheers, Fabian Am Do., 7. Mai 2020 um 18:03 Uhr schrieb Konstantin Knauf : > Hi everyone, > > Thanks everyone for joining the discussion on this. Please let me summarize > what I have understood so far. > > 1) For joining an append-only table and a temporal table the syntax the > "FOR > SYSTEM_TIME AS OF " seems to be preferred (Fabian, Timo, > Seth). > > 2) To define a temporal table based on a changelog stream from an external > system CREATE TEMPORAL TABLE (as suggested by Timo/Fabian) could be used. > 3) In order to also support temporal tables derived from an append-only > stream, we either need to support TEMPORAL VIEW (as mentioned by Fabian) or > need to have a way to convert an append-only table into a changelog table > (briefly discussed in [1]). It is not completely clear to me how a temporal > table based on an append-only table would be with the syntax proposed in > [1] and 2). @Jark Wu could you elaborate a bit on that? > > How do we move forward with this? > > * It seems that a two-phased approach (1 + 2 now, 3 later) makes sense. > What do you think? * If we proceed like this, what would this mean for the > current syntax of LATERAL TABLE? Would we keep it? Would we eventually > deprecate and drop it? Since only after 3) we would be on par with the > current temporal table function join, I assume, we could only drop it > thereafter. > > Thanks, Konstantin > > [1] > > https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.kduaw9moein6 > > > On Sat, Apr 18, 2020 at 3:07 PM Jark Wu wrote: > > > Hi Fabian, > > > > Just to clarify a little bit, we decided to move the "converting > > append-only table into changelog table" into future work. > > So FLIP-105 only introduced some CDC formats (debezium) and new > TableSource > > interfaces proposed in FLIP-95. > > I should have started a new FLIP for the new CDC formats and keep > FLIP-105 > > as it is to avoid the confusion, sorry about that. > > > > Best, > > Jark > > > > > > On Sat, 18 Apr 2020 at 00:35, Fabian Hueske wrote: > > > > > Thanks Jark! > > > > > > I certainly need to read up on FLIP-105 (and I'll try to adjust my > > > terminology to changelog table from now on ;-) ) > > > If FLIP-105 addresses the issue of converting an append-only table > into a > > > changelog table that upserts on primary key (basically what the VIEW > > > definition in my first email did), > > > TEMPORAL VIEWs become much less important. > > > In that case, we would be well served with TEMPORAL TABLE and TEMPORAL > > VIEW > > > would be a nice-to-have feature for some later time. > > > > > > Cheers, Fabian > > > > > > > > > > > > > > > > > > > > > Am Fr., 17. Apr. 2020 um 18:13 Uhr schrieb Jark Wu : > > > > > > > Hi Fabian, > > > > > > > > I think converting an append-only table into temporal table contains > > two > > > > things: > > > > (1) converting append-only table into changelog table (or retraction > > > table > > > > as you said) > > > > (2) define the converted changelog table (maybe is a view now) as > > > temporal > > > > (or history tracked). > > > > > > > > The first thing is also mentioned and discussed in FLIP-105 design > > draft > > > > [1] which proposed a syntax > > > > to convert the append-only table into a changelog table. > > > > > > > > I think TEMPORAL TABLE is quite straightforward and simple, and can > > > satisfy > > > > most existing changelog > > > > data with popular CDC formats. TEMPORAL VIEW is flexible but will > > involve > > > > more SQL codes. I think > > > > we can support them both. > > > > > > > > Best, > > > > Jark > > > > > > > > [1]: > > > > > > > > > > > > > > https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.sz656g8mb2wb > > > > > > > > On Fri, 17 Apr 2020 at 23:52, Fabian Hueske > wrote: > > > > > > > > > Hi, > > > > > > > > > > I agree with most of what Timo said. > > > > > > > > > > The TEMPORAL keyword (which unfortunately might be easily confused > > with > > > > > TEMPORARY...) looks very intuitive and I think using the only time > > > > > attribute for versioning would be a good choice. > > > > > > > > > > However, TEMPORAL TABLE on retraction tables do not solve the full > > > > problem. > > > > > I believe there will be also cases where we need to derive a > temporal > > > > table >
[jira] [Created] (FLINK-17558) Partitions are released in TaskExecutor Main Thread
Gary Yao created FLINK-17558: Summary: Partitions are released in TaskExecutor Main Thread Key: FLINK-17558 URL: https://issues.apache.org/jira/browse/FLINK-17558 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.11.0 Reporter: Gary Yao Fix For: 1.11.0 Partitions are released in the main thread of the TaskExecutor (see the stacktrace below). This can lead to missed heartbeats, timeouts of RPCs, etc. because deleting files is blocking I/O. The partitions should be released in a devoted I/O thread pool ({{TaskExecutor#ioExecutor}} is a candidate). {noformat} 2020-05-06T19:13:12.4383402Z "flink-akka.actor.default-dispatcher-35" #3555 prio=5 os_prio=0 tid=0x7f7fcc071000 nid=0x1f3f9 runnable [0x7f7fd302c000] 2020-05-06T19:13:12.4383983Zjava.lang.Thread.State: RUNNABLE 2020-05-06T19:13:12.4384519Zat sun.nio.fs.UnixNativeDispatcher.unlink0(Native Method) 2020-05-06T19:13:12.4384971Zat sun.nio.fs.UnixNativeDispatcher.unlink(UnixNativeDispatcher.java:146) 2020-05-06T19:13:12.4385465Zat sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:231) 2020-05-06T19:13:12.4386000Zat sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) 2020-05-06T19:13:12.4386458Zat java.nio.file.Files.delete(Files.java:1126) 2020-05-06T19:13:12.4386968Zat org.apache.flink.runtime.io.network.partition.FileChannelBoundedData.close(FileChannelBoundedData.java:93) 2020-05-06T19:13:12.4388088Zat org.apache.flink.runtime.io.network.partition.BoundedBlockingSubpartition.checkReaderReferencesAndDispose(BoundedBlockingSubpartition.java:247) 2020-05-06T19:13:12.4388765Zat org.apache.flink.runtime.io.network.partition.BoundedBlockingSubpartition.release(BoundedBlockingSubpartition.java:208) 2020-05-06T19:13:12.4389444Z- locked <0xff836d78> (a java.lang.Object) 2020-05-06T19:13:12.4389905Zat org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:290) 2020-05-06T19:13:12.4390481Zat org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:80) 2020-05-06T19:13:12.4391118Z- locked <0x9d452b90> (a java.util.HashMap) 2020-05-06T19:13:12.4391597Zat org.apache.flink.runtime.io.network.NettyShuffleEnvironment.releasePartitionsLocally(NettyShuffleEnvironment.java:153) 2020-05-06T19:13:12.4392267Zat org.apache.flink.runtime.io.network.partition.TaskExecutorPartitionTrackerImpl.stopTrackingAndReleaseJobPartitions(TaskExecutorPartitionTrackerImpl.java:62) 2020-05-06T19:13:12.4392914Zat org.apache.flink.runtime.taskexecutor.TaskExecutor.releaseOrPromotePartitions(TaskExecutor.java:776) 2020-05-06T19:13:12.4393366Zat sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) 2020-05-06T19:13:12.4393813Zat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2020-05-06T19:13:12.4394257Zat java.lang.reflect.Method.invoke(Method.java:498) 2020-05-06T19:13:12.4394693Zat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:279) 2020-05-06T19:13:12.4395202Zat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199) 2020-05-06T19:13:12.4395686Zat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) 2020-05-06T19:13:12.4396165Zat org.apache.flink.runtime.rpc.akka.AkkaRpcActor$$Lambda$72/775020844.apply(Unknown Source) 2020-05-06T19:13:12.4396606Zat akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) 2020-05-06T19:13:12.4397015Zat akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) 2020-05-06T19:13:12.4397447Zat scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) 2020-05-06T19:13:12.4397874Zat akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) 2020-05-06T19:13:12.4398414Zat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) 2020-05-06T19:13:12.4398879Zat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 2020-05-06T19:13:12.4399321Zat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 2020-05-06T19:13:12.4399737Zat akka.actor.Actor$class.aroundReceive(Actor.scala:517) 2020-05-06T19:13:12.4400138Zat akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) 2020-05-06T19:13:12.4400552Zat akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) 2020-05-06T19:13:12.4400930Zat akka.actor.ActorCell.invoke(ActorCell.scala:561) 2020-05-06T19:13:12.4401390Zat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) 2020-05-06T19:13:12.4401763Zat akka.dispatch.Mailbox.run(Mailbox.scala:225) 2020-05-06T19:13:12.4402135Zat akka.dis
[jira] [Created] (FLINK-17559) Backpressure seems to be broken when not going through network
Luis created FLINK-17559: Summary: Backpressure seems to be broken when not going through network Key: FLINK-17559 URL: https://issues.apache.org/jira/browse/FLINK-17559 Project: Flink Issue Type: Bug Components: API / Core, Connectors/ RabbitMQ Affects Versions: 1.8.2 Reporter: Luis Attachments: Screenshot from 2020-05-07 13-31-23.png Back pressure for Flink seems broken. Someone please correct me, from what I understand it only works between network transfers. If I have a source with no thread sleep then there is no back pressure some operation will accurate data and crash. I even tried removing chaining with env.disableOperatorChaining() >From this I can conclude if I have any map function that produces more output >that is coming in it will eventually crash if there is no network dividing >them to allow for backpressure. Is this correct? {code:java} java.lang.OutOfMemoryError: Java heap space 2020-05-07 18:27:37,942 ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler - FATAL: Thread 'flink-scheduler-1' produced an uncaught exception. Stopping the process... java.lang.OutOfMemoryError: Java heap space at akka.dispatch.AbstractNodeQueue.(AbstractNodeQueue.java:32) at akka.actor.LightArrayRevolverScheduler$TaskQueue.(LightArrayRevolverScheduler.scala:305) at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:270) at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236) at java.lang.Thread.run(Thread.java:748) 2020-05-07 18:27:35,725 ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler - FATAL: Thread 'flink-metrics-8' produced an uncaught exception. Stopping the process... java.lang.OutOfMemoryError: Java heap space 2020-05-07 18:27:35,725 ERROR com.rabbitmq.client.impl.ForgivingExceptionHandler- An unexpected connection driver error occured java.lang.OutOfMemoryError: Java heap space at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:120) at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:164) at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:580) at java.lang.Thread.run(Thread.java:748) {code} [https://stackoverflow.com/questions/61465789/how-do-i-create-a-flink-richparallelsourcefunction-with-backpressure] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17560) No Slots available exception in Apache Flink Job Manager while Scheduling
josson paul kalapparambath created FLINK-17560: -- Summary: No Slots available exception in Apache Flink Job Manager while Scheduling Key: FLINK-17560 URL: https://issues.apache.org/jira/browse/FLINK-17560 Project: Flink Issue Type: Bug Affects Versions: 1.8.3 Environment: Flink verson 1.8.3 Session cluster Reporter: josson paul kalapparambath Set up -- Flink verson 1.8.3 Zookeeper HA cluster 1 ResourceManager/Dispatcher (Same Node) 1 TaskManager 4 pipelines running with various parallelism's Issue -- Occationally when the Job Manager gets restarted we noticed that all the pipelines are not getting scheduled. The error that is reporeted by the Job Manger is 'not enough slots are available'. This should not be the case because task manager was deployed with sufficient slots for the number of pipelines/parallelism we have. We further noticed that the slot report sent by the taskmanger contains solts filled with old CANCELLED job Ids. I am not sure why the task manager still holds the details of the old jobs. Thread dump on the task manager confirms that old pipelines are not running. I am aware of https://issues.apache.org/jira/browse/FLINK-12865. But this is not the issue happening in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17561) Table program cannot be compiled
linweijiang created FLINK-17561: --- Summary: Table program cannot be compiled Key: FLINK-17561 URL: https://issues.apache.org/jira/browse/FLINK-17561 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.10.0 Reporter: linweijiang Exception info is: /* 1 */ /* 2 */ public class SourceConversion$136 extends org.apache.flink.table.runtime.operators.AbstractProcessStreamOperator /* 3 */ implements org.apache.flink.streaming.api.operators.OneInputStreamOperator { /* 4 */ /* 5 */ private final Object[] references; /* 6 */ private transient org.apache.flink.table.dataformat.DataFormatConverters.PojoConverter converter$135; /* 7 */ private final org.apache.flink.streaming.runtime.streamrecord.StreamRecord outElement = new org.apache.flink.streaming.runtime.streamrecord.StreamRecord(null); /* 8 */ /* 9 */ public SourceConversion$136( /* 10 */ Object[] references, /* 11 */ org.apache.flink.streaming.runtime.tasks.StreamTask task, /* 12 */ org.apache.flink.streaming.api.graph.StreamConfig config, /* 13 */ org.apache.flink.streaming.api.operators.Output output) throws Exception { /* 14 */ this.references = references; /* 15 */ converter$135 = (((org.apache.flink.table.dataformat.DataFormatConverters.PojoConverter) references[0])); /* 16 */ this.setup(task, config, output); /* 17 */ } /* 18 */ /* 19 */ @Override /* 20 */ public void open() throws Exception { /* 21 */ super.open(); /* 22 */ /* 23 */ } /* 24 */ /* 25 */ @Override /* 26 */ public void processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord element) throws Exception { /* 27 */ org.apache.flink.table.dataformat.BaseRow in1 = (org.apache.flink.table.dataformat.BaseRow) (org.apache.flink.table.dataformat.BaseRow) converter$135.toInternal((com.xxx.flink.bean.RegisterBean) element.getValue()); /* 28 */ /* 29 */ /* 30 */ /* 31 */ output.collect(outElement.replace(in1)); /* 32 */ } /* 33 */ /* 34 */ /* 35 */ /* 36 */ @Override /* 37 */ public void close() throws Exception { /* 38 */ super.close(); /* 39 */ /* 40 */ } /* 41 */ /* 42 */ /* 43 */ } /* 44 */ 2020-05-08 10:35:24,144 ERROR org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler - Unhandled exception. org.apache.flink.util.FlinkRuntimeException: org.apache.flink.api.common.InvalidProgramException: Table program cannot be compiled. This is a bug. Please file an issue. at org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:68) at org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:78) at org.apache.flink.table.runtime.generated.GeneratedClass.getClass(GeneratedClass.java:96) at org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.getStreamOperatorClass(CodeGenOperatorFactory.java:62) at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.preValidate(StreamingJobGraphGenerator.java:214) at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:149) at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:104) at org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph(StreamGraph.java:777) at org.apache.flink.streaming.api.graph.StreamGraphTranslator.translateToJobGraph(StreamGraphTranslator.java:52) at org.apache.flink.client.FlinkPipelineTranslationUtil.getJobGraph(FlinkPipelineTranslationUtil.java:43) at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:57) at org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toJobGraph(JarHandlerUtils.java:128) at org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler.lambda$handleRequest$1(JarPlanHandler.java:100) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.flink.shaded.guava18.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.flink.api.common.InvalidProgramException: Table program cannot be compiled. This is a bug. Please file an issue. at org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) at org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache.get(LocalCache.java:3937) at org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739) at org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:66) ... 16 more Caused by: org.apache.flink.api.common.InvalidProgramException: Table program cannot be compiled. This is a b
Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL
Hi, I agree what Fabian said above. Besides, IMO, (3) is in a lower priority and will involve much more things. It makes sense to me to do it in two-phase. Regarding to (3), the key point to convert an append-only table into changelog table is that the framework should know the operation type, so we introduced a special CREATE VIEW syntax to do it in the documentation [1]. Here is an example: -- my_binlog table is registered as an append-only table CREATE TABLE my_binlog ( before ROW<...>, after ROW<...>, op STRING, op_ms TIMESTAMP(3) ) WITH ( 'connector.type' = 'kafka', ... ); -- interpret my_binlog as a changelog on the op_type and id key CREATE VIEW my_table AS SELECT after.* FROM my_binlog CHANGELOG OPERATION BY op UPDATE KEY BY (id); -- my_table will materialize the insert/delete/update changes -- if we have 4 records in dbz that -- a create for 1004 -- an update for 1004 -- a create for 1005 -- a delete for 1004 > SELECT COUNT(*) FROM my_table; +---+ | COUNT(*) | +---+ | 1 | +---+ Best, Jark [1]: https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.sz656g8mb2wb On Fri, 8 May 2020 at 00:24, Fabian Hueske wrote: > Thanks for the summary Konstantin. > I think you got all points right. > > IMO, the way forward would be to work on a FLIP to define > * the concept of temporal tables, > * how to feed them from retraction tables > * how to feed them from append-only tables > * their specification with CREATE TEMPORAL TABLE, > * how to use temporal tables in temporal table joins > * how (if at all) to use temporal tables in other types of queries > > We would keep the LATERAL TABLE syntax because it used for regular > table-valued functions. > However, we would probably remove the TemporalTableFunction (which is a > built-in table-valued function) after we deprecated it for a while. > > Cheers, Fabian > > Am Do., 7. Mai 2020 um 18:03 Uhr schrieb Konstantin Knauf < > kna...@apache.org>: > >> Hi everyone, >> >> Thanks everyone for joining the discussion on this. Please let me >> summarize >> what I have understood so far. >> >> 1) For joining an append-only table and a temporal table the syntax the >> "FOR >> SYSTEM_TIME AS OF " seems to be preferred (Fabian, Timo, >> Seth). >> >> 2) To define a temporal table based on a changelog stream from an external >> system CREATE TEMPORAL TABLE (as suggested by Timo/Fabian) could be used. >> 3) In order to also support temporal tables derived from an append-only >> stream, we either need to support TEMPORAL VIEW (as mentioned by Fabian) >> or >> need to have a way to convert an append-only table into a changelog table >> (briefly discussed in [1]). It is not completely clear to me how a >> temporal >> table based on an append-only table would be with the syntax proposed in >> [1] and 2). @Jark Wu could you elaborate a bit on >> that? >> >> How do we move forward with this? >> >> * It seems that a two-phased approach (1 + 2 now, 3 later) makes sense. >> What do you think? * If we proceed like this, what would this mean for the >> current syntax of LATERAL TABLE? Would we keep it? Would we eventually >> deprecate and drop it? Since only after 3) we would be on par with the >> current temporal table function join, I assume, we could only drop it >> thereafter. >> >> Thanks, Konstantin >> >> [1] >> >> https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.kduaw9moein6 >> >> >> On Sat, Apr 18, 2020 at 3:07 PM Jark Wu wrote: >> >> > Hi Fabian, >> > >> > Just to clarify a little bit, we decided to move the "converting >> > append-only table into changelog table" into future work. >> > So FLIP-105 only introduced some CDC formats (debezium) and new >> TableSource >> > interfaces proposed in FLIP-95. >> > I should have started a new FLIP for the new CDC formats and keep >> FLIP-105 >> > as it is to avoid the confusion, sorry about that. >> > >> > Best, >> > Jark >> > >> > >> > On Sat, 18 Apr 2020 at 00:35, Fabian Hueske wrote: >> > >> > > Thanks Jark! >> > > >> > > I certainly need to read up on FLIP-105 (and I'll try to adjust my >> > > terminology to changelog table from now on ;-) ) >> > > If FLIP-105 addresses the issue of converting an append-only table >> into a >> > > changelog table that upserts on primary key (basically what the VIEW >> > > definition in my first email did), >> > > TEMPORAL VIEWs become much less important. >> > > In that case, we would be well served with TEMPORAL TABLE and TEMPORAL >> > VIEW >> > > would be a nice-to-have feature for some later time. >> > > >> > > Cheers, Fabian >> > > >> > > >> > > >> > > >> > > >> > > >> > > Am Fr., 17. Apr. 2020 um 18:13 Uhr schrieb Jark Wu > >: >> > > >> > > > Hi Fabian, >> > > > >> > > > I think converting an append-only table into temporal table contains >> > two >> > > > things: >> > > > (1) converting append-only table into changelog table (or retraction >> >
[jira] [Created] (FLINK-17562) Monitoring REST API page has duplicate items
AT-Fieldless created FLINK-17562: Summary: Monitoring REST API page has duplicate items Key: FLINK-17562 URL: https://issues.apache.org/jira/browse/FLINK-17562 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.10.0 Reporter: AT-Fieldless Monitoring REST API[link title|[https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/rest_api.html#jars-jarid-plan-1]] has two */jars/:jarid/plan* column and there descriptions are same. So I think one of them should be removed -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17563) YARN Setup page has a broken link
AT-Fieldless created FLINK-17563: Summary: YARN Setup page has a broken link Key: FLINK-17563 URL: https://issues.apache.org/jira/browse/FLINK-17563 Project: Flink Issue Type: Bug Reporter: AT-Fieldless In Flink YARN Session column, the _FAQ section_ link is broken.Is it deprecated? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17564) Inflight data of incoming channel may be disordered for unaligned checkpoint
Yingjie Cao created FLINK-17564: --- Summary: Inflight data of incoming channel may be disordered for unaligned checkpoint Key: FLINK-17564 URL: https://issues.apache.org/jira/browse/FLINK-17564 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.11.0 Reporter: Yingjie Cao Fix For: 1.11.0 For unaligned checkpoint, when checkpointing the inflight data of incoming channel, both task thread and Netty thread may add data to the channel state writer. More specifically, the task thread will first request inflight buffers from the input channel and add the buffers to the channel state writer, and then the Netty thread will add the following up buffers (if any) to the channel state writer. The buffer adding of task thread and Netty thread is not synchronized so the Netty thread may add buffers before the task thread which leads to disorder of the data and corruption of the data stream. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17565) Bump fabric8 version from 4.5.2 to 4.10.1
Canbin Zheng created FLINK-17565: Summary: Bump fabric8 version from 4.5.2 to 4.10.1 Key: FLINK-17565 URL: https://issues.apache.org/jira/browse/FLINK-17565 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Reporter: Canbin Zheng Currently, we are using a version of 4.5.2, it's better that we upgrade it to 4.10.1 for features like K8s 1.17 support, LeaderElection support, and etc. For more details, please refer to [fabric8 releases|[https://github.com/fabric8io/kubernetes-client/releases].] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17566) Fix potential K8s resources leak after JobManager finishes in Applicaion mode
Canbin Zheng created FLINK-17566: Summary: Fix potential K8s resources leak after JobManager finishes in Applicaion mode Key: FLINK-17566 URL: https://issues.apache.org/jira/browse/FLINK-17566 Project: Flink Issue Type: Bug Components: Deployment / Kubernetes Reporter: Canbin Zheng FLINK-10934 introduces applicaion mode support in the native K8s setups., but as the discussion in [https://github.com/apache/flink/pull/12003|https://github.com/apache/flink/pull/12003,], there's large probability that all the K8s resources leak after the JobManager finishes except that the replica of Deployment is scaled down to 0. We need to find out the root cause and fix it. This may be related to the way fabric8 SDK deletes a Deployment. It splits the procedure into three steps as follows: # Scales down the replica to 0 # Wait until the scaling down succeed # Delete the ReplicaSet -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL
I might missed something but why we need a new "TEMPORAL TABLE" syntax? According to Fabian's first mail: > Hence, the requirements for a temporal table are: > * The temporal table has a primary key / unique attribute > * The temporal table has a time-attribute that defines the start of the > validity interval of a row (processing time or event time) > * The system knows that the history of the table is tracked and can infer > how to look up a version. I think primary key plus proper event time attribute is already sufficient. So a join query looks like: "Fact join Dim FOR SYSTEM_TIME AS OF Fact.some_event_time ON Fact.id = Dim.id" would means for every record belong to Fact, use Fact.some_event_time as Dim's version (which will only keep all records from Dim table with event time less or equal to Fact.some_event_time, and keep only one record for each primary key). The temporal behavior is actually triggered by the join syntax "FOR SYSTEM_TIME AS OF Fact.some_event_time" but not the DDL description. Best, Kurt On Fri, May 8, 2020 at 10:51 AM Jark Wu wrote: > Hi, > > I agree what Fabian said above. > Besides, IMO, (3) is in a lower priority and will involve much more things. > It makes sense to me to do it in two-phase. > > Regarding to (3), the key point to convert an append-only table into > changelog table is that the framework should know the operation type, > so we introduced a special CREATE VIEW syntax to do it in the documentation > [1]. Here is an example: > > -- my_binlog table is registered as an append-only table > CREATE TABLE my_binlog ( > before ROW<...>, > after ROW<...>, > op STRING, > op_ms TIMESTAMP(3) > ) WITH ( > 'connector.type' = 'kafka', > ... > ); > > -- interpret my_binlog as a changelog on the op_type and id key > CREATE VIEW my_table AS > SELECT > after.* > FROM my_binlog > CHANGELOG OPERATION BY op > UPDATE KEY BY (id); > > -- my_table will materialize the insert/delete/update changes > -- if we have 4 records in dbz that > -- a create for 1004 > -- an update for 1004 > -- a create for 1005 > -- a delete for 1004 > > SELECT COUNT(*) FROM my_table; > +---+ > | COUNT(*) | > +---+ > | 1 | > +---+ > > Best, > Jark > > [1]: > > https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.sz656g8mb2wb > > > On Fri, 8 May 2020 at 00:24, Fabian Hueske wrote: > > > Thanks for the summary Konstantin. > > I think you got all points right. > > > > IMO, the way forward would be to work on a FLIP to define > > * the concept of temporal tables, > > * how to feed them from retraction tables > > * how to feed them from append-only tables > > * their specification with CREATE TEMPORAL TABLE, > > * how to use temporal tables in temporal table joins > > * how (if at all) to use temporal tables in other types of queries > > > > We would keep the LATERAL TABLE syntax because it used for regular > > table-valued functions. > > However, we would probably remove the TemporalTableFunction (which is a > > built-in table-valued function) after we deprecated it for a while. > > > > Cheers, Fabian > > > > Am Do., 7. Mai 2020 um 18:03 Uhr schrieb Konstantin Knauf < > > kna...@apache.org>: > > > >> Hi everyone, > >> > >> Thanks everyone for joining the discussion on this. Please let me > >> summarize > >> what I have understood so far. > >> > >> 1) For joining an append-only table and a temporal table the syntax the > >> "FOR > >> SYSTEM_TIME AS OF " seems to be preferred (Fabian, Timo, > >> Seth). > >> > >> 2) To define a temporal table based on a changelog stream from an > external > >> system CREATE TEMPORAL TABLE (as suggested by Timo/Fabian) could be > used. > >> 3) In order to also support temporal tables derived from an append-only > >> stream, we either need to support TEMPORAL VIEW (as mentioned by Fabian) > >> or > >> need to have a way to convert an append-only table into a changelog > table > >> (briefly discussed in [1]). It is not completely clear to me how a > >> temporal > >> table based on an append-only table would be with the syntax proposed in > >> [1] and 2). @Jark Wu could you elaborate a bit on > >> that? > >> > >> How do we move forward with this? > >> > >> * It seems that a two-phased approach (1 + 2 now, 3 later) makes sense. > >> What do you think? * If we proceed like this, what would this mean for > the > >> current syntax of LATERAL TABLE? Would we keep it? Would we eventually > >> deprecate and drop it? Since only after 3) we would be on par with the > >> current temporal table function join, I assume, we could only drop it > >> thereafter. > >> > >> Thanks, Konstantin > >> > >> [1] > >> > >> > https://docs.google.com/document/d/1onyIUUdWAHfr_Yd5nZOE7SOExBc6TiW5C4LiL5FrjtQ/edit#heading=h.kduaw9moein6 > >> > >> > >> On Sat, Apr 18, 2020 at 3:07 PM Jark Wu wrote: > >> > >> > Hi Fabian, > >> > > >> > Just to clarify a little bit, we decided to move the "converting > >> > append-