[GitHub] [flink] wuchong commented on pull request #17038: [FLINK-24024][table-planner] Fix syntax mistake in session Window TVF
wuchong commented on pull request #17038: URL: https://github.com/apache/flink/pull/17038#issuecomment-908084677 Looks good to me. cc @godfreyhe -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gyfora commented on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data
gyfora commented on pull request #17022: URL: https://github.com/apache/flink/pull/17022#issuecomment-908087499 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.
wuchong commented on pull request #16723: URL: https://github.com/apache/flink/pull/16723#issuecomment-908087936 @RocMarshal Please do not use "git merge" to rebase branches, otherwise the changes is hard to track. Please use "git rebase" instead. IntelliJ IDEA provide an easy tool to do git rebase, you can find the tool via `VCS -> Git -> Rebase`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16963: [FLINK-23582][docs][table] Add documentation for session window tvf.
flinkbot edited a comment on pull request #16963: URL: https://github.com/apache/flink/pull/16963#issuecomment-904569734 ## CI report: * 3f642deaa05216d448a83522964416044891297b Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23064) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data
flinkbot edited a comment on pull request #17022: URL: https://github.com/apache/flink/pull/17022#issuecomment-907158087 ## CI report: * f954017eb4d81acc1283bae464d7247bda38d904 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22963) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23071) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17019: [FLINK-23854][connectors/Kafka] Ensure lingering Kafka transactions are aborted reliably.
flinkbot edited a comment on pull request #17019: URL: https://github.com/apache/flink/pull/17019#issuecomment-907120059 ## CI report: * fa25b27d642ab7e6af356ca1e9738a006d48 UNKNOWN * 6120b584b4b4ee9286f4998065f122c9a7de Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23047) * 0a6ed86ec26cf3fd8a85ccb00d5b504aa31c4d9b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17044: [FLINK-24010][connector/common] HybridSourceReader/Enumerator delegate checkpoint notifications
flinkbot edited a comment on pull request #17044: URL: https://github.com/apache/flink/pull/17044#issuecomment-907990627 ## CI report: * d0e16223f3b69b417b90c1320117168b8dc5e59e Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23060) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17045: [FLINK-23924][python][examples] Add PyFlink examples
flinkbot edited a comment on pull request #17045: URL: https://github.com/apache/flink/pull/17045#issuecomment-908007478 ## CI report: * 77e0e48a0f9c3f2f62e6e19ca5f4253060c13b07 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23061) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-23686) KafkaSource metric "commitsSucceeded" should count per-commit instead of per-partition
[ https://issues.apache.org/jira/browse/FLINK-23686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-23686: Assignee: Qingsheng Ren > KafkaSource metric "commitsSucceeded" should count per-commit instead of > per-partition > -- > > Key: FLINK-23686 > URL: https://issues.apache.org/jira/browse/FLINK-23686 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.13.2 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.3 > > > Currently if a successful offset commit includes multiple topic partition > (let's say 4), the counter will increase by 4 instead of 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #17046: [FLINK-24049][python] Handle properly for field types need conversion in TupleTypeInfo
flinkbot commented on pull request #17046: URL: https://github.com/apache/flink/pull/17046#issuecomment-908096451 ## CI report: * 0ce3f1e021ffd0a8278412dda43198f8c1eb0b8c UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17047: [FLINK-24027][filesystems] Remove excessive dependencies from NOTICE files
flinkbot commented on pull request #17047: URL: https://github.com/apache/flink/pull/17047#issuecomment-908096579 ## CI report: * cba327f4c75a11dcb018505333abf42e50f0b7f7 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a change in pull request #17027: [FLINK-23954][tests] Fix e2e test test_queryable_state_restart_tm.sh
zentol commented on a change in pull request #17027: URL: https://github.com/apache/flink/pull/17027#discussion_r698253003 ## File path: flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateProducer.java ## @@ -158,24 +161,34 @@ public void open(Configuration parameters) throws Exception { stateDescriptor.setQueryable(QsConstants.QUERY_NAME); state = getRuntimeContext().getMapState(stateDescriptor); -updateCount(); - -LOG.info("Open {} with a count of {}.", getClass().getSimpleName(), count); -} - -private void updateCount() throws Exception { -count = Iterables.size(state.keys()); +countsAtCheckpoint = new HashMap<>(); +count = -1; +lastCompletedCheckpoint = -1; } @Override public void flatMap(Email value, Collector out) throws Exception { state.put(value.getEmailId(), new EmailInformation(value)); -updateCount(); +count = Iterables.size(state.keys()); } @Override public void notifyCheckpointComplete(long checkpointId) { -System.out.println("Count on snapshot: " + count); // we look for it in the test +if (checkpointId > lastCompletedCheckpoint) { Review comment: Is this required to ensure the count (as seen from the test) does not decrease? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on pull request #17034: [BP-1.13][FLINK-23947] Improve logging of granting/revoking leadership in DefaultDispatcherRunner
zentol commented on pull request #17034: URL: https://github.com/apache/flink/pull/17034#issuecomment-908106689 Could you rebase the PR again? 😢 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24044) Errors are output when compiling flink-runtime-web
[ https://issues.apache.org/jira/browse/FLINK-24044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-24044: - Component/s: Build System > Errors are output when compiling flink-runtime-web > -- > > Key: FLINK-24044 > URL: https://issues.apache.org/jira/browse/FLINK-24044 > Project: Flink > Issue Type: Bug > Components: Build System, Runtime / Web Frontend >Affects Versions: 1.14.0 >Reporter: Zhilong Hong >Priority: Minor > > When compiling the module {{flink-runtime-web}}, the terminal would be filled > with errors as below: > {code:bash} > [ERROR] - Generating browser application bundles (phase: setup)... > [ERROR] > [ERROR] Compiling @angular/core : es2015 as esm2015 > [ERROR] > [ERROR] Compiling @angular/common : es2015 as esm2015 > [ERROR] > [ERROR] Compiling @angular/platform-browser : es2015 as esm2015 > [ERROR] > [ERROR] Compiling @angular/platform-browser-dynamic : es2015 as esm2015 > [ERROR] > [ERROR] Compiling @angular/router : es2015 as esm2015 > ... > {code} > > Although it doesn't break the compilation, maybe we should fix this. > I'm not familiar with the module flink-runtime-web or Angular. I found > something that may be useful: > [https://github.com/angular/angular/issues/36513] > [https://github.com/angular/angular/issues/31853] > [https://stackoverflow.com/questions/57220392/even-though-i-am-using-target-esm2015-ivy-is-compiles-as-a-esm5-module/57220445#57220445] > [https://stackoverflow.com/questions/61304548/simple-angular-9-project-compiles-whole-angular-material-es2015-as-esm2015] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Description: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. was: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, and i read source from kafka and sink into > blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhuzhurk commented on pull request #16856: [FLINK-23833][coordination] Individually clean up the cache for ShuffleDescriptors
zhuzhurk commented on pull request #16856: URL: https://github.com/apache/flink/pull/16856#issuecomment-908108473 Merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Issue Type: Bug (was: Improvement) > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, and i read source from kafka and sink into > blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Description: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !!sql.png and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !!udf.png As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !!performance_befort_optimization.png !!performance_after_optimization.png was: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !!sql.png and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !!udf.png As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !! > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, > !!sql.png > and i read source from kafka and sink into blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > !!udf.png > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > !!performance_befort_optimization.png > !!performance_after_optimization.png > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Description: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !!sql.png and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !!udf.png As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !! was: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, > !!sql.png > and i read source from kafka and sink into blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > !!udf.png > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > !! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Description: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !sql.png! and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !udf.png! As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !performance_befort_optimization.png! !performance_after_optimization.png! was: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !!sql.png and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !!udf.png As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !!performance_befort_optimization.png !!performance_after_optimization.png > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, > !sql.png! > and i read source from kafka and sink into blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > !udf.png! > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > !performance_befort_optimization.png! > !performance_after_optimization.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17019: [FLINK-23854][connectors/Kafka] Ensure lingering Kafka transactions are aborted reliably.
flinkbot edited a comment on pull request #17019: URL: https://github.com/apache/flink/pull/17019#issuecomment-907120059 ## CI report: * fa25b27d642ab7e6af356ca1e9738a006d48 UNKNOWN * 6120b584b4b4ee9286f4998065f122c9a7de Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23047) * 0a6ed86ec26cf3fd8a85ccb00d5b504aa31c4d9b Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23068) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17011: [FLINK-23663][table-planner] Reduce state size of ChangelogNormalize
flinkbot edited a comment on pull request #17011: URL: https://github.com/apache/flink/pull/17011#issuecomment-907015222 ## CI report: * e13ee18ff9f927cb23c670236f83956873cfcc1e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22958) * 211af17558902443958144ba9f202b75c719982e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data
flinkbot edited a comment on pull request #17022: URL: https://github.com/apache/flink/pull/17022#issuecomment-907158087 ## CI report: * f954017eb4d81acc1283bae464d7247bda38d904 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23071) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22963) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17038: [FLINK-24024][table-planner] Fix syntax mistake in session Window TVF
flinkbot edited a comment on pull request #17038: URL: https://github.com/apache/flink/pull/17038#issuecomment-907732897 ## CI report: * c6982f9e3518e2908a4c206d0aad363d22fab2d7 UNKNOWN * 209ebeb33f4cb30e50d87728af0db0c9088e0303 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23059) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17047: [FLINK-24027][filesystems] Remove excessive dependencies from NOTICE files
flinkbot edited a comment on pull request #17047: URL: https://github.com/apache/flink/pull/17047#issuecomment-908096579 ## CI report: * cba327f4c75a11dcb018505333abf42e50f0b7f7 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23073) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Description: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !sql.png! and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !udf.png! As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !performace_before_optimization.png! !performance_after_optimization.png! was: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !sql.png! and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !udf.png! As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !performance_befort_optimization.png! !performance_after_optimization.png! > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, > !sql.png! > and i read source from kafka and sink into blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > !udf.png! > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > !performace_before_optimization.png! > !performance_after_optimization.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17046: [FLINK-24049][python] Handle properly for field types need conversion in TupleTypeInfo
flinkbot edited a comment on pull request #17046: URL: https://github.com/apache/flink/pull/17046#issuecomment-908096451 ## CI report: * 0ce3f1e021ffd0a8278412dda43198f8c1eb0b8c Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23072) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weibowen updated FLINK-23555: - Description: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !sql.png! and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !udf.png! As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. before: !performance_before_optimization.png! after: !performance_after_optimization.png! was: When we write a sql like {code:java} select udf2(udf1(field), udf3(udf1(field) ...{code} udf1(field) will be invoked twice. However once udf1 has a bad performance, it will have a huge impact to the whole task. More times invoked, huger impact. I hope that whatever how many times udf1(field) writed in sql, Flink will take advantage of common subexpression elimination and only invoke it once. Then i do some work on this, and the attachment tells the result. The sql.png shows the sql logic, !sql.png! and i read source from kafka and sink into blackhole. The parallelism is 1. The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf `testcse2`, `testcse3` and `testcse4` are the same udf with different alias which completely do nothing. !udf.png! As expected, the performance after optimization is approximately 3 times than before since I write `testcse(sid)` 3 times in sql. !performace_before_optimization.png! !performance_after_optimization.png! > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, > !sql.png! > and i read source from kafka and sink into blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > !udf.png! > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > before: > !performance_before_optimization.png! > after: > !performance_after_optimization.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhuzhurk closed pull request #16856: [FLINK-23833][coordination] Individually clean up the cache for ShuffleDescriptors
zhuzhurk closed pull request #16856: URL: https://github.com/apache/flink/pull/16856 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23833) Cache of ShuffleDescriptors should be individually cleaned up
[ https://issues.apache.org/jira/browse/FLINK-23833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-23833: Fix Version/s: 1.15.0 > Cache of ShuffleDescriptors should be individually cleaned up > - > > Key: FLINK-23833 > URL: https://issues.apache.org/jira/browse/FLINK-23833 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Zhilong Hong >Assignee: Zhilong Hong >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0, 1.15.0 > > > {color:#172b4d}In FLINK-23005, we introduce the cache of compressed > serialized value for ShuffleDescriptors to improve the performance of > deployment. To make sure the cache wouldn't stay too long and become a burden > for GC, the cache would be cleaned up when the partition is released or reset > for new execution. In the implementation, the cache of the entire > IntermediateResult is cleaned up because a partition is released only when > the entire IntermediateResult is released. {color} > {color:#172b4d}However, after FLINK-22017, the BLOCKING result partition is > allowed to be consumable individually. It also means that the result > partition doesn't need to wait for other result partitions and can be > released individually. After this change, there may be a scene: when a result > partition is finished, the cache of IntermediateResult on the blob is > deleted, while other result partitions corresponding to this > IntermediateResult is just deployed to the TaskExecutor. Then when > TaskExecutors are trying to download TDD from the blob, they will find the > blob is deleted and get stuck.{color} > {color:#172b4d}This bug only happens for jobs with POINTWISE BLOCKING edge. > Also, the {{blob.offload.minsize}} is set to be a extremely small value, > since the size of ShuffleDescriptors of POINTWISE BLOCKING edges is usually > small. To solve this issue, we just need to clean up the cache of > ShuffleDescriptors individually.{color} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol commented on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data
zentol commented on pull request #17022: URL: https://github.com/apache/flink/pull/17022#issuecomment-908116741 The HistoryServerTest is getting stuck on CI which is likely caused by this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] beyond1920 commented on pull request #16997: [FLINK-23287][docs][table] Create user document for Window Join in SQL
beyond1920 commented on pull request #16997: URL: https://github.com/apache/flink/pull/16997#issuecomment-908118313 @wuchong Jark, I've updated the input schema to use a lower-case and non-reserved-keyword as the field names. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure
[ https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406602#comment-17406602 ] Hang Ruan commented on FLINK-23971: --- Fine, I will provide a PR to fix it. > PulsarSourceITCase.testIdleReader failed on azure > - > > Key: FLINK-23971 > URL: https://issues.apache.org/jira/browse/FLINK-23971 > Project: Flink > Issue Type: Bug > Components: Connectors / Pulsar >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > Attachments: error.log > > > {code:java} > [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 353.527 s <<< FAILURE! - in > org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2] Time elapsed: > 4.549 s <<< FAILURE! > java.lang.AssertionError: > Expected: Records consumed by Flink should be identical to test data and > preserve the order in multiple splits > but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF' >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) >at > org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193) > {code} > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448] > This is the same error as in FLINK-23828 (kafka). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-23833) Cache of ShuffleDescriptors should be individually cleaned up
[ https://issues.apache.org/jira/browse/FLINK-23833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-23833. --- Resolution: Fixed Fixed via: master: 8d3ec4a2fb1c9e07ee386dced15e1f64f359c6a2 ef6ef809b17b4547194c247975fb4566a19679f0 eba8f574c550123004ed4f557cef28ff557cd88e 1b4340a416bb7da39bb194da2faf3fc5e83fc5d9 1d38803c408fd4f4984913605030394d56bb160e release-1.14: f1eaf314ef232c1e604a49122857bbd99ad71a97 d4f8bb18ebe98ab4fd0d08ec24b00794d732cb38 f7bedb0603c33cb4e25c62c9899edb709b264371 daf997b7a13a62e251a9275a1b631697dea413bd be857d1fdf8853b0315152521e27714ced6b1204 > Cache of ShuffleDescriptors should be individually cleaned up > - > > Key: FLINK-23833 > URL: https://issues.apache.org/jira/browse/FLINK-23833 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Zhilong Hong >Assignee: Zhilong Hong >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0, 1.15.0 > > > {color:#172b4d}In FLINK-23005, we introduce the cache of compressed > serialized value for ShuffleDescriptors to improve the performance of > deployment. To make sure the cache wouldn't stay too long and become a burden > for GC, the cache would be cleaned up when the partition is released or reset > for new execution. In the implementation, the cache of the entire > IntermediateResult is cleaned up because a partition is released only when > the entire IntermediateResult is released. {color} > {color:#172b4d}However, after FLINK-22017, the BLOCKING result partition is > allowed to be consumable individually. It also means that the result > partition doesn't need to wait for other result partitions and can be > released individually. After this change, there may be a scene: when a result > partition is finished, the cache of IntermediateResult on the blob is > deleted, while other result partitions corresponding to this > IntermediateResult is just deployed to the TaskExecutor. Then when > TaskExecutors are trying to download TDD from the blob, they will find the > blob is deleted and get stuck.{color} > {color:#172b4d}This bug only happens for jobs with POINTWISE BLOCKING edge. > Also, the {{blob.offload.minsize}} is set to be a extremely small value, > since the size of ShuffleDescriptors of POINTWISE BLOCKING edges is usually > small. To solve this issue, we just need to clean up the cache of > ShuffleDescriptors individually.{color} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24043) Reuse the code of 'check savepoint preconditions'.
[ https://issues.apache.org/jira/browse/FLINK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406605#comment-17406605 ] Roc Marshal commented on FLINK-24043: - Thanks [~jark] for the reply. Hi, [~gaoyunhaii] [~tangyun] Could you help me to review it if you have free time? Thank you. > Reuse the code of 'check savepoint preconditions'. > --- > > Key: FLINK-24043 > URL: https://issues.apache.org/jira/browse/FLINK-24043 > Project: Flink > Issue Type: Technical Debt > Components: Runtime / Task >Reporter: Roc Marshal >Priority: Minor > Labels: pull-request-available > > [here|https://github.com/apache/flink/blob/da944a8c90477d7b0210024028abd1011e250f14/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java#L822] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23555) Improve common subexpression elimination by using local references
[ https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406607#comment-17406607 ] Enze Liu commented on FLINK-23555: -- The description has updated. Please help check again. Thanks. Our approach is to keep record of the local reference. And implement the \{{visitLocalRef}} in \{{ExprCodeGenerator}}. We can come up with the pr if needed. > Improve common subexpression elimination by using local references > -- > > Key: FLINK-23555 > URL: https://issues.apache.org/jira/browse/FLINK-23555 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: weibowen >Priority: Major > Fix For: 1.14.0 > > Attachments: performance_after_optimization.png, > performance_before_optimization.png, sql.png, udf.png > > > When we write a sql like > {code:java} > select udf2(udf1(field), udf3(udf1(field) ...{code} > udf1(field) will be invoked twice. However once udf1 has a bad performance, > it will have a huge impact to the whole task. More times invoked, huger > impact. > I hope that whatever how many times udf1(field) writed in sql, Flink will > take advantage of common subexpression elimination and only invoke it once. > Then i do some work on this, and the attachment tells the result. > > The sql.png shows the sql logic, > !sql.png! > and i read source from kafka and sink into blackhole. The parallelism is 1. > The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf > `testcse2`, `testcse3` and `testcse4` are the same udf with different alias > which completely do nothing. > !udf.png! > As expected, the performance after optimization is approximately 3 times than > before since I write `testcse(sid)` 3 times in sql. > before: > !performance_before_optimization.png! > after: > !performance_after_optimization.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] pnowojski commented on a change in pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer
pnowojski commented on a change in pull request #17037: URL: https://github.com/apache/flink/pull/17037#discussion_r698277969 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java ## @@ -216,9 +220,10 @@ // Lock is only taken, because #checkAvailability asserts it. It's a small penalty for // thread safety. synchronized (this.availableMemorySegments) { -if (checkAvailability()) { -availabilityHelper.resetAvailable(); -} +// guarantee that we have one buffer on initialization Review comment: 1. Can you expand the java doc why is it important? Also maybe mention the difference between input and output? 2. Can you respect here `numberOfRequiredMemorySegments`? If this value is zero, we should probably skip requesting. 3. Can you also keep the best effort code `checkAvailability()` that was requesting all buffers at once if they were available? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fapaul commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
fapaul commented on a change in pull request #16966: URL: https://github.com/apache/flink/pull/16966#discussion_r698282237 ## File path: flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.table; + +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.core.io.SimpleVersionedSerializer; +import org.apache.flink.core.memory.DataInputViewStreamWrapper; +import org.apache.flink.core.memory.DataOutputViewStreamWrapper; +import org.apache.flink.table.data.RowData; + +import javax.annotation.Nullable; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +class CompoundReducingUpsertWriterStateSerializer +implements SimpleVersionedSerializer> { + +private final TypeSerializer rowDataTypeSerializer; +@Nullable private final SimpleVersionedSerializer wrappedStateSerializer; + +CompoundReducingUpsertWriterStateSerializer( +TypeSerializer rowDataTypeSerializer, +@Nullable SimpleVersionedSerializer wrappedStateSerializer) { +this.wrappedStateSerializer = wrappedStateSerializer; +this.rowDataTypeSerializer = checkNotNull(rowDataTypeSerializer); +} + +CompoundReducingUpsertWriterStateSerializer(TypeSerializer rowDataTypeSerializer) { +this(rowDataTypeSerializer, null); +} + +@Override +public int getVersion() { +return 1; +} + +@Override +public byte[] serialize(CompoundReducingUpsertWriterState state) +throws IOException { +try (final ByteArrayOutputStream baos = new ByteArrayOutputStream(); +final DataOutputViewStreamWrapper out = new DataOutputViewStreamWrapper(baos)) { +final List wrappedStates = state.getWrappedStates(); +if (wrappedStateSerializer != null) { +out.writeInt(wrappedStateSerializer.getVersion()); +out.writeInt(wrappedStates.size()); +for (final WriterState wrappedState : wrappedStates) { +final byte[] serializedWrappedState = +wrappedStateSerializer.serialize(wrappedState); +out.writeInt(serializedWrappedState.length); +out.write(wrappedStateSerializer.serialize(wrappedState)); +} Review comment: Thanks for the hint, I did not know that such a class exists but I think does not help much here. The `SimpleVersionedSerialization` always writes the version in front of the serialized which in this cases leads to a lot of duplicated version writes. In general, the `SimpleVersionedSerialization` seems not to be great for writing `List<>` data. ## File path: flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +
[jira] [Commented] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure
[ https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406615#comment-17406615 ] Xintong Song commented on FLINK-23971: -- Thanks [~ruanhang1993], you're assigned. Please move ahead. > PulsarSourceITCase.testIdleReader failed on azure > - > > Key: FLINK-23971 > URL: https://issues.apache.org/jira/browse/FLINK-23971 > Project: Flink > Issue Type: Bug > Components: Connectors / Pulsar >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > Attachments: error.log > > > {code:java} > [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 353.527 s <<< FAILURE! - in > org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2] Time elapsed: > 4.549 s <<< FAILURE! > java.lang.AssertionError: > Expected: Records consumed by Flink should be identical to test data and > preserve the order in multiple splits > but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF' >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) >at > org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193) > {code} > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448] > This is the same error as in FLINK-23828 (kafka). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure
[ https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-23971: Assignee: Hang Ruan > PulsarSourceITCase.testIdleReader failed on azure > - > > Key: FLINK-23971 > URL: https://issues.apache.org/jira/browse/FLINK-23971 > Project: Flink > Issue Type: Bug > Components: Connectors / Pulsar >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Assignee: Hang Ruan >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > Attachments: error.log > > > {code:java} > [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 353.527 s <<< FAILURE! - in > org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2] Time elapsed: > 4.549 s <<< FAILURE! > java.lang.AssertionError: > Expected: Records consumed by Flink should be identical to test data and > preserve the order in multiple splits > but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF' >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) >at > org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193) > {code} > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448] > This is the same error as in FLINK-23828 (kafka). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23190) Make task-slot allocation much more evenly
[ https://issues.apache.org/jira/browse/FLINK-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406616#comment-17406616 ] loyi commented on FLINK-23190: -- I have submitted a [PR|https://github.com/apache/flink/pull/16929] for this feature, do you have time to take a look? Thanks. [~trohrmann] > Make task-slot allocation much more evenly > -- > > Key: FLINK-23190 > URL: https://issues.apache.org/jira/browse/FLINK-23190 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.12.3 >Reporter: loyi >Assignee: loyi >Priority: Major > Labels: pull-request-available > Attachments: image-2021-07-16-10-34-30-700.png > > > FLINK-12122 only guarantees spreading out tasks across the set of TMs which > are registered at the time of scheduling, but our jobs are all runing on > active yarn mode, the job with smaller source parallelism offen cause > load-balance issues. > > For this job: > {code:java} > // -ys 4 means 10 taskmanagers > env.addSource(...).name("A").setParallelism(10). > map(...).name("B").setParallelism(30) > .map(...).name("C").setParallelism(40) > .addSink(...).name("D").setParallelism(20); > {code} > > Flink-1.12.3 task allocation: > ||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10|| > |A| > 1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}| > |B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}| > |C|4|4|4|4|4|4|4|4|4|4| > |D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}| > > Suggestions: > When TaskManger start register slots to slotManager , current processing > logic will choose the first pendingSlot which meet its resource > requirements. The "random" strategy usually causes uneven task allocation > when source-operator's parallelism is significantly below process-operator's. > A simple feasible idea is {color:#de350b}partition{color} the current > "{color:#de350b}pendingSlots{color}" by their "JobVertexIds" (such as let > AllocationID bring the detail) , then allocate the slots proportionally to > each JobVertexGroup. > > For above case, the 40 pendingSlots could be divided into 4 groups: > [ABCD]: 10 // A、B、C、D reprents {color:#de350b}jobVertexId{color} > [BCD]: 10 > [CD]: 10 > [D]: 10 > > Every taskmanager will provide 4 slots one time, and each group will get 1 > slot according their proportion (1/4), the final allocation result is below: > [ABCD] : deploye on 10 different taskmangers > [BCD]: deploye on 10 different taskmangers > [CD]: deploye on 10 different taskmangers > [D]: deploye on 10 different taskmangers > > I have implement a [concept > code|https://github.com/saddays/flink/commit/dc82e60a7c7599fbcb58c14f8e3445bc8d07ace1] > based on Flink-1.12.3 , the patch version has {color:#de350b}fully > evenly{color} task allocation , and works well on my workload . Are there > other point that have not been considered or does it conflict with future > plans? Sorry for my poor english. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] fapaul commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
fapaul commented on a change in pull request #16966: URL: https://github.com/apache/flink/pull/16966#discussion_r698286541 ## File path: flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterState.java ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.table; + +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.table.data.RowData; + +import javax.annotation.Nullable; + +import java.util.List; +import java.util.Map; +import java.util.Objects; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +class CompoundReducingUpsertWriterState { Review comment: I am in favor of your first comment because afaik there will be some efforts to integrate the `ReducingUpsert` functionality in the table API operators and make it available to all upsert sinks. Therefore generalizing it now may not have an impact because it is dropped soonish. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18807) FlinkKafkaProducerITCase.testScaleUpAfterScalingDown failed with "Timeout expired after 60000milliseconds while awaiting EndTxn(COMMIT)"
[ https://issues.apache.org/jira/browse/FLINK-18807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406617#comment-17406617 ] Xintong Song commented on FLINK-18807: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23062&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=15a22db7-8faa-5b34-3920-d33c9f0ca23c&l=7399 > FlinkKafkaProducerITCase.testScaleUpAfterScalingDown failed with "Timeout > expired after 6milliseconds while awaiting EndTxn(COMMIT)" > > > Key: FLINK-18807 > URL: https://issues.apache.org/jira/browse/FLINK-18807 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka, Tests >Affects Versions: 1.11.0, 1.12.0 >Reporter: Dian Fu >Priority: Minor > Labels: auto-deprioritized-major, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5142&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 > {code} > 2020-08-03T22:06:45.9078498Z [ERROR] > testScaleUpAfterScalingDown(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase) > Time elapsed: 76.498 s <<< ERROR! > 2020-08-03T22:06:45.9079233Z org.apache.kafka.common.errors.TimeoutException: > Timeout expired after 6milliseconds while awaiting EndTxn(COMMIT) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-statefun] igalshilman closed pull request #245: [FLINK-23520][datastream] improve statefun <-> datastream interop
igalshilman closed pull request #245: URL: https://github.com/apache/flink-statefun/pull/245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-statefun] igalshilman commented on pull request #245: [FLINK-23520][datastream] improve statefun <-> datastream interop
igalshilman commented on pull request #245: URL: https://github.com/apache/flink-statefun/pull/245#issuecomment-908144474 @sjwiesman I'm closing this for now, because we will try to solve it without copying the sources, by properly shedding protobuf. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.
flinkbot edited a comment on pull request #16723: URL: https://github.com/apache/flink/pull/16723#issuecomment-893364145 ## CI report: * 542b00a0777a8b6483a0e61eaf47787ee43f98b8 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23018) * 1de6e0b83ee64c22c861a9d8e639483aaaba9e7f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure
[ https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406618#comment-17406618 ] Qingsheng Ren commented on FLINK-23971: --- Thanks [~ruanhang1993] for the investigation! I double checked the code and [~ruanhang1993]'s assumption is reasonable. Also I think this ticket shares the same cause as FLINK-23828. I'll mark it as related to this one. > PulsarSourceITCase.testIdleReader failed on azure > - > > Key: FLINK-23971 > URL: https://issues.apache.org/jira/browse/FLINK-23971 > Project: Flink > Issue Type: Bug > Components: Connectors / Pulsar >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Assignee: Hang Ruan >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > Attachments: error.log > > > {code:java} > [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 353.527 s <<< FAILURE! - in > org.apache.flink.connector.pulsar.source.PulsarSourceITCase > [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2] Time elapsed: > 4.549 s <<< FAILURE! > java.lang.AssertionError: > Expected: Records consumed by Flink should be identical to test data and > preserve the order in multiple splits > but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF' >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) >at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) >at > org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193) > {code} > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448] > This is the same error as in FLINK-23828 (kafka). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17011: [FLINK-23663][table-planner] Reduce state size of ChangelogNormalize
flinkbot edited a comment on pull request #17011: URL: https://github.com/apache/flink/pull/17011#issuecomment-907015222 ## CI report: * e13ee18ff9f927cb23c670236f83956873cfcc1e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22958) * 211af17558902443958144ba9f202b75c719982e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23075) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16997: [FLINK-23287][docs][table] Create user document for Window Join in SQL
flinkbot edited a comment on pull request #16997: URL: https://github.com/apache/flink/pull/16997#issuecomment-906324142 ## CI report: * 717cc232df32c0a19b2026b7246af7b307c14dc7 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23056) * c096ede2ac4bd991486ec9fcff878f96b02f2f41 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23544) Window TVF Supports session window
[ https://issues.apache.org/jira/browse/FLINK-23544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406619#comment-17406619 ] Timo Walther commented on FLINK-23544: -- [~godfreyhe][~qingru zhang] Sorry, for jumping in so late but are we sure that this change has correct SQL semantics? As far as I can see, it violates the syntax mentioned in the FLIP that states that SESSION window TVFs need a {{PARTITION BY}} clause. This was also the reason why we didn't support SESSION in the first version, right? {code} SELECT * FROM TABLE( SESSION(TABLE Bid PARTITION BY bidder, DESCRIPTOR(bidtime), DESCRIPTOR(bidder), INTERVAL '5' MINUTES); {code} > Window TVF Supports session window > -- > > Key: FLINK-23544 > URL: https://issues.apache.org/jira/browse/FLINK-23544 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Minor > Labels: pull-request-available > Fix For: 1.14.0 > > > Window TVF would support SESSION window in the following case: > # SESSION Window TVF followed by Window Aggregate, in this case SESSION > window TVF would be pulled up into WindowAggregate, so Window assigner would > happen in WindowAggrgeate > > *Note, SESSION window TVF only works in limited cases currently, the > following user cases is not supported yet:* > # *SESSION WINDOW TVF followed by Window JOIN* > # *SESSION WINDOW TVF followed by Window RANK*** > *BESIDES, SESSION window Aggregate does not support the following performance > improvement yet:* > 1. Split Distinct Aggregation > 2. Local-global Aggregation > 3. Mini-batch Aggregate -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23791) Enable RocksDB log again
[ https://issues.apache.org/jira/browse/FLINK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated FLINK-23791: -- Fix Version/s: 1.15.0 > Enable RocksDB log again > > > Key: FLINK-23791 > URL: https://issues.apache.org/jira/browse/FLINK-23791 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Yun Tang >Assignee: Zakelly Lan >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0, 1.14.1 > > > FLINK-15068 disabled the RocksDB's local LOG due to previous RocksDB cannot > limit the local log files. > After we upgraded to newer RocksDB version, we can then enable RocksDB log > again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] AHeise commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
AHeise commented on a change in pull request #16966: URL: https://github.com/apache/flink/pull/16966#discussion_r698299396 ## File path: flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterState.java ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.table; + +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.table.data.RowData; + +import javax.annotation.Nullable; + +import java.util.List; +import java.util.Map; +import java.util.Objects; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +class CompoundReducingUpsertWriterState { Review comment: I'm okay with that. I just think that it could be a general building block for advanced sources/sinks. But we can generalize it then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23814) Test FLIP-143 KafkaSink
[ https://issues.apache.org/jira/browse/FLINK-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406624#comment-17406624 ] Fabian Paul commented on FLINK-23814: - [~ruanhang1993] we consolidated all our found fixes in this PR [https://github.com/apache/flink/pull/17019] and once it is merged we will close all the related tickets. Sorry for the confusion. > Test FLIP-143 KafkaSink > --- > > Key: FLINK-23814 > URL: https://issues.apache.org/jira/browse/FLINK-23814 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka >Reporter: Fabian Paul >Assignee: Hang Ruan >Priority: Blocker > Labels: release-testing > Fix For: 1.14.0 > > > The following scenarios are worthwhile to test > * Start simple job with None/At-least once delivery guarantee and write > records to kafka topic > * Start simple job with exactly-once delivery guarantee and write records to > kafka topic. The records should only be visible with a `read-committed` > consumer > * Stop a job with exactly-once delivery guarantee and restart it with > different parallelism (scale-down, scale-up) > * Restart/kill a taskmanager while writing in exactly-once mode -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] AHeise commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
AHeise commented on a change in pull request #16966: URL: https://github.com/apache/flink/pull/16966#discussion_r698299782 ## File path: flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.table; + +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.core.io.SimpleVersionedSerializer; +import org.apache.flink.core.memory.DataInputViewStreamWrapper; +import org.apache.flink.core.memory.DataOutputViewStreamWrapper; +import org.apache.flink.table.data.RowData; + +import javax.annotation.Nullable; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +class CompoundReducingUpsertWriterStateSerializer +implements SimpleVersionedSerializer> { + +private final TypeSerializer rowDataTypeSerializer; +@Nullable private final SimpleVersionedSerializer wrappedStateSerializer; + +CompoundReducingUpsertWriterStateSerializer( +TypeSerializer rowDataTypeSerializer, +@Nullable SimpleVersionedSerializer wrappedStateSerializer) { +this.wrappedStateSerializer = wrappedStateSerializer; +this.rowDataTypeSerializer = checkNotNull(rowDataTypeSerializer); +} + +CompoundReducingUpsertWriterStateSerializer(TypeSerializer rowDataTypeSerializer) { +this(rowDataTypeSerializer, null); +} + +@Override +public int getVersion() { +return 1; +} + +@Override +public byte[] serialize(CompoundReducingUpsertWriterState state) +throws IOException { +try (final ByteArrayOutputStream baos = new ByteArrayOutputStream(); +final DataOutputViewStreamWrapper out = new DataOutputViewStreamWrapper(baos)) { +final List wrappedStates = state.getWrappedStates(); +if (wrappedStateSerializer != null) { +out.writeInt(wrappedStateSerializer.getVersion()); +out.writeInt(wrappedStates.size()); +for (final WriterState wrappedState : wrappedStates) { +final byte[] serializedWrappedState = +wrappedStateSerializer.serialize(wrappedState); +out.writeInt(serializedWrappedState.length); +out.write(wrappedStateSerializer.serialize(wrappedState)); +} Review comment: That's why I was hoping if we could add it to `SimpleVersionedSerialization`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-statefun-docker] igalshilman merged pull request #15: Flink StateFun Release 3.1.0
igalshilman merged pull request #15: URL: https://github.com/apache/flink-statefun-docker/pull/15 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] twalthr opened a new pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream
twalthr opened a new pull request #17048: URL: https://github.com/apache/flink/pull/17048 ## What is the purpose of the change This fixes a serious issue which also justified that `from/toChangelogStream` were marked as `@Experimental` in 1.13. Unique keys were not set correctly. Fixing it leads to even modified examples and tests which was accidentally classified as a planner shortcoming instead of an actual bug. ## Brief change log Set statistics for `fromChangelogStream`. ## Verifying this change This change added tests and can be verified as follows: `DataStreamJavaITCase` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fapaul commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
fapaul commented on a change in pull request #16966: URL: https://github.com/apache/flink/pull/16966#discussion_r698300654 ## File path: flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.table; + +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.core.io.SimpleVersionedSerializer; +import org.apache.flink.core.memory.DataInputViewStreamWrapper; +import org.apache.flink.core.memory.DataOutputViewStreamWrapper; +import org.apache.flink.table.data.RowData; + +import javax.annotation.Nullable; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +class CompoundReducingUpsertWriterStateSerializer +implements SimpleVersionedSerializer> { + +private final TypeSerializer rowDataTypeSerializer; +@Nullable private final SimpleVersionedSerializer wrappedStateSerializer; + +CompoundReducingUpsertWriterStateSerializer( +TypeSerializer rowDataTypeSerializer, +@Nullable SimpleVersionedSerializer wrappedStateSerializer) { +this.wrappedStateSerializer = wrappedStateSerializer; +this.rowDataTypeSerializer = checkNotNull(rowDataTypeSerializer); +} + +CompoundReducingUpsertWriterStateSerializer(TypeSerializer rowDataTypeSerializer) { +this(rowDataTypeSerializer, null); +} + +@Override +public int getVersion() { +return 1; +} + +@Override +public byte[] serialize(CompoundReducingUpsertWriterState state) +throws IOException { +try (final ByteArrayOutputStream baos = new ByteArrayOutputStream(); +final DataOutputViewStreamWrapper out = new DataOutputViewStreamWrapper(baos)) { +final List wrappedStates = state.getWrappedStates(); +if (wrappedStateSerializer != null) { +out.writeInt(wrappedStateSerializer.getVersion()); +out.writeInt(wrappedStates.size()); +for (final WriterState wrappedState : wrappedStates) { +final byte[] serializedWrappedState = +wrappedStateSerializer.serialize(wrappedState); +out.writeInt(serializedWrappedState.length); +out.write(wrappedStateSerializer.serialize(wrappedState)); +} Review comment: Now I understand your point I'll have a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaborgsomogyi commented on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data
gaborgsomogyi commented on pull request #17022: URL: https://github.com/apache/flink/pull/17022#issuecomment-908155425 @zentol having a look... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24033) Propagate unique keys for fromChangelogStream
[ https://issues.apache.org/jira/browse/FLINK-24033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-24033: --- Labels: pull-request-available (was: ) > Propagate unique keys for fromChangelogStream > - > > Key: FLINK-24033 > URL: https://issues.apache.org/jira/browse/FLINK-24033 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.13.2 >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Blocker > Labels: pull-request-available > Fix For: 1.14.0, 1.13.3 > > > Similar to FLINK-23915, we are not propagating unique keys for > {{fromChangelogStream}} because it is not written into statistics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream
flinkbot commented on pull request #17048: URL: https://github.com/apache/flink/pull/17048#issuecomment-908156791 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 44a4c5b89d03c403df73fae28d464a557444323a (Mon Aug 30 08:38:52 UTC 2021) ✅no warnings Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wsry closed pull request #17030: [FLINK-24035][network] Notify the buffer listeners when the local buffer pool receives available notification from the global pool
wsry closed pull request #17030: URL: https://github.com/apache/flink/pull/17030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wsry commented on pull request #17030: [FLINK-24035][network] Notify the buffer listeners when the local buffer pool receives available notification from the global pool
wsry commented on pull request #17030: URL: https://github.com/apache/flink/pull/17030#issuecomment-908159549 Abandoned. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-24050) Support primary keys on metadata columns
Ingo Bürk created FLINK-24050: - Summary: Support primary keys on metadata columns Key: FLINK-24050 URL: https://issues.apache.org/jira/browse/FLINK-24050 Project: Flink Issue Type: Improvement Components: Table SQL / API Reporter: Ingo Bürk Currently, primary keys are required to consist solely of physical columns. However, there might be scenarios where the actual payload/records do not contain a suitable primary key, but a unique identifier is available through metadata. In this case it would make sense to define the primary key on such a metadata column: {code:java} CREATE TABLE T ( uid METADATA, content STRING PRIMARY KEY (uid) NOT ENFORCED ) WITH (…) {code} A simple example for this would be IMAP: there is nothing unique about any single email as a record, but each email in a specific folder on an IMAP server has a unique UID (I'm excluding some irrelevant technical details here). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.
flinkbot edited a comment on pull request #16723: URL: https://github.com/apache/flink/pull/16723#issuecomment-893364145 ## CI report: * 542b00a0777a8b6483a0e61eaf47787ee43f98b8 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23018) * 1de6e0b83ee64c22c861a9d8e639483aaaba9e7f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23078) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
flinkbot edited a comment on pull request #16966: URL: https://github.com/apache/flink/pull/16966#issuecomment-904741081 ## CI report: * ed07cb76926d2747ac25953e5f5e74c48e64ef25 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22827) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22799) * 758c42cbc3442c3b95389dc65ef4bd67e2ff3344 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16997: [FLINK-23287][docs][table] Create user document for Window Join in SQL
flinkbot edited a comment on pull request #16997: URL: https://github.com/apache/flink/pull/16997#issuecomment-906324142 ## CI report: * 717cc232df32c0a19b2026b7246af7b307c14dc7 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23056) * c096ede2ac4bd991486ec9fcff878f96b02f2f41 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23079) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17045: [FLINK-23924][python][examples] Add PyFlink examples
flinkbot edited a comment on pull request #17045: URL: https://github.com/apache/flink/pull/17045#issuecomment-908007478 ## CI report: * 77e0e48a0f9c3f2f62e6e19ca5f4253060c13b07 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23061) * edd075f9e5a4dd29de26db54bafedace3c92b9ef UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on pull request #16915: [FLINK-9925][tests] Harden ClientTest by making handler shareable
tillrohrmann commented on pull request #16915: URL: https://github.com/apache/flink/pull/16915#issuecomment-908164719 Thanks for the review @zentol. The failing test case is https://issues.apache.org/jira/browse/FLINK-23097 and should be unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-24051) Make consumer.group-id optional for KafkaSource
Fabian Paul created FLINK-24051: --- Summary: Make consumer.group-id optional for KafkaSource Key: FLINK-24051 URL: https://issues.apache.org/jira/browse/FLINK-24051 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.14.0, 1.15.0 Reporter: Fabian Paul For most of the users it is not necessary to generate a group-id and the source itself can provide a meaningful group-id during startup. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24051) Make consumer.group-id optional for KafkaSource
[ https://issues.apache.org/jira/browse/FLINK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Paul reassigned FLINK-24051: --- Assignee: Fabian Paul > Make consumer.group-id optional for KafkaSource > --- > > Key: FLINK-24051 > URL: https://issues.apache.org/jira/browse/FLINK-24051 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.15.0 >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > > For most of the users it is not necessary to generate a group-id and the > source itself can provide a meaningful group-id during startup. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-training] NicoK merged pull request #35: [FLINK-24022] Change how to enable Scala and use in CI
NicoK merged pull request #35: URL: https://github.com/apache/flink-training/pull/35 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24050) Support primary keys on metadata columns
[ https://issues.apache.org/jira/browse/FLINK-24050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406632#comment-17406632 ] Jark Wu commented on FLINK-24050: - Sounds good to me. From my perspective, metadata columns are also some-kind physical column. > Support primary keys on metadata columns > > > Key: FLINK-24050 > URL: https://issues.apache.org/jira/browse/FLINK-24050 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Ingo Bürk >Priority: Major > > Currently, primary keys are required to consist solely of physical columns. > However, there might be scenarios where the actual payload/records do not > contain a suitable primary key, but a unique identifier is available through > metadata. In this case it would make sense to define the primary key on such > a metadata column: > {code:java} > CREATE TABLE T ( > uid METADATA, > content STRING > PRIMARY KEY (uid) NOT ENFORCED > ) WITH (…) > {code} > A simple example for this would be IMAP: there is nothing unique about any > single email as a record, but each email in a specific folder on an IMAP > server has a unique UID (I'm excluding some irrelevant technical details > here). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24022) Scala checks not running in flink-training CI
[ https://issues.apache.org/jira/browse/FLINK-24022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber closed FLINK-24022. --- Resolution: Fixed Fixed on master via 65cde7910e5bff6283eef7ea536fde3ea5d4c9d4 > Scala checks not running in flink-training CI > - > > Key: FLINK-24022 > URL: https://issues.apache.org/jira/browse/FLINK-24022 > Project: Flink > Issue Type: Bug > Components: Documentation / Training / Exercises >Affects Versions: 1.14.0, 1.13.3 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.3 > > > FLINK-23339 disabled Scala by default but therefore also disabled CI for > newly checked-in changes on the Scala code. > We should run CI with Scala enabled -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23764) add RuntimeContext to SourceReaderContext
[ https://issues.apache.org/jira/browse/FLINK-23764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406633#comment-17406633 ] Fabian Paul commented on FLINK-23764: - [~ruanhang1993] you made a good point. I will have a look into making the group-id optional for the KafkaSource https://issues.apache.org/jira/browse/FLINK-24051. Does it resolve your problem or do you need more information from the RuntimeContext? > add RuntimeContext to SourceReaderContext > - > > Key: FLINK-23764 > URL: https://issues.apache.org/jira/browse/FLINK-23764 > Project: Flink > Issue Type: Improvement > Components: Connectors / Common >Reporter: Hang Ruan >Priority: Major > Labels: connector > > RuntimeContext is the important information for sourceReader. Not only the > subtask index, we sometimes need to get other information in RuntimeContext > like the operator id. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24050) Support primary keys on metadata columns
[ https://issues.apache.org/jira/browse/FLINK-24050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Bürk updated FLINK-24050: -- Description: Currently, primary keys are required to consist solely of physical columns. However, there might be scenarios where the actual payload/records do not contain a suitable primary key, but a unique identifier is available through metadata. In this case it would make sense to define the primary key on such a metadata column: {code:java} CREATE TABLE T ( uid STRING METADATA, content STRING PRIMARY KEY (uid) NOT ENFORCED ) WITH (…) {code} A simple example for this would be IMAP: there is nothing unique about any single email as a record, but each email in a specific folder on an IMAP server has a unique UID (I'm excluding some irrelevant technical details here). was: Currently, primary keys are required to consist solely of physical columns. However, there might be scenarios where the actual payload/records do not contain a suitable primary key, but a unique identifier is available through metadata. In this case it would make sense to define the primary key on such a metadata column: {code:java} CREATE TABLE T ( uid METADATA, content STRING PRIMARY KEY (uid) NOT ENFORCED ) WITH (…) {code} A simple example for this would be IMAP: there is nothing unique about any single email as a record, but each email in a specific folder on an IMAP server has a unique UID (I'm excluding some irrelevant technical details here). > Support primary keys on metadata columns > > > Key: FLINK-24050 > URL: https://issues.apache.org/jira/browse/FLINK-24050 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Ingo Bürk >Priority: Major > > Currently, primary keys are required to consist solely of physical columns. > However, there might be scenarios where the actual payload/records do not > contain a suitable primary key, but a unique identifier is available through > metadata. In this case it would make sense to define the primary key on such > a metadata column: > {code:java} > CREATE TABLE T ( > uid STRING METADATA, > content STRING > PRIMARY KEY (uid) NOT ENFORCED > ) WITH (…) > {code} > A simple example for this would be IMAP: there is nothing unique about any > single email as a record, but each email in a specific folder on an IMAP > server has a unique UID (I'm excluding some irrelevant technical details > here). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] Airblader commented on a change in pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream
Airblader commented on a change in pull request #17048: URL: https://github.com/apache/flink/pull/17048#discussion_r698307830 ## File path: flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/connectors/DynamicSourceUtils.java ## @@ -90,13 +90,18 @@ public static RelNode convertDataStreamToRel( final DynamicTableSource tableSource = new ExternalDynamicSource<>( identifier, dataStream, physicalDataType, isTopLevelRecord, changelogMode); +final FlinkStatistic statistic = +FlinkStatistic.builder() +// this is a temporary solution, FLINK-15123 will resolve this Review comment: (The good ol' two-year-and-counting temporary solution… :-)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann closed pull request #16915: [FLINK-9925][tests] Harden ClientTest by making handler shareable
tillrohrmann closed pull request #16915: URL: https://github.com/apache/flink/pull/16915 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski merged pull request #17009: [FLINK-23466][network] Fix the bug that buffer listeners may not be notified when recycling buffers
pnowojski merged pull request #17009: URL: https://github.com/apache/flink/pull/17009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure
[ https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-23466: --- Description: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016 The problem is the buffer listener will be removed from the listener queue when notified and then it will be added to the listener queue again if it needs more buffers. However, if some buffers are recycled meanwhile, the buffer listener will not be notified of the available buffers. For example: 1. Thread 1 calls LocalBufferPool#recycle(). 2. Thread 1 reaches LocalBufferPool#fireBufferAvailableNotification() and listener.notifyBufferAvailable() is invoked, but Thread 1 sleeps before acquiring the lock to registeredListeners.add(listener). 3. Thread 2 is being woken up as a result of notifyBufferAvailable() call. It takes the buffer, but it needs more buffers. 4. Other threads, return all buffers, including this one that has been recycled. None are taken. Are all in the LocalBufferPool. 5. Thread 1 wakes up, and continues fireBufferAvailableNotification() invocation. 6. Thread 1 re-adds listener that's waiting for more buffer registeredListeners.add(listener). 7. Thread 1 exits loop LocalBufferPool#recycle(MemorySegment, int) inside, as the original memory segment has been used. At the end we have a state where all buffers are in the LocalBufferPool, so no new recycle() calls will happen, but there is still one listener waiting for a buffer (despite buffers being available). was:https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016 > UnalignedCheckpointITCase hangs on Azure > > > Key: FLINK-23466 > URL: https://issues.apache.org/jira/browse/FLINK-23466 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Assignee: Yingjie Cao >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016 > The problem is the buffer listener will be removed from the listener queue > when notified and then it will be added to the listener queue again if it > needs more buffers. However, if some buffers are recycled meanwhile, the > buffer listener will not be notified of the available buffers. For example: > 1. Thread 1 calls LocalBufferPool#recycle(). > 2. Thread 1 reaches LocalBufferPool#fireBufferAvailableNotification() and > listener.notifyBufferAvailable() is invoked, but Thread 1 sleeps before > acquiring the lock to registeredListeners.add(listener). > 3. Thread 2 is being woken up as a result of notifyBufferAvailable() > call. It takes the buffer, but it needs more buffers. > 4. Other threads, return all buffers, including this one that has been > recycled. None are taken. Are all in the LocalBufferPool. > 5. Thread 1 wakes up, and continues fireBufferAvailableNotification() > invocation. > 6. Thread 1 re-adds listener that's waiting for more buffer > registeredListeners.add(listener). > 7. Thread 1 exits loop LocalBufferPool#recycle(MemorySegment, int) > inside, as the original memory segment has been used. > At the end we have a state where all buffers are in the LocalBufferPool, so > no new recycle() calls will happen, but there is still one listener waiting > for a buffer (despite buffers being available). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure
[ https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-23466. -- Resolution: Fixed Merged to master as 48a384dffc7 Merged to release-1.14 as 0067d35cc0f > UnalignedCheckpointITCase hangs on Azure > > > Key: FLINK-23466 > URL: https://issues.apache.org/jira/browse/FLINK-23466 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Assignee: Yingjie Cao >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016 > The problem is the buffer listener will be removed from the listener queue > when notified and then it will be added to the listener queue again if it > needs more buffers. However, if some buffers are recycled meanwhile, the > buffer listener will not be notified of the available buffers. For example: > 1. Thread 1 calls LocalBufferPool#recycle(). > 2. Thread 1 reaches LocalBufferPool#fireBufferAvailableNotification() and > listener.notifyBufferAvailable() is invoked, but Thread 1 sleeps before > acquiring the lock to registeredListeners.add(listener). > 3. Thread 2 is being woken up as a result of notifyBufferAvailable() > call. It takes the buffer, but it needs more buffers. > 4. Other threads, return all buffers, including this one that has been > recycled. None are taken. Are all in the LocalBufferPool. > 5. Thread 1 wakes up, and continues fireBufferAvailableNotification() > invocation. > 6. Thread 1 re-adds listener that's waiting for more buffer > registeredListeners.add(listener). > 7. Thread 1 exits loop LocalBufferPool#recycle(MemorySegment, int) > inside, as the original memory segment has been used. > At the end we have a state where all buffers are in the LocalBufferPool, so > no new recycle() calls will happen, but there is still one listener waiting > for a buffer (despite buffers being available). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24030) PulsarSourceITCase>SourceTestSuiteBase.testMultipleSplits failed
[ https://issues.apache.org/jira/browse/FLINK-24030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-24030. -- Resolution: Duplicate > PulsarSourceITCase>SourceTestSuiteBase.testMultipleSplits failed > > > Key: FLINK-24030 > URL: https://issues.apache.org/jira/browse/FLINK-24030 > Project: Flink > Issue Type: Bug > Components: Connectors / Pulsar >Affects Versions: 1.14.0 >Reporter: Piotr Nowojski >Priority: Major > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22936&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461 > root cause: > {noformat} > Aug 27 09:41:42 Caused by: > org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: > Consumer not found > Aug 27 09:41:42 at > org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:987) > Aug 27 09:41:42 at > org.apache.pulsar.client.impl.PulsarClientImpl.close(PulsarClientImpl.java:658) > Aug 27 09:41:42 at > org.apache.flink.connector.pulsar.source.reader.source.PulsarSourceReaderBase.close(PulsarSourceReaderBase.java:83) > Aug 27 09:41:42 at > org.apache.flink.connector.pulsar.source.reader.source.PulsarOrderedSourceReader.close(PulsarOrderedSourceReader.java:170) > Aug 27 09:41:42 at > org.apache.flink.streaming.api.operators.SourceOperator.close(SourceOperator.java:308) > Aug 27 09:41:42 at > org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:141) > Aug 27 09:41:42 at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:127) > Aug 27 09:41:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:1015) > Aug 27 09:41:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.afterInvoke(StreamTask.java:859) > Aug 27 09:41:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:747) > Aug 27 09:41:42 at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > Aug 27 09:41:42 at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937) > Aug 27 09:41:42 at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > Aug 27 09:41:42 at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > {noformat} > Top level error: > {noformat} > WARNING: The following warnings have been detected: WARNING: Return type, > java.util.Map org.apache.pulsar.common.policies.data.NamespaceIsolationData>, of method, > public java.util.Map org.apache.pulsar.common.policies.data.NamespaceIsolationData> > org.apache.pulsar.broker.admin.impl.ClustersBase.getNamespaceIsolationPolicies(java.lang.String) > throws java.lang.Exception, is not resolvable to a concrete type. > Aug 27 09:41:42 [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 357.849 s <<< FAILURE! - in > org.apache.flink.connector.pulsar.source.PulsarSourceITCase > Aug 27 09:41:42 [ERROR] testMultipleSplits{TestEnvironment, > ExternalContext}[1] Time elapsed: 5.391 s <<< ERROR! > Aug 27 09:41:42 java.lang.RuntimeException: Failed to fetch next result > Aug 27 09:41:42 at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109) > Aug 27 09:41:42 at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80) > Aug 27 09:41:42 at > org.apache.flink.connectors.test.common.utils.TestDataMatchers$MultipleSplitDataMatcher.matchesSafely(TestDataMatchers.java:151) > Aug 27 09:41:42 at > org.apache.flink.connectors.test.common.utils.TestDataMatchers$MultipleSplitDataMatcher.matchesSafely(TestDataMatchers.java:133) > Aug 27 09:41:42 at > org.hamcrest.TypeSafeDiagnosingMatcher.matches(TypeSafeDiagnosingMatcher.java:55) > Aug 27 09:41:42 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:12) > Aug 27 09:41:42 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) > Aug 27 09:41:42 at > org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testMultipleSplits(SourceTestSuiteBase.java:156) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-9925) ClientTest.testSimpleRequests fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-9925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-9925. Resolution: Fixed Fixed via 1.15.0: bad537d6b4517341e33253354d6273153e571747 d505e6ed9bd596d1b6efcc3943dabd2b6a880744 1.14.0: 9da4069821a3ca6d06add3eab06b0af38f43112a 9d742bceef839bb3e9087946f434979d2d3f847d 1.13.3: c2ad55f6edf8426b12390fc06ee64c57d9ae86b7 6b455b41202c85dd8bb41813bbee8e2a9e3c2ea0 1.12.6: 68316d82a1c46b598e63f351335a02e71e5a1f84 58039eee6e3963444d09ca9e4703509cd3f8c4f6 > ClientTest.testSimpleRequests fails on Travis > - > > Key: FLINK-9925 > URL: https://issues.apache.org/jira/browse/FLINK-9925 > Project: Flink > Issue Type: Bug > Components: Runtime / Queryable State, Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Major > Labels: auto-deprioritized-critical, pull-request-available, > test-stability > Fix For: 1.14.0, 1.12.6, 1.13.3 > > > {{ClientTest.testSimpleRequests}} fails on Travis with an {{AssertionError}}: > https://api.travis-ci.org/v3/job/405690023/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…
flinkbot edited a comment on pull request #16966: URL: https://github.com/apache/flink/pull/16966#issuecomment-904741081 ## CI report: * ed07cb76926d2747ac25953e5f5e74c48e64ef25 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22827) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22799) * 758c42cbc3442c3b95389dc65ef4bd67e2ff3344 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23080) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22483) Recover checkpoints when JobMaster gains leadership
[ https://issues.apache.org/jira/browse/FLINK-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406647#comment-17406647 ] ming li commented on FLINK-22483: - Hi, [~dmvk], [~edu05]. Could you please help answer one of my questions? If we can only recover the ??CompletedCheckpointStore?? once at startup, why does the ??sharedStateRegistry?? need to be re-registered every time when the job is restarted? Is it possible to re-register only once at startup? > Recover checkpoints when JobMaster gains leadership > --- > > Key: FLINK-22483 > URL: https://issues.apache.org/jira/browse/FLINK-22483 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.13.0 >Reporter: Robert Metzger >Assignee: David Morávek >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > Recovering checkpoints (from the CompletedCheckpointStore) is a potentially > long-lasting/blocking operation, for example if the file system > implementation is retrying to connect to a unavailable storage backend. > Currently, we are calling the CompletedCheckpointStore.recover() method from > the main thread of the JobManager, making it unresponsive to any RPC call > while the recover method is blocked: > {code} > 2021-04-02 20:33:31,384 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job XXX > switched from state RUNNING to RESTARTING. > com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to > minio.minio.svc:9000 [minio.minio.svc/] failed: Connection refused > (Connection refused) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) ~[?:?] > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) ~[?:?] > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) > ~[?:?] > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) > ~[?:?] > at > com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1490) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:905) > ~[?:?] > at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:902) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:887) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:880) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$0(PrestoS3FileSystem.java:819) > ~[?:?] > at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:818) > ~[?:?] > at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) > ~[?:1.8.0_282] > at XXX.recover(KubernetesHaCheckpointStore.java:69) > ~[vvp-flink-ha-kubernetes-flink112-1.1.0.jar:?] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateInternal(CheckpointCoordinator.java:1511) > ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateToAll(CheckpointCoordinator.java:1451) > ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1] > at > org.apache.flink.runtime.scheduler.SchedulerBase.restoreState(SchedulerBase.java:421) > ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1] > at > org.apache.flink.runtime.scheduler.DefaultScheduler.lambda$restartTasks$2(DefaultScheduler.java:314) > ~[flink-dis
[GitHub] [flink] flinkbot edited a comment on pull request #17045: [FLINK-23924][python][examples] Add PyFlink examples
flinkbot edited a comment on pull request #17045: URL: https://github.com/apache/flink/pull/17045#issuecomment-908007478 ## CI report: * 77e0e48a0f9c3f2f62e6e19ca5f4253060c13b07 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23061) * edd075f9e5a4dd29de26db54bafedace3c92b9ef Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23081) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream
flinkbot commented on pull request #17048: URL: https://github.com/apache/flink/pull/17048#issuecomment-908188488 ## CI report: * 44a4c5b89d03c403df73fae28d464a557444323a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-22483) Recover checkpoints when JobMaster gains leadership
[ https://issues.apache.org/jira/browse/FLINK-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406647#comment-17406647 ] ming li edited comment on FLINK-22483 at 8/30/21, 9:25 AM: --- Hi, [~dmvk], [~edu05]. Could you please help answer one of my questions? If we can only recover the {{CompletedCheckpointStore}} once at startup, why does the {{sharedStateRegistry}} need to be re-registered every time when the job is restarted? Is it possible to re-register only once at startup? was (Author: ming li): Hi, [~dmvk], [~edu05]. Could you please help answer one of my questions? If we can only recover the ??CompletedCheckpointStore?? once at startup, why does the ??sharedStateRegistry?? need to be re-registered every time when the job is restarted? Is it possible to re-register only once at startup? > Recover checkpoints when JobMaster gains leadership > --- > > Key: FLINK-22483 > URL: https://issues.apache.org/jira/browse/FLINK-22483 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.13.0 >Reporter: Robert Metzger >Assignee: David Morávek >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > Recovering checkpoints (from the CompletedCheckpointStore) is a potentially > long-lasting/blocking operation, for example if the file system > implementation is retrying to connect to a unavailable storage backend. > Currently, we are calling the CompletedCheckpointStore.recover() method from > the main thread of the JobManager, making it unresponsive to any RPC call > while the recover method is blocked: > {code} > 2021-04-02 20:33:31,384 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job XXX > switched from state RUNNING to RESTARTING. > com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to > minio.minio.svc:9000 [minio.minio.svc/] failed: Connection refused > (Connection refused) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) ~[?:?] > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) ~[?:?] > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) > ~[?:?] > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) > ~[?:?] > at > com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1490) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:905) > ~[?:?] > at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:902) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:887) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:880) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$0(PrestoS3FileSystem.java:819) > ~[?:?] > at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:818) > ~[?:?] > at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) > ~[?:1.8.0_282] > at XXX.recover(KubernetesHaCheckpointStore.java:69) > ~[vvp-flink-ha-kubernetes-flink112-1.1.0.jar:?] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateInternal(CheckpointCoordinator.java:1511) > ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateToA
[GitHub] [flink] wsry commented on a change in pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer
wsry commented on a change in pull request #17037: URL: https://github.com/apache/flink/pull/17037#discussion_r698335837 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java ## @@ -216,9 +220,10 @@ // Lock is only taken, because #checkAvailability asserts it. It's a small penalty for // thread safety. synchronized (this.availableMemorySegments) { -if (checkAvailability()) { -availabilityHelper.resetAvailable(); -} +// guarantee that we have one buffer on initialization Review comment: 1. Added more comments. 2. The argument check already guarantees that the numberOfRequiredMemorySegments must be greater than 0. 3. From, it seem that checkAvailability only request one buffer, not requesting all buffers at once (correct me if I am wrong). Do you mean we enhance this logic and request all buffers at once? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wsry commented on a change in pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer
wsry commented on a change in pull request #17037: URL: https://github.com/apache/flink/pull/17037#discussion_r698335837 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java ## @@ -216,9 +220,10 @@ // Lock is only taken, because #checkAvailability asserts it. It's a small penalty for // thread safety. synchronized (this.availableMemorySegments) { -if (checkAvailability()) { -availabilityHelper.resetAvailable(); -} +// guarantee that we have one buffer on initialization Review comment: 1. Added more comments. 2. The argument check already guarantees that the numberOfRequiredMemorySegments must be greater than 0. 3. From the code, it seems that checkAvailability only requests one buffer, not requests all buffers at once (correct me if I am wrong). Do you mean we enhance this logic and request all buffers at once? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-training] NicoK commented on a change in pull request #31: [FLINK-23653] exercise and test rework
NicoK commented on a change in pull request #31: URL: https://github.com/apache/flink-training/pull/31#discussion_r698320553 ## File path: common/src/test/java/org/apache/flink/training/exercises/testing/ComposedRichCoFlatMapFunction.java ## @@ -0,0 +1,70 @@ +package org.apache.flink.training.exercises.testing; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.co.RichCoFlatMapFunction; +import org.apache.flink.training.exercises.common.utils.MissingSolutionException; +import org.apache.flink.util.Collector; + +/** + * A RichCoFlatMapFunction that can delegate to a RichCoFlatMapFunction in either the exercise or in + * the solution. The implementation in the exercise is tested first, and if it throws + * MissingSolutionException, then the solution is tested instead. + * + * This can be used to write test harness tests. + * + * @param first input type + * @param second input type + * @param output type + */ +public class ComposedRichCoFlatMapFunction +extends RichCoFlatMapFunction { +private final RichCoFlatMapFunction exercise; +private final RichCoFlatMapFunction solution; +private boolean useExercise; + +public ComposedRichCoFlatMapFunction( +RichCoFlatMapFunction exercise, +RichCoFlatMapFunction solution) { + +this.exercise = exercise; +this.solution = solution; +this.useExercise = true; +} + +@Override +public void open(Configuration parameters) throws Exception { + +try { +exercise.setRuntimeContext(this.getRuntimeContext()); +exercise.open(parameters); +} catch (Exception e) { +if (MissingSolutionException.ultimateCauseIsMissingSolution(e)) { +this.useExercise = false; +solution.setRuntimeContext(this.getRuntimeContext()); +solution.open(parameters); +} else { +throw e; +} +} +} + +@Override +public void flatMap1(IN1 value, Collector out) throws Exception { + +if (useExercise) { +exercise.flatMap1(value, out); +} else { +solution.flatMap1(value, out); +} Review comment: Looking at this pattern again, what do you think about this alternative: - create a new member `private RichCoFlatMapFunction functionToExecute;` (or maybe you have a better idea for the name) - assign either `exercise` or `solution` to it - just call `functionToExecute.flatMap1()` etc -> Less code, don't have to wait for JIT to optimise the boolean away (hopefully it can), no additional memory required (just a new reference that replaces the boolean) ## File path: hourly-tips/src/main/java/org/apache/flink/training/exercises/hourlytips/HourlyTipsExercise.java ## @@ -40,18 +56,32 @@ */ public static void main(String[] args) throws Exception { +HourlyTipsSolution job = +new HourlyTipsSolution(new TaxiFareGenerator(), new PrintSinkFunction<>()); + +job.execute(); +} + +/** + * Create and execute the hourly tips pipeline. + * + * @return {JobExecutionResult} + * @throws Exception which occurs during job execution. + */ +public JobExecutionResult execute() throws Exception { + // set up streaming execution environment StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); -env.setParallelism(ExerciseBase.parallelism); // start the data generator -DataStream fares = env.addSource(fareSourceOrTest(new TaxiFareGenerator())); - -throw new MissingSolutionException(); +DataStream fares = env.addSource(new TaxiFareGenerator()); Review comment: ```suggestion DataStream fares = env.addSource(source); ``` ## File path: hourly-tips/src/test/java/org/apache/flink/training/exercises/hourlytips/HourlyTipsTest.java ## @@ -72,27 +84,39 @@ public void testMaxAcrossDrivers() throws Exception { TaxiFare tenFor1In2 = testFare(1, t(90), 10.0F); Review comment: Can you add a couple of more test fares so that chances are higher that these are processed on different parallel instances (if the users' code supports parallel instances)? ## File path: hourly-tips/src/test/java/org/apache/flink/training/exercises/hourlytips/HourlyTipsTest.java ## @@ -72,27 +84,39 @@ public void testMaxAcrossDrivers() throws Exception { TaxiFare tenFor1In2 = testFare(1, t(90), 10.0F); TaxiFare twentyFor2In2 = testFare(2, t(90), 20.0F); -TestFareSource source = -new TestFareSource(oneFor1In1, fiveFor1In1, tenFor1In2, twentyFor2In2); +ParallelTestSource source = +new ParallelTestSource<>(oneFor1In1, fiveFor1In1, tenFor1In2, twentyFor2In2); Tuple3 hour
[GitHub] [flink] twalthr commented on a change in pull request #17011: [FLINK-23663][table-planner] Reduce state size of ChangelogNormalize
twalthr commented on a change in pull request #17011: URL: https://github.com/apache/flink/pull/17011#discussion_r698306556 ## File path: flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TableFactoryHarness.java ## @@ -0,0 +1,361 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.factories; + +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ConfigOptions; +import org.apache.flink.configuration.ReadableConfig; +import org.apache.flink.streaming.api.functions.sink.SinkFunction; +import org.apache.flink.streaming.api.functions.source.SourceFunction; +import org.apache.flink.table.api.Schema; +import org.apache.flink.table.api.TableDescriptor; +import org.apache.flink.table.api.ValidationException; +import org.apache.flink.table.catalog.CatalogTable; +import org.apache.flink.table.connector.ChangelogMode; +import org.apache.flink.table.connector.sink.DynamicTableSink; +import org.apache.flink.table.connector.sink.DynamicTableSink.SinkRuntimeProvider; +import org.apache.flink.table.connector.sink.SinkFunctionProvider; +import org.apache.flink.table.connector.source.DynamicTableSource; +import org.apache.flink.table.connector.source.LookupTableSource; +import org.apache.flink.table.connector.source.ScanTableSource; +import org.apache.flink.table.connector.source.ScanTableSource.ScanRuntimeProvider; +import org.apache.flink.table.connector.source.SourceFunctionProvider; +import org.apache.flink.table.connector.source.TableFunctionProvider; +import org.apache.flink.table.data.RowData; +import org.apache.flink.table.factories.DynamicTableFactory; +import org.apache.flink.table.factories.DynamicTableSinkFactory; +import org.apache.flink.table.factories.DynamicTableSourceFactory; +import org.apache.flink.table.factories.FactoryUtil; +import org.apache.flink.table.functions.TableFunction; +import org.apache.flink.testutils.junit.SharedObjects; +import org.apache.flink.util.InstantiationUtil; + +import java.io.IOException; +import java.io.Serializable; +import java.util.Base64; +import java.util.HashSet; +import java.util.Set; + +/** + * Provides a flexible testing harness for table factories. + * + * This testing harness allows writing custom sources and sinks which can be directly + * instantiated from the test. This avoids having to implement a factory, and enables using the + * {@link SharedObjects} rule to get direct access to the underlying source/sink from the test. + * + * Note that the underlying source/sink must be {@link Serializable}. It is recommended to extend + * from {@link ScanSourceBase}, {@link LookupSourceBase}, or {@link SinkBase} which provide default + * implementations for most methods as well as some convenience methods. + * + * The harness provides a {@link Factory}. You can register a source / sink through configuration + * by passing a base64-encoded serialization. The harness provides convenience methods to make this + * process as simple as possible. + * + * Example: + * + * {@code + * public class CustomSourceTest { + * {@literal @}Rule public SharedObjects sharedObjects = SharedObjects.create(); + * + * {@literal @}Test + * public void test() { + * SharedReference> appliedLimits = sharedObjects.add(new ArrayList<>()); + * + * Schema schema = Schema.newBuilder().build(); Review comment: nit: use the new `Schema.derived()` ## File path: flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TableFactoryHarness.java ## @@ -0,0 +1,361 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + *
[jira] [Assigned] (FLINK-23850) Test Kafka table connector with new runtime provider
[ https://issues.apache.org/jira/browse/FLINK-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias reassigned FLINK-23850: Assignee: Matthias > Test Kafka table connector with new runtime provider > > > Key: FLINK-23850 > URL: https://issues.apache.org/jira/browse/FLINK-23850 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Qingsheng Ren >Assignee: Matthias >Priority: Blocker > Labels: release-testing > Fix For: 1.14.0 > > > The runtime provider of Kafka table connector has been replaced with new > KafkaSource and KafkaSink. The table connector requires to be tested to make > sure nothing is surprised to Table/SQL API users. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer
flinkbot edited a comment on pull request #17037: URL: https://github.com/apache/flink/pull/17037#issuecomment-907731112 ## CI report: * 71ddbbb89bf81896a5acca9466e50d0a6e544758 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23020) * a1fc01a4a4c3373f4f931c688aa587fb0d3dfc01 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-24052) Flink SQL reads S3 bucket data.
Moses created FLINK-24052: - Summary: Flink SQL reads S3 bucket data. Key: FLINK-24052 URL: https://issues.apache.org/jira/browse/FLINK-24052 Project: Flink Issue Type: Improvement Components: Table SQL / Ecosystem Reporter: Moses I wanna use Flink SQL reads S3 bucket data. But now I found it ONLY supports absolute path, which means I can not read all content in the bucket. My SQL statements write as below: {code:sql} CREATE TABLE file_data ( a BIGINT, b STRING, c STRING, d DOUBLE, e BOOLEAN, f DATE, g STRING,h STRING, i STRING, j STRING, k STRING, l STRING, m STRING, n STRING, o STRING, p FLOAT ) WITH ( 'connector' = 'filesystem', 'path' = 's3a://my-bucket', 'format' = 'parquet' ); SELECT COUNT(*) FROM file_data; {code} The exception info: {code:java} Caused by: java.lang.IllegalArgumentException: path must be absolute at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) ~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1] at org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:68) ~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1] at org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:60) ~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1] at org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:56) ~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1] at org.apache.hadoop.fs.s3a.s3guard.S3Guard.putAndReturn(S3Guard.java:149) ~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1] {code} Is there any solution to meet my requirement ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24036) SSL cannot be installed on CI
[ https://issues.apache.org/jira/browse/FLINK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-24036: - Fix Version/s: (was: 1.15.0) > SSL cannot be installed on CI > - > > Key: FLINK-24036 > URL: https://issues.apache.org/jira/browse/FLINK-24036 > Project: Flink > Issue Type: Bug > Components: Build System / CI >Affects Versions: 1.14.0, 1.12.5, 1.13.2, 1.15.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > Fix For: 1.14.0, 1.12.6, 1.13.3 > > > {code} > # install libssl1.0.0 for netty tcnative > wget > http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.6_amd64.deb > sudo apt install ./libssl1.0.0_1.0.2n-1ubuntu5.6_amd64.deb > {code} > {code} > --2021-08-27 20:48:49-- > http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.6_amd64.deb > Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, > 91.189.91.38, 2001:67c:1562::15, ... > Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... > connected. > HTTP request sent, awaiting response... 404 Not Found > 2021-08-27 20:48:49 ERROR 404: Not Found. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol merged pull request #17047: [FLINK-24027][filesystems] Remove excessive dependencies from NOTICE files
zentol merged pull request #17047: URL: https://github.com/apache/flink/pull/17047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-24027) FileSystems list excessive dependencies in NOTICE
[ https://issues.apache.org/jira/browse/FLINK-24027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-24027. Resolution: Fixed master: 401d7241c7e1eb432d988af3ef097dc5a633c9b4 1.14: 4c23df38555e364df9f14c67452d06de653ecdc2 > FileSystems list excessive dependencies in NOTICE > - > > Key: FLINK-24027 > URL: https://issues.apache.org/jira/browse/FLINK-24027 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.14.0 >Reporter: Chesnay Schepler >Assignee: Ingo Bürk >Priority: Blocker > Labels: pull-request-available > Fix For: 1.14.0 > > > The LicenseChecker finds several dependencies that are listed in the NOTICE > but do not show up in the shade-plugin output. It could be that after the > recent AWS/Hadoop bumps these are no longer being bundled (needs > confirmation!). > {code} > 17:05:14,651 WARN NoticeFileChecker [] - Dependency > com.fasterxml.jackson.core:jackson-annotations:2.12.1 is mentioned in NOTICE > file > /__w/1/s/flink-filesystems/flink-azure-fs-hadoop/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,651 WARN NoticeFileChecker [] - Dependency > com.fasterxml.jackson.core:jackson-databind:2.12.1 is mentioned in NOTICE > file > /__w/1/s/flink-filesystems/flink-azure-fs-hadoop/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,652 WARN NoticeFileChecker [] - Dependency > org.apache.hadoop.thirdparty:hadoop-shaded-protobuf_3_7:1.1.1 is mentioned in > NOTICE file > /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,652 WARN NoticeFileChecker [] - Dependency > org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.1.1 is mentioned in NOTICE > file > /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,652 WARN NoticeFileChecker [] - Dependency > org.wildfly.openssl:wildfly-openssl:1.0.7.Final is mentioned in NOTICE file > /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,652 WARN NoticeFileChecker [] - Dependency > commons-lang:commons-lang:2.6 is mentioned in NOTICE file > /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,741 WARN NoticeFileChecker [] - Dependency > commons-lang:commons-lang:2.6 is mentioned in NOTICE file > /__w/1/s/flink-filesystems/flink-fs-hadoop-shaded/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,743 WARN NoticeFileChecker [] - Dependency > org.apache.hadoop.thirdparty:hadoop-shaded-protobuf_3_7:1.1.1 is mentioned in > NOTICE file > /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,743 WARN NoticeFileChecker [] - Dependency > org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.1.1 is mentioned in NOTICE > file > /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,743 WARN NoticeFileChecker [] - Dependency > org.wildfly.openssl:wildfly-openssl:1.0.7.Final is mentioned in NOTICE file > /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE, > but is not expected there > 17:05:14,743 WARN NoticeFileChecker [] - Dependency > commons-lang:commons-lang:2.6 is mentioned in NOTICE file > /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE, > but is not expected there > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream
flinkbot edited a comment on pull request #17048: URL: https://github.com/apache/flink/pull/17048#issuecomment-908188488 ## CI report: * 44a4c5b89d03c403df73fae28d464a557444323a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23088) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on a change in pull request #17027: [FLINK-23954][tests] Fix e2e test test_queryable_state_restart_tm.sh
tillrohrmann commented on a change in pull request #17027: URL: https://github.com/apache/flink/pull/17027#discussion_r698366255 ## File path: flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateProducer.java ## @@ -158,24 +161,34 @@ public void open(Configuration parameters) throws Exception { stateDescriptor.setQueryable(QsConstants.QUERY_NAME); state = getRuntimeContext().getMapState(stateDescriptor); -updateCount(); - -LOG.info("Open {} with a count of {}.", getClass().getSimpleName(), count); -} - -private void updateCount() throws Exception { -count = Iterables.size(state.keys()); +countsAtCheckpoint = new HashMap<>(); +count = -1; +lastCompletedCheckpoint = -1; } @Override public void flatMap(Email value, Collector out) throws Exception { state.put(value.getEmailId(), new EmailInformation(value)); -updateCount(); +count = Iterables.size(state.keys()); } @Override public void notifyCheckpointComplete(long checkpointId) { -System.out.println("Count on snapshot: " + count); // we look for it in the test +if (checkpointId > lastCompletedCheckpoint) { Review comment: Yes, it is a safety that that we don't output decreasing values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on pull request #17027: [FLINK-23954][tests] Fix e2e test test_queryable_state_restart_tm.sh
tillrohrmann commented on pull request #17027: URL: https://github.com/apache/flink/pull/17027#issuecomment-908221973 Thanks for the review @zentol. Merging this PR now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org