[GitHub] [flink] wuchong commented on pull request #17038: [FLINK-24024][table-planner] Fix syntax mistake in session Window TVF

2021-08-30 Thread GitBox


wuchong commented on pull request #17038:
URL: https://github.com/apache/flink/pull/17038#issuecomment-908084677


   Looks good to me. 
   
   cc @godfreyhe 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gyfora commented on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data

2021-08-30 Thread GitBox


gyfora commented on pull request #17022:
URL: https://github.com/apache/flink/pull/17022#issuecomment-908087499


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wuchong commented on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.

2021-08-30 Thread GitBox


wuchong commented on pull request #16723:
URL: https://github.com/apache/flink/pull/16723#issuecomment-908087936


   @RocMarshal  Please do not use "git merge" to rebase branches, otherwise the 
changes is hard to track. Please use "git rebase" instead. IntelliJ IDEA 
provide an easy tool to do git rebase, you can find the tool via `VCS -> Git -> 
Rebase`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16963: [FLINK-23582][docs][table] Add documentation for session window tvf.

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16963:
URL: https://github.com/apache/flink/pull/16963#issuecomment-904569734


   
   ## CI report:
   
   * 3f642deaa05216d448a83522964416044891297b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23064)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17022:
URL: https://github.com/apache/flink/pull/17022#issuecomment-907158087


   
   ## CI report:
   
   * f954017eb4d81acc1283bae464d7247bda38d904 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22963)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23071)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17019: [FLINK-23854][connectors/Kafka] Ensure lingering Kafka transactions are aborted reliably.

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17019:
URL: https://github.com/apache/flink/pull/17019#issuecomment-907120059


   
   ## CI report:
   
   * fa25b27d642ab7e6af356ca1e9738a006d48 UNKNOWN
   * 6120b584b4b4ee9286f4998065f122c9a7de Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23047)
 
   * 0a6ed86ec26cf3fd8a85ccb00d5b504aa31c4d9b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17044: [FLINK-24010][connector/common] HybridSourceReader/Enumerator delegate checkpoint notifications

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17044:
URL: https://github.com/apache/flink/pull/17044#issuecomment-907990627


   
   ## CI report:
   
   * d0e16223f3b69b417b90c1320117168b8dc5e59e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23060)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17045: [FLINK-23924][python][examples] Add PyFlink examples

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17045:
URL: https://github.com/apache/flink/pull/17045#issuecomment-908007478


   
   ## CI report:
   
   * 77e0e48a0f9c3f2f62e6e19ca5f4253060c13b07 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23061)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-23686) KafkaSource metric "commitsSucceeded" should count per-commit instead of per-partition

2021-08-30 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-23686:


Assignee: Qingsheng Ren

> KafkaSource metric "commitsSucceeded" should count per-commit instead of 
> per-partition
> --
>
> Key: FLINK-23686
> URL: https://issues.apache.org/jira/browse/FLINK-23686
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Qingsheng Ren
>Assignee: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3
>
>
> Currently if a successful offset commit includes multiple topic partition 
> (let's say 4), the counter will increase by 4 instead of 1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #17046: [FLINK-24049][python] Handle properly for field types need conversion in TupleTypeInfo

2021-08-30 Thread GitBox


flinkbot commented on pull request #17046:
URL: https://github.com/apache/flink/pull/17046#issuecomment-908096451


   
   ## CI report:
   
   * 0ce3f1e021ffd0a8278412dda43198f8c1eb0b8c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17047: [FLINK-24027][filesystems] Remove excessive dependencies from NOTICE files

2021-08-30 Thread GitBox


flinkbot commented on pull request #17047:
URL: https://github.com/apache/flink/pull/17047#issuecomment-908096579


   
   ## CI report:
   
   * cba327f4c75a11dcb018505333abf42e50f0b7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #17027: [FLINK-23954][tests] Fix e2e test test_queryable_state_restart_tm.sh

2021-08-30 Thread GitBox


zentol commented on a change in pull request #17027:
URL: https://github.com/apache/flink/pull/17027#discussion_r698253003



##
File path: 
flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateProducer.java
##
@@ -158,24 +161,34 @@ public void open(Configuration parameters) throws 
Exception {
 
 stateDescriptor.setQueryable(QsConstants.QUERY_NAME);
 state = getRuntimeContext().getMapState(stateDescriptor);
-updateCount();
-
-LOG.info("Open {} with a count of {}.", 
getClass().getSimpleName(), count);
-}
-
-private void updateCount() throws Exception {
-count = Iterables.size(state.keys());
+countsAtCheckpoint = new HashMap<>();
+count = -1;
+lastCompletedCheckpoint = -1;
 }
 
 @Override
 public void flatMap(Email value, Collector out) throws 
Exception {
 state.put(value.getEmailId(), new EmailInformation(value));
-updateCount();
+count = Iterables.size(state.keys());
 }
 
 @Override
 public void notifyCheckpointComplete(long checkpointId) {
-System.out.println("Count on snapshot: " + count); // we look for 
it in the test
+if (checkpointId > lastCompletedCheckpoint) {

Review comment:
   Is this required to ensure the count (as seen from the test) does not 
decrease?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on pull request #17034: [BP-1.13][FLINK-23947] Improve logging of granting/revoking leadership in DefaultDispatcherRunner

2021-08-30 Thread GitBox


zentol commented on pull request #17034:
URL: https://github.com/apache/flink/pull/17034#issuecomment-908106689


   Could you rebase the PR again? 😢 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24044) Errors are output when compiling flink-runtime-web

2021-08-30 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-24044:
-
Component/s: Build System

> Errors are output when compiling flink-runtime-web
> --
>
> Key: FLINK-24044
> URL: https://issues.apache.org/jira/browse/FLINK-24044
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, Runtime / Web Frontend
>Affects Versions: 1.14.0
>Reporter: Zhilong Hong
>Priority: Minor
>
> When compiling the module {{flink-runtime-web}}, the terminal would be filled 
> with errors as below:
> {code:bash}
> [ERROR] - Generating browser application bundles (phase: setup)... 
> [ERROR] 
> [ERROR] Compiling @angular/core : es2015 as esm2015 
> [ERROR] 
> [ERROR] Compiling @angular/common : es2015 as esm2015 
> [ERROR] 
> [ERROR] Compiling @angular/platform-browser : es2015 as esm2015 
> [ERROR] 
> [ERROR] Compiling @angular/platform-browser-dynamic : es2015 as esm2015 
> [ERROR] 
> [ERROR] Compiling @angular/router : es2015 as esm2015
> ...
> {code}
>  
>  Although it doesn't break the compilation, maybe we should fix this.
> I'm not familiar with the module flink-runtime-web or Angular. I found 
> something that may be useful: 
> [https://github.com/angular/angular/issues/36513]
> [https://github.com/angular/angular/issues/31853]
> [https://stackoverflow.com/questions/57220392/even-though-i-am-using-target-esm2015-ivy-is-compiles-as-a-esm5-module/57220445#57220445]
> [https://stackoverflow.com/questions/61304548/simple-angular-9-project-compiles-whole-angular-material-es2015-as-esm2015]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Description: 
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic, and i read source from kafka and sink into 
blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

 

  was:
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic, and i read source from kafka and sink into 
blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.


> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic, and i read source from kafka and sink into 
> blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhuzhurk commented on pull request #16856: [FLINK-23833][coordination] Individually clean up the cache for ShuffleDescriptors

2021-08-30 Thread GitBox


zhuzhurk commented on pull request #16856:
URL: https://github.com/apache/flink/pull/16856#issuecomment-908108473


   Merging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Issue Type: Bug  (was: Improvement)

> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic, and i read source from kafka and sink into 
> blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Description: 
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!!sql.png

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!!udf.png

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!!performance_befort_optimization.png

!!performance_after_optimization.png

 

  was:
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!!sql.png

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!!udf.png

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!!

 


> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !!sql.png
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !!udf.png
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> !!performance_befort_optimization.png
> !!performance_after_optimization.png
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Description: 
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!!sql.png

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!!udf.png

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!!

 

  was:
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic, and i read source from kafka and sink into 
blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

 


> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !!sql.png
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !!udf.png
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> !!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Description: 
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!sql.png!

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!udf.png!

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!performance_befort_optimization.png!

!performance_after_optimization.png!

 

  was:
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!!sql.png

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!!udf.png

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!!performance_befort_optimization.png

!!performance_after_optimization.png

 


> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !sql.png!
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !udf.png!
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> !performance_befort_optimization.png!
> !performance_after_optimization.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17019: [FLINK-23854][connectors/Kafka] Ensure lingering Kafka transactions are aborted reliably.

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17019:
URL: https://github.com/apache/flink/pull/17019#issuecomment-907120059


   
   ## CI report:
   
   * fa25b27d642ab7e6af356ca1e9738a006d48 UNKNOWN
   * 6120b584b4b4ee9286f4998065f122c9a7de Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23047)
 
   * 0a6ed86ec26cf3fd8a85ccb00d5b504aa31c4d9b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23068)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17011: [FLINK-23663][table-planner] Reduce state size of ChangelogNormalize

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17011:
URL: https://github.com/apache/flink/pull/17011#issuecomment-907015222


   
   ## CI report:
   
   * e13ee18ff9f927cb23c670236f83956873cfcc1e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22958)
 
   * 211af17558902443958144ba9f202b75c719982e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17022:
URL: https://github.com/apache/flink/pull/17022#issuecomment-907158087


   
   ## CI report:
   
   * f954017eb4d81acc1283bae464d7247bda38d904 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23071)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22963)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17038: [FLINK-24024][table-planner] Fix syntax mistake in session Window TVF

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17038:
URL: https://github.com/apache/flink/pull/17038#issuecomment-907732897


   
   ## CI report:
   
   * c6982f9e3518e2908a4c206d0aad363d22fab2d7 UNKNOWN
   * 209ebeb33f4cb30e50d87728af0db0c9088e0303 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23059)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17047: [FLINK-24027][filesystems] Remove excessive dependencies from NOTICE files

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17047:
URL: https://github.com/apache/flink/pull/17047#issuecomment-908096579


   
   ## CI report:
   
   * cba327f4c75a11dcb018505333abf42e50f0b7f7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23073)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Description: 
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!sql.png!

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!udf.png!

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!performace_before_optimization.png!

!performance_after_optimization.png!

 

  was:
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!sql.png!

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!udf.png!

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!performance_befort_optimization.png!

!performance_after_optimization.png!

 


> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !sql.png!
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !udf.png!
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> !performace_before_optimization.png!
> !performance_after_optimization.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17046: [FLINK-24049][python] Handle properly for field types need conversion in TupleTypeInfo

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17046:
URL: https://github.com/apache/flink/pull/17046#issuecomment-908096451


   
   ## CI report:
   
   * 0ce3f1e021ffd0a8278412dda43198f8c1eb0b8c Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23072)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread weibowen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibowen updated FLINK-23555:
-
Description: 
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!sql.png!

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!udf.png!

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

before:

!performance_before_optimization.png!

after:

!performance_after_optimization.png!

 

  was:
When we write a sql like 
{code:java}
select udf2(udf1(field), udf3(udf1(field) ...{code}
udf1(field) will be invoked twice. However once udf1 has a bad performance, it 
will have a huge impact to the whole task. More times invoked, huger impact.

I hope that whatever how many times udf1(field) writed in sql, Flink will take 
advantage of common subexpression elimination and only invoke it once.

Then i do some work on this, and the attachment tells the result.

 

The sql.png shows the sql logic,

!sql.png!

and i read source from kafka and sink into blackhole. The parallelism is 1.

The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
`testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
which completely do nothing.

!udf.png!

As expected, the performance after optimization is approximately 3 times than 
before since I write `testcse(sid)` 3 times in sql.

!performace_before_optimization.png!

!performance_after_optimization.png!

 


> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !sql.png!
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !udf.png!
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> before:
> !performance_before_optimization.png!
> after:
> !performance_after_optimization.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhuzhurk closed pull request #16856: [FLINK-23833][coordination] Individually clean up the cache for ShuffleDescriptors

2021-08-30 Thread GitBox


zhuzhurk closed pull request #16856:
URL: https://github.com/apache/flink/pull/16856


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23833) Cache of ShuffleDescriptors should be individually cleaned up

2021-08-30 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-23833:

Fix Version/s: 1.15.0

> Cache of ShuffleDescriptors should be individually cleaned up
> -
>
> Key: FLINK-23833
> URL: https://issues.apache.org/jira/browse/FLINK-23833
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> {color:#172b4d}In FLINK-23005, we introduce the cache of compressed 
> serialized value for ShuffleDescriptors to improve the performance of 
> deployment. To make sure the cache wouldn't stay too long and become a burden 
> for GC, the cache would be cleaned up when the partition is released or reset 
> for new execution. In the implementation, the cache of the entire 
> IntermediateResult is cleaned up because a partition is released only when 
> the entire IntermediateResult is released. {color}
> {color:#172b4d}However, after FLINK-22017, the BLOCKING result partition is 
> allowed to be consumable individually. It also means that the result 
> partition doesn't need to wait for other result partitions and can be 
> released individually. After this change, there may be a scene: when a result 
> partition is finished, the cache of IntermediateResult on the blob is 
> deleted, while other result partitions corresponding to this 
> IntermediateResult is just deployed to the TaskExecutor. Then when 
> TaskExecutors are trying to download TDD from the blob, they will find the 
> blob is deleted and get stuck.{color}
> {color:#172b4d}This bug only happens for jobs with POINTWISE BLOCKING edge. 
> Also, the {{blob.offload.minsize}} is set to be a extremely small value, 
> since the size of  ShuffleDescriptors of POINTWISE BLOCKING edges is usually 
> small. To solve this issue, we just need to clean up the cache of 
> ShuffleDescriptors individually.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol commented on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data

2021-08-30 Thread GitBox


zentol commented on pull request #17022:
URL: https://github.com/apache/flink/pull/17022#issuecomment-908116741


   The HistoryServerTest is getting stuck on CI which is likely caused by this 
PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] beyond1920 commented on pull request #16997: [FLINK-23287][docs][table] Create user document for Window Join in SQL

2021-08-30 Thread GitBox


beyond1920 commented on pull request #16997:
URL: https://github.com/apache/flink/pull/16997#issuecomment-908118313


   @wuchong Jark, I've updated the input schema to use a lower-case and 
non-reserved-keyword as the field names.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure

2021-08-30 Thread Hang Ruan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406602#comment-17406602
 ] 

Hang Ruan commented on FLINK-23971:
---

Fine, I will provide a PR to fix it.

> PulsarSourceITCase.testIdleReader failed on azure
> -
>
> Key: FLINK-23971
> URL: https://issues.apache.org/jira/browse/FLINK-23971
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.14.0
>
> Attachments: error.log
>
>
> {code:java}
> [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 353.527 s <<< FAILURE! - in 
> org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2]  Time elapsed: 
> 4.549 s  <<< FAILURE!
> java.lang.AssertionError:
> Expected: Records consumed by Flink should be identical to test data and 
> preserve the order in multiple splits
>  but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF'
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>at 
> org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448]
> This is the same error as in FLINK-23828 (kafka).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-23833) Cache of ShuffleDescriptors should be individually cleaned up

2021-08-30 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-23833.
---
Resolution: Fixed

Fixed via:
master:
8d3ec4a2fb1c9e07ee386dced15e1f64f359c6a2
ef6ef809b17b4547194c247975fb4566a19679f0
eba8f574c550123004ed4f557cef28ff557cd88e
1b4340a416bb7da39bb194da2faf3fc5e83fc5d9
1d38803c408fd4f4984913605030394d56bb160e

release-1.14:
f1eaf314ef232c1e604a49122857bbd99ad71a97
d4f8bb18ebe98ab4fd0d08ec24b00794d732cb38
f7bedb0603c33cb4e25c62c9899edb709b264371
daf997b7a13a62e251a9275a1b631697dea413bd
be857d1fdf8853b0315152521e27714ced6b1204

> Cache of ShuffleDescriptors should be individually cleaned up
> -
>
> Key: FLINK-23833
> URL: https://issues.apache.org/jira/browse/FLINK-23833
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> {color:#172b4d}In FLINK-23005, we introduce the cache of compressed 
> serialized value for ShuffleDescriptors to improve the performance of 
> deployment. To make sure the cache wouldn't stay too long and become a burden 
> for GC, the cache would be cleaned up when the partition is released or reset 
> for new execution. In the implementation, the cache of the entire 
> IntermediateResult is cleaned up because a partition is released only when 
> the entire IntermediateResult is released. {color}
> {color:#172b4d}However, after FLINK-22017, the BLOCKING result partition is 
> allowed to be consumable individually. It also means that the result 
> partition doesn't need to wait for other result partitions and can be 
> released individually. After this change, there may be a scene: when a result 
> partition is finished, the cache of IntermediateResult on the blob is 
> deleted, while other result partitions corresponding to this 
> IntermediateResult is just deployed to the TaskExecutor. Then when 
> TaskExecutors are trying to download TDD from the blob, they will find the 
> blob is deleted and get stuck.{color}
> {color:#172b4d}This bug only happens for jobs with POINTWISE BLOCKING edge. 
> Also, the {{blob.offload.minsize}} is set to be a extremely small value, 
> since the size of  ShuffleDescriptors of POINTWISE BLOCKING edges is usually 
> small. To solve this issue, we just need to clean up the cache of 
> ShuffleDescriptors individually.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24043) Reuse the code of 'check savepoint preconditions'.

2021-08-30 Thread Roc Marshal (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406605#comment-17406605
 ] 

Roc Marshal commented on FLINK-24043:
-

Thanks [~jark] for the reply.


Hi, [~gaoyunhaii]  [~tangyun] Could you help me to review it  if you have free 
time? Thank you.

> Reuse the code of 'check savepoint preconditions'. 
> ---
>
> Key: FLINK-24043
> URL: https://issues.apache.org/jira/browse/FLINK-24043
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Task
>Reporter: Roc Marshal
>Priority: Minor
>  Labels: pull-request-available
>
> [here|https://github.com/apache/flink/blob/da944a8c90477d7b0210024028abd1011e250f14/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java#L822]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23555) Improve common subexpression elimination by using local references

2021-08-30 Thread Enze Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406607#comment-17406607
 ] 

Enze Liu commented on FLINK-23555:
--

The description has updated. Please help check again. Thanks.

Our approach is to keep record of the local reference. And implement the 
\{{visitLocalRef}} in \{{ExprCodeGenerator}}.

We can come up with the pr if needed.

> Improve common subexpression elimination by using local references
> --
>
> Key: FLINK-23555
> URL: https://issues.apache.org/jira/browse/FLINK-23555
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: weibowen
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !sql.png!
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !udf.png!
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> before:
> !performance_before_optimization.png!
> after:
> !performance_after_optimization.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] pnowojski commented on a change in pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer

2021-08-30 Thread GitBox


pnowojski commented on a change in pull request #17037:
URL: https://github.com/apache/flink/pull/17037#discussion_r698277969



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java
##
@@ -216,9 +220,10 @@
 // Lock is only taken, because #checkAvailability asserts it. It's a 
small penalty for
 // thread safety.
 synchronized (this.availableMemorySegments) {
-if (checkAvailability()) {
-availabilityHelper.resetAvailable();
-}
+// guarantee that we have one buffer on initialization

Review comment:
   1. Can you expand the java doc why is it important? Also maybe mention 
the difference between input and output?
   2. Can you respect here `numberOfRequiredMemorySegments`? If this value is 
zero, we should probably skip requesting.
   3. Can you also keep the best effort code `checkAvailability()` that was 
requesting all buffers at once if they were available?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fapaul commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


fapaul commented on a change in pull request #16966:
URL: https://github.com/apache/flink/pull/16966#discussion_r698282237



##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.table;
+
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+import org.apache.flink.core.memory.DataInputViewStreamWrapper;
+import org.apache.flink.core.memory.DataOutputViewStreamWrapper;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+class CompoundReducingUpsertWriterStateSerializer
+implements 
SimpleVersionedSerializer> {
+
+private final TypeSerializer rowDataTypeSerializer;
+@Nullable private final SimpleVersionedSerializer 
wrappedStateSerializer;
+
+CompoundReducingUpsertWriterStateSerializer(
+TypeSerializer rowDataTypeSerializer,
+@Nullable SimpleVersionedSerializer 
wrappedStateSerializer) {
+this.wrappedStateSerializer = wrappedStateSerializer;
+this.rowDataTypeSerializer = checkNotNull(rowDataTypeSerializer);
+}
+
+CompoundReducingUpsertWriterStateSerializer(TypeSerializer 
rowDataTypeSerializer) {
+this(rowDataTypeSerializer, null);
+}
+
+@Override
+public int getVersion() {
+return 1;
+}
+
+@Override
+public byte[] serialize(CompoundReducingUpsertWriterState 
state)
+throws IOException {
+try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
+final DataOutputViewStreamWrapper out = new 
DataOutputViewStreamWrapper(baos)) {
+final List wrappedStates = state.getWrappedStates();
+if (wrappedStateSerializer != null) {
+out.writeInt(wrappedStateSerializer.getVersion());
+out.writeInt(wrappedStates.size());
+for (final WriterState wrappedState : wrappedStates) {
+final byte[] serializedWrappedState =
+wrappedStateSerializer.serialize(wrappedState);
+out.writeInt(serializedWrappedState.length);
+out.write(wrappedStateSerializer.serialize(wrappedState));
+}

Review comment:
   Thanks for the hint, I did not know that such a class exists but I think 
does not help much here. The `SimpleVersionedSerialization` always writes the 
version in front of the serialized which in this cases leads to a lot of 
duplicated version writes.
   In general, the `SimpleVersionedSerialization` seems not to be great for 
writing `List<>` data.

##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+

[jira] [Commented] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure

2021-08-30 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406615#comment-17406615
 ] 

Xintong Song commented on FLINK-23971:
--

Thanks [~ruanhang1993], you're assigned. Please move ahead.

> PulsarSourceITCase.testIdleReader failed on azure
> -
>
> Key: FLINK-23971
> URL: https://issues.apache.org/jira/browse/FLINK-23971
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.14.0
>
> Attachments: error.log
>
>
> {code:java}
> [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 353.527 s <<< FAILURE! - in 
> org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2]  Time elapsed: 
> 4.549 s  <<< FAILURE!
> java.lang.AssertionError:
> Expected: Records consumed by Flink should be identical to test data and 
> preserve the order in multiple splits
>  but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF'
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>at 
> org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448]
> This is the same error as in FLINK-23828 (kafka).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure

2021-08-30 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-23971:


Assignee: Hang Ruan

> PulsarSourceITCase.testIdleReader failed on azure
> -
>
> Key: FLINK-23971
> URL: https://issues.apache.org/jira/browse/FLINK-23971
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Assignee: Hang Ruan
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.14.0
>
> Attachments: error.log
>
>
> {code:java}
> [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 353.527 s <<< FAILURE! - in 
> org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2]  Time elapsed: 
> 4.549 s  <<< FAILURE!
> java.lang.AssertionError:
> Expected: Records consumed by Flink should be identical to test data and 
> preserve the order in multiple splits
>  but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF'
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>at 
> org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448]
> This is the same error as in FLINK-23828 (kafka).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23190) Make task-slot allocation much more evenly

2021-08-30 Thread loyi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406616#comment-17406616
 ] 

loyi commented on FLINK-23190:
--

I have submitted a [PR|https://github.com/apache/flink/pull/16929] for this 
feature, do you have time to take a look?  Thanks.  [~trohrmann] 

> Make task-slot allocation much more evenly
> --
>
> Key: FLINK-23190
> URL: https://issues.apache.org/jira/browse/FLINK-23190
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.3
>Reporter: loyi
>Assignee: loyi
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-07-16-10-34-30-700.png
>
>
> FLINK-12122 only guarantees spreading out tasks across the set of TMs which 
> are registered at the time of scheduling, but our jobs are all runing on 
> active yarn mode, the job with smaller source parallelism offen cause 
> load-balance issues. 
>  
> For this job:
> {code:java}
> //  -ys 4 means 10 taskmanagers
> env.addSource(...).name("A").setParallelism(10).
>  map(...).name("B").setParallelism(30)
>  .map(...).name("C").setParallelism(40)
>  .addSink(...).name("D").setParallelism(20);
> {code}
>  
>  Flink-1.12.3 task allocation: 
> ||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
> |A| 
> 1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
> |B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
> |C|4|4|4|4|4|4|4|4|4|4|
> |D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
>  
> Suggestions:
> When TaskManger start register slots to slotManager , current processing 
> logic will choose  the first pendingSlot which meet its resource 
> requirements.  The "random" strategy usually causes uneven task allocation 
> when source-operator's parallelism is significantly below process-operator's. 
>   A simple feasible idea  is  {color:#de350b}partition{color} the current  
> "{color:#de350b}pendingSlots{color}" by their "JobVertexIds" (such as  let 
> AllocationID bring the detail)  , then allocate the slots proportionally to 
> each JobVertexGroup.
>  
> For above case, the 40 pendingSlots could be divided into 4 groups:
> [ABCD]: 10        // A、B、C、D reprents  {color:#de350b}jobVertexId{color}
> [BCD]: 10
> [CD]: 10
> [D]: 10
>  
> Every taskmanager will provide 4 slots one time, and each group will get 1 
> slot according their proportion (1/4), the final allocation result is below:
> [ABCD] : deploye on 10 different taskmangers
> [BCD]: deploye on 10 different taskmangers
> [CD]: deploye on 10  different taskmangers
> [D]: deploye on 10 different taskmangers
>  
> I have implement a [concept 
> code|https://github.com/saddays/flink/commit/dc82e60a7c7599fbcb58c14f8e3445bc8d07ace1]
>   based on Flink-1.12.3 ,  the patch version has {color:#de350b}fully 
> evenly{color} task allocation , and works well on my workload .  Are there 
> other point that have not been considered or  does it conflict with future 
> plans?      Sorry for my poor english.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] fapaul commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


fapaul commented on a change in pull request #16966:
URL: https://github.com/apache/flink/pull/16966#discussion_r698286541



##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterState.java
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.table;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+class CompoundReducingUpsertWriterState {

Review comment:
   I am in favor of your first comment because afaik there will be some 
efforts to integrate the `ReducingUpsert` functionality in the table API 
operators and make it available to all upsert sinks.
   Therefore generalizing it now may not have an impact because it is dropped 
soonish.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-18807) FlinkKafkaProducerITCase.testScaleUpAfterScalingDown failed with "Timeout expired after 60000milliseconds while awaiting EndTxn(COMMIT)"

2021-08-30 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406617#comment-17406617
 ] 

Xintong Song commented on FLINK-18807:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23062&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=15a22db7-8faa-5b34-3920-d33c9f0ca23c&l=7399

> FlinkKafkaProducerITCase.testScaleUpAfterScalingDown failed with "Timeout 
> expired after 6milliseconds while awaiting EndTxn(COMMIT)"
> 
>
> Key: FLINK-18807
> URL: https://issues.apache.org/jira/browse/FLINK-18807
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Tests
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5142&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5
> {code}
> 2020-08-03T22:06:45.9078498Z [ERROR] 
> testScaleUpAfterScalingDown(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 76.498 s  <<< ERROR!
> 2020-08-03T22:06:45.9079233Z org.apache.kafka.common.errors.TimeoutException: 
> Timeout expired after 6milliseconds while awaiting EndTxn(COMMIT)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-statefun] igalshilman closed pull request #245: [FLINK-23520][datastream] improve statefun <-> datastream interop

2021-08-30 Thread GitBox


igalshilman closed pull request #245:
URL: https://github.com/apache/flink-statefun/pull/245


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-statefun] igalshilman commented on pull request #245: [FLINK-23520][datastream] improve statefun <-> datastream interop

2021-08-30 Thread GitBox


igalshilman commented on pull request #245:
URL: https://github.com/apache/flink-statefun/pull/245#issuecomment-908144474


   @sjwiesman I'm closing this for now, because we will try to solve it without 
copying the sources, by properly shedding protobuf.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16723:
URL: https://github.com/apache/flink/pull/16723#issuecomment-893364145


   
   ## CI report:
   
   * 542b00a0777a8b6483a0e61eaf47787ee43f98b8 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23018)
 
   * 1de6e0b83ee64c22c861a9d8e639483aaaba9e7f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23971) PulsarSourceITCase.testIdleReader failed on azure

2021-08-30 Thread Qingsheng Ren (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406618#comment-17406618
 ] 

Qingsheng Ren commented on FLINK-23971:
---

Thanks [~ruanhang1993] for the investigation! I double checked the code and 
[~ruanhang1993]'s assumption is reasonable.

Also I think this ticket shares the same cause as FLINK-23828. I'll mark it as 
related to this one. 

> PulsarSourceITCase.testIdleReader failed on azure
> -
>
> Key: FLINK-23971
> URL: https://issues.apache.org/jira/browse/FLINK-23971
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Assignee: Hang Ruan
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.14.0
>
> Attachments: error.log
>
>
> {code:java}
> [INFO] Running org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 353.527 s <<< FAILURE! - in 
> org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> [ERROR] testIdleReader{TestEnvironment, ExternalContext}[2]  Time elapsed: 
> 4.549 s  <<< FAILURE!
> java.lang.AssertionError:
> Expected: Records consumed by Flink should be identical to test data and 
> preserve the order in multiple splits
>  but: Unexpected record 'tj7MpFRWX95GzBpSF3CCjxKSal6bRhR0aF'
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>at 
> org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testIdleReader(SourceTestSuiteBase.java:193)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22819&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=24448]
> This is the same error as in FLINK-23828 (kafka).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17011: [FLINK-23663][table-planner] Reduce state size of ChangelogNormalize

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17011:
URL: https://github.com/apache/flink/pull/17011#issuecomment-907015222


   
   ## CI report:
   
   * e13ee18ff9f927cb23c670236f83956873cfcc1e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22958)
 
   * 211af17558902443958144ba9f202b75c719982e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23075)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16997: [FLINK-23287][docs][table] Create user document for Window Join in SQL

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16997:
URL: https://github.com/apache/flink/pull/16997#issuecomment-906324142


   
   ## CI report:
   
   * 717cc232df32c0a19b2026b7246af7b307c14dc7 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23056)
 
   * c096ede2ac4bd991486ec9fcff878f96b02f2f41 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23544) Window TVF Supports session window

2021-08-30 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406619#comment-17406619
 ] 

Timo Walther commented on FLINK-23544:
--

[~godfreyhe][~qingru zhang] Sorry, for jumping in so late but are we sure that 
this change has correct SQL semantics? As far as I can see, it violates the 
syntax mentioned in the FLIP that states that SESSION window TVFs need a 
{{PARTITION BY}} clause. This was also the reason why we didn't support SESSION 
in the first version, right?

{code}
SELECT *
  FROM TABLE(
SESSION(TABLE Bid PARTITION BY bidder, DESCRIPTOR(bidtime), 
DESCRIPTOR(bidder), INTERVAL '5' MINUTES);
{code}

> Window TVF Supports session window
> --
>
> Key: FLINK-23544
> URL: https://issues.apache.org/jira/browse/FLINK-23544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: JING ZHANG
>Assignee: JING ZHANG
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Window TVF would support SESSION window in the following case:
>  # SESSION Window TVF followed by Window Aggregate, in this case SESSION 
> window TVF would be pulled up into WindowAggregate, so Window assigner would 
> happen in WindowAggrgeate  
>  
> *Note, SESSION window TVF only works in limited cases currently, the 
> following user cases is not supported yet:*
>  # *SESSION WINDOW TVF followed by Window JOIN*
>  # *SESSION WINDOW TVF followed by Window RANK***
> *BESIDES, SESSION window Aggregate does not support the following performance 
> improvement yet:*
>      1. Split Distinct Aggregation
>      2. Local-global Aggregation
>      3. Mini-batch Aggregate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23791) Enable RocksDB log again

2021-08-30 Thread Yu Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated FLINK-23791:
--
Fix Version/s: 1.15.0

> Enable RocksDB log again
> 
>
> Key: FLINK-23791
> URL: https://issues.apache.org/jira/browse/FLINK-23791
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Yun Tang
>Assignee: Zakelly Lan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.1
>
>
> FLINK-15068 disabled the RocksDB's local LOG due to previous RocksDB cannot 
> limit the local log files.
> After we upgraded to newer RocksDB version, we can then enable RocksDB log 
> again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] AHeise commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


AHeise commented on a change in pull request #16966:
URL: https://github.com/apache/flink/pull/16966#discussion_r698299396



##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterState.java
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.table;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+class CompoundReducingUpsertWriterState {

Review comment:
   I'm okay with that. I just think that it could be a general building 
block for advanced sources/sinks. But we can generalize it then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23814) Test FLIP-143 KafkaSink

2021-08-30 Thread Fabian Paul (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406624#comment-17406624
 ] 

Fabian Paul commented on FLINK-23814:
-

[~ruanhang1993] we consolidated all our found fixes in this PR 
[https://github.com/apache/flink/pull/17019] and once it is merged we will 
close all the related tickets. Sorry for the confusion.

> Test FLIP-143 KafkaSink
> ---
>
> Key: FLINK-23814
> URL: https://issues.apache.org/jira/browse/FLINK-23814
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kafka
>Reporter: Fabian Paul
>Assignee: Hang Ruan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.14.0
>
>
> The following scenarios are worthwhile to test
>  * Start simple job with None/At-least once delivery guarantee and write 
> records to kafka topic
>  * Start simple job with exactly-once delivery guarantee and write records to 
> kafka topic. The records should only be visible with a `read-committed` 
> consumer
>  * Stop a job with exactly-once delivery guarantee and restart it with 
> different parallelism (scale-down, scale-up)
>  * Restart/kill a taskmanager while writing in exactly-once mode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] AHeise commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


AHeise commented on a change in pull request #16966:
URL: https://github.com/apache/flink/pull/16966#discussion_r698299782



##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.table;
+
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+import org.apache.flink.core.memory.DataInputViewStreamWrapper;
+import org.apache.flink.core.memory.DataOutputViewStreamWrapper;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+class CompoundReducingUpsertWriterStateSerializer
+implements 
SimpleVersionedSerializer> {
+
+private final TypeSerializer rowDataTypeSerializer;
+@Nullable private final SimpleVersionedSerializer 
wrappedStateSerializer;
+
+CompoundReducingUpsertWriterStateSerializer(
+TypeSerializer rowDataTypeSerializer,
+@Nullable SimpleVersionedSerializer 
wrappedStateSerializer) {
+this.wrappedStateSerializer = wrappedStateSerializer;
+this.rowDataTypeSerializer = checkNotNull(rowDataTypeSerializer);
+}
+
+CompoundReducingUpsertWriterStateSerializer(TypeSerializer 
rowDataTypeSerializer) {
+this(rowDataTypeSerializer, null);
+}
+
+@Override
+public int getVersion() {
+return 1;
+}
+
+@Override
+public byte[] serialize(CompoundReducingUpsertWriterState 
state)
+throws IOException {
+try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
+final DataOutputViewStreamWrapper out = new 
DataOutputViewStreamWrapper(baos)) {
+final List wrappedStates = state.getWrappedStates();
+if (wrappedStateSerializer != null) {
+out.writeInt(wrappedStateSerializer.getVersion());
+out.writeInt(wrappedStates.size());
+for (final WriterState wrappedState : wrappedStates) {
+final byte[] serializedWrappedState =
+wrappedStateSerializer.serialize(wrappedState);
+out.writeInt(serializedWrappedState.length);
+out.write(wrappedStateSerializer.serialize(wrappedState));
+}

Review comment:
   That's why I was hoping if we could add it to 
`SimpleVersionedSerialization`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-statefun-docker] igalshilman merged pull request #15: Flink StateFun Release 3.1.0

2021-08-30 Thread GitBox


igalshilman merged pull request #15:
URL: https://github.com/apache/flink-statefun-docker/pull/15


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr opened a new pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream

2021-08-30 Thread GitBox


twalthr opened a new pull request #17048:
URL: https://github.com/apache/flink/pull/17048


   ## What is the purpose of the change
   
   This fixes a serious issue which also justified that 
`from/toChangelogStream` were marked as `@Experimental` in 1.13. Unique keys 
were not set correctly. Fixing it leads to even modified examples and tests 
which was accidentally classified as a planner shortcoming instead of an actual 
bug.
   
   
   ## Brief change log
   
   Set statistics for `fromChangelogStream`.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows: 
`DataStreamJavaITCase`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fapaul commented on a change in pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


fapaul commented on a change in pull request #16966:
URL: https://github.com/apache/flink/pull/16966#discussion_r698300654



##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/CompoundReducingUpsertWriterStateSerializer.java
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.table;
+
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+import org.apache.flink.core.memory.DataInputViewStreamWrapper;
+import org.apache.flink.core.memory.DataOutputViewStreamWrapper;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+class CompoundReducingUpsertWriterStateSerializer
+implements 
SimpleVersionedSerializer> {
+
+private final TypeSerializer rowDataTypeSerializer;
+@Nullable private final SimpleVersionedSerializer 
wrappedStateSerializer;
+
+CompoundReducingUpsertWriterStateSerializer(
+TypeSerializer rowDataTypeSerializer,
+@Nullable SimpleVersionedSerializer 
wrappedStateSerializer) {
+this.wrappedStateSerializer = wrappedStateSerializer;
+this.rowDataTypeSerializer = checkNotNull(rowDataTypeSerializer);
+}
+
+CompoundReducingUpsertWriterStateSerializer(TypeSerializer 
rowDataTypeSerializer) {
+this(rowDataTypeSerializer, null);
+}
+
+@Override
+public int getVersion() {
+return 1;
+}
+
+@Override
+public byte[] serialize(CompoundReducingUpsertWriterState 
state)
+throws IOException {
+try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
+final DataOutputViewStreamWrapper out = new 
DataOutputViewStreamWrapper(baos)) {
+final List wrappedStates = state.getWrappedStates();
+if (wrappedStateSerializer != null) {
+out.writeInt(wrappedStateSerializer.getVersion());
+out.writeInt(wrappedStates.size());
+for (final WriterState wrappedState : wrappedStates) {
+final byte[] serializedWrappedState =
+wrappedStateSerializer.serialize(wrappedState);
+out.writeInt(serializedWrappedState.length);
+out.write(wrappedStateSerializer.serialize(wrappedState));
+}

Review comment:
   Now I understand your point I'll have a look.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaborgsomogyi commented on pull request #17022: [FLINK-24020][web] Aggregate HTTP requests before custom netty handers are getting the data

2021-08-30 Thread GitBox


gaborgsomogyi commented on pull request #17022:
URL: https://github.com/apache/flink/pull/17022#issuecomment-908155425


   @zentol having a look...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24033) Propagate unique keys for fromChangelogStream

2021-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-24033:
---
Labels: pull-request-available  (was: )

> Propagate unique keys for fromChangelogStream
> -
>
> Key: FLINK-24033
> URL: https://issues.apache.org/jira/browse/FLINK-24033
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3
>
>
> Similar to FLINK-23915, we are not propagating unique keys for 
> {{fromChangelogStream}} because it is not written into statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream

2021-08-30 Thread GitBox


flinkbot commented on pull request #17048:
URL: https://github.com/apache/flink/pull/17048#issuecomment-908156791


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 44a4c5b89d03c403df73fae28d464a557444323a (Mon Aug 30 
08:38:52 UTC 2021)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsry closed pull request #17030: [FLINK-24035][network] Notify the buffer listeners when the local buffer pool receives available notification from the global pool

2021-08-30 Thread GitBox


wsry closed pull request #17030:
URL: https://github.com/apache/flink/pull/17030


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsry commented on pull request #17030: [FLINK-24035][network] Notify the buffer listeners when the local buffer pool receives available notification from the global pool

2021-08-30 Thread GitBox


wsry commented on pull request #17030:
URL: https://github.com/apache/flink/pull/17030#issuecomment-908159549


   Abandoned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-24050) Support primary keys on metadata columns

2021-08-30 Thread Jira
Ingo Bürk created FLINK-24050:
-

 Summary: Support primary keys on metadata columns
 Key: FLINK-24050
 URL: https://issues.apache.org/jira/browse/FLINK-24050
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Ingo Bürk


Currently, primary keys are required to consist solely of physical columns. 
However, there might be scenarios where the actual payload/records do not 
contain a suitable primary key, but a unique identifier is available through 
metadata. In this case it would make sense to define the primary key on such a 
metadata column:
{code:java}
CREATE TABLE T (
  uid METADATA,
  content STRING
  PRIMARY KEY (uid) NOT ENFORCED
) WITH (…)
{code}
A simple example for this would be IMAP: there is nothing unique about any 
single email as a record, but each email in a specific folder on an IMAP server 
has a unique UID (I'm excluding some irrelevant technical details here).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16723:
URL: https://github.com/apache/flink/pull/16723#issuecomment-893364145


   
   ## CI report:
   
   * 542b00a0777a8b6483a0e61eaf47787ee43f98b8 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23018)
 
   * 1de6e0b83ee64c22c861a9d8e639483aaaba9e7f Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23078)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16966:
URL: https://github.com/apache/flink/pull/16966#issuecomment-904741081


   
   ## CI report:
   
   * ed07cb76926d2747ac25953e5f5e74c48e64ef25 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22827)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22799)
 
   * 758c42cbc3442c3b95389dc65ef4bd67e2ff3344 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16997: [FLINK-23287][docs][table] Create user document for Window Join in SQL

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16997:
URL: https://github.com/apache/flink/pull/16997#issuecomment-906324142


   
   ## CI report:
   
   * 717cc232df32c0a19b2026b7246af7b307c14dc7 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23056)
 
   * c096ede2ac4bd991486ec9fcff878f96b02f2f41 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23079)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17045: [FLINK-23924][python][examples] Add PyFlink examples

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17045:
URL: https://github.com/apache/flink/pull/17045#issuecomment-908007478


   
   ## CI report:
   
   * 77e0e48a0f9c3f2f62e6e19ca5f4253060c13b07 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23061)
 
   * edd075f9e5a4dd29de26db54bafedace3c92b9ef UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on pull request #16915: [FLINK-9925][tests] Harden ClientTest by making handler shareable

2021-08-30 Thread GitBox


tillrohrmann commented on pull request #16915:
URL: https://github.com/apache/flink/pull/16915#issuecomment-908164719


   Thanks for the review @zentol. The failing test case is 
https://issues.apache.org/jira/browse/FLINK-23097 and should be unrelated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-24051) Make consumer.group-id optional for KafkaSource

2021-08-30 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-24051:
---

 Summary: Make consumer.group-id optional for KafkaSource
 Key: FLINK-24051
 URL: https://issues.apache.org/jira/browse/FLINK-24051
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.14.0, 1.15.0
Reporter: Fabian Paul


For most of the users it is not necessary to generate a group-id and the source 
itself can provide a meaningful group-id during startup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-24051) Make consumer.group-id optional for KafkaSource

2021-08-30 Thread Fabian Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Paul reassigned FLINK-24051:
---

Assignee: Fabian Paul

> Make consumer.group-id optional for KafkaSource
> ---
>
> Key: FLINK-24051
> URL: https://issues.apache.org/jira/browse/FLINK-24051
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Fabian Paul
>Assignee: Fabian Paul
>Priority: Major
>
> For most of the users it is not necessary to generate a group-id and the 
> source itself can provide a meaningful group-id during startup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-training] NicoK merged pull request #35: [FLINK-24022] Change how to enable Scala and use in CI

2021-08-30 Thread GitBox


NicoK merged pull request #35:
URL: https://github.com/apache/flink-training/pull/35


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24050) Support primary keys on metadata columns

2021-08-30 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406632#comment-17406632
 ] 

Jark Wu commented on FLINK-24050:
-

Sounds good to me. From my perspective, metadata columns are also some-kind 
physical column.

> Support primary keys on metadata columns
> 
>
> Key: FLINK-24050
> URL: https://issues.apache.org/jira/browse/FLINK-24050
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Ingo Bürk
>Priority: Major
>
> Currently, primary keys are required to consist solely of physical columns. 
> However, there might be scenarios where the actual payload/records do not 
> contain a suitable primary key, but a unique identifier is available through 
> metadata. In this case it would make sense to define the primary key on such 
> a metadata column:
> {code:java}
> CREATE TABLE T (
>   uid METADATA,
>   content STRING
>   PRIMARY KEY (uid) NOT ENFORCED
> ) WITH (…)
> {code}
> A simple example for this would be IMAP: there is nothing unique about any 
> single email as a record, but each email in a specific folder on an IMAP 
> server has a unique UID (I'm excluding some irrelevant technical details 
> here).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-24022) Scala checks not running in flink-training CI

2021-08-30 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber closed FLINK-24022.
---
Resolution: Fixed

Fixed on master via 65cde7910e5bff6283eef7ea536fde3ea5d4c9d4

> Scala checks not running in flink-training CI
> -
>
> Key: FLINK-24022
> URL: https://issues.apache.org/jira/browse/FLINK-24022
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation / Training / Exercises
>Affects Versions: 1.14.0, 1.13.3
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3
>
>
> FLINK-23339 disabled Scala by default but therefore also disabled CI for 
> newly checked-in changes on the Scala code.
> We should run CI with Scala enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23764) add RuntimeContext to SourceReaderContext

2021-08-30 Thread Fabian Paul (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406633#comment-17406633
 ] 

Fabian Paul commented on FLINK-23764:
-

[~ruanhang1993] you made a good point. I will have a look into making the 
group-id optional for the KafkaSource 
https://issues.apache.org/jira/browse/FLINK-24051. Does it resolve your problem 
or do you need more information from the RuntimeContext?

> add RuntimeContext to SourceReaderContext
> -
>
> Key: FLINK-23764
> URL: https://issues.apache.org/jira/browse/FLINK-23764
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Hang Ruan
>Priority: Major
>  Labels: connector
>
> RuntimeContext is the important information for sourceReader. Not only the 
> subtask index, we sometimes need to get other information in RuntimeContext 
> like the operator id.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24050) Support primary keys on metadata columns

2021-08-30 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-24050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ingo Bürk updated FLINK-24050:
--
Description: 
Currently, primary keys are required to consist solely of physical columns. 
However, there might be scenarios where the actual payload/records do not 
contain a suitable primary key, but a unique identifier is available through 
metadata. In this case it would make sense to define the primary key on such a 
metadata column:
{code:java}
CREATE TABLE T (
  uid STRING METADATA,
  content STRING
  PRIMARY KEY (uid) NOT ENFORCED
) WITH (…)
{code}
A simple example for this would be IMAP: there is nothing unique about any 
single email as a record, but each email in a specific folder on an IMAP server 
has a unique UID (I'm excluding some irrelevant technical details here).

  was:
Currently, primary keys are required to consist solely of physical columns. 
However, there might be scenarios where the actual payload/records do not 
contain a suitable primary key, but a unique identifier is available through 
metadata. In this case it would make sense to define the primary key on such a 
metadata column:
{code:java}
CREATE TABLE T (
  uid METADATA,
  content STRING
  PRIMARY KEY (uid) NOT ENFORCED
) WITH (…)
{code}
A simple example for this would be IMAP: there is nothing unique about any 
single email as a record, but each email in a specific folder on an IMAP server 
has a unique UID (I'm excluding some irrelevant technical details here).


> Support primary keys on metadata columns
> 
>
> Key: FLINK-24050
> URL: https://issues.apache.org/jira/browse/FLINK-24050
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Ingo Bürk
>Priority: Major
>
> Currently, primary keys are required to consist solely of physical columns. 
> However, there might be scenarios where the actual payload/records do not 
> contain a suitable primary key, but a unique identifier is available through 
> metadata. In this case it would make sense to define the primary key on such 
> a metadata column:
> {code:java}
> CREATE TABLE T (
>   uid STRING METADATA,
>   content STRING
>   PRIMARY KEY (uid) NOT ENFORCED
> ) WITH (…)
> {code}
> A simple example for this would be IMAP: there is nothing unique about any 
> single email as a record, but each email in a specific folder on an IMAP 
> server has a unique UID (I'm excluding some irrelevant technical details 
> here).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] Airblader commented on a change in pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream

2021-08-30 Thread GitBox


Airblader commented on a change in pull request #17048:
URL: https://github.com/apache/flink/pull/17048#discussion_r698307830



##
File path: 
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/connectors/DynamicSourceUtils.java
##
@@ -90,13 +90,18 @@ public static RelNode convertDataStreamToRel(
 final DynamicTableSource tableSource =
 new ExternalDynamicSource<>(
 identifier, dataStream, physicalDataType, 
isTopLevelRecord, changelogMode);
+final FlinkStatistic statistic =
+FlinkStatistic.builder()
+// this is a temporary solution, FLINK-15123 will 
resolve this

Review comment:
   (The good ol' two-year-and-counting temporary solution… :-))




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann closed pull request #16915: [FLINK-9925][tests] Harden ClientTest by making handler shareable

2021-08-30 Thread GitBox


tillrohrmann closed pull request #16915:
URL: https://github.com/apache/flink/pull/16915


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] pnowojski merged pull request #17009: [FLINK-23466][network] Fix the bug that buffer listeners may not be notified when recycling buffers

2021-08-30 Thread GitBox


pnowojski merged pull request #17009:
URL: https://github.com/apache/flink/pull/17009


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure

2021-08-30 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23466:
---
Description: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016

The problem is the buffer listener will be removed from the listener queue when 
notified and then it will be added to the listener queue again if it needs more 
buffers. However, if some buffers are recycled meanwhile, the buffer listener 
will not be notified of the available buffers. For example:

1. Thread 1 calls LocalBufferPool#recycle().
2. Thread 1 reaches LocalBufferPool#fireBufferAvailableNotification() and 
listener.notifyBufferAvailable() is invoked, but Thread 1 sleeps before 
acquiring the lock to registeredListeners.add(listener).
3. Thread 2 is being woken up as a result of notifyBufferAvailable() call. 
It takes the buffer, but it needs more buffers.
4. Other threads, return all buffers, including this one that has been 
recycled. None are taken. Are all in the LocalBufferPool.
5. Thread 1 wakes up, and continues fireBufferAvailableNotification() 
invocation.
6. Thread 1 re-adds listener that's waiting for more buffer 
registeredListeners.add(listener).
7. Thread 1 exits loop LocalBufferPool#recycle(MemorySegment, int) inside, 
as the original memory segment has been used.

At the end we have a state where all buffers are in the LocalBufferPool, so no 
new recycle() calls will happen, but there is still one listener waiting for a 
buffer (despite buffers being available).

  
was:https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016


> UnalignedCheckpointITCase hangs on Azure
> 
>
> Key: FLINK-23466
> URL: https://issues.apache.org/jira/browse/FLINK-23466
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Yingjie Cao
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016
> The problem is the buffer listener will be removed from the listener queue 
> when notified and then it will be added to the listener queue again if it 
> needs more buffers. However, if some buffers are recycled meanwhile, the 
> buffer listener will not be notified of the available buffers. For example:
> 1. Thread 1 calls LocalBufferPool#recycle().
> 2. Thread 1 reaches LocalBufferPool#fireBufferAvailableNotification() and 
> listener.notifyBufferAvailable() is invoked, but Thread 1 sleeps before 
> acquiring the lock to registeredListeners.add(listener).
> 3. Thread 2 is being woken up as a result of notifyBufferAvailable() 
> call. It takes the buffer, but it needs more buffers.
> 4. Other threads, return all buffers, including this one that has been 
> recycled. None are taken. Are all in the LocalBufferPool.
> 5. Thread 1 wakes up, and continues fireBufferAvailableNotification() 
> invocation.
> 6. Thread 1 re-adds listener that's waiting for more buffer 
> registeredListeners.add(listener).
> 7. Thread 1 exits loop LocalBufferPool#recycle(MemorySegment, int) 
> inside, as the original memory segment has been used.
> At the end we have a state where all buffers are in the LocalBufferPool, so 
> no new recycle() calls will happen, but there is still one listener waiting 
> for a buffer (despite buffers being available).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure

2021-08-30 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-23466.
--
Resolution: Fixed

Merged to master as 48a384dffc7
Merged to release-1.14 as 0067d35cc0f

> UnalignedCheckpointITCase hangs on Azure
> 
>
> Key: FLINK-23466
> URL: https://issues.apache.org/jira/browse/FLINK-23466
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Yingjie Cao
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016
> The problem is the buffer listener will be removed from the listener queue 
> when notified and then it will be added to the listener queue again if it 
> needs more buffers. However, if some buffers are recycled meanwhile, the 
> buffer listener will not be notified of the available buffers. For example:
> 1. Thread 1 calls LocalBufferPool#recycle().
> 2. Thread 1 reaches LocalBufferPool#fireBufferAvailableNotification() and 
> listener.notifyBufferAvailable() is invoked, but Thread 1 sleeps before 
> acquiring the lock to registeredListeners.add(listener).
> 3. Thread 2 is being woken up as a result of notifyBufferAvailable() 
> call. It takes the buffer, but it needs more buffers.
> 4. Other threads, return all buffers, including this one that has been 
> recycled. None are taken. Are all in the LocalBufferPool.
> 5. Thread 1 wakes up, and continues fireBufferAvailableNotification() 
> invocation.
> 6. Thread 1 re-adds listener that's waiting for more buffer 
> registeredListeners.add(listener).
> 7. Thread 1 exits loop LocalBufferPool#recycle(MemorySegment, int) 
> inside, as the original memory segment has been used.
> At the end we have a state where all buffers are in the LocalBufferPool, so 
> no new recycle() calls will happen, but there is still one listener waiting 
> for a buffer (despite buffers being available).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-24030) PulsarSourceITCase>SourceTestSuiteBase.testMultipleSplits failed

2021-08-30 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-24030.
--
Resolution: Duplicate

> PulsarSourceITCase>SourceTestSuiteBase.testMultipleSplits failed
> 
>
> Key: FLINK-24030
> URL: https://issues.apache.org/jira/browse/FLINK-24030
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: Piotr Nowojski
>Priority: Major
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22936&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461
> root cause:
> {noformat}
> Aug 27 09:41:42 Caused by: 
> org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: 
> Consumer not found
> Aug 27 09:41:42   at 
> org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:987)
> Aug 27 09:41:42   at 
> org.apache.pulsar.client.impl.PulsarClientImpl.close(PulsarClientImpl.java:658)
> Aug 27 09:41:42   at 
> org.apache.flink.connector.pulsar.source.reader.source.PulsarSourceReaderBase.close(PulsarSourceReaderBase.java:83)
> Aug 27 09:41:42   at 
> org.apache.flink.connector.pulsar.source.reader.source.PulsarOrderedSourceReader.close(PulsarOrderedSourceReader.java:170)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.api.operators.SourceOperator.close(SourceOperator.java:308)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:141)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:127)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:1015)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.afterInvoke(StreamTask.java:859)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:747)
> Aug 27 09:41:42   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
> Aug 27 09:41:42   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
> Aug 27 09:41:42   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
> Aug 27 09:41:42   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
> {noformat}
> Top level error:
> {noformat}
> WARNING: The following warnings have been detected: WARNING: Return type, 
> java.util.Map org.apache.pulsar.common.policies.data.NamespaceIsolationData>, of method, 
> public java.util.Map org.apache.pulsar.common.policies.data.NamespaceIsolationData> 
> org.apache.pulsar.broker.admin.impl.ClustersBase.getNamespaceIsolationPolicies(java.lang.String)
>  throws java.lang.Exception, is not resolvable to a concrete type.
> Aug 27 09:41:42 [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 357.849 s <<< FAILURE! - in 
> org.apache.flink.connector.pulsar.source.PulsarSourceITCase
> Aug 27 09:41:42 [ERROR] testMultipleSplits{TestEnvironment, 
> ExternalContext}[1]  Time elapsed: 5.391 s  <<< ERROR!
> Aug 27 09:41:42 java.lang.RuntimeException: Failed to fetch next result
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
> Aug 27 09:41:42   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
> Aug 27 09:41:42   at 
> org.apache.flink.connectors.test.common.utils.TestDataMatchers$MultipleSplitDataMatcher.matchesSafely(TestDataMatchers.java:151)
> Aug 27 09:41:42   at 
> org.apache.flink.connectors.test.common.utils.TestDataMatchers$MultipleSplitDataMatcher.matchesSafely(TestDataMatchers.java:133)
> Aug 27 09:41:42   at 
> org.hamcrest.TypeSafeDiagnosingMatcher.matches(TypeSafeDiagnosingMatcher.java:55)
> Aug 27 09:41:42   at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:12)
> Aug 27 09:41:42   at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> Aug 27 09:41:42   at 
> org.apache.flink.connectors.test.common.testsuites.SourceTestSuiteBase.testMultipleSplits(SourceTestSuiteBase.java:156)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-9925) ClientTest.testSimpleRequests fails on Travis

2021-08-30 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-9925.

Resolution: Fixed

Fixed via

1.15.0:
bad537d6b4517341e33253354d6273153e571747
d505e6ed9bd596d1b6efcc3943dabd2b6a880744

1.14.0:
9da4069821a3ca6d06add3eab06b0af38f43112a
9d742bceef839bb3e9087946f434979d2d3f847d

1.13.3:
c2ad55f6edf8426b12390fc06ee64c57d9ae86b7
6b455b41202c85dd8bb41813bbee8e2a9e3c2ea0

1.12.6:
68316d82a1c46b598e63f351335a02e71e5a1f84
58039eee6e3963444d09ca9e4703509cd3f8c4f6

> ClientTest.testSimpleRequests fails on Travis
> -
>
> Key: FLINK-9925
> URL: https://issues.apache.org/jira/browse/FLINK-9925
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Queryable State, Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: auto-deprioritized-critical, pull-request-available, 
> test-stability
> Fix For: 1.14.0, 1.12.6, 1.13.3
>
>
> {{ClientTest.testSimpleRequests}} fails on Travis with an {{AssertionError}}: 
> https://api.travis-ci.org/v3/job/405690023/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16966: [FLINK-23875][connectors/kafka] Snapshot reduceBuffer of ReducingUpse…

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #16966:
URL: https://github.com/apache/flink/pull/16966#issuecomment-904741081


   
   ## CI report:
   
   * ed07cb76926d2747ac25953e5f5e74c48e64ef25 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22827)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22799)
 
   * 758c42cbc3442c3b95389dc65ef4bd67e2ff3344 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23080)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22483) Recover checkpoints when JobMaster gains leadership

2021-08-30 Thread ming li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406647#comment-17406647
 ] 

ming li commented on FLINK-22483:
-

Hi, [~dmvk], [~edu05]. Could you please help answer one of my questions? If we 
can only recover the ??CompletedCheckpointStore?? once at startup, why does the 
??sharedStateRegistry?? need to be re-registered every time when the job is 
restarted? Is it possible to re-register only once at startup?

> Recover checkpoints when JobMaster gains leadership
> ---
>
> Key: FLINK-22483
> URL: https://issues.apache.org/jira/browse/FLINK-22483
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Robert Metzger
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Recovering checkpoints (from the CompletedCheckpointStore) is a potentially 
> long-lasting/blocking operation, for example if the file system 
> implementation is retrying to connect to a unavailable storage backend.
> Currently, we are calling the CompletedCheckpointStore.recover() method from 
> the main thread of the JobManager, making it unresponsive to any RPC call 
> while the recover method is blocked:
> {code}
> 2021-04-02 20:33:31,384 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job XXX 
> switched from state RUNNING to RESTARTING.
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to 
> minio.minio.svc:9000 [minio.minio.svc/] failed: Connection refused 
> (Connection refused)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) 
> ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) 
> ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1490) 
> ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:905)
>  ~[?:?]
>   at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:902)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:887)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:880)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$0(PrestoS3FileSystem.java:819)
>  ~[?:?]
>   at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:818)
>  ~[?:?]
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) 
> ~[?:1.8.0_282]
>   at XXX.recover(KubernetesHaCheckpointStore.java:69) 
> ~[vvp-flink-ha-kubernetes-flink112-1.1.0.jar:?]
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateInternal(CheckpointCoordinator.java:1511)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateToAll(CheckpointCoordinator.java:1451)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.restoreState(SchedulerBase.java:421)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.lambda$restartTasks$2(DefaultScheduler.java:314)
>  ~[flink-dis

[GitHub] [flink] flinkbot edited a comment on pull request #17045: [FLINK-23924][python][examples] Add PyFlink examples

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17045:
URL: https://github.com/apache/flink/pull/17045#issuecomment-908007478


   
   ## CI report:
   
   * 77e0e48a0f9c3f2f62e6e19ca5f4253060c13b07 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23061)
 
   * edd075f9e5a4dd29de26db54bafedace3c92b9ef Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23081)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream

2021-08-30 Thread GitBox


flinkbot commented on pull request #17048:
URL: https://github.com/apache/flink/pull/17048#issuecomment-908188488


   
   ## CI report:
   
   * 44a4c5b89d03c403df73fae28d464a557444323a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-22483) Recover checkpoints when JobMaster gains leadership

2021-08-30 Thread ming li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406647#comment-17406647
 ] 

ming li edited comment on FLINK-22483 at 8/30/21, 9:25 AM:
---

Hi, [~dmvk], [~edu05]. Could you please help answer one of my questions? If we 
can only recover the {{CompletedCheckpointStore}} once at startup, why does the 
{{sharedStateRegistry}} need to be re-registered every time when the job is 
restarted? Is it possible to re-register only once at startup?


was (Author: ming li):
Hi, [~dmvk], [~edu05]. Could you please help answer one of my questions? If we 
can only recover the ??CompletedCheckpointStore?? once at startup, why does the 
??sharedStateRegistry?? need to be re-registered every time when the job is 
restarted? Is it possible to re-register only once at startup?

> Recover checkpoints when JobMaster gains leadership
> ---
>
> Key: FLINK-22483
> URL: https://issues.apache.org/jira/browse/FLINK-22483
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Robert Metzger
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Recovering checkpoints (from the CompletedCheckpointStore) is a potentially 
> long-lasting/blocking operation, for example if the file system 
> implementation is retrying to connect to a unavailable storage backend.
> Currently, we are calling the CompletedCheckpointStore.recover() method from 
> the main thread of the JobManager, making it unresponsive to any RPC call 
> while the recover method is blocked:
> {code}
> 2021-04-02 20:33:31,384 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job XXX 
> switched from state RUNNING to RESTARTING.
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to 
> minio.minio.svc:9000 [minio.minio.svc/] failed: Connection refused 
> (Connection refused)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) 
> ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) 
> ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1490) 
> ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:905)
>  ~[?:?]
>   at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:902)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:887)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:880)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$0(PrestoS3FileSystem.java:819)
>  ~[?:?]
>   at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:818)
>  ~[?:?]
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) 
> ~[?:1.8.0_282]
>   at XXX.recover(KubernetesHaCheckpointStore.java:69) 
> ~[vvp-flink-ha-kubernetes-flink112-1.1.0.jar:?]
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateInternal(CheckpointCoordinator.java:1511)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateToA

[GitHub] [flink] wsry commented on a change in pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer

2021-08-30 Thread GitBox


wsry commented on a change in pull request #17037:
URL: https://github.com/apache/flink/pull/17037#discussion_r698335837



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java
##
@@ -216,9 +220,10 @@
 // Lock is only taken, because #checkAvailability asserts it. It's a 
small penalty for
 // thread safety.
 synchronized (this.availableMemorySegments) {
-if (checkAvailability()) {
-availabilityHelper.resetAvailable();
-}
+// guarantee that we have one buffer on initialization

Review comment:
   1. Added more comments.
   2. The argument check already guarantees that the 
numberOfRequiredMemorySegments must be greater than 0.
   3. From, it seem that checkAvailability only request one buffer, not 
requesting all buffers at once (correct me if I am wrong). Do you mean we 
enhance this logic and request all buffers at once?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsry commented on a change in pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer

2021-08-30 Thread GitBox


wsry commented on a change in pull request #17037:
URL: https://github.com/apache/flink/pull/17037#discussion_r698335837



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/LocalBufferPool.java
##
@@ -216,9 +220,10 @@
 // Lock is only taken, because #checkAvailability asserts it. It's a 
small penalty for
 // thread safety.
 synchronized (this.availableMemorySegments) {
-if (checkAvailability()) {
-availabilityHelper.resetAvailable();
-}
+// guarantee that we have one buffer on initialization

Review comment:
   1. Added more comments.
   2. The argument check already guarantees that the 
numberOfRequiredMemorySegments must be greater than 0.
   3. From the code, it seems that checkAvailability only requests one buffer, 
not requests all buffers at once (correct me if I am wrong). Do you mean we 
enhance this logic and request all buffers at once?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-training] NicoK commented on a change in pull request #31: [FLINK-23653] exercise and test rework

2021-08-30 Thread GitBox


NicoK commented on a change in pull request #31:
URL: https://github.com/apache/flink-training/pull/31#discussion_r698320553



##
File path: 
common/src/test/java/org/apache/flink/training/exercises/testing/ComposedRichCoFlatMapFunction.java
##
@@ -0,0 +1,70 @@
+package org.apache.flink.training.exercises.testing;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.co.RichCoFlatMapFunction;
+import 
org.apache.flink.training.exercises.common.utils.MissingSolutionException;
+import org.apache.flink.util.Collector;
+
+/**
+ * A RichCoFlatMapFunction that can delegate to a RichCoFlatMapFunction in 
either the exercise or in
+ * the solution. The implementation in the exercise is tested first, and if it 
throws
+ * MissingSolutionException, then the solution is tested instead.
+ *
+ * This can be used to write test harness tests.
+ *
+ * @param  first input type
+ * @param  second input type
+ * @param  output type
+ */
+public class ComposedRichCoFlatMapFunction
+extends RichCoFlatMapFunction {
+private final RichCoFlatMapFunction exercise;
+private final RichCoFlatMapFunction solution;
+private boolean useExercise;
+
+public ComposedRichCoFlatMapFunction(
+RichCoFlatMapFunction exercise,
+RichCoFlatMapFunction solution) {
+
+this.exercise = exercise;
+this.solution = solution;
+this.useExercise = true;
+}
+
+@Override
+public void open(Configuration parameters) throws Exception {
+
+try {
+exercise.setRuntimeContext(this.getRuntimeContext());
+exercise.open(parameters);
+} catch (Exception e) {
+if (MissingSolutionException.ultimateCauseIsMissingSolution(e)) {
+this.useExercise = false;
+solution.setRuntimeContext(this.getRuntimeContext());
+solution.open(parameters);
+} else {
+throw e;
+}
+}
+}
+
+@Override
+public void flatMap1(IN1 value, Collector out) throws Exception {
+
+if (useExercise) {
+exercise.flatMap1(value, out);
+} else {
+solution.flatMap1(value, out);
+}

Review comment:
   Looking at this pattern again, what do you think about this alternative:
   - create a new member `private RichCoFlatMapFunction 
functionToExecute;` (or maybe you have a better idea for the name)
   - assign either `exercise` or `solution` to it
   - just call `functionToExecute.flatMap1()` etc
   
   -> Less code, don't have to wait for JIT to optimise the boolean away 
(hopefully it can), no additional memory required (just a new reference that 
replaces the boolean)

##
File path: 
hourly-tips/src/main/java/org/apache/flink/training/exercises/hourlytips/HourlyTipsExercise.java
##
@@ -40,18 +56,32 @@
  */
 public static void main(String[] args) throws Exception {
 
+HourlyTipsSolution job =
+new HourlyTipsSolution(new TaxiFareGenerator(), new 
PrintSinkFunction<>());
+
+job.execute();
+}
+
+/**
+ * Create and execute the hourly tips pipeline.
+ *
+ * @return {JobExecutionResult}
+ * @throws Exception which occurs during job execution.
+ */
+public JobExecutionResult execute() throws Exception {
+
 // set up streaming execution environment
 StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
-env.setParallelism(ExerciseBase.parallelism);
 
 // start the data generator
-DataStream fares = env.addSource(fareSourceOrTest(new 
TaxiFareGenerator()));
-
-throw new MissingSolutionException();
+DataStream fares = env.addSource(new TaxiFareGenerator());

Review comment:
   ```suggestion
   DataStream fares = env.addSource(source);
   ```

##
File path: 
hourly-tips/src/test/java/org/apache/flink/training/exercises/hourlytips/HourlyTipsTest.java
##
@@ -72,27 +84,39 @@ public void testMaxAcrossDrivers() throws Exception {
 TaxiFare tenFor1In2 = testFare(1, t(90), 10.0F);

Review comment:
   Can you add a couple of more test fares so that chances are higher that 
these are processed on different parallel instances (if the users' code 
supports parallel instances)?

##
File path: 
hourly-tips/src/test/java/org/apache/flink/training/exercises/hourlytips/HourlyTipsTest.java
##
@@ -72,27 +84,39 @@ public void testMaxAcrossDrivers() throws Exception {
 TaxiFare tenFor1In2 = testFare(1, t(90), 10.0F);
 TaxiFare twentyFor2In2 = testFare(2, t(90), 20.0F);
 
-TestFareSource source =
-new TestFareSource(oneFor1In1, fiveFor1In1, tenFor1In2, 
twentyFor2In2);
+ParallelTestSource source =
+new ParallelTestSource<>(oneFor1In1, fiveFor1In1, tenFor1In2, 
twentyFor2In2);
 
 Tuple3 hour

[GitHub] [flink] twalthr commented on a change in pull request #17011: [FLINK-23663][table-planner] Reduce state size of ChangelogNormalize

2021-08-30 Thread GitBox


twalthr commented on a change in pull request #17011:
URL: https://github.com/apache/flink/pull/17011#discussion_r698306556



##
File path: 
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TableFactoryHarness.java
##
@@ -0,0 +1,361 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.factories;
+
+import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ConfigOptions;
+import org.apache.flink.configuration.ReadableConfig;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.streaming.api.functions.source.SourceFunction;
+import org.apache.flink.table.api.Schema;
+import org.apache.flink.table.api.TableDescriptor;
+import org.apache.flink.table.api.ValidationException;
+import org.apache.flink.table.catalog.CatalogTable;
+import org.apache.flink.table.connector.ChangelogMode;
+import org.apache.flink.table.connector.sink.DynamicTableSink;
+import 
org.apache.flink.table.connector.sink.DynamicTableSink.SinkRuntimeProvider;
+import org.apache.flink.table.connector.sink.SinkFunctionProvider;
+import org.apache.flink.table.connector.source.DynamicTableSource;
+import org.apache.flink.table.connector.source.LookupTableSource;
+import org.apache.flink.table.connector.source.ScanTableSource;
+import 
org.apache.flink.table.connector.source.ScanTableSource.ScanRuntimeProvider;
+import org.apache.flink.table.connector.source.SourceFunctionProvider;
+import org.apache.flink.table.connector.source.TableFunctionProvider;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.factories.DynamicTableFactory;
+import org.apache.flink.table.factories.DynamicTableSinkFactory;
+import org.apache.flink.table.factories.DynamicTableSourceFactory;
+import org.apache.flink.table.factories.FactoryUtil;
+import org.apache.flink.table.functions.TableFunction;
+import org.apache.flink.testutils.junit.SharedObjects;
+import org.apache.flink.util.InstantiationUtil;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.Base64;
+import java.util.HashSet;
+import java.util.Set;
+
+/**
+ * Provides a flexible testing harness for table factories.
+ *
+ * This testing harness allows writing custom sources and sinks which can 
be directly
+ * instantiated from the test. This avoids having to implement a factory, and 
enables using the
+ * {@link SharedObjects} rule to get direct access to the underlying 
source/sink from the test.
+ *
+ * Note that the underlying source/sink must be {@link Serializable}. It is 
recommended to extend
+ * from {@link ScanSourceBase}, {@link LookupSourceBase}, or {@link SinkBase} 
which provide default
+ * implementations for most methods as well as some convenience methods.
+ *
+ * The harness provides a {@link Factory}. You can register a source / sink 
through configuration
+ * by passing a base64-encoded serialization. The harness provides convenience 
methods to make this
+ * process as simple as possible.
+ *
+ * Example:
+ *
+ * {@code
+ * public class CustomSourceTest {
+ * {@literal @}Rule public SharedObjects sharedObjects = 
SharedObjects.create();
+ *
+ * {@literal @}Test
+ * public void test() {
+ * SharedReference> appliedLimits = sharedObjects.add(new 
ArrayList<>());
+ *
+ * Schema schema = Schema.newBuilder().build();

Review comment:
   nit: use the new `Schema.derived()`

##
File path: 
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TableFactoryHarness.java
##
@@ -0,0 +1,361 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ *

[jira] [Assigned] (FLINK-23850) Test Kafka table connector with new runtime provider

2021-08-30 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias reassigned FLINK-23850:


Assignee: Matthias

> Test Kafka table connector with new runtime provider
> 
>
> Key: FLINK-23850
> URL: https://issues.apache.org/jira/browse/FLINK-23850
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0
>Reporter: Qingsheng Ren
>Assignee: Matthias
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.14.0
>
>
> The runtime provider of Kafka table connector has been replaced with new 
> KafkaSource and KafkaSink. The table connector requires to be tested to make 
> sure nothing is surprised to Table/SQL API users. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17037: [FLINK-24035][network] Guarantee that the LocalBufferPool is initialized with one buffer

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17037:
URL: https://github.com/apache/flink/pull/17037#issuecomment-907731112


   
   ## CI report:
   
   * 71ddbbb89bf81896a5acca9466e50d0a6e544758 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23020)
 
   * a1fc01a4a4c3373f4f931c688aa587fb0d3dfc01 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-24052) Flink SQL reads S3 bucket data.

2021-08-30 Thread Moses (Jira)
Moses created FLINK-24052:
-

 Summary: Flink SQL reads S3 bucket data.
 Key: FLINK-24052
 URL: https://issues.apache.org/jira/browse/FLINK-24052
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Ecosystem
Reporter: Moses


I wanna use Flink SQL reads S3 bucket data. But now I found it ONLY supports 
absolute path, which means I can not read all content in the bucket. 
My SQL statements write as below:
{code:sql}
CREATE TABLE file_data (
a BIGINT, b STRING, c STRING, d DOUBLE, e BOOLEAN, f DATE, g STRING,h 
STRING,
i STRING, j STRING, k STRING, l STRING, m STRING, n STRING, o STRING, p 
FLOAT
) WITH (
'connector' = 'filesystem',
'path' = 's3a://my-bucket',
'format' = 'parquet'
);

SELECT COUNT(*) FROM file_data;
{code}
The exception info:


{code:java}
Caused by: java.lang.IllegalArgumentException: path must be absolute
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:68) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:60) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:56) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
at 
org.apache.hadoop.fs.s3a.s3guard.S3Guard.putAndReturn(S3Guard.java:149) 
~[flink-s3-fs-hadoop-1.13.1.jar:1.13.1]
{code}

Is there any solution to meet my requirement ?






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24036) SSL cannot be installed on CI

2021-08-30 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-24036:
-
Fix Version/s: (was: 1.15.0)

> SSL cannot be installed on CI
> -
>
> Key: FLINK-24036
> URL: https://issues.apache.org/jira/browse/FLINK-24036
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.14.0, 1.12.5, 1.13.2, 1.15.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.14.0, 1.12.6, 1.13.3
>
>
> {code}
> # install libssl1.0.0 for netty tcnative
> wget 
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.6_amd64.deb
> sudo apt install ./libssl1.0.0_1.0.2n-1ubuntu5.6_amd64.deb
> {code}
> {code}
> --2021-08-27 20:48:49--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.6_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 91.189.91.38, 2001:67c:1562::15, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2021-08-27 20:48:49 ERROR 404: Not Found.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol merged pull request #17047: [FLINK-24027][filesystems] Remove excessive dependencies from NOTICE files

2021-08-30 Thread GitBox


zentol merged pull request #17047:
URL: https://github.com/apache/flink/pull/17047


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-24027) FileSystems list excessive dependencies in NOTICE

2021-08-30 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24027.

Resolution: Fixed

master: 401d7241c7e1eb432d988af3ef097dc5a633c9b4
1.14: 4c23df38555e364df9f14c67452d06de653ecdc2 

> FileSystems list excessive dependencies in NOTICE
> -
>
> Key: FLINK-24027
> URL: https://issues.apache.org/jira/browse/FLINK-24027
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Affects Versions: 1.14.0
>Reporter: Chesnay Schepler
>Assignee: Ingo Bürk
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The LicenseChecker finds several dependencies that are listed in the NOTICE 
> but do not show up in the shade-plugin output. It could be that after the 
> recent AWS/Hadoop bumps these are no longer being bundled (needs 
> confirmation!).
> {code}
> 17:05:14,651 WARN  NoticeFileChecker [] - Dependency 
> com.fasterxml.jackson.core:jackson-annotations:2.12.1 is mentioned in NOTICE 
> file 
> /__w/1/s/flink-filesystems/flink-azure-fs-hadoop/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,651 WARN  NoticeFileChecker [] - Dependency 
> com.fasterxml.jackson.core:jackson-databind:2.12.1 is mentioned in NOTICE 
> file 
> /__w/1/s/flink-filesystems/flink-azure-fs-hadoop/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,652 WARN  NoticeFileChecker [] - Dependency 
> org.apache.hadoop.thirdparty:hadoop-shaded-protobuf_3_7:1.1.1 is mentioned in 
> NOTICE file 
> /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,652 WARN  NoticeFileChecker [] - Dependency 
> org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.1.1 is mentioned in NOTICE 
> file 
> /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,652 WARN  NoticeFileChecker [] - Dependency 
> org.wildfly.openssl:wildfly-openssl:1.0.7.Final is mentioned in NOTICE file 
> /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,652 WARN  NoticeFileChecker [] - Dependency 
> commons-lang:commons-lang:2.6 is mentioned in NOTICE file 
> /__w/1/s/flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,741 WARN  NoticeFileChecker [] - Dependency 
> commons-lang:commons-lang:2.6 is mentioned in NOTICE file 
> /__w/1/s/flink-filesystems/flink-fs-hadoop-shaded/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,743 WARN  NoticeFileChecker [] - Dependency 
> org.apache.hadoop.thirdparty:hadoop-shaded-protobuf_3_7:1.1.1 is mentioned in 
> NOTICE file 
> /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,743 WARN  NoticeFileChecker [] - Dependency 
> org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.1.1 is mentioned in NOTICE 
> file 
> /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,743 WARN  NoticeFileChecker [] - Dependency 
> org.wildfly.openssl:wildfly-openssl:1.0.7.Final is mentioned in NOTICE file 
> /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> 17:05:14,743 WARN  NoticeFileChecker [] - Dependency 
> commons-lang:commons-lang:2.6 is mentioned in NOTICE file 
> /__w/1/s/flink-filesystems/flink-s3-fs-hadoop/src/main/resources/META-INF/NOTICE,
>  but is not expected there
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17048: [FLINK-24033][table-planner] Propagate unique keys for fromChangelogStream

2021-08-30 Thread GitBox


flinkbot edited a comment on pull request #17048:
URL: https://github.com/apache/flink/pull/17048#issuecomment-908188488


   
   ## CI report:
   
   * 44a4c5b89d03c403df73fae28d464a557444323a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23088)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #17027: [FLINK-23954][tests] Fix e2e test test_queryable_state_restart_tm.sh

2021-08-30 Thread GitBox


tillrohrmann commented on a change in pull request #17027:
URL: https://github.com/apache/flink/pull/17027#discussion_r698366255



##
File path: 
flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateProducer.java
##
@@ -158,24 +161,34 @@ public void open(Configuration parameters) throws 
Exception {
 
 stateDescriptor.setQueryable(QsConstants.QUERY_NAME);
 state = getRuntimeContext().getMapState(stateDescriptor);
-updateCount();
-
-LOG.info("Open {} with a count of {}.", 
getClass().getSimpleName(), count);
-}
-
-private void updateCount() throws Exception {
-count = Iterables.size(state.keys());
+countsAtCheckpoint = new HashMap<>();
+count = -1;
+lastCompletedCheckpoint = -1;
 }
 
 @Override
 public void flatMap(Email value, Collector out) throws 
Exception {
 state.put(value.getEmailId(), new EmailInformation(value));
-updateCount();
+count = Iterables.size(state.keys());
 }
 
 @Override
 public void notifyCheckpointComplete(long checkpointId) {
-System.out.println("Count on snapshot: " + count); // we look for 
it in the test
+if (checkpointId > lastCompletedCheckpoint) {

Review comment:
   Yes, it is a safety that that we don't output decreasing values.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on pull request #17027: [FLINK-23954][tests] Fix e2e test test_queryable_state_restart_tm.sh

2021-08-30 Thread GitBox


tillrohrmann commented on pull request #17027:
URL: https://github.com/apache/flink/pull/17027#issuecomment-908221973


   Thanks for the review @zentol. Merging this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   6   >