[jira] [Commented] (FLINK-33592) The return type of the function is void,not convenient to use

2023-11-19 Thread Junrui Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787624#comment-17787624
 ] 

Junrui Li commented on FLINK-33592:
---

Hi [~zhangtao9876] ,

The community has already accepted FLIP-381: Deprecate configuration 
getters/setters that return/set complex Java objects 
([here|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278464992]).
 Therefore, it is not recommended to modify this method, and it is also 
recommended to use ConfigOption instead of the setter API. Additionally, since 
the setRestartStrategy method is annotated by @PublicEnvolving, any 
modifications to it would require community discussion.

> The return type of the function is void,not convenient to use
> -
>
> Key: FLINK-33592
> URL: https://issues.apache.org/jira/browse/FLINK-33592
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: ZhangTao
>Priority: Minor
>  Labels: pull-request-available
>
> {code:java}
> @PublicEvolving
> public void setRestartStrategy(
> RestartStrategies.RestartStrategyConfiguration 
> restartStrategyConfiguration) {
> config.setRestartStrategy(restartStrategyConfiguration);
> } {code}
> StreamExecutionEnvironment usually has many parameters that need to be 
> set.The return type is void, making it inconvenient to use.
> Others set methods 'return this; ' , only this method has a void return type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


snuyanzin opened a new pull request, #23751:
URL: https://github.com/apache/flink/pull/23751

   
   
   
   ## What is the purpose of the change
   
   Removes the usage of the deprecated method 
`TableEnvironment#registerFunction`. 
   
   The PR is based on https://github.com/apache/flink/pull/23748
   
   ## Brief change log
   
   - Replace the usage of `TableEnvironment#registerFunction` with 
`TableEnvironment#createTemporarySystemFunction`
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


snuyanzin commented on PR #23751:
URL: https://github.com/apache/flink/pull/23751#issuecomment-1818024788

   @wuchong since you were involved in this, could you please have a look?
   
   It seems I found a way to clean this deprecated things


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


flinkbot commented on PR #23751:
URL: https://github.com/apache/flink/pull/23751#issuecomment-1818027266

   
   ## CI report:
   
   * aaf86d607973080e260b734317ba2d082f9c436c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


wuchong commented on PR #23751:
URL: https://github.com/apache/flink/pull/23751#issuecomment-181811

   cc @lincoln-lil , @LadyForest , could you help to review this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-33583][table] support state ttl for join [flink]

2023-11-19 Thread via GitHub


xuyangzhong opened a new pull request, #23752:
URL: https://github.com/apache/flink/pull/23752

   ## What is the purpose of the change
   
   Support state ttl hint for join node.
   
   
   ## Brief change log
   
   *(for example:)*
 - *rename join hints to query hints*
 - *introduce state ttl hints*
 - *support state ttl hints for join*
   
   
   ## Verifying this change
   
   Some tests are added to verify this pr.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? A new pr will full the doc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33583) Support state ttl hint for join

2023-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33583:
---
Labels: pull-request-available  (was: )

> Support state ttl hint for join 
> 
>
> Key: FLINK-33583
> URL: https://issues.apache.org/jira/browse/FLINK-33583
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


LadyForest commented on PR #23751:
URL: https://github.com/apache/flink/pull/23751#issuecomment-1818108433

   > cc @lincoln-lil , @LadyForest , could you help to review this?
   
   Sure. I'll take a look ASAP.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33583][table] support state ttl hint for join [flink]

2023-11-19 Thread via GitHub


flinkbot commented on PR #23752:
URL: https://github.com/apache/flink/pull/23752#issuecomment-1818110177

   
   ## CI report:
   
   * 1808b1bd3c924f93bcec8f2e9e33fb3601c3a690 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-33583) Support state ttl hint for join

2023-11-19 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan reassigned FLINK-33583:
-

Assignee: xuyang

> Support state ttl hint for join 
> 
>
> Key: FLINK-33583
> URL: https://issues.apache.org/jira/browse/FLINK-33583
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: xuyang
>Assignee: xuyang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33583) Support state TTL hint for regular join

2023-11-19 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-33583:
--
Summary: Support state TTL hint for regular join  (was: Support state ttl 
hint for join )

> Support state TTL hint for regular join
> ---
>
> Key: FLINK-33583
> URL: https://issues.apache.org/jira/browse/FLINK-33583
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: xuyang
>Assignee: xuyang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33588][Flink-Runtime] Fix NullArgumentException in DescriptiveStatisticsHistogramStatistics [flink]

2023-11-19 Thread via GitHub


zhutong6688 commented on PR #23749:
URL: https://github.com/apache/flink/pull/23749#issuecomment-1818145860

   issue link is https://issues.apache.org/jira/browse/FLINK-33588


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-20896) Support SupportsAggregatePushDown for JDBC TableSource

2023-11-19 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787792#comment-17787792
 ] 

Jane Chan commented on FLINK-20896:
---

Hi [~shared_ptr], I was wondering if there have been any updates.

> Support SupportsAggregatePushDown for JDBC TableSource
> --
>
> Key: FLINK-20896
> URL: https://issues.apache.org/jira/browse/FLINK-20896
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / JDBC, Table SQL / Ecosystem
>Reporter: Sebastian Liu
>Priority: Major
>  Labels: auto-unassigned
>
> Will add SupportsAggregatePushDown implementation for JDBC TableSource.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [hotfix][docs] Fix the messed navigation bar on Flink’s official website [flink]

2023-11-19 Thread via GitHub


KarmaGYZ opened a new pull request, #23753:
URL: https://github.com/apache/flink/pull/23753

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [hotfix][docs] Fix the messed navigation bar on Flink’s official website [flink]

2023-11-19 Thread via GitHub


flinkbot commented on PR #23753:
URL: https://github.com/apache/flink/pull/23753#issuecomment-1818168351

   
   ## CI report:
   
   * e9cebf147e9776dcb267087d764a901520239217 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33593) Left title bar of the official documentation for the master branch is misaligned

2023-11-19 Thread dalongliu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu updated FLINK-33593:
--
Description: 
The left title bar of the official documentation for the Master branch is 
misaligned, but the 1.18 branch is normal, so I'm guessing there's something 
wrong with this and we should fix it.

!image-2023-11-20-11-34-07-084.png!

  was:
The left title bar of the official documentation for the Master branch is 
misaligned, but the 1.18 branch is normal, so I'm guessing there's something 
wrong with this and I should fix it.

!image-2023-11-20-11-34-07-084.png!


> Left title bar of the official documentation for the master branch is 
> misaligned
> 
>
> Key: FLINK-33593
> URL: https://issues.apache.org/jira/browse/FLINK-33593
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: dalongliu
>Priority: Major
> Attachments: image-2023-11-20-11-34-07-084.png
>
>
> The left title bar of the official documentation for the Master branch is 
> misaligned, but the 1.18 branch is normal, so I'm guessing there's something 
> wrong with this and we should fix it.
> !image-2023-11-20-11-34-07-084.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33593) Left title bar of the official documentation for the master branch is misaligned

2023-11-19 Thread dalongliu (Jira)
dalongliu created FLINK-33593:
-

 Summary: Left title bar of the official documentation for the 
master branch is misaligned
 Key: FLINK-33593
 URL: https://issues.apache.org/jira/browse/FLINK-33593
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.19.0
Reporter: dalongliu
 Attachments: image-2023-11-20-11-34-07-084.png

The left title bar of the official documentation for the Master branch is 
misaligned, but the 1.18 branch is normal, so I'm guessing there's something 
wrong with this and I should fix it.

!image-2023-11-20-11-34-07-084.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [hotfix][docs] Fix the messed navigation bar on Flink’s official website [flink]

2023-11-19 Thread via GitHub


KarmaGYZ merged PR #23753:
URL: https://github.com/apache/flink/pull/23753


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-32871][SQL] Support BuiltInMethod TO_TIMESTAMP with timezone options [flink]

2023-11-19 Thread via GitHub


xuzifu666 opened a new pull request, #23754:
URL: https://github.com/apache/flink/pull/23754

   …ptions
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33315) Optimize memory usage of large StreamOperator

2023-11-19 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787823#comment-17787823
 ] 

Rui Fan commented on FLINK-33315:
-

3 subtasks have been merged for some time, and after these improvement our 
Flink Batch job with large operator run very well, so I close this JIRA.

Many thanks to everyone who helped with the review.

> Optimize memory usage of large StreamOperator
> -
>
> Key: FLINK-33315
> URL: https://issues.apache.org/jira/browse/FLINK-33315
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / Task
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Attachments: 
> 130f436613b52b321bd9bd0211dd109f0b010220e860f292a13c0702016976850466192b.png,
>  image-2023-10-19-16-28-16-077.png
>
>
> Some of our batch jobs are upgraded from flink-1.15 to flink-1.17, and TM 
> always fail with java.lang.OutOfMemoryError: Java heap space.
>  
> Here is a example: a hive table with a lot of data, and the 
> HiveSource#partitionBytes is 281MB.
> After analysis, the root cause is that TM maintains the big object with 3 
> replicas:
>  * Replica_1: SourceOperatorFactory (it's necessary for running task)
>  * Replica_2: Temporarily generate the duplicate SourceOperatorFactory object.
>  ** It's introduced in FLINK-30536 (1.17), it's not necessary. ([code 
> link|https://github.com/apache/flink/blob/c2e14ff411e806f9ccf176c85eb8249b8ff12e56/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/OperatorChain.java#L646])
>  ** When creating a successor operator to a SourceOperator, the call stack is:
>  *** OperatorChain#createOperatorChain ->
>  *** wrapOperatorIntoOutput ->
>  *** getOperatorRecordsOutCounter ->
>  *** operatorConfig.getStreamOperatorFactory(userCodeClassloader)
>  ** It will generate the SourceOperatorFactory temporarily and just check 
> whether it's SinkWriterOperatorFactory
>  * Replica_3: The value of StreamConfig#{color:#9876aa}SERIALIZEDUDF {color}
>  ** It is used to generate SourceOperatorFactory.
>  ** Now the value is always maintained in heap memory.
>  ** However, after generating we can release it or store it in the disk if 
> needed.
>  *** We can define a threshold, when the value size is less than threshold, 
> the release strategy doesn't take effect.
>  ** If so, we can save a lot of heap memory.
> These three replicas use about 800MB of memory. Please note that this is just 
> a subtask. Since each TM has 4 slots, it will run 4 HiveSources at the same 
> time, so 12 replicas are maintained in the TM memory, it's about 3.3 GB.
> These large objects in the JVM cannot be recycled, causing TM to frequently 
> OOM.
> This JIRA focus on optimizing Replica_2 and Replica_3.
>  
> !image-2023-10-19-16-28-16-077.png!
>  
> !https://f.haiserve.com/download/130f436613b52b321bd9bd0211dd109f0b010220e860f292a13c0702016976850466192b?userid=146850&token=4e7b7352b30d6e5d2dd2bb7a7479fc93!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-33315) Optimize memory usage of large StreamOperator

2023-11-19 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan resolved FLINK-33315.
-
Fix Version/s: 1.19.0
   Resolution: Fixed

> Optimize memory usage of large StreamOperator
> -
>
> Key: FLINK-33315
> URL: https://issues.apache.org/jira/browse/FLINK-33315
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / Task
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Fix For: 1.19.0
>
> Attachments: 
> 130f436613b52b321bd9bd0211dd109f0b010220e860f292a13c0702016976850466192b.png,
>  image-2023-10-19-16-28-16-077.png
>
>
> Some of our batch jobs are upgraded from flink-1.15 to flink-1.17, and TM 
> always fail with java.lang.OutOfMemoryError: Java heap space.
>  
> Here is a example: a hive table with a lot of data, and the 
> HiveSource#partitionBytes is 281MB.
> After analysis, the root cause is that TM maintains the big object with 3 
> replicas:
>  * Replica_1: SourceOperatorFactory (it's necessary for running task)
>  * Replica_2: Temporarily generate the duplicate SourceOperatorFactory object.
>  ** It's introduced in FLINK-30536 (1.17), it's not necessary. ([code 
> link|https://github.com/apache/flink/blob/c2e14ff411e806f9ccf176c85eb8249b8ff12e56/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/OperatorChain.java#L646])
>  ** When creating a successor operator to a SourceOperator, the call stack is:
>  *** OperatorChain#createOperatorChain ->
>  *** wrapOperatorIntoOutput ->
>  *** getOperatorRecordsOutCounter ->
>  *** operatorConfig.getStreamOperatorFactory(userCodeClassloader)
>  ** It will generate the SourceOperatorFactory temporarily and just check 
> whether it's SinkWriterOperatorFactory
>  * Replica_3: The value of StreamConfig#{color:#9876aa}SERIALIZEDUDF {color}
>  ** It is used to generate SourceOperatorFactory.
>  ** Now the value is always maintained in heap memory.
>  ** However, after generating we can release it or store it in the disk if 
> needed.
>  *** We can define a threshold, when the value size is less than threshold, 
> the release strategy doesn't take effect.
>  ** If so, we can save a lot of heap memory.
> These three replicas use about 800MB of memory. Please note that this is just 
> a subtask. Since each TM has 4 slots, it will run 4 HiveSources at the same 
> time, so 12 replicas are maintained in the TM memory, it's about 3.3 GB.
> These large objects in the JVM cannot be recycled, causing TM to frequently 
> OOM.
> This JIRA focus on optimizing Replica_2 and Replica_3.
>  
> !image-2023-10-19-16-28-16-077.png!
>  
> !https://f.haiserve.com/download/130f436613b52b321bd9bd0211dd109f0b010220e860f292a13c0702016976850466192b?userid=146850&token=4e7b7352b30d6e5d2dd2bb7a7479fc93!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-32871][SQL] Support BuiltInMethod TO_TIMESTAMP with timezone options [flink]

2023-11-19 Thread via GitHub


xuzifu666 commented on PR #23754:
URL: https://github.com/apache/flink/pull/23754#issuecomment-1818298330

   @leonardBang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread Xingbo Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787824#comment-17787824
 ] 

Xingbo Huang commented on FLINK-33531:
--

After doing some experiments, I came to the following conclusions:
1. In Python 3.9 + Cython 0.29.36 environment, the `test_denpendency.py` test 
will fail stably in my private Azure pipeline. Although I don't think Python 
and Cython versions have anything to do with this test failure.
2. Change the Python or Cython version of this test and the failure case will 
no longer appear.
3. This problem cannot be reproduced locally using the same versions of all 
packages such as Python and Cython.
4. After reverting the commit that may cause the problem, this case will still 
fail in Azure.(I didn't revert all the commits because I don't think these are 
the root causes.)

My preferred solution right now is to upgrade Cython to address testing issues 
caused by the Azure environment.

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211)
> 2023-11-12T02:10:24.5102410Z Nov 12 02:10:24 Eat 
> org.apache.flink.calcite.shaded.com.google.common.collect.ImmutableList.forEach(Immu

[PR] [FLINK-33531][python] Remove cython upper bounds [flink]

2023-11-19 Thread via GitHub


HuangXingBo opened a new pull request, #23755:
URL: https://github.com/apache/flink/pull/23755

   ## What is the purpose of the change
   
   *This pull request will remove cython upper bounds*
   
   
   ## Brief change log
   
 - *remove cython upper bounds*
   
   ## Verifying this change
- *original tests*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33594][SQL] Support BuiltInMethod TO_TIMESTAMP with timezone options [flink]

2023-11-19 Thread via GitHub


flinkbot commented on PR #23754:
URL: https://github.com/apache/flink/pull/23754#issuecomment-1818300655

   
   ## CI report:
   
   * 4a6efba11f072edea167bda533e574ff772b9e83 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33594) Support BuiltInMethod TO_TIMESTAMP with timezone options

2023-11-19 Thread xy (Jira)
xy created FLINK-33594:
--

 Summary: Support BuiltInMethod TO_TIMESTAMP with timezone options
 Key: FLINK-33594
 URL: https://issues.apache.org/jira/browse/FLINK-33594
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.8.4
Reporter: xy






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33594) Support BuiltInMethod TO_TIMESTAMP with timezone options

2023-11-19 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy updated FLINK-33594:
---
Description: 
Support BuiltInMethod TO_TIMESTAMP with timezone options,TO_TIMESTAMPS now only 
use utc timezone,but many scenarios we need timzone to choose,so need a pr to 
support it as TO_TIMESTAMP('2023-08-10', '-MM-dd', 'Asia/Shanghai')

this scenario in presto,starrocks,trino:

as presto,trino,starrocks:
SELECT timestamp '2012-10-31 01:00 UTC' AT TIME ZONE 'America/Los_Angeles';
2012-10-30 18:00:00.000 America/Los_Angeles

so we maybe need this function in to_timestamps

> Support BuiltInMethod TO_TIMESTAMP with timezone options
> 
>
> Key: FLINK-33594
> URL: https://issues.apache.org/jira/browse/FLINK-33594
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.8.4
>Reporter: xy
>Priority: Major
>
> Support BuiltInMethod TO_TIMESTAMP with timezone options,TO_TIMESTAMPS now 
> only use utc timezone,but many scenarios we need timzone to choose,so need a 
> pr to support it as TO_TIMESTAMP('2023-08-10', '-MM-dd', 'Asia/Shanghai')
> this scenario in presto,starrocks,trino:
> as presto,trino,starrocks:
> SELECT timestamp '2012-10-31 01:00 UTC' AT TIME ZONE 'America/Los_Angeles';
> 2012-10-30 18:00:00.000 America/Los_Angeles
> so we maybe need this function in to_timestamps



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33594) Support BuiltInMethod TO_TIMESTAMP with timezone options

2023-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33594:
---
Labels: pull-request-available  (was: )

> Support BuiltInMethod TO_TIMESTAMP with timezone options
> 
>
> Key: FLINK-33594
> URL: https://issues.apache.org/jira/browse/FLINK-33594
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.8.4
>Reporter: xy
>Priority: Major
>  Labels: pull-request-available
>
> Support BuiltInMethod TO_TIMESTAMP with timezone options,TO_TIMESTAMPS now 
> only use utc timezone,but many scenarios we need timzone to choose,so need a 
> pr to support it as TO_TIMESTAMP('2023-08-10', '-MM-dd', 'Asia/Shanghai')
> this scenario in presto,starrocks,trino:
> as presto,trino,starrocks:
> SELECT timestamp '2012-10-31 01:00 UTC' AT TIME ZONE 'America/Los_Angeles';
> 2012-10-30 18:00:00.000 America/Los_Angeles
> so we maybe need this function in to_timestamps



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33531:
---
Labels: pull-request-available test-stability  (was: test-stability)

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211)
> 2023-11-12T02:10:24.5102410Z Nov 12 02:10:24 Eat 
> org.apache.flink.calcite.shaded.com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
> 2023-11-12T02:10:24.5103343Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210)
> 2023-11-12T02:10:24.5104105Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepProgram$State.execute(HepProgram.java:118)
> 2023-11-12T02:10:24.5104868Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:205)
> 2023-11-12T02:10:24.5105616Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:191)
> 2023-11-12T02:10:24.5106421Z Nov 12 02:10:24 Eat 
> org.apache.flink.table.planner.plan.optim

Re: [PR] [FLINK-33531][python] Remove cython upper bounds [flink]

2023-11-19 Thread via GitHub


flinkbot commented on PR #23755:
URL: https://github.com/apache/flink/pull/23755#issuecomment-1818305522

   
   ## CI report:
   
   * 8630ff975634fae60f3562f52f5b7110d69741a1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33568) SinkV2MetricsITCase.testCommitterMetrics fails with NullPointerException

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787827#comment-17787827
 ] 

Sergey Nuyanzin commented on FLINK-33568:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba

> SinkV2MetricsITCase.testCommitterMetrics fails with NullPointerException
> 
>
> Key: FLINK-33568
> URL: https://issues.apache.org/jira/browse/FLINK-33568
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Screenshot 2023-11-16 at 22.30.51.png
>
>
> {code}
> Nov 16 01:48:57 01:48:57.537 [ERROR] Tests run: 2, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 6.023 s <<< FAILURE! - in 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase
> Nov 16 01:48:57 01:48:57.538 [ERROR] 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.testCommitterMetrics
>   Time elapsed: 0.745 s  <<< ERROR!
> Nov 16 01:48:57 java.lang.NullPointerException
> Nov 16 01:48:57   at 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.assertSinkCommitterMetrics(SinkV2MetricsITCase.java:254)
> Nov 16 01:48:57   at 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.testCommitterMetrics(SinkV2MetricsITCase.java:153)
> Nov 16 01:48:57   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...]
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54602&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=8546
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54602&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8605



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33502) HybridShuffleITCase caused a fatal error

2023-11-19 Thread Wencong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wencong Liu updated FLINK-33502:

Attachment: image-2023-11-20-14-37-37-321.png

> HybridShuffleITCase caused a fatal error
> 
>
> Key: FLINK-33502
> URL: https://issues.apache.org/jira/browse/FLINK-33502
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Attachments: image-2023-11-20-14-37-37-321.png
>
>
> [https://github.com/XComp/flink/actions/runs/6789774296/job/18458197040#step:12:9177]
> {code:java}
> Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, check 
> output in log
> 9168Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9169Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9170Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9171Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> 9172Error: 21:21:35 21:21:35.379 [ERROR] Command was /bin/sh -c cd 
> /root/flink/flink-tests && /usr/lib/jvm/jdk-11.0.19+7/bin/java -XX:+UseG1GC 
> -Xms256m -XX:+IgnoreUnrecognizedVMOptions 
> --add-opens=java.base/java.util=ALL-UNNAMED 
> --add-opens=java.base/java.io=ALL-UNNAMED -Xmx1536m -jar 
> /root/flink/flink-tests/target/surefire/surefirebooter10811559899200556131.jar
>  /root/flink/flink-tests/target/surefire 2023-11-07T20-32-50_466-jvmRun4 
> surefire6242806641230738408tmp surefire_1603959900047297795160tmp
> 9173Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, 
> check output in log
> 9174Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9175Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9176Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9177Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 9178Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 9179Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 9180Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33502) HybridShuffleITCase caused a fatal error

2023-11-19 Thread Wencong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787828#comment-17787828
 ] 

Wencong Liu commented on FLINK-33502:
-

Thank you for your detailed reply. I am currently trying to download the build 
artifacts for the corresponding stage. 
However, I noticed that the log collection downloaded using the method shown in 
the figure is different from the logs-ci-test_ci_tests-1699014739.zip that you 
mentioned.
!image-2023-11-20-14-37-37-321.png|width=839,height=434!  
Could you please advise me on how to download 
logs-ci-test_ci_tests-1699014739.zip?

> HybridShuffleITCase caused a fatal error
> 
>
> Key: FLINK-33502
> URL: https://issues.apache.org/jira/browse/FLINK-33502
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Attachments: image-2023-11-20-14-37-37-321.png
>
>
> [https://github.com/XComp/flink/actions/runs/6789774296/job/18458197040#step:12:9177]
> {code:java}
> Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, check 
> output in log
> 9168Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9169Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9170Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9171Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> 9172Error: 21:21:35 21:21:35.379 [ERROR] Command was /bin/sh -c cd 
> /root/flink/flink-tests && /usr/lib/jvm/jdk-11.0.19+7/bin/java -XX:+UseG1GC 
> -Xms256m -XX:+IgnoreUnrecognizedVMOptions 
> --add-opens=java.base/java.util=ALL-UNNAMED 
> --add-opens=java.base/java.io=ALL-UNNAMED -Xmx1536m -jar 
> /root/flink/flink-tests/target/surefire/surefirebooter10811559899200556131.jar
>  /root/flink/flink-tests/target/surefire 2023-11-07T20-32-50_466-jvmRun4 
> surefire6242806641230738408tmp surefire_1603959900047297795160tmp
> 9173Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, 
> check output in log
> 9174Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9175Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9176Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9177Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 9178Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 9179Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 9180Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787829#comment-17787829
 ] 

Sergey Nuyanzin commented on FLINK-33531:
-

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=821b528f-1eed-5598-a3b4-7f748b13f261&t=6bb545dd-772d-5d8c-f258-f5085fba3295]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=bf5e383b-9fd3-5f02-ca1c-8f788e2e76d3&t=85189c57-d8a0-5c9c-b61d-fc05cfac62cf]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=e92ecf6d-e207-5a42-7ff7-528ff0c5b259&t=40fc352e-9b4c-5fd8-363f-628f24b01ec2]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=b53e1644-5cb4-5a3b-5d48-f523f39bcf06&t=b68c9f5c-04c9-5c75-3862-a3a27aabbce3]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=3e4dd1a2-fe2f-5e5d-a581-48087e718d53&t=b4612f28-e3b5-5853-8a8b-610ae894217a]

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.

[jira] [Commented] (FLINK-33502) HybridShuffleITCase caused a fatal error

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787830#comment-17787830
 ] 

Sergey Nuyanzin commented on FLINK-33502:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54679&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9

> HybridShuffleITCase caused a fatal error
> 
>
> Key: FLINK-33502
> URL: https://issues.apache.org/jira/browse/FLINK-33502
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Attachments: image-2023-11-20-14-37-37-321.png
>
>
> [https://github.com/XComp/flink/actions/runs/6789774296/job/18458197040#step:12:9177]
> {code:java}
> Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, check 
> output in log
> 9168Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9169Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9170Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9171Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> 9172Error: 21:21:35 21:21:35.379 [ERROR] Command was /bin/sh -c cd 
> /root/flink/flink-tests && /usr/lib/jvm/jdk-11.0.19+7/bin/java -XX:+UseG1GC 
> -Xms256m -XX:+IgnoreUnrecognizedVMOptions 
> --add-opens=java.base/java.util=ALL-UNNAMED 
> --add-opens=java.base/java.io=ALL-UNNAMED -Xmx1536m -jar 
> /root/flink/flink-tests/target/surefire/surefirebooter10811559899200556131.jar
>  /root/flink/flink-tests/target/surefire 2023-11-07T20-32-50_466-jvmRun4 
> surefire6242806641230738408tmp surefire_1603959900047297795160tmp
> 9173Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, 
> check output in log
> 9174Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9175Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9176Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9177Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 9178Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 9179Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 9180Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-30235) Comprehensive benchmarks on changelog checkpointing

2023-11-19 Thread Hangxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hangxiang Yu resolved FLINK-30235.
--
Fix Version/s: 1.17.0
 Assignee: Rui Xia
   Resolution: Fixed

Resolved this in 1.17.0
See the [blog 
post|https://www.ververica.com/blog/generic-log-based-incremental-checkpoint] 
for more details about the benchmark results.

> Comprehensive benchmarks on changelog checkpointing
> ---
>
> Key: FLINK-30235
> URL: https://issues.apache.org/jira/browse/FLINK-30235
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Reporter: Rui Xia
>Assignee: Rui Xia
>Priority: Minor
>  Labels: performance
> Fix For: 1.17.0
>
>
> Changelog checkpointing is functionally usable right now. To make it as a 
> productive feature, more comprehensive benchmarks are required. In this 
> issue, I aim to answer the following two major concerns:
>  * The expansion of full checkpoint size caused by changelog persistence;
>  * The TPS regression caused by DTSL double-write;
> By the way, I will also present other metrics related to checkpointing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32972) TaskTest.testInterruptibleSharedLockInInvokeAndCancel causes a JVM shutdown with exit code 239

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759513#comment-17759513
 ] 

Matthias Pohl edited comment on FLINK-32972 at 11/20/23 7:24 AM:
-

1.17: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52682&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798&l=8716]


was (Author: sergey nuyanzin):
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52682&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798&l=8716

> TaskTest.testInterruptibleSharedLockInInvokeAndCancel causes a JVM shutdown 
> with exit code 239
> --
>
> Key: FLINK-32972
> URL: https://issues.apache.org/jira/browse/FLINK-32972
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: test-stability
>
> Within this build 
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52668&view=logs&j=b0a398c0-685b-599c-eb57-c8c2a771138e&t=747432ad-a576-5911-1e2a-68c6bedc248a&l=8677]
> it looks like task 
> {{1ec32305eb0f926acae926007429c142__0_0}} was 
> canceled
> {noformat}
> 
> Test 
> testInterruptibleSharedLockInInvokeAndCancel(org.apache.flink.runtime.taskmanager.TaskTest)
>  is running.
> 
> 01:30:05,140 [main] INFO  
> org.apache.flink.runtime.io.network.NettyShuffleServiceFactory [] - Created a 
> new FileChannelManager for storing result partitions of BLOCKING shuffles. 
> Used directories:
>   /tmp/flink-netty-shuffle-82415974-782a-46db-afbc-8f18f30a4ec5
> 01:30:05,177 [main] INFO  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool [] - Allocated 
> 32 MB for network buffer pool (number of memory segments: 1024, bytes per 
> segment: 32768).
> 01:30:05,181 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from CREATED to DEPLOYING.
> 01:30:05,190 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Loading JAR 
> files for task Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> [DEPLOYING].
> 01:30:05,192 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from DEPLOYING to INITIALIZING.
> 01:30:05,192 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from INITIALIZING to RUNNING.
> 01:30:05,195 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Attempting 
> to cancel task Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> 01:30:05,196 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from RUNNING to CANCELING.
> 01:30:05,196 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Triggering 
> cancellation of task code Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> 01:30:05,197 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from CANCELING to CANCELED.
> 01:30:05,198 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> {noformat}
> and after that there are records in logs complaining htat task did not react
> {noformat}
> 01:30:05,337 [Canceler/Interrupts for Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).] 
> WARN  org.apache.flink.runtime.taskmanager.Task[] - Task 
> 'Test Task (1/1)#0' did not react to cancelling signal - interrupting; it is 
> stuck for 0

[jira] [Commented] (FLINK-25358) New File Sink end-to-end test failed due to TM could not connect to RM

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787849#comment-17787849
 ] 

Sergey Nuyanzin commented on FLINK-25358:
-

Another instance with Test testShouldShutdownIfRegistrationWithJobManagerFails
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54681&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798

> New File Sink end-to-end test failed due to TM could not connect to RM
> --
>
> Key: FLINK-25358
> URL: https://issues.apache.org/jira/browse/FLINK-25358
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Runtime / Coordination
>Affects Versions: 1.14.2
>Reporter: Yun Gao
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> test-stability
>
> {code:java}
> 2021-12-16T16:53:21.8386046Z Dec 16 16:53:15 2021-12-16 16:39:16,946 INFO  
> org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job 
> leader service.
> 2021-12-16T16:53:21.8386987Z Dec 16 16:53:15 2021-12-16 16:39:16,948 INFO  
> org.apache.flink.runtime.filecache.FileCache [] - User file 
> cache uses directory 
> /tmp/flink-dist-cache-3a780c9f-c395-4355-b496-ade62a0757ad
> 2021-12-16T16:53:21.8388219Z Dec 16 16:53:15 2021-12-16 16:39:16,960 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Connecting 
> to ResourceManager 
> akka.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*().
> 2021-12-16T16:53:21.8389898Z Dec 16 16:53:15 2021-12-16 16:39:17,485 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$a]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8391767Z Dec 16 16:53:15 2021-12-16 16:39:18,851 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8393631Z Dec 16 16:53:15 2021-12-16 16:39:20,374 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8395509Z Dec 16 16:53:15 2021-12-16 16:39:20,376 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8397548Z Dec 16 16:53:15 2021-12-16 16:39:27,018 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Could not 
> resolve ResourceManager address 
> akka.ssl.tcp://flink@localhost:***@localhost:6123/user/rpc/resourcemanager_*.
> 2021-12-16T16:53:21.8399173Z Dec 16 16:53:15 2021-12-16 16:39:28,864 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8401223Z Dec 16 16:53:15 2021-12-16 16:39:30,357 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8403087Z Dec 16 16:53:15 2021-12-16 16:39:37,036 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$b]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16

[jira] [Comment Edited] (FLINK-32972) TaskTest.testInterruptibleSharedLockInInvokeAndCancel causes a JVM shutdown with exit code 239

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785773#comment-17785773
 ] 

Matthias Pohl edited comment on FLINK-32972 at 11/20/23 7:24 AM:
-

1.17: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54198&view=logs&j=d89de3df-4600-5585-dadc-9bbc9a5e661c&t=be5a4b15-4b23-56b1-7582-795f58a645a2&l=8428]


was (Author: mapohl):
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54198&view=logs&j=d89de3df-4600-5585-dadc-9bbc9a5e661c&t=be5a4b15-4b23-56b1-7582-795f58a645a2&l=8428

> TaskTest.testInterruptibleSharedLockInInvokeAndCancel causes a JVM shutdown 
> with exit code 239
> --
>
> Key: FLINK-32972
> URL: https://issues.apache.org/jira/browse/FLINK-32972
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: test-stability
>
> Within this build 
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52668&view=logs&j=b0a398c0-685b-599c-eb57-c8c2a771138e&t=747432ad-a576-5911-1e2a-68c6bedc248a&l=8677]
> it looks like task 
> {{1ec32305eb0f926acae926007429c142__0_0}} was 
> canceled
> {noformat}
> 
> Test 
> testInterruptibleSharedLockInInvokeAndCancel(org.apache.flink.runtime.taskmanager.TaskTest)
>  is running.
> 
> 01:30:05,140 [main] INFO  
> org.apache.flink.runtime.io.network.NettyShuffleServiceFactory [] - Created a 
> new FileChannelManager for storing result partitions of BLOCKING shuffles. 
> Used directories:
>   /tmp/flink-netty-shuffle-82415974-782a-46db-afbc-8f18f30a4ec5
> 01:30:05,177 [main] INFO  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool [] - Allocated 
> 32 MB for network buffer pool (number of memory segments: 1024, bytes per 
> segment: 32768).
> 01:30:05,181 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from CREATED to DEPLOYING.
> 01:30:05,190 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Loading JAR 
> files for task Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> [DEPLOYING].
> 01:30:05,192 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from DEPLOYING to INITIALIZING.
> 01:30:05,192 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from INITIALIZING to RUNNING.
> 01:30:05,195 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Attempting 
> to cancel task Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> 01:30:05,196 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from RUNNING to CANCELING.
> 01:30:05,196 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Triggering 
> cancellation of task code Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> 01:30:05,197 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from CANCELING to CANCELED.
> 01:30:05,198 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> {noformat}
> and after that there are records in logs complaining htat task did not react
> {noformat}
> 01:30:05,337 [Canceler/Interrupts for Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).] 
> WARN  org.apache.flink.runtime.taskmanager.Task[] - Task 
> 'Test Task (1/1)#0' did not react to cancelling signal - interrupting; it is 
> stuck for 0 seconds 

[jira] [Commented] (FLINK-32972) TaskTest.testInterruptibleSharedLockInInvokeAndCancel causes a JVM shutdown with exit code 239

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787848#comment-17787848
 ] 

Matthias Pohl commented on FLINK-32972:
---

1.17: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54681&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798&l=8750

> TaskTest.testInterruptibleSharedLockInInvokeAndCancel causes a JVM shutdown 
> with exit code 239
> --
>
> Key: FLINK-32972
> URL: https://issues.apache.org/jira/browse/FLINK-32972
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: test-stability
>
> Within this build 
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52668&view=logs&j=b0a398c0-685b-599c-eb57-c8c2a771138e&t=747432ad-a576-5911-1e2a-68c6bedc248a&l=8677]
> it looks like task 
> {{1ec32305eb0f926acae926007429c142__0_0}} was 
> canceled
> {noformat}
> 
> Test 
> testInterruptibleSharedLockInInvokeAndCancel(org.apache.flink.runtime.taskmanager.TaskTest)
>  is running.
> 
> 01:30:05,140 [main] INFO  
> org.apache.flink.runtime.io.network.NettyShuffleServiceFactory [] - Created a 
> new FileChannelManager for storing result partitions of BLOCKING shuffles. 
> Used directories:
>   /tmp/flink-netty-shuffle-82415974-782a-46db-afbc-8f18f30a4ec5
> 01:30:05,177 [main] INFO  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool [] - Allocated 
> 32 MB for network buffer pool (number of memory segments: 1024, bytes per 
> segment: 32768).
> 01:30:05,181 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from CREATED to DEPLOYING.
> 01:30:05,190 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Loading JAR 
> files for task Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> [DEPLOYING].
> 01:30:05,192 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from DEPLOYING to INITIALIZING.
> 01:30:05,192 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from INITIALIZING to RUNNING.
> 01:30:05,195 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Attempting 
> to cancel task Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> 01:30:05,196 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from RUNNING to CANCELING.
> 01:30:05,196 [main] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Triggering 
> cancellation of task code Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> 01:30:05,197 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Test Task 
> (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0) 
> switched from CANCELING to CANCELED.
> 01:30:05,198 [   Test Task (1/1)#0] INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).
> {noformat}
> and after that there are records in logs complaining htat task did not react
> {noformat}
> 01:30:05,337 [Canceler/Interrupts for Test Task (1/1)#0 
> (1ec32305eb0f926acae926007429c142__0_0).] 
> WARN  org.apache.flink.runtime.taskmanager.Task[] - Task 
> 'Test Task (1/1)#0' did not react to cancelling signal - interrupting; it is 
> stuck for 0 seconds in method:
>  
> app//org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.close(AbstractMetricGroup.java:322)
> app//org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.close(AbstractMetricGroup.java:327)
> app//org.apache.flink.runt

[jira] [Updated] (FLINK-25358) New File Sink end-to-end test failed due to TM could not connect to RM

2023-11-19 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-25358:

Affects Version/s: 1.17.3

> New File Sink end-to-end test failed due to TM could not connect to RM
> --
>
> Key: FLINK-25358
> URL: https://issues.apache.org/jira/browse/FLINK-25358
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Runtime / Coordination
>Affects Versions: 1.14.2, 1.17.3
>Reporter: Yun Gao
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> test-stability
>
> {code:java}
> 2021-12-16T16:53:21.8386046Z Dec 16 16:53:15 2021-12-16 16:39:16,946 INFO  
> org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job 
> leader service.
> 2021-12-16T16:53:21.8386987Z Dec 16 16:53:15 2021-12-16 16:39:16,948 INFO  
> org.apache.flink.runtime.filecache.FileCache [] - User file 
> cache uses directory 
> /tmp/flink-dist-cache-3a780c9f-c395-4355-b496-ade62a0757ad
> 2021-12-16T16:53:21.8388219Z Dec 16 16:53:15 2021-12-16 16:39:16,960 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Connecting 
> to ResourceManager 
> akka.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*().
> 2021-12-16T16:53:21.8389898Z Dec 16 16:53:15 2021-12-16 16:39:17,485 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$a]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8391767Z Dec 16 16:53:15 2021-12-16 16:39:18,851 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8393631Z Dec 16 16:53:15 2021-12-16 16:39:20,374 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8395509Z Dec 16 16:53:15 2021-12-16 16:39:20,376 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8397548Z Dec 16 16:53:15 2021-12-16 16:39:27,018 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Could not 
> resolve ResourceManager address 
> akka.ssl.tcp://flink@localhost:***@localhost:6123/user/rpc/resourcemanager_*.
> 2021-12-16T16:53:21.8399173Z Dec 16 16:53:15 2021-12-16 16:39:28,864 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8401223Z Dec 16 16:53:15 2021-12-16 16:39:30,357 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8403087Z Dec 16 16:53:15 2021-12-16 16:39:37,036 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$b]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8404956Z Dec 16 16:53:15 2021-12-16 16:39:38,886 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl

Re: [PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


snuyanzin commented on PR #23751:
URL: https://github.com/apache/flink/pull/23751#issuecomment-1818374977

   the failure is unrelated and the reason is FLINK-33568


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-31597][table] Cleanup usage of deprecated TableEnvironment#registerFunction [flink]

2023-11-19 Thread via GitHub


snuyanzin commented on PR #23751:
URL: https://github.com/apache/flink/pull/23751#issuecomment-1818375339

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33595) StreamDependencyTests.test_set_requirements_with_cached_directory failed with ModuleNotFoundError

2023-11-19 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33595:
-

 Summary: 
StreamDependencyTests.test_set_requirements_with_cached_directory failed with 
ModuleNotFoundError
 Key: FLINK-33595
 URL: https://issues.apache.org/jira/browse/FLINK-33595
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.18.0
Reporter: Matthias Pohl


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54682&view=logs&j=b53e1644-5cb4-5a3b-5d48-f523f39bcf06&t=b68c9f5c-04c9-5c75-3862-a3a27aabbce3&l=24663]
{code:java}
[...]
Nov 18 02:53:13 E   ModuleNotFoundError: No module named 
'python_package1'
Nov 18 02:53:13 E   
Nov 18 02:53:13 E   at 
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
Nov 18 02:53:13 E   at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
Nov 18 02:53:13 E   at 
org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:61)
Nov 18 02:53:13 E   at 
org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor$ActiveBundle.close(SdkHarnessClient.java:504)
Nov 18 02:53:13 E   at 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory$1.close(DefaultJobBundleFactory.java:555)
Nov 18 02:53:13 E   at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.finishBundle(BeamPythonFunctionRunner.java:421)
Nov 18 02:53:13 E   ... 7 more
Nov 18 02:53:13 E   Caused by: java.lang.RuntimeException: 
Error received from SDK harness for instruction 1: Traceback (most recent call 
last):
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 295, in _execute
Nov 18 02:53:13 E   response = task()
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 370, in 
Nov 18 02:53:13 E   lambda: 
self.create_worker().do_instruction(request), request)
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 630, in do_instruction
Nov 18 02:53:13 E   getattr(request, request_type), 
request.instruction_id)
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 667, in process_bundle
Nov 18 02:53:13 E   
bundle_processor.process_bundle(instruction_id))
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 1062, in process_bundle
Nov 18 02:53:13 E   element.data)
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 231, in process_encoded
Nov 18 02:53:13 E   self.output(decoded_value)
Nov 18 02:53:13 E File 
"apache_beam/runners/worker/operations.py", line 526, in 
apache_beam.runners.worker.operations.Operation.output
Nov 18 02:53:13 E File 
"apache_beam/runners/worker/operations.py", line 528, in 
apache_beam.runners.worker.operations.Operation.output
Nov 18 02:53:13 E File 
"apache_beam/runners/worker/operations.py", line 237, in 
apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
Nov 18 02:53:13 E File 
"apache_beam/runners/worker/operations.py", line 240, in 
apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
Nov 18 02:53:13 E File 
"pyflink/fn_execution/beam/beam_operations_fast.pyx", line 169, in 
pyflink.fn_execution.beam.beam_operations_fast.FunctionOperation.process
Nov 18 02:53:13 E   with self.scoped_process_state:
Nov 18 02:53:13 E File 
"pyflink/fn_execution/beam/beam_operations_fast.pyx", line 196, in 
pyflink.fn_execution.beam.beam_operations_fast.FunctionOperation.process
Nov 18 02:53:13 E   
self.process_element(input_processor.next())
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/pyflink/fn_execution/table/operations.py", line 102, in 
process_element
Nov 18 02:53:13 E   return self.func(value)
Nov 18 02:53:13 E File "", line 1, in 
Nov 18 02:53:13 E File 
"/__w/2/s/flink-python/pyflink/table/tests/test_dependency.

[jira] [Commented] (FLINK-33595) StreamDependencyTests.test_set_requirements_with_cached_directory failed with ModuleNotFoundError

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787852#comment-17787852
 ] 

Matthias Pohl commented on FLINK-33595:
---

I'm linking FLINK-26644 because it's the same test failure. The logs are 
different, though. I assume that both issue have different causes because of 
this.

> StreamDependencyTests.test_set_requirements_with_cached_directory failed with 
> ModuleNotFoundError
> -
>
> Key: FLINK-33595
> URL: https://issues.apache.org/jira/browse/FLINK-33595
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.18.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54682&view=logs&j=b53e1644-5cb4-5a3b-5d48-f523f39bcf06&t=b68c9f5c-04c9-5c75-3862-a3a27aabbce3&l=24663]
> {code:java}
> [...]
> Nov 18 02:53:13 E   ModuleNotFoundError: No module named 
> 'python_package1'
> Nov 18 02:53:13 E   
> Nov 18 02:53:13 E at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
> Nov 18 02:53:13 E at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
> Nov 18 02:53:13 E at 
> org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:61)
> Nov 18 02:53:13 E at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor$ActiveBundle.close(SdkHarnessClient.java:504)
> Nov 18 02:53:13 E at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory$1.close(DefaultJobBundleFactory.java:555)
> Nov 18 02:53:13 E at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.finishBundle(BeamPythonFunctionRunner.java:421)
> Nov 18 02:53:13 E ... 7 more
> Nov 18 02:53:13 E   Caused by: java.lang.RuntimeException: 
> Error received from SDK harness for instruction 1: Traceback (most recent 
> call last):
> Nov 18 02:53:13 E File 
> "/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 295, in _execute
> Nov 18 02:53:13 E   response = task()
> Nov 18 02:53:13 E File 
> "/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 370, in 
> Nov 18 02:53:13 E   lambda: 
> self.create_worker().do_instruction(request), request)
> Nov 18 02:53:13 E File 
> "/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 630, in do_instruction
> Nov 18 02:53:13 E   getattr(request, request_type), 
> request.instruction_id)
> Nov 18 02:53:13 E File 
> "/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 667, in process_bundle
> Nov 18 02:53:13 E   
> bundle_processor.process_bundle(instruction_id))
> Nov 18 02:53:13 E File 
> "/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 1062, in process_bundle
> Nov 18 02:53:13 E   element.data)
> Nov 18 02:53:13 E File 
> "/__w/2/s/flink-python/.tox/py37-cython/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 231, in process_encoded
> Nov 18 02:53:13 E   self.output(decoded_value)
> Nov 18 02:53:13 E File 
> "apache_beam/runners/worker/operations.py", line 526, in 
> apache_beam.runners.worker.operations.Operation.output
> Nov 18 02:53:13 E File 
> "apache_beam/runners/worker/operations.py", line 528, in 
> apache_beam.runners.worker.operations.Operation.output
> Nov 18 02:53:13 E File 
> "apache_beam/runners/worker/operations.py", line 237, in 
> apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
> Nov 18 02:53:13 E File 
> "apache_beam/runners/worker/operations.py", line 240, in 
> apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
> Nov 18 02:53:13 E File 
> "pyflink/fn_execution/beam/beam_operations_fast.pyx", line 169, in 
> pyflink.fn_execution.beam.beam_operations_fast.FunctionOperation.process
> Nov 18 02:53:13 E   with self.scoped_process_state:
> Nov 18 02:53:13 E  

[jira] [Commented] (FLINK-26974) Python EmbeddedThreadDependencyTests.test_add_python_file failed on azure

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787853#comment-17787853
 ] 

Matthias Pohl commented on FLINK-26974:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54682&view=logs&j=821b528f-1eed-5598-a3b4-7f748b13f261&t=6bb545dd-772d-5d8c-f258-f5085fba3295&l=24883

> Python EmbeddedThreadDependencyTests.test_add_python_file failed on azure
> -
>
> Key: FLINK-26974
> URL: https://issues.apache.org/jira/browse/FLINK-26974
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.15.0, 1.16.0, 1.17.0, 1.19.0
>Reporter: Yun Gao
>Assignee: Huang Xingbo
>Priority: Critical
>  Labels: auto-deprioritized-major, stale-assigned, test-stability
>
> {code:java}
> Mar 31 10:49:17 === FAILURES 
> ===
> Mar 31 10:49:17 __ 
> EmbeddedThreadDependencyTests.test_add_python_file __
> Mar 31 10:49:17 
> Mar 31 10:49:17 self = 
>  testMethod=test_add_python_file>
> Mar 31 10:49:17 
> Mar 31 10:49:17 def test_add_python_file(self):
> Mar 31 10:49:17 python_file_dir = os.path.join(self.tempdir, 
> "python_file_dir_" + str(uuid.uuid4()))
> Mar 31 10:49:17 os.mkdir(python_file_dir)
> Mar 31 10:49:17 python_file_path = os.path.join(python_file_dir, 
> "test_dependency_manage_lib.py")
> Mar 31 10:49:17 with open(python_file_path, 'w') as f:
> Mar 31 10:49:17 f.write("def add_two(a):\nraise 
> Exception('This function should not be called!')")
> Mar 31 10:49:17 self.t_env.add_python_file(python_file_path)
> Mar 31 10:49:17 
> Mar 31 10:49:17 python_file_dir_with_higher_priority = os.path.join(
> Mar 31 10:49:17 self.tempdir, "python_file_dir_" + 
> str(uuid.uuid4()))
> Mar 31 10:49:17 os.mkdir(python_file_dir_with_higher_priority)
> Mar 31 10:49:17 python_file_path_higher_priority = 
> os.path.join(python_file_dir_with_higher_priority,
> Mar 31 10:49:17 
> "test_dependency_manage_lib.py")
> Mar 31 10:49:17 with open(python_file_path_higher_priority, 'w') as f:
> Mar 31 10:49:17 f.write("def add_two(a):\nreturn a + 2")
> Mar 31 10:49:17 
> self.t_env.add_python_file(python_file_path_higher_priority)
> Mar 31 10:49:17 
> Mar 31 10:49:17 def plus_two(i):
> Mar 31 10:49:17 from test_dependency_manage_lib import add_two
> Mar 31 10:49:17 return add_two(i)
> Mar 31 10:49:17 
> Mar 31 10:49:17 self.t_env.create_temporary_system_function(
> Mar 31 10:49:17 "add_two", udf(plus_two, DataTypes.BIGINT(), 
> DataTypes.BIGINT()))
> Mar 31 10:49:17 table_sink = source_sink_utils.TestAppendSink(
> Mar 31 10:49:17 ['a', 'b'], [DataTypes.BIGINT(), 
> DataTypes.BIGINT()])
> Mar 31 10:49:17 self.t_env.register_table_sink("Results", table_sink)
> Mar 31 10:49:17 t = self.t_env.from_elements([(1, 2), (2, 5), (3, 
> 1)], ['a', 'b'])
> Mar 31 10:49:17 >   t.select(expr.call("add_two", t.a), 
> t.a).execute_insert("Results").wait()
> Mar 31 10:49:17 
> Mar 31 10:49:17 pyflink/table/tests/test_dependency.py:63: 
> Mar 31 10:49:17 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ _ _ _ _ _ _ _ _ 
> Mar 31 10:49:17 pyflink/table/table_result.py:76: in wait
> Mar 31 10:49:17 get_method(self._j_table_result, "await")()
> Mar 31 10:49:17 
> .tox/py38-cython/lib/python3.8/site-packages/py4j/java_gateway.py:1321: in 
> __call__
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34001&view=logs&j=821b528f-1eed-5598-a3b4-7f748b13f261&t=6bb545dd-772d-5d8c-f258-f5085fba3295&l=27239



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787855#comment-17787855
 ] 

Matthias Pohl commented on FLINK-18356:
---

This is a 1.18 build that failed: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54682&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=11661]

We should provide backports for 499e56f138fb4e81cbb8810385cfb393d16ea454. I'm 
gonna go ahead and create them.

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 
> 1.19.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787858#comment-17787858
 ] 

Sergey Nuyanzin commented on FLINK-33531:
-

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=821b528f-1eed-5598-a3b4-7f748b13f261&t=6bb545dd-772d-5d8c-f258-f5085fba3295]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=bf5e383b-9fd3-5f02-ca1c-8f788e2e76d3&t=85189c57-d8a0-5c9c-b61d-fc05cfac62cf]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=e92ecf6d-e207-5a42-7ff7-528ff0c5b259&t=40fc352e-9b4c-5fd8-363f-628f24b01ec2]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=b53e1644-5cb4-5a3b-5d48-f523f39bcf06&t=b68c9f5c-04c9-5c75-3862-a3a27aabbce3]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=3e4dd1a2-fe2f-5e5d-a581-48087e718d53&t=b4612f28-e3b5-5853-8a8b-610ae894217a]

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.

[jira] [Commented] (FLINK-33568) SinkV2MetricsITCase.testCommitterMetrics fails with NullPointerException

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787859#comment-17787859
 ] 

Sergey Nuyanzin commented on FLINK-33568:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54697&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b

> SinkV2MetricsITCase.testCommitterMetrics fails with NullPointerException
> 
>
> Key: FLINK-33568
> URL: https://issues.apache.org/jira/browse/FLINK-33568
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Screenshot 2023-11-16 at 22.30.51.png
>
>
> {code}
> Nov 16 01:48:57 01:48:57.537 [ERROR] Tests run: 2, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 6.023 s <<< FAILURE! - in 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase
> Nov 16 01:48:57 01:48:57.538 [ERROR] 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.testCommitterMetrics
>   Time elapsed: 0.745 s  <<< ERROR!
> Nov 16 01:48:57 java.lang.NullPointerException
> Nov 16 01:48:57   at 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.assertSinkCommitterMetrics(SinkV2MetricsITCase.java:254)
> Nov 16 01:48:57   at 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.testCommitterMetrics(SinkV2MetricsITCase.java:153)
> Nov 16 01:48:57   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...]
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54602&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=8546
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54602&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8605



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787862#comment-17787862
 ] 

Matthias Pohl commented on FLINK-33531:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211)
> 2023-11-12T02:10:24.5102410Z Nov 12 02:10:24 Eat 
> org.apache.flink.calcite.shaded.com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
> 2023-11-12T02:10:24.5103343Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210)
> 2023-11-12T02:10:24.5104105Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepProgram$State.execute(HepProgram.java:118)
> 2023-11-12T02:10:24.5104868Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:205)
> 2023-11-12T02:10:24.5105616Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:191)
> 2023-11-12T02:10:24.5106421Z Nov 12 02:10:24 E   

[jira] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread Matthias Pohl (Jira)


[ https://issues.apache.org/jira/browse/FLINK-33531 ]


Matthias Pohl deleted comment on FLINK-33531:
---

was (Author: mapohl):
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211)
> 2023-11-12T02:10:24.5102410Z Nov 12 02:10:24 Eat 
> org.apache.flink.calcite.shaded.com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
> 2023-11-12T02:10:24.5103343Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210)
> 2023-11-12T02:10:24.5104105Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepProgram$State.execute(HepProgram.java:118)
> 2023-11-12T02:10:24.5104868Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:205)
> 2023-11-12T02:10:24.5105616Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:191)
> 2023-11-12T02:10:24.5106421Z Nov 12 02:10:24 Eat 
> org.apache.flink.table.planner.plan.optimize.program.FlinkHe

[jira] [Commented] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP (StreamDependencyTests.test_add_python_archive)

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787861#comment-17787861
 ] 

Sergey Nuyanzin commented on FLINK-33531:
-

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=821b528f-1eed-5598-a3b4-7f748b13f261&t=6bb545dd-772d-5d8c-f258-f5085fba3295]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=bf5e383b-9fd3-5f02-ca1c-8f788e2e76d3&t=85189c57-d8a0-5c9c-b61d-fc05cfac62cf]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=e92ecf6d-e207-5a42-7ff7-528ff0c5b259&t=40fc352e-9b4c-5fd8-363f-628f24b01ec2]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=b53e1644-5cb4-5a3b-5d48-f523f39bcf06&t=b68c9f5c-04c9-5c75-3862-a3a27aabbce3]
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=3e4dd1a2-fe2f-5e5d-a581-48087e718d53&t=b4612f28-e3b5-5853-8a8b-610ae894217a]

> Nightly Python fails with NPE at metadataHandlerProvider on AZP 
> (StreamDependencyTests.test_add_python_archive)
> ---
>
> Key: FLINK-33531
> URL: https://issues.apache.org/jira/browse/FLINK-33531
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> It seems starting 02.11.2023 every master nightly fails with this (that's why 
> it is a blocker)
> for instance
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901]
> {noformat}
> 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]:
> 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1:
> 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1]
> 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = 
> OUTPUT_CONVERTER[type](answer[2:], gateway_client)
> 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == 
> REFERENCE_TYPE:
> 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 >   raise 
> Py4JJavaError(
> 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An 
> error occurred while calling {0}{1}{2}.\n".
> 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 
> format(target_id, ".", name), value)
> 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E   
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> o3371.executeInsert.
> 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E   : 
> java.lang.NullPointerException: metadataHandlerProvider
> 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat 
> java.util.Objects.requireNonNull(Objects.java:228)
> 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
> 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
> 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844)
> 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307)
> 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
> 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
> 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
> 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
> 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat 
> org.

[jira] [Commented] (FLINK-33568) SinkV2MetricsITCase.testCommitterMetrics fails with NullPointerException

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787863#comment-17787863
 ] 

Sergey Nuyanzin commented on FLINK-33568:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca

> SinkV2MetricsITCase.testCommitterMetrics fails with NullPointerException
> 
>
> Key: FLINK-33568
> URL: https://issues.apache.org/jira/browse/FLINK-33568
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Screenshot 2023-11-16 at 22.30.51.png
>
>
> {code}
> Nov 16 01:48:57 01:48:57.537 [ERROR] Tests run: 2, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 6.023 s <<< FAILURE! - in 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase
> Nov 16 01:48:57 01:48:57.538 [ERROR] 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.testCommitterMetrics
>   Time elapsed: 0.745 s  <<< ERROR!
> Nov 16 01:48:57 java.lang.NullPointerException
> Nov 16 01:48:57   at 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.assertSinkCommitterMetrics(SinkV2MetricsITCase.java:254)
> Nov 16 01:48:57   at 
> org.apache.flink.test.streaming.runtime.SinkV2MetricsITCase.testCommitterMetrics(SinkV2MetricsITCase.java:153)
> Nov 16 01:48:57   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...]
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54602&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=8546
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54602&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8605



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33502) HybridShuffleITCase caused a fatal error

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787864#comment-17787864
 ] 

Sergey Nuyanzin commented on FLINK-33502:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=b78d9d30-509a-5cea-1fef-db7abaa325ae

> HybridShuffleITCase caused a fatal error
> 
>
> Key: FLINK-33502
> URL: https://issues.apache.org/jira/browse/FLINK-33502
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Attachments: image-2023-11-20-14-37-37-321.png
>
>
> [https://github.com/XComp/flink/actions/runs/6789774296/job/18458197040#step:12:9177]
> {code:java}
> Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, check 
> output in log
> 9168Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9169Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9170Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9171Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> 9172Error: 21:21:35 21:21:35.379 [ERROR] Command was /bin/sh -c cd 
> /root/flink/flink-tests && /usr/lib/jvm/jdk-11.0.19+7/bin/java -XX:+UseG1GC 
> -Xms256m -XX:+IgnoreUnrecognizedVMOptions 
> --add-opens=java.base/java.util=ALL-UNNAMED 
> --add-opens=java.base/java.io=ALL-UNNAMED -Xmx1536m -jar 
> /root/flink/flink-tests/target/surefire/surefirebooter10811559899200556131.jar
>  /root/flink/flink-tests/target/surefire 2023-11-07T20-32-50_466-jvmRun4 
> surefire6242806641230738408tmp surefire_1603959900047297795160tmp
> 9173Error: 21:21:35 21:21:35.379 [ERROR] Error occurred in starting fork, 
> check output in log
> 9174Error: 21:21:35 21:21:35.379 [ERROR] Process Exit Code: 239
> 9175Error: 21:21:35 21:21:35.379 [ERROR] Crashed tests:
> 9176Error: 21:21:35 21:21:35.379 [ERROR] 
> org.apache.flink.test.runtime.HybridShuffleITCase
> 9177Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 9178Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 9179Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 9180Error: 21:21:35 21:21:35.379 [ERROR]  at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33414) MiniClusterITCase.testHandleStreamingJobsWhenNotEnoughSlot fails due to unexpected TimeoutException

2023-11-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787865#comment-17787865
 ] 

Sergey Nuyanzin commented on FLINK-33414:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54687&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef

> MiniClusterITCase.testHandleStreamingJobsWhenNotEnoughSlot fails due to 
> unexpected TimeoutException
> ---
>
> Key: FLINK-33414
> URL: https://issues.apache.org/jira/browse/FLINK-33414
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: github-actions, test-stability
>
> We see this test instability in [this 
> build|https://github.com/XComp/flink/actions/runs/6695266358/job/18192039035#step:12:9253].
> {code:java}
> Error: 17:04:52 17:04:52.042 [ERROR] Failures: 
> 9252Error: 17:04:52 17:04:52.042 [ERROR]   
> MiniClusterITCase.testHandleStreamingJobsWhenNotEnoughSlot:120 
> 9253Oct 30 17:04:52 Expecting a throwable with root cause being an instance 
> of:
> 9254Oct 30 17:04:52   
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException
> 9255Oct 30 17:04:52 but was an instance of:
> 9256Oct 30 17:04:52   java.util.concurrent.TimeoutException: Timeout has 
> occurred: 100 ms
> 9257Oct 30 17:04:52   at 
> org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkCheckerImpl.lambda$schedulePendingRequestBulkWithTimestampCheck$0(PhysicalSlotRequestBulkCheckerImpl.java:86)
> 9258Oct 30 17:04:52   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> 9259Oct 30 17:04:52   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> 9260Oct 30 17:04:52   ...(27 remaining lines not displayed - this can be 
> changed with Assertions.setMaxStackTraceElementsDisplayed) {code}
> The same error occurred in the [finegrained_resourcemanager stage of this 
> build|https://github.com/XComp/flink/actions/runs/6468655160/job/17563927249#step:11:26516]
>  (as reported in FLINK-33245).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-18356] Update CI image [flink]

2023-11-19 Thread via GitHub


XComp opened a new pull request, #23756:
URL: https://github.com/apache/flink/pull/23756

   ## What is the purpose of the change
   
   1.18 backport PR for parent PR #23717
   
   ## Brief change log
   
   I also backport the change for the docs build for the sake of consistency 
even though it's not necessary to fix FLINK-18356.
   
   ## Verifying this change
   
   CI should pass without problems.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-18356] Update CI image [flink]

2023-11-19 Thread via GitHub


XComp opened a new pull request, #23757:
URL: https://github.com/apache/flink/pull/23757

   ## What is the purpose of the change
   
   1.17 backport PR for parent PR #23717
   
   ## Brief change log
   
   I also backport the change for the docs build for the sake of consistency 
even though it's not necessary to fix FLINK-18356.
   
   ## Verifying this change
   
   CI should pass without problems.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-11-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787855#comment-17787855
 ] 

Matthias Pohl edited comment on FLINK-18356 at 11/20/23 7:54 AM:
-

This is a 1.18 build that failed: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54682&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=11661]

We should provide backports for 499e56f138fb4e81cbb8810385cfb393d16ea454. I'm 
gonna go ahead and create them.

*Update*
 * [https://github.com/apache/flink/pull/23756]
 * [https://github.com/apache/flink/pull/23757]


was (Author: mapohl):
This is a 1.18 build that failed: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54682&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=11661]

We should provide backports for 499e56f138fb4e81cbb8810385cfb393d16ea454. I'm 
gonna go ahead and create them.

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 
> 1.19.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [BP-1.18][FLINK-18356] Update CI image [flink]

2023-11-19 Thread via GitHub


flinkbot commented on PR #23756:
URL: https://github.com/apache/flink/pull/23756#issuecomment-1818404081

   
   ## CI report:
   
   * 26fb21e97c0d548e77f99e80ac3a5002dd33e143 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org