date:20250110

Re: [PR] [FLINK-36160] Support hive advanced configuration [flink]

2025-01-10 Thread via GitHub



luoyuxia commented on PR #25258:
URL: https://github.com/apache/flink/pull/25258#issuecomment-2585115946

   FYI, let's wait the response for the question I asked in 
[FLINK-37097](https://issues.apache.org/jira/browse/FLINK-37097?focusedCommentId=17912172&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17912172)
 before merging this pr...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [MINOR] Deprecated StreamSink and SinkOperator [flink]

2025-01-10 Thread via GitHub



beliefer commented on code in PR #25835:
URL: https://github.com/apache/flink/pull/25835#discussion_r1911925092


##
flink-runtime/src/main/java/org/apache/flink/streaming/api/operators/StreamSink.java:
##
@@ -25,6 +25,7 @@
 import org.apache.flink.streaming.runtime.tasks.ProcessingTimeService;
 
 /** A {@link StreamOperator} for executing {@link SinkFunction SinkFunctions}. 
*/
+@Deprecated

Review Comment:
   ```
   @Internal
   @Deprecated
   public interface SinkFunction extends Function, Serializable {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-37097) Remove Hive connector from core Flink

2025-01-10 Thread luoyuxia (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912172#comment-17912172
 ] 

luoyuxia commented on FLINK-37097:
--

So the currently pending pr for hive connector should have there optionsL

1:  Merged into flink repo  and then wait to be ported to external connector 
along with the other commits has been merge into flink repo 

2: Merge into external connector without waiting commits in flink repo to be 
ported to external connector

3: Block the pending pr until  commits in flink repo has been ported to 
external connector.. Then merge the pr into external repo

Which option should we follow given the current situation?

> Remove Hive connector from core Flink
> -
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36645] [flink-autoscaler] Gracefully handle null execution pla… [flink-kubernetes-operator]

2025-01-10 Thread via GitHub



sharath1709 commented on code in PR #930:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/930#discussion_r1911479006


##
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/topology/JobTopology.java:
##
@@ -150,6 +151,11 @@ public static JobTopology fromJsonPlan(
 ObjectNode plan = objectMapper.readValue(jsonPlan, ObjectNode.class);
 ArrayNode nodes = (ArrayNode) plan.get("nodes");
 
+if (nodes == null || nodes.isEmpty()) {
+throw new NotReadyException(
+new RuntimeException("No nodes found in the plan, job is 
not ready yet"));

Review Comment:
   Thanks for the quick review! Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36696] [flink-autoscaler-plugin-jdbc] Switch sql connection usages to datasource [flink-kubernetes-operator]

2025-01-10 Thread via GitHub



sharath1709 commented on PR #929:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/929#issuecomment-2584888730

   Thanks a lot @1996fanrui for the quick review. Please find my replies below
   
   1. Typically, DataSource connections will be relinquished automatically 
after a time out and DataSource isn't Closeable. However, some DataSource 
implementations like HikariDataSource implement the Closeable interface to 
close the ConnectionPool. We can add a check to see if the DataSource is 
Closeable and then close it. Additionally, I've attempted to make the classes 
that hold the datasource objects Autocloseable as well so that they could be 
created in a try-with-resources block
   2. Thanks, I've tried to fix most of the issues but might need a couple more 
iterations.
   3. Great point, I will try to add a test case for this scenario soon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35600] Add timestamp for low and high watermark [flink-cdc]

2025-01-10 Thread via GitHub



github-actions[bot] commented on PR #3415:
URL: https://github.com/apache/flink-cdc/pull/3415#issuecomment-2584936582

   This pull request has been automatically marked as stale because it has not 
had recent activity for 60 days. It will be closed in 30 days if no further 
activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34932) Translate concepts of Flink-Kubernetes-Operator documentation

2025-01-10 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912165#comment-17912165
 ] 

Rui Fan commented on FLINK-34932:
-

[~fernandoha] , assigned it to you. 

Hi [~caicancai] , would you mind reviewing it in your free time? thanks both of 
you in advance.

 

 

> Translate concepts of Flink-Kubernetes-Operator documentation
> -
>
> Key: FLINK-34932
> URL: https://issues.apache.org/jira/browse/FLINK-34932
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Kubernetes Operator
>Affects Versions: 1.9.0
>Reporter: Caican Cai
>Assignee: yinrhh
>Priority: Minor
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34932) Translate concepts of Flink-Kubernetes-Operator documentation

2025-01-10 Thread Rui Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan reassigned FLINK-34932:
---

Assignee: yinrhh  (was: Caican Cai)

> Translate concepts of Flink-Kubernetes-Operator documentation
> -
>
> Key: FLINK-34932
> URL: https://issues.apache.org/jira/browse/FLINK-34932
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Kubernetes Operator
>Affects Versions: 1.9.0
>Reporter: Caican Cai
>Assignee: yinrhh
>Priority: Minor
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34932) Translate concepts of Flink-Kubernetes-Operator documentation

2025-01-10 Thread yinrhh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912164#comment-17912164
 ] 

yinrhh commented on FLINK-34932:


Hi. can anybody assign this problem to me，i truly want to fix it. thanks.

> Translate concepts of Flink-Kubernetes-Operator documentation
> -
>
> Key: FLINK-34932
> URL: https://issues.apache.org/jira/browse/FLINK-34932
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Kubernetes Operator
>Affects Versions: 1.9.0
>Reporter: Caican Cai
>Assignee: Caican Cai
>Priority: Minor
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-28897] [TABLE-SQL] Fail to use udf in added jar when enabling checkpoint [flink]

2025-01-10 Thread via GitHub



ammu20-dev commented on PR #25656:
URL: https://github.com/apache/flink/pull/25656#issuecomment-2585004128

   > In that case the primary question should be - why does the 
`userClassLoader` class field not contain the actual user class loader in the 
first place. Should not that be the focus of the fix instead of pulling it out 
of the resourceManager only for one the methods?
   > 
   
   @afedulov That sounds reasonable. I will investigate more on this and try to 
figure out the reason. Meanwhile do you think a more targeted fix would be a 
better approach here. Something that is confined to 
[StreamingJobGraphGenerator.prevalidate](https://github.com/apache/flink/blob/4b306699811af0b6ff2cb862914adfda56345996/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java#L502)
 method level by removing the class loading happening 
[here](https://github.com/apache/flink/blob/4b306699811af0b6ff2cb862914adfda56345996/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java#L535).
 Due to this issue, the ADD JAR statement is causing a job failure when 
checkpointing is enabled, which is preventing to use this feature in 
production. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread xuyang (Jira)

xuyang created FLINK-37094:
--

 Summary: Adapt the external Hive connector to the removal of 
deprecated APIs in table module in 2.0
 Key: FLINK-37094
 URL: https://issues.apache.org/jira/browse/FLINK-37094
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: xuyang


Since the Hive connector is scheduled to be removed from the Flink repo, and 
the compilation of the remaining Hive connector module within the Flink repo 
has been skipped, all deprecation work related to the table module performed in 
version 2.0 needs to be synced to the external Hive connector's separate 
repository. There will be no further updates to the remaining Hive connector 
module in the Flink repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-37095) test case that used AI model may met quota not enough exception.

2025-01-10 Thread Yanquan Lv (Jira)

Yanquan Lv created FLINK-37095:
--

 Summary: test case that used AI model may met quota not enough 
exception.
 Key: FLINK-37095
 URL: https://issues.apache.org/jira/browse/FLINK-37095
 Project: Flink
  Issue Type: New Feature
  Components: Flink CDC
Affects Versions: cdc-3.2.1
Reporter: Yanquan Lv
 Fix For: cdc-3.3.0


Met the following error message when running test in 
FlinkPipelineUdfITCase.testTransformWithModel:
{code:java}
2025-01-10T03:12:23.5137158Z Caused by: dev.ai4j.openai4j.OpenAiHttpException: {
2025-01-10T03:12:23.5137225Z   "error": {
2025-01-10T03:12:23.5137854Z     "message": "You exceeded your current quota, 
please check your plan and billing details. For more information on this error, 
read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.";,
2025-01-10T03:12:23.5137947Z     "type": "insufficient_quota",
2025-01-10T03:12:23.5138020Z     "param": null,
2025-01-10T03:12:23.5138104Z     "code": "insufficient_quota"
2025-01-10T03:12:23.5138172Z   }
2025-01-10T03:12:23.5138234Z }
2025-01-10T03:12:23.5138364Z     at 
dev.ai4j.openai4j.Utils.toException(Utils.java:8)
2025-01-10T03:12:23.5138583Z     at 
dev.ai4j.openai4j.SyncRequestExecutor.execute(SyncRequestExecutor.java:28)
2025-01-10T03:12:23.5138761Z     at 
dev.ai4j.openai4j.RequestExecutor.execute(RequestExecutor.java:59)
2025-01-10T03:12:23.5139026Z     at 
dev.langchain4j.model.openai.OpenAiChatModel.lambda$generate$0(OpenAiChatModel.java:124)
2025-01-10T03:12:23.5139207Z     at 
dev.langchain4j.internal.RetryUtils.withRetry(RetryUtils.java:26)
2025-01-10T03:12:23.5139274Z     ... 26 more {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-37024) Task can be stuck in deploying state forever when canceling job/failover

2025-01-10 Thread Weihua Hu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weihua Hu resolved FLINK-37024.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

merged in master: 37c70c482a09845370cb6e694a26f55950c4699f

> Task can be stuck in deploying state forever when canceling job/failover
> 
>
> Key: FLINK-37024
> URL: https://issues.apache.org/jira/browse/FLINK-37024
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.20.0
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> We observed that task can be stuck in deploying state forever when the task 
> loading/instantiating logic has some issues. Cancelling the job / failover 
> caused by failures of other tasks will also get stuck as the cancel watch dog 
> won't work for tasks in CREATED/DEPLOYING state at present. We should make 
> cancel watch dog cover tasks in DEPLOYING as well (no need for tasks in 
> CREATED state has there is no real logic between  CREATED->DEPLOYING).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-37025] Fix generating watermarks in SQL on-periodic (#25921) [flink]

2025-01-10 Thread via GitHub



dawidwys merged PR #25935:
URL: https://github.com/apache/flink/pull/25935


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25947:
URL: https://github.com/apache/flink/pull/25947#issuecomment-2582054899

   
   ## CI report:
   
   * 0557987216d7d26709023fd6cfc2379ec9f92eae UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread xuyang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuyang updated FLINK-37094:
---
Issue Type: Improvement  (was: Bug)

> Adapt the external Hive connector to the removal of deprecated APIs in table 
> module in 2.0
> --
>
> Key: FLINK-37094
> URL: https://issues.apache.org/jira/browse/FLINK-37094
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: xuyang
>Priority: Major
>
> Since the Hive connector is scheduled to be removed from the Flink repo, and 
> the compilation of the remaining Hive connector module within the Flink repo 
> has been skipped, all deprecation work related to the table module performed 
> in version 2.0 needs to be synced to the external Hive connector's separate 
> repository.
> There will be no further modifications to the remaining Hive connector module 
> in the Flink repo, as any changes made cannot be guaranteed to be validated 
> through CI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-37093][table] store catalog in catalogStoreHolder should after initCatalog and open when createCatalog [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25946:
URL: https://github.com/apache/flink/pull/25946#issuecomment-2582054718

   
   ## CI report:
   
   * 59d74f97e7e7905497e8db467527c9961c3a2f86 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread xuyang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuyang updated FLINK-37094:
---
Description: 
Since the Hive connector is scheduled to be removed from the Flink repo, and 
the compilation of the remaining Hive connector module within the Flink repo 
has been skipped, all deprecation work related to the table module performed in 
version 2.0 needs to be synced to the external Hive connector's separate 
repository.

There will be no further modifications to the remaining Hive connector module 
in the Flink repo, as any changes made cannot be guaranteed to be validated 
through CI.

  was:Since the Hive connector is scheduled to be removed from the Flink repo, 
and the compilation of the remaining Hive connector module within the Flink 
repo has been skipped, all deprecation work related to the table module 
performed in version 2.0 needs to be synced to the external Hive connector's 
separate repository. There will be no further updates to the remaining Hive 
connector module in the Flink repo.


> Adapt the external Hive connector to the removal of deprecated APIs in table 
> module in 2.0
> --
>
> Key: FLINK-37094
> URL: https://issues.apache.org/jira/browse/FLINK-37094
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: xuyang
>Priority: Major
>
> Since the Hive connector is scheduled to be removed from the Flink repo, and 
> the compilation of the remaining Hive connector module within the Flink repo 
> has been skipped, all deprecation work related to the table module performed 
> in version 2.0 needs to be synced to the external Hive connector's separate 
> repository.
> There will be no further modifications to the remaining Hive connector module 
> in the Flink repo, as any changes made cannot be guaranteed to be validated 
> through CI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-37024][task] Make cancel watchdog cover tasks stuck in DEPLOYING state [flink]

2025-01-10 Thread via GitHub



huwh merged PR #25915:
URL: https://github.com/apache/flink/pull/25915


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35325) Paimon connector miss the position of AddColumnEvent

2025-01-10 Thread Yanquan Lv (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanquan Lv updated FLINK-35325:
---
Fix Version/s: cdc-3.2.1
   (was: cdc-3.1.1)

> Paimon connector miss the position of AddColumnEvent
> 
>
> Key: FLINK-35325
> URL: https://issues.apache.org/jira/browse/FLINK-35325
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.1
>Reporter: Yanquan Lv
>Assignee: tianzhu.wen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.1
>
>
> Currently, new columns are always added in the last position, however some 
> newly add columns had a specific before and after relationship with other 
> column.
> Source code:
> [https://github.com/apache/flink-cdc/blob/fa6e7ea51258dcd90f06036196618224156df367/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java#L137]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-37025) Periodic SQL watermarks can travel back in time

2025-01-10 Thread Dawid Wysakowicz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz resolved FLINK-37025.
--
Resolution: Fixed

Fixed in:
* master
** 7059ee555c5093e3b233876ac3d84ad693023df8
* 1.20.x
** bfe3576f439466744871d950b70339344c15e44f
* 1.19.x
** 868f413bd63db4399a89a403103f02555a5293b7

> Periodic SQL watermarks can travel back in time
> ---
>
> Key: FLINK-37025
> URL: https://issues.apache.org/jira/browse/FLINK-37025
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.20.0, 2.0-preview
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.19.2, 1.20.1
>
>
> If watermarks are generated through SQL, e.g. using {{WATERMARK FOR ts AS 
> ts}} is used 
> https://github.com/apache/flink/blob/e9353319ad625baa5b2c20fa709ab5b23f83c0f4/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/generated/GeneratedWatermarkGeneratorSupplier.java#L86
> The code in {{onEvent}} can move the {{currentWatermark}} back in time, 
> because it does not check if it advanced. This is fine in case of 
> {{on-event}} watermarks, because underlying  {{WatermarkOutput}} will, but it 
> has confusing outcomes for {{on-periodic}}.
> Example:
> If events with timestamps:
> {code}
> {"id":1,"ts":"2024-01-01 00:00:00"}
> {"id":3,"ts":"2024-01-03 00:00:00"}
> {"id":2,"ts":"2024-01-02 00:00:00"}
> {code}
> come within the periodic emit interval, the generated watermark will be 
> "2024-01-02 00:00:00" instead of "2024-01-03 00:00:00". As a result the 
> watermark for "2024-01-03 00:00:00" is swallowed and a new event is required 
> to progress the processing of that event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread xuyang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911842#comment-17911842
 ] 

xuyang commented on FLINK-37094:


Btw, I tried to enable the compilation directly in the pom in the Flink repo, 
but it resulted in compilation errors, and many tests also failed... cc 
[~luoyuxia] [~Sergey Nuyanzin]

> Adapt the external Hive connector to the removal of deprecated APIs in table 
> module in 2.0
> --
>
> Key: FLINK-37094
> URL: https://issues.apache.org/jira/browse/FLINK-37094
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: xuyang
>Priority: Major
>
> Since the Hive connector is scheduled to be removed from the Flink repo, and 
> the compilation of the remaining Hive connector module within the Flink repo 
> has been skipped, all deprecation work related to the table module performed 
> in version 2.0 needs to be synced to the external Hive connector's separate 
> repository.
> There will be no further modifications to the remaining Hive connector module 
> in the Flink repo, as any changes made cannot be guaranteed to be validated 
> through CI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35325][transform] Skip insufficient_quota error when running test case using ad model. [flink-cdc]

2025-01-10 Thread via GitHub



lvyanquan opened a new pull request, #3849:
URL: https://github.com/apache/flink-cdc/pull/3849

   Fix Ci failure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36957][Datastream] Implement asyc state version of stream flatmap [flink]

2025-01-10 Thread via GitHub



fredia commented on PR #25848:
URL: https://github.com/apache/flink/pull/25848#issuecomment-2582074659

   > @fredia I am unsure what this change it - please could you add the 
motivation of the fix in the Jira and the impact on the Datastream API. Does 
this require a docs change?
   
   @davidradl Sorry for the confusion, this pr provides the async state version 
of stream flatmap, I changed the title and description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-36494][table] Remove deprecated method Catalog#getTableFactory [flink]

2025-01-10 Thread via GitHub



xuyangzhong opened a new pull request, #25948:
URL: https://github.com/apache/flink/pull/25948

   ## What is the purpose of the change
   
   Remove deprecated method Catalog#getTableFactory. 
   Note: no changes are done in hive connector repo because of 
https://issues.apache.org/jira/browse/FLINK-37094.
   
   ## Brief change log
   
 - *Remove deprecated method Catalog#getTableFactory*
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37011] Improve get source field value by column name in PreTransformProcessor. [flink-cdc]

2025-01-10 Thread via GitHub



ChaomingZhangCN commented on PR #3836:
URL: https://github.com/apache/flink-cdc/pull/3836#issuecomment-2582263733

   > @leonardBang @yuxiqian Please take a look.
   
   pin


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36193] Supports applying TRUNCATE / DROP table ddl... [flink-cdc]

2025-01-10 Thread via GitHub



yuxiqian commented on code in PR #3673:
URL: https://github.com/apache/flink-cdc/pull/3673#discussion_r1910133465


##
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java:
##
@@ -325,14 +317,32 @@ private void applyAlterColumnType(AlterColumnTypeEvent 
event) throws SchemaEvolv
 tableChangeList.add(
 
SchemaChangeProvider.updateColumnType(
 oldName, newType)));
-catalog.alterTable(
-new Identifier(event.tableId().getSchemaName(), 
event.tableId().getTableName()),
-tableChangeList,
-true);
+catalog.alterTable(tableIdToIdentifier(event), tableChangeList, 
true);
 } catch (Catalog.TableNotExistException
 | Catalog.ColumnAlreadyExistException
 | Catalog.ColumnNotExistException e) {
 throw new SchemaEvolveException(event, e.getMessage(), e);
 }
 }
+
+private void applyTruncateTable(TruncateTableEvent event) throws 
SchemaEvolveException {
+try (BatchTableCommit batchTableCommit =
+
catalog.getTable(tableIdToIdentifier(event)).newBatchWriteBuilder().newCommit())
 {

Review Comment:
   Yeah it should be safer throw an exception in such configuration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]

2025-01-10 Thread via GitHub



xuyangzhong opened a new pull request, #25950:
URL: https://github.com/apache/flink/pull/25950

   ## What is the purpose of the change
   
   Remove all deprecated methods in ColumnStats
   
   ## Brief change log
   
 - Remove all deprecated methods in ColumnStats
   
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-37093][table] store catalog in catalogStoreHolder should after initCatalog and open when createCatalog [flink]

2025-01-10 Thread via GitHub



jiefei30 opened a new pull request, #25946:
URL: https://github.com/apache/flink/pull/25946

   ## What is the purpose of the change
   
   This pull request fix a bug which a catalog that failed validation due to no 
type still exists in catalogStoreHolder.
   And this catalog you cannot USE, ALTER.
   
   
   ## Brief change log
   
   - fix a bug which a catalog that failed validation due to no type still 
exists in catalogStoreHolder.
   
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37093) The catalog that failed validation due to no type still exists in catalogStoreHolder

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-37093:
---
Labels: pull-request-available  (was: )

> The catalog that failed validation due to no type still exists in 
> catalogStoreHolder
> 
>
> Key: FLINK-37093
> URL: https://issues.apache.org/jira/browse/FLINK-37093
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.20.0, 2.0-preview
> Environment: mac os
>Reporter: Mingcan Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.20.0
>
>
> here's sql-client (1.20 & 2.0-preview1):
>  
> *Flink SQL> create catalog cat1;*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.api.ValidationException: Catalog 
> options do not contain an option key 'type' for discovering a catalog.{color}
> *Flink SQL> show catalogs;*
> +-+
> |    catalog name |
> +-+
> |            cat1 |
> | default_catalog |
> +-+
> 2 rows in set
> *Flink SQL> use catalog cat1;*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.api.ValidationException: Catalog 
> options do not contain an option key 'type' for discovering a catalog.{color}
> *Flink SQL> alter catalog cat1 comment 'no type';*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.api.ValidationException: Catalog 
> options do not contain an option key 'type' for discovering a catalog.{color}
> *Flink SQL> create catalog cat1;*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.catalog.exceptions.CatalogException: 
> Catalog cat1 already exists.{color}
> *Flink SQL> drop catalog cat1;*
> [INFO] Execute statement succeeded.
> *Flink SQL> show catalogs;*
> +-+
> |    catalog name |
> +-+
> | default_catalog |
> +-+
> 1 row in set
>  
> When i create a catalog without type , the sql will parsed successfully. 
> According to the code :
> ```java
> public void createCatalog(
> String catalogName, CatalogDescriptor catalogDescriptor, boolean 
> ignoreIfExists)
> throws CatalogException {
> checkArgument(
> !StringUtils.isNullOrWhitespaceOnly(catalogName),
> "Catalog name cannot be null or empty.");
> checkNotNull(catalogDescriptor, "Catalog descriptor cannot be null");
> boolean catalogExistsInStore = 
> catalogStoreHolder.catalogStore().contains(catalogName);
> boolean catalogExistsInMemory = catalogs.containsKey(catalogName);
> if (catalogExistsInStore || catalogExistsInMemory) {
> if (!ignoreIfExists) {
> throw new CatalogException(format("Catalog %s already exists.", catalogName));
> }
> } else {
> // Store the catalog in the catalog store
> catalogStoreHolder.catalogStore().storeCatalog(catalogName, 
> catalogDescriptor);
> // Initialize and store the catalog in memory
> Catalog catalog = initCatalog(catalogName, catalogDescriptor);
> catalog.open();
> catalogs.put(catalogName, catalog);
> }
> }
> ```
> the catalog will store in the catalog store first, then failed in 
> initCatalog() method.
> the result is the catalog still exists in catalogStoreHolder. And I cannot 
> use, alter or create it again.
> I noticed the alterCatalog() method, the proces is :
> ```java 
> Catalog newCatalog = initCatalog(catalogName, newDescriptor);
> catalogStore.removeCatalog(catalogName, false);
> if (catalogs.containsKey(catalogName)) {
> catalogs.get(catalogName).close();
> }
> newCatalog.open();
> catalogs.put(catalogName, newCatalog);
> catalogStoreHolder.catalogStore().storeCatalog(catalogName, newDescriptor);
> ```
> store in catalogStoreHolder should after initCatalog() and open().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36899][state/forst] Introduce metrics for forst cache [flink]

2025-01-10 Thread via GitHub



fredia commented on code in PR #25884:
URL: https://github.com/apache/flink/pull/25884#discussion_r1909974038


##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/cache/FileBasedCache.java:
##
@@ -97,6 +129,13 @@ public void delete(Path path) {
 
 @Override
 FileCacheEntry internalGet(String key, FileCacheEntry value) {
+if (metricGroup != null) {

Review Comment:
   `Key` is reserved for future use.
   
   `internalGet` is used for some customized operations like statistics.
   This PR only focuses on the `hit`, `miss` metric which is not related with 
`key`.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-36494) Remove deprecated methods `Catalog#getTableFactory` and `Catalog#supportsManagedTable`

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-36494:
---
Labels: pull-request-available  (was: )

> Remove deprecated methods `Catalog#getTableFactory` and 
> `Catalog#supportsManagedTable`
> --
>
> Key: FLINK-36494
> URL: https://issues.apache.org/jira/browse/FLINK-36494
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: xuyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911849#comment-17911849
 ] 

Sergey Nuyanzin edited comment on FLINK-37094 at 1/10/25 8:51 AM:
--

Hi [~xuyangzhong] 
IIRC the idea was to release hive connector externally and then remove it from 
main repo (PR for removal is ready however need to wait for release of external 
hive connector)

I think the main issue here is encouraging people to test/vote

about a year ago when I created an RC only a couple of people voted... and not 
enough binding votes to proceed

 

I can try to adapt new commits and repeat the procedure of RC creation and 
starting a new vote however not sure it will be able to finish before feature 
freeze of 2.0 or even release of 2.0 (I have no idea how to predict this voting 
procedure)


was (Author: sergey nuyanzin):
Hi [~xuyangzhong] 
IIRC the idea was to release hive connector externally and then remove it from 
main repo

I think the main issue here is encouraging people to test/vote

about a year ago when I created an RC only a couple of people voted... and not 
enough binding votes to proceed

 

I can try to adapt new commits and repeat the procedure of RC creation and 
starting a new vote however not sure it will be able to finish before feature 
freeze of 2.0 or even release of 2.0 (I have no idea how to predict this voting 
procedure)

> Adapt the external Hive connector to the removal of deprecated APIs in table 
> module in 2.0
> --
>
> Key: FLINK-37094
> URL: https://issues.apache.org/jira/browse/FLINK-37094
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: xuyang
>Priority: Major
>
> Since the Hive connector is scheduled to be removed from the Flink repo, and 
> the compilation of the remaining Hive connector module within the Flink repo 
> has been skipped, all deprecation work related to the table module performed 
> in version 2.0 needs to be synced to the external Hive connector's separate 
> repository.
> There will be no further modifications to the remaining Hive connector module 
> in the Flink repo, as any changes made cannot be guaranteed to be validated 
> through CI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36494][table] Remove deprecated method Catalog#getTableFactory [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25948:
URL: https://github.com/apache/flink/pull/25948#issuecomment-2582085953

   
   ## CI report:
   
   * 69dbfadecb608f9ccd9702c9a793ea0218a039cb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911849#comment-17911849
 ] 

Sergey Nuyanzin commented on FLINK-37094:
-

Hi [~xuyangzhong] 
IIRC the idea was to release hive connector externally and then remove it from 
main repo

I think the main issue here is encouraging people to test/vote

about a year ago when I created an RC only a couple of people voted... and not 
enough binding votes to proceed

 

I can try to adapt new commits and repeat the procedure of RC creation and 
starting a new vote however not sure it will be able to finish before feature 
freeze of 2.0 or even release of 2.0 (I have no idea how to predict this voting 
procedure)

> Adapt the external Hive connector to the removal of deprecated APIs in table 
> module in 2.0
> --
>
> Key: FLINK-37094
> URL: https://issues.apache.org/jira/browse/FLINK-37094
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: xuyang
>Priority: Major
>
> Since the Hive connector is scheduled to be removed from the Flink repo, and 
> the compilation of the remaining Hive connector module within the Flink repo 
> has been skipped, all deprecation work related to the table module performed 
> in version 2.0 needs to be synced to the external Hive connector's separate 
> repository.
> There will be no further modifications to the remaining Hive connector module 
> in the Flink repo, as any changes made cannot be guaranteed to be validated 
> through CI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36698][pipeline-connector][elasticsearch] Elasticsearch pipeline sink support authentication [flink-cdc]

2025-01-10 Thread via GitHub



beryllw commented on PR #3728:
URL: https://github.com/apache/flink-cdc/pull/3728#issuecomment-2582088579

   https://github.com/apache/flink-cdc/pull/3849


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35325][transform] Skip insufficient_quota error when running test case using ad model. [flink-cdc]

2025-01-10 Thread via GitHub



yuxiqian commented on code in PR #3849:
URL: https://github.com/apache/flink-cdc/pull/3849#discussion_r1910020556


##
flink-cdc-composer/src/test/java/org/apache/flink/cdc/composer/flink/FlinkPipelineUdfITCase.java:
##
@@ -897,7 +899,6 @@ void testTransformWithModel(ValuesDataSink.SinkApi sinkApi) 
throws Exception {
 // Execute the pipeline
 PipelineExecution execution = composer.compose(pipelineDef);
 execution.execute();
-

Review Comment:
   nit: irrelevant change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-36506) Remove all deprecated methods in `ColumnStats`

2025-01-10 Thread xuyang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911850#comment-17911850
 ] 

xuyang commented on FLINK-36506:


I'll take charge of this task and add you [~atusharm]  as co-author since the 
code freeze date for flink 2.0.0 is approaching.

> Remove all deprecated methods in `ColumnStats`
> --
>
> Key: FLINK-36506
> URL: https://issues.apache.org/jira/browse/FLINK-36506
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: xuyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-37089][Runtime] Advanced scheduling for derived async state processing [flink]

2025-01-10 Thread via GitHub



Zakelly opened a new pull request, #25949:
URL: https://github.com/apache/flink/pull/25949

   ## What is the purpose of the change
   
   Currently, there is a limit for in-flight records in async state processing. 
However, we allow developers to initialize a new processing in-middle of 
another. These derived processings are treated as normal ones for timers or 
upstreaming records, which is problematic:
   
   - The new derived processing may increase the number of in-flight requests 
and cause force draining when it reaches the limit. But the draining may 
require the current processing also finish. This is a deadlock.
   - The derived processing will queue behind normal processing, which is not 
the behavior the user wants. The derived ones should be fired right after the 
current processing or ASAP.
   
   Thus, This PR changes the behavior in `AsyncExecutionController` and its 
related buffer:
   
   - Avoid drain when creating derived processing.
   - Provide priority for the queuing processing and make derived one have 
greater priority.
   
   ## Brief change log
   
- Add an additional condition to skip the drain when inserting new record 
to `AEC`'s buffer.
- Add priority to `RecordContext` and consider it when queue in blocking 
queue.
   
   ## Verifying this change
   
   This change added tests and can be verified by 
`AbstractAsyncStateStreamOperatorTest#testManyAsyncProcessWithKey`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): yes
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37089) Advanced scheduling for derived async state processing

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-37089:
---
Labels: pull-request-available  (was: )

> Advanced scheduling for derived async state processing
> --
>
> Key: FLINK-37089
> URL: https://issues.apache.org/jira/browse/FLINK-37089
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Async State Processing, Runtime / Task
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, there is a limit for in-flight records in async state processing. 
> However, we allow developers to initialize a new processing in-middle of 
> another. These derived processings are treated as normal ones for timers or 
> upstreaming records, which is problematic:
> * The new derived processing may increase the number of in-flight requests 
> and cause force draining when it reaches the limit. But the draining may 
> require the current processing also finish. This is a deadlock.
> * The derived processing will queue behind normal processing, which is not 
> the behavior the user wants. The derived ones should be fired right after the 
> current processing or ASAP.
> Thus, I will change the behavior in `AsyncExecutionController` and its 
> related buffer:
> * Avoid drain when creating derived processing.
> * Provide priority for the queuing processing and make derived one have 
> greater priority.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25950:
URL: https://github.com/apache/flink/pull/25950#issuecomment-2582291586

   
   ## CI report:
   
   * 43820887f2acc648f48dbfc0c8c18fb370f4a55a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37078] Fix for jackson-mapper-asl 1.9.13 vulnerability [flink]

2025-01-10 Thread via GitHub



rmetzger commented on code in PR #25890:
URL: https://github.com/apache/flink/pull/25890#discussion_r1910162122


##
flink-connectors/flink-sql-connector-hive-3.1.3/src/main/resources/META-INF/NOTICE:
##
@@ -45,7 +45,7 @@ the Apache Software License 2.0 
(http://www.apache.org/licenses/LICENSE-2.0.txt)
 - org.apache.orc:orc-tools:1.5.8
 - org.apache.parquet:parquet-hadoop-bundle:1.10.0
 - org.codehaus.jackson:jackson-core-asl:1.9.13
-- org.codehaus.jackson:jackson-mapper-asl:1.9.13
+- org.codehaus.jackson:jackson-mapper-asl:1.9.14.jdk17-redhat-1

Review Comment:
   okay, I see. I would say we do it, given that the tests are passing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37078] Fix for jackson-mapper-asl 1.9.13 vulnerability [flink]

2025-01-10 Thread via GitHub



rmetzger commented on code in PR #25890:
URL: https://github.com/apache/flink/pull/25890#discussion_r1910165407


##
pom.xml:
##
@@ -328,6 +335,13 @@ under the License.
-->

 
+   
+   
+   org.codehaus.jackson
+   jackson-mapper-asl
+   ${jackson.mapper.asl.version}
+   
+

Review Comment:
   Does this mean now any Flink dependency has `jackson-mapper-asl` as dep? 
e.g. if I'm depending on `flink-core`, will it pull in `jackson-mapper-asl` as 
well? (that wouldn't  be good)
   
   it seems this is true, right? https://stackoverflow.com/a/38882349



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-28897] [TABLE-SQL] Fail to use udf in added jar when enabling checkpoint [flink]

2025-01-10 Thread via GitHub



afedulov commented on PR #25656:
URL: https://github.com/apache/flink/pull/25656#issuecomment-2582320914

   > But while debugging the issue I found that the class loader coming here is 
the App class loader and not the user class loader.
   
   In that case the primary question should be - why does the userClassLoader 
class field not contain the actual user class loader in the first place. Should 
not that be the focus of the fix instead of pulling it out of the 
resourceManager only for one the methods?
   
   > and I see minimum possibility on regressions.
   
   Unfortunately, I can't agree with this assessment. There are thousands of 
SQL jobs out there that might have unintentionally included some classes that 
could cause the jobs to fail after we implement this change. For instance, the 
`flink-avro` format includes an unshaded version of the Avro 
[library](https://github.com/apache/flink/blob/d9c10d584b7ef8532804b849cfbd2b33beac8044/flink-formats/flink-avro/pom.xml#L81),
 which is directly 
[used](https://github.com/apache/flink/blob/85937b8c11dcca25fd978d095d07af121d7ae8e3/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSerializerSnapshot.java#L28)
 in the core format classes.
   
   If the user JAR, for some reason, mistakenly bundles the same library in a 
different version, those classes would take priority after the change and could 
potentially break existing jobs. 
   Should those jobs have failed upon the initial submission if the "correct" 
classloading had been applied? Yes.  
   Can we risk breaking production jobs in a patch release, even by doing the 
"right" thing? My answer is no.  
   
   The risk is especially not worth it because the issue has existed since 
version 1.16 (over 2 years) and is only relevant for the Table API. Moreover, a 
[workaround](https://issues.apache.org/jira/browse/FLINK-29890?focusedCommentId=17693105&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17693105)
 exists, and a proper fix is implemented in 2.0.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36899][state/forst] Introduce metrics for forst cache [flink]

2025-01-10 Thread via GitHub



fredia commented on code in PR #25884:
URL: https://github.com/apache/flink/pull/25884#discussion_r1909979879


##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/ForStFlinkFileSystem.java:
##
@@ -119,7 +121,11 @@ public static FileBasedCache getFileBasedCache(
 new File(cacheBase.toString()), cacheReservedSize, 
SST_FILE_SIZE);
 }
 return new FileBasedCache(
-Integer.MAX_VALUE, cacheLimitPolicy, 
cacheBase.getFileSystem(), cacheBase);
+Integer.MAX_VALUE,
+cacheLimitPolicy,
+cacheBase.getFileSystem(),

Review Comment:
   `cacheBase.getFileSystem()` can throw IOException.
If we get fileSystem in `FileBasedCache` constructor, the constructor would 
throw IOException, this is something I don't want to introduce.
   
   So, I prefer to keep ` cacheBase.getFileSystem()` in parameter.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36193] Supports applying TRUNCATE / DROP table ddl... [flink-cdc]

2025-01-10 Thread via GitHub



lvyanquan commented on code in PR #3673:
URL: https://github.com/apache/flink-cdc/pull/3673#discussion_r1910115690


##
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java:
##
@@ -325,14 +317,32 @@ private void applyAlterColumnType(AlterColumnTypeEvent 
event) throws SchemaEvolv
 tableChangeList.add(
 
SchemaChangeProvider.updateColumnType(
 oldName, newType)));
-catalog.alterTable(
-new Identifier(event.tableId().getSchemaName(), 
event.tableId().getTableName()),
-tableChangeList,
-true);
+catalog.alterTable(tableIdToIdentifier(event), tableChangeList, 
true);
 } catch (Catalog.TableNotExistException
 | Catalog.ColumnAlreadyExistException
 | Catalog.ColumnNotExistException e) {
 throw new SchemaEvolveException(event, e.getMessage(), e);
 }
 }
+
+private void applyTruncateTable(TruncateTableEvent event) throws 
SchemaEvolveException {
+try (BatchTableCommit batchTableCommit =
+
catalog.getTable(tableIdToIdentifier(event)).newBatchWriteBuilder().newCommit())
 {

Review Comment:
   As We are currently experiencing some unexpected situations while truncating 
table with option of `deletion-vectors.enabled: true`. How about throwing 
exceptions when truncate these tables? WDYT.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-36811) MySQL CDC source set is processing backlog during snapshot phase

2025-01-10 Thread Ruan Hang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruan Hang reassigned FLINK-36811:
-

Assignee: Xiao Huang

> MySQL CDC source set is processing backlog during snapshot phase
> 
>
> Key: FLINK-36811
> URL: https://issues.apache.org/jira/browse/FLINK-36811
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.3.0
>Reporter: Xuannan Su
>Assignee: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> In 
> [FLIP-309|https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog],
>  Flink introduce ProcessingBacklog to demonstrate whether a record should be 
> processed with low latency or high throughput. ProcessingBacklog can be used 
> to change the checkpoint interval of a job during runtime.
> MySQL CDC source can set the procesingBacklog during the snapshot phase so 
> that the checkpoint interval during the snapshot phase can be changed during 
> runtime.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-36811) MySQL CDC source set is processing backlog during snapshot phase

2025-01-10 Thread Ruan Hang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruan Hang reassigned FLINK-36811:
-

Assignee: (was: Ruan Hang)

> MySQL CDC source set is processing backlog during snapshot phase
> 
>
> Key: FLINK-36811
> URL: https://issues.apache.org/jira/browse/FLINK-36811
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.3.0
>Reporter: Xuannan Su
>Priority: Major
>  Labels: pull-request-available
>
> In 
> [FLIP-309|https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog],
>  Flink introduce ProcessingBacklog to demonstrate whether a record should be 
> processed with low latency or high throughput. ProcessingBacklog can be used 
> to change the checkpoint interval of a job during runtime.
> MySQL CDC source can set the procesingBacklog during the snapshot phase so 
> that the checkpoint interval during the snapshot phase can be changed during 
> runtime.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-37093) The catalog that failed validation due to no type still exists in catalogStoreHolder

2025-01-10 Thread Mingcan Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingcan Wang updated FLINK-37093:
-
Description: 
here's sql-client (1.20 & 2.0-preview1):

 

*Flink SQL> create catalog cat1;*
{color:#ff}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#ff}org.apache.flink.table.api.ValidationException: Catalog options 
do not contain an option key 'type' for discovering a catalog.{color}

*Flink SQL> show catalogs;*
+-+
|   catalog name|

+-+
|           cat1|
|default_catalog|

+-+
2 rows in set

*Flink SQL> use catalog cat1;*
{color:#ff}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#ff}org.apache.flink.table.api.ValidationException: Catalog options 
do not contain an option key 'type' for discovering a catalog.{color}

*Flink SQL> alter catalog cat1 comment 'no type';*
{color:#ff}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#ff}org.apache.flink.table.api.ValidationException: Catalog options 
do not contain an option key 'type' for discovering a catalog.{color}

*Flink SQL> create catalog cat1;*
{color:#ff}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#ff}org.apache.flink.table.catalog.exceptions.CatalogException: 
Catalog cat1 already exists.{color}

*Flink SQL> drop catalog cat1;*
[INFO] Execute statement succeeded.

*Flink SQL> show catalogs;*
+-+
|   catalog name|

+-+
|default_catalog|

+-+
1 row in set

 

When I create a catalog without type , the sql will parsed successfully. 
According to the code :

 
{code:java}
public void createCatalog(
String catalogName, CatalogDescriptor catalogDescriptor, boolean ignoreIfExists)
throws CatalogException {
checkArgument(
!StringUtils.isNullOrWhitespaceOnly(catalogName),
"Catalog name cannot be null or empty.");
checkNotNull(catalogDescriptor, "Catalog descriptor cannot be null");
boolean catalogExistsInStore = 
catalogStoreHolder.catalogStore().contains(catalogName);
boolean catalogExistsInMemory = catalogs.containsKey(catalogName);
if (catalogExistsInStore || catalogExistsInMemory) {
if (!ignoreIfExists)
{ throw new CatalogException(format("Catalog %s already exists.", 
catalogName)); }
} else { 
// Store the catalog in the catalog store 
catalogStoreHolder.catalogStore().storeCatalog(catalogName, catalogDescriptor); 

// Initialize and store the catalog in memory 
Catalog catalog = initCatalog(catalogName, catalogDescriptor); 
catalog.open(); 
catalogs.put(catalogName, catalog); 
}
}{code}
the catalog will store in the catalog store first, then failed in initCatalog() 
method.

the result is the catalog still exists in catalogStoreHolder. And I cannot use, 
alter or create it again.

I noticed the alterCatalog() method, the proces is :
{code:java}
Catalog newCatalog = initCatalog(catalogName, newDescriptor);
catalogStore.removeCatalog(catalogName, false);
if (catalogs.containsKey(catalogName))
{ catalogs.get(catalogName).close(); }
newCatalog.open();
catalogs.put(catalogName, newCatalog);
catalogStoreHolder.catalogStore().storeCatalog(catalogName, newDescriptor); 
{code}
store in catalogStoreHolder should after initCatalog() and open().

  was:
here's sql-client (1.20 & 2.0-preview1):

 

*Flink SQL> create catalog cat1;*
{color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#FF}org.apache.flink.table.api.ValidationException: Catalog options 
do not contain an option key 'type' for discovering a catalog.{color}

*Flink SQL> show catalogs;*
+-+
|    catalog name |
+-+
|            cat1 |
| default_catalog |
+-+
2 rows in set

*Flink SQL> use catalog cat1;*
{color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#FF}org.apache.flink.table.api.ValidationException: Catalog options 
do not contain an option key 'type' for discovering a catalog.{color}

*Flink SQL> alter catalog cat1 comment 'no type';*
{color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#FF}org.apache.flink.table.api.ValidationException: Catalog options 
do not contain an option key 'type' for discovering a catalog.{color}

*Flink SQL> create catalog cat1;*
{color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
{color:#FF}org.apache.flink.table.catalog.exceptions.CatalogException: 
Catalog cat1 already exists.{color}

*Flink SQL> drop catalog cat1;*
[INFO] Execute statement succeeded.

*Flink SQL> show catalogs;*
+-+
|    catalog name |
+-+
| default_catalog |
+-+
1 row in set

 

When i create a catalog without type , the sql will parsed successfully. 
According to the code :

```java
public void createCatalog(
String catalogName, CatalogDescriptor catalogDescriptor, boolean ignoreIfExists)
throws Catalog

Re: [PR] [FLINK-36794] [cdc-composer/cli] pipeline cdc connector support multiple data sources [flink-cdc]

2025-01-10 Thread via GitHub



ChaomingZhangCN commented on code in PR #3844:
URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910131067


##
docs/content.zh/docs/connectors/pipeline-connectors/mysql.md:
##
@@ -77,6 +77,32 @@ pipeline:
parallelism: 4
 ```
 
+## 多数据源示例
+
+单数据源，从多个 MySQL 读取数据同步到 Doris 的 Pipeline 可以定义如下：
+
+```yaml
+source:
+   type: mysql_mutiple

Review Comment:
   ```yaml
   sources:
- type: mysql
   name: mysql-instance-00
   hostname: localhost
   port: 3306
   
- type: mysql
   name: mysql-instance-01
   hostname: localhost
   port: 3307
   
   ```
   And the corresponding PipelineDef looks like this:
   ```java
   public class PipelineDef {
   @Nullable private List sources;
   private final SourceDef source;
   ...
   }
   ```
   If the `sources` is not null then we use these data sources, otherwise we 
use `source` to build up DataStream. In this way, the previous usage will not 
be affected.
   I want to hear your opinion. @yuxiqian 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-37093) The catalog that failed validation due to no type still exists in catalogStoreHolder

2025-01-10 Thread Mingcan Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911817#comment-17911817
 ] 

Mingcan Wang commented on FLINK-37093:
--

can someone assign this issue to me ? thanks :)

> The catalog that failed validation due to no type still exists in 
> catalogStoreHolder
> 
>
> Key: FLINK-37093
> URL: https://issues.apache.org/jira/browse/FLINK-37093
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.20.0, 2.0-preview
> Environment: mac os
>Reporter: Mingcan Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.20.0
>
>
> here's sql-client (1.20 & 2.0-preview1):
>  
> *Flink SQL> create catalog cat1;*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.api.ValidationException: Catalog 
> options do not contain an option key 'type' for discovering a catalog.{color}
> *Flink SQL> show catalogs;*
> +-+
> |    catalog name |
> +-+
> |            cat1 |
> | default_catalog |
> +-+
> 2 rows in set
> *Flink SQL> use catalog cat1;*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.api.ValidationException: Catalog 
> options do not contain an option key 'type' for discovering a catalog.{color}
> *Flink SQL> alter catalog cat1 comment 'no type';*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.api.ValidationException: Catalog 
> options do not contain an option key 'type' for discovering a catalog.{color}
> *Flink SQL> create catalog cat1;*
> {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color}
> {color:#FF}org.apache.flink.table.catalog.exceptions.CatalogException: 
> Catalog cat1 already exists.{color}
> *Flink SQL> drop catalog cat1;*
> [INFO] Execute statement succeeded.
> *Flink SQL> show catalogs;*
> +-+
> |    catalog name |
> +-+
> | default_catalog |
> +-+
> 1 row in set
>  
> When i create a catalog without type , the sql will parsed successfully. 
> According to the code :
> ```java
> public void createCatalog(
> String catalogName, CatalogDescriptor catalogDescriptor, boolean 
> ignoreIfExists)
> throws CatalogException {
> checkArgument(
> !StringUtils.isNullOrWhitespaceOnly(catalogName),
> "Catalog name cannot be null or empty.");
> checkNotNull(catalogDescriptor, "Catalog descriptor cannot be null");
> boolean catalogExistsInStore = 
> catalogStoreHolder.catalogStore().contains(catalogName);
> boolean catalogExistsInMemory = catalogs.containsKey(catalogName);
> if (catalogExistsInStore || catalogExistsInMemory) {
> if (!ignoreIfExists) {
> throw new CatalogException(format("Catalog %s already exists.", catalogName));
> }
> } else {
> // Store the catalog in the catalog store
> catalogStoreHolder.catalogStore().storeCatalog(catalogName, 
> catalogDescriptor);
> // Initialize and store the catalog in memory
> Catalog catalog = initCatalog(catalogName, catalogDescriptor);
> catalog.open();
> catalogs.put(catalogName, catalog);
> }
> }
> ```
> the catalog will store in the catalog store first, then failed in 
> initCatalog() method.
> the result is the catalog still exists in catalogStoreHolder. And I cannot 
> use, alter or create it again.
> I noticed the alterCatalog() method, the proces is :
> ```java 
> Catalog newCatalog = initCatalog(catalogName, newDescriptor);
> catalogStore.removeCatalog(catalogName, false);
> if (catalogs.containsKey(catalogName)) {
> catalogs.get(catalogName).close();
> }
> newCatalog.open();
> catalogs.put(catalogName, newCatalog);
> catalogStoreHolder.catalogStore().storeCatalog(catalogName, newDescriptor);
> ```
> store in catalogStoreHolder should after initCatalog() and open().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



lsyldliu opened a new pull request, #25947:
URL: https://github.com/apache/flink/pull/25947

   
   ## What is the purpose of the change
   
   *Remove deprecated interface CatalogLock*
   
   
   ## Brief change log
 - *Remove deprecated interface CatalogLock*
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30064) Move existing Hive connector code from Flink repo to dedicated Hive repo

2025-01-10 Thread xuyang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911818#comment-17911818
 ] 

xuyang commented on FLINK-30064:


Hi, should we sync the new commits from the Hive connector module in the Flink 
repo into the separate Hive connector repo? I'm mentioning this because I 
removed a lot of deprecated code in flink 2.0, which may impact the Hive 
connector.

> Move existing Hive connector code from Flink repo to dedicated Hive repo
> 
>
> Key: FLINK-30064
> URL: https://issues.apache.org/jira/browse/FLINK-30064
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Martijn Visser
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-37091) Remove deprecated CatalogLock interface

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-37091:
---
Labels: pull-request-available  (was: )

> Remove deprecated CatalogLock interface
> ---
>
> Key: FLINK-37091
> URL: https://issues.apache.org/jira/browse/FLINK-37091
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36899][state/forst] Introduce metrics for forst cache [flink]

2025-01-10 Thread via GitHub



fredia commented on code in PR #25884:
URL: https://github.com/apache/flink/pull/25884#discussion_r1909979879


##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/ForStFlinkFileSystem.java:
##
@@ -119,7 +121,11 @@ public static FileBasedCache getFileBasedCache(
 new File(cacheBase.toString()), cacheReservedSize, 
SST_FILE_SIZE);
 }
 return new FileBasedCache(
-Integer.MAX_VALUE, cacheLimitPolicy, 
cacheBase.getFileSystem(), cacheBase);
+Integer.MAX_VALUE,
+cacheLimitPolicy,
+cacheBase.getFileSystem(),

Review Comment:
   I will remove `cacheBase.getFileSystem()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36794] [cdc-composer/cli] pipeline cdc connector support multiple data sources [flink-cdc]

2025-01-10 Thread via GitHub



linjianchang commented on code in PR #3844:
URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910064891


##
flink-cdc-composer/src/main/java/org/apache/flink/cdc/composer/flink/FlinkPipelineComposer.java:
##
@@ -126,16 +127,28 @@ private void translate(StreamExecutionEnvironment env, 
PipelineDef pipelineDef)
 // And required constructors
 OperatorIDGenerator schemaOperatorIDGenerator =
 new 
OperatorIDGenerator(schemaOperatorTranslator.getSchemaOperatorUid());
-DataSource dataSource =
-sourceTranslator.createDataSource(pipelineDef.getSource(), 
pipelineDefConfig, env);
+List sourceDefs = pipelineDef.getSources();
+//DataSource dataSource =
+//sourceTranslator.createDataSource(sourceDefs, 
pipelineDefConfig, env);
 DataSink dataSink =
 sinkTranslator.createDataSink(pipelineDef.getSink(), 
pipelineDefConfig, env);
-
-boolean isParallelMetadataSource = 
dataSource.isParallelMetadataSource();
-
 // O ---> Source
-DataStream stream =
-sourceTranslator.translate(pipelineDef.getSource(), 
dataSource, env, parallelism);
+DataStream stream = null;
+DataSource dataSource = null;
+for (SourceDef sourceDef : sourceDefs) {
+dataSource = sourceTranslator.createDataSource(sourceDef, 
pipelineDefConfig, env);
+DataStream streamBranch =
+sourceTranslator.translate(sourceDef, dataSource, env, 
parallelism);
+if (stream == null) {
+stream = streamBranch;
+} else {
+stream = stream.union(streamBranch);
+}
+}
+boolean isParallelMetadataSource = 
dataSource.isParallelMetadataSource();

Review Comment:
   Already modified



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36794] [cdc-composer/cli] pipeline cdc connector support multiple data sources [flink-cdc]

2025-01-10 Thread via GitHub



linjianchang commented on code in PR #3844:
URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910065232


##
docs/content.zh/docs/connectors/pipeline-connectors/mysql.md:
##
@@ -77,6 +77,32 @@ pipeline:
parallelism: 4
 ```
 
+## 多数据源示例
+
+单数据源，从多个 MySQL 读取数据同步到 Doris 的 Pipeline 可以定义如下：
+
+```yaml
+source:
+   type: mysql_mutiple

Review Comment:
   Already modified



##
flink-cdc-cli/src/main/java/org/apache/flink/cdc/cli/parser/YamlPipelineDefinitionParser.java:
##
@@ -97,6 +97,13 @@ public class YamlPipelineDefinitionParser implements 
PipelineDefinitionParser {
 
 public static final String TRANSFORM_TABLE_OPTION_KEY = "table-options";
 
+private static final String HOST_LIST = "host_list";
+private static final String COMMA = ",";
+private static final String HOST_NAME = "hostname";
+private static final String PORT = "port";
+private static final String COLON = ":";
+private static final String MUTIPLE = "_mutiple";

Review Comment:
   Already modified



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-37096) If the MongoDB connector is implemented with DebeziumSourceFunction, the curve will appear in the full phase delay curve showing 55year

2025-01-10 Thread sukang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911874#comment-17911874
 ] 

sukang edited comment on FLINK-37096 at 1/10/25 9:22 AM:
-

I will do it.Just by learning from this, it may be slow to repair.


was (Author: JIRAUSER308130):
I wll do it.Just by learning from this, it may be slow to repair.

> If the MongoDB connector is implemented with DebeziumSourceFunction, the 
> curve will appear in the full phase delay curve showing 55year
> ---
>
> Key: FLINK-37096
> URL: https://issues.apache.org/jira/browse/FLINK-37096
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / MongoDB
>Affects Versions: cdc-3.2.1
>Reporter: sukang
>Priority: Minor
>   Original Estimate: 612h
>  Remaining Estimate: 612h
>
> emitDelay =isInDbSnapshotPhase ? 0L : System.currentTimeMillis() - 
> messageTimestamp;
> In the full phase, messageTimestamp is 0 which will cause an exception here 
> when we compute it.
> !https://intranetproxy.alipay.com/skylark/lark/0/2025/png/109956362/1736480972716-696877e2-e202-454d-bee4-947944159891.png|width=659,height=258,id=u750186d3!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37096) If the MongoDB connector is implemented with DebeziumSourceFunction, the curve will appear in the full phase delay curve showing 55year

2025-01-10 Thread sukang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911876#comment-17911876
 ] 

sukang commented on FLINK-37096:


assign to you

> If the MongoDB connector is implemented with DebeziumSourceFunction, the 
> curve will appear in the full phase delay curve showing 55year
> ---
>
> Key: FLINK-37096
> URL: https://issues.apache.org/jira/browse/FLINK-37096
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / MongoDB
>Affects Versions: cdc-3.2.1
>Reporter: sukang
>Priority: Minor
>   Original Estimate: 612h
>  Remaining Estimate: 612h
>
> emitDelay =isInDbSnapshotPhase ? 0L : System.currentTimeMillis() - 
> messageTimestamp;
> In the full phase, messageTimestamp is 0 which will cause an exception here 
> when we compute it.
> !https://intranetproxy.alipay.com/skylark/lark/0/2025/png/109956362/1736480972716-696877e2-e202-454d-bee4-947944159891.png|width=659,height=258,id=u750186d3!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-37081) The debezium version of sqlserver cdc needs to be upgraded to 2.6 or higher

2025-01-10 Thread zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhang closed FLINK-37081.
-
Resolution: Later

> The debezium version of sqlserver cdc needs to be upgraded to 2.6 or higher
> ---
>
> Key: FLINK-37081
> URL: https://issues.apache.org/jira/browse/FLINK-37081
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
> Environment: Microsoft SQL Server 2016 (SP1-CU15) 
>Reporter: zhang
>Assignee: zhang
>Priority: Major
> Attachments: image-2025-01-09-14-34-32-514.png
>
>
> The sql to get cdc data before debezium version 2.6 was
> {code:java}
> SELECT *# FROM [#db].cdc.[fn_cdc_get_all_changes_#](?, ?, N'all update old') 
> order by [__$start_lsn] ASC, [__$seqval] ASC, [__$operation] ASC {code}
> The “order by [__$start_lsn] ASC, [__$seqval] ASC, [__$operation] ASC” part 
> of the ordering is not necessary, and the results of “fn_cdc_get_all_changes” 
> will be sorted by “__$start_lsn, __$ command_id, __$seqval, __$operation” by 
> default. 
> If the cdc table has a large amount of data, this extra sorting can cause 
> serious performance problems.
> This redundant sorting has been removed in debezium version 2.6 and above
> !image-2025-01-09-14-34-32-514.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-28897] [TABLE-SQL] Fail to use udf in added jar when enabling checkpoint [flink]

2025-01-10 Thread via GitHub



ammu20-dev commented on PR #25656:
URL: https://github.com/apache/flink/pull/25656#issuecomment-2582186905

   > As per the comments please could you investigate:
   > 
   > * potential regressions
   > * the loss of the user class loader
   > * whether existing option could solve this without a code change
   
   @davidradl 
   > * whether existing option could solve this without a code change
   
   Existing options might not help us in solving the issue as stated here in 
this 
[comment](https://github.com/apache/flink/pull/25656#discussion_r1904076204)
   > * the loss of the user class loader
   
   I noticed that at this line here 
https://github.com/apache/flink/blob/4b306699811af0b6ff2cb862914adfda56345996/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java#L2472
 the class loader expected as per naming convention is user class loader. But 
while debugging the issue I found that the class loader coming here is the App 
class loader and not the user class loader. Inorder to pull in the user class 
loader, I am consuming the user class loader from resourceManager like 
[this](https://github.com/apache/flink/blob/dfd9e0e644a5ee9d0d3238c9f03a2942dc4a52e6/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java#L1036)
 and setting the user class loader as the thread context class loader and later 
used 
[here](https://github.com/apache/flink/blob/dfd9e0e644a5ee9d0d3238c9f03a2942dc4a52e6/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecution
 Environment.java#L2472). This is how the issue was fixed.
   
   > * potential regressions
   
   Analysing through the code, I can confirm that the scope of this fix is 
limited to the sql execution part and I see minimum possibility on regressions. 
Also I am restoring the thread class loader back to the original class loader 
post execution of the sql 
query([here](https://github.com/apache/flink/blob/dfd9e0e644a5ee9d0d3238c9f03a2942dc4a52e6/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java#L1063))
 so that it doesn't have any further impact.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36979][rpc] Reverting pekko version bump in Flink 1.20 [flink]

2025-01-10 Thread via GitHub



He-Pin commented on PR #25866:
URL: https://github.com/apache/flink/pull/25866#issuecomment-2582436133

   @davidradl Is there any investigation result update from your side, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [hotfix][docs] Typo fix in RBAC docs [flink-kubernetes-operator]

2025-01-10 Thread via GitHub



gunnarmorling opened a new pull request, #931:
URL: https://github.com/apache/flink-kubernetes-operator/pull/931

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37011] Improve get source field value by column name in PreTransformProcessor. [flink-cdc]

2025-01-10 Thread via GitHub



yuxiqian commented on code in PR #3836:
URL: https://github.com/apache/flink-cdc/pull/3836#discussion_r1910216599


##
flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/transform/PreTransformProcessor.java:
##
@@ -62,30 +62,18 @@ public CreateTableEvent 
preTransformCreateTableEvent(CreateTableEvent createTabl
 public BinaryRecordData processFillDataField(BinaryRecordData data) {
 List valueList = new ArrayList<>();
 List columns = 
tableChangeInfo.getPreTransformedSchema().getColumns();
-for (int i = 0; i < columns.size(); i++) {
-valueList.add(
-getValueFromBinaryRecordData(
-columns.get(i).getName(),
-data,
-tableChangeInfo.getSourceSchema().getColumns(),
-tableChangeInfo.getSourceFieldGetters()));
+Map sourceFieldGetters =
+tableChangeInfo.getSourceFieldGetters();
+for (Column column : columns) {
+RecordData.FieldGetter fieldGetter = 
sourceFieldGetters.get(column.getName());
+if (fieldGetter != null) {

Review Comment:
   I think `TableChangeInfo` should be able to guarantee the invariability that 
for each column in `preTransformedSchema`, there must be corresponding 
`fieldGetters` available.



##
flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/transform/PreTransformChangeInfo.java:
##
@@ -59,7 +61,11 @@ public PreTransformChangeInfo(
 this.tableId = tableId;
 this.sourceSchema = sourceSchema;
 this.preTransformedSchema = preTransformedSchema;
-this.sourceFieldGetters = sourceFieldGetters;
+this.sourceFieldGetters = new HashMap<>(sourceSchema.getColumnCount());
+for (int i = 0; i < sourceSchema.getColumns().size(); i++) {
+this.sourceFieldGetters.put(
+sourceSchema.getColumns().get(i).getName(), 
sourceFieldGetters[i]);
+}

Review Comment:
   Maybe `this.sourceFieldGetters` -> `this.sourceFieldGettersMap`, as now it 
has different type and meaning with the constructor parameter.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]

2025-01-10 Thread via GitHub



xuyangzhong commented on code in PR #25950:
URL: https://github.com/apache/flink/pull/25950#discussion_r1910221269


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/plan/stats/ColumnStats.java:
##
@@ -118,40 +81,10 @@ public Integer getMaxLen() {
 return maxLen;
 }
 
-/**
- * Deprecated because Number type max/min is not well supported comparable 
type, e.g. {@link
- * java.util.Date}, {@link java.sql.Timestamp}.
- *
- * Returns null if this instance is constructed by {@link 
ColumnStats.Builder}.
- */
-@Deprecated
-public Number getMaxValue() {
-return maxValue;
-}
-
-/**

Review Comment:
   Yes. Because the legacy construct `{@link ColumnStats#ColumnStats(Long, Long,
* Double, Integer, Number, Number)}.` has been removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36964] Fix potential exception when SchemaChange in parallel w… [flink-cdc]

2025-01-10 Thread via GitHub



lvyanquan commented on PR #3818:
URL: https://github.com/apache/flink-cdc/pull/3818#issuecomment-2582453742

   Rebased to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-36608) Support dynamic StreamGraph optimization for AdaptiveBroadcastJoinOperator

2025-01-10 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-36608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-36608.
---
Resolution: Done

5d0cb98fe7843796e6ca0598060839f2b48f0882
627395acf46b8b38ee2c8db2cb6ba31849db9f27
80a79bb5f6320a97e4051cba6e22c151dc9603cf

> Support dynamic StreamGraph optimization for AdaptiveBroadcastJoinOperator
> --
>
> Key: FLINK-36608
> URL: https://issues.apache.org/jira/browse/FLINK-36608
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 2.0.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Optimize the AdaptiveBroadcastJoinOperator dynamically at runtime by 
> transforming qualifying joins into broadcasts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]

2025-01-10 Thread via GitHub



davidradl commented on code in PR #25950:
URL: https://github.com/apache/flink/pull/25950#discussion_r1910216285


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/plan/stats/ColumnStats.java:
##
@@ -118,40 +81,10 @@ public Integer getMaxLen() {
 return maxLen;
 }
 
-/**
- * Deprecated because Number type max/min is not well supported comparable 
type, e.g. {@link
- * java.util.Date}, {@link java.sql.Timestamp}.
- *
- * Returns null if this instance is constructed by {@link 
ColumnStats.Builder}.
- */
-@Deprecated
-public Number getMaxValue() {
-return maxValue;
-}
-
-/**

Review Comment:
   It looks like you have removed the javadoc associated with the getMax method



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36494][table] Remove deprecated method Catalog#getTableFactory [flink]

2025-01-10 Thread via GitHub



davidradl commented on code in PR #25948:
URL: https://github.com/apache/flink/pull/25948#discussion_r1910227028


##
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/factories/TableFactoryUtil.java:
##
@@ -70,28 +66,14 @@ public static  TableSource 
findAndCreateTableSource(TableSourceFactory.Con
  */
 @SuppressWarnings("unchecked")
 public static  TableSource findAndCreateTableSource(
-@Nullable Catalog catalog,
 ObjectIdentifier objectIdentifier,
 CatalogTable catalogTable,
 ReadableConfig configuration,
 boolean isTemporary) {
 TableSourceFactory.Context context =
 new TableSourceFactoryContextImpl(
 objectIdentifier, catalogTable, configuration, 
isTemporary);
-Optional factoryOptional =
-catalog == null ? Optional.empty() : catalog.getTableFactory();

Review Comment:
   A question: why is it safe to delete this call rather than using 
getFactory() as per the deprecation comments in getTableFactory.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



davidradl commented on PR #25947:
URL: https://github.com/apache/flink/pull/25947#issuecomment-2582466083

   I see in HiveDynamicTableFactory that there are references to 
RequireCatalogLock


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-37097) Remove

2025-01-10 Thread david radley (Jira)

david radley created FLINK-37097:


 Summary: Remove 
 Key: FLINK-37097
 URL: https://issues.apache.org/jira/browse/FLINK-37097
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Hive
Affects Versions: 2.0.0
Reporter: david radley


As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
externalized into a new repo. We should remove flink-connector-hive and the 
flink-sql-connector-hive-* maven modules from core flink .
 
I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-36910][table] Add function call syntax support to time-related dynamic functions [flink]

2025-01-10 Thread via GitHub



gustavodemorais opened a new pull request, #25951:
URL: https://github.com/apache/flink/pull/25951

   
   
   ## What is the purpose of the change
   
   Enabled dynamic time-related functions to be called with or without 
parenthesis
   
   
   ## Brief change log
   
   - Add function call syntax for dynamic functions
   - Support null inputTable for sql from tableapi
   - Add support for BuiltInFunction test in batch mode
   - Add TimeFunctionsBatchModeITCase and testSelectDynamicDatetimeFunctions
   
   
   ## Verifying this change
   
 - Added TimeFunctionsBatchModeITCase and 
testSelectDynamicDatetimeFunctions to test all modified functions

   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: don't know
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



lsyldliu commented on PR #25947:
URL: https://github.com/apache/flink/pull/25947#issuecomment-2582517352

   > I just checked the external Hive connector and that refers to the class we 
are removing. I think we need a plan for the Hive connector before removing it 
from core Flink
   
   We can remove the related code when we bump flink version to 2.0 in hive 
connector.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-37097) Remove

2025-01-10 Thread dalongliu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu reassigned FLINK-37097:
-

Assignee: david radley

> Remove 
> ---
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



davidradl commented on PR #25947:
URL: https://github.com/apache/flink/pull/25947#issuecomment-2582518192

   I have raised https://issues.apache.org/jira/browse/FLINK-37097 to track the 
removal of the Hive connector from core flink, 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



lsyldliu commented on PR #25947:
URL: https://github.com/apache/flink/pull/25947#issuecomment-2582519779

   > I have raised https://issues.apache.org/jira/browse/FLINK-37097 to track 
the removal of the Hive connector from core flink,
   
   Good job, assign to you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0

2025-01-10 Thread xuyang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911929#comment-17911929
 ] 

xuyang commented on FLINK-37094:


[~Sergey Nuyanzin] Okay... IIUC we can adapt, test, and release the Hive 
connector after releasing flink 2.0.0, similar to other external connectors, 
and then remove it from flink repo in 2.1 (I believe it's likely that there 
won't be enough time to delete it in 2.0...). What do you think? 

When the time comes for the Hive connector to adapt to 2.0.0 where the 
deprecated table APIs have been removed, I can assist with that.

> Adapt the external Hive connector to the removal of deprecated APIs in table 
> module in 2.0
> --
>
> Key: FLINK-37094
> URL: https://issues.apache.org/jira/browse/FLINK-37094
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: xuyang
>Priority: Major
>
> Since the Hive connector is scheduled to be removed from the Flink repo, and 
> the compilation of the remaining Hive connector module within the Flink repo 
> has been skipped, all deprecation work related to the table module performed 
> in version 2.0 needs to be synced to the external Hive connector's separate 
> repository.
> There will be no further modifications to the remaining Hive connector module 
> in the Flink repo, as any changes made cannot be guaranteed to be validated 
> through CI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36698][pipeline-connector][elasticsearch] Elasticsearch pipeline sink support authentication [flink-cdc]

2025-01-10 Thread via GitHub



lvyanquan commented on PR #3728:
URL: https://github.com/apache/flink-cdc/pull/3728#issuecomment-2582478475

   Hi @beryllw, is it convenient to build a test to verify whether this 
certification is effective?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-37097) Remove

2025-01-10 Thread Sergey Nuyanzin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin closed FLINK-37097.
---
Resolution: Duplicate

> Remove 
> ---
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37097) Remove

2025-01-10 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911932#comment-17911932
 ] 

Sergey Nuyanzin commented on FLINK-37097:
-

This is a duplicate of https://issues.apache.org/jira/browse/FLINK-33786

> Remove 
> ---
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-3154][API] Upgrade from Kryo 2.x to Kryo 5.x. Removed twitter … [flink]

2025-01-10 Thread via GitHub



davidradl commented on PR #25896:
URL: https://github.com/apache/flink/pull/25896#issuecomment-258276

   @kurtostfeld on your suggestion - I have raised this with on the dev list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]

2025-01-10 Thread via GitHub



xuyangzhong commented on PR #25947:
URL: https://github.com/apache/flink/pull/25947#issuecomment-2582796136

   @davidradl FYI we can ignore all classes in hive connector module in flink 
repo. You can refer more here https://issues.apache.org/jira/browse/FLINK-37094


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-37100][tests] Fix `test_netty_shuffle_memory_control.sh` in CI for JDK11+ [flink]

2025-01-10 Thread via GitHub



ferenc-csaky opened a new pull request, #25955:
URL: https://github.com/apache/flink/pull/25955

   ## What is the purpose of the change
   
   Fixes the test executed by `test_netty_shuffle_memory_control.sh` that can 
possibly fail the CI in case Netty4 cannot reserve enough memory, hence Pekko 
is not able to start up.
   
   ## Brief change log
   
   - Set `io.netty.tryReflectionSetAccessible=true` for TMs, which reduces 
Netty4 memory footprint.
   
   ## Verifying this change
   
   Existing test should run consistently.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? handled in different PR
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37100) Fix test_netty_shuffle_memory_control.sh in CI for JDK11+

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-37100:
---
Labels: pull-request-available  (was: )

> Fix test_netty_shuffle_memory_control.sh in CI for JDK11+
> -
>
> Key: FLINK-37100
> URL: https://issues.apache.org/jira/browse/FLINK-37100
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.2, 1.20.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36979][rpc] Reverting pekko version bump in Flink 1.20 [flink]

2025-01-10 Thread via GitHub



ferenc-csaky commented on PR #25866:
URL: https://github.com/apache/flink/pull/25866#issuecomment-2582907004

   Opened https://github.com/apache/flink/pull/25955 which I believe should 
supersede this current PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-36753) Adaptive Scheduler actively triggers a Checkpoint

2025-01-10 Thread Samrat Deb (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-36753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911938#comment-17911938
 ] 

Samrat Deb commented on FLINK-36753:


Thank you [~fanrui] for the input. 

for 1:  it makes sense to unify the scale-up and scale-down with the active 
trigger. 

for 3: onboarded with your thoughts not to trigger anything in case of already 
running checkpointing. 

Regarding point 2: Knowing other opinions and how the community thinks would be 
great. 



 

> Adaptive Scheduler actively triggers a Checkpoint
> -
>
> Key: FLINK-36753
> URL: https://issues.apache.org/jira/browse/FLINK-36753
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 2.0-preview
>Reporter: Rui Fan
>Assignee: Samrat Deb
>Priority: Major
>
> FLIP-461[1] and FLINK-35549[2] support that rescale could be executed after 
> the next completed checkpoint. It greatly reduces the amount of data replay 
> after rescale.
> In FLIP-461, Adaptive Scheduler waits for the next periodic checkpoint to be 
> triggered. In most scenarios, a more efficient solution might be Adaptive 
> Scheduler actively triggers a Checkpoint after all resources are 
> ready(Technically desire resources are ready).
> The idea comes from an offline discussion between [~mxm]  and [~fanrui].
> [1][https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler]
> [2] https://issues.apache.org/jira/browse/FLINK-35549



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-36910][table] Add function call syntax support to time-related dynamic functions [flink]

2025-01-10 Thread via GitHub



gustavodemorais commented on PR #25806:
URL: https://github.com/apache/flink/pull/25806#issuecomment-2582505543

   The reason for the failures are the planner changes I did to try to extend 
testing. They are not valid, and I'll discard the planner changes. We'll stick 
to the testDynamicDatetimeFunctions and testDynamicDatetimeFunctionsAreEqual 
tests. Closing this in favor of https://github.com/apache/flink/pull/25951


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36910][table] Add function call syntax support to time-related dynamic functions [flink]

2025-01-10 Thread via GitHub



gustavodemorais closed pull request #25806: [FLINK-36910][table] Add function 
call syntax support to time-related dynamic functions
URL: https://github.com/apache/flink/pull/25806


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36576][runtime] Improving amount-based data balancing distribution algorithm for DefaultVertexParallelismAndInputInfosDecider [flink]

2025-01-10 Thread via GitHub



zhuzhurk commented on code in PR #25552:
URL: https://github.com/apache/flink/pull/25552#discussion_r1910075634


##
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexInputInfoComputerTestUtil.java:
##
@@ -86,22 +90,55 @@ public static List 
createBlockingInputInfos(
 return blockingInputInfos;
 }
 
-public static void checkJobVertexInputInfo(
+public static void checkParallelism(
 int targetParallelism,
-List inputInfos,
+Map vertexInputInfoMap) 
{
+vertexInputInfoMap
+.values()
+.forEach(
+info ->
+
assertThat(info.getExecutionVertexInputInfos().size())
+.isEqualTo(targetParallelism));
+}
+
+public static void checkConsumedSubpartitionGroups(
 List> targetConsumedSubpartitionGroups,
+List inputInfos,
 Map vertexInputInfoMap) 
{
 JobVertexInputInfo vertexInputInfo =
 checkAndGetJobVertexInputInfo(inputInfos, vertexInputInfoMap);
 List executionVertexInputInfos =
 vertexInputInfo.getExecutionVertexInputInfos();
-
assertThat(executionVertexInputInfos.size()).isEqualTo(targetParallelism);
-for (int i = 0; i < targetParallelism; i++) {
+for (int i = 0; i < executionVertexInputInfos.size(); i++) {
 
assertThat(executionVertexInputInfos.get(i).getConsumedSubpartitionGroups())
 .isEqualTo(targetConsumedSubpartitionGroups.get(i));
 }
 }
 
+public static void checkConsumedDataVolumePerSubtask(
+long[] targetConsumedDataVolume,
+List inputInfos,
+Map vertexInputs) {
+long[] consumedDataVolume = new long[targetConsumedDataVolume.length];
+for (BlockingInputInfo inputInfo : inputInfos) {
+JobVertexInputInfo vertexInputInfo = 
vertexInputs.get(inputInfo.getResultId());
+List executionVertexInputInfos =
+vertexInputInfo.getExecutionVertexInputInfos();
+for (int i = 0; i < executionVertexInputInfos.size(); ++i) {
+ExecutionVertexInputInfo executionVertexInputInfo =
+executionVertexInputInfos.get(i);
+consumedDataVolume[i] +=
+
executionVertexInputInfo.getConsumedSubpartitionGroups().entrySet().stream()
+.mapToLong(
+entry ->
+inputInfo.getNumBytesProduced(
+entry.getKey(), 
entry.getValue()))
+.sum();
+}
+}
+assertThat(consumedDataVolume).isEqualTo(targetConsumedDataVolume);
+}
+
 public static JobVertexInputInfo checkAndGetJobVertexInputInfo(

Review Comment:
   can be private



##
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexInputInfoComputerTestUtil.java:
##
@@ -86,22 +90,55 @@ public static List 
createBlockingInputInfos(
 return blockingInputInfos;
 }
 
-public static void checkJobVertexInputInfo(
+public static void checkParallelism(

Review Comment:
   can be private



##
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexInputInfoComputerTestUtil.java:
##
@@ -0,0 +1,357 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler.adaptivebatch;
+
+import org.apache.flink.runtime.executiongraph.ExecutionVertexInputInfo;
+import org.apache.flink.runtime.executiongraph.IndexRange;
+import org.apache.flink.runtime.executiongraph.JobVertexInputInfo;
+import org.apache.flink.runtime.jobgraph.IntermediateDataSetID;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static 
org.apache.flink.runtim

[jira] [Commented] (FLINK-37097) Remove Hive connector from core Flink

2025-01-10 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911962#comment-17911962
 ] 

lincoln lee commented on FLINK-37097:
-

[~davidradl] We're hoping to see this happen but currently we do not have 
enough resource to do so.

Before the externalize can happen [~yuxia] had finished the work to decouple 
Hive with Flink planner[1], then Martijn & Sergey kickoff the removal work, but 
unfortunately the releasing hadn't finish yet(this is why #23899 not merged).

Recently we're moving the table moudle releated clean up work[1] and discovered 
that the hive connector code in the main repo also needs to be adapted and 
cleaned up accordingly, so get the current situation with hive externalization.

If [~Sergey Nuyanzin] is willing to re-prepare the release(adapt to 
flink-1.20?), we can at least participate in the vote. And continue to complete 
the corresponding flink-2.0 code adjustments in the hive repo.
 # https://issues.apache.org/jira/browse/FLINK-26603
 # https://issues.apache.org/jira/browse/FLINK-36476

> Remove Hive connector from core Flink
> -
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-37026] Add Stale PR GitHub Action workflow [flink]

2025-01-10 Thread via GitHub



tomncooper opened a new pull request, #25953:
URL: https://github.com/apache/flink/pull/25953

   ## What is the purpose of the change
   
   This is the implementation of the [Stale PR GitHub Action 
Proposal](https://cwiki.apache.org/confluence/display/FLINK/Stale+PR+Cleanup).  
This was [voted 
through](https://lists.apache.org/thread/kc90254wvo5q934doh8o4sbj1fgwvy76) on 
the dev mailing list.
   
   The initial setting is for 6 months of inactivity to be marked as stale with 
3 month to respond before the PR is closed. Ideally, once the initial backlog 
is cleared, we will reduces these down to 3 months and 1 month respectively, to 
match other Apache projects. 
   
   There is also configuration for when the action is run and how many PRs it 
will process in one run. Initially I have set this to every 6 hours and 200 PRs 
per run, this will allow us to process the backlog in a few days. After which 
these can be reduced to daily, with a lower request limit (e.g. 50).
   
   ## Brief change log
   
   - Enable Stale PR GitHub action workflow.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37026] Add Stale PR GitHub Action workflow [flink]

2025-01-10 Thread via GitHub



tomncooper commented on PR #25953:
URL: https://github.com/apache/flink/pull/25953#issuecomment-2582840912

   cc @gyfora @1996fanrui 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37026) Enable Stale PR Github Action

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-37026:
---
Labels: pull-request-available  (was: )

> Enable Stale PR Github Action
> -
>
> Key: FLINK-37026
> URL: https://issues.apache.org/jira/browse/FLINK-37026
> Project: Flink
>  Issue Type: Improvement
>Reporter: Thomas Cooper
>Priority: Minor
>  Labels: pull-request-available
>
> As part of the Community Health Initiative (CHI) we made a 
> [proposal|https://cwiki.apache.org/confluence/display/FLINK/Stale+PR+Cleanup] 
> to clean up old PRs which have seen no activity in a long time (stale PRs).
> After 
> [discussion|https://lists.apache.org/thread/6yoclzmvymxors8vlpt4nn9r7t3stcsz] 
> on the mailing list the proposal was 
> [voted|https://lists.apache.org/thread/kc90254wvo5q934doh8o4sbj1fgwvy76] 
> through.
> We now need to enable the Stale PR GitHub action with a stale period of 6 
> months, with 3 months to refresh (interact in any way with) the PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-37090][e2e] Add adaptive broadcast join to e2e tpc-ds tests. [flink]

2025-01-10 Thread via GitHub



SinBex opened a new pull request, #25954:
URL: https://github.com/apache/flink/pull/25954

   ## What is the purpose of the change
   
   Add adaptive broadcast join to e2e tpc-ds tests.
   
   In this case, we will set the AdaptiveBroadcastJoinStrategy to RUNTIME_ONLY, 
significantly increasing the likelihood of triggering adaptive broadcast join.
   
   ## Brief change log
   
 - *Add adaptive broadcast join to e2e tpc-ds tests.*
   
   
   ## Verifying this change
   This change is already covered by existing tests, such as *test_tpcds.sh*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
 - The serializers: ( no )
 - The runtime per-record code paths (performance sensitive): (no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no )
 - The S3 file system connector: ( no )
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-37090) Introduce a TPC-DS E2E case for adaptive broadcast join

2025-01-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-37090:
---
Labels: pull-request-available  (was: )

> Introduce a TPC-DS E2E case for adaptive broadcast join
> ---
>
> Key: FLINK-37090
> URL: https://issues.apache.org/jira/browse/FLINK-37090
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Junrui Lee
>Assignee: xingbe
>Priority: Major
>  Labels: pull-request-available
>
> Introduce a specific TPC-DS case for adaptive broadcast join.
> In this case, we will set the AdaptiveBroadcastJoinStrategy to RUNTIME_ONLY 
> and configure the table.optimizer.join.broadcast-threshold to 0, 
> significantly increasing the likelihood of triggering adaptive broadcast join.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-37098] Fix selecting time attribute from a view [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25952:
URL: https://github.com/apache/flink/pull/25952#issuecomment-2582853024

   
   ## CI report:
   
   * fdf0105d32c8a346d9a890411098715bd297ddee UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37026] Add Stale PR GitHub Action workflow [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25953:
URL: https://github.com/apache/flink/pull/25953#issuecomment-2582853276

   
   ## CI report:
   
   * 03ee68c53e6523046cbb64f9a2b6d15823edb659 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37090][e2e] Add adaptive broadcast join to e2e tpc-ds tests. [flink]

2025-01-10 Thread via GitHub



flinkbot commented on PR #25954:
URL: https://github.com/apache/flink/pull/25954#issuecomment-2582853466

   
   ## CI report:
   
   * 27e503efb6c2fc7742a513cd166399a51312af67 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-37097) Remove Hive connector from core Flink

2025-01-10 Thread david radley (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911964#comment-17911964
 ] 

david radley commented on FLINK-37097:
--

[~Sergey Nuyanzin] it seems that:

 - V1 (1.20) still ships the Hive connector

- master has unknitted the Hive connector from the connector pom so it is not 
shipped

- So if V2 is not shipping the Hive connector, we should be able to remove the 
code from master - but keep it in the 1.20 v1 branch.

WDYT?

 

> Remove Hive connector from core Flink
> -
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-37097) Remove Hive connector from core Flink

2025-01-10 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911962#comment-17911962
 ] 

lincoln lee edited comment on FLINK-37097 at 1/10/25 1:24 PM:
--

[~davidradl] We're hoping to see this happen but currently we do not have 
enough resource to do so.

Before the externalize can happen [~yuxia] had finished the work to decouple 
Hive with Flink planner[1], then Martijn & Sergey kickoff the removal work, but 
unfortunately the releasing hadn't finish yet(this is why #23899 not merged).

Recently we're moving the table moudle releated clean up work[1] and found that 
the hive connector code in the main repo also needs to be adapted accordingly, 
then get the current situation with hive externalization.

If [~Sergey Nuyanzin] is willing to re-prepare the release(adapt to 
flink-1.20?), we can at least participate in the vote and continue to complete 
the corresponding flink-2.0 code adjustments in the hive repo.
 # https://issues.apache.org/jira/browse/FLINK-26603
 # https://issues.apache.org/jira/browse/FLINK-36476


was (Author: lincoln.86xy):
[~davidradl] We're hoping to see this happen but currently we do not have 
enough resource to do so.

Before the externalize can happen [~yuxia] had finished the work to decouple 
Hive with Flink planner[1], then Martijn & Sergey kickoff the removal work, but 
unfortunately the releasing hadn't finish yet(this is why #23899 not merged).

Recently we're moving the table moudle releated clean up work[1] and discovered 
that the hive connector code in the main repo also needs to be adapted and 
cleaned up accordingly, so get the current situation with hive externalization.

If [~Sergey Nuyanzin] is willing to re-prepare the release(adapt to 
flink-1.20?), we can at least participate in the vote. And continue to complete 
the corresponding flink-2.0 code adjustments in the hive repo.
 # https://issues.apache.org/jira/browse/FLINK-26603
 # https://issues.apache.org/jira/browse/FLINK-36476

> Remove Hive connector from core Flink
> -
>
> Key: FLINK-37097
> URL: https://issues.apache.org/jira/browse/FLINK-37097
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: david radley
>Assignee: david radley
>Priority: Minor
>
> As per [https://github.com/apache/flink/pull/25947] the Hive code has been 
> externalized into a new repo. We should remove flink-connector-hive and the 
> flink-sql-connector-hive-* maven modules from core flink .
>  
> I am happy to make this change if someone can assign me the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 3 >

1 - 100 of 220 matches

Mail list logo