Re: [PR] [FLINK-36160] Support hive advanced configuration [flink]
luoyuxia commented on PR #25258: URL: https://github.com/apache/flink/pull/25258#issuecomment-2585115946 FYI, let's wait the response for the question I asked in [FLINK-37097](https://issues.apache.org/jira/browse/FLINK-37097?focusedCommentId=17912172&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17912172) before merging this pr... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Deprecated StreamSink and SinkOperator [flink]
beliefer commented on code in PR #25835: URL: https://github.com/apache/flink/pull/25835#discussion_r1911925092 ## flink-runtime/src/main/java/org/apache/flink/streaming/api/operators/StreamSink.java: ## @@ -25,6 +25,7 @@ import org.apache.flink.streaming.runtime.tasks.ProcessingTimeService; /** A {@link StreamOperator} for executing {@link SinkFunction SinkFunctions}. */ +@Deprecated Review Comment: ``` @Internal @Deprecated public interface SinkFunction extends Function, Serializable { ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-37097) Remove Hive connector from core Flink
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912172#comment-17912172 ] luoyuxia commented on FLINK-37097: -- So the currently pending pr for hive connector should have there optionsL 1: Merged into flink repo and then wait to be ported to external connector along with the other commits has been merge into flink repo 2: Merge into external connector without waiting commits in flink repo to be ported to external connector 3: Block the pending pr until commits in flink repo has been ported to external connector.. Then merge the pr into external repo Which option should we follow given the current situation? > Remove Hive connector from core Flink > - > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36645] [flink-autoscaler] Gracefully handle null execution pla… [flink-kubernetes-operator]
sharath1709 commented on code in PR #930: URL: https://github.com/apache/flink-kubernetes-operator/pull/930#discussion_r1911479006 ## flink-autoscaler/src/main/java/org/apache/flink/autoscaler/topology/JobTopology.java: ## @@ -150,6 +151,11 @@ public static JobTopology fromJsonPlan( ObjectNode plan = objectMapper.readValue(jsonPlan, ObjectNode.class); ArrayNode nodes = (ArrayNode) plan.get("nodes"); +if (nodes == null || nodes.isEmpty()) { +throw new NotReadyException( +new RuntimeException("No nodes found in the plan, job is not ready yet")); Review Comment: Thanks for the quick review! Updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36696] [flink-autoscaler-plugin-jdbc] Switch sql connection usages to datasource [flink-kubernetes-operator]
sharath1709 commented on PR #929: URL: https://github.com/apache/flink-kubernetes-operator/pull/929#issuecomment-2584888730 Thanks a lot @1996fanrui for the quick review. Please find my replies below 1. Typically, DataSource connections will be relinquished automatically after a time out and DataSource isn't Closeable. However, some DataSource implementations like HikariDataSource implement the Closeable interface to close the ConnectionPool. We can add a check to see if the DataSource is Closeable and then close it. Additionally, I've attempted to make the classes that hold the datasource objects Autocloseable as well so that they could be created in a try-with-resources block 2. Thanks, I've tried to fix most of the issues but might need a couple more iterations. 3. Great point, I will try to add a test case for this scenario soon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35600] Add timestamp for low and high watermark [flink-cdc]
github-actions[bot] commented on PR #3415: URL: https://github.com/apache/flink-cdc/pull/3415#issuecomment-2584936582 This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-34932) Translate concepts of Flink-Kubernetes-Operator documentation
[ https://issues.apache.org/jira/browse/FLINK-34932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912165#comment-17912165 ] Rui Fan commented on FLINK-34932: - [~fernandoha] , assigned it to you. Hi [~caicancai] , would you mind reviewing it in your free time? thanks both of you in advance. > Translate concepts of Flink-Kubernetes-Operator documentation > - > > Key: FLINK-34932 > URL: https://issues.apache.org/jira/browse/FLINK-34932 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Kubernetes Operator >Affects Versions: 1.9.0 >Reporter: Caican Cai >Assignee: yinrhh >Priority: Minor > Fix For: 1.9.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34932) Translate concepts of Flink-Kubernetes-Operator documentation
[ https://issues.apache.org/jira/browse/FLINK-34932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Fan reassigned FLINK-34932: --- Assignee: yinrhh (was: Caican Cai) > Translate concepts of Flink-Kubernetes-Operator documentation > - > > Key: FLINK-34932 > URL: https://issues.apache.org/jira/browse/FLINK-34932 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Kubernetes Operator >Affects Versions: 1.9.0 >Reporter: Caican Cai >Assignee: yinrhh >Priority: Minor > Fix For: 1.9.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34932) Translate concepts of Flink-Kubernetes-Operator documentation
[ https://issues.apache.org/jira/browse/FLINK-34932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912164#comment-17912164 ] yinrhh commented on FLINK-34932: Hi. can anybody assign this problem to me,i truly want to fix it. thanks. > Translate concepts of Flink-Kubernetes-Operator documentation > - > > Key: FLINK-34932 > URL: https://issues.apache.org/jira/browse/FLINK-34932 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Kubernetes Operator >Affects Versions: 1.9.0 >Reporter: Caican Cai >Assignee: Caican Cai >Priority: Minor > Fix For: 1.9.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-28897] [TABLE-SQL] Fail to use udf in added jar when enabling checkpoint [flink]
ammu20-dev commented on PR #25656: URL: https://github.com/apache/flink/pull/25656#issuecomment-2585004128 > In that case the primary question should be - why does the `userClassLoader` class field not contain the actual user class loader in the first place. Should not that be the focus of the fix instead of pulling it out of the resourceManager only for one the methods? > @afedulov That sounds reasonable. I will investigate more on this and try to figure out the reason. Meanwhile do you think a more targeted fix would be a better approach here. Something that is confined to [StreamingJobGraphGenerator.prevalidate](https://github.com/apache/flink/blob/4b306699811af0b6ff2cb862914adfda56345996/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java#L502) method level by removing the class loading happening [here](https://github.com/apache/flink/blob/4b306699811af0b6ff2cb862914adfda56345996/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java#L535). Due to this issue, the ADD JAR statement is causing a job failure when checkpointing is enabled, which is preventing to use this feature in production. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
xuyang created FLINK-37094: -- Summary: Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0 Key: FLINK-37094 URL: https://issues.apache.org/jira/browse/FLINK-37094 Project: Flink Issue Type: Bug Components: Connectors / Hive Reporter: xuyang Since the Hive connector is scheduled to be removed from the Flink repo, and the compilation of the remaining Hive connector module within the Flink repo has been skipped, all deprecation work related to the table module performed in version 2.0 needs to be synced to the external Hive connector's separate repository. There will be no further updates to the remaining Hive connector module in the Flink repo. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-37095) test case that used AI model may met quota not enough exception.
Yanquan Lv created FLINK-37095: -- Summary: test case that used AI model may met quota not enough exception. Key: FLINK-37095 URL: https://issues.apache.org/jira/browse/FLINK-37095 Project: Flink Issue Type: New Feature Components: Flink CDC Affects Versions: cdc-3.2.1 Reporter: Yanquan Lv Fix For: cdc-3.3.0 Met the following error message when running test in FlinkPipelineUdfITCase.testTransformWithModel: {code:java} 2025-01-10T03:12:23.5137158Z Caused by: dev.ai4j.openai4j.OpenAiHttpException: { 2025-01-10T03:12:23.5137225Z "error": { 2025-01-10T03:12:23.5137854Z "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.";, 2025-01-10T03:12:23.5137947Z "type": "insufficient_quota", 2025-01-10T03:12:23.5138020Z "param": null, 2025-01-10T03:12:23.5138104Z "code": "insufficient_quota" 2025-01-10T03:12:23.5138172Z } 2025-01-10T03:12:23.5138234Z } 2025-01-10T03:12:23.5138364Z at dev.ai4j.openai4j.Utils.toException(Utils.java:8) 2025-01-10T03:12:23.5138583Z at dev.ai4j.openai4j.SyncRequestExecutor.execute(SyncRequestExecutor.java:28) 2025-01-10T03:12:23.5138761Z at dev.ai4j.openai4j.RequestExecutor.execute(RequestExecutor.java:59) 2025-01-10T03:12:23.5139026Z at dev.langchain4j.model.openai.OpenAiChatModel.lambda$generate$0(OpenAiChatModel.java:124) 2025-01-10T03:12:23.5139207Z at dev.langchain4j.internal.RetryUtils.withRetry(RetryUtils.java:26) 2025-01-10T03:12:23.5139274Z ... 26 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-37024) Task can be stuck in deploying state forever when canceling job/failover
[ https://issues.apache.org/jira/browse/FLINK-37024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weihua Hu resolved FLINK-37024. --- Fix Version/s: 2.0.0 Resolution: Fixed merged in master: 37c70c482a09845370cb6e694a26f55950c4699f > Task can be stuck in deploying state forever when canceling job/failover > > > Key: FLINK-37024 > URL: https://issues.apache.org/jira/browse/FLINK-37024 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.20.0 >Reporter: Zhanghao Chen >Assignee: Zhanghao Chen >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > We observed that task can be stuck in deploying state forever when the task > loading/instantiating logic has some issues. Cancelling the job / failover > caused by failures of other tasks will also get stuck as the cancel watch dog > won't work for tasks in CREATED/DEPLOYING state at present. We should make > cancel watch dog cover tasks in DEPLOYING as well (no need for tasks in > CREATED state has there is no real logic between CREATED->DEPLOYING). -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-37025] Fix generating watermarks in SQL on-periodic (#25921) [flink]
dawidwys merged PR #25935: URL: https://github.com/apache/flink/pull/25935 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
flinkbot commented on PR #25947: URL: https://github.com/apache/flink/pull/25947#issuecomment-2582054899 ## CI report: * 0557987216d7d26709023fd6cfc2379ec9f92eae UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
[ https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuyang updated FLINK-37094: --- Issue Type: Improvement (was: Bug) > Adapt the external Hive connector to the removal of deprecated APIs in table > module in 2.0 > -- > > Key: FLINK-37094 > URL: https://issues.apache.org/jira/browse/FLINK-37094 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: xuyang >Priority: Major > > Since the Hive connector is scheduled to be removed from the Flink repo, and > the compilation of the remaining Hive connector module within the Flink repo > has been skipped, all deprecation work related to the table module performed > in version 2.0 needs to be synced to the external Hive connector's separate > repository. > There will be no further modifications to the remaining Hive connector module > in the Flink repo, as any changes made cannot be guaranteed to be validated > through CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-37093][table] store catalog in catalogStoreHolder should after initCatalog and open when createCatalog [flink]
flinkbot commented on PR #25946: URL: https://github.com/apache/flink/pull/25946#issuecomment-2582054718 ## CI report: * 59d74f97e7e7905497e8db467527c9961c3a2f86 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
[ https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuyang updated FLINK-37094: --- Description: Since the Hive connector is scheduled to be removed from the Flink repo, and the compilation of the remaining Hive connector module within the Flink repo has been skipped, all deprecation work related to the table module performed in version 2.0 needs to be synced to the external Hive connector's separate repository. There will be no further modifications to the remaining Hive connector module in the Flink repo, as any changes made cannot be guaranteed to be validated through CI. was:Since the Hive connector is scheduled to be removed from the Flink repo, and the compilation of the remaining Hive connector module within the Flink repo has been skipped, all deprecation work related to the table module performed in version 2.0 needs to be synced to the external Hive connector's separate repository. There will be no further updates to the remaining Hive connector module in the Flink repo. > Adapt the external Hive connector to the removal of deprecated APIs in table > module in 2.0 > -- > > Key: FLINK-37094 > URL: https://issues.apache.org/jira/browse/FLINK-37094 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Reporter: xuyang >Priority: Major > > Since the Hive connector is scheduled to be removed from the Flink repo, and > the compilation of the remaining Hive connector module within the Flink repo > has been skipped, all deprecation work related to the table module performed > in version 2.0 needs to be synced to the external Hive connector's separate > repository. > There will be no further modifications to the remaining Hive connector module > in the Flink repo, as any changes made cannot be guaranteed to be validated > through CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-37024][task] Make cancel watchdog cover tasks stuck in DEPLOYING state [flink]
huwh merged PR #25915: URL: https://github.com/apache/flink/pull/25915 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-35325) Paimon connector miss the position of AddColumnEvent
[ https://issues.apache.org/jira/browse/FLINK-35325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanquan Lv updated FLINK-35325: --- Fix Version/s: cdc-3.2.1 (was: cdc-3.1.1) > Paimon connector miss the position of AddColumnEvent > > > Key: FLINK-35325 > URL: https://issues.apache.org/jira/browse/FLINK-35325 > Project: Flink > Issue Type: Improvement > Components: Flink CDC >Affects Versions: cdc-3.1.1 >Reporter: Yanquan Lv >Assignee: tianzhu.wen >Priority: Minor > Labels: pull-request-available > Fix For: cdc-3.2.1 > > > Currently, new columns are always added in the last position, however some > newly add columns had a specific before and after relationship with other > column. > Source code: > [https://github.com/apache/flink-cdc/blob/fa6e7ea51258dcd90f06036196618224156df367/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java#L137] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-37025) Periodic SQL watermarks can travel back in time
[ https://issues.apache.org/jira/browse/FLINK-37025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz resolved FLINK-37025. -- Resolution: Fixed Fixed in: * master ** 7059ee555c5093e3b233876ac3d84ad693023df8 * 1.20.x ** bfe3576f439466744871d950b70339344c15e44f * 1.19.x ** 868f413bd63db4399a89a403103f02555a5293b7 > Periodic SQL watermarks can travel back in time > --- > > Key: FLINK-37025 > URL: https://issues.apache.org/jira/browse/FLINK-37025 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.20.0, 2.0-preview >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.19.2, 1.20.1 > > > If watermarks are generated through SQL, e.g. using {{WATERMARK FOR ts AS > ts}} is used > https://github.com/apache/flink/blob/e9353319ad625baa5b2c20fa709ab5b23f83c0f4/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/generated/GeneratedWatermarkGeneratorSupplier.java#L86 > The code in {{onEvent}} can move the {{currentWatermark}} back in time, > because it does not check if it advanced. This is fine in case of > {{on-event}} watermarks, because underlying {{WatermarkOutput}} will, but it > has confusing outcomes for {{on-periodic}}. > Example: > If events with timestamps: > {code} > {"id":1,"ts":"2024-01-01 00:00:00"} > {"id":3,"ts":"2024-01-03 00:00:00"} > {"id":2,"ts":"2024-01-02 00:00:00"} > {code} > come within the periodic emit interval, the generated watermark will be > "2024-01-02 00:00:00" instead of "2024-01-03 00:00:00". As a result the > watermark for "2024-01-03 00:00:00" is swallowed and a new event is required > to progress the processing of that event. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
[ https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911842#comment-17911842 ] xuyang commented on FLINK-37094: Btw, I tried to enable the compilation directly in the pom in the Flink repo, but it resulted in compilation errors, and many tests also failed... cc [~luoyuxia] [~Sergey Nuyanzin] > Adapt the external Hive connector to the removal of deprecated APIs in table > module in 2.0 > -- > > Key: FLINK-37094 > URL: https://issues.apache.org/jira/browse/FLINK-37094 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: xuyang >Priority: Major > > Since the Hive connector is scheduled to be removed from the Flink repo, and > the compilation of the remaining Hive connector module within the Flink repo > has been skipped, all deprecation work related to the table module performed > in version 2.0 needs to be synced to the external Hive connector's separate > repository. > There will be no further modifications to the remaining Hive connector module > in the Flink repo, as any changes made cannot be guaranteed to be validated > through CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-35325][transform] Skip insufficient_quota error when running test case using ad model. [flink-cdc]
lvyanquan opened a new pull request, #3849: URL: https://github.com/apache/flink-cdc/pull/3849 Fix Ci failure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36957][Datastream] Implement asyc state version of stream flatmap [flink]
fredia commented on PR #25848: URL: https://github.com/apache/flink/pull/25848#issuecomment-2582074659 > @fredia I am unsure what this change it - please could you add the motivation of the fix in the Jira and the impact on the Datastream API. Does this require a docs change? @davidradl Sorry for the confusion, this pr provides the async state version of stream flatmap, I changed the title and description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-36494][table] Remove deprecated method Catalog#getTableFactory [flink]
xuyangzhong opened a new pull request, #25948: URL: https://github.com/apache/flink/pull/25948 ## What is the purpose of the change Remove deprecated method Catalog#getTableFactory. Note: no changes are done in hive connector repo because of https://issues.apache.org/jira/browse/FLINK-37094. ## Brief change log - *Remove deprecated method Catalog#getTableFactory* ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37011] Improve get source field value by column name in PreTransformProcessor. [flink-cdc]
ChaomingZhangCN commented on PR #3836: URL: https://github.com/apache/flink-cdc/pull/3836#issuecomment-2582263733 > @leonardBang @yuxiqian Please take a look. pin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36193] Supports applying TRUNCATE / DROP table ddl... [flink-cdc]
yuxiqian commented on code in PR #3673: URL: https://github.com/apache/flink-cdc/pull/3673#discussion_r1910133465 ## flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java: ## @@ -325,14 +317,32 @@ private void applyAlterColumnType(AlterColumnTypeEvent event) throws SchemaEvolv tableChangeList.add( SchemaChangeProvider.updateColumnType( oldName, newType))); -catalog.alterTable( -new Identifier(event.tableId().getSchemaName(), event.tableId().getTableName()), -tableChangeList, -true); +catalog.alterTable(tableIdToIdentifier(event), tableChangeList, true); } catch (Catalog.TableNotExistException | Catalog.ColumnAlreadyExistException | Catalog.ColumnNotExistException e) { throw new SchemaEvolveException(event, e.getMessage(), e); } } + +private void applyTruncateTable(TruncateTableEvent event) throws SchemaEvolveException { +try (BatchTableCommit batchTableCommit = + catalog.getTable(tableIdToIdentifier(event)).newBatchWriteBuilder().newCommit()) { Review Comment: Yeah it should be safer throw an exception in such configuration. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]
xuyangzhong opened a new pull request, #25950: URL: https://github.com/apache/flink/pull/25950 ## What is the purpose of the change Remove all deprecated methods in ColumnStats ## Brief change log - Remove all deprecated methods in ColumnStats ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-37093][table] store catalog in catalogStoreHolder should after initCatalog and open when createCatalog [flink]
jiefei30 opened a new pull request, #25946: URL: https://github.com/apache/flink/pull/25946 ## What is the purpose of the change This pull request fix a bug which a catalog that failed validation due to no type still exists in catalogStoreHolder. And this catalog you cannot USE, ALTER. ## Brief change log - fix a bug which a catalog that failed validation due to no type still exists in catalogStoreHolder. ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37093) The catalog that failed validation due to no type still exists in catalogStoreHolder
[ https://issues.apache.org/jira/browse/FLINK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-37093: --- Labels: pull-request-available (was: ) > The catalog that failed validation due to no type still exists in > catalogStoreHolder > > > Key: FLINK-37093 > URL: https://issues.apache.org/jira/browse/FLINK-37093 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.20.0, 2.0-preview > Environment: mac os >Reporter: Mingcan Wang >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.20.0 > > > here's sql-client (1.20 & 2.0-preview1): > > *Flink SQL> create catalog cat1;* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.api.ValidationException: Catalog > options do not contain an option key 'type' for discovering a catalog.{color} > *Flink SQL> show catalogs;* > +-+ > | catalog name | > +-+ > | cat1 | > | default_catalog | > +-+ > 2 rows in set > *Flink SQL> use catalog cat1;* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.api.ValidationException: Catalog > options do not contain an option key 'type' for discovering a catalog.{color} > *Flink SQL> alter catalog cat1 comment 'no type';* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.api.ValidationException: Catalog > options do not contain an option key 'type' for discovering a catalog.{color} > *Flink SQL> create catalog cat1;* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.catalog.exceptions.CatalogException: > Catalog cat1 already exists.{color} > *Flink SQL> drop catalog cat1;* > [INFO] Execute statement succeeded. > *Flink SQL> show catalogs;* > +-+ > | catalog name | > +-+ > | default_catalog | > +-+ > 1 row in set > > When i create a catalog without type , the sql will parsed successfully. > According to the code : > ```java > public void createCatalog( > String catalogName, CatalogDescriptor catalogDescriptor, boolean > ignoreIfExists) > throws CatalogException { > checkArgument( > !StringUtils.isNullOrWhitespaceOnly(catalogName), > "Catalog name cannot be null or empty."); > checkNotNull(catalogDescriptor, "Catalog descriptor cannot be null"); > boolean catalogExistsInStore = > catalogStoreHolder.catalogStore().contains(catalogName); > boolean catalogExistsInMemory = catalogs.containsKey(catalogName); > if (catalogExistsInStore || catalogExistsInMemory) { > if (!ignoreIfExists) { > throw new CatalogException(format("Catalog %s already exists.", catalogName)); > } > } else { > // Store the catalog in the catalog store > catalogStoreHolder.catalogStore().storeCatalog(catalogName, > catalogDescriptor); > // Initialize and store the catalog in memory > Catalog catalog = initCatalog(catalogName, catalogDescriptor); > catalog.open(); > catalogs.put(catalogName, catalog); > } > } > ``` > the catalog will store in the catalog store first, then failed in > initCatalog() method. > the result is the catalog still exists in catalogStoreHolder. And I cannot > use, alter or create it again. > I noticed the alterCatalog() method, the proces is : > ```java > Catalog newCatalog = initCatalog(catalogName, newDescriptor); > catalogStore.removeCatalog(catalogName, false); > if (catalogs.containsKey(catalogName)) { > catalogs.get(catalogName).close(); > } > newCatalog.open(); > catalogs.put(catalogName, newCatalog); > catalogStoreHolder.catalogStore().storeCatalog(catalogName, newDescriptor); > ``` > store in catalogStoreHolder should after initCatalog() and open(). -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36899][state/forst] Introduce metrics for forst cache [flink]
fredia commented on code in PR #25884: URL: https://github.com/apache/flink/pull/25884#discussion_r1909974038 ## flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/cache/FileBasedCache.java: ## @@ -97,6 +129,13 @@ public void delete(Path path) { @Override FileCacheEntry internalGet(String key, FileCacheEntry value) { +if (metricGroup != null) { Review Comment: `Key` is reserved for future use. `internalGet` is used for some customized operations like statistics. This PR only focuses on the `hit`, `miss` metric which is not related with `key`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-36494) Remove deprecated methods `Catalog#getTableFactory` and `Catalog#supportsManagedTable`
[ https://issues.apache.org/jira/browse/FLINK-36494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-36494: --- Labels: pull-request-available (was: ) > Remove deprecated methods `Catalog#getTableFactory` and > `Catalog#supportsManagedTable` > -- > > Key: FLINK-36494 > URL: https://issues.apache.org/jira/browse/FLINK-36494 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: xuyang >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
[ https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911849#comment-17911849 ] Sergey Nuyanzin edited comment on FLINK-37094 at 1/10/25 8:51 AM: -- Hi [~xuyangzhong] IIRC the idea was to release hive connector externally and then remove it from main repo (PR for removal is ready however need to wait for release of external hive connector) I think the main issue here is encouraging people to test/vote about a year ago when I created an RC only a couple of people voted... and not enough binding votes to proceed I can try to adapt new commits and repeat the procedure of RC creation and starting a new vote however not sure it will be able to finish before feature freeze of 2.0 or even release of 2.0 (I have no idea how to predict this voting procedure) was (Author: sergey nuyanzin): Hi [~xuyangzhong] IIRC the idea was to release hive connector externally and then remove it from main repo I think the main issue here is encouraging people to test/vote about a year ago when I created an RC only a couple of people voted... and not enough binding votes to proceed I can try to adapt new commits and repeat the procedure of RC creation and starting a new vote however not sure it will be able to finish before feature freeze of 2.0 or even release of 2.0 (I have no idea how to predict this voting procedure) > Adapt the external Hive connector to the removal of deprecated APIs in table > module in 2.0 > -- > > Key: FLINK-37094 > URL: https://issues.apache.org/jira/browse/FLINK-37094 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: xuyang >Priority: Major > > Since the Hive connector is scheduled to be removed from the Flink repo, and > the compilation of the remaining Hive connector module within the Flink repo > has been skipped, all deprecation work related to the table module performed > in version 2.0 needs to be synced to the external Hive connector's separate > repository. > There will be no further modifications to the remaining Hive connector module > in the Flink repo, as any changes made cannot be guaranteed to be validated > through CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36494][table] Remove deprecated method Catalog#getTableFactory [flink]
flinkbot commented on PR #25948: URL: https://github.com/apache/flink/pull/25948#issuecomment-2582085953 ## CI report: * 69dbfadecb608f9ccd9702c9a793ea0218a039cb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
[ https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911849#comment-17911849 ] Sergey Nuyanzin commented on FLINK-37094: - Hi [~xuyangzhong] IIRC the idea was to release hive connector externally and then remove it from main repo I think the main issue here is encouraging people to test/vote about a year ago when I created an RC only a couple of people voted... and not enough binding votes to proceed I can try to adapt new commits and repeat the procedure of RC creation and starting a new vote however not sure it will be able to finish before feature freeze of 2.0 or even release of 2.0 (I have no idea how to predict this voting procedure) > Adapt the external Hive connector to the removal of deprecated APIs in table > module in 2.0 > -- > > Key: FLINK-37094 > URL: https://issues.apache.org/jira/browse/FLINK-37094 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: xuyang >Priority: Major > > Since the Hive connector is scheduled to be removed from the Flink repo, and > the compilation of the remaining Hive connector module within the Flink repo > has been skipped, all deprecation work related to the table module performed > in version 2.0 needs to be synced to the external Hive connector's separate > repository. > There will be no further modifications to the remaining Hive connector module > in the Flink repo, as any changes made cannot be guaranteed to be validated > through CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36698][pipeline-connector][elasticsearch] Elasticsearch pipeline sink support authentication [flink-cdc]
beryllw commented on PR #3728: URL: https://github.com/apache/flink-cdc/pull/3728#issuecomment-2582088579 https://github.com/apache/flink-cdc/pull/3849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35325][transform] Skip insufficient_quota error when running test case using ad model. [flink-cdc]
yuxiqian commented on code in PR #3849: URL: https://github.com/apache/flink-cdc/pull/3849#discussion_r1910020556 ## flink-cdc-composer/src/test/java/org/apache/flink/cdc/composer/flink/FlinkPipelineUdfITCase.java: ## @@ -897,7 +899,6 @@ void testTransformWithModel(ValuesDataSink.SinkApi sinkApi) throws Exception { // Execute the pipeline PipelineExecution execution = composer.compose(pipelineDef); execution.execute(); - Review Comment: nit: irrelevant change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-36506) Remove all deprecated methods in `ColumnStats`
[ https://issues.apache.org/jira/browse/FLINK-36506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911850#comment-17911850 ] xuyang commented on FLINK-36506: I'll take charge of this task and add you [~atusharm] as co-author since the code freeze date for flink 2.0.0 is approaching. > Remove all deprecated methods in `ColumnStats` > -- > > Key: FLINK-36506 > URL: https://issues.apache.org/jira/browse/FLINK-36506 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: xuyang >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-37089][Runtime] Advanced scheduling for derived async state processing [flink]
Zakelly opened a new pull request, #25949: URL: https://github.com/apache/flink/pull/25949 ## What is the purpose of the change Currently, there is a limit for in-flight records in async state processing. However, we allow developers to initialize a new processing in-middle of another. These derived processings are treated as normal ones for timers or upstreaming records, which is problematic: - The new derived processing may increase the number of in-flight requests and cause force draining when it reaches the limit. But the draining may require the current processing also finish. This is a deadlock. - The derived processing will queue behind normal processing, which is not the behavior the user wants. The derived ones should be fired right after the current processing or ASAP. Thus, This PR changes the behavior in `AsyncExecutionController` and its related buffer: - Avoid drain when creating derived processing. - Provide priority for the queuing processing and make derived one have greater priority. ## Brief change log - Add an additional condition to skip the drain when inserting new record to `AEC`'s buffer. - Add priority to `RecordContext` and consider it when queue in blocking queue. ## Verifying this change This change added tests and can be verified by `AbstractAsyncStateStreamOperatorTest#testManyAsyncProcessWithKey` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): yes - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37089) Advanced scheduling for derived async state processing
[ https://issues.apache.org/jira/browse/FLINK-37089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-37089: --- Labels: pull-request-available (was: ) > Advanced scheduling for derived async state processing > -- > > Key: FLINK-37089 > URL: https://issues.apache.org/jira/browse/FLINK-37089 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Async State Processing, Runtime / Task >Reporter: Zakelly Lan >Assignee: Zakelly Lan >Priority: Major > Labels: pull-request-available > > Currently, there is a limit for in-flight records in async state processing. > However, we allow developers to initialize a new processing in-middle of > another. These derived processings are treated as normal ones for timers or > upstreaming records, which is problematic: > * The new derived processing may increase the number of in-flight requests > and cause force draining when it reaches the limit. But the draining may > require the current processing also finish. This is a deadlock. > * The derived processing will queue behind normal processing, which is not > the behavior the user wants. The derived ones should be fired right after the > current processing or ASAP. > Thus, I will change the behavior in `AsyncExecutionController` and its > related buffer: > * Avoid drain when creating derived processing. > * Provide priority for the queuing processing and make derived one have > greater priority. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]
flinkbot commented on PR #25950: URL: https://github.com/apache/flink/pull/25950#issuecomment-2582291586 ## CI report: * 43820887f2acc648f48dbfc0c8c18fb370f4a55a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37078] Fix for jackson-mapper-asl 1.9.13 vulnerability [flink]
rmetzger commented on code in PR #25890: URL: https://github.com/apache/flink/pull/25890#discussion_r1910162122 ## flink-connectors/flink-sql-connector-hive-3.1.3/src/main/resources/META-INF/NOTICE: ## @@ -45,7 +45,7 @@ the Apache Software License 2.0 (http://www.apache.org/licenses/LICENSE-2.0.txt) - org.apache.orc:orc-tools:1.5.8 - org.apache.parquet:parquet-hadoop-bundle:1.10.0 - org.codehaus.jackson:jackson-core-asl:1.9.13 -- org.codehaus.jackson:jackson-mapper-asl:1.9.13 +- org.codehaus.jackson:jackson-mapper-asl:1.9.14.jdk17-redhat-1 Review Comment: okay, I see. I would say we do it, given that the tests are passing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37078] Fix for jackson-mapper-asl 1.9.13 vulnerability [flink]
rmetzger commented on code in PR #25890: URL: https://github.com/apache/flink/pull/25890#discussion_r1910165407 ## pom.xml: ## @@ -328,6 +335,13 @@ under the License. --> + + + org.codehaus.jackson + jackson-mapper-asl + ${jackson.mapper.asl.version} + + Review Comment: Does this mean now any Flink dependency has `jackson-mapper-asl` as dep? e.g. if I'm depending on `flink-core`, will it pull in `jackson-mapper-asl` as well? (that wouldn't be good) it seems this is true, right? https://stackoverflow.com/a/38882349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-28897] [TABLE-SQL] Fail to use udf in added jar when enabling checkpoint [flink]
afedulov commented on PR #25656: URL: https://github.com/apache/flink/pull/25656#issuecomment-2582320914 > But while debugging the issue I found that the class loader coming here is the App class loader and not the user class loader. In that case the primary question should be - why does the userClassLoader class field not contain the actual user class loader in the first place. Should not that be the focus of the fix instead of pulling it out of the resourceManager only for one the methods? > and I see minimum possibility on regressions. Unfortunately, I can't agree with this assessment. There are thousands of SQL jobs out there that might have unintentionally included some classes that could cause the jobs to fail after we implement this change. For instance, the `flink-avro` format includes an unshaded version of the Avro [library](https://github.com/apache/flink/blob/d9c10d584b7ef8532804b849cfbd2b33beac8044/flink-formats/flink-avro/pom.xml#L81), which is directly [used](https://github.com/apache/flink/blob/85937b8c11dcca25fd978d095d07af121d7ae8e3/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSerializerSnapshot.java#L28) in the core format classes. If the user JAR, for some reason, mistakenly bundles the same library in a different version, those classes would take priority after the change and could potentially break existing jobs. Should those jobs have failed upon the initial submission if the "correct" classloading had been applied? Yes. Can we risk breaking production jobs in a patch release, even by doing the "right" thing? My answer is no. The risk is especially not worth it because the issue has existed since version 1.16 (over 2 years) and is only relevant for the Table API. Moreover, a [workaround](https://issues.apache.org/jira/browse/FLINK-29890?focusedCommentId=17693105&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17693105) exists, and a proper fix is implemented in 2.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36899][state/forst] Introduce metrics for forst cache [flink]
fredia commented on code in PR #25884: URL: https://github.com/apache/flink/pull/25884#discussion_r1909979879 ## flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/ForStFlinkFileSystem.java: ## @@ -119,7 +121,11 @@ public static FileBasedCache getFileBasedCache( new File(cacheBase.toString()), cacheReservedSize, SST_FILE_SIZE); } return new FileBasedCache( -Integer.MAX_VALUE, cacheLimitPolicy, cacheBase.getFileSystem(), cacheBase); +Integer.MAX_VALUE, +cacheLimitPolicy, +cacheBase.getFileSystem(), Review Comment: `cacheBase.getFileSystem()` can throw IOException. If we get fileSystem in `FileBasedCache` constructor, the constructor would throw IOException, this is something I don't want to introduce. So, I prefer to keep ` cacheBase.getFileSystem()` in parameter. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36193] Supports applying TRUNCATE / DROP table ddl... [flink-cdc]
lvyanquan commented on code in PR #3673: URL: https://github.com/apache/flink-cdc/pull/3673#discussion_r1910115690 ## flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java: ## @@ -325,14 +317,32 @@ private void applyAlterColumnType(AlterColumnTypeEvent event) throws SchemaEvolv tableChangeList.add( SchemaChangeProvider.updateColumnType( oldName, newType))); -catalog.alterTable( -new Identifier(event.tableId().getSchemaName(), event.tableId().getTableName()), -tableChangeList, -true); +catalog.alterTable(tableIdToIdentifier(event), tableChangeList, true); } catch (Catalog.TableNotExistException | Catalog.ColumnAlreadyExistException | Catalog.ColumnNotExistException e) { throw new SchemaEvolveException(event, e.getMessage(), e); } } + +private void applyTruncateTable(TruncateTableEvent event) throws SchemaEvolveException { +try (BatchTableCommit batchTableCommit = + catalog.getTable(tableIdToIdentifier(event)).newBatchWriteBuilder().newCommit()) { Review Comment: As We are currently experiencing some unexpected situations while truncating table with option of `deletion-vectors.enabled: true`. How about throwing exceptions when truncate these tables? WDYT. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-36811) MySQL CDC source set is processing backlog during snapshot phase
[ https://issues.apache.org/jira/browse/FLINK-36811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruan Hang reassigned FLINK-36811: - Assignee: Xiao Huang > MySQL CDC source set is processing backlog during snapshot phase > > > Key: FLINK-36811 > URL: https://issues.apache.org/jira/browse/FLINK-36811 > Project: Flink > Issue Type: Improvement > Components: Flink CDC >Affects Versions: cdc-3.3.0 >Reporter: Xuannan Su >Assignee: Xiao Huang >Priority: Major > Labels: pull-request-available > > In > [FLIP-309|https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog], > Flink introduce ProcessingBacklog to demonstrate whether a record should be > processed with low latency or high throughput. ProcessingBacklog can be used > to change the checkpoint interval of a job during runtime. > MySQL CDC source can set the procesingBacklog during the snapshot phase so > that the checkpoint interval during the snapshot phase can be changed during > runtime. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36811) MySQL CDC source set is processing backlog during snapshot phase
[ https://issues.apache.org/jira/browse/FLINK-36811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruan Hang reassigned FLINK-36811: - Assignee: (was: Ruan Hang) > MySQL CDC source set is processing backlog during snapshot phase > > > Key: FLINK-36811 > URL: https://issues.apache.org/jira/browse/FLINK-36811 > Project: Flink > Issue Type: Improvement > Components: Flink CDC >Affects Versions: cdc-3.3.0 >Reporter: Xuannan Su >Priority: Major > Labels: pull-request-available > > In > [FLIP-309|https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog], > Flink introduce ProcessingBacklog to demonstrate whether a record should be > processed with low latency or high throughput. ProcessingBacklog can be used > to change the checkpoint interval of a job during runtime. > MySQL CDC source can set the procesingBacklog during the snapshot phase so > that the checkpoint interval during the snapshot phase can be changed during > runtime. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-37093) The catalog that failed validation due to no type still exists in catalogStoreHolder
[ https://issues.apache.org/jira/browse/FLINK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingcan Wang updated FLINK-37093: - Description: here's sql-client (1.20 & 2.0-preview1): *Flink SQL> create catalog cat1;* {color:#ff}[ERROR] Could not execute SQL statement. Reason:{color} {color:#ff}org.apache.flink.table.api.ValidationException: Catalog options do not contain an option key 'type' for discovering a catalog.{color} *Flink SQL> show catalogs;* +-+ | catalog name| +-+ | cat1| |default_catalog| +-+ 2 rows in set *Flink SQL> use catalog cat1;* {color:#ff}[ERROR] Could not execute SQL statement. Reason:{color} {color:#ff}org.apache.flink.table.api.ValidationException: Catalog options do not contain an option key 'type' for discovering a catalog.{color} *Flink SQL> alter catalog cat1 comment 'no type';* {color:#ff}[ERROR] Could not execute SQL statement. Reason:{color} {color:#ff}org.apache.flink.table.api.ValidationException: Catalog options do not contain an option key 'type' for discovering a catalog.{color} *Flink SQL> create catalog cat1;* {color:#ff}[ERROR] Could not execute SQL statement. Reason:{color} {color:#ff}org.apache.flink.table.catalog.exceptions.CatalogException: Catalog cat1 already exists.{color} *Flink SQL> drop catalog cat1;* [INFO] Execute statement succeeded. *Flink SQL> show catalogs;* +-+ | catalog name| +-+ |default_catalog| +-+ 1 row in set When I create a catalog without type , the sql will parsed successfully. According to the code : {code:java} public void createCatalog( String catalogName, CatalogDescriptor catalogDescriptor, boolean ignoreIfExists) throws CatalogException { checkArgument( !StringUtils.isNullOrWhitespaceOnly(catalogName), "Catalog name cannot be null or empty."); checkNotNull(catalogDescriptor, "Catalog descriptor cannot be null"); boolean catalogExistsInStore = catalogStoreHolder.catalogStore().contains(catalogName); boolean catalogExistsInMemory = catalogs.containsKey(catalogName); if (catalogExistsInStore || catalogExistsInMemory) { if (!ignoreIfExists) { throw new CatalogException(format("Catalog %s already exists.", catalogName)); } } else { // Store the catalog in the catalog store catalogStoreHolder.catalogStore().storeCatalog(catalogName, catalogDescriptor); // Initialize and store the catalog in memory Catalog catalog = initCatalog(catalogName, catalogDescriptor); catalog.open(); catalogs.put(catalogName, catalog); } }{code} the catalog will store in the catalog store first, then failed in initCatalog() method. the result is the catalog still exists in catalogStoreHolder. And I cannot use, alter or create it again. I noticed the alterCatalog() method, the proces is : {code:java} Catalog newCatalog = initCatalog(catalogName, newDescriptor); catalogStore.removeCatalog(catalogName, false); if (catalogs.containsKey(catalogName)) { catalogs.get(catalogName).close(); } newCatalog.open(); catalogs.put(catalogName, newCatalog); catalogStoreHolder.catalogStore().storeCatalog(catalogName, newDescriptor); {code} store in catalogStoreHolder should after initCatalog() and open(). was: here's sql-client (1.20 & 2.0-preview1): *Flink SQL> create catalog cat1;* {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} {color:#FF}org.apache.flink.table.api.ValidationException: Catalog options do not contain an option key 'type' for discovering a catalog.{color} *Flink SQL> show catalogs;* +-+ | catalog name | +-+ | cat1 | | default_catalog | +-+ 2 rows in set *Flink SQL> use catalog cat1;* {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} {color:#FF}org.apache.flink.table.api.ValidationException: Catalog options do not contain an option key 'type' for discovering a catalog.{color} *Flink SQL> alter catalog cat1 comment 'no type';* {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} {color:#FF}org.apache.flink.table.api.ValidationException: Catalog options do not contain an option key 'type' for discovering a catalog.{color} *Flink SQL> create catalog cat1;* {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} {color:#FF}org.apache.flink.table.catalog.exceptions.CatalogException: Catalog cat1 already exists.{color} *Flink SQL> drop catalog cat1;* [INFO] Execute statement succeeded. *Flink SQL> show catalogs;* +-+ | catalog name | +-+ | default_catalog | +-+ 1 row in set When i create a catalog without type , the sql will parsed successfully. According to the code : ```java public void createCatalog( String catalogName, CatalogDescriptor catalogDescriptor, boolean ignoreIfExists) throws Catalog
Re: [PR] [FLINK-36794] [cdc-composer/cli] pipeline cdc connector support multiple data sources [flink-cdc]
ChaomingZhangCN commented on code in PR #3844: URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910131067 ## docs/content.zh/docs/connectors/pipeline-connectors/mysql.md: ## @@ -77,6 +77,32 @@ pipeline: parallelism: 4 ``` +## 多数据源示例 + +单数据源,从多个 MySQL 读取数据同步到 Doris 的 Pipeline 可以定义如下: + +```yaml +source: + type: mysql_mutiple Review Comment: ```yaml sources: - type: mysql name: mysql-instance-00 hostname: localhost port: 3306 - type: mysql name: mysql-instance-01 hostname: localhost port: 3307 ``` And the corresponding PipelineDef looks like this: ```java public class PipelineDef { @Nullable private List sources; private final SourceDef source; ... } ``` If the `sources` is not null then we use these data sources, otherwise we use `source` to build up DataStream. In this way, the previous usage will not be affected. I want to hear your opinion. @yuxiqian -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-37093) The catalog that failed validation due to no type still exists in catalogStoreHolder
[ https://issues.apache.org/jira/browse/FLINK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911817#comment-17911817 ] Mingcan Wang commented on FLINK-37093: -- can someone assign this issue to me ? thanks :) > The catalog that failed validation due to no type still exists in > catalogStoreHolder > > > Key: FLINK-37093 > URL: https://issues.apache.org/jira/browse/FLINK-37093 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.20.0, 2.0-preview > Environment: mac os >Reporter: Mingcan Wang >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.20.0 > > > here's sql-client (1.20 & 2.0-preview1): > > *Flink SQL> create catalog cat1;* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.api.ValidationException: Catalog > options do not contain an option key 'type' for discovering a catalog.{color} > *Flink SQL> show catalogs;* > +-+ > | catalog name | > +-+ > | cat1 | > | default_catalog | > +-+ > 2 rows in set > *Flink SQL> use catalog cat1;* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.api.ValidationException: Catalog > options do not contain an option key 'type' for discovering a catalog.{color} > *Flink SQL> alter catalog cat1 comment 'no type';* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.api.ValidationException: Catalog > options do not contain an option key 'type' for discovering a catalog.{color} > *Flink SQL> create catalog cat1;* > {color:#FF}[ERROR] Could not execute SQL statement. Reason:{color} > {color:#FF}org.apache.flink.table.catalog.exceptions.CatalogException: > Catalog cat1 already exists.{color} > *Flink SQL> drop catalog cat1;* > [INFO] Execute statement succeeded. > *Flink SQL> show catalogs;* > +-+ > | catalog name | > +-+ > | default_catalog | > +-+ > 1 row in set > > When i create a catalog without type , the sql will parsed successfully. > According to the code : > ```java > public void createCatalog( > String catalogName, CatalogDescriptor catalogDescriptor, boolean > ignoreIfExists) > throws CatalogException { > checkArgument( > !StringUtils.isNullOrWhitespaceOnly(catalogName), > "Catalog name cannot be null or empty."); > checkNotNull(catalogDescriptor, "Catalog descriptor cannot be null"); > boolean catalogExistsInStore = > catalogStoreHolder.catalogStore().contains(catalogName); > boolean catalogExistsInMemory = catalogs.containsKey(catalogName); > if (catalogExistsInStore || catalogExistsInMemory) { > if (!ignoreIfExists) { > throw new CatalogException(format("Catalog %s already exists.", catalogName)); > } > } else { > // Store the catalog in the catalog store > catalogStoreHolder.catalogStore().storeCatalog(catalogName, > catalogDescriptor); > // Initialize and store the catalog in memory > Catalog catalog = initCatalog(catalogName, catalogDescriptor); > catalog.open(); > catalogs.put(catalogName, catalog); > } > } > ``` > the catalog will store in the catalog store first, then failed in > initCatalog() method. > the result is the catalog still exists in catalogStoreHolder. And I cannot > use, alter or create it again. > I noticed the alterCatalog() method, the proces is : > ```java > Catalog newCatalog = initCatalog(catalogName, newDescriptor); > catalogStore.removeCatalog(catalogName, false); > if (catalogs.containsKey(catalogName)) { > catalogs.get(catalogName).close(); > } > newCatalog.open(); > catalogs.put(catalogName, newCatalog); > catalogStoreHolder.catalogStore().storeCatalog(catalogName, newDescriptor); > ``` > store in catalogStoreHolder should after initCatalog() and open(). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
lsyldliu opened a new pull request, #25947: URL: https://github.com/apache/flink/pull/25947 ## What is the purpose of the change *Remove deprecated interface CatalogLock* ## Brief change log - *Remove deprecated interface CatalogLock* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-30064) Move existing Hive connector code from Flink repo to dedicated Hive repo
[ https://issues.apache.org/jira/browse/FLINK-30064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911818#comment-17911818 ] xuyang commented on FLINK-30064: Hi, should we sync the new commits from the Hive connector module in the Flink repo into the separate Hive connector repo? I'm mentioning this because I removed a lot of deprecated code in flink 2.0, which may impact the Hive connector. > Move existing Hive connector code from Flink repo to dedicated Hive repo > > > Key: FLINK-30064 > URL: https://issues.apache.org/jira/browse/FLINK-30064 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Martijn Visser >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-37091) Remove deprecated CatalogLock interface
[ https://issues.apache.org/jira/browse/FLINK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-37091: --- Labels: pull-request-available (was: ) > Remove deprecated CatalogLock interface > --- > > Key: FLINK-37091 > URL: https://issues.apache.org/jira/browse/FLINK-37091 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: dalongliu >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36899][state/forst] Introduce metrics for forst cache [flink]
fredia commented on code in PR #25884: URL: https://github.com/apache/flink/pull/25884#discussion_r1909979879 ## flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/ForStFlinkFileSystem.java: ## @@ -119,7 +121,11 @@ public static FileBasedCache getFileBasedCache( new File(cacheBase.toString()), cacheReservedSize, SST_FILE_SIZE); } return new FileBasedCache( -Integer.MAX_VALUE, cacheLimitPolicy, cacheBase.getFileSystem(), cacheBase); +Integer.MAX_VALUE, +cacheLimitPolicy, +cacheBase.getFileSystem(), Review Comment: I will remove `cacheBase.getFileSystem()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36794] [cdc-composer/cli] pipeline cdc connector support multiple data sources [flink-cdc]
linjianchang commented on code in PR #3844: URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910064891 ## flink-cdc-composer/src/main/java/org/apache/flink/cdc/composer/flink/FlinkPipelineComposer.java: ## @@ -126,16 +127,28 @@ private void translate(StreamExecutionEnvironment env, PipelineDef pipelineDef) // And required constructors OperatorIDGenerator schemaOperatorIDGenerator = new OperatorIDGenerator(schemaOperatorTranslator.getSchemaOperatorUid()); -DataSource dataSource = -sourceTranslator.createDataSource(pipelineDef.getSource(), pipelineDefConfig, env); +List sourceDefs = pipelineDef.getSources(); +//DataSource dataSource = +//sourceTranslator.createDataSource(sourceDefs, pipelineDefConfig, env); DataSink dataSink = sinkTranslator.createDataSink(pipelineDef.getSink(), pipelineDefConfig, env); - -boolean isParallelMetadataSource = dataSource.isParallelMetadataSource(); - // O ---> Source -DataStream stream = -sourceTranslator.translate(pipelineDef.getSource(), dataSource, env, parallelism); +DataStream stream = null; +DataSource dataSource = null; +for (SourceDef sourceDef : sourceDefs) { +dataSource = sourceTranslator.createDataSource(sourceDef, pipelineDefConfig, env); +DataStream streamBranch = +sourceTranslator.translate(sourceDef, dataSource, env, parallelism); +if (stream == null) { +stream = streamBranch; +} else { +stream = stream.union(streamBranch); +} +} +boolean isParallelMetadataSource = dataSource.isParallelMetadataSource(); Review Comment: Already modified -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36794] [cdc-composer/cli] pipeline cdc connector support multiple data sources [flink-cdc]
linjianchang commented on code in PR #3844: URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910065232 ## docs/content.zh/docs/connectors/pipeline-connectors/mysql.md: ## @@ -77,6 +77,32 @@ pipeline: parallelism: 4 ``` +## 多数据源示例 + +单数据源,从多个 MySQL 读取数据同步到 Doris 的 Pipeline 可以定义如下: + +```yaml +source: + type: mysql_mutiple Review Comment: Already modified ## flink-cdc-cli/src/main/java/org/apache/flink/cdc/cli/parser/YamlPipelineDefinitionParser.java: ## @@ -97,6 +97,13 @@ public class YamlPipelineDefinitionParser implements PipelineDefinitionParser { public static final String TRANSFORM_TABLE_OPTION_KEY = "table-options"; +private static final String HOST_LIST = "host_list"; +private static final String COMMA = ","; +private static final String HOST_NAME = "hostname"; +private static final String PORT = "port"; +private static final String COLON = ":"; +private static final String MUTIPLE = "_mutiple"; Review Comment: Already modified -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-37096) If the MongoDB connector is implemented with DebeziumSourceFunction, the curve will appear in the full phase delay curve showing 55year
[ https://issues.apache.org/jira/browse/FLINK-37096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911874#comment-17911874 ] sukang edited comment on FLINK-37096 at 1/10/25 9:22 AM: - I will do it.Just by learning from this, it may be slow to repair. was (Author: JIRAUSER308130): I wll do it.Just by learning from this, it may be slow to repair. > If the MongoDB connector is implemented with DebeziumSourceFunction, the > curve will appear in the full phase delay curve showing 55year > --- > > Key: FLINK-37096 > URL: https://issues.apache.org/jira/browse/FLINK-37096 > Project: Flink > Issue Type: Improvement > Components: Connectors / MongoDB >Affects Versions: cdc-3.2.1 >Reporter: sukang >Priority: Minor > Original Estimate: 612h > Remaining Estimate: 612h > > emitDelay =isInDbSnapshotPhase ? 0L : System.currentTimeMillis() - > messageTimestamp; > In the full phase, messageTimestamp is 0 which will cause an exception here > when we compute it. > !https://intranetproxy.alipay.com/skylark/lark/0/2025/png/109956362/1736480972716-696877e2-e202-454d-bee4-947944159891.png|width=659,height=258,id=u750186d3! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-37096) If the MongoDB connector is implemented with DebeziumSourceFunction, the curve will appear in the full phase delay curve showing 55year
[ https://issues.apache.org/jira/browse/FLINK-37096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911876#comment-17911876 ] sukang commented on FLINK-37096: assign to you > If the MongoDB connector is implemented with DebeziumSourceFunction, the > curve will appear in the full phase delay curve showing 55year > --- > > Key: FLINK-37096 > URL: https://issues.apache.org/jira/browse/FLINK-37096 > Project: Flink > Issue Type: Improvement > Components: Connectors / MongoDB >Affects Versions: cdc-3.2.1 >Reporter: sukang >Priority: Minor > Original Estimate: 612h > Remaining Estimate: 612h > > emitDelay =isInDbSnapshotPhase ? 0L : System.currentTimeMillis() - > messageTimestamp; > In the full phase, messageTimestamp is 0 which will cause an exception here > when we compute it. > !https://intranetproxy.alipay.com/skylark/lark/0/2025/png/109956362/1736480972716-696877e2-e202-454d-bee4-947944159891.png|width=659,height=258,id=u750186d3! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-37081) The debezium version of sqlserver cdc needs to be upgraded to 2.6 or higher
[ https://issues.apache.org/jira/browse/FLINK-37081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhang closed FLINK-37081. - Resolution: Later > The debezium version of sqlserver cdc needs to be upgraded to 2.6 or higher > --- > > Key: FLINK-37081 > URL: https://issues.apache.org/jira/browse/FLINK-37081 > Project: Flink > Issue Type: Improvement > Components: Flink CDC > Environment: Microsoft SQL Server 2016 (SP1-CU15) >Reporter: zhang >Assignee: zhang >Priority: Major > Attachments: image-2025-01-09-14-34-32-514.png > > > The sql to get cdc data before debezium version 2.6 was > {code:java} > SELECT *# FROM [#db].cdc.[fn_cdc_get_all_changes_#](?, ?, N'all update old') > order by [__$start_lsn] ASC, [__$seqval] ASC, [__$operation] ASC {code} > The “order by [__$start_lsn] ASC, [__$seqval] ASC, [__$operation] ASC” part > of the ordering is not necessary, and the results of “fn_cdc_get_all_changes” > will be sorted by “__$start_lsn, __$ command_id, __$seqval, __$operation” by > default. > If the cdc table has a large amount of data, this extra sorting can cause > serious performance problems. > This redundant sorting has been removed in debezium version 2.6 and above > !image-2025-01-09-14-34-32-514.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-28897] [TABLE-SQL] Fail to use udf in added jar when enabling checkpoint [flink]
ammu20-dev commented on PR #25656: URL: https://github.com/apache/flink/pull/25656#issuecomment-2582186905 > As per the comments please could you investigate: > > * potential regressions > * the loss of the user class loader > * whether existing option could solve this without a code change @davidradl > * whether existing option could solve this without a code change Existing options might not help us in solving the issue as stated here in this [comment](https://github.com/apache/flink/pull/25656#discussion_r1904076204) > * the loss of the user class loader I noticed that at this line here https://github.com/apache/flink/blob/4b306699811af0b6ff2cb862914adfda56345996/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java#L2472 the class loader expected as per naming convention is user class loader. But while debugging the issue I found that the class loader coming here is the App class loader and not the user class loader. Inorder to pull in the user class loader, I am consuming the user class loader from resourceManager like [this](https://github.com/apache/flink/blob/dfd9e0e644a5ee9d0d3238c9f03a2942dc4a52e6/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java#L1036) and setting the user class loader as the thread context class loader and later used [here](https://github.com/apache/flink/blob/dfd9e0e644a5ee9d0d3238c9f03a2942dc4a52e6/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecution Environment.java#L2472). This is how the issue was fixed. > * potential regressions Analysing through the code, I can confirm that the scope of this fix is limited to the sql execution part and I see minimum possibility on regressions. Also I am restoring the thread class loader back to the original class loader post execution of the sql query([here](https://github.com/apache/flink/blob/dfd9e0e644a5ee9d0d3238c9f03a2942dc4a52e6/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java#L1063)) so that it doesn't have any further impact. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36979][rpc] Reverting pekko version bump in Flink 1.20 [flink]
He-Pin commented on PR #25866: URL: https://github.com/apache/flink/pull/25866#issuecomment-2582436133 @davidradl Is there any investigation result update from your side, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [hotfix][docs] Typo fix in RBAC docs [flink-kubernetes-operator]
gunnarmorling opened a new pull request, #931: URL: https://github.com/apache/flink-kubernetes-operator/pull/931 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37011] Improve get source field value by column name in PreTransformProcessor. [flink-cdc]
yuxiqian commented on code in PR #3836: URL: https://github.com/apache/flink-cdc/pull/3836#discussion_r1910216599 ## flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/transform/PreTransformProcessor.java: ## @@ -62,30 +62,18 @@ public CreateTableEvent preTransformCreateTableEvent(CreateTableEvent createTabl public BinaryRecordData processFillDataField(BinaryRecordData data) { List valueList = new ArrayList<>(); List columns = tableChangeInfo.getPreTransformedSchema().getColumns(); -for (int i = 0; i < columns.size(); i++) { -valueList.add( -getValueFromBinaryRecordData( -columns.get(i).getName(), -data, -tableChangeInfo.getSourceSchema().getColumns(), -tableChangeInfo.getSourceFieldGetters())); +Map sourceFieldGetters = +tableChangeInfo.getSourceFieldGetters(); +for (Column column : columns) { +RecordData.FieldGetter fieldGetter = sourceFieldGetters.get(column.getName()); +if (fieldGetter != null) { Review Comment: I think `TableChangeInfo` should be able to guarantee the invariability that for each column in `preTransformedSchema`, there must be corresponding `fieldGetters` available. ## flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/transform/PreTransformChangeInfo.java: ## @@ -59,7 +61,11 @@ public PreTransformChangeInfo( this.tableId = tableId; this.sourceSchema = sourceSchema; this.preTransformedSchema = preTransformedSchema; -this.sourceFieldGetters = sourceFieldGetters; +this.sourceFieldGetters = new HashMap<>(sourceSchema.getColumnCount()); +for (int i = 0; i < sourceSchema.getColumns().size(); i++) { +this.sourceFieldGetters.put( +sourceSchema.getColumns().get(i).getName(), sourceFieldGetters[i]); +} Review Comment: Maybe `this.sourceFieldGetters` -> `this.sourceFieldGettersMap`, as now it has different type and meaning with the constructor parameter. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]
xuyangzhong commented on code in PR #25950: URL: https://github.com/apache/flink/pull/25950#discussion_r1910221269 ## flink-table/flink-table-common/src/main/java/org/apache/flink/table/plan/stats/ColumnStats.java: ## @@ -118,40 +81,10 @@ public Integer getMaxLen() { return maxLen; } -/** - * Deprecated because Number type max/min is not well supported comparable type, e.g. {@link - * java.util.Date}, {@link java.sql.Timestamp}. - * - * Returns null if this instance is constructed by {@link ColumnStats.Builder}. - */ -@Deprecated -public Number getMaxValue() { -return maxValue; -} - -/** Review Comment: Yes. Because the legacy construct `{@link ColumnStats#ColumnStats(Long, Long, * Double, Integer, Number, Number)}.` has been removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36964] Fix potential exception when SchemaChange in parallel w… [flink-cdc]
lvyanquan commented on PR #3818: URL: https://github.com/apache/flink-cdc/pull/3818#issuecomment-2582453742 Rebased to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-36608) Support dynamic StreamGraph optimization for AdaptiveBroadcastJoinOperator
[ https://issues.apache.org/jira/browse/FLINK-36608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-36608. --- Resolution: Done 5d0cb98fe7843796e6ca0598060839f2b48f0882 627395acf46b8b38ee2c8db2cb6ba31849db9f27 80a79bb5f6320a97e4051cba6e22c151dc9603cf > Support dynamic StreamGraph optimization for AdaptiveBroadcastJoinOperator > -- > > Key: FLINK-36608 > URL: https://issues.apache.org/jira/browse/FLINK-36608 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Affects Versions: 2.0.0 >Reporter: xingbe >Assignee: xingbe >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Optimize the AdaptiveBroadcastJoinOperator dynamically at runtime by > transforming qualifying joins into broadcasts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36506][table] Remove all deprecated methods in ColumnStats [flink]
davidradl commented on code in PR #25950: URL: https://github.com/apache/flink/pull/25950#discussion_r1910216285 ## flink-table/flink-table-common/src/main/java/org/apache/flink/table/plan/stats/ColumnStats.java: ## @@ -118,40 +81,10 @@ public Integer getMaxLen() { return maxLen; } -/** - * Deprecated because Number type max/min is not well supported comparable type, e.g. {@link - * java.util.Date}, {@link java.sql.Timestamp}. - * - * Returns null if this instance is constructed by {@link ColumnStats.Builder}. - */ -@Deprecated -public Number getMaxValue() { -return maxValue; -} - -/** Review Comment: It looks like you have removed the javadoc associated with the getMax method -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36494][table] Remove deprecated method Catalog#getTableFactory [flink]
davidradl commented on code in PR #25948: URL: https://github.com/apache/flink/pull/25948#discussion_r1910227028 ## flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/factories/TableFactoryUtil.java: ## @@ -70,28 +66,14 @@ public static TableSource findAndCreateTableSource(TableSourceFactory.Con */ @SuppressWarnings("unchecked") public static TableSource findAndCreateTableSource( -@Nullable Catalog catalog, ObjectIdentifier objectIdentifier, CatalogTable catalogTable, ReadableConfig configuration, boolean isTemporary) { TableSourceFactory.Context context = new TableSourceFactoryContextImpl( objectIdentifier, catalogTable, configuration, isTemporary); -Optional factoryOptional = -catalog == null ? Optional.empty() : catalog.getTableFactory(); Review Comment: A question: why is it safe to delete this call rather than using getFactory() as per the deprecation comments in getTableFactory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
davidradl commented on PR #25947: URL: https://github.com/apache/flink/pull/25947#issuecomment-2582466083 I see in HiveDynamicTableFactory that there are references to RequireCatalogLock -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-37097) Remove
david radley created FLINK-37097: Summary: Remove Key: FLINK-37097 URL: https://issues.apache.org/jira/browse/FLINK-37097 Project: Flink Issue Type: Technical Debt Components: Connectors / Hive Affects Versions: 2.0.0 Reporter: david radley As per [https://github.com/apache/flink/pull/25947] the Hive code has been externalized into a new repo. We should remove flink-connector-hive and the flink-sql-connector-hive-* maven modules from core flink . I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-36910][table] Add function call syntax support to time-related dynamic functions [flink]
gustavodemorais opened a new pull request, #25951: URL: https://github.com/apache/flink/pull/25951 ## What is the purpose of the change Enabled dynamic time-related functions to be called with or without parenthesis ## Brief change log - Add function call syntax for dynamic functions - Support null inputTable for sql from tableapi - Add support for BuiltInFunction test in batch mode - Add TimeFunctionsBatchModeITCase and testSelectDynamicDatetimeFunctions ## Verifying this change - Added TimeFunctionsBatchModeITCase and testSelectDynamicDatetimeFunctions to test all modified functions ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: don't know - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
lsyldliu commented on PR #25947: URL: https://github.com/apache/flink/pull/25947#issuecomment-2582517352 > I just checked the external Hive connector and that refers to the class we are removing. I think we need a plan for the Hive connector before removing it from core Flink We can remove the related code when we bump flink version to 2.0 in hive connector. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-37097) Remove
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dalongliu reassigned FLINK-37097: - Assignee: david radley > Remove > --- > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
davidradl commented on PR #25947: URL: https://github.com/apache/flink/pull/25947#issuecomment-2582518192 I have raised https://issues.apache.org/jira/browse/FLINK-37097 to track the removal of the Hive connector from core flink, -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
lsyldliu commented on PR #25947: URL: https://github.com/apache/flink/pull/25947#issuecomment-2582519779 > I have raised https://issues.apache.org/jira/browse/FLINK-37097 to track the removal of the Hive connector from core flink, Good job, assign to you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-37094) Adapt the external Hive connector to the removal of deprecated APIs in table module in 2.0
[ https://issues.apache.org/jira/browse/FLINK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911929#comment-17911929 ] xuyang commented on FLINK-37094: [~Sergey Nuyanzin] Okay... IIUC we can adapt, test, and release the Hive connector after releasing flink 2.0.0, similar to other external connectors, and then remove it from flink repo in 2.1 (I believe it's likely that there won't be enough time to delete it in 2.0...). What do you think? When the time comes for the Hive connector to adapt to 2.0.0 where the deprecated table APIs have been removed, I can assist with that. > Adapt the external Hive connector to the removal of deprecated APIs in table > module in 2.0 > -- > > Key: FLINK-37094 > URL: https://issues.apache.org/jira/browse/FLINK-37094 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: xuyang >Priority: Major > > Since the Hive connector is scheduled to be removed from the Flink repo, and > the compilation of the remaining Hive connector module within the Flink repo > has been skipped, all deprecation work related to the table module performed > in version 2.0 needs to be synced to the external Hive connector's separate > repository. > There will be no further modifications to the remaining Hive connector module > in the Flink repo, as any changes made cannot be guaranteed to be validated > through CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36698][pipeline-connector][elasticsearch] Elasticsearch pipeline sink support authentication [flink-cdc]
lvyanquan commented on PR #3728: URL: https://github.com/apache/flink-cdc/pull/3728#issuecomment-2582478475 Hi @beryllw, is it convenient to build a test to verify whether this certification is effective? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-37097) Remove
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin closed FLINK-37097. --- Resolution: Duplicate > Remove > --- > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-37097) Remove
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911932#comment-17911932 ] Sergey Nuyanzin commented on FLINK-37097: - This is a duplicate of https://issues.apache.org/jira/browse/FLINK-33786 > Remove > --- > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-3154][API] Upgrade from Kryo 2.x to Kryo 5.x. Removed twitter … [flink]
davidradl commented on PR #25896: URL: https://github.com/apache/flink/pull/25896#issuecomment-258276 @kurtostfeld on your suggestion - I have raised this with on the dev list. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37091][table-common] Remove deprecated interface CatalogLock [flink]
xuyangzhong commented on PR #25947: URL: https://github.com/apache/flink/pull/25947#issuecomment-2582796136 @davidradl FYI we can ignore all classes in hive connector module in flink repo. You can refer more here https://issues.apache.org/jira/browse/FLINK-37094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-37100][tests] Fix `test_netty_shuffle_memory_control.sh` in CI for JDK11+ [flink]
ferenc-csaky opened a new pull request, #25955: URL: https://github.com/apache/flink/pull/25955 ## What is the purpose of the change Fixes the test executed by `test_netty_shuffle_memory_control.sh` that can possibly fail the CI in case Netty4 cannot reserve enough memory, hence Pekko is not able to start up. ## Brief change log - Set `io.netty.tryReflectionSetAccessible=true` for TMs, which reduces Netty4 memory footprint. ## Verifying this change Existing test should run consistently. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? handled in different PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37100) Fix test_netty_shuffle_memory_control.sh in CI for JDK11+
[ https://issues.apache.org/jira/browse/FLINK-37100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-37100: --- Labels: pull-request-available (was: ) > Fix test_netty_shuffle_memory_control.sh in CI for JDK11+ > - > > Key: FLINK-37100 > URL: https://issues.apache.org/jira/browse/FLINK-37100 > Project: Flink > Issue Type: Improvement >Reporter: Ferenc Csaky >Assignee: Ferenc Csaky >Priority: Major > Labels: pull-request-available > Fix For: 1.19.2, 1.20.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36979][rpc] Reverting pekko version bump in Flink 1.20 [flink]
ferenc-csaky commented on PR #25866: URL: https://github.com/apache/flink/pull/25866#issuecomment-2582907004 Opened https://github.com/apache/flink/pull/25955 which I believe should supersede this current PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-36753) Adaptive Scheduler actively triggers a Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-36753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911938#comment-17911938 ] Samrat Deb commented on FLINK-36753: Thank you [~fanrui] for the input. for 1: it makes sense to unify the scale-up and scale-down with the active trigger. for 3: onboarded with your thoughts not to trigger anything in case of already running checkpointing. Regarding point 2: Knowing other opinions and how the community thinks would be great. > Adaptive Scheduler actively triggers a Checkpoint > - > > Key: FLINK-36753 > URL: https://issues.apache.org/jira/browse/FLINK-36753 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 2.0-preview >Reporter: Rui Fan >Assignee: Samrat Deb >Priority: Major > > FLIP-461[1] and FLINK-35549[2] support that rescale could be executed after > the next completed checkpoint. It greatly reduces the amount of data replay > after rescale. > In FLIP-461, Adaptive Scheduler waits for the next periodic checkpoint to be > triggered. In most scenarios, a more efficient solution might be Adaptive > Scheduler actively triggers a Checkpoint after all resources are > ready(Technically desire resources are ready). > The idea comes from an offline discussion between [~mxm] and [~fanrui]. > [1][https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler] > [2] https://issues.apache.org/jira/browse/FLINK-35549 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-36910][table] Add function call syntax support to time-related dynamic functions [flink]
gustavodemorais commented on PR #25806: URL: https://github.com/apache/flink/pull/25806#issuecomment-2582505543 The reason for the failures are the planner changes I did to try to extend testing. They are not valid, and I'll discard the planner changes. We'll stick to the testDynamicDatetimeFunctions and testDynamicDatetimeFunctionsAreEqual tests. Closing this in favor of https://github.com/apache/flink/pull/25951 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36910][table] Add function call syntax support to time-related dynamic functions [flink]
gustavodemorais closed pull request #25806: [FLINK-36910][table] Add function call syntax support to time-related dynamic functions URL: https://github.com/apache/flink/pull/25806 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-36576][runtime] Improving amount-based data balancing distribution algorithm for DefaultVertexParallelismAndInputInfosDecider [flink]
zhuzhurk commented on code in PR #25552: URL: https://github.com/apache/flink/pull/25552#discussion_r1910075634 ## flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexInputInfoComputerTestUtil.java: ## @@ -86,22 +90,55 @@ public static List createBlockingInputInfos( return blockingInputInfos; } -public static void checkJobVertexInputInfo( +public static void checkParallelism( int targetParallelism, -List inputInfos, +Map vertexInputInfoMap) { +vertexInputInfoMap +.values() +.forEach( +info -> + assertThat(info.getExecutionVertexInputInfos().size()) +.isEqualTo(targetParallelism)); +} + +public static void checkConsumedSubpartitionGroups( List> targetConsumedSubpartitionGroups, +List inputInfos, Map vertexInputInfoMap) { JobVertexInputInfo vertexInputInfo = checkAndGetJobVertexInputInfo(inputInfos, vertexInputInfoMap); List executionVertexInputInfos = vertexInputInfo.getExecutionVertexInputInfos(); - assertThat(executionVertexInputInfos.size()).isEqualTo(targetParallelism); -for (int i = 0; i < targetParallelism; i++) { +for (int i = 0; i < executionVertexInputInfos.size(); i++) { assertThat(executionVertexInputInfos.get(i).getConsumedSubpartitionGroups()) .isEqualTo(targetConsumedSubpartitionGroups.get(i)); } } +public static void checkConsumedDataVolumePerSubtask( +long[] targetConsumedDataVolume, +List inputInfos, +Map vertexInputs) { +long[] consumedDataVolume = new long[targetConsumedDataVolume.length]; +for (BlockingInputInfo inputInfo : inputInfos) { +JobVertexInputInfo vertexInputInfo = vertexInputs.get(inputInfo.getResultId()); +List executionVertexInputInfos = +vertexInputInfo.getExecutionVertexInputInfos(); +for (int i = 0; i < executionVertexInputInfos.size(); ++i) { +ExecutionVertexInputInfo executionVertexInputInfo = +executionVertexInputInfos.get(i); +consumedDataVolume[i] += + executionVertexInputInfo.getConsumedSubpartitionGroups().entrySet().stream() +.mapToLong( +entry -> +inputInfo.getNumBytesProduced( +entry.getKey(), entry.getValue())) +.sum(); +} +} +assertThat(consumedDataVolume).isEqualTo(targetConsumedDataVolume); +} + public static JobVertexInputInfo checkAndGetJobVertexInputInfo( Review Comment: can be private ## flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexInputInfoComputerTestUtil.java: ## @@ -86,22 +90,55 @@ public static List createBlockingInputInfos( return blockingInputInfos; } -public static void checkJobVertexInputInfo( +public static void checkParallelism( Review Comment: can be private ## flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/VertexInputInfoComputerTestUtil.java: ## @@ -0,0 +1,357 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.scheduler.adaptivebatch; + +import org.apache.flink.runtime.executiongraph.ExecutionVertexInputInfo; +import org.apache.flink.runtime.executiongraph.IndexRange; +import org.apache.flink.runtime.executiongraph.JobVertexInputInfo; +import org.apache.flink.runtime.jobgraph.IntermediateDataSetID; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static org.apache.flink.runtim
[jira] [Commented] (FLINK-37097) Remove Hive connector from core Flink
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911962#comment-17911962 ] lincoln lee commented on FLINK-37097: - [~davidradl] We're hoping to see this happen but currently we do not have enough resource to do so. Before the externalize can happen [~yuxia] had finished the work to decouple Hive with Flink planner[1], then Martijn & Sergey kickoff the removal work, but unfortunately the releasing hadn't finish yet(this is why #23899 not merged). Recently we're moving the table moudle releated clean up work[1] and discovered that the hive connector code in the main repo also needs to be adapted and cleaned up accordingly, so get the current situation with hive externalization. If [~Sergey Nuyanzin] is willing to re-prepare the release(adapt to flink-1.20?), we can at least participate in the vote. And continue to complete the corresponding flink-2.0 code adjustments in the hive repo. # https://issues.apache.org/jira/browse/FLINK-26603 # https://issues.apache.org/jira/browse/FLINK-36476 > Remove Hive connector from core Flink > - > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-37026] Add Stale PR GitHub Action workflow [flink]
tomncooper opened a new pull request, #25953: URL: https://github.com/apache/flink/pull/25953 ## What is the purpose of the change This is the implementation of the [Stale PR GitHub Action Proposal](https://cwiki.apache.org/confluence/display/FLINK/Stale+PR+Cleanup). This was [voted through](https://lists.apache.org/thread/kc90254wvo5q934doh8o4sbj1fgwvy76) on the dev mailing list. The initial setting is for 6 months of inactivity to be marked as stale with 3 month to respond before the PR is closed. Ideally, once the initial backlog is cleared, we will reduces these down to 3 months and 1 month respectively, to match other Apache projects. There is also configuration for when the action is run and how many PRs it will process in one run. Initially I have set this to every 6 hours and 200 PRs per run, this will allow us to process the backlog in a few days. After which these can be reduced to daily, with a lower request limit (e.g. 50). ## Brief change log - Enable Stale PR GitHub action workflow. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37026] Add Stale PR GitHub Action workflow [flink]
tomncooper commented on PR #25953: URL: https://github.com/apache/flink/pull/25953#issuecomment-2582840912 cc @gyfora @1996fanrui -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37026) Enable Stale PR Github Action
[ https://issues.apache.org/jira/browse/FLINK-37026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-37026: --- Labels: pull-request-available (was: ) > Enable Stale PR Github Action > - > > Key: FLINK-37026 > URL: https://issues.apache.org/jira/browse/FLINK-37026 > Project: Flink > Issue Type: Improvement >Reporter: Thomas Cooper >Priority: Minor > Labels: pull-request-available > > As part of the Community Health Initiative (CHI) we made a > [proposal|https://cwiki.apache.org/confluence/display/FLINK/Stale+PR+Cleanup] > to clean up old PRs which have seen no activity in a long time (stale PRs). > After > [discussion|https://lists.apache.org/thread/6yoclzmvymxors8vlpt4nn9r7t3stcsz] > on the mailing list the proposal was > [voted|https://lists.apache.org/thread/kc90254wvo5q934doh8o4sbj1fgwvy76] > through. > We now need to enable the Stale PR GitHub action with a stale period of 6 > months, with 3 months to refresh (interact in any way with) the PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-37090][e2e] Add adaptive broadcast join to e2e tpc-ds tests. [flink]
SinBex opened a new pull request, #25954: URL: https://github.com/apache/flink/pull/25954 ## What is the purpose of the change Add adaptive broadcast join to e2e tpc-ds tests. In this case, we will set the AdaptiveBroadcastJoinStrategy to RUNTIME_ONLY, significantly increasing the likelihood of triggering adaptive broadcast join. ## Brief change log - *Add adaptive broadcast join to e2e tpc-ds tests.* ## Verifying this change This change is already covered by existing tests, such as *test_tpcds.sh*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: ( no) - The serializers: ( no ) - The runtime per-record code paths (performance sensitive): (no ) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no ) - The S3 file system connector: ( no ) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-37090) Introduce a TPC-DS E2E case for adaptive broadcast join
[ https://issues.apache.org/jira/browse/FLINK-37090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-37090: --- Labels: pull-request-available (was: ) > Introduce a TPC-DS E2E case for adaptive broadcast join > --- > > Key: FLINK-37090 > URL: https://issues.apache.org/jira/browse/FLINK-37090 > Project: Flink > Issue Type: Sub-task >Reporter: Junrui Lee >Assignee: xingbe >Priority: Major > Labels: pull-request-available > > Introduce a specific TPC-DS case for adaptive broadcast join. > In this case, we will set the AdaptiveBroadcastJoinStrategy to RUNTIME_ONLY > and configure the table.optimizer.join.broadcast-threshold to 0, > significantly increasing the likelihood of triggering adaptive broadcast join. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-37098] Fix selecting time attribute from a view [flink]
flinkbot commented on PR #25952: URL: https://github.com/apache/flink/pull/25952#issuecomment-2582853024 ## CI report: * fdf0105d32c8a346d9a890411098715bd297ddee UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37026] Add Stale PR GitHub Action workflow [flink]
flinkbot commented on PR #25953: URL: https://github.com/apache/flink/pull/25953#issuecomment-2582853276 ## CI report: * 03ee68c53e6523046cbb64f9a2b6d15823edb659 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-37090][e2e] Add adaptive broadcast join to e2e tpc-ds tests. [flink]
flinkbot commented on PR #25954: URL: https://github.com/apache/flink/pull/25954#issuecomment-2582853466 ## CI report: * 27e503efb6c2fc7742a513cd166399a51312af67 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-37097) Remove Hive connector from core Flink
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911964#comment-17911964 ] david radley commented on FLINK-37097: -- [~Sergey Nuyanzin] it seems that: - V1 (1.20) still ships the Hive connector - master has unknitted the Hive connector from the connector pom so it is not shipped - So if V2 is not shipping the Hive connector, we should be able to remove the code from master - but keep it in the 1.20 v1 branch. WDYT? > Remove Hive connector from core Flink > - > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-37097) Remove Hive connector from core Flink
[ https://issues.apache.org/jira/browse/FLINK-37097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911962#comment-17911962 ] lincoln lee edited comment on FLINK-37097 at 1/10/25 1:24 PM: -- [~davidradl] We're hoping to see this happen but currently we do not have enough resource to do so. Before the externalize can happen [~yuxia] had finished the work to decouple Hive with Flink planner[1], then Martijn & Sergey kickoff the removal work, but unfortunately the releasing hadn't finish yet(this is why #23899 not merged). Recently we're moving the table moudle releated clean up work[1] and found that the hive connector code in the main repo also needs to be adapted accordingly, then get the current situation with hive externalization. If [~Sergey Nuyanzin] is willing to re-prepare the release(adapt to flink-1.20?), we can at least participate in the vote and continue to complete the corresponding flink-2.0 code adjustments in the hive repo. # https://issues.apache.org/jira/browse/FLINK-26603 # https://issues.apache.org/jira/browse/FLINK-36476 was (Author: lincoln.86xy): [~davidradl] We're hoping to see this happen but currently we do not have enough resource to do so. Before the externalize can happen [~yuxia] had finished the work to decouple Hive with Flink planner[1], then Martijn & Sergey kickoff the removal work, but unfortunately the releasing hadn't finish yet(this is why #23899 not merged). Recently we're moving the table moudle releated clean up work[1] and discovered that the hive connector code in the main repo also needs to be adapted and cleaned up accordingly, so get the current situation with hive externalization. If [~Sergey Nuyanzin] is willing to re-prepare the release(adapt to flink-1.20?), we can at least participate in the vote. And continue to complete the corresponding flink-2.0 code adjustments in the hive repo. # https://issues.apache.org/jira/browse/FLINK-26603 # https://issues.apache.org/jira/browse/FLINK-36476 > Remove Hive connector from core Flink > - > > Key: FLINK-37097 > URL: https://issues.apache.org/jira/browse/FLINK-37097 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 2.0.0 >Reporter: david radley >Assignee: david radley >Priority: Minor > > As per [https://github.com/apache/flink/pull/25947] the Hive code has been > externalized into a new repo. We should remove flink-connector-hive and the > flink-sql-connector-hive-* maven modules from core flink . > > I am happy to make this change if someone can assign me the Jira. -- This message was sent by Atlassian Jira (v8.20.10#820010)