[jira] [Updated] (HUDI-3813) Schema Evolution Support DDL And DML Concurrency

2022-04-07 Thread YangXuan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YangXuan updated HUDI-3813: --- Description: Assume that there are only two fields name and age in table hudi_table. 1、Schema evolution is su

[GitHub] [hudi] danny0405 merged pull request #5236: [HUDI-3808] Flink bulk_insert timestamp(3) can not be read by Spark

2022-04-07 Thread GitBox
danny0405 merged PR #5236: URL: https://github.com/apache/hudi/pull/5236 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[hudi] branch master updated: [HUDI-3808] Flink bulk_insert timestamp(3) can not be read by Spark (#5236)

2022-04-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new e33149be9a [HUDI-3808] Flink bulk_insert timest

[GitHub] [hudi] hudi-bot commented on pull request #5247: [MINOR]if issuedInstant is empty, the endInstant should not skip the …

2022-04-07 Thread GitBox
hudi-bot commented on PR #5247: URL: https://github.com/apache/hudi/pull/5247#issuecomment-1091200158 ## CI report: * 4fc75d16c91766eeb89ee294848df07da439ad45 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7889

[jira] [Created] (HUDI-3815) Hudi Option metadata.compaction.delta_commits

2022-04-07 Thread Ibson (Jira)
Ibson created HUDI-3815: --- Summary: Hudi Option metadata.compaction.delta_commits Key: HUDI-3815 URL: https://issues.apache.org/jira/browse/HUDI-3815 Project: Apache Hudi Issue Type: Wish Comp

[jira] [Commented] (HUDI-2762) Ensure hive can query insert only logs in MOR

2022-04-07 Thread rex xiong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518693#comment-17518693 ] rex xiong commented on HUDI-2762: -  as [~mengtao] mentioned, hive will filter out files wh

[GitHub] [hudi] danny0405 commented on pull request #4421: [HUDI-3096] fixed the bug that the cow table(contains decimalType) wriite by flink cannot be read by spark.

2022-04-07 Thread GitBox
danny0405 commented on PR #4421: URL: https://github.com/apache/hudi/pull/4421#issuecomment-1091336679 > > > @danny0405 Thank you for your attention I suggest include to 0.11. What do you suggest? > > > > > > Yes, could you test hive 2.x and 3.x both for this patch ? Thanks in ad

[jira] [Created] (HUDI-3816) Change HudiSplitSource to use async queue and shared executor service

2022-04-07 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3816: - Summary: Change HudiSplitSource to use async queue and shared executor service Key: HUDI-3816 URL: https://issues.apache.org/jira/browse/HUDI-3816 Project: Apache Hudi

[GitHub] [hudi] pratyakshsharma commented on a diff in pull request #4910: [HUDI-2560][RFC-33] Support full Schema evolution for Spark

2022-04-07 Thread GitBox
pratyakshsharma commented on code in PR #4910: URL: https://github.com/apache/hudi/pull/4910#discussion_r844893684 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1505,4 +1549,138 @@ private void tryUpgrade(HoodieTableMet

[jira] [Updated] (HUDI-3816) Change HudiSplitSource to use async queue and shared executor service

2022-04-07 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3816: -- Description: Please see this comment on the PR:  [https://github.com/trinodb/trino/pull/10228#issuecomm

[jira] [Created] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread rex xiong (Jira)
rex xiong created HUDI-3817: --- Summary: Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3 Key: HUDI-3817 URL: https://issues.apache.org/jira/browse/HUDI-3817 Project:

[jira] [Created] (HUDI-3818) hudi doesn't support bytes column as primary key

2022-04-07 Thread rex xiong (Jira)
rex xiong created HUDI-3818: --- Summary: hudi doesn't support bytes column as primary key Key: HUDI-3818 URL: https://issues.apache.org/jira/browse/HUDI-3818 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] KnightChess opened a new issue, #5248: [QUESION] Should filter prop "hoodie.datasource.write.operation" when use spark sql create table

2022-04-07 Thread GitBox
KnightChess opened a new issue, #5248: URL: https://github.com/apache/hudi/issues/5248 when I use spark sql create table and set **hoodie.datasource.write.operation**=upsert. delete sql (like pr #5215 ), insert overwrite sql etc will still use **hoodie.datasource.write.operation** to up

[GitHub] [hudi] KnightChess commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
KnightChess commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1091399157 other command has the same question, I open a issue to discuss #5248 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #4910: [HUDI-2560][RFC-33] Support full Schema evolution for Spark

2022-04-07 Thread GitBox
xiarixiaoyao commented on code in PR #4910: URL: https://github.com/apache/hudi/pull/4910#discussion_r844907147 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1505,4 +1549,138 @@ private void tryUpgrade(HoodieTableMetaCl

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread rex xiong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rex xiong updated HUDI-3817: Priority: Minor (was: Major) > Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi

[GitHub] [hudi] danny0405 merged pull request #4421: [HUDI-3096] fixed the bug that the cow table(contains decimalType) wriite by flink cannot be read by spark.

2022-04-07 Thread GitBox
danny0405 merged PR #4421: URL: https://github.com/apache/hudi/pull/4421 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[hudi] branch master updated (e33149be9a -> 531381faff)

2022-04-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from e33149be9a [HUDI-3808] Flink bulk_insert timestamp(3) can not be read by Spark (#5236) add 531381faff [HUDI-309

[GitHub] [hudi] toninis opened a new issue, #5249: [SUPPORT] Deltastreamer job does not terminate on Kubernetes when hoodie.metrics.on=true

2022-04-07 Thread GitBox
toninis opened a new issue, #5249: URL: https://github.com/apache/hudi/issues/5249 **Describe the problem you faced** We 've noticed that when you enable hoodie JMX metrics Shutdown hook is never called. We took a thread dump to check with threads remain in running state and keep

[jira] [Updated] (HUDI-3801) Validate/verify clustering commit metadata preservation

2022-04-07 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3801: -- Status: In Progress (was: Open) > Validate/verify clustering commit metadata preservation > ---

[jira] [Commented] (HUDI-3801) Validate/verify clustering commit metadata preservation

2022-04-07 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518743#comment-17518743 ] Sagar Sumit commented on HUDI-3801: --- Verified both the points above: [https://gist.gith

[GitHub] [hudi] codope merged pull request #5245: [HUDI-3805] Delete existing corrupted requested rollback plan during rollback

2022-04-07 Thread GitBox
codope merged PR #5245: URL: https://github.com/apache/hudi/pull/5245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.or

[hudi] branch master updated: [HUDI-3805] Delete existing corrupted requested rollback plan during rollback (#5245)

2022-04-07 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9d744bb35c [HUDI-3805] Delete existing corrupted r

[GitHub] [hudi] RexXiong opened a new pull request, #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
RexXiong opened a new pull request, #5250: URL: https://github.com/apache/hudi/pull/5250 ## What is the purpose of the pull request This PR specify parquet version for hudi-hadoop-mr-bundle module, used to solve the conflict problem for hive that hudi-hadoop-mr-bundle will include 1.

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3817: - Labels: pull-request-available (was: ) > Need to specify parquet version for hudi-hadoop-mr-bundl

[GitHub] [hudi] hudi-bot commented on pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
hudi-bot commented on PR #5250: URL: https://github.com/apache/hudi/pull/5250#issuecomment-1091544533 ## CI report: * 81ce27f6aad1498e7634d6e5f380e9408ef1e78e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] codope commented on a diff in pull request #4118: [HUDI-2774] Handle duplicate instants while fetching pending clustering plans

2022-04-07 Thread GitBox
codope commented on code in PR #4118: URL: https://github.com/apache/hudi/pull/4118#discussion_r844982092 ## hudi-common/src/main/java/org/apache/hudi/common/util/ClusteringUtils.java: ## @@ -124,7 +125,16 @@ // get all filegroups in the plan getFileGroupEntrie

[GitHub] [hudi] codope commented on a diff in pull request #5207: [HUDI-3772] Fixing auto adjustment of lock configs for deltastreamer

2022-04-07 Thread GitBox
codope commented on code in PR #5207: URL: https://github.com/apache/hudi/pull/5207#discussion_r844982369 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -2480,41 +2494,42 @@ protected void setDefaults() { HoodieLa

[GitHub] [hudi] RexXiong commented on pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
RexXiong commented on PR #5250: URL: https://github.com/apache/hudi/pull/5250#issuecomment-1091549449 @xushiyan do you have time to look at this pr? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] hudi-bot commented on pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
hudi-bot commented on PR #5250: URL: https://github.com/apache/hudi/pull/5250#issuecomment-1091551379 ## CI report: * 81ce27f6aad1498e7634d6e5f380e9408ef1e78e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7893

[GitHub] [hudi] xushiyan commented on a diff in pull request #5051: [HUDI-3643] Fix hive count exception when the table is empty and the path depth is less than 3

2022-04-07 Thread GitBox
xushiyan commented on code in PR #5051: URL: https://github.com/apache/hudi/pull/5051#discussion_r845013508 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java: ## @@ -324,14 +326,24 @@ public static HoodieTimeline getHoodieTimelineForIncrem

[GitHub] [hudi] xushiyan merged pull request #5051: [HUDI-3643] Fix hive count exception when the table is empty and the path depth is less than 3

2022-04-07 Thread GitBox
xushiyan merged PR #5051: URL: https://github.com/apache/hudi/pull/5051 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

[hudi] branch master updated (9d744bb35c -> 6a8396420c)

2022-04-07 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 9d744bb35c [HUDI-3805] Delete existing corrupted requested rollback plan during rollback (#5245) add 6a8396420c

[GitHub] [hudi] pratyakshsharma commented on a diff in pull request #4910: [HUDI-2560][RFC-33] Support full Schema evolution for Spark

2022-04-07 Thread GitBox
pratyakshsharma commented on code in PR #4910: URL: https://github.com/apache/hudi/pull/4910#discussion_r845019633 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Found

[jira] [Closed] (HUDI-3729) Enable Spark vectorized read for non-incremental read paths

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3729. Resolution: Fixed > Enable Spark vectorized read for non-incremental read paths > --

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Fix Version/s: 0.11.0 > Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi > usi

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Priority: Critical (was: Minor) > Need to specify parquet version for hudi-hadoop-mr-bundle when compile

[GitHub] [hudi] xiaozhch5 opened a new pull request, #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream

2022-04-07 Thread GitBox
xiaozhch5 opened a new pull request, #5251: URL: https://github.com/apache/hudi/pull/5251 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpo

[GitHub] [hudi] hudi-bot commented on pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream o

2022-04-07 Thread GitBox
hudi-bot commented on PR #5251: URL: https://github.com/apache/hudi/pull/5251#issuecomment-1091629481 ## CI report: * 798fa04c4e6870c0162567d8afb66a78ebc1fb5e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] xushiyan commented on a diff in pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
xushiyan commented on code in PR #5250: URL: https://github.com/apache/hudi/pull/5250#discussion_r845031704 ## packaging/hudi-hadoop-mr-bundle/pom.xml: ## @@ -29,6 +29,7 @@ true ${project.parent.basedir} +${hive.parquet.version} Review Comment: shouldn't th

[jira] [Closed] (HUDI-3643) Hive count throws exception when the table is empty and the path depth is less than 3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3643. Reviewers: Raymond Xu, Tao Meng Resolution: Fixed > Hive count throws exception when the table is empty

[GitHub] [hudi] hudi-bot commented on pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream o

2022-04-07 Thread GitBox
hudi-bot commented on PR #5251: URL: https://github.com/apache/hudi/pull/5251#issuecomment-1091632049 ## CI report: * 798fa04c4e6870c0162567d8afb66a78ebc1fb5e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7895

[jira] [Updated] (HUDI-3643) Hive count throws exception when the table is empty and the path depth is less than 3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3643: - Priority: Blocker (was: Critical) > Hive count throws exception when the table is empty and the path dept

[GitHub] [hudi] danny0405 commented on a diff in pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(Out

2022-04-07 Thread GitBox
danny0405 commented on code in PR #5251: URL: https://github.com/apache/hudi/pull/5251#discussion_r845038201 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieParquetDataBlock.java: ## @@ -109,7 +109,7 @@ public HoodieLogBlockType getBlockType() {

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Sprint: Hudi-Sprint-Apr-05 > Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Priority: Blocker (was: Critical) > Need to specify parquet version for hudi-hadoop-mr-bundle when compil

[jira] [Updated] (HUDI-3812) Metadata is not enabled by default on the Read Path

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3812: - Sprint: Hudi-Sprint-Apr-05 > Metadata is not enabled by default on the Read Path > ---

[GitHub] [hudi] codope commented on a diff in pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
codope commented on code in PR #5250: URL: https://github.com/apache/hudi/pull/5250#discussion_r845041708 ## packaging/hudi-hadoop-mr-bundle/pom.xml: ## @@ -29,6 +29,7 @@ true ${project.parent.basedir} +${hive.parquet.version} Review Comment: parquet-avro i

[GitHub] [hudi] xiarixiaoyao commented on pull request #4421: [HUDI-3096] fixed the bug that the cow table(contains decimalType) wriite by flink cannot be read by spark.

2022-04-07 Thread GitBox
xiarixiaoyao commented on PR #4421: URL: https://github.com/apache/hudi/pull/4421#issuecomment-1091640655 @danny0405 @nsivabalan I'm very sorry for taking so long to reply already test with hive3.1.1/ hive2.3.1 /hive 1.2.1 it works well with this pr. -- This is an automated mess

[GitHub] [hudi] hudi-bot commented on pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
hudi-bot commented on PR #5250: URL: https://github.com/apache/hudi/pull/5250#issuecomment-1091642684 ## CI report: * 81ce27f6aad1498e7634d6e5f380e9408ef1e78e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7893

[GitHub] [hudi] xiaozhch5 commented on a diff in pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(Out

2022-04-07 Thread GitBox
xiaozhch5 commented on code in PR #5251: URL: https://github.com/apache/hudi/pull/5251#discussion_r845044294 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieParquetDataBlock.java: ## @@ -109,7 +109,7 @@ public HoodieLogBlockType getBlockType() {

[GitHub] [hudi] XuQianJin-Stars commented on issue #5248: [QUESION] Should filter prop "hoodie.datasource.write.operation" when use spark sql create table?

2022-04-07 Thread GitBox
XuQianJin-Stars commented on issue #5248: URL: https://github.com/apache/hudi/issues/5248#issuecomment-1091647916 hi @KnightChess The problem of refraction is to sort out those parameters that cannot be covered. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [hudi] XuQianJin-Stars commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
XuQianJin-Stars commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1091656699 hi @KnightChess Retrigger the CI of the github action. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [hudi] codope commented on a diff in pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
codope commented on code in PR #5250: URL: https://github.com/apache/hudi/pull/5250#discussion_r845061214 ## packaging/hudi-hadoop-mr-bundle/pom.xml: ## @@ -29,6 +29,7 @@ true ${project.parent.basedir} +${hive.parquet.version} Review Comment: As discussed w

[GitHub] [hudi] KnightChess commented on issue #5248: [QUESION] Should filter prop "hoodie.datasource.write.operation" when use spark sql create table?

2022-04-07 Thread GitBox
KnightChess commented on issue #5248: URL: https://github.com/apache/hudi/issues/5248#issuecomment-1091660722 @XuQianJin-Stars if I not set this parameter when create table, anything will be ok when use sql to insert, delete or other, because statement will set itself in runtime. So, I

[GitHub] [hudi] codope commented on a diff in pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
codope commented on code in PR #5250: URL: https://github.com/apache/hudi/pull/5250#discussion_r845064810 ## packaging/hudi-hadoop-mr-bundle/pom.xml: ## @@ -29,6 +29,7 @@ true ${project.parent.basedir} +${hive.parquet.version} Review Comment: For spark3.2,

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Status: Patch Available (was: In Progress) > Need to specify parquet version for hudi-hadoop-mr-bundle wh

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Status: In Progress (was: Open) > Need to specify parquet version for hudi-hadoop-mr-bundle when compile

[GitHub] [hudi] xiarixiaoyao commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
xiarixiaoyao commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1091674787 @KnightChess @XuQianJin-Stars how about forbidden set hoodie.datasource.write.operation when we create hudi table. maybe we can do some check when we create table, or ignore this

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Epic Link: HUDI-3529 > Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi > usin

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Component/s: dependencies > Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi >

[jira] [Created] (HUDI-3819) upgrade spring cve-2022-22965

2022-04-07 Thread Jason-Morries Adam (Jira)
Jason-Morries Adam created HUDI-3819: Summary: upgrade spring cve-2022-22965 Key: HUDI-3819 URL: https://issues.apache.org/jira/browse/HUDI-3819 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-3819) upgrade spring cve-2022-22965

2022-04-07 Thread Jason-Morries Adam (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason-Morries Adam updated HUDI-3819: - Affects Version/s: 0.9.0 > upgrade spring cve-2022-22965 > - >

[GitHub] [hudi] xushiyan commented on a diff in pull request #5205: [HUDI-3726] Harden constraints around switching between different key generators

2022-04-07 Thread GitBox
xushiyan commented on code in PR #5205: URL: https://github.com/apache/hudi/pull/5205#discussion_r845085528 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala: ## @@ -173,6 +175,29 @@ object HoodieWriterUtils { } } + /**

[GitHub] [hudi] xushiyan commented on a diff in pull request #5205: [HUDI-3726] Harden constraints around switching between different key generators

2022-04-07 Thread GitBox
xushiyan commented on code in PR #5205: URL: https://github.com/apache/hudi/pull/5205#discussion_r845087195 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala: ## @@ -173,6 +175,29 @@ object HoodieWriterUtils { } } + /**

[GitHub] [hudi] KnightChess commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
KnightChess commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1091690934 @XuQianJin-Stars how to retrigger it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Updated] (HUDI-3817) Need to specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3817: - Reviewers: Raymond Xu, Sagar Sumit (was: Raymond Xu) > Need to specify parquet version for hudi-hadoop-mr

[GitHub] [hudi] codope opened a new pull request, #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
codope opened a new pull request, #5252: URL: https://github.com/apache/hudi/pull/5252 ## What is the purpose of the pull request HUDI-3454 There were some paths where we did not init partition while building log record scanner. Fixed all such code paths. ## Brief change log

[jira] [Updated] (HUDI-3454) Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3454: - Labels: pull-request-available (was: ) > Fix partition name in all code paths for LogRecordScanne

[GitHub] [hudi] KnightChess commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
KnightChess commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1091693575 > @KnightChess @XuQianJin-Stars how about forbidden set hoodie.datasource.write.operation when we create hudi table. maybe we can do some check when we create table, or ignore this param

[GitHub] [hudi] XuQianJin-Stars commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
XuQianJin-Stars commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1091695966 > @XuQianJin-Stars how to retrigger it? reopen this pr or commit pr comments. -- This is an automated message from the Apache Git Service. To respond to the message, please l

[GitHub] [hudi] hudi-bot commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
hudi-bot commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1091697968 ## CI report: * 0dbfdb5962671aa72db7dd394e89fe759fd94f26 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
hudi-bot commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1091701466 ## CI report: * 0dbfdb5962671aa72db7dd394e89fe759fd94f26 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7896

[GitHub] [hudi] njalan opened a new issue, #5253: Hudi execution plan not generated properly [SUPPORT]

2022-04-07 Thread GitBox
njalan opened a new issue, #5253: URL: https://github.com/apache/hudi/issues/5253 I am trying to get column lineage from spark sql query plan Below is my sql for testing and all the tables are hudi table. insert into test.datahub_3 select a.email, b.phone from test.datahub_1

[GitHub] [hudi] njalan commented on issue #5253: Hudi execution plan not generated properly [SUPPORT]

2022-04-07 Thread GitBox
njalan commented on issue #5253: URL: https://github.com/apache/hudi/issues/5253#issuecomment-1091704009 @XuQianJin-Stars Can you please take a look at this issue when you have the time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] pratyakshsharma commented on a diff in pull request #4910: [HUDI-2560][RFC-33] Support full Schema evolution for Spark

2022-04-07 Thread GitBox
pratyakshsharma commented on code in PR #4910: URL: https://github.com/apache/hudi/pull/4910#discussion_r845019633 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Found

[GitHub] [hudi] hudi-bot commented on pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream o

2022-04-07 Thread GitBox
hudi-bot commented on PR #5251: URL: https://github.com/apache/hudi/pull/5251#issuecomment-1091732767 ## CI report: * 798fa04c4e6870c0162567d8afb66a78ebc1fb5e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7895

[jira] [Assigned] (HUDI-3820) Validate switching between different key generators will throw exception

2022-04-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-3820: - Assignee: Rajesh > Validate switching between different key generators will throw

[jira] [Created] (HUDI-3820) Validate switching between different key generators will throw exception

2022-04-07 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3820: - Summary: Validate switching between different key generators will throw exception Key: HUDI-3820 URL: https://issues.apache.org/jira/browse/HUDI-3820 Projec

[jira] [Updated] (HUDI-3820) Validate switching between different key generators will throw exception

2022-04-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3820: -- Fix Version/s: 0.12.0 > Validate switching between different key generators will throw e

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5205: [HUDI-3726] Harden constraints around switching between different key generators

2022-04-07 Thread GitBox
nsivabalan commented on code in PR #5205: URL: https://github.com/apache/hudi/pull/5205#discussion_r845185260 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala: ## @@ -173,6 +175,29 @@ object HoodieWriterUtils { } } + /*

[GitHub] [hudi] hudi-bot commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
hudi-bot commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1091799514 ## CI report: * 0dbfdb5962671aa72db7dd394e89fe759fd94f26 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7896

[GitHub] [hudi] pratyakshsharma commented on a diff in pull request #4910: [HUDI-2560][RFC-33] Support full Schema evolution for Spark

2022-04-07 Thread GitBox
pratyakshsharma commented on code in PR #4910: URL: https://github.com/apache/hudi/pull/4910#discussion_r845211292 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1505,4 +1549,138 @@ private void tryUpgrade(HoodieTableMet

[GitHub] [hudi] pratyakshsharma commented on pull request #5126: [HUDI-3689] Fixing nullability in schema for deltastreamer tests

2022-04-07 Thread GitBox
pratyakshsharma commented on PR #5126: URL: https://github.com/apache/hudi/pull/5126#issuecomment-1091842764 Please check CI failures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[jira] [Closed] (HUDI-3805) Empty requested rollback plan can stay on the timeline forever

2022-04-07 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3805. - Resolution: Fixed > Empty requested rollback plan can stay on the timeline forever > -

[jira] [Created] (HUDI-3821) Unify datasource payload and compaction payload class config

2022-04-07 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3821: - Summary: Unify datasource payload and compaction payload class config Key: HUDI-3821 URL: https://issues.apache.org/jira/browse/HUDI-3821 Project: Apache Hudi Iss

[jira] [Closed] (HUDI-3801) Validate/verify clustering commit metadata preservation

2022-04-07 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3801. - Resolution: Fixed > Validate/verify clustering commit metadata preservation >

[jira] [Updated] (HUDI-3454) Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3454: - Status: In Progress (was: Open) > Fix partition name in all code paths for LogRecordScanner > ---

[jira] [Updated] (HUDI-3812) Metadata is not enabled by default on the Read Path

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3812: - Status: In Progress (was: Open) > Metadata is not enabled by default on the Read Path > -

[jira] [Updated] (HUDI-3812) Metadata is not enabled by default on the Read Path

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3812: - Status: Patch Available (was: In Progress) > Metadata is not enabled by default on the Read Path > --

[jira] [Updated] (HUDI-3454) Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3454: - Status: Patch Available (was: In Progress) > Fix partition name in all code paths for LogRecordScanner >

[jira] [Assigned] (HUDI-3680) Update docs to reflect new Bundles Spark compatibility

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3680: Assignee: Raymond Xu (was: Ethan Guo) > Update docs to reflect new Bundles Spark compatibility >

[jira] [Updated] (HUDI-3609) Create scala version specific artifacts for hudi-spark-client

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3609: - Fix Version/s: 0.12.0 (was: 0.11.0) > Create scala version specific artifacts for h

[jira] [Commented] (HUDI-3791) Test perf for point looks up for bloom filter and col stats partition

2022-04-07 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518970#comment-17518970 ] Sagar Sumit commented on HUDI-3791: --- [~guoyihua] took up Items #1 and #2 and validated t

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5156: [HUDI-3571] Spark datasource continuous ingestion tool

2022-04-07 Thread GitBox
nsivabalan commented on code in PR #5156: URL: https://github.com/apache/hudi/pull/5156#discussion_r845305133 ## hudi-integ-test/src/main/scala/org/apache/hudi/integ/testsuite/SparkDataSourceContinuousIngest.scala: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foun

[GitHub] [hudi] hudi-bot commented on pull request #5156: [HUDI-3571] Spark datasource continuous ingestion tool

2022-04-07 Thread GitBox
hudi-bot commented on PR #5156: URL: https://github.com/apache/hudi/pull/5156#issuecomment-1091914191 ## CI report: * 34d7c9d5ede8e0b16bf511be6022b93b46b9868b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7883

[GitHub] [hudi] hudi-bot commented on pull request #5156: [HUDI-3571] Spark datasource continuous ingestion tool

2022-04-07 Thread GitBox
hudi-bot commented on PR #5156: URL: https://github.com/apache/hudi/pull/5156#issuecomment-1091918961 ## CI report: * 34d7c9d5ede8e0b16bf511be6022b93b46b9868b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7883

[GitHub] [hudi] hudi-bot commented on pull request #5156: [HUDI-3571] Spark datasource continuous ingestion tool

2022-04-07 Thread GitBox
hudi-bot commented on PR #5156: URL: https://github.com/apache/hudi/pull/5156#issuecomment-1092009680 ## CI report: * 3cde1c9c32640cfafd34a7fdb84035c21542e62f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7897

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5234: [HUDI-3637] Exclude uncommitted log files from metadata table validation

2022-04-07 Thread GitBox
nsivabalan commented on code in PR #5234: URL: https://github.com/apache/hudi/pull/5234#discussion_r845436003 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java: ## @@ -723,6 +722,121 @@ private void validateBloomFilters( } } +

[GitHub] [hudi] nsivabalan merged pull request #5156: [HUDI-3571] Spark datasource continuous ingestion tool

2022-04-07 Thread GitBox
nsivabalan merged PR #5156: URL: https://github.com/apache/hudi/pull/5156 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

  1   2   3   >