[GitHub] [hudi] hudi-bot commented on pull request #5177: [HUDI-3746][DO_NOT_MERGE] Test CI

2022-04-07 Thread GitBox
hudi-bot commented on PR #5177: URL: https://github.com/apache/hudi/pull/5177#issuecomment-1092506337 ## CI report: * ec0cd6c76fcf1f8a2c0448c96c6c4c69a9fd9ffb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7685

[GitHub] [hudi] stym06 opened a new issue, #5262: [SUPPORT]

2022-04-07 Thread GitBox
stym06 opened a new issue, #5262: URL: https://github.com/apache/hudi/issues/5262 **Describe the problem you faced** Deltastreamer job running on spark-on-k8s stops and executors die when ingesting CDC data from Mongo to Azure blob in INSERT mode **To Reproduce** Run Deltastream

[GitHub] [hudi] rahil-c commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-07 Thread GitBox
rahil-c commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1092503783 Should we also add in here examples of docker build commands for building the images locally? -- This is an automated message from the Apache Git Service. To respond to the message, please

[jira] [Commented] (HUDI-3827) Promote the inetAddress picking strategy for NetworkUtils#getHostname

2022-04-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519337#comment-17519337 ] Danny Chen commented on HUDI-3827: -- Fixed via master branch: 67215abaf0bff74b85ffc51b2e67

[jira] [Resolved] (HUDI-3827) Promote the inetAddress picking strategy for NetworkUtils#getHostname

2022-04-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-3827. -- > Promote the inetAddress picking strategy for NetworkUtils#getHostname > --

[GitHub] [hudi] hudi-bot commented on pull request #5177: [HUDI-3746][DO_NOT_MERGE] Test CI

2022-04-07 Thread GitBox
hudi-bot commented on PR #5177: URL: https://github.com/apache/hudi/pull/5177#issuecomment-1092490422 ## CI report: * ec0cd6c76fcf1f8a2c0448c96c6c4c69a9fd9ffb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7685

[hudi] branch master updated (7a6272fba1 -> 67215abaf0)

2022-04-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 7a6272fba1 [HUDI-3781] fix spark delete sql can not delete record (#5215) add 67215abaf0 [HUDI-3827] Promote th

[GitHub] [hudi] danny0405 merged pull request #5260: [HUDI-3827] Promote the inetAddress picking strategy for NetworkUtils…

2022-04-07 Thread GitBox
danny0405 merged PR #5260: URL: https://github.com/apache/hudi/pull/5260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] hudi-bot commented on pull request #5177: [HUDI-3746][DO_NOT_MERGE] Test CI

2022-04-07 Thread GitBox
hudi-bot commented on PR #5177: URL: https://github.com/apache/hudi/pull/5177#issuecomment-1092487769 ## CI report: * ec0cd6c76fcf1f8a2c0448c96c6c4c69a9fd9ffb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7685

[GitHub] [hudi] yihua commented on pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream out)

2022-04-07 Thread GitBox
yihua commented on PR #5251: URL: https://github.com/apache/hudi/pull/5251#issuecomment-1092485240 This change has already been covered by #4286 which is going to be ready for review soon. I'm wondering if it still makes sense to have this PR independently. -- This is an automated messa

[hudi] branch master updated: [HUDI-3781] fix spark delete sql can not delete record (#5215)

2022-04-07 Thread mengtao
This is an automated email from the ASF dual-hosted git repository. mengtao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 7a6272fba1 [HUDI-3781] fix spark delete sql can n

[GitHub] [hudi] xiarixiaoyao merged pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
xiarixiaoyao merged PR #5215: URL: https://github.com/apache/hudi/pull/5215 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apa

[GitHub] [hudi] xiarixiaoyao commented on pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
xiarixiaoyao commented on PR #5215: URL: https://github.com/apache/hudi/pull/5215#issuecomment-1092484913 @KnightChess thanks for your contribution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [hudi] yihua commented on pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
yihua commented on PR #5250: URL: https://github.com/apache/hudi/pull/5250#issuecomment-1092479640 cc @rahil-c -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[GitHub] [hudi] yihua commented on pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
yihua commented on PR #5250: URL: https://github.com/apache/hudi/pull/5250#issuecomment-1092479437 There is a separate effort to address Hudi's compatibility with Hadoop, Hive, and Spark 3.x altogether. Please check this branch which is WIP: https://github.com/rahil-c/hudi/commits/rchertar

[GitHub] [hudi] codope commented on a diff in pull request #5255: [HUDI-3798] Fixing ending of a transaction by different owner and removing some extraneous methods in trxn manager

2022-04-07 Thread GitBox
codope commented on code in PR #5255: URL: https://github.com/apache/hudi/pull/5255#discussion_r845742512 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1432,13 +1433,13 @@ protected final HoodieTable initTable(WriteOpe

[GitHub] [hudi] hudi-bot commented on pull request #5261: [HUDI-3799] Fixing not deleting empty instants w/o archiving

2022-04-07 Thread GitBox
hudi-bot commented on PR #5261: URL: https://github.com/apache/hudi/pull/5261#issuecomment-1092453959 ## CI report: * a66b3427c2ea32ddab2d3e784ec13343f7fb9e1c UNKNOWN * 19950a8f71e48d6ef5cba6970690cd2b2d2686f9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5260: [HUDI-3827] Promote the inetAddress picking strategy for NetworkUtils…

2022-04-07 Thread GitBox
hudi-bot commented on PR #5260: URL: https://github.com/apache/hudi/pull/5260#issuecomment-109285 ## CI report: * fe779c1401dea8c6ce71dc705246420e19526863 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7909

[hudi] branch master updated (672974c412 -> df87095ef0)

2022-04-07 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 672974c412 [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading (#5257) add df87095ef0

[GitHub] [hudi] codope merged pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
codope merged PR #5252: URL: https://github.com/apache/hudi/pull/5252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.or

[GitHub] [hudi] hudi-bot commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
hudi-bot commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1092429929 ## CI report: * 871b43397f13ebb5b548f8705ad315ffb76e3117 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7908

[GitHub] [hudi] hudi-bot commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-07 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1092416411 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 23a10d80255247f32c77cade8b15d9a8711f7ee1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5261: [HUDI-3799] Fixing not deleting empty instants w/o archiving

2022-04-07 Thread GitBox
hudi-bot commented on PR #5261: URL: https://github.com/apache/hudi/pull/5261#issuecomment-1092401369 ## CI report: * a66b3427c2ea32ddab2d3e784ec13343f7fb9e1c UNKNOWN * 19950a8f71e48d6ef5cba6970690cd2b2d2686f9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5261: [HUDI-3799] Fixing not deleting empty instants w/o archiving

2022-04-07 Thread GitBox
hudi-bot commented on PR #5261: URL: https://github.com/apache/hudi/pull/5261#issuecomment-1092400052 ## CI report: * a66b3427c2ea32ddab2d3e784ec13343f7fb9e1c UNKNOWN * 19950a8f71e48d6ef5cba6970690cd2b2d2686f9 UNKNOWN Bot commands @hudi-bot supports the following

[GitHub] [hudi] hudi-bot commented on pull request #5261: [HUDI-3799] Fixing not deleting empty instants w/o archiving

2022-04-07 Thread GitBox
hudi-bot commented on PR #5261: URL: https://github.com/apache/hudi/pull/5261#issuecomment-1092398781 ## CI report: * a66b3427c2ea32ddab2d3e784ec13343f7fb9e1c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-3799) Understand reason behind "Not an avro data file" with hudi

2022-04-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3799: - Labels: pull-request-available (was: ) > Understand reason behind "Not an avro data file" with hu

[GitHub] [hudi] nsivabalan opened a new pull request, #5261: [HUDI-3799] Fixing not deleting empty instants w/o archiving

2022-04-07 Thread GitBox
nsivabalan opened a new pull request, #5261: URL: https://github.com/apache/hudi/pull/5261 ## What is the purpose of the pull request In Local and hdfs file schemes, there could be partial files created, where in the commit meta files could be empty. We put in a fix sometime back to

[GitHub] [hudi] hudi-bot commented on pull request #5260: [HUDI-3827] Promote the inetAddress picking strategy for NetworkUtils…

2022-04-07 Thread GitBox
hudi-bot commented on PR #5260: URL: https://github.com/apache/hudi/pull/5260#issuecomment-1092397527 ## CI report: * fe779c1401dea8c6ce71dc705246420e19526863 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7909

[GitHub] [hudi] hudi-bot commented on pull request #5260: [HUDI-3827] Promote the inetAddress picking strategy for NetworkUtils…

2022-04-07 Thread GitBox
hudi-bot commented on PR #5260: URL: https://github.com/apache/hudi/pull/5260#issuecomment-1092396281 ## CI report: * fe779c1401dea8c6ce71dc705246420e19526863 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-07 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1092396173 ## CI report: * 655d7d2a8417ce6ed1a6fdd1560563051672327c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7906

[jira] [Updated] (HUDI-3827) Promote the inetAddress picking strategy for NetworkUtils#getHostname

2022-04-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3827: - Labels: pull-request-available (was: ) > Promote the inetAddress picking strategy for NetworkUtil

[GitHub] [hudi] danny0405 opened a new pull request, #5260: [HUDI-3827] Promote the inetAddress picking strategy for NetworkUtils…

2022-04-07 Thread GitBox
danny0405 opened a new pull request, #5260: URL: https://github.com/apache/hudi/pull/5260 …#getHostname ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*

[jira] [Created] (HUDI-3827) Promote the inetAddress picking strategy for NetworkUtils#getHostname

2022-04-07 Thread Danny Chen (Jira)
Danny Chen created HUDI-3827: Summary: Promote the inetAddress picking strategy for NetworkUtils#getHostname Key: HUDI-3827 URL: https://issues.apache.org/jira/browse/HUDI-3827 Project: Apache Hudi

[jira] [Commented] (HUDI-3096) fixed the bug that the cow table(contains decimalType) write by flink cannot be read by spark

2022-04-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519280#comment-17519280 ] Danny Chen commented on HUDI-3096: -- Fixed via master branch: 531381faffe634e0976a756d4213

[GitHub] [hudi] hudi-bot commented on pull request #5259: [HUDI-3825] Fixing non-partitioned table Partition Records persistence in MT

2022-04-07 Thread GitBox
hudi-bot commented on PR #5259: URL: https://github.com/apache/hudi/pull/5259#issuecomment-1092387993 ## CI report: * cbddee47602d92fd8f327d72152cd3a5406bd14b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7905

[GitHub] [hudi] hudi-bot commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
hudi-bot commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1092384461 ## CI report: * 0dbfdb5962671aa72db7dd394e89fe759fd94f26 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7896

[GitHub] [hudi] hudi-bot commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
hudi-bot commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1092383224 ## CI report: * 0dbfdb5962671aa72db7dd394e89fe759fd94f26 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7896

[GitHub] [hudi] RexXiong commented on a diff in pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-07 Thread GitBox
RexXiong commented on code in PR #5250: URL: https://github.com/apache/hudi/pull/5250#discussion_r845689751 ## packaging/hudi-hadoop-mr-bundle/pom.xml: ## @@ -29,6 +29,7 @@ true ${project.parent.basedir} +${hive.parquet.version} Review Comment: @codope The

[jira] [Commented] (HUDI-3826) TruncateHoodieTableCommand deletes partitions incorrectly

2022-04-07 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519270#comment-17519270 ] Forward Xu commented on HUDI-3826: -- [~alexey.kudinkin]  Well, this one I'll commit a fix

[GitHub] [hudi] hudi-bot commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-07 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1092370692 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 4071e00dcd1bfedbda291add17478b15495864a2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #4910: [HUDI-2560][RFC-33] Support full Schema evolution for Spark

2022-04-07 Thread GitBox
xiarixiaoyao commented on code in PR #4910: URL: https://github.com/apache/hudi/pull/4910#discussion_r845677851 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1505,4 +1549,138 @@ private void tryUpgrade(HoodieTableMetaCl

[jira] [Closed] (HUDI-3791) Test perf for point looks up for bloom filter and col stats partition

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3791. Resolution: Done > Test perf for point looks up for bloom filter and col stats partition > -

[jira] [Assigned] (HUDI-3762) Check performance of key lookup with full scan disabled in metadata table

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3762: Assignee: sivabalan narayanan (was: Ethan Guo) > Check performance of key lookup with full scan di

[jira] [Updated] (HUDI-3762) Check performance of key lookup with full scan disabled in metadata table

2022-04-07 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3762: - Reviewers: Ethan Guo > Check performance of key lookup with full scan disabled in metadata table > ---

[jira] [Closed] (HUDI-3810) Enabling point look ups does an extra full scan in addition to point look up for log reader readers with metadata

2022-04-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-3810. - Resolution: Fixed > Enabling point look ups does an extra full scan in addition to point l

[jira] [Updated] (HUDI-3799) Understand reason behind "Not an avro data file" with hudi

2022-04-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3799: -- Status: In Progress (was: Open) > Understand reason behind "Not an avro data file" with

[jira] [Updated] (HUDI-3798) Allow TransactionManager to work w/ multiple txn owners

2022-04-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3798: -- Status: Patch Available (was: In Progress) > Allow TransactionManager to work w/ multip

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5165: [HUDI-3742] Enable parquet enableVectorizedReader for spark inc query to improve peformance

2022-04-07 Thread GitBox
alexeykudinkin commented on code in PR #5165: URL: https://github.com/apache/hudi/pull/5165#discussion_r845671911 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadIncrementalRelation.scala: ## @@ -95,6 +108,9 @@ class MergeOnReadIncrementalRel

[GitHub] [hudi] KnightChess opened a new pull request, #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
KnightChess opened a new pull request, #5215: URL: https://github.com/apache/hudi/pull/5215 ## What is the purpose of the pull request fix can not delete record by use spark sql. operation.key must be "delete" ## Brief change log *(for example:)* - *Modify Annotatio

[GitHub] [hudi] KnightChess closed pull request #5215: [HUDI-3781] fix spark delete sql can not delete record

2022-04-07 Thread GitBox
KnightChess closed pull request #5215: [HUDI-3781] fix spark delete sql can not delete record URL: https://github.com/apache/hudi/pull/5215 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-07 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1092347728 ## CI report: * 0977730e895b6da85db146aec89f8dd748ffd7e8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7880

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-07 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1092346312 ## CI report: * 0977730e895b6da85db146aec89f8dd748ffd7e8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7880

[GitHub] [hudi] hudi-bot commented on pull request #5259: [HUDI-3825] Fixing non-partitioned table Partition Records persistence in MT

2022-04-07 Thread GitBox
hudi-bot commented on PR #5259: URL: https://github.com/apache/hudi/pull/5259#issuecomment-1092345098 ## CI report: * cbddee47602d92fd8f327d72152cd3a5406bd14b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7905

[jira] [Updated] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3825: -- Status: Patch Available (was: In Progress) > Fix tests failing when enabling MT on the Read Pat

[jira] [Updated] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3825: -- Story Points: 2 (was: 3) > Fix tests failing when enabling MT on the Read Path > --

[GitHub] [hudi] hudi-bot commented on pull request #5259: [HUDI-3825] Fixing non-partitioned table Partition Records persistence in MT

2022-04-07 Thread GitBox
hudi-bot commented on PR #5259: URL: https://github.com/apache/hudi/pull/5259#issuecomment-1092343789 ## CI report: * cbddee47602d92fd8f327d72152cd3a5406bd14b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5259: [HUDI-3825] Fixing non-partitioned table Partition Records persistence in MT

2022-04-07 Thread GitBox
alexeykudinkin commented on code in PR #5259: URL: https://github.com/apache/hudi/pull/5259#discussion_r845663828 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -1051,28 +1043,31 @@ private void initialCommit(

[jira] [Updated] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3825: - Labels: pull-request-available (was: ) > Fix tests failing when enabling MT on the Read Path > --

[GitHub] [hudi] alexeykudinkin opened a new pull request, #5259: [HUDI-3825] Fixing non-partitioned table Partition Records persistence in MT

2022-04-07 Thread GitBox
alexeykudinkin opened a new pull request, #5259: URL: https://github.com/apache/hudi/pull/5259 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] hudi-bot commented on pull request #5244: [WIP][HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-07 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1092341251 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 4071e00dcd1bfedbda291add17478b15495864a2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Updated] (HUDI-3066) Very slow file listing after enabling metadata for existing tables in 0.10.0 release

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3066: Story Points: 2 > Very slow file listing after enabling metadata for existing tables in 0.10.0 > release >

[jira] [Created] (HUDI-3826) TruncateHoodieTableCommand deletes partitions incorrectly

2022-04-07 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3826: - Summary: TruncateHoodieTableCommand deletes partitions incorrectly Key: HUDI-3826 URL: https://issues.apache.org/jira/browse/HUDI-3826 Project: Apache Hudi

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3636: Status: In Progress (was: Open) > Clustering fails due to marker creation failure > ---

[jira] [Updated] (HUDI-3823) Enabling MT by default on Read path makes HiveSync fail

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3823: Status: In Progress (was: Open) > Enabling MT by default on Read path makes HiveSync fail > ---

[jira] [Updated] (HUDI-3823) Enabling MT by default on Read path makes HiveSync fail

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3823: Status: Patch Available (was: In Progress) > Enabling MT by default on Read path makes HiveSync fail >

[jira] [Closed] (HUDI-3823) Enabling MT by default on Read path makes HiveSync fail

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-3823. --- Resolution: Fixed > Enabling MT by default on Read path makes HiveSync fail >

[jira] [Updated] (HUDI-3823) Enabling MT by default on Read path makes HiveSync fail

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3823: Sprint: Hudi-Sprint-Apr-05 > Enabling MT by default on Read path makes HiveSync fail > -

[jira] [Commented] (HUDI-3804) Partition metadata is not properly created for Column Stats

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519245#comment-17519245 ] Alexey Kudinkin commented on HUDI-3804: --- [~shivnarayan] did you check the AppendHand

[jira] [Updated] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3825: -- Status: In Progress (was: Open) > Fix tests failing when enabling MT on the Read Path > ---

[jira] [Updated] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3825: -- Sprint: Hudi-Sprint-Apr-05 > Fix tests failing when enabling MT on the Read Path > -

[jira] [Assigned] (HUDI-3454) Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-3454: - Assignee: Sagar Sumit (was: Alexey Kudinkin) > Fix partition name in all code paths for

[hudi] branch master updated (ef06e4a526 -> 672974c412)

2022-04-07 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from ef06e4a526 [HUDI-3810] Fixing lazy read for metadata log record readers (#5241) add 672974c412 [HUDI-3823] Fix hudi

[GitHub] [hudi] yihua merged pull request #5257: [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading

2022-04-07 Thread GitBox
yihua merged PR #5257: URL: https://github.com/apache/hudi/pull/5257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[jira] [Updated] (HUDI-3674) Remove unnecessary HBase-related dependencies from bundles if there is any

2022-04-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3674: Fix Version/s: 0.12.0 > Remove unnecessary HBase-related dependencies from bundles if there is any > ---

[GitHub] [hudi] ChenShuai1981 opened a new issue, #5258: [SUPPORT]

2022-04-07 Thread GitBox
ChenShuai1981 opened a new issue, #5258: URL: https://github.com/apache/hudi/issues/5258 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-su

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5207: [HUDI-3772] Fixing auto adjustment of lock configs for deltastreamer

2022-04-07 Thread GitBox
nsivabalan commented on code in PR #5207: URL: https://github.com/apache/hudi/pull/5207#discussion_r845644741 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -2480,41 +2494,42 @@ protected void setDefaults() { Hood

[GitHub] [hudi] hudi-bot commented on pull request #5257: [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading

2022-04-07 Thread GitBox
hudi-bot commented on PR #5257: URL: https://github.com/apache/hudi/pull/5257#issuecomment-1092316074 ## CI report: * f4af6a2ba0a5236ca741208588f18c7da2ff0c79 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7902

[jira] [Commented] (HUDI-3799) Understand reason behind "Not an avro data file" with hudi

2022-04-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519241#comment-17519241 ] sivabalan narayanan commented on HUDI-3799: --- We might need to fix how archival h

[jira] [Created] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-07 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3825: - Summary: Fix tests failing when enabling MT on the Read Path Key: HUDI-3825 URL: https://issues.apache.org/jira/browse/HUDI-3825 Project: Apache Hudi Issue

[GitHub] [hudi] yihua commented on a diff in pull request #5244: [WIP][HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-07 Thread GitBox
yihua commented on code in PR #5244: URL: https://github.com/apache/hudi/pull/5244#discussion_r845618339 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala: ## @@ -196,12 +191,20 @@ case class HoodieFileIndex(spark: SparkSession,

[GitHub] [hudi] hudi-bot commented on pull request #5257: [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading

2022-04-07 Thread GitBox
hudi-bot commented on PR #5257: URL: https://github.com/apache/hudi/pull/5257#issuecomment-1092280197 ## CI report: * f4af6a2ba0a5236ca741208588f18c7da2ff0c79 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7902

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5244: [WIP][HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-07 Thread GitBox
nsivabalan commented on code in PR #5244: URL: https://github.com/apache/hudi/pull/5244#discussion_r845614938 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala: ## @@ -196,12 +191,20 @@ case class HoodieFileIndex(spark: SparkSession,

[GitHub] [hudi] hudi-bot commented on pull request #5257: [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading

2022-04-07 Thread GitBox
hudi-bot commented on PR #5257: URL: https://github.com/apache/hudi/pull/5257#issuecomment-1092278854 ## CI report: * f4af6a2ba0a5236ca741208588f18c7da2ff0c79 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-3823) Enabling MT by default on Read path makes HiveSync fail

2022-04-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3823: - Labels: pull-request-available (was: ) > Enabling MT by default on Read path makes HiveSync fail

[GitHub] [hudi] yihua opened a new pull request, #5257: [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading

2022-04-07 Thread GitBox
yihua opened a new pull request, #5257: URL: https://github.com/apache/hudi/pull/5257 ## What is the purpose of the pull request The Hive sync fails with `ClassNotFoundException` on a Hudi table if metadata table is enabled on the read path, because `hudi-hive-sync-bundle` does not p

[hudi] branch master updated: [HUDI-3810] Fixing lazy read for metadata log record readers (#5241)

2022-04-07 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new ef06e4a526 [HUDI-3810] Fixing lazy read for met

[GitHub] [hudi] nsivabalan merged pull request #5241: [HUDI-3810] Fixing lazy read for metadata log record readers

2022-04-07 Thread GitBox
nsivabalan merged PR #5241: URL: https://github.com/apache/hudi/pull/5241 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

[GitHub] [hudi] hudi-bot commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-07 Thread GitBox
hudi-bot commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1092268230 ## CI report: * 10abfe1260bfbd28676e043cea6e91c9aca4b067 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7901

[jira] [Updated] (HUDI-3687) Make sure CI run tests against all Spark versions

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3687: -- Epic Link: HUDI-3824 (was: HUDI-3679) > Make sure CI run tests against all Spark versions > ---

[jira] [Created] (HUDI-3824) [Umbrella] Hudi CI Improvements

2022-04-07 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3824: - Summary: [Umbrella] Hudi CI Improvements Key: HUDI-3824 URL: https://issues.apache.org/jira/browse/HUDI-3824 Project: Apache Hudi Issue Type: Epic

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5052: [HUDI-3644] hoodie log scan bug cause data duplication bugfix

2022-04-07 Thread GitBox
nsivabalan commented on code in PR #5052: URL: https://github.com/apache/hudi/pull/5052#discussion_r845602394 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java: ## @@ -346,6 +349,19 @@ public synchronized void scan(Option> keys) {

[GitHub] [hudi] nsivabalan commented on pull request #5042: [HUDI-3802] Three bulk_insert files are concurrently submitted and executed with a difference of 2s, the insert fails occasionally.

2022-04-07 Thread GitBox
nsivabalan commented on PR #5042: URL: https://github.com/apache/hudi/pull/5042#issuecomment-1092264348 if not, feel free to close out the patch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] nsivabalan commented on pull request #5042: [HUDI-3802] Three bulk_insert files are concurrently submitted and executed with a difference of 2s, the insert fails occasionally.

2022-04-07 Thread GitBox
nsivabalan commented on PR #5042: URL: https://github.com/apache/hudi/pull/5042#issuecomment-1092264233 @peanut-chenzhong : We have fixed few bugs around multi-writers and archival with 0.11. Can you give it a try w/ latest master and let us know if you are still seeing the issues. -

[jira] [Updated] (HUDI-3812) Make sure Data Skipping respects Metadata Table config

2022-04-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3812: -- Epic Link: HUDI-1822 (was: HUDI-1292) > Make sure Data Skipping respects Metadata Table config

[jira] [Created] (HUDI-3823) Enabling MT by default on Read path makes HiveSync fail

2022-04-07 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3823: - Summary: Enabling MT by default on Read path makes HiveSync fail Key: HUDI-3823 URL: https://issues.apache.org/jira/browse/HUDI-3823 Project: Apache Hudi I

[GitHub] [hudi] hudi-bot commented on pull request #5255: [HUDI-3798] Fixing ending of a transaction by different owner and removing some extraneous methods in trxn manager

2022-04-07 Thread GitBox
hudi-bot commented on PR #5255: URL: https://github.com/apache/hudi/pull/5255#issuecomment-1092212722 ## CI report: * d806482d7ea926a08b145a9d5983e68e56bc0bcf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7899

[GitHub] [hudi] yihua commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-07 Thread GitBox
yihua commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1092174590 @rahil-c -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [hudi] hudi-bot commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-07 Thread GitBox
hudi-bot commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1092174296 ## CI report: * 10abfe1260bfbd28676e043cea6e91c9aca4b067 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7901

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-07 Thread GitBox
alexeykudinkin commented on code in PR #5252: URL: https://github.com/apache/hudi/pull/5252#discussion_r845536891 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FormatUtils.java: ## @@ -144,7 +150,12 @@ public static HoodieMergedLogRecordScanner l

[GitHub] [hudi] hudi-bot commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-07 Thread GitBox
hudi-bot commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1092171769 ## CI report: * 10abfe1260bfbd28676e043cea6e91c9aca4b067 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

  1   2   3   >