[GitHub] [hudi] hudi-bot commented on pull request #5177: [HUDI-3746][DO_NOT_MERGE] Test CI

2022-04-08 Thread GitBox
hudi-bot commented on PR #5177: URL: https://github.com/apache/hudi/pull/5177#issuecomment-1092508912 ## CI report: * c99e0c304a6c46ea9ed09e6f3ecf1e94d10f2deb Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=791

[jira] [Resolved] (HUDI-3781) spark delete sql can't delete record

2022-04-08 Thread KnightChess (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KnightChess resolved HUDI-3781. --- > spark delete sql can't delete record > > > Key: HUD

[GitHub] [hudi] hudi-bot commented on pull request #5177: [HUDI-3746][DO_NOT_MERGE] Test CI

2022-04-08 Thread GitBox
hudi-bot commented on PR #5177: URL: https://github.com/apache/hudi/pull/5177#issuecomment-1092546138 ## CI report: * c99e0c304a6c46ea9ed09e6f3ecf1e94d10f2deb Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=791

[GitHub] [hudi] jxlining closed issue #5237: hive query no data

2022-04-08 Thread GitBox
jxlining closed issue #5237: hive query no data URL: https://github.com/apache/hudi/issues/5237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits

[GitHub] [hudi] danny0405 commented on pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-08 Thread GitBox
danny0405 commented on PR #5252: URL: https://github.com/apache/hudi/pull/5252#issuecomment-1092549772 Left one comment and see if we can put some improvement here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [hudi] danny0405 commented on a diff in pull request #5252: [HUDI-3454] Fix partition name in all code paths for LogRecordScanner

2022-04-08 Thread GitBox
danny0405 commented on code in PR #5252: URL: https://github.com/apache/hudi/pull/5252#discussion_r845828970 ## hudi-common/src/test/java/org/apache/hudi/common/functional/TestHoodieLogFormat.java: ## @@ -589,6 +591,7 @@ public void testBasicAppendAndScanMultipleFiles(ExternalS

[jira] [Updated] (HUDI-3818) hudi doesn't support bytes column as primary key

2022-04-08 Thread rex xiong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rex xiong updated HUDI-3818: Description:   {code:java} scala> sql("desc extended binary_test1").show(false) +--

[jira] [Updated] (HUDI-3818) hudi doesn't support bytes column as primary key

2022-04-08 Thread rex xiong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rex xiong updated HUDI-3818: Description:  when use bytes column as primary key, hudi will generate fixed hoodie key, then upserts will

[GitHub] [hudi] KnightChess opened a new pull request, #5263: [MINOR] doc: fix error link to community tab

2022-04-08 Thread GitBox
KnightChess opened a new pull request, #5263: URL: https://github.com/apache/hudi/pull/5263 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the pur

[GitHub] [hudi] RexXiong opened a new pull request, #5264: [HUDI-3818] encode bytes column value when generate HoodieKey

2022-04-08 Thread GitBox
RexXiong opened a new pull request, #5264: URL: https://github.com/apache/hudi/pull/5264 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpos

[jira] [Updated] (HUDI-3818) hudi doesn't support bytes column as primary key

2022-04-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3818: - Labels: pull-request-available (was: ) > hudi doesn't support bytes column as primary key > -

[GitHub] [hudi] hudi-bot commented on pull request #5264: [HUDI-3818] encode bytes column value when generate HoodieKey

2022-04-08 Thread GitBox
hudi-bot commented on PR #5264: URL: https://github.com/apache/hudi/pull/5264#issuecomment-1092569017 ## CI report: * 550ed6f6f39b1505e835e989d96be3c674120015 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5264: [HUDI-3818] encode bytes column value when generate HoodieKey

2022-04-08 Thread GitBox
hudi-bot commented on PR #5264: URL: https://github.com/apache/hudi/pull/5264#issuecomment-1092571999 ## CI report: * 550ed6f6f39b1505e835e989d96be3c674120015 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7916

[GitHub] [hudi] hudi-bot commented on pull request #5177: [HUDI-3746][DO_NOT_MERGE] Test CI

2022-04-08 Thread GitBox
hudi-bot commented on PR #5177: URL: https://github.com/apache/hudi/pull/5177#issuecomment-1092576599 ## CI report: * dde4ad3d1c449e47db2cca0aad4a2bc8960d7647 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7915

[GitHub] [hudi] RexXiong commented on a diff in pull request #5250: [HUDI-3817] specify parquet version for hudi-hadoop-mr-bundle when compile hudi using -Dspark3

2022-04-08 Thread GitBox
RexXiong commented on code in PR #5250: URL: https://github.com/apache/hudi/pull/5250#discussion_r845878329 ## packaging/hudi-hadoop-mr-bundle/pom.xml: ## @@ -29,6 +29,7 @@ true ${project.parent.basedir} +${hive.parquet.version} Review Comment: As discussed

[GitHub] [hudi] yesemsanthoshkumar commented on issue #5233: [SUPPORT] _hoodie_is_deleted not working for spark Datasource.

2022-04-08 Thread GitBox
yesemsanthoshkumar commented on issue #5233: URL: https://github.com/apache/hudi/issues/5233#issuecomment-1092616629 @ashah-lightbox Is the `_hoodie_is_deleted` field boolean? AFAIK, deletion seems to work only with that column being booleans. If it is char column, then it won't. -- This

[GitHub] [hudi] danny0405 merged pull request #5263: [MINOR] doc: fix error link to community tab

2022-04-08 Thread GitBox
danny0405 merged PR #5263: URL: https://github.com/apache/hudi/pull/5263 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[hudi] branch asf-site updated: [MINOR] doc: fix error link to community tab (#5263)

2022-04-08 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 6f122fd728 [MINOR] doc: fix error link to c

[GitHub] [hudi] hudi-bot commented on pull request #5264: [HUDI-3818] encode bytes column value when generate HoodieKey

2022-04-08 Thread GitBox
hudi-bot commented on PR #5264: URL: https://github.com/apache/hudi/pull/5264#issuecomment-1092658442 ## CI report: * 550ed6f6f39b1505e835e989d96be3c674120015 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7916

[GitHub] [hudi] xiaozhch5 commented on pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream

2022-04-08 Thread GitBox
xiaozhch5 commented on PR #5251: URL: https://github.com/apache/hudi/pull/5251#issuecomment-1092680981 Sorry, I didn't nitice that PR, this PR can be closed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [hudi] xiaozhch5 closed pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream out)

2022-04-08 Thread GitBox
xiaozhch5 closed pull request #5251: Fix hadoop 3 compile Error, use FSDataOutputStream(OutputStream out, FileSystem.Statistics stats) instead of FSDataOutputStream(OutputStream out) URL: https://github.com/apache/hudi/pull/5251 -- This is an automated message from the Apache Git Service. To

[jira] [Commented] (HUDI-2762) Ensure hive can query insert only logs in MOR

2022-04-08 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519490#comment-17519490 ] Sagar Sumit commented on HUDI-2762: --- [~rex_xiong] [~alexey.kudinkin] [~mengtao] [~rmahin

[GitHub] [hudi] codope merged pull request #5259: [HUDI-3825] Fixing non-partitioned table Partition Records persistence in MT

2022-04-08 Thread GitBox
codope merged PR #5259: URL: https://github.com/apache/hudi/pull/5259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.or

[jira] [Closed] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-08 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3825. - Resolution: Fixed > Fix tests failing when enabling MT on the Read Path >

[jira] [Closed] (HUDI-3454) Fix partition name in all code paths for LogRecordScanner

2022-04-08 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3454. - Resolution: Fixed > Fix partition name in all code paths for LogRecordScanner > --

[GitHub] [hudi] data-storyteller opened a new pull request, #5265: [HUDI-3571] Spark datasource continuous ingestion tool Checkpoint FS fixed

2022-04-08 Thread GitBox
data-storyteller opened a new pull request, #5265: URL: https://github.com/apache/hudi/pull/5265 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is th

[GitHub] [hudi] hudi-bot commented on pull request #5265: [HUDI-3571] Spark datasource continuous ingestion tool Checkpoint FS fixed

2022-04-08 Thread GitBox
hudi-bot commented on PR #5265: URL: https://github.com/apache/hudi/pull/5265#issuecomment-1092734772 ## CI report: * f91dcda7f03b3ecac73ece93d73940f7ab4cc395 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5265: [HUDI-3571] Spark datasource continuous ingestion tool Checkpoint FS fixed

2022-04-08 Thread GitBox
hudi-bot commented on PR #5265: URL: https://github.com/apache/hudi/pull/5265#issuecomment-1092737852 ## CI report: * f91dcda7f03b3ecac73ece93d73940f7ab4cc395 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7918

[GitHub] [hudi] nsivabalan merged pull request #5265: [HUDI-3571] Spark datasource continuous ingestion tool Checkpoint FS fixed

2022-04-08 Thread GitBox
nsivabalan merged PR #5265: URL: https://github.com/apache/hudi/pull/5265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

[GitHub] [hudi] nsivabalan commented on issue #4230: [SUPPORT] org.apache.hudi.exception.HoodieRemoteException: Failed to create marker file

2022-04-08 Thread GitBox
nsivabalan commented on issue #4230: URL: https://github.com/apache/hudi/issues/4230#issuecomment-1092782144 @bhasudha : can you follow up to add docs on how to set hudi configs at spark-sql dmls. I see examples in older version https://hudi.apache.org/docs/0.9.0/quick-start-guide#use-set-c

[GitHub] [hudi] nsivabalan commented on pull request #4445: [HUDI-3102] Do not store rollback plan in inflight instant

2022-04-08 Thread GitBox
nsivabalan commented on PR #4445: URL: https://github.com/apache/hudi/pull/4445#issuecomment-1092782769 got it, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5141: [HUDI-3724] Fixing closure of ParquetReader

2022-04-08 Thread GitBox
nsivabalan commented on code in PR #5141: URL: https://github.com/apache/hudi/pull/5141#discussion_r846036990 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ## @@ -333,7 +335,13 @@ object HoodieBaseRelation { partitionedF

[GitHub] [hudi] nsivabalan commented on pull request #4459: [HUDI-3116]Add a new HoodieDropPartitionsTool to let users drop table partitions through a standalone job.

2022-04-08 Thread GitBox
nsivabalan commented on PR #4459: URL: https://github.com/apache/hudi/pull/4459#issuecomment-1092809604 If you can rebase w/ master and once CI passed, we can land. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] hudi-bot commented on pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-08 Thread GitBox
hudi-bot commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1092880295 ## CI report: * bf117b1d1dca16bf62e1d8c8b9bfcf52aa894bb3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7683

[GitHub] [hudi] hudi-bot commented on pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-08 Thread GitBox
hudi-bot commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1092883361 ## CI report: * bf117b1d1dca16bf62e1d8c8b9bfcf52aa894bb3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7683

[GitHub] [hudi] ashah-lightbox commented on issue #5233: [SUPPORT] _hoodie_is_deleted not working for spark Datasource.

2022-04-08 Thread GitBox
ashah-lightbox commented on issue #5233: URL: https://github.com/apache/hudi/issues/5233#issuecomment-1092892296 Its in Boolean and nullable is also set to true. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] hudi-bot commented on pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-08 Thread GitBox
hudi-bot commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1092985515 ## CI report: * 823f02dc228d4d8a27e701f3f699d53e842ac00e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7920

[GitHub] [hudi] hudi-bot commented on pull request #4676: [HUDI-3304] support partial update on mor table

2022-04-08 Thread GitBox
hudi-bot commented on PR #4676: URL: https://github.com/apache/hudi/pull/4676#issuecomment-1093014508 ## CI report: * f39affd6e0a1952556ba4123465d7b0b7a5e8d79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7284

[GitHub] [hudi] hudi-bot commented on pull request #4676: [HUDI-3304] support partial update on mor table

2022-04-08 Thread GitBox
hudi-bot commented on PR #4676: URL: https://github.com/apache/hudi/pull/4676#issuecomment-1093017669 ## CI report: * f39affd6e0a1952556ba4123465d7b0b7a5e8d79 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7284

[GitHub] [hudi] hudi-bot commented on pull request #4676: [HUDI-3304] support partial update on mor table

2022-04-08 Thread GitBox
hudi-bot commented on PR #4676: URL: https://github.com/apache/hudi/pull/4676#issuecomment-1093074272 ## CI report: * c9ee1edc0285fb17a9455cc5ca52072854d66a91 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7921

[GitHub] [hudi] yihua commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-08 Thread GitBox
yihua commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1093137546 > Should we also add in here examples of docker build commands for building the images locally? Yes. Actually, there are examples in the README already. Are they enough? ```shell

[jira] [Created] (HUDI-3832) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3832: - Summary: We need to revisit MOR block merging sequence Key: HUDI-3832 URL: https://issues.apache.org/jira/browse/HUDI-3832 Project: Apache Hudi Issue Type:

[jira] [Created] (HUDI-3830) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3830: - Summary: We need to revisit MOR block merging sequence Key: HUDI-3830 URL: https://issues.apache.org/jira/browse/HUDI-3830 Project: Apache Hudi Issue Type:

[jira] [Created] (HUDI-3831) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3831: - Summary: We need to revisit MOR block merging sequence Key: HUDI-3831 URL: https://issues.apache.org/jira/browse/HUDI-3831 Project: Apache Hudi Issue Type:

[jira] [Created] (HUDI-3828) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3828: - Summary: We need to revisit MOR block merging sequence Key: HUDI-3828 URL: https://issues.apache.org/jira/browse/HUDI-3828 Project: Apache Hudi Issue Type:

[jira] [Created] (HUDI-3833) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3833: - Summary: We need to revisit MOR block merging sequence Key: HUDI-3833 URL: https://issues.apache.org/jira/browse/HUDI-3833 Project: Apache Hudi Issue Type:

[jira] [Created] (HUDI-3829) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3829: - Summary: We need to revisit MOR block merging sequence Key: HUDI-3829 URL: https://issues.apache.org/jira/browse/HUDI-3829 Project: Apache Hudi Issue Type:

[jira] [Closed] (HUDI-3832) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3832. - Resolution: Duplicate > We need to revisit MOR block merging sequence > --

[jira] [Closed] (HUDI-3829) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3829. - Resolution: Duplicate > We need to revisit MOR block merging sequence > --

[jira] [Closed] (HUDI-3830) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3830. - Resolution: Duplicate > We need to revisit MOR block merging sequence > --

[jira] [Closed] (HUDI-3833) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3833. - Resolution: Duplicate > We need to revisit MOR block merging sequence > --

[jira] [Closed] (HUDI-3831) We need to revisit MOR block merging sequence

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3831. - Resolution: Duplicate > We need to revisit MOR block merging sequence > --

[jira] [Updated] (HUDI-3826) TruncateHoodieTableCommand deletes partitions incorrectly

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3826: -- Sprint: Hudi-Sprint-Apr-05 > TruncateHoodieTableCommand deletes partitions incorrectly > ---

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Sprint: Hudi-Sprint-Apr-05 > Evaluate MT Column Stats Performance > ---

[jira] [Created] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-3834: - Summary: Evaluate MT Column Stats Performance Key: HUDI-3834 URL: https://issues.apache.org/jira/browse/HUDI-3834 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Status: In Progress (was: Open) > Evaluate MT Column Stats Performance > -

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Description: Previously, while evaluating Data Skipping runtime in EMR setting, it was measured

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Description: Previously, while evaluating Data Skipping runtime in EMR setting, it was measured

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Description: h3. *UPDATE* *TL;DR* After identifying the bottlenecks as Avro 1.10 regr

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Attachment: Screen Shot 2022-04-08 at 8.55.43 AM.png Screen Shot 2022-04-08 at 9.

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Attachment: (was: Screen Shot 2022-04-08 at 8.55.43 AM.png) > Evaluate MT Column Stats Perfo

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Attachment: (was: Screen Shot 2022-04-08 at 9.12.48 AM.png) > Evaluate MT Column Stats Perfo

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Attachment: Screen Shot 2022-04-07 at 7.48.04 PM.png Screen Shot 2022-04-07 at 7.

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Attachment: (was: Screen Shot 2022-04-08 at 12.14.29 PM.png) > Evaluate MT Column Stats Perf

[jira] [Commented] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519769#comment-17519769 ] Alexey Kudinkin commented on HUDI-3834: --- Original runtime of the query reading whole

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Attachment: Screen Shot 2022-04-08 at 9.12.48 AM.png > Evaluate MT Column Stats Performance > -

[jira] [Commented] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519770#comment-17519770 ] Alexey Kudinkin commented on HUDI-3834: --- After working this code-path around we see

[GitHub] [hudi] alexeykudinkin opened a new pull request, #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
alexeykudinkin opened a new pull request, #5266: URL: https://github.com/apache/hudi/pull/5266 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3834: - Labels: pull-request-available (was: ) > Evaluate MT Column Stats Performance >

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093279292 ## CI report: * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093283339 ## CI report: * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093357009 ## CI report: * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922

[GitHub] [hudi] hudi-bot commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-08 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1093358630 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 23a10d80255247f32c77cade8b15d9a8711f7ee1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-08 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1093360309 ## CI report: * 655d7d2a8417ce6ed1a6fdd1560563051672327c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7906

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-08 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1093362089 ## CI report: * 655d7d2a8417ce6ed1a6fdd1560563051672327c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7906

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093363965 ## CI report: * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093365714 ## CI report: * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922

[GitHub] [hudi] hudi-bot commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-08 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1093374069 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 23a10d80255247f32c77cade8b15d9a8711f7ee1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Closed] (HUDI-3611) Benchmark Data Skipping using MT

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3611. - Resolution: Duplicate > Benchmark Data Skipping using MT > > >

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
nsivabalan commented on code in PR #5266: URL: https://github.com/apache/hudi/pull/5266#discussion_r846516629 ## hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java: ## @@ -159,18 +166,34 @@ public static HoodieTableMetaClient reload(HoodieTableMet

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093425691 ## CI report: * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-08 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1093428414 ## CI report: * ed1fad096902d40c83d305606cf4d646bb63dfe9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7923

[GitHub] [hudi] alexeykudinkin opened a new pull request, #5267: [WIP] Fixing Column Stats Index updating sequence

2022-04-08 Thread GitBox
alexeykudinkin opened a new pull request, #5267: URL: https://github.com/apache/hudi/pull/5267 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] hudi-bot commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-08 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1093447210 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 6097520274d65ff9a9c2734a25eaf253ef78529d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-08 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1093449822 ## CI report: * ed1fad096902d40c83d305606cf4d646bb63dfe9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7923

[GitHub] [hudi] hudi-bot commented on pull request #5267: [WIP] Fixing Column Stats Index updating sequence

2022-04-08 Thread GitBox
hudi-bot commented on PR #5267: URL: https://github.com/apache/hudi/pull/5267#issuecomment-1093449983 ## CI report: * db99045030d04f226d26466b931cfb95cd5a186b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-08 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1093452535 ## CI report: * ed1fad096902d40c83d305606cf4d646bb63dfe9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7923

[GitHub] [hudi] hudi-bot commented on pull request #5267: [WIP] Fixing Column Stats Index updating sequence

2022-04-08 Thread GitBox
hudi-bot commented on PR #5267: URL: https://github.com/apache/hudi/pull/5267#issuecomment-1093452743 ## CI report: * db99045030d04f226d26466b931cfb95cd5a186b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7927

[GitHub] [hudi] rahil-c commented on pull request #5256: [MINOR] Update README of docker build setup

2022-04-08 Thread GitBox
rahil-c commented on PR #5256: URL: https://github.com/apache/hudi/pull/5256#issuecomment-1093457630 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[GitHub] [hudi] yihua merged pull request #5256: [MINOR] Update README of docker build setup

2022-04-08 Thread GitBox
yihua merged PR #5256: URL: https://github.com/apache/hudi/pull/5256 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[hudi] branch master updated (26eb7b8183 -> 1cc7542357)

2022-04-08 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 26eb7b8183 [HUDI-3571] Spark datasource continuous checkpoint should have own fs variable (#5265) add 1cc7542357 [M

[GitHub] [hudi] alexeykudinkin commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-08 Thread GitBox
alexeykudinkin commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1093502499 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [hudi] hudi-bot commented on pull request #5244: [HUDI-3812] Fixing Data Skipping configuration to respect Metadata Table configs

2022-04-08 Thread GitBox
hudi-bot commented on PR #5244: URL: https://github.com/apache/hudi/pull/5244#issuecomment-1093504604 ## CI report: * c252f8d6d9a6b38adcebec0ba857d5aafae823cf UNKNOWN * 6097520274d65ff9a9c2734a25eaf253ef78529d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093516227 ## CI report: * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924

[GitHub] [hudi] hudi-bot commented on pull request #5267: [WIP] Fixing Column Stats Index updating sequence

2022-04-08 Thread GitBox
hudi-bot commented on PR #5267: URL: https://github.com/apache/hudi/pull/5267#issuecomment-1093516253 ## CI report: * db99045030d04f226d26466b931cfb95cd5a186b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7927

[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

2022-04-08 Thread GitBox
hudi-bot commented on PR #5266: URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093518493 ## CI report: * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924

[jira] [Reopened] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reopened HUDI-3825: --- > Fix tests failing when enabling MT on the Read Path > --

[jira] [Updated] (HUDI-3834) Evaluate MT Column Stats Performance

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3834: -- Status: Patch Available (was: In Progress) > Evaluate MT Column Stats Performance > --

[GitHub] [hudi] hudi-bot commented on pull request #5193: [DO NOT MERGE][DBG] Enabling Column Stats Index by default for all columns

2022-04-08 Thread GitBox
hudi-bot commented on PR #5193: URL: https://github.com/apache/hudi/pull/5193#issuecomment-1093522719 ## CI report: * c125819a6310b97bfb656ba7b397a7500992fc37 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7926

[jira] [Commented] (HUDI-3825) Fix tests failing when enabling MT on the Read Path

2022-04-08 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519851#comment-17519851 ] Alexey Kudinkin commented on HUDI-3825: --- More test failures: {code:java} [ERROR] or

  1   2   >