[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] [Stacked 3123/3085] Consistent bucket index: bucket resizing (split&merge) & concurrent write during resizing

2022-04-18 Thread GitBox
hudi-bot commented on PR #4958: URL: https://github.com/apache/hudi/pull/4958#issuecomment-1102161717 ## CI report: * eb1c6e290d676158af9385ef7922e372163113e7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8127

[jira] [Created] (HUDI-3915) Error upserting bucketType UPDATE for partition :0

2022-04-18 Thread Neetu Gupta (Jira)
Neetu Gupta created HUDI-3915: - Summary: Error upserting bucketType UPDATE for partition :0 Key: HUDI-3915 URL: https://issues.apache.org/jira/browse/HUDI-3915 Project: Apache Hudi Issue Type: Bu

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102155299 ## CI report: * d9c0a047103cb4d72ba5d46ee45d5b2c10319458 UNKNOWN * 7de9f669139bdd6b812ff443daab08b940f5319c Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102151249 ## CI report: * d9c0a047103cb4d72ba5d46ee45d5b2c10319458 UNKNOWN * 7de9f669139bdd6b812ff443daab08b940f5319c Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102144730 ## CI report: * d9c0a047103cb4d72ba5d46ee45d5b2c10319458 UNKNOWN * 7de9f669139bdd6b812ff443daab08b940f5319c Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102142708 ## CI report: * cb1112f40390ba58260466d2d7b3e30230e19b76 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102130903 ## CI report: * cb1112f40390ba58260466d2d7b3e30230e19b76 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102129023 ## CI report: * cb1112f40390ba58260466d2d7b3e30230e19b76 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1102127065 ## CI report: * cb1112f40390ba58260466d2d7b3e30230e19b76 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8128

[GitHub] [hudi] Guanpx opened a new issue, #5358: [SUPPORT] read hudi cow table with spark, throw exception: File does not exist

2022-04-18 Thread GitBox
Guanpx opened a new issue, #5358: URL: https://github.com/apache/hudi/issues/5358 **Describe the problem you faced** read hudi cow table with spark , throw exception **File does not exist: ** **To Reproduce** Steps to reproduce the behavior: 1. insert

[GitHub] [hudi] hudi-bot commented on pull request #5356: [HUDI-3905] Add S3 related setup in Kafka Connect quick start

2022-04-18 Thread GitBox
hudi-bot commented on PR #5356: URL: https://github.com/apache/hudi/pull/5356#issuecomment-1102123099 ## CI report: * 480b4e9c735edc69b37436376089086443a481ba Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8125

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #5344: [HUDI-3879]Suppress exceptions that are not fatal in HoodieMetadataTableValidator

2022-04-18 Thread GitBox
zhangyue19921010 commented on code in PR #5344: URL: https://github.com/apache/hudi/pull/5344#discussion_r852603749 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java: ## @@ -399,7 +400,9 @@ public void doMetadataTableValidation() {

[GitHub] [hudi] zhangyue19921010 commented on pull request #5344: [HUDI-3879]Suppress exceptions that are not fatal in HoodieMetadataTableValidator

2022-04-18 Thread GitBox
zhangyue19921010 commented on PR #5344: URL: https://github.com/apache/hudi/pull/5344#issuecomment-1102064611 > when there are no commits in metadata table. Yeap, after this patch, when there are no commits in metadata table, HoodieMetadataTableValidator will skip current loop and log

[GitHub] [hudi] YuweiXiao commented on a diff in pull request #4326: [HUDI-2999] [RFC-42] RFC for consistent hashing index

2022-04-18 Thread GitBox
YuweiXiao commented on code in PR #4326: URL: https://github.com/apache/hudi/pull/4326#discussion_r852602739 ## rfc/rfc-42/rfc-42.md: ## @@ -0,0 +1,230 @@ + +# RFC-42: Consistent Hashing Index for Dynamic Bucket Number + + +## Proposers + +- @HuberyLee +- @hujincalrin +- @stream

[jira] [Closed] (HUDI-3894) Add HBase dependencies and shading in datahub and gcp bundles

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-3894. --- Resolution: Fixed > Add HBase dependencies and shading in datahub and gcp bundles > --

[jira] [Updated] (HUDI-2673) Add integration/e2e test for kafka-connect functionality

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2673: Sprint: Hudi-Sprint-Apr-19 > Add integration/e2e test for kafka-connect functionality >

[GitHub] [hudi] hudi-bot commented on pull request #5344: [HUDI-3879]Suppress exceptions that are not fatal in HoodieMetadataTableValidator

2022-04-18 Thread GitBox
hudi-bot commented on PR #5344: URL: https://github.com/apache/hudi/pull/5344#issuecomment-1102046824 ## CI report: * 6da5a939cc985fc21be40168503c43a98d39e566 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8124

[jira] [Updated] (HUDI-2673) Add integration/e2e test for kafka-connect functionality

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2673: Story Points: 6 > Add integration/e2e test for kafka-connect functionality > ---

[hudi] branch master updated (4f44e6aeb5 -> 9af7b09aec)

2022-04-18 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 4f44e6aeb5 [HUDI-3899] Drop index to delete pending index instants from timeline if applicable (#5342) add 9af7b09a

[GitHub] [hudi] yihua merged pull request #5349: [HUDI-3894] Fix gcp bundle to include HBase dependencies and shading

2022-04-18 Thread GitBox
yihua merged PR #5349: URL: https://github.com/apache/hudi/pull/5349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] hudi-bot commented on pull request #5355: [WIP][DO_NOT_MERGE] Savepoint archival flip

2022-04-18 Thread GitBox
hudi-bot commented on PR #5355: URL: https://github.com/apache/hudi/pull/5355#issuecomment-1102013085 ## CI report: * 35fd8d0aa5d56f1a0b98bb63e0b6f47f1452e35e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8123

[GitHub] [hudi] nsivabalan commented on a diff in pull request #4326: [HUDI-2999] [RFC-42] RFC for consistent hashing index

2022-04-18 Thread GitBox
nsivabalan commented on code in PR #4326: URL: https://github.com/apache/hudi/pull/4326#discussion_r852587530 ## rfc/rfc-42/rfc-42.md: ## @@ -0,0 +1,215 @@ + +# RFC-42: Consistent Hashing Index for Dynamic Bucket Number + + +## Proposers + +- @HuberyLee +- @hujincalrin +- @strea

[GitHub] [hudi] hudi-bot commented on pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Compaction/Clustering Servi…

2022-04-18 Thread GitBox
hudi-bot commented on PR #4309: URL: https://github.com/apache/hudi/pull/4309#issuecomment-1101991348 ## CI report: * fbe27691b5d9de58128cc58158047a4df2b53750 UNKNOWN * ec26a6b6d14f16de6db11dad782fa9c0002dcd04 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5352: [HUDI-3902] Fallback to `HadoopFsRelation` in cases non-involving Schema Evolution

2022-04-18 Thread GitBox
hudi-bot commented on PR #5352: URL: https://github.com/apache/hudi/pull/5352#issuecomment-1101974409 ## CI report: * 57f622f643f7c623129636f8e5000ffe014b0c0b UNKNOWN * 6f2b0129ba44ba17b6bd1ba552e9724a62f8e96c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5350: [HUDI-3884][WIP] Adding support to let archival proceed beyond savepointed commits

2022-04-18 Thread GitBox
hudi-bot commented on PR #5350: URL: https://github.com/apache/hudi/pull/5350#issuecomment-1101967426 ## CI report: * 134ef19bbe6dfd63c9fc38c36075c6239281c407 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8122

[jira] [Created] (HUDI-3914) Enhance TestColumnStatsIndex to test indexing with regular writes and table services

2022-04-18 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3914: - Summary: Enhance TestColumnStatsIndex to test indexing with regular writes and table services Key: HUDI-3914 URL: https://issues.apache.org/jira/browse/HUDI-3914 Project: A

[GitHub] [hudi] wxplovecc commented on issue #5330: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2022-04-18 Thread GitBox
wxplovecc commented on issue #5330: URL: https://github.com/apache/hudi/issues/5330#issuecomment-1101965808 > Thanks for the PR @wxplovecc , can you explain why the #5185 patch fixed the bug ? The mor table rollback was not delete the log files , If the job failed before first succes

[jira] [Updated] (HUDI-3783) Fix HoodieTestTable harness to also properly validate Column Stats

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3783: -- Fix Version/s: 0.11.1 (was: 0.12.0) > Fix HoodieTestTable harness to also properl

[GitHub] [hudi] hudi-bot commented on pull request #5328: [WIP][HUDI-3883] Fix Bulk Insert to repartition the dataset based on Partition Path

2022-04-18 Thread GitBox
hudi-bot commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1101963690 ## CI report: * 6812e0065e1411107d7d53ad2997d02e7ce34d06 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8079

[jira] [Updated] (HUDI-3883) File-sizing issues when writing COW table to S3

2022-04-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3883: - Labels: pull-request-available (was: ) > File-sizing issues when writing COW table to S3 > --

[GitHub] [hudi] hudi-bot commented on pull request #5328: [WIP][HUDI-3883] Fix Bulk Insert to repartition the dataset based on Partition Path

2022-04-18 Thread GitBox
hudi-bot commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1101962392 ## CI report: * 6812e0065e1411107d7d53ad2997d02e7ce34d06 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8079

[jira] [Created] (HUDI-3913) Use hudi-trino-bundle in trino connector

2022-04-18 Thread Todd Gao (Jira)
Todd Gao created HUDI-3913: -- Summary: Use hudi-trino-bundle in trino connector Key: HUDI-3913 URL: https://issues.apache.org/jira/browse/HUDI-3913 Project: Apache Hudi Issue Type: Improvement

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1101957190 ## CI report: * cb1112f40390ba58260466d2d7b3e30230e19b76 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8128

[jira] [Updated] (HUDI-3207) Hudi Trino connector PR review

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3207: -- Story Points: 1 (was: 0) > Hudi Trino connector PR review > -- > >

[GitHub] [hudi] hudi-bot commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
hudi-bot commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1101955848 ## CI report: * cb1112f40390ba58260466d2d7b3e30230e19b76 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] wxplovecc commented on pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-18 Thread GitBox
wxplovecc commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1101955739 > Hello @wxplovecc Can you fix the checkstyle for building ? ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] hudi-bot commented on pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Compaction/Clustering Servi…

2022-04-18 Thread GitBox
hudi-bot commented on PR #4309: URL: https://github.com/apache/hudi/pull/4309#issuecomment-1101955310 ## CI report: * fbe27691b5d9de58128cc58158047a4df2b53750 UNKNOWN * ec26a6b6d14f16de6db11dad782fa9c0002dcd04 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] wxplovecc commented on pull request #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
wxplovecc commented on PR #5357: URL: https://github.com/apache/hudi/pull/5357#issuecomment-1101954300 cc @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[jira] [Updated] (HUDI-3912) Avoid data lose in flink async compact when rollbackCompaction

2022-04-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3912: - Labels: pull-request-available (was: ) > Avoid data lose in flink async compact when rollbackComp

[GitHub] [hudi] wxplovecc opened a new pull request, #5357: [HUDI-3912] Fix lose data when rollback in flink async compact

2022-04-18 Thread GitBox
wxplovecc opened a new pull request, #5357: URL: https://github.com/apache/hudi/pull/5357 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpo

[GitHub] [hudi] hudi-bot commented on pull request #5352: [HUDI-3902] Fallback to `HadoopFsRelation` in cases non-involving Schema Evolution

2022-04-18 Thread GitBox
hudi-bot commented on PR #5352: URL: https://github.com/apache/hudi/pull/5352#issuecomment-1101952743 ## CI report: * 57f622f643f7c623129636f8e5000ffe014b0c0b UNKNOWN * 6f2b0129ba44ba17b6bd1ba552e9724a62f8e96c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Created] (HUDI-3912) Avoid data lose in flink async compact when rollbackCompaction

2022-04-18 Thread konwu (Jira)
konwu created HUDI-3912: --- Summary: Avoid data lose in flink async compact when rollbackCompaction Key: HUDI-3912 URL: https://issues.apache.org/jira/browse/HUDI-3912 Project: Apache Hudi Issue Type: B

[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] [Stacked 3123/3085] Consistent bucket index: bucket resizing (split&merge) & concurrent write during resizing

2022-04-18 Thread GitBox
hudi-bot commented on PR #4958: URL: https://github.com/apache/hudi/pull/4958#issuecomment-1101951219 ## CI report: * 9a9e6149befed76a04fc2e691467772d44d158d5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8106

[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] [Stacked 3123/3085] Consistent bucket index: bucket resizing (split&merge) & concurrent write during resizing

2022-04-18 Thread GitBox
hudi-bot commented on PR #4958: URL: https://github.com/apache/hudi/pull/4958#issuecomment-1101949831 ## CI report: * 9a9e6149befed76a04fc2e691467772d44d158d5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8106

[jira] [Updated] (HUDI-3911) Async indexer blog/doc for 0.11 release

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3911: -- Status: In Progress (was: Open) > Async indexer blog/doc for 0.11 release > ---

[jira] [Updated] (HUDI-3368) Support metadata bloom index for secondary keys

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3368: -- Story Points: 4 (was: 3) > Support metadata bloom index for secondary keys > --

[jira] [Closed] (HUDI-2481) Fix Restore and RollbackMetadata in HoodieTestTable

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-2481. - Fix Version/s: 0.11.0 (was: 0.12.0) Resolution: Fixed > Fix Restore and Roll

[GitHub] [hudi] danny0405 commented on issue #4881: Full incremental Enable index loading to discover duplicate data(index.bootstrap.enabled)

2022-04-18 Thread GitBox
danny0405 commented on issue #4881: URL: https://github.com/apache/hudi/issues/4881#issuecomment-1101946263 I mean no new records are allowed to write before the bootstrap finish ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[jira] [Commented] (HUDI-2481) Fix Restore and RollbackMetadata in HoodieTestTable

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524024#comment-17524024 ] Sagar Sumit commented on HUDI-2481: --- Both followups done: # set log files in rollback m

[jira] [Updated] (HUDI-3884) Inspect why archival stops at first savepoint. Add support if possible

2022-04-18 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3884: -- Story Points: 2 Remaining Estimate: 4h Original Estimate: 4h > Inspect wh

[jira] [Updated] (HUDI-3848) Restore fails when files pertaining to a commit has been cleaned up

2022-04-18 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3848: -- Status: In Progress (was: Open) > Restore fails when files pertaining to a commit has b

[jira] [Updated] (HUDI-3848) Restore fails when files pertaining to a commit has been cleaned up

2022-04-18 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3848: -- Status: Patch Available (was: In Progress) > Restore fails when files pertaining to a c

[jira] [Updated] (HUDI-3884) Inspect why archival stops at first savepoint. Add support if possible

2022-04-18 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3884: -- Status: In Progress (was: Open) > Inspect why archival stops at first savepoint. Add su

[jira] [Updated] (HUDI-3899) Drop index should delete index commit files from the timeline

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3899: -- Story Points: 2 > Drop index should delete index commit files from the timeline > --

[jira] [Updated] (HUDI-3519) Make sure every public Hudi Client Method invokes necessary prologue

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3519: - Sprint: Hudi-Sprint-Apr-19 > Make sure every public Hudi Client Method invokes necessary prologue > --

[jira] [Updated] (HUDI-3911) Async indexer blog/doc for 0.11 release

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3911: -- Sprint: Hudi-Sprint-Apr-12 > Async indexer blog/doc for 0.11 release > -

[jira] [Created] (HUDI-3911) Async indexer blog/doc for 0.11 release

2022-04-18 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3911: - Summary: Async indexer blog/doc for 0.11 release Key: HUDI-3911 URL: https://issues.apache.org/jira/browse/HUDI-3911 Project: Apache Hudi Issue Type: Task

[jira] [Updated] (HUDI-2597) Improve code quality around Generics with Java 8

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2597: - Priority: Major (was: Blocker) > Improve code quality around Generics with Java 8 > -

[jira] [Updated] (HUDI-3321) HFileWriter, HFileReader and HFileDataBlock should avoid hardcoded key field name

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3321: - Sprint: Hudi-Sprint-Apr-19 > HFileWriter, HFileReader and HFileDataBlock should avoid hardcoded key field

[jira] [Updated] (HUDI-3300) Timeline server FSViewManager should avoid point lookup for metadata file partition

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3300: - Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Apr-19 (was: Hudi-Sprint-Feb-14) > Timeline server FSViewManager

[jira] [Updated] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3301: - Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Apr-19 (was: Hudi-Sprint-Feb-14) > MergedLogRecordReader inline r

[jira] [Updated] (HUDI-3317) Partition specific pointed lookup/reading strategy for metadata table

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3317: - Sprint: Hudi-Sprint-Apr-19 > Partition specific pointed lookup/reading strategy for metadata table > -

[jira] [Updated] (HUDI-3288) Partition specific compaction strategy for the metadata table

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3288: - Sprint: Hudi-Sprint-Apr-19 > Partition specific compaction strategy for the metadata table > -

[jira] [Closed] (HUDI-3710) Fix testHoodieAsyncClusteringJob in TestHoodieDeltaStreamer

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3710. - Resolution: Fixed > Fix testHoodieAsyncClusteringJob in TestHoodieDeltaStreamer >

[jira] [Updated] (HUDI-2954) Code cleanup: HFileDataBock - using integer keys is never used

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2954: - Sprint: Hudi-Sprint-Apr-19 > Code cleanup: HFileDataBock - using integer keys is never used > ---

[jira] [Updated] (HUDI-2736) Redundant metadata table initialization by the metadata writer

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2736: - Sprint: Hudi-Sprint-Apr-19 > Redundant metadata table initialization by the metadata writer >

[GitHub] [hudi] yihua commented on a diff in pull request #5337: [HUDI-3895] Fixing files partitioning sequence for `BaseFileOnlyRelation`

2022-04-18 Thread GitBox
yihua commented on code in PR #5337: URL: https://github.com/apache/hudi/pull/5337#discussion_r852551786 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BaseFileOnlyRelation.scala: ## @@ -84,21 +84,24 @@ class BaseFileOnlyRelation(sqlContext: SQLContext

[jira] [Assigned] (HUDI-3707) Fix deltastreamer test with schema provider and transformer enabled

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-3707: - Assignee: Sagar Sumit (was: sivabalan narayanan) > Fix deltastreamer test with schema provider a

[jira] [Closed] (HUDI-3899) Drop index should delete index commit files from the timeline

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3899. - Resolution: Fixed > Drop index should delete index commit files from the timeline > --

[jira] [Updated] (HUDI-2613) Fix usages of RealtimeSplit to use the new getDeltaLogFileStatus

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2613: - Sprint: Hudi-Sprint-Apr-19 > Fix usages of RealtimeSplit to use the new getDeltaLogFileStatus > --

[jira] [Updated] (HUDI-2460) Async cleaning with metadata table

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2460: - Sprint: Hudi-Sprint-Apr-19 > Async cleaning with metadata table > -- > >

[jira] [Created] (HUDI-3910) Fix HoodieIncrSource test failure

2022-04-18 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3910: - Summary: Fix HoodieIncrSource test failure Key: HUDI-3910 URL: https://issues.apache.org/jira/browse/HUDI-3910 Project: Apache Hudi Issue Type: Test Re

[jira] [Closed] (HUDI-3707) Fix deltastreamer test with schema provider and transformer enabled

2022-04-18 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3707. - Resolution: Fixed > Fix deltastreamer test with schema provider and transformer enabled >

[jira] [Updated] (HUDI-2481) Fix Restore and RollbackMetadata in HoodieTestTable

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2481: - Sprint: Hudi-Sprint-Apr-19 > Fix Restore and RollbackMetadata in HoodieTestTable > ---

[jira] [Updated] (HUDI-2459) Support async compaction for metadata table

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2459: - Sprint: Hudi-Sprint-Apr-19 > Support async compaction for metadata table > ---

[jira] [Updated] (HUDI-10) Auto tune bulk insert parallelism #555

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-10: --- Priority: Major (was: Blocker) > Auto tune bulk insert parallelism #555 > -

[jira] [Updated] (HUDI-3878) create hudi-aws-bundle to use with utilities-slim and spark/flink on aws

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3878: - Sprint: Hudi-Sprint-Apr-19 > create hudi-aws-bundle to use with utilities-slim and spark/flink on aws > --

[jira] [Created] (HUDI-3909) Fix repair tool test failure

2022-04-18 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3909: - Summary: Fix repair tool test failure Key: HUDI-3909 URL: https://issues.apache.org/jira/browse/HUDI-3909 Project: Apache Hudi Issue Type: Test Reporte

[jira] [Updated] (HUDI-3878) create hudi-aws-bundle to use with utilities-slim and spark/flink on aws

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3878: - Sprint: (was: Hudi-Sprint-Apr-19) > create hudi-aws-bundle to use with utilities-slim and spark/flink on

[jira] [Updated] (HUDI-3778) Rename module names in hudi-spark-datasource

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3778: - Sprint: (was: Hudi-Sprint-Apr-19) > Rename module names in hudi-spark-datasource > -

[jira] [Updated] (HUDI-3778) Rename module names in hudi-spark-datasource

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3778: - Sprint: Hudi-Sprint-Apr-19 > Rename module names in hudi-spark-datasource > --

[jira] [Updated] (HUDI-3746) CI ignored test failure in TestDataSkippingUtils

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3746: - Sprint: Hudi-Sprint-Apr-19 > CI ignored test failure in TestDataSkippingUtils > --

[GitHub] [hudi] hudi-bot commented on pull request #5354: [HUDI-3904] Claim RFC number for Improve timeline server

2022-04-18 Thread GitBox
hudi-bot commented on PR #5354: URL: https://github.com/apache/hudi/pull/5354#issuecomment-1101939374 ## CI report: * 99024f2b516f52d458aff8ca274bf4e4d15e140f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8120

[jira] [Updated] (HUDI-2638) Rewrite tests around Hudi index

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2638: - Priority: Major (was: Blocker) > Rewrite tests around Hudi index > --- > >

[jira] [Updated] (HUDI-3287) Remove unnecessary deps in hudi-kafka-connect

2022-04-18 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3287: - Sprint: Hudi-Sprint-Mar-01, Hudi-Sprint-Apr-19 (was: Hudi-Sprint-Mar-01) > Remove unnecessary deps in hud

[jira] [Updated] (HUDI-3114) Kafka Connect can not connect Hive by jdbc

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3114: Priority: Blocker (was: Major) > Kafka Connect can not connect Hive by jdbc > -

[jira] [Updated] (HUDI-3113) Kafka Connect create Multiple Embedded Timeline Services

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3113: Priority: Blocker (was: Major) > Kafka Connect create Multiple Embedded Timeline Services > ---

[jira] [Updated] (HUDI-3557) Add support for Glue schema registry to deltastreamer and kafka sink connector

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3557: Fix Version/s: 0.12.0 > Add support for Glue schema registry to deltastreamer and kafka sink connector > ---

[jira] [Updated] (HUDI-3557) Add support for Glue schema registry to deltastreamer and kafka sink connector

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3557: Priority: Blocker (was: Major) > Add support for Glue schema registry to deltastreamer and kafka sink conne

[jira] [Updated] (HUDI-2445) Implement time based bucketing of inserts

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2445: Priority: Blocker (was: Major) > Implement time based bucketing of inserts > -

[jira] [Updated] (HUDI-2431) Reimplement BufferedWriter in streaming fashion

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2431: Priority: Blocker (was: Major) > Reimplement BufferedWriter in streaming fashion >

[jira] [Updated] (HUDI-2353) Rewrite the java write client with the right abstractions

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2353: Priority: Blocker (was: Major) > Rewrite the java write client with the right abstractions > --

[jira] [Updated] (HUDI-2353) Rewrite the java write client with the right abstractions

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2353: Fix Version/s: 0.12.0 > Rewrite the java write client with the right abstractions >

[jira] [Updated] (HUDI-2337) Implement Multiwriter support for Kafka connect

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2337: Fix Version/s: 0.12.0 > Implement Multiwriter support for Kafka connect > --

[jira] [Updated] (HUDI-2337) Implement Multiwriter support for Kafka connect

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2337: Priority: Blocker (was: Major) > Implement Multiwriter support for Kafka connect >

[jira] [Updated] (HUDI-2336) Metadata table integration for Kafka connect

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2336: Fix Version/s: 0.12.0 > Metadata table integration for Kafka connect > -

[jira] [Updated] (HUDI-2336) Metadata table integration for Kafka connect

2022-04-18 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2336: Priority: Blocker (was: Major) > Metadata table integration for Kafka connect > ---

[GitHub] [hudi] hudi-bot commented on pull request #5356: [HUDI-3905] Add S3 related setup in Kafka Connect quick start

2022-04-18 Thread GitBox
hudi-bot commented on PR #5356: URL: https://github.com/apache/hudi/pull/5356#issuecomment-1101930926 ## CI report: * 480b4e9c735edc69b37436376089086443a481ba Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8125

[GitHub] [hudi] hudi-bot commented on pull request #5344: [HUDI-3879]Suppress exceptions that are not fatal in HoodieMetadataTableValidator

2022-04-18 Thread GitBox
hudi-bot commented on PR #5344: URL: https://github.com/apache/hudi/pull/5344#issuecomment-1101930886 ## CI report: * e2200282573a65abbae99f5bc0fb81f903996ffe Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8102

[hudi] branch master updated (52d878c52b -> 4f44e6aeb5)

2022-04-18 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 52d878c52b [HUDI-3903] Fix NoClassDefFoundError with Kafka Connect bundle (#5353) add 4f44e6aeb5 [HUDI-3899] Dr

  1   2   3   4   >