[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786388066 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -146,6 +155,97 @@ protected BaseTableMetadata(HoodieEngineCon

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786388149 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -146,6 +155,97 @@ protected BaseTableMetadata(HoodieEngineCon

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786388388 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java ## @@ -125,30 +129,43 @@ private void initIfNeeded() {

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786388655 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java ## @@ -125,30 +129,43 @@ private void initIfNeeded() {

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786389155 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java ## @@ -233,38 +250,78 @@ private void initIfNeeded() { }

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786389597 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java ## @@ -109,55 +191,97 @@ private HoodieMetadataPayload(String k

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786389763 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java ## @@ -109,55 +191,97 @@ private HoodieMetadataPayload(String k

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786390304 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java ## @@ -109,55 +191,97 @@ private HoodieMetadataPayload(String k

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786390448 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java ## @@ -109,55 +194,92 @@ private HoodieMetadataPayload(String k

[jira] [Commented] (HUDI-2873) Support optimize data layout by sql and make the build more fast

2022-01-17 Thread shibei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477539#comment-17477539 ] shibei commented on HUDI-2873: -- [~xiaotaotao] Two things need to be clarified: 1) Compaction

[GitHub] [hudi] ChangbingChen commented on issue #4618: [SUPPORT] When querying a hudi table in hive, there have duplicated records.

2022-01-17 Thread GitBox
ChangbingChen commented on issue #4618: URL: https://github.com/apache/hudi/issues/4618#issuecomment-1015040351 > @ChangbingChen does parquet files exists in your table? if parquet file exists, pls set mapreduce.input.fileinputformat.split.maxsize >=(maxSize of paruert file) to forbiden hi

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786392523 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java ## @@ -124,14 +202,111 @@ public static void deleteMetadataTa

[GitHub] [hudi] hudi-bot commented on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4352: URL: https://github.com/apache/hudi/pull/4352#issuecomment-1015041615 ## CI report: * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN * 519a5b123a7315948cf83dfd2e13ee275abcbd39 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r786392823 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java ## @@ -399,4 +795,119 @@ public static int mapRecordKeyToFile

[GitHub] [hudi] hudi-bot removed a comment on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4352: URL: https://github.com/apache/hudi/pull/4352#issuecomment-1014076298 ## CI report: * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN * 519a5b123a7315948cf83dfd2e13ee275abcbd39 Azure: [FAILURE](https://dev.azure.com/apache-hud

[jira] [Assigned] (HUDI-2873) Support optimize data layout by sql and make the build more fast

2022-01-17 Thread leesf (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf reassigned HUDI-2873: --- Assignee: shibei > Support optimize data layout by sql and make the build more fast > ---

[jira] [Assigned] (HUDI-2645) Rewrite Zoptimize and other files in scala into Java

2022-01-17 Thread leesf (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf reassigned HUDI-2645: --- Assignee: shibei > Rewrite Zoptimize and other files in scala into Java > ---

[GitHub] [hudi] xuzifu666 commented on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
xuzifu666 commented on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015045648 @hudi-bot -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] xuzifu666 commented on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
xuzifu666 commented on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015045789 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot removed a comment on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1011888990 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015046322 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] xushiyan commented on issue #4411: [SUPPORT] - Presto Querying Issue in AWS EMR 6.3.1

2022-01-17 Thread GitBox
xushiyan commented on issue #4411: URL: https://github.com/apache/hudi/issues/4411#issuecomment-1015048564 @rajgowtham24 > we have manually updated the table location to "s3://bucket_name/test/table_name/default" so is `default/` the partition path in your table? I don't quit

[GitHub] [hudi] hudi-bot removed a comment on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015046322 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015052591 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot commented on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015053946 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015052591 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4352: URL: https://github.com/apache/hudi/pull/4352#issuecomment-1015055015 ## CI report: * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN * 519a5b123a7315948cf83dfd2e13ee275abcbd39 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot removed a comment on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4352: URL: https://github.com/apache/hudi/pull/4352#issuecomment-1015041615 ## CI report: * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN * 519a5b123a7315948cf83dfd2e13ee275abcbd39 Azure: [FAILURE](https://dev.azure.com/apache-hud

[GitHub] [hudi] xiarixiaoyao commented on issue #4618: [SUPPORT] When querying a hudi table in hive, there have duplicated records.

2022-01-17 Thread GitBox
xiarixiaoyao commented on issue #4618: URL: https://github.com/apache/hudi/issues/4618#issuecomment-1015059696 @ChangbingChen i know hudi has a bug for this。 if possible could you pls do modify for hudi code and package new hudi jar HoodieParquetRealtimeInputFormat.isSplitable

[GitHub] [hudi] xushiyan commented on issue #4623: [SUPPORT] Delete is not working in python, parquet files

2022-01-17 Thread GitBox
xushiyan commented on issue #4623: URL: https://github.com/apache/hudi/issues/4623#issuecomment-1015070882 @logan-jun hudi table name will be used to generate avro schema record name and hyphen is not a valid character in avro schema name. see https://avro.apache.org/docs/1.8.2/spec.html#n

[GitHub] [hudi] xushiyan commented on issue #4621: [SUPPORT] Integ tests are failing for HUDI

2022-01-17 Thread GitBox
xushiyan commented on issue #4621: URL: https://github.com/apache/hudi/issues/4621#issuecomment-1015074689 moving to https://issues.apache.org/jira/browse/HUDI-3262 for work tracking -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [hudi] xushiyan closed issue #4621: [SUPPORT] Integ tests are failing for HUDI

2022-01-17 Thread GitBox
xushiyan closed issue #4621: URL: https://github.com/apache/hudi/issues/4621 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@h

[jira] [Created] (HUDI-3262) Integration test suite failure

2022-01-17 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-3262: Summary: Integration test suite failure Key: HUDI-3262 URL: https://issues.apache.org/jira/browse/HUDI-3262 Project: Apache Hudi Issue Type: Bug Components

[GitHub] [hudi] logan-jun commented on issue #4623: [SUPPORT] Delete is not working in python, parquet files

2022-01-17 Thread GitBox
logan-jun commented on issue #4623: URL: https://github.com/apache/hudi/issues/4623#issuecomment-1015074559 Hi. I've already tested changing table name to '500mdeltest' which does not have any hyphen or other special characters. but it returns same error: org.apache.avro.SchemaParseExcepti

[GitHub] [hudi] hudi-bot commented on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4352: URL: https://github.com/apache/hudi/pull/4352#issuecomment-1015075020 ## CI report: * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN * a206a0bf59a3d4d079b19164eb4350a07856d8fa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot removed a comment on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4352: URL: https://github.com/apache/hudi/pull/4352#issuecomment-1015055015 ## CI report: * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN * 519a5b123a7315948cf83dfd2e13ee275abcbd39 Azure: [FAILURE](https://dev.azure.com/apache-hud

[GitHub] [hudi] xushiyan commented on issue #4623: [SUPPORT] Delete is not working in python, parquet files

2022-01-17 Thread GitBox
xushiyan commented on issue #4623: URL: https://github.com/apache/hudi/issues/4623#issuecomment-1015075092 > @logan-jun hudi table name will be used to generate avro schema record name and hyphen is not a valid character in avro schema name. see https://avro.apache.org/docs/1.8.2/spec.html

[GitHub] [hudi] logan-jun commented on issue #4623: [SUPPORT] Delete is not working in python, parquet files

2022-01-17 Thread GitBox
logan-jun commented on issue #4623: URL: https://github.com/apache/hudi/issues/4623#issuecomment-1015075724 Oh, I see. let me test it again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [hudi] xushiyan commented on issue #4604: [SUPPORT] Archive functionality fails

2022-01-17 Thread GitBox
xushiyan commented on issue #4604: URL: https://github.com/apache/hudi/issues/4604#issuecomment-1015077050 @XuQianJin-Stars can you help looking into this please? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [hudi] hudi-bot removed a comment on pull request #4523: [WIP][HUDI-3173] Add INDEX action type and corresponding commit metadata

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4523: URL: https://github.com/apache/hudi/pull/4523#issuecomment-1014190454 ## CI report: * 6c1c19492e82d894d20095e1b5038d29fd2d3322 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4523: [WIP][HUDI-3173] Add INDEX action type and corresponding commit metadata

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4523: URL: https://github.com/apache/hudi/pull/4523#issuecomment-1015078994 ## CI report: * 6c1c19492e82d894d20095e1b5038d29fd2d3322 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot commented on pull request #4523: [WIP][HUDI-3173] Add INDEX action type and corresponding commit metadata

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4523: URL: https://github.com/apache/hudi/pull/4523#issuecomment-1015080375 ## CI report: * 6c1c19492e82d894d20095e1b5038d29fd2d3322 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4523: [WIP][HUDI-3173] Add INDEX action type and corresponding commit metadata

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4523: URL: https://github.com/apache/hudi/pull/4523#issuecomment-1015078994 ## CI report: * 6c1c19492e82d894d20095e1b5038d29fd2d3322 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015080437 ## CI report: * 88ccdba40380ee97a48d6c6a299ca55d9d5c3030 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4582: [MINOR] standardize HoodieSqlCommon.g4 file

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4582: URL: https://github.com/apache/hudi/pull/4582#issuecomment-1015053946 ## CI report: * 208e50cde75c9ef2e79d8550cfbc441e6e0d2ba6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] xushiyan commented on issue #4600: [SUPPORT]When hive queries Hudi data, the query path is wrong

2022-01-17 Thread GitBox
xushiyan commented on issue #4600: URL: https://github.com/apache/hudi/issues/4600#issuecomment-1015080532 @xiarixiaoyao can you give some advices here please? thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [hudi] logan-jun closed issue #4623: [SUPPORT] Delete is not working in python, parquet files

2022-01-17 Thread GitBox
logan-jun closed issue #4623: URL: https://github.com/apache/hudi/issues/4623 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@

[GitHub] [hudi] logan-jun commented on issue #4623: [SUPPORT] Delete is not working in python, parquet files

2022-01-17 Thread GitBox
logan-jun commented on issue #4623: URL: https://github.com/apache/hudi/issues/4623#issuecomment-1015080879 It works. thank you very much -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [hudi] xushiyan commented on issue #4593: [SUPPORT] Does Hudi support just column re-order?

2022-01-17 Thread GitBox
xushiyan commented on issue #4593: URL: https://github.com/apache/hudi/issues/4593#issuecomment-1015082464 Great to see rfc 33 moving and thanks @xiarixiaoyao for driving the work there! @WTa-hash if no further questions, let's close this. thank you. -- This is an automated message from

[GitHub] [hudi] xushiyan commented on issue #4585: Target Schema cannot be set in MultiTableDeltaStreamer

2022-01-17 Thread GitBox
xushiyan commented on issue #4585: URL: https://github.com/apache/hudi/issues/4585#issuecomment-1015083274 @pratyakshsharma can you chime in here please? looks like some improvements for `MultiTableDeltaStreamer` ? -- This is an automated message from the Apache Git Service. To respond t

[GitHub] [hudi] xushiyan commented on issue #4583: [SUPPORT] Got NoSuchElementException while using hudi 0.10.0 and Flink (COW)

2022-01-17 Thread GitBox
xushiyan commented on issue #4583: URL: https://github.com/apache/hudi/issues/4583#issuecomment-1015085431 @york-yu-ctw let us know if you got it resolved. thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[jira] [Comment Edited] (HUDI-3222) On-call team to triage GH issues, PRs, and JIRAs

2022-01-17 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477396#comment-17477396 ] Raymond Xu edited comment on HUDI-3222 at 1/18/22, 5:34 AM: h4

[GitHub] [hudi] XuQianJin-Stars commented on issue #4604: [SUPPORT] Archive functionality fails

2022-01-17 Thread GitBox
XuQianJin-Stars commented on issue #4604: URL: https://github.com/apache/hudi/issues/4604#issuecomment-1015086623 hi @andykrk Can you provide further steps to help me reproduce this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[jira] [Comment Edited] (HUDI-3222) On-call team to triage GH issues, PRs, and JIRAs

2022-01-17 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477396#comment-17477396 ] Raymond Xu edited comment on HUDI-3222 at 1/18/22, 5:38 AM: h4

[GitHub] [hudi] xushiyan commented on issue #4299: [SUPPORT] Upsert performance decreased after 3 years of data loading

2022-01-17 Thread GitBox
xushiyan commented on issue #4299: URL: https://github.com/apache/hudi/issues/4299#issuecomment-1015089248 @XuQianJin-Stars can you help investigate this issue please? thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [hudi] EchoLee5 opened a new pull request #4624: [HUDI-3261] Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread GitBox
EchoLee5 opened a new pull request #4624: URL: https://github.com/apache/hudi/pull/4624 ## What is the purpose of the pull request Shade org.apache.parquet.avro. to org.apache.hudi.org.apache.parquet.avro. for hudi-hadoop-mr-bundle module ## Brief change log Modify the

[jira] [Updated] (HUDI-3261) Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3261: - Labels: pull-request-available sev:normal (was: sev:normal) > Read rt table by hive cli throw NoS

[GitHub] [hudi] hudi-bot commented on pull request #4624: [HUDI-3261] Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4624: URL: https://github.com/apache/hudi/pull/4624#issuecomment-1015091544 ## CI report: * 315c4c0092660cc10b4dadb833456283cebe6979 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-3261) Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread Echo Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Echo Lee updated HUDI-3261: --- Description: @Mention someone by typing their name... (was: ``` Exception in thread "main" java.lang.NoSuchMet

[jira] [Updated] (HUDI-3261) Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread Echo Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Echo Lee updated HUDI-3261: --- Description: ``` Exception in thread "main" java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaCo

[GitHub] [hudi] hudi-bot removed a comment on pull request #4624: [HUDI-3261] Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4624: URL: https://github.com/apache/hudi/pull/4624#issuecomment-1015091544 ## CI report: * 315c4c0092660cc10b4dadb833456283cebe6979 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4624: [HUDI-3261] Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4624: URL: https://github.com/apache/hudi/pull/4624#issuecomment-1015092655 ## CI report: * 315c4c0092660cc10b4dadb833456283cebe6979 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] xushiyan commented on issue #4082: [SUPPORT] How to write multiple HUDi tables simultaneously in a Spark Streaming task?

2022-01-17 Thread GitBox
xushiyan commented on issue #4082: URL: https://github.com/apache/hudi/issues/4082#issuecomment-1015092759 > @xuranyang : are you referring to MultiTableDeltastreamer. I don't think we have any such functionality for now to stream from multiple and write to diff hudi tables. Had to be done

[jira] [Updated] (HUDI-3261) Read rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread Echo Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Echo Lee updated HUDI-3261: --- Description: When querying the MOR table synchronized from hudi to hive, the following exception is thrown:

[jira] [Updated] (HUDI-3261) Query rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread Echo Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Echo Lee updated HUDI-3261: --- Summary: Query rt table by hive cli throw NoSuchMethodError (was: Read rt table by hive cli throw NoSuchMetho

[jira] [Commented] (HUDI-3261) Query rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread Echo Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477590#comment-17477590 ] Echo Lee commented on HUDI-3261: cc [~danny0405] Can you assign this ticket to me? > Que

[GitHub] [hudi] zhangyue19921010 commented on pull request #4459: [HUDI-3116]Add a new HoodieDropPartitionsTool to let users drop table partitions through a standalone job.

2022-01-17 Thread GitBox
zhangyue19921010 commented on pull request #4459: URL: https://github.com/apache/hudi/pull/4459#issuecomment-1015099938 > this might have to hold until we cleanly fix #4489 . We are discussing on how to do delete partition cleanly here. cleaner has to take care of deleting the partition di

[jira] [Updated] (HUDI-3261) Query rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread Echo Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Echo Lee updated HUDI-3261: --- Description: When query the MOR table synchronized from hudi to hive, the following exception is thrown:  

[GitHub] [hudi] hudi-bot removed a comment on pull request #4523: [WIP][HUDI-3173] Add INDEX action type and corresponding commit metadata

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4523: URL: https://github.com/apache/hudi/pull/4523#issuecomment-1015080375 ## CI report: * 6c1c19492e82d894d20095e1b5038d29fd2d3322 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4523: [WIP][HUDI-3173] Add INDEX action type and corresponding commit metadata

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4523: URL: https://github.com/apache/hudi/pull/4523#issuecomment-1015117561 ## CI report: * bae79196fe6e9e2674ebdfa3d0ecdb176e85af79 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] XuQianJin-Stars commented on issue #4299: [SUPPORT] Upsert performance decreased after 3 years of data loading

2022-01-17 Thread GitBox
XuQianJin-Stars commented on issue #4299: URL: https://github.com/apache/hudi/issues/4299#issuecomment-1015119793 > @XuQianJin-Stars can you help investigate this issue please? thank you. well, I will follow this question later. -- This is an automated message from the Apache Git S

[GitHub] [hudi] chrischnweiss commented on issue #4585: Target Schema cannot be set in MultiTableDeltaStreamer

2022-01-17 Thread GitBox
chrischnweiss commented on issue #4585: URL: https://github.com/apache/hudi/issues/4585#issuecomment-1015125841 Hey @nsivabalan, unfortunately our Kafka topic naming schema makes it impossible for us to use it this way. Cheers, Christian -- This is an automated message f

[GitHub] [hudi] hudi-bot removed a comment on pull request #4624: [HUDI-3261] Query rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4624: URL: https://github.com/apache/hudi/pull/4624#issuecomment-1015092655 ## CI report: * 315c4c0092660cc10b4dadb833456283cebe6979 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4624: [HUDI-3261] Query rt table by hive cli throw NoSuchMethodError

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4624: URL: https://github.com/apache/hudi/pull/4624#issuecomment-1015126541 ## CI report: * 315c4c0092660cc10b4dadb833456283cebe6979 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Created] (HUDI-3263) Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE

2022-01-17 Thread Danny Chen (Jira)
Danny Chen created HUDI-3263: Summary: Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE Key: HUDI-3263 URL: https://issues.apache.org/jira/browse/HUDI-3263 Project: Apache H

[jira] [Updated] (HUDI-3263) Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE

2022-01-17 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-3263: - Attachment: 1.png > Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid > NPE > -

[jira] [Updated] (HUDI-3263) Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE

2022-01-17 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-3263: - Attachment: 2.png > Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid > NPE > -

[GitHub] [hudi] wangxianghu merged pull request #4620: [MINOR] Minor improvement in JsonkafkaSource

2022-01-17 Thread GitBox
wangxianghu merged pull request #4620: URL: https://github.com/apache/hudi/pull/4620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsub

[hudi] branch master updated (f184474 -> 3d93e85)

2022-01-17 Thread wangxianghu
This is an automated email from the ASF dual-hosted git repository. wangxianghu pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from f184474 [HUDI-1558] Struct Stream Source Support Spark3 (#4586) add 3d93e85 [MINOR] Minor improvement in Js

[GitHub] [hudi] danny0405 opened a new pull request #4625: [HUDI-3263] Do not nullify members in HoodieTableFileSystemView#reset…

2022-01-17 Thread GitBox
danny0405 opened a new pull request #4625: URL: https://github.com/apache/hudi/pull/4625 …ViewState to avoid NPE ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull reques

[jira] [Updated] (HUDI-3263) Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE

2022-01-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3263: - Labels: pull-request-available sev:normal (was: sev:normal) > Do not nullify members in HoodieTab

[GitHub] [hudi] hudi-bot commented on pull request #4078: [HUDI-2833][Design] Merge small archive files instead of expanding indefinitely.

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4078: URL: https://github.com/apache/hudi/pull/4078#issuecomment-1015132977 ## CI report: * 8f8ae385baf21dacd4b9fedd3670133160001dc0 UNKNOWN * 019e161bb908731244e13cdf36d12781956f0114 UNKNOWN * c36aac530d3350857fb01df858d0f26c123e5766 UNKN

[GitHub] [hudi] hudi-bot removed a comment on pull request #4078: [HUDI-2833][Design] Merge small archive files instead of expanding indefinitely.

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4078: URL: https://github.com/apache/hudi/pull/4078#issuecomment-1009938542 ## CI report: * 8f8ae385baf21dacd4b9fedd3670133160001dc0 UNKNOWN * 019e161bb908731244e13cdf36d12781956f0114 UNKNOWN * c36aac530d3350857fb01df858d0f26c123e5

[GitHub] [hudi] hudi-bot commented on pull request #4625: [HUDI-3263] Do not nullify members in HoodieTableFileSystemView#reset…

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4625: URL: https://github.com/apache/hudi/pull/4625#issuecomment-1015133370 ## CI report: * 31a00a1d995612cc616eab9df6c03b5fff87f098 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] hudi-bot removed a comment on pull request #4625: [HUDI-3263] Do not nullify members in HoodieTableFileSystemView#reset…

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4625: URL: https://github.com/apache/hudi/pull/4625#issuecomment-1015133370 ## CI report: * 31a00a1d995612cc616eab9df6c03b5fff87f098 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4625: [HUDI-3263] Do not nullify members in HoodieTableFileSystemView#reset…

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4625: URL: https://github.com/apache/hudi/pull/4625#issuecomment-1015135059 ## CI report: * 31a00a1d995612cc616eab9df6c03b5fff87f098 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] ChangbingChen commented on issue #4618: [SUPPORT] When querying a hudi table in hive, there have duplicated records.

2022-01-17 Thread GitBox
ChangbingChen commented on issue #4618: URL: https://github.com/apache/hudi/issues/4618#issuecomment-1015144555 Sorry, it doesn't work either. i query the xxx_ro table, the inputformat should be org.apache.hudi.hadoop.HoodieParquetInputFormat? By the way, there are

[GitHub] [hudi] guyuqi opened a new pull request #4617: HUDI-1657: build failed on AArch64, Fedora 33

2022-01-17 Thread GitBox
guyuqi opened a new pull request #4617: URL: https://github.com/apache/hudi/pull/4617 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] 7c00 commented on a change in pull request #4563: [HUDI-3211][RFC-44] Add RFC for Hudi Connector for Presto

2022-01-17 Thread GitBox
7c00 commented on a change in pull request #4563: URL: https://github.com/apache/hudi/pull/4563#discussion_r785717040 ## File path: rfc/rfc-44/rfc-44.md ## @@ -0,0 +1,156 @@ + + +# RFC-44: Hudi Connector for Presto + +## Proposers + +- @7c00 + +## Approvers + +- @ + +## Status

[jira] [Updated] (HUDI-1657) build failed on AArch64, Fedora 33

2022-01-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1657: - Labels: pull-request-available sev:triage user-support-issues (was: sev:triage user-support-issue

[jira] [Commented] (HUDI-1657) build failed on AArch64, Fedora 33

2022-01-17 Thread Yuqi Gu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477014#comment-17477014 ] Yuqi Gu commented on HUDI-1657: --- Hudi failed to build on Arm64. Protocbuf did not support Ar

[GitHub] [hudi] hudi-bot commented on pull request #4617: HUDI-1657: build failed on AArch64, Fedora 33

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4617: URL: https://github.com/apache/hudi/pull/4617#issuecomment-1014242906 ## CI report: * bc16dc89ea6b1827ac7eb4cd8a2ef4661646e1eb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] guyuqi commented on pull request #4617: HUDI-1657: build failed on AArch64, Fedora 33

2022-01-17 Thread GitBox
guyuqi commented on pull request #4617: URL: https://github.com/apache/hudi/pull/4617#issuecomment-1014243001 Successfully build Hudi on Arm64 Fedora33/Ubuntu20: ``` INFO] Dependency-reduced POM written at: /home/builder/hudi/packaging/hudi-kafka-connect-bundle/target/dependency-reduc

[GitHub] [hudi] hudi-bot commented on pull request #4616: [HUDI-3257] Excluding clustering instants from pending rollback info

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4616: URL: https://github.com/apache/hudi/pull/4616#issuecomment-1014245666 ## CI report: * a4173a03fceab5ba874ebd9e86aba63e005d640d UNKNOWN * 79ce599b8833c1606bdcb463ab74f21bbd01c521 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot removed a comment on pull request #4617: HUDI-1657: build failed on AArch64, Fedora 33

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4617: URL: https://github.com/apache/hudi/pull/4617#issuecomment-1014242906 ## CI report: * bc16dc89ea6b1827ac7eb4cd8a2ef4661646e1eb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4617: HUDI-1657: build failed on AArch64, Fedora 33

2022-01-17 Thread GitBox
hudi-bot commented on pull request #4617: URL: https://github.com/apache/hudi/pull/4617#issuecomment-1014245703 ## CI report: * bc16dc89ea6b1827ac7eb4cd8a2ef4661646e1eb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4616: [HUDI-3257] Excluding clustering instants from pending rollback info

2022-01-17 Thread GitBox
hudi-bot removed a comment on pull request #4616: URL: https://github.com/apache/hudi/pull/4616#issuecomment-1014197006 ## CI report: * a4173a03fceab5ba874ebd9e86aba63e005d640d UNKNOWN * 79ce599b8833c1606bdcb463ab74f21bbd01c521 Azure: [PENDING](https://dev.azure.com/apache-hud

[GitHub] [hudi] ChangbingChen opened a new issue #4618: [SUPPORT] When querying a hudi table in hive, there have duplated records.

2022-01-17 Thread GitBox
_timestamp(date_format(update_time, '-MM-dd HH:mm:ss'))*1000 ,date_format(create_time, 'MMdd') as dt from mysql_table_kafka; 2.query in beeline select m ,sum(case when cnt=1 then 1 else 0 end) as one_cnt ,sum(case when cnt=2 then 1 else 0 end)

[GitHub] [hudi] 7c00 commented on pull request #4563: [HUDI-3211][RFC-44] Add RFC for Hudi Connector for Presto

2022-01-17 Thread GitBox
7c00 commented on pull request #4563: URL: https://github.com/apache/hudi/pull/4563#issuecomment-1014264458 @agrawaldevesh For the first question, users will need to change their queries to use the hudi connector at first. For example, change `select * from hive.schema.hudi_table`

[GitHub] [hudi] codope commented on a change in pull request #4588: [HUDI-3072] Fixing conflict resolution in transaction management code path for auto commit code path

2022-01-17 Thread GitBox
codope commented on a change in pull request #4588: URL: https://github.com/apache/hudi/pull/4588#discussion_r785749484 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/TestHoodieClientMultiWriter.java ## @@ -441,6 +445,135 @@ public void testH

  1   2   3   4   >