[GitHub] [hudi] xushiyan commented on a change in pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
xushiyan commented on a change in pull request #4489: URL: https://github.com/apache/hudi/pull/4489#discussion_r780645945 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkDeletePartitionCommitActionExecutor.java ## @@ -42,27 +4

[GitHub] [hudi] Guanpx commented on issue #4537: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx commented on issue #4537: URL: https://github.com/apache/hudi/issues/4537#issuecomment-1007919256 Duplicate of # -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] Guanpx closed issue #4537: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx closed issue #4537: URL: https://github.com/apache/hudi/issues/4537 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hud

[GitHub] [hudi] Guanpx edited a comment on issue #4537: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx edited a comment on issue #4537: URL: https://github.com/apache/hudi/issues/4537#issuecomment-1007919256 Duplicate of #4539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [hudi] XuQianJin-Stars commented on a change in pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
XuQianJin-Stars commented on a change in pull request #4489: URL: https://github.com/apache/hudi/pull/4489#discussion_r780649867 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkDeletePartitionCommitActionExecutor.java ## @@ -4

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007880055 ## CI report: * 7ca735c33a6ac260b763a0ac82215cb9fe99fc4f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007921813 ## CI report: * 7ca735c33a6ac260b763a0ac82215cb9fe99fc4f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Updated] (HUDI-110) Better defaults for Partition extractor for Spark DataSource and DeltaStreamer

2022-01-08 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu updated HUDI-110: Summary: Better defaults for Partition extractor for Spark DataSource and DeltaStreamer (was: Better default

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007922264 ## CI report: * 7ca735c33a6ac260b763a0ac82215cb9fe99fc4f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007921813 ## CI report: * 7ca735c33a6ac260b763a0ac82215cb9fe99fc4f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1003690548 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] xushiyan merged pull request #4538: [HUDI-3195] optimize spark3 pom and modify build command

2022-01-08 Thread GitBox
xushiyan merged pull request #4538: URL: https://github.com/apache/hudi/pull/4538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr

[GitHub] [hudi] hudi-bot commented on pull request #4234: [HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4234: URL: https://github.com/apache/hudi/pull/4234#issuecomment-1007699087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #4451: [HUDI-3104] Kafka-connect support hadoop config environments and properties

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4451: URL: https://github.com/apache/hudi/pull/4451#issuecomment-1001525163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] xushiyan commented on a change in pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
xushiyan commented on a change in pull request #4489: URL: https://github.com/apache/hudi/pull/4489#discussion_r780645945 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkDeletePartitionCommitActionExecutor.java ## @@ -42,27 +4

[GitHub] [hudi] hudi-bot removed a comment on pull request #4538: [HUDI-3195] optimize spark3 pom and modify build command

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4538: URL: https://github.com/apache/hudi/pull/4538#issuecomment-1007510466 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] nsivabalan commented on pull request #4514: [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4514: URL: https://github.com/apache/hudi/pull/4514#issuecomment-1007379499 yeah, would really appreciate if we can wait until Jan 9 to land this patch. thanks! -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [hudi] hudi-bot commented on pull request #4536: [HUDI-3185] HoodieConfig#getBoolean should return false when default …

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4536: URL: https://github.com/apache/hudi/pull/4536#issuecomment-1007382826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #4514: [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4514: URL: https://github.com/apache/hudi/pull/4514#issuecomment-1006428061 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] alexeykudinkin commented on pull request #4531: [HUDI-3191][Stacked on 4520] Rebasing Hive's FileInputFormat onto `AbstractHoodieTableFileIndex`

2022-01-08 Thread GitBox
alexeykudinkin commented on pull request #4531: URL: https://github.com/apache/hudi/pull/4531#issuecomment-1007887286 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [hudi] codope commented on pull request #4203: [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator

2022-01-08 Thread GitBox
codope commented on pull request #4203: URL: https://github.com/apache/hudi/pull/4203#issuecomment-1007218764 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [hudi] Guanpx edited a comment on issue #4539: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx edited a comment on issue #4539: URL: https://github.com/apache/hudi/issues/4539#issuecomment-1007514186 #3191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #4203: [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4203: URL: https://github.com/apache/hudi/pull/4203#issuecomment-1007219440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] vinothchandar edited a comment on pull request #4514: [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation

2022-01-08 Thread GitBox
vinothchandar edited a comment on pull request #4514: URL: https://github.com/apache/hudi/pull/4514#issuecomment-1007172166 @leesf few questions. 1. What do the hudi-spark2-extensions/hudi-spark3-extensions do? What code would these have in the future? 2. Users may have spark

[GitHub] [hudi] nsivabalan commented on pull request #4515: [HUDI-3158] Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4515: URL: https://github.com/apache/hudi/pull/4515#issuecomment-1007390563 @codope : Can you review this PR. May be there could be some follow ups in other places where we do create similar empty replace commit instants. I -- This is an automated me

[GitHub] [hudi] nsivabalan commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007436610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [hudi] leesf commented on pull request #4514: [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation

2022-01-08 Thread GitBox
leesf commented on pull request #4514: URL: https://github.com/apache/hudi/pull/4514#issuecomment-1007441054 > @leesf few questions. > > 1. What do the hudi-spark2-extensions/hudi-spark3-extensions do? What code would these have in the future? > 2. Users may have spark jobs that d

[GitHub] [hudi] hudi-bot commented on pull request #4535: [HUDI-3161] Add Call Produce Command for spark sql

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4535: URL: https://github.com/apache/hudi/pull/4535#issuecomment-1007371885 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #3946: [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #3946: URL: https://github.com/apache/hudi/pull/3946#issuecomment-1007015749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] hudi-bot removed a comment on pull request #4485: [HUDI-2947] Fixing checkpoint fetch in detlastreamer

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4485: URL: https://github.com/apache/hudi/pull/4485#issuecomment-1003399247 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4533: [HUDI-2682] Spark schema not updated with new columns on hive sync

2022-01-08 Thread GitBox
xiarixiaoyao commented on a change in pull request #4533: URL: https://github.com/apache/hudi/pull/4533#discussion_r780270395 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java ## @@ -251,10 +251,8 @@ private boolean syncSchema(String t

[GitHub] [hudi] jsbali edited a comment on pull request #3946: [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner

2022-01-08 Thread GitBox
jsbali edited a comment on pull request #3946: URL: https://github.com/apache/hudi/pull/3946#issuecomment-1007552923 Adding more context here ![Screenshot 2022-01-07 at 8 52 02 PM](https://user-images.githubusercontent.com/1778470/148575437-1bcacaef-deac-4ec4-b76d-0524798116e5.png) Le

[GitHub] [hudi] Guanpx commented on issue #4537: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx commented on issue #4537: URL: https://github.com/apache/hudi/issues/4537#issuecomment-1007919256 Duplicate of # -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #4534: [MINOR] fix typos in DDLExecutor

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4534: URL: https://github.com/apache/hudi/pull/4534#issuecomment-1007263210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] jsbali commented on pull request #3946: [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner

2022-01-08 Thread GitBox
jsbali commented on pull request #3946: URL: https://github.com/apache/hudi/pull/3946#issuecomment-1007552923 Adding more context here ![Screenshot 2022-01-07 at 8 52 02 PM](https://user-images.githubusercontent.com/1778470/148575437-1bcacaef-deac-4ec4-b76d-0524798116e5.png) Since par

[GitHub] [hudi] YannByron edited a comment on pull request #4538: [HUDI-3195] optimize spark3 pom and modify build command

2022-01-08 Thread GitBox
YannByron edited a comment on pull request #4538: URL: https://github.com/apache/hudi/pull/4538#issuecomment-1007859312 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] XuQianJin-Stars commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
XuQianJin-Stars commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007438849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [hudi] dongkelun commented on pull request #4533: [HUDI-3192] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
dongkelun commented on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007187752 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [hudi] Guanpx closed issue #4537: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx closed issue #4537: URL: https://github.com/apache/hudi/issues/4537 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hud

[GitHub] [hudi] Limess commented on issue #4525: [SUPPORT] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
Limess commented on issue #4525: URL: https://github.com/apache/hudi/issues/4525#issuecomment-1007247570 I think this is the same issue I raised on JIRA previously https://issues.apache.org/jira/browse/HUDI-2682 -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [hudi] alexeykudinkin commented on pull request #4234: [HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization

2022-01-08 Thread GitBox
alexeykudinkin commented on pull request #4234: URL: https://github.com/apache/hudi/pull/4234#issuecomment-1007698786 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [hudi] Guanpx edited a comment on issue #4537: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx edited a comment on issue #4537: URL: https://github.com/apache/hudi/issues/4537#issuecomment-1007919256 Duplicate of #4539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [hudi] nsivabalan commented on a change in pull request #4485: [HUDI-2947] Fixing checkpoint fetch in detlastreamer

2022-01-08 Thread GitBox
nsivabalan commented on a change in pull request #4485: URL: https://github.com/apache/hudi/pull/4485#discussion_r780253945 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java ## @@ -471,6 +468,22 @@ public void refreshTimeline() t

[GitHub] [hudi] xushiyan commented on issue #4474: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?

2022-01-08 Thread GitBox
xushiyan commented on issue #4474: URL: https://github.com/apache/hudi/issues/4474#issuecomment-1007188747 After some discussions, we think that we should keep cloud provider's jars out of open source bundle jars. Any cloud provider can create its own specific hudi module and hudi bundle j

[GitHub] [hudi] parisni commented on issue #4525: [SUPPORT] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
parisni commented on issue #4525: URL: https://github.com/apache/hudi/issues/4525#issuecomment-1007436024 @Limess @nsivabalan the issue is independant from hudi I guess. CF my [spark-user mail](https://lists.apache.org/thread/9mmrnc5o7w42z723s2yqgcrdpwwtts3x) -- This is an automated mes

[GitHub] [hudi] codope merged pull request #4485: [HUDI-2947] Fixing checkpoint fetch in detlastreamer

2022-01-08 Thread GitBox
codope merged pull request #4485: URL: https://github.com/apache/hudi/pull/4485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr..

[GitHub] [hudi] YannByron commented on pull request #4538: [HUDI-3195] optimize spark3 pom and modify build command

2022-01-08 Thread GitBox
YannByron commented on pull request #4538: URL: https://github.com/apache/hudi/pull/4538#issuecomment-1007509024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [hudi] hudi-bot commented on pull request #4532: [Minor]Fix some code style based on check-sytle plugin

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4532: URL: https://github.com/apache/hudi/pull/4532#issuecomment-1007171945 ## CI report: * 59e48916ff71ca86523e40558a2e15418a3fffcb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] nsivabalan commented on pull request #4519: [HUDI-3180] Include files from completed commits while bootstrapping metadata table

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4519: URL: https://github.com/apache/hudi/pull/4519#issuecomment-1007437335 @manojpec : can you review the patch please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [hudi] hudi-bot commented on pull request #4538: [HUDI-3195] optimize spark3 pom and modify build command

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4538: URL: https://github.com/apache/hudi/pull/4538#issuecomment-1007510466 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[jira] [Updated] (HUDI-3183) Fix wrong result of HoodieArchivedTimeline loadInstants with TimeRangeFilter

2022-01-08 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-3183: - Fix Version/s: 0.11.0 > Fix wrong result of HoodieArchivedTimeline loadInstants with TimeRangeFilter > ---

[GitHub] [hudi] hudi-bot commented on pull request #4514: [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4514: URL: https://github.com/apache/hudi/pull/4514#issuecomment-1007440136 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] jardel-lima edited a comment on issue #3879: [SUPPORT] Incomplete Table Migration

2022-01-08 Thread GitBox
jardel-lima edited a comment on issue #3879: URL: https://github.com/apache/hudi/issues/3879#issuecomment-1007434803 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [hudi] hudi-bot removed a comment on pull request #4536: [HUDI-3185] HoodieConfig#getBoolean should return false when default …

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4536: URL: https://github.com/apache/hudi/pull/4536#issuecomment-1007382826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] hudi-bot removed a comment on pull request #4533: [HUDI-3192] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007166026 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] nsivabalan commented on pull request #4530: [HUDI-3178] Fixing metadata table compaction so as to not include uncommitted data

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4530: URL: https://github.com/apache/hudi/pull/4530#issuecomment-1007385719 @manojpec : Can you review the patch too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] hudi-bot commented on pull request #4520: [HUDI-3179][Stacked on 4417] Extracted common `AbstractHoodieTableFileIndex` to be shared across engines

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4520: URL: https://github.com/apache/hudi/pull/4520#issuecomment-1007706224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] nsivabalan commented on issue #4525: [SUPPORT] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
nsivabalan commented on issue #4525: URL: https://github.com/apache/hudi/issues/4525#issuecomment-1007391256 Closing this as we have a tracking jira and a patch put up on this end. https://github.com/apache/hudi/pull/4533 thanks for reporting. -- This is an automated message from

[GitHub] [hudi] jardel-lima commented on issue #3879: [SUPPORT] Incomplete Table Migration

2022-01-08 Thread GitBox
jardel-lima commented on issue #3879: URL: https://github.com/apache/hudi/issues/3879#issuecomment-1007434803 Hi @nsivabalan. [HERE](https://drive.google.com/file/d/1RsesivvlLUZ9dZh7WbaGJJpnqIcWNbso/view?usp=sharing) is the dataset used to replicate this problem. The file is not public

[GitHub] [hudi] dongkelun commented on a change in pull request #4533: [HUDI-2682] Spark schema not updated with new columns on hive sync

2022-01-08 Thread GitBox
dongkelun commented on a change in pull request #4533: URL: https://github.com/apache/hudi/pull/4533#discussion_r780276868 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java ## @@ -251,10 +251,8 @@ private boolean syncSchema(String tabl

[GitHub] [hudi] hudi-bot removed a comment on pull request #4535: [HUDI-3161] Add Call Produce Command for spark sql

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4535: URL: https://github.com/apache/hudi/pull/4535#issuecomment-1007371885 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] hudi-bot commented on pull request #4495: [HUDI-3139] Shade htrace and parquet-avro in presto bundle

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4495: URL: https://github.com/apache/hudi/pull/4495#issuecomment-1007407141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] nsivabalan closed issue #4525: [SUPPORT] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
nsivabalan closed issue #4525: URL: https://github.com/apache/hudi/issues/4525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[GitHub] [hudi] YannByron commented on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-08 Thread GitBox
YannByron commented on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1007894563 > LGTM. would be nice to have some UT case covering the nested val case @xushiyan Have added UT for this. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] nsivabalan merged pull request #4534: [MINOR] fix typos in DDLExecutor

2022-01-08 Thread GitBox
nsivabalan merged pull request #4534: URL: https://github.com/apache/hudi/pull/4534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubs

[GitHub] [hudi] nsivabalan merged pull request #4527: [HUDI-3188] Update quick start guide for Kafka Connect Sink for Hudi

2022-01-08 Thread GitBox
nsivabalan merged pull request #4527: URL: https://github.com/apache/hudi/pull/4527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubs

[GitHub] [hudi] hudi-bot removed a comment on pull request #4203: [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4203: URL: https://github.com/apache/hudi/pull/4203#issuecomment-1007148515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] vinothchandar commented on pull request #4514: [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation

2022-01-08 Thread GitBox
vinothchandar commented on pull request #4514: URL: https://github.com/apache/hudi/pull/4514#issuecomment-1007172166 @leesf few questions. 1. What do the hudi-spark2-extensions/hudi-spark3-extensions do? What code would these have in the future? 2. Users may have spark jobs t

[GitHub] [hudi] hudi-bot removed a comment on pull request #4495: [HUDI-3139] Shade htrace and parquet-avro in presto bundle

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4495: URL: https://github.com/apache/hudi/pull/4495#issuecomment-1003993031 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] xushiyan merged pull request #4440: [HUDI-3100] Add config for hive conditional sync

2022-01-08 Thread GitBox
xushiyan merged pull request #4440: URL: https://github.com/apache/hudi/pull/4440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr

[GitHub] [hudi] nsivabalan commented on pull request #4421: [HUDI-3096] fixed the bug that the cow table(contains decimalType) wriite by flink cannot be read by spark.

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4421: URL: https://github.com/apache/hudi/pull/4421#issuecomment-1007755381 @xiarixiaoyao @danny0405 : Let me know if we are targeting this for 0.10.1. If yes, can you folks follow up and get it to completion. thanks for the cooperation. -- This is

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1007458568 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot commented on pull request #4451: [HUDI-3104] Kafka-connect support hadoop config environments and properties

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4451: URL: https://github.com/apache/hudi/pull/4451#issuecomment-1007480064 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] xiarixiaoyao commented on pull request #4421: [HUDI-3096] fixed the bug that the cow table(contains decimalType) wriite by flink cannot be read by spark.

2022-01-08 Thread GitBox
xiarixiaoyao commented on pull request #4421: URL: https://github.com/apache/hudi/pull/4421#issuecomment-1007873879 @nsivabalan I think we can put it in the next version, We may need to test this pr with more different versions of hive/presto. thanks -- This is an automated message fro

[GitHub] [hudi] hudi-bot commented on pull request #4485: [HUDI-2947] Fixing checkpoint fetch in detlastreamer

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4485: URL: https://github.com/apache/hudi/pull/4485#issuecomment-1007433563 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] nsivabalan merged pull request #4536: [HUDI-3185] HoodieConfig#getBoolean should return false when default …

2022-01-08 Thread GitBox
nsivabalan merged pull request #4536: URL: https://github.com/apache/hudi/pull/4536 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubs

[GitHub] [hudi] nsivabalan commented on pull request #4533: [HUDI-3192] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007388310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [hudi] nsivabalan commented on pull request #4485: [HUDI-2947] Fixing checkpoint fetch in detlastreamer

2022-01-08 Thread GitBox
nsivabalan commented on pull request #4485: URL: https://github.com/apache/hudi/pull/4485#issuecomment-1007430788 @codope : addressed all comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] hudi-bot removed a comment on pull request #4234: [HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4234: URL: https://github.com/apache/hudi/pull/4234#issuecomment-1006160112 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] parisni commented on pull request #4533: [HUDI-2682] Spark schema not updated with new columns on hive sync

2022-01-08 Thread GitBox
parisni commented on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007442525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[GitHub] [hudi] hudi-bot commented on pull request #4531: [WIP][HUDI-3191][Stacked on 4520] Rebasing Hive's FileInputFormat onto `AbstractHoodieTableFileIndex`

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4531: URL: https://github.com/apache/hudi/pull/4531#issuecomment-1007706281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #4534: [MINOR] fix typos in DDLExecutor

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4534: URL: https://github.com/apache/hudi/pull/4534#issuecomment-1007263210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] xiarixiaoyao commented on pull request #4533: [HUDI-2682] Spark schema not updated with new columns on hive sync

2022-01-08 Thread GitBox
xiarixiaoyao commented on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007472831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [hudi] cdmikechen commented on a change in pull request #4451: [HUDI-3104] Kafka-connect support hadoop config environments and properties

2022-01-08 Thread GitBox
cdmikechen commented on a change in pull request #4451: URL: https://github.com/apache/hudi/pull/4451#discussion_r780322456 ## File path: hudi-kafka-connect/src/main/java/org/apache/hudi/connect/writers/KafkaConnectConfigs.java ## @@ -93,6 +93,17 @@ .defaultValue(true)

[GitHub] [hudi] danny0405 commented on issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-08 Thread GitBox
danny0405 commented on issue #4508: URL: https://github.com/apache/hudi/issues/4508#issuecomment-1007233408 I said you should use the right code version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-01-08 Thread GitBox
manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r780117989 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java ## @@ -111,13 +124,14 @@ public HoodieBloomInd

[GitHub] [hudi] dongkelun edited a comment on pull request #4533: [HUDI-2682] Spark schema not updated with new columns on hive sync

2022-01-08 Thread GitBox
dongkelun edited a comment on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007410108 > @dongkelun : can you please check if [HUDI-3192](https://issues.apache.org/jira/browse/HUDI-3192) and https://issues.apache.org/jira/browse/HUDI-2682 are duplicates. if

[GitHub] [hudi] hudi-bot commented on pull request #3946: [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner

2022-01-08 Thread GitBox
hudi-bot commented on pull request #3946: URL: https://github.com/apache/hudi/pull/3946#issuecomment-1007548455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1006360315 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] hudi-bot commented on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1007886735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] Guanpx commented on issue #4539: [SUPPORT] spark 2.4.0 write data to hudi ERROR (0.10.0)

2022-01-08 Thread GitBox
Guanpx commented on issue #4539: URL: https://github.com/apache/hudi/issues/4539#issuecomment-1007514186 #3191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #4532: [Minor]Fix some code style based on check-sytle plugin

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4532: URL: https://github.com/apache/hudi/pull/4532#issuecomment-1007152604 ## CI report: * 59e48916ff71ca86523e40558a2e15418a3fffcb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4533: [HUDI-3192] Spark metastore schema evolution broken

2022-01-08 Thread GitBox
hudi-bot commented on pull request #4533: URL: https://github.com/apache/hudi/pull/4533#issuecomment-1007186261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] hudi-bot removed a comment on pull request #4531: [WIP][HUDI-3191][Stacked on 4520] Rebasing Hive's FileInputFormat onto `AbstractHoodieTableFileIndex`

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4531: URL: https://github.com/apache/hudi/pull/4531#issuecomment-1007147660 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] XuQianJin-Stars commented on a change in pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-08 Thread GitBox
XuQianJin-Stars commented on a change in pull request #4489: URL: https://github.com/apache/hudi/pull/4489#discussion_r780649867 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkDeletePartitionCommitActionExecutor.java ## @@ -4

[GitHub] [hudi] prashantwason commented on a change in pull request #4530: [HUDI-3178] Fixing metadata table compaction so as to not include uncommitted data

2022-01-08 Thread GitBox
prashantwason commented on a change in pull request #4530: URL: https://github.com/apache/hudi/pull/4530#discussion_r780621073 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -689,7 +689,7 @@ protec

[GitHub] [hudi] xiarixiaoyao merged pull request #4533: [HUDI-2682] Spark schema not updated with new columns on hive sync

2022-01-08 Thread GitBox
xiarixiaoyao merged pull request #4533: URL: https://github.com/apache/hudi/pull/4533 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsu

[GitHub] [hudi] hudi-bot removed a comment on pull request #4520: [HUDI-3179][Stacked on 4417] Extracted common `AbstractHoodieTableFileIndex` to be shared across engines

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4520: URL: https://github.com/apache/hudi/pull/4520#issuecomment-1007154511 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [hudi] danny0405 commented on issue #4238: [SUPPORT]Merge cow partitioned files manually

2022-01-08 Thread GitBox
danny0405 commented on issue #4238: URL: https://github.com/apache/hudi/issues/4238#issuecomment-1007948170 Yes, we can use clustering first then bootstrap the table with flink streaming. The clustering feature is on the way for flink side ~ -- This is an automated message from the Apach

[GitHub] [hudi] hudi-bot removed a comment on pull request #4287: [DO NOT MERGE] 0.10.0 release patch for flink

2022-01-08 Thread GitBox
hudi-bot removed a comment on pull request #4287: URL: https://github.com/apache/hudi/pull/4287#issuecomment-1006286289 ## CI report: * 5b7a535559d80359a3febc2d1a80bf9a8ac20cf9 UNKNOWN * 5b9130b16d5931b0031bfc2c6fc051d03fa4f49b Azure: [FAILURE](https://dev.azure.com/apache-hud

  1   2   3   >