[GitHub] [hudi] nsivabalan commented on pull request #4420: [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering

2022-02-08 Thread GitBox
nsivabalan commented on pull request #4420: URL: https://github.com/apache/hudi/pull/4420#issuecomment-1032315323 @danny0405 : addressed your feedback -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [hudi] hudi-bot commented on pull request #4420: [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering

2022-02-08 Thread GitBox
hudi-bot commented on pull request #4420: URL: https://github.com/apache/hudi/pull/4420#issuecomment-1032316246 ## CI report: * 48fd6ddd0a36975f7e065ae3c4c4635422def062 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4420: [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #4420: URL: https://github.com/apache/hudi/pull/4420#issuecomment-1028987325 ## CI report: * 48fd6ddd0a36975f7e065ae3c4c4635422def062 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] pratyakshsharma commented on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-1032316498 @nsivabalan @prashantwason Working on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [hudi] jaxonzhang commented on pull request #4420: [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering

2022-02-08 Thread GitBox
jaxonzhang commented on pull request #4420: URL: https://github.com/apache/hudi/pull/4420#issuecomment-1032327356 hi @nsivabalan, if users have a separate job to execute the already scheduled compaction plans, do the job have to configure OPTIMISTIC CONCURRENCY (lock provider)? If not, wha

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #4385: [HUDI-1436]: provided option to trigger clean every nth commit

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #4385: URL: https://github.com/apache/hudi/pull/4385#discussion_r801363902 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/TestCleaner.java ## @@ -1148,7 +1167,9 @@ public void testKeepLatest

[GitHub] [hudi] pratyakshsharma commented on pull request #4385: [HUDI-1436]: provided option to trigger clean every nth commit

2022-02-08 Thread GitBox
pratyakshsharma commented on pull request #4385: URL: https://github.com/apache/hudi/pull/4385#issuecomment-1032331957 @nsivabalan Apologies for the delay. I have asked for few clarifications. Let us land this one once the questions are addressed. -- This is an automated message from th

[GitHub] [hudi] hudi-bot commented on pull request #4385: [HUDI-1436]: provided option to trigger clean every nth commit

2022-02-08 Thread GitBox
hudi-bot commented on pull request #4385: URL: https://github.com/apache/hudi/pull/4385#issuecomment-1032334310 ## CI report: * 63dde3aeba8a92dd0e0616ed6e7269ffd59d72b7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4385: [HUDI-1436]: provided option to trigger clean every nth commit

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #4385: URL: https://github.com/apache/hudi/pull/4385#issuecomment-1019544968 ## CI report: * 63dde3aeba8a92dd0e0616ed6e7269ffd59d72b7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] pratyakshsharma edited a comment on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma edited a comment on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-1032316498 @nsivabalan @prashantwason I had been waiting for @vinothchandar 's review on this. Let me address the existing comments so we can land this. -- This is

[GitHub] [hudi] pengzhiwei2018 closed pull request #4613: [HUDI-2283] Support Clustering Command For Spark Sql

2022-02-08 Thread GitBox
pengzhiwei2018 closed pull request #4613: URL: https://github.com/apache/hudi/pull/4613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-un

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801372069 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java ## @@ -330,6 +349,19 @@ public Clean

[GitHub] [hudi] hudi-bot removed a comment on pull request #3614: [HUDI-2370] Supports data encryption

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #3614: URL: https://github.com/apache/hudi/pull/3614#issuecomment-1031057925 ## CI report: * f85aeac825205ef91e31ca4a12183c1501d12d9d UNKNOWN * a4688a962fedeeab27ce030396ce86622e6083d2 UNKNOWN * 26c72bc763d3364c8ce6a39b62942090416ee

[GitHub] [hudi] hudi-bot commented on pull request #3614: [HUDI-2370] Supports data encryption

2022-02-08 Thread GitBox
hudi-bot commented on pull request #3614: URL: https://github.com/apache/hudi/pull/3614#issuecomment-1032340022 ## CI report: * f85aeac825205ef91e31ca4a12183c1501d12d9d UNKNOWN * a4688a962fedeeab27ce030396ce86622e6083d2 UNKNOWN * 26c72bc763d3364c8ce6a39b62942090416eee83 Azur

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801372791 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java ## @@ -402,9 +436,16 @@ private Stri

[GitHub] [hudi] yihua commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-08 Thread GitBox
yihua commented on pull request #4721: URL: https://github.com/apache/hudi/pull/4721#issuecomment-1032340685 CI passes. ![1C1874F4-B53E-4979-A1A7-B9D2895F2861](https://user-images.githubusercontent.com/2497195/152947621-4614b4b3-cfe8-45aa-b05b-0235526c6589.jpeg) -- This is an auto

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801373570 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java ## @@ -402,9 +436,16 @@ private Stri

[GitHub] [hudi] yihua merged pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-08 Thread GitBox
yihua merged pull request #4721: URL: https://github.com/apache/hudi/pull/4721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[hudi] branch master updated (0ab1a8e -> 1636876)

2022-02-08 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 0ab1a8e [HUDI-3312] Fixing spark yaml and adding hive validation to integ test suite (#4731) add 1636876 [HUDI-3

[GitHub] [hudi] zhangyue19921010 commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-08 Thread GitBox
zhangyue19921010 commented on pull request #4721: URL: https://github.com/apache/hudi/pull/4721#issuecomment-1032342450 Thanks a lot for your help @yihua and @nsivabalan :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] hudi-bot removed a comment on pull request #4753: [HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #4753: URL: https://github.com/apache/hudi/pull/4753#issuecomment-1030840032 ## CI report: * c8198192cdeeff6062c07877c2aa71129313 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4753: [HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction

2022-02-08 Thread GitBox
hudi-bot commented on pull request #4753: URL: https://github.com/apache/hudi/pull/4753#issuecomment-1032365874 ## CI report: * c8198192cdeeff6062c07877c2aa71129313 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot commented on pull request #4480: [HUDI-3123] consistent hashing index: basic write path (upsert/insert)

2022-02-08 Thread GitBox
hudi-bot commented on pull request #4480: URL: https://github.com/apache/hudi/pull/4480#issuecomment-1032376929 ## CI report: * c4a2ace5e28fafb29394a1448e1a6c2a0645dda9 UNKNOWN * 4314dccc4029df3e638d974e4d380dec73bbad31 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot removed a comment on pull request #4480: [HUDI-3123] consistent hashing index: basic write path (upsert/insert)

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #4480: URL: https://github.com/apache/hudi/pull/4480#issuecomment-1009591034 ## CI report: * c4a2ace5e28fafb29394a1448e1a6c2a0645dda9 UNKNOWN * 4314dccc4029df3e638d974e4d380dec73bbad31 Azure: [SUCCESS](https://dev.azure.com/apache-hud

[GitHub] [hudi] codope merged pull request #4762: [HUDI-3367] Adding support for custom scheduler configs with streaming sink

2022-02-08 Thread GitBox
codope merged pull request #4762: URL: https://github.com/apache/hudi/pull/4762 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr..

[hudi] branch master updated (1636876 -> ab73047)

2022-02-08 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 1636876 [HUDI-3320] Hoodie metadata table validator (#4721) add ab73047 Adding support for custom scheduler conf

[GitHub] [hudi] codope merged pull request #4659: [HUDI-3091] Making SIMPLE index as the default index type

2022-02-08 Thread GitBox
codope merged pull request #4659: URL: https://github.com/apache/hudi/pull/4659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr..

[hudi] branch master updated (ab73047 -> 6a32cfe)

2022-02-08 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from ab73047 Adding support for custom scheduler configs with streaming sink (#4762) add 6a32cfe [HUDI-3091] Making

[GitHub] [hudi] codope commented on a change in pull request #4681: [HUDI-2987] Update all deprecated calls to new apis in HoodieRecordPayload

2022-02-08 Thread GitBox
codope commented on a change in pull request #4681: URL: https://github.com/apache/hudi/pull/4681#discussion_r801430779 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/debezium/PostgresDebeziumAvroPayload.java ## @@ -71,6 +72,19 @@ protected boolean should

[GitHub] [hudi] zhangyue19921010 commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-02-08 Thread GitBox
zhangyue19921010 commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1032400658 > @zhangyue19921010 : you are bringing up a good point. if I am not wrong, you are talking about a scenario, where someone triggered delete_partition for partition X and

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801435126 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java ## @@ -299,14 +308,24 @@ public Clea

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801446438 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java ## @@ -76,6 +76,11 @@ .withDocu

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801451351 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/TestCleaner.java ## @@ -1267,6 +1271,154 @@ public void testKeepLate

[jira] [Created] (HUDI-3389) Bump flink version to 1.14.3

2022-02-08 Thread Danny Chen (Jira)
Danny Chen created HUDI-3389: Summary: Bump flink version to 1.14.3 Key: HUDI-3389 URL: https://issues.apache.org/jira/browse/HUDI-3389 Project: Apache Hudi Issue Type: Task Components:

[GitHub] [hudi] hudi-bot removed a comment on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-997409117 ## CI report: * 676d9996fea7dec928df3ef578d907ddde76d4f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/re

[GitHub] [hudi] hudi-bot commented on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
hudi-bot commented on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-1032432658 ## CI report: * 676d9996fea7dec928df3ef578d907ddde76d4f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Created] (HUDI-3390) Update cleaner blog with KEEP_LATEST_BY_HOURS policy

2022-02-08 Thread Pratyaksh Sharma (Jira)
Pratyaksh Sharma created HUDI-3390: -- Summary: Update cleaner blog with KEEP_LATEST_BY_HOURS policy Key: HUDI-3390 URL: https://issues.apache.org/jira/browse/HUDI-3390 Project: Apache Hudi Is

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on a change in pull request #3646: URL: https://github.com/apache/hudi/pull/3646#discussion_r801464832 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java ## @@ -76,6 +76,11 @@ .withDocu

[GitHub] [hudi] pratyakshsharma commented on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-1032446215 Not sure why Hudi bot is showing FAILURE. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] pratyakshsharma commented on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-1032446874 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [hudi] pratyakshsharma commented on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

2022-02-08 Thread GitBox
pratyakshsharma commented on pull request #3646: URL: https://github.com/apache/hudi/pull/3646#issuecomment-1032447339 @nsivabalan Please take a pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] XuQianJin-Stars commented on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-08 Thread GitBox
XuQianJin-Stars commented on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1032466415 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [hudi] XuQianJin-Stars removed a comment on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-08 Thread GitBox
XuQianJin-Stars removed a comment on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1032466415 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] hudi-bot commented on pull request #4693: [WIP][HUDI-3175][RFC-45] Implement async metadata indexing

2022-02-08 Thread GitBox
hudi-bot commented on pull request #4693: URL: https://github.com/apache/hudi/pull/4693#issuecomment-1032499400 ## CI report: * 06c6dd9db383efa291c999d5f0140e5d2493eeaf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4693: [WIP][HUDI-3175][RFC-45] Implement async metadata indexing

2022-02-08 Thread GitBox
hudi-bot removed a comment on pull request #4693: URL: https://github.com/apache/hudi/pull/4693#issuecomment-1029330925 ## CI report: * 06c6dd9db383efa291c999d5f0140e5d2493eeaf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[jira] [Closed] (HUDI-382) Move TimestampBasedKeyGenerator to hudi-spark module from hudi-utilities

2022-02-08 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pratyaksh Sharma closed HUDI-382. - Resolution: Fixed This key generator is a part of hudi-spark-client now. > Move TimestampBasedKeyG

[jira] [Updated] (HUDI-3264) Make schema registry configs more flexible with MultiTableDeltaStreamer

2022-02-08 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pratyaksh Sharma updated HUDI-3264: --- Status: In Progress (was: Open) > Make schema registry configs more flexible with MultiTableD

[jira] [Assigned] (HUDI-3264) Make schema registry configs more flexible with MultiTableDeltaStreamer

2022-02-08 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pratyaksh Sharma reassigned HUDI-3264: -- Assignee: Pratyaksh Sharma > Make schema registry configs more flexible with MultiTable

[GitHub] [hudi] zhangyue19921010 commented on pull request #4753: [HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction

2022-02-08 Thread GitBox
zhangyue19921010 commented on pull request #4753: URL: https://github.com/apache/hudi/pull/4753#issuecomment-1032578196 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [hudi] zhangyue19921010 removed a comment on pull request #4753: [HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction

2022-02-08 Thread GitBox
zhangyue19921010 removed a comment on pull request #4753: URL: https://github.com/apache/hudi/pull/4753#issuecomment-1032578196 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [hudi] zhangyue19921010 commented on pull request #4753: [HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction

2022-02-08 Thread GitBox
zhangyue19921010 commented on pull request #4753: URL: https://github.com/apache/hudi/pull/4753#issuecomment-1032588005 Hi there. Trying to `find all such places for getting write stats, to eliminate all similar problems`. But it seems that new commit couldn't trigger CI job running. Cou

[GitHub] [hudi] FelixKJose commented on issue #4719: Spark Structured Streaming Continuous Mode Failed with HoodieMetadataException

2022-02-08 Thread GitBox
FelixKJose commented on issue #4719: URL: https://github.com/apache/hudi/issues/4719#issuecomment-1032606949 Yes, my is related to #4206. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [hudi] FelixKJose edited a comment on issue #4719: Spark Structured Streaming Continuous Mode Failed with HoodieMetadataException

2022-02-08 Thread GitBox
FelixKJose edited a comment on issue #4719: URL: https://github.com/apache/hudi/issues/4719#issuecomment-1032606949 Yes, my issue is related to #4206. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] xushiyan commented on a change in pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-08 Thread GitBox
xushiyan commented on a change in pull request #4752: URL: https://github.com/apache/hudi/pull/4752#discussion_r801636606 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestKafkaOffsetGen.java ## @@ -164,10 +163,17 @@ public void testGetNe

[GitHub] [hudi] nsivabalan commented on pull request #4420: [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering

2022-02-08 Thread GitBox
nsivabalan commented on pull request #4420: URL: https://github.com/apache/hudi/pull/4420#issuecomment-1032651666 @jaxonzhang : nope. not required. Lock provider is required if scheduling is also done async. Scheduling is the crux or critical section here where no concurrent writers should

[GitHub] [hudi] zhangyue19921010 opened a new pull request #4764: [TEST]Test

2022-02-08 Thread GitBox
zhangyue19921010 opened a new pull request #4764: URL: https://github.com/apache/hudi/pull/4764 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is th

[GitHub] [hudi] zhangyue19921010 closed pull request #4764: [TEST]Test

2022-02-08 Thread GitBox
zhangyue19921010 closed pull request #4764: URL: https://github.com/apache/hudi/pull/4764 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-

[jira] [Updated] (HUDI-3358) Investigate and fix hive query validation in integ test suite

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3358: -- Status: Patch Available (was: In Progress) > Investigate and fix hive query validation

[GitHub] [hudi] zhangyue19921010 opened a new pull request #4765: [TEST] Hudi-3370 test

2022-02-08 Thread GitBox
zhangyue19921010 opened a new pull request #4765: URL: https://github.com/apache/hudi/pull/4765 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is th

[jira] [Closed] (HUDI-3358) Investigate and fix hive query validation in integ test suite

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-3358. - Resolution: Fixed > Investigate and fix hive query validation in integ test suite > --

[jira] [Updated] (HUDI-3359) On-call GH Issue triaging and community PR reviews, Slack support (Jan31)

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3359: -- Status: Open (was: In Progress) > On-call GH Issue triaging and community PR reviews, S

[jira] [Updated] (HUDI-3091) Make simple index as the default hoodie.index.type

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3091: -- Status: Patch Available (was: In Progress) > Make simple index as the default hoodie.in

[jira] [Closed] (HUDI-3359) On-call GH Issue triaging and community PR reviews, Slack support (Jan31)

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-3359. - Resolution: Not A Bug > On-call GH Issue triaging and community PR reviews, Slack support

[jira] [Updated] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-83: Sprint: Cont' improve - 2021/01/24, Cont' improve - 2021/01/31 (was: Cont' improve - 2021/

[jira] [Closed] (HUDI-3091) Make simple index as the default hoodie.index.type

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-3091. - Resolution: Fixed > Make simple index as the default hoodie.index.type > -

[GitHub] [hudi] zhangyue19921010 commented on pull request #4765: [TEST] Hudi-3370 test

2022-02-08 Thread GitBox
zhangyue19921010 commented on pull request #4765: URL: https://github.com/apache/hudi/pull/4765#issuecomment-1032664183 Since https://github.com/apache/hudi/pull/4753 didn't trigger CI job. Just raise another PR to do test work. -- This is an automated message from the Apache Git Service

[jira] [Updated] (HUDI-2941) Show _hoodie_operation in spark sql results

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2941: -- Status: Patch Available (was: In Progress) > Show _hoodie_operation in spark sql result

[jira] [Closed] (HUDI-2941) Show _hoodie_operation in spark sql results

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-2941. - Resolution: Fixed > Show _hoodie_operation in spark sql results >

[GitHub] [hudi] hudi-bot commented on pull request #4765: [TEST] Hudi-3370 test

2022-02-08 Thread GitBox
hudi-bot commented on pull request #4765: URL: https://github.com/apache/hudi/pull/4765#issuecomment-1032665632 ## CI report: * d59d02fe852356e15c7ca48461540e5152d92ad6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] nsivabalan commented on issue #4719: Spark Structured Streaming Continuous Mode Failed with HoodieMetadataException

2022-02-08 Thread GitBox
nsivabalan commented on issue #4719: URL: https://github.com/apache/hudi/issues/4719#issuecomment-1032673537 thanks for confirming. will go ahead and close the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [hudi] nsivabalan closed issue #4719: Spark Structured Streaming Continuous Mode Failed with HoodieMetadataException

2022-02-08 Thread GitBox
nsivabalan closed issue #4719: URL: https://github.com/apache/hudi/issues/4719 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[GitHub] [hudi] nsivabalan commented on issue #4690: write hudi mor table always encounter FileNotFoundException hdfs://ns/...0220126193401513.parquet

2022-02-08 Thread GitBox
nsivabalan commented on issue #4690: URL: https://github.com/apache/hudi/issues/4690#issuecomment-1032674381 thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [hudi] nsivabalan commented on issue #4678: [SUPPORT] spark.read.format("hudi").schema(userSpecifiedSchema) doesn't work in version 0.10.0 ,but does work in 0.5.3

2022-02-08 Thread GitBox
nsivabalan commented on issue #4678: URL: https://github.com/apache/hudi/issues/4678#issuecomment-1032675989 @wjcwin : if you can respond to Yann's question above (code, etc), would help us. since Yann is not able to reproduce -- This is an automated message from the Apache Git Service.

[GitHub] [hudi] stayrascal commented on a change in pull request #4724: [HUDI-2815] add partial overwrite payload to support partial overwrit…

2022-02-08 Thread GitBox
stayrascal commented on a change in pull request #4724: URL: https://github.com/apache/hudi/pull/4724#discussion_r801702428 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/FlinkWriteHelper.java ## @@ -105,7 +105,7 @@ public static

[GitHub] [hudi] stayrascal commented on a change in pull request #4724: [HUDI-2815] add partial overwrite payload to support partial overwrit…

2022-02-08 Thread GitBox
stayrascal commented on a change in pull request #4724: URL: https://github.com/apache/hudi/pull/4724#discussion_r801702428 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/FlinkWriteHelper.java ## @@ -105,7 +105,7 @@ public static

[GitHub] [hudi] nsivabalan edited a comment on issue #3834: [SUPPORT] - AWS Athena snapshot query fails if there are two or more record array fields in a MoR table

2022-02-08 Thread GitBox
nsivabalan edited a comment on issue #3834: URL: https://github.com/apache/hudi/issues/3834#issuecomment-997307677 Closing the github issue as its root caused it to parquet upgrade. Feel free to follow the jira for updates. We are looking to get the parquet upgrade for 1.11.0. thanks f

[jira] [Updated] (HUDI-3387) Enable async timeline server by default

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3387: - Remaining Estimate: 1h Original Estimate: 1h > Enable async timeline server by default >

[jira] [Updated] (HUDI-1976) Upgrade hive, jackson, log4j, hadoop to remove vulnerability

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1976: - Sprint: Cont' improve - 2022/02/07 > Upgrade hive, jackson, log4j, hadoop to remove vulnerability > -

[jira] [Updated] (HUDI-349) Make cleaner retention based on time period to account for higher deviations in ingestion runs

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-349: Sprint: Cont' improve - 2022/02/07 > Make cleaner retention based on time period to account for higher devia

[jira] [Updated] (HUDI-2413) Sql source in delta streamer does not work

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2413: - Sprint: Cont' improve - 2022/02/07 > Sql source in delta streamer does not work > ---

[jira] [Updated] (HUDI-2875) Concurrent call to HoodieMergeHandler cause parquet corruption

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2875: - Sprint: Cont' improve - 2022/02/07 > Concurrent call to HoodieMergeHandler cause parquet corruption > ---

[jira] [Created] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

2022-02-08 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3391: - Summary: presto and hive beeline fails to read MOR table w/ 2 or more array fields Key: HUDI-3391 URL: https://issues.apache.org/jira/browse/HUDI-3391 Proje

[jira] [Updated] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3391: -- Priority: Critical (was: Major) > presto and hive beeline fails to read MOR table w/ 2

[jira] [Updated] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3391: -- Fix Version/s: 0.11.0 > presto and hive beeline fails to read MOR table w/ 2 or more arr

[jira] [Updated] (HUDI-1436) Provide Option to run auto clean every nth commit.

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1436: - Sprint: Cont' improve - 2022/02/07 > Provide Option to run auto clean every nth commit. > --

[jira] [Assigned] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-3391: - Assignee: sivabalan narayanan > presto and hive beeline fails to read MOR table w

[jira] [Updated] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3391: -- Sprint: Cont' improve - 2022/02/07 > presto and hive beeline fails to read MOR table w/

[jira] [Updated] (HUDI-3026) HoodieAppendhandle may result in duplicate key for hbase index

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3026: - Sprint: Cont' improve - 2022/02/07 > HoodieAppendhandle may result in duplicate key for hbase index > ---

[jira] [Updated] (HUDI-3362) Hudi 0.8.0 cannot rollback CoW table

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3362: -- Summary: Hudi 0.8.0 cannot rollback CoW table (was: Hudi 0.8.0 cannot roleback CoW tabl

[jira] [Updated] (HUDI-3085) Refactor fileId & writeHandler logic into partitioner for bulk_insert

2022-02-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3085: - Sprint: Cont' improve - 2022/02/07 > Refactor fileId & writeHandler logic into partitioner for bulk_inser

[jira] [Updated] (HUDI-3214) [UMBRELLA] optimize auto partition in spark

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3214: -- Sprint: Cont' improve - 2022/02/07 > [UMBRELLA] optimize auto partition in spark >

[jira] [Updated] (HUDI-3201) Make partition auto discovery configurable

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3201: -- Sprint: Cont' improve - 2022/02/07 > Make partition auto discovery configurable > -

[GitHub] [hudi] stayrascal commented on a change in pull request #4724: [HUDI-2815] add partial overwrite payload to support partial overwrit…

2022-02-08 Thread GitBox
stayrascal commented on a change in pull request #4724: URL: https://github.com/apache/hudi/pull/4724#discussion_r801731047 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java ## @@ -141,27 +142,41 @@ public void snapshotState(Func

[GitHub] [hudi] nsivabalan commented on pull request #4749: Set hoodie.parquet.outputtimestamptype to TIMESTAMP_MICROS by default

2022-02-08 Thread GitBox
nsivabalan commented on pull request #4749: URL: https://github.com/apache/hudi/pull/4749#issuecomment-1032708222 can we file a jira please as we try to get consensus -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [hudi] stayrascal commented on a change in pull request #4724: [HUDI-2815] add partial overwrite payload to support partial overwrit…

2022-02-08 Thread GitBox
stayrascal commented on a change in pull request #4724: URL: https://github.com/apache/hudi/pull/4724#discussion_r801734866 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/FlinkWriteHelper.java ## @@ -105,7 +105,7 @@ public static

[jira] [Updated] (HUDI-1657) build failed on AArch64, Fedora 33

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1657: -- Fix Version/s: 0.11.0 > build failed on AArch64, Fedora 33 > --

[jira] [Updated] (HUDI-1657) build failed on AArch64, Fedora 33

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1657: -- Sprint: Cont' improve - 2022/02/07 > build failed on AArch64, Fedora 33 >

[GitHub] [hudi] nsivabalan commented on pull request #4589: [MINOR] Fix the check condition in the `readFromVector` method to alway true

2022-02-08 Thread GitBox
nsivabalan commented on pull request #4589: URL: https://github.com/apache/hudi/pull/4589#issuecomment-1032712988 Can we please file a jira. I see lot of discussions going on. So, may not be trivial. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan commented on pull request #4557: [WIP] Allow pass rollbackUsingMarkers to Hudi CLI rollback command

2022-02-08 Thread GitBox
nsivabalan commented on pull request #4557: URL: https://github.com/apache/hudi/pull/4557#issuecomment-1032713384 can we please file a jira. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Updated] (HUDI-3130) Hive read fails when different partitions have different schemas

2022-02-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3130: -- Sprint: Cont' improve - 2022/02/07 > Hive read fails when different partitions have dif

  1   2   3   4   5   >