[jira] [Updated] (HUDI-3259) Code Refactor: Common prep records commit util for Spark and Flink

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3259: - Labels: (was: sev:normal) > Code Refactor: Common prep records commit util for Spark and Flink > ---

[jira] [Updated] (HUDI-3259) Code Refactor: Common prep records commit util for Spark and Flink

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3259: - Issue Type: Improvement (was: Task) > Code Refactor: Common prep records commit util for Spark and Flink

[jira] [Updated] (HUDI-3317) Partition specific pointed lookup/reading strategy for metadata table

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3317: - Story Points: 4 (was: 2) > Partition specific pointed lookup/reading strategy for metadata table > --

[jira] [Updated] (HUDI-3167) Update RFC27 with the design for the new HoodieIndex type based on metadata indices

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3167: - Priority: Critical (was: Blocker) > Update RFC27 with the design for the new HoodieIndex type based on me

[jira] [Commented] (HUDI-3167) Update RFC27 with the design for the new HoodieIndex type based on metadata indices

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498750#comment-17498750 ] Raymond Xu commented on HUDI-3167: -- Put on hold until new DAG is implemented. > Update R

[jira] [Updated] (HUDI-3167) Update RFC27 with the design for the new HoodieIndex type based on metadata indices

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3167: - Component/s: docs > Update RFC27 with the design for the new HoodieIndex type based on metadata > indices

[jira] [Updated] (HUDI-3167) Update RFC27 with the design for the new HoodieIndex type based on metadata indices

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3167: - Story Points: 1 (was: 2) > Update RFC27 with the design for the new HoodieIndex type based on metadata >

[jira] [Updated] (HUDI-2657) Make inlining configurable based on diff use-case.

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2657: - Component/s: metadata > Make inlining configurable based on diff use-case. >

[jira] [Created] (HUDI-3525) Introduce JsonkafkaSourceProcessor to support data preprocess before it is transformed to DataSet

2022-02-28 Thread Xianghu Wang (Jira)
Xianghu Wang created HUDI-3525: -- Summary: Introduce JsonkafkaSourceProcessor to support data preprocess before it is transformed to DataSet Key: HUDI-3525 URL: https://issues.apache.org/jira/browse/HUDI-3525

[jira] [Updated] (HUDI-2657) Make inlining configurable based on diff use-case.

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2657: - Issue Type: New Feature (was: Task) > Make inlining configurable based on diff use-case. > -

[GitHub] [hudi] hudi-bot removed a comment on pull request #4910: [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4910: URL: https://github.com/apache/hudi/pull/4910#issuecomment-1053981830 ## CI report: * 52be34d7d5e025180415c46e64a3e2145c29e498 UNKNOWN * 78e86dd1953cc4d6bf10ca808a7bcffe22b4b587 UNKNOWN * 4ff0d57275e8f907d945c60bd93c2bef227c7

[GitHub] [hudi] hudi-bot commented on pull request #4910: [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4910: URL: https://github.com/apache/hudi/pull/4910#issuecomment-1053990045 ## CI report: * 52be34d7d5e025180415c46e64a3e2145c29e498 UNKNOWN * 78e86dd1953cc4d6bf10ca808a7bcffe22b4b587 UNKNOWN * 4ff0d57275e8f907d945c60bd93c2bef227c7c3d Azur

[jira] [Updated] (HUDI-2199) DynamoDB based external index implementation

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2199: - Issue Type: New Feature (was: Task) > DynamoDB based external index implementation >

[jira] [Updated] (HUDI-2657) Make inlining configurable based on diff use-case.

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2657: - Story Points: 2 > Make inlining configurable based on diff use-case. > --

[GitHub] [hudi] hudi-bot removed a comment on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1053957438 ## CI report: * 48399d1f4e5fc3acf04ded4e9ed6e1fbfb34aebd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1053992276 ## CI report: * bf80ef66675695d0cbc6eff541226e09567b6e51 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Updated] (HUDI-2700) Metadata based bloom index - PoC

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2700: -- Priority: Blocker (was: Critical) > Metadata based bloom index - PoC >

[jira] [Updated] (HUDI-3525) Introduce JsonkafkaSourceProcessor to support data preprocess before it is transformed to DataSet

2022-02-28 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianghu Wang updated HUDI-3525: --- Description: currently we have `Transform` to transform source to target dataset before writing, but

[jira] [Assigned] (HUDI-3258) Support multiple metadata index partitions - bloom and column stats

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-3258: - Assignee: Sagar Sumit (was: Manoj Govindassamy) > Support multiple metadata index partitions - b

[jira] [Updated] (HUDI-3525) Introduce JsonkafkaSourceProcessor to support data preprocess before it is transformed to DataSet

2022-02-28 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianghu Wang updated HUDI-3525: --- Description: currently we have `Transform` to transform source to target dataset before writing, but

[jira] [Updated] (HUDI-3166) Implement new HoodieIndex based on metadata indices

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3166: -- Priority: Critical (was: Blocker) > Implement new HoodieIndex based on metadata indices >

[jira] [Assigned] (HUDI-2973) Rewrite/re-publish RFC for Data skipping index

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-2973: - Assignee: Sagar Sumit (was: Manoj Govindassamy) > Rewrite/re-publish RFC for Data skipping index

[jira] [Updated] (HUDI-2705) Metadata based column stats index - PoC

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2705: -- Priority: Blocker (was: Critical) > Metadata based column stats index - PoC > -

[jira] [Updated] (HUDI-1951) Hash Index for HUDI

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-1951: -- Epic Link: HUDI-3000 (was: HUDI-1822) > Hash Index for HUDI > --- > > K

[jira] [Commented] (HUDI-3525) Introduce JsonkafkaSourceProcessor to support data preprocess before it is transformed to DataSet

2022-02-28 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498756#comment-17498756 ] Xianghu Wang commented on HUDI-3525: hi [~shivnarayan], [~xushiyan] any ideas about th

[jira] [Closed] (HUDI-2705) Metadata based column stats index - PoC

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-2705. - Resolution: Done > Metadata based column stats index - PoC > --- > >

[jira] [Commented] (HUDI-2700) Metadata based bloom index - PoC

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498758#comment-17498758 ] Sagar Sumit commented on HUDI-2700: --- Covered in https://github.com/apache/hudi/pull/4352

[jira] [Closed] (HUDI-2700) Metadata based bloom index - PoC

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-2700. - Resolution: Done > Metadata based bloom index - PoC > > >

[jira] [Updated] (HUDI-3258) Support multiple metadata index partitions - bloom and column stats

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3258: -- Status: Patch Available (was: In Progress) > Support multiple metadata index partitions - bloom and col

[GitHub] [hudi] hudi-bot removed a comment on pull request #4924: [WIP][CI Test Only - 2][HUDI-1180] Upgrade HBase to 2.4.9

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4924: URL: https://github.com/apache/hudi/pull/4924#issuecomment-1053951773 ## CI report: * a9c0bcb8a33a8c046732a4aa135423f84f65235d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4924: [WIP][CI Test Only - 2][HUDI-1180] Upgrade HBase to 2.4.9

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4924: URL: https://github.com/apache/hudi/pull/4924#issuecomment-1054004151 ## CI report: * a9c0bcb8a33a8c046732a4aa135423f84f65235d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Closed] (HUDI-3203) Meta bloom index should use the bloom filter type property to construct back the bloom filter instant

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3203. - Resolution: Won't Do > Meta bloom index should use the bloom filter type property to construct back > the

[jira] [Closed] (HUDI-3142) Metadata new Indices initialization during table creation

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3142. - Resolution: Won't Do > Metadata new Indices initialization during table creation > --

[jira] [Closed] (HUDI-1492) Enhance DeltaWriteStat with block level metadata correctly for storage schemes that support appends

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-1492. - Resolution: Won't Do > Enhance DeltaWriteStat with block level metadata correctly for storage > schemes t

[jira] [Closed] (HUDI-2584) Unit tests for bloom filter index based out of metadata table.

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-2584. - Resolution: Won't Do > Unit tests for bloom filter index based out of metadata table. > -

[jira] [Closed] (HUDI-3356) Conversion of write stats to metadata index records should use HoodieData throughout

2022-02-28 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3356. - Resolution: Won't Do > Conversion of write stats to metadata index records should use HoodieData > throug

[jira] [Updated] (HUDI-3364) Support column stats indexing for subset of columns

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3364: - Sprint: Hudi-Sprint-Mar-01 > Support column stats indexing for subset of columns > ---

[jira] [Updated] (HUDI-3374) metadata index for secondary keys

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3374: - Sprint: Hudi-Sprint-Mar-01 > metadata index for secondary keys > - > >

[jira] [Assigned] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3368: Assignee: Sagar Sumit (was: Manoj Govindassamy) > Support metadata bloom index for secondary keys

[jira] [Updated] (HUDI-3451) Add checks for metadata table init to avoid possible out-of-sync

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3451: - Story Points: 2 > Add checks for metadata table init to avoid possible out-of-sync > -

[jira] [Updated] (HUDI-3451) Add checks for metadata table init to avoid possible out-of-sync

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3451: - Sprint: Hudi-Sprint-Mar-01 > Add checks for metadata table init to avoid possible out-of-sync > --

[GitHub] [hudi] danny0405 commented on pull request #4909: [HUDI-3516] Implement record iterator for HoodieDataBlock

2022-02-28 Thread GitBox
danny0405 commented on pull request #4909: URL: https://github.com/apache/hudi/pull/4909#issuecomment-1054031304 Thanks for the contribution, i have reviewed and supply a path :) [HUDI-3516.patch.zip](https://github.com/apache/hudi/files/8151892/HUDI-3516.patch.zip) -- This is

[jira] [Updated] (HUDI-3288) Partition specific compaction strategy for the metadata table

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3288: - Story Points: 3 (was: 4) > Partition specific compaction strategy for the metadata table > --

[jira] [Commented] (HUDI-1236) [UMBRELLA] Integ Test suite infra

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498780#comment-17498780 ] Raymond Xu commented on HUDI-1236: -- [~shivnarayan] could you make 1 pass on the open and

[jira] [Updated] (HUDI-2488) Support async metadata index creation while regular writers and table services are in progress

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2488: - Component/s: index metadata > Support async metadata index creation while regular writers

[jira] [Updated] (HUDI-3176) Add index commit metadata

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3176: - Component/s: index > Add index commit metadata > - > > Key: HUDI-3

[jira] [Updated] (HUDI-3386) DROP INDEX comand

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3386: - Component/s: index metadata > DROP INDEX comand > - > > K

[jira] [Updated] (HUDI-3174) Implement metadata filesystem view changes to support INDEX action type

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3174: - Labels: (was: metadata) > Implement metadata filesystem view changes to support INDEX action type >

[jira] [Updated] (HUDI-3174) Implement metadata filesystem view changes to support INDEX action type

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3174: - Component/s: index > Implement metadata filesystem view changes to support INDEX action type > ---

[jira] [Updated] (HUDI-2708) Support indexing of metadata table even when async table service is in progress

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2708: - Component/s: index > Support indexing of metadata table even when async table service is in > progress >

[jira] [Updated] (HUDI-3225) RFC for Async Metadata Index

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3225: - Component/s: docs index > RFC for Async Metadata Index > > >

[jira] [Updated] (HUDI-3173) Introduce new INDEX action type

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3173: - Component/s: index > Introduce new INDEX action type > --- > >

[jira] [Updated] (HUDI-3275) Add tests for async metadata indexing

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3275: - Component/s: tests-ci > Add tests for async metadata indexing > - > >

[GitHub] [hudi] hudi-bot removed a comment on pull request #4910: [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4910: URL: https://github.com/apache/hudi/pull/4910#issuecomment-1053990045 ## CI report: * 52be34d7d5e025180415c46e64a3e2145c29e498 UNKNOWN * 78e86dd1953cc4d6bf10ca808a7bcffe22b4b587 UNKNOWN * 4ff0d57275e8f907d945c60bd93c2bef227c7

[GitHub] [hudi] hudi-bot commented on pull request #4910: [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4910: URL: https://github.com/apache/hudi/pull/4910#issuecomment-1054040283 ## CI report: * 52be34d7d5e025180415c46e64a3e2145c29e498 UNKNOWN * 78e86dd1953cc4d6bf10ca808a7bcffe22b4b587 UNKNOWN * cb8c6f4233cb1bac50aa67de4145df8458499f6d Azur

[GitHub] [hudi] boneanxs commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

2022-02-28 Thread GitBox
boneanxs commented on pull request #4905: URL: https://github.com/apache/hudi/pull/4905#issuecomment-1054044282 > parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()) while instantiating writeConfig may or may not work. Even though we have added alternate keys, not sure w

[GitHub] [hudi] xushiyan commented on a change in pull request #4916: Update roadmap page to reflect latest state

2022-02-28 Thread GitBox
xushiyan commented on a change in pull request #4916: URL: https://github.com/apache/hudi/pull/4916#discussion_r815698226 ## File path: website/src/pages/roadmap.md ## @@ -10,13 +10,13 @@ down by areas on our [stack](blog/2021/07/21/streaming-data-lake-platform/#hudi- ## H1

[GitHub] [hudi] hudi-bot commented on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054051103 ## CI report: * ee9f2eaa28c5836977ea980a1d50b1d65ce342ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1018115438 ## CI report: * ee9f2eaa28c5836977ea980a1d50b1d65ce342ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054053920 ## CI report: * ee9f2eaa28c5836977ea980a1d50b1d65ce342ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054051103 ## CI report: * ee9f2eaa28c5836977ea980a1d50b1d65ce342ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] watermelon12138 opened a new pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 opened a new pull request #4925: URL: https://github.com/apache/hudi/pull/4925 ## What is the purpose of the pull request *Enable MultiTableDeltaStreamer to update a single target table from multiple source tables.* ## Brief change log - *Modify the Hoodie

[GitHub] [hudi] hudi-bot commented on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054064777 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] watermelon12138 commented on a change in pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 commented on a change in pull request #4645: URL: https://github.com/apache/hudi/pull/4645#discussion_r815719379 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java ## @@ -370,50 +441,124 @@ publ

[GitHub] [hudi] hudi-bot removed a comment on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054064777 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054067340 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] watermelon12138 commented on a change in pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 commented on a change in pull request #4645: URL: https://github.com/apache/hudi/pull/4645#discussion_r815720738 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java ## @@ -370,50 +441,124 @@ publ

[GitHub] [hudi] watermelon12138 commented on a change in pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 commented on a change in pull request #4645: URL: https://github.com/apache/hudi/pull/4645#discussion_r815721725 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieMultiTableDeltaStreamer.java ## @@ -177,6 +182,73 @@ publi

[GitHub] [hudi] watermelon12138 commented on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 commented on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054073152 @nsivabalan @pratyakshsharma Thank you for your advice. I've changed the corresponding content in the code. Because my PR is too late for current version, and many confl

[GitHub] [hudi] watermelon12138 commented on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 commented on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054073675 @nsivabalan @pratyakshsharma This is new PR address. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [hudi] watermelon12138 commented on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
watermelon12138 commented on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054074075 @nsivabalan @pratyakshsharma https://github.com/apache/hudi/pull/4925/files -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [hudi] scxwhite opened a new pull request #4926: add thread factory in BoundedInMemoryExecutor

2022-02-28 Thread GitBox
scxwhite opened a new pull request #4926: URL: https://github.com/apache/hudi/pull/4926 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpos

[GitHub] [hudi] hudi-bot commented on pull request #4926: add thread factory in BoundedInMemoryExecutor

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4926: URL: https://github.com/apache/hudi/pull/4926#issuecomment-1054113931 ## CI report: * b0d5367e3209a3b097a1cb20288a06d969ffb693 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] hudi-bot commented on pull request #4926: add thread factory in BoundedInMemoryExecutor

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4926: URL: https://github.com/apache/hudi/pull/4926#issuecomment-1054116463 ## CI report: * b0d5367e3209a3b097a1cb20288a06d969ffb693 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4926: add thread factory in BoundedInMemoryExecutor

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4926: URL: https://github.com/apache/hudi/pull/4926#issuecomment-1054113931 ## CI report: * b0d5367e3209a3b097a1cb20288a06d969ffb693 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054123412 ## CI report: * d98dd23ee2077ff5c70d7c1f57a89057319850fd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4645: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single target table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4645: URL: https://github.com/apache/hudi/pull/4645#issuecomment-1054053920 ## CI report: * ee9f2eaa28c5836977ea980a1d50b1d65ce342ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054067340 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054132818 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] xushiyan commented on pull request #4885: [DO NOT MERGE][HUDI-3221] Support querying a table as of a savepoint

2022-02-28 Thread GitBox
xushiyan commented on pull request #4885: URL: https://github.com/apache/hudi/pull/4885#issuecomment-1054141117 @XuQianJin-Stars close this in favor of https://github.com/apache/hudi/pull/4720 ? -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [hudi] hudi-bot commented on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054142814 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054132818 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] xushiyan commented on issue #4502: [QUESTION] Athena Hudi Time Travel Queries

2022-02-28 Thread GitBox
xushiyan commented on issue #4502: URL: https://github.com/apache/hudi/issues/4502#issuecomment-1054143138 Implementation https://github.com/apache/hudi/pull/4720/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [hudi] hudi-bot removed a comment on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054142814 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1054145638 ## CI report: * afa42d835dcfc17181cc314b69b052ce374f763c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Updated] (HUDI-3441) Clean up markers of completed commits in .hoodie/.temp folder

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3441: - Sprint: Hudi-Sprint-Mar-01 > Clean up markers of completed commits in .hoodie/.temp folder > -

[jira] [Updated] (HUDI-3468) Emit table metadata to Linkedin datahub metadata

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3468: - Sprint: Hudi-Sprint-Mar-01 > Emit table metadata to Linkedin datahub metadata > --

[jira] [Updated] (HUDI-2883) Refactor Hive Sync tool /config to use reflection and move to hudi sync common package

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2883: - Sprint: Hudi-Sprint-Mar-01 > Refactor Hive Sync tool /config to use reflection and move to hudi sync > co

[jira] [Updated] (HUDI-3287) Remove unnecessary deps in hudi-kafka-connect

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3287: - Sprint: Hudi-Sprint-Mar-01 > Remove unnecessary deps in hudi-kafka-connect > -

[jira] [Updated] (HUDI-3411) Incorrect Record Key Field property Handling

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3411: - Sprint: Hudi-Sprint-Mar-01 > Incorrect Record Key Field property Handling > --

[jira] [Updated] (HUDI-3406) Rollback incorrectly relying on FS listing instead of Commit Metadata

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3406: - Sprint: Hudi-Sprint-Mar-01 > Rollback incorrectly relying on FS listing instead of Commit Metadata > -

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1054160824 ## CI report: * 09663f88e7a7ef0d924155f9b4594b5c844e9f31 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1038565187 ## CI report: * 09663f88e7a7ef0d924155f9b4594b5c844e9f31 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[jira] [Updated] (HUDI-3348) HoodieRealtimeFileSplit losing info when serialized/deserialized

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3348: - Sprint: Hudi-Sprint-Mar-01 > HoodieRealtimeFileSplit losing info when serialized/deserialized > --

[jira] [Updated] (HUDI-2283) Support Clustering Command For Spark Sql

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2283: - Sprint: Hudi-Sprint-Mar-01 > Support Clustering Command For Spark Sql > --

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-02-28 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1054163323 ## CI report: * 09663f88e7a7ef0d924155f9b4594b5c844e9f31 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-02-28 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1054160824 ## CI report: * 09663f88e7a7ef0d924155f9b4594b5c844e9f31 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[jira] [Updated] (HUDI-2283) Support Clustering Command For Spark Sql

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2283: - Sprint: (was: Hudi-Sprint-Mar-01) > Support Clustering Command For Spark Sql > -

[GitHub] [hudi] XuQianJin-Stars commented on pull request #4901: [HUDI-3445] Supporting Clustering Command Based on Call Procedure Command for Spark SQL

2022-02-28 Thread GitBox
XuQianJin-Stars commented on pull request #4901: URL: https://github.com/apache/hudi/pull/4901#issuecomment-1054164686 hi @huberylee pls fixed the CI build. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[jira] [Updated] (HUDI-3441) Clean up markers of completed commits in .hoodie/.temp folder

2022-02-28 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3441: - Issue Type: Improvement (was: Task) > Clean up markers of completed commits in .hoodie/.temp folder > ---

  1   2   3   4   5   6   7   8   >