[jira] [Created] (HUDI-3405) Query Integration: Graceful fallback when indexes are not available

2022-02-09 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3405: Summary: Query Integration: Graceful fallback when indexes are not available Key: HUDI-3405 URL: https://issues.apache.org/jira/browse/HUDI-3405 Project: Apac

[jira] [Updated] (HUDI-3258) Support multiple metadata index partitions - bloom and column stats

2022-02-09 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Summary: Support multiple metadata index partitions - bloom and column stats (was: Suppor

[jira] [Updated] (HUDI-3258) Support multiple metadata index partitions - bloom and column stats

2022-02-09 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Status: In Progress (was: Open) > Support multiple metadata index partitions - bloom and

[jira] [Updated] (HUDI-3258) Support multiple / multi-level metadata index partitions - bloom and column stats

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Summary: Support multiple / multi-level metadata index partitions - bloom and column stats

[jira] [Updated] (HUDI-3258) Support multiple / multi-level metadata index partitions - bloom and column stats

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Story Points: 4 (was: 6) > Support multiple / multi-level metadata index partitions - blo

[jira] [Updated] (HUDI-1492) Enhance DeltaWriteStat with block level metadata correctly for storage schemes that support appends

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-1492: - Story Points: 2 (was: 5) > Enhance DeltaWriteStat with block level metadata correctly for

[jira] [Updated] (HUDI-3203) Meta bloom index should use the bloom filter type property to construct back the bloom filter instant

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3203: - Story Points: 1 (was: 2) > Meta bloom index should use the bloom filter type property to

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3142: - Story Points: 0 (was: 2) > Metadata new Indices initialization during table creation > -

[jira] [Updated] (HUDI-3166) Implement new HoodieIndex based on metadata indices

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3166: - Story Points: 1 (was: 3) > Implement new HoodieIndex based on metadata indices > ---

[jira] [Updated] (HUDI-3356) Conversion of write stats to metadata index records should use HoodieData throughout

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3356: - Story Points: 1 (was: 2) > Conversion of write stats to metadata index records should use

[jira] [Updated] (HUDI-3382) Support removal of bloom and column stats indexes

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3382: - Summary: Support removal of bloom and column stats indexes (was: AsyncIndexing Integratio

[jira] [Updated] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3368: - Summary: Support metadata bloom index for secondary keys (was: AsyncIndexing Integration:

[jira] [Updated] (HUDI-3364) Support column stats indexing for subset of columns

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3364: - Summary: Support column stats indexing for subset of columns (was: AsyncIndexing Integrat

[jira] [Updated] (HUDI-3368) AsyncIndexing Integration: Support metadata bloom index for secondary keys

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3368: - Summary: AsyncIndexing Integration: Support metadata bloom index for secondary keys (was:

[jira] [Updated] (HUDI-3258) Support multi level metadata index partitions - bloom and column stats

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Summary: Support multi level metadata index partitions - bloom and column stats (was: Sup

[jira] [Updated] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3368: - Story Points: 3 (was: 6) > Support metadata bloom index for secondary keys >

[jira] [Reopened] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy reopened HUDI-3368: -- HUDI-3258 will provide the infra needed for adding any additional index of same type. This

[jira] [Updated] (HUDI-3258) Support multiple level metadata index partitions - bloom and column stats

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Story Points: 6 (was: 5) > Support multiple level metadata index partitions - bloom and c

[jira] [Updated] (HUDI-3258) Support multiple level metadata index partitions - bloom and column stats

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Summary: Support multiple level metadata index partitions - bloom and column stats (was:

[jira] [Updated] (HUDI-3258) Support multiple level metadata index partitions of same type - bloom and column stats

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Summary: Support multiple level metadata index partitions of same type - bloom and column

[jira] [Created] (HUDI-3382) AsyncIndexing Integration: Support removal of bloom and column stats indexes

2022-02-07 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3382: Summary: AsyncIndexing Integration: Support removal of bloom and column stats indexes Key: HUDI-3382 URL: https://issues.apache.org/jira/browse/HUDI-3382 Proj

[jira] [Updated] (HUDI-2584) Unit tests for bloom filter index based out of metadata table.

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2584: - Story Points: 5 (was: 3) > Unit tests for bloom filter index based out of metadata table.

[jira] [Updated] (HUDI-3258) Support secondary keys via multiple bloom filter partitions

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Priority: Blocker (was: Major) > Support secondary keys via multiple bloom filter partiti

[jira] [Updated] (HUDI-3258) Support secondary keys via multiple bloom filter partitions

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3258: - Story Points: 5 (was: 4) > Support secondary keys via multiple bloom filter partitions >

[jira] [Closed] (HUDI-3327) Support bloom filter indexing for all columns/fields

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3327. Resolution: Duplicate > Support bloom filter indexing for all columns/fields > -

[jira] [Closed] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3368. Resolution: Duplicate > Support metadata bloom index for secondary keys > --

[jira] [Updated] (HUDI-3374) Column stats index initialization for MOR table - handle log files

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3374: - Story Points: 2 > Column stats index initialization for MOR table - handle log files > ---

[jira] [Closed] (HUDI-3219) Summary of performance related issues that MetaIndex would address

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3219. Resolution: Done > Summary of performance related issues that MetaIndex would address >

[jira] [Closed] (HUDI-3260) Support column stat index for multiple columns

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3260. Resolution: Fixed > Support column stat index for multiple columns > ---

[jira] [Closed] (HUDI-3332) Handle all supported data types for column stats index

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3332. Resolution: Fixed > Handle all supported data types for column stats index > ---

[jira] [Commented] (HUDI-3332) Handle all supported data types for column stats index

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488536#comment-17488536 ] Manoj Govindassamy commented on HUDI-3332: -- This is taken care as part of HUDI-12

[jira] [Assigned] (HUDI-3327) Support bloom filter indexing for all columns/fields

2022-02-07 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy reassigned HUDI-3327: Assignee: Manoj Govindassamy > Support bloom filter indexing for all columns/fields

[jira] [Created] (HUDI-3374) Column stats index initialization for MOR table - handle log files

2022-02-07 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3374: Summary: Column stats index initialization for MOR table - handle log files Key: HUDI-3374 URL: https://issues.apache.org/jira/browse/HUDI-3374 Project: Apach

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-02-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3142: - Reviewers: Sagar Sumit > Metadata new Indices initialization during table creation >

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-02-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3142: - Status: In Progress (was: Open) > Metadata new Indices initialization during table creati

[jira] [Created] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-03 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3368: Summary: Support metadata bloom index for secondary keys Key: HUDI-3368 URL: https://issues.apache.org/jira/browse/HUDI-3368 Project: Apache Hudi Iss

[jira] [Updated] (HUDI-3166) Implement new HoodieIndex based on metadata indices

2022-02-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3166: - Reviewers: Sagar Sumit, sivabalan narayanan > Implement new HoodieIndex based on metadata

[jira] [Updated] (HUDI-3203) Meta bloom index should use the bloom filter type property to construct back the bloom filter instant

2022-02-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3203: - Reviewers: Sagar Sumit > Meta bloom index should use the bloom filter type property to con

[jira] [Updated] (HUDI-3356) Conversion of write stats to metadata index records should use HoodieData throughout

2022-02-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3356: - Reviewers: Sagar Sumit > Conversion of write stats to metadata index records should use Ho

[jira] [Created] (HUDI-3364) AsyncIndexing Integration: Support column stats indexing for subset of columns

2022-02-01 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3364: Summary: AsyncIndexing Integration: Support column stats indexing for subset of columns Key: HUDI-3364 URL: https://issues.apache.org/jira/browse/HUDI-3364 Pr

[jira] [Commented] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-02-01 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17485455#comment-17485455 ] Manoj Govindassamy commented on HUDI-3301: -- Perf related comments in [https://git

[jira] [Updated] (HUDI-3203) Meta bloom index should use the bloom filter type property to construct back the bloom filter instant

2022-02-01 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3203: - Status: In Progress (was: Open) > Meta bloom index should use the bloom filter type prope

[jira] [Updated] (HUDI-1492) Enhance DeltaWriteStat with block level metadata correctly for storage schemes that support appends

2022-02-01 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-1492: - Status: Open (was: In Progress) > Enhance DeltaWriteStat with block level metadata correc

[jira] [Updated] (HUDI-3356) Conversion of write stats to metadata index records should use HoodieData throughout

2022-02-01 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3356: - Status: In Progress (was: Open) > Conversion of write stats to metadata index records sho

[jira] [Created] (HUDI-3356) Conversion of write stats to metadata index records should use HoodieData throughout

2022-01-31 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3356: Summary: Conversion of write stats to metadata index records should use HoodieData throughout Key: HUDI-3356 URL: https://issues.apache.org/jira/browse/HUDI-3356

[jira] [Updated] (HUDI-3181) Address test failures after enabling metadata index for bloom filters and column stats

2022-01-31 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3181: - Status: Open (was: Patch Available) > Address test failures after enabling metadata index

[jira] [Updated] (HUDI-2589) RFC: Metadata based index for bloom filter and column stats

2022-01-31 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2589: - Story Points: 0 (was: 1) > RFC: Metadata based index for bloom filter and column stats >

[jira] [Updated] (HUDI-1295) Implement: Metadata based bloom index - write path

2022-01-31 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-1295: - Story Points: 1 (was: 2) > Implement: Metadata based bloom index - write path > -

[jira] [Updated] (HUDI-3166) Implement new HoodieIndex based on metadata indices

2022-01-31 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3166: - Story Points: 3 (was: 5) > Implement new HoodieIndex based on metadata indices > ---

[jira] [Updated] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all statistics for the column

2022-01-31 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3316: - Story Points: 0 (was: 2) > HoodieColumnRangeMetadata doesn't include all statistics for t

[jira] [Updated] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all statistics for the column

2022-01-27 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3316: - Status: In Progress (was: Open) > HoodieColumnRangeMetadata doesn't include all statistic

[jira] [Updated] (HUDI-3260) Support column stat index for multiple columns

2022-01-27 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3260: - Status: In Progress (was: Open) > Support column stat index for multiple columns > --

[jira] [Updated] (HUDI-3260) Support column stat index for multiple columns

2022-01-27 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3260: - Sprint: Hudi-Sprint-Jan-24 > Support column stat index for multiple columns >

[jira] [Updated] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all statistics for the column

2022-01-27 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3316: - Description: HoodieColumnChunkMetadata includes the following stats about a parquet column

[jira] [Updated] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all statistics for the column

2022-01-27 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3316: - Summary: HoodieColumnRangeMetadata doesn't include all statistics for the column (was: Ho

[jira] [Commented] (HUDI-3334) Unable to merge HoodieMetadataPayload during partition listing

2022-01-27 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483422#comment-17483422 ] Manoj Govindassamy commented on HUDI-3334: -- Do you have merge key filter set or i

[jira] [Created] (HUDI-3332) Handle all supported data types for column stats index

2022-01-26 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3332: Summary: Handle all supported data types for column stats index Key: HUDI-3332 URL: https://issues.apache.org/jira/browse/HUDI-3332 Project: Apache Hudi

[jira] [Created] (HUDI-3330) TestHoodieDeltaStreamerWithMultiWriter: Use HoodieTestDataGenerator to generate backfill dataset

2022-01-26 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3330: Summary: TestHoodieDeltaStreamerWithMultiWriter: Use HoodieTestDataGenerator to generate backfill dataset Key: HUDI-3330 URL: https://issues.apache.org/jira/browse/HUDI-33

[jira] [Created] (HUDI-3327) Support bloom filter indexing for all columns/fields

2022-01-25 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3327: Summary: Support bloom filter indexing for all columns/fields Key: HUDI-3327 URL: https://issues.apache.org/jira/browse/HUDI-3327 Project: Apache Hudi

[jira] [Assigned] (HUDI-3325) Query Integration: Util to get aggregate columns ranges across all files from the column index

2022-01-25 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy reassigned HUDI-3325: Assignee: Manoj Govindassamy > Query Integration: Util to get aggregate columns ran

[jira] [Updated] (HUDI-3326) Query Integration: HoodieFileReader should expose API for getting range metadata

2022-01-25 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3326: - Summary: Query Integration: HoodieFileReader should expose API for getting range metadata

[jira] [Created] (HUDI-3326) HoodieFileReader should expose API for getting range metadata

2022-01-25 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3326: Summary: HoodieFileReader should expose API for getting range metadata Key: HUDI-3326 URL: https://issues.apache.org/jira/browse/HUDI-3326 Project: Apache Hud

[jira] [Created] (HUDI-3325) Query Integration: Util to get aggregate columns ranges across all files from the column index

2022-01-25 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3325: Summary: Query Integration: Util to get aggregate columns ranges across all files from the column index Key: HUDI-3325 URL: https://issues.apache.org/jira/browse/HUDI-3325

[jira] [Created] (HUDI-3324) Query Integration: Support returning file names matching the given columns and ranges

2022-01-25 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3324: Summary: Query Integration: Support returning file names matching the given columns and ranges Key: HUDI-3324 URL: https://issues.apache.org/jira/browse/HUDI-3324

[jira] [Created] (HUDI-3323) Refactor: Metadata various partitions payload merging using delegation pattern

2022-01-25 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3323: Summary: Refactor: Metadata various partitions payload merging using delegation pattern Key: HUDI-3323 URL: https://issues.apache.org/jira/browse/HUDI-3323 Pr

[jira] [Created] (HUDI-3321) HFileWriter, HFileReader and HFileDataBlock should avoid hardcoded key field name

2022-01-25 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3321: Summary: HFileWriter, HFileReader and HFileDataBlock should avoid hardcoded key field name Key: HUDI-3321 URL: https://issues.apache.org/jira/browse/HUDI-3321

[jira] [Updated] (HUDI-3317) Partition specific pointed lookup/reading strategy for metadata table

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3317: - Summary: Partition specific pointed lookup/reading strategy for metadata table (was: Part

[jira] [Updated] (HUDI-3260) Support column stat index for multiple columns

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3260: - Story Points: 3 (was: 4) > Support column stat index for multiple columns > -

[jira] [Commented] (HUDI-3317) Partition specific inline reading strategy for metadata table

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481514#comment-17481514 ] Manoj Govindassamy commented on HUDI-3317: -- Related to * https://issues.apache.o

[jira] [Created] (HUDI-3317) Partition specific inline reading strategy for metadata table

2022-01-24 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3317: Summary: Partition specific inline reading strategy for metadata table Key: HUDI-3317 URL: https://issues.apache.org/jira/browse/HUDI-3317 Project: Apache Hud

[jira] [Assigned] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all Parquet chunk statistics

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy reassigned HUDI-3316: Assignee: Manoj Govindassamy > HoodieColumnRangeMetadata doesn't include all Parque

[jira] [Updated] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all Parquet chunk statistics

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3316: - Issue Type: Task (was: Bug) > HoodieColumnRangeMetadata doesn't include all Parquet chunk

[jira] [Created] (HUDI-3316) HoodieColumnRangeMetadata doesn't include all Parquet chunk statistics

2022-01-24 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3316: Summary: HoodieColumnRangeMetadata doesn't include all Parquet chunk statistics Key: HUDI-3316 URL: https://issues.apache.org/jira/browse/HUDI-3316 Project: A

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3142: - Story Points: 2 (was: 3) > Metadata new Indices initialization during table creation > -

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3142: - Story Points: 3 (was: 15) > Metadata new Indices initialization during table creation >

[jira] [Updated] (HUDI-3288) Partition specific compaction strategy for the metadata table

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3288: - Issue Type: Task (was: New Feature) > Partition specific compaction strategy for the meta

[jira] [Updated] (HUDI-3143) Support multiple file groups for metadata table index partitions

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3143: - Story Points: 1 (was: 2) > Support multiple file groups for metadata table index partitio

[jira] [Updated] (HUDI-2714) Benchmark MetaIndex performance w/ bloom and column stat metadata

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2714: - Story Points: 0 (was: 3) > Benchmark MetaIndex performance w/ bloom and column stat metad

[jira] [Commented] (HUDI-2714) Benchmark MetaIndex performance w/ bloom and column stat metadata

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481491#comment-17481491 ] Manoj Govindassamy commented on HUDI-2714: -- Benchmarking results are at https://

[jira] [Closed] (HUDI-3144) Metadata table getRecordsByKeys() operations with inline reading has poor performance

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3144. Resolution: Duplicate The core issue is discussed under - https://issues.apache.org/jira/br

[jira] [Updated] (HUDI-3144) Metadata table getRecordsByKeys() operations with inline reading has poor performance

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3144: - Summary: Metadata table getRecordsByKeys() operations with inline reading has poor perform

[jira] [Closed] (HUDI-3273) Performance: Metadata table log file scanning and base file merging are repeated for each keys lookup request

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy closed HUDI-3273. Resolution: Duplicate Please find more discussions on this in https://issues.apache.org/jir

[jira] [Commented] (HUDI-3273) Performance: Metadata table log file scanning and base file merging are repeated for each keys lookup request

2022-01-24 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481457#comment-17481457 ] Manoj Govindassamy commented on HUDI-3273: -- By enabling inline reading at the tim

[jira] [Commented] (HUDI-3300) Timeline server FSViewManager should avoid inline reading for metadata file partition

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480362#comment-17480362 ] Manoj Govindassamy commented on HUDI-3300: -- Verified the time line server - it ha

[jira] [Assigned] (HUDI-3260) Support column stat index for multiple columns

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy reassigned HUDI-3260: Assignee: Manoj Govindassamy > Support column stat index for multiple columns > ---

[jira] [Updated] (HUDI-3260) Support column stat index for multiple columns

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3260: - Priority: Blocker (was: Major) > Support column stat index for multiple columns > ---

[jira] [Assigned] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy reassigned HUDI-3301: Assignee: Manoj Govindassamy (was: Ethan Guo) > MergedLogRecordReader inline readi

[jira] [Updated] (HUDI-3301) Metadata table inline reading should be stateless and thread safe

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3301: - Epic Link: (was: HUDI-1292) > Metadata table inline reading should be stateless and thre

[jira] [Updated] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3301: - Summary: MergedLogRecordReader inline reading should be stateless and thread safe (was: M

[jira] [Updated] (HUDI-3301) Metadata table inline reading should be stateless and thread safe

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3301: - Epic Link: HUDI-1292 > Metadata table inline reading should be stateless and thread safe >

[jira] [Created] (HUDI-3301) Metadata table inline reading should be stateless and thread safe

2022-01-21 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3301: Summary: Metadata table inline reading should be stateless and thread safe Key: HUDI-3301 URL: https://issues.apache.org/jira/browse/HUDI-3301 Project: Apache

[jira] [Created] (HUDI-3300) Timeline server FSViewManager should avoid inline reading for metadata file partition

2022-01-21 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3300: Summary: Timeline server FSViewManager should avoid inline reading for metadata file partition Key: HUDI-3300 URL: https://issues.apache.org/jira/browse/HUDI-3300

[jira] [Updated] (HUDI-3144) Make metadata table getRecordsByKeys() operations more performant by doing range reads

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3144: - Status: In Progress (was: Open) > Make metadata table getRecordsByKeys() operations more

[jira] [Commented] (HUDI-3142) Metadata new Indices initialization during table creation

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479879#comment-17479879 ] Manoj Govindassamy commented on HUDI-3142: -- [~shivnarayan] This needs to be addre

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-01-21 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3142: - Priority: Blocker (was: Critical) > Metadata new Indices initialization during table crea

[jira] [Updated] (HUDI-3166) Implement new HoodieIndex based on metadata indices

2022-01-20 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3166: - Status: Open (was: In Progress) > Implement new HoodieIndex based on metadata indices >

[jira] [Updated] (HUDI-3273) Performance: Metadata table log file scanning and base file merging are repeated for each keys lookup request

2022-01-20 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3273: - Status: In Progress (was: Open) > Performance: Metadata table log file scanning and base

[jira] [Updated] (HUDI-3144) Make metadata table getRecordsByKeys() operations more performant by doing range reads

2022-01-20 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3144: - Status: Open (was: In Progress) > Make metadata table getRecordsByKeys() operations more

[jira] [Updated] (HUDI-2584) Unit tests for bloom filter index based out of metadata table.

2022-01-19 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2584: - Story Points: 3 (was: 2) > Unit tests for bloom filter index based out of metadata table.

[jira] [Updated] (HUDI-2763) Metadata table records key deduplication

2022-01-19 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2763: - Reviewers: Prashant Wason, Vinoth Chandar (was: sivabalan narayanan) > Metadata table rec

  1   2   3   4   5   >