[jira] [Updated] (HUDI-9665) Repartition the write status RDD for MDT DAG to avoid long processing durations

2025-07-29 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9665: -- Priority: Blocker (was: Major) > Repartition the write status RDD for MDT DAG to avoid long pro

[jira] [Assigned] (HUDI-9665) Repartition the write status RDD for MDT DAG to avoid long processing durations

2025-07-29 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-9665: - Assignee: sivabalan narayanan > Repartition the write status RDD for MDT DAG to avoid lon

[jira] [Updated] (HUDI-9665) Repartition the write status RDD for MDT DAG to avoid long processing durations

2025-07-29 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9665: -- Fix Version/s: 1.1.0 > Repartition the write status RDD for MDT DAG to avoid long processing >

[jira] [Created] (HUDI-9665) Repartition the write status RDD for MDT DAG to avoid long processing durations

2025-07-29 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9665: - Summary: Repartition the write status RDD for MDT DAG to avoid long processing durations Key: HUDI-9665 URL: https://issues.apache.org/jira/browse/HUDI-9665 Project

[jira] [Assigned] (HUDI-9649) Avoid DAG retriggers in StreamSync (DeltaStreamer)

2025-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-9649: - Assignee: sivabalan narayanan > Avoid DAG retriggers in StreamSync (DeltaStreamer) >

[jira] [Created] (HUDI-9649) Avoid DAG retriggers in StreamSync (DeltaStreamer)

2025-07-25 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9649: - Summary: Avoid DAG retriggers in StreamSync (DeltaStreamer) Key: HUDI-9649 URL: https://issues.apache.org/jira/browse/HUDI-9649 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-9649) Avoid DAG retriggers in StreamSync (DeltaStreamer)

2025-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9649: -- Fix Version/s: 1.1.0 > Avoid DAG retriggers in StreamSync (DeltaStreamer) >

[jira] [Created] (HUDI-9633) Add ability to remove custom configs from being passed to kafka consumer

2025-07-23 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9633: - Summary: Add ability to remove custom configs from being passed to kafka consumer Key: HUDI-9633 URL: https://issues.apache.org/jira/browse/HUDI-9633 Project: Apach

[jira] [Updated] (HUDI-9628) Add bloom filter pruning when looking up keys in metadata files

2025-07-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9628: -- Priority: Blocker (was: Major) > Add bloom filter pruning when looking up keys in metadata file

[jira] [Assigned] (HUDI-9628) Add bloom filter pruning when looking up keys in metadata files

2025-07-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-9628: - Assignee: Rajesh Mahindra > Add bloom filter pruning when looking up keys in metadata fil

[jira] [Updated] (HUDI-9628) Add bloom filter pruning when looking up keys in metadata files

2025-07-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9628: -- Fix Version/s: 1.1.0 > Add bloom filter pruning when looking up keys in metadata files > ---

[jira] [Created] (HUDI-9628) Add bloom filter pruning when looking up keys in metadata files

2025-07-23 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9628: - Summary: Add bloom filter pruning when looking up keys in metadata files Key: HUDI-9628 URL: https://issues.apache.org/jira/browse/HUDI-9628 Project: Apache Hudi

[jira] [Assigned] (HUDI-9626) Implement pre caching of metadata Hfiles below a certain size threshold

2025-07-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-9626: - Assignee: Rajesh Mahindra > Implement pre caching of metadata Hfiles below a certain size

[jira] [Created] (HUDI-9627) Add bloom filter pruning when looking up keys in metadata files

2025-07-23 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9627: - Summary: Add bloom filter pruning when looking up keys in metadata files Key: HUDI-9627 URL: https://issues.apache.org/jira/browse/HUDI-9627 Project: Apache Hudi

[jira] [Updated] (HUDI-9626) Implement pre caching of metadata Hfiles below a certain size threshold

2025-07-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9626: -- Priority: Blocker (was: Major) > Implement pre caching of metadata Hfiles below a certain size

[jira] [Updated] (HUDI-9626) Implement pre caching of metadata Hfiles below a certain size threshold

2025-07-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9626: -- Fix Version/s: 1.1.0 > Implement pre caching of metadata Hfiles below a certain size threshold >

[jira] [Created] (HUDI-9626) Implement pre caching of metadata Hfiles below a certain size threshold

2025-07-23 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9626: - Summary: Implement pre caching of metadata Hfiles below a certain size threshold Key: HUDI-9626 URL: https://issues.apache.org/jira/browse/HUDI-9626 Project: Apache

[jira] [Assigned] (HUDI-8601) Support LEAF_INDEX block type in HFile

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-8601: - Assignee: Lin Liu (was: Y Ethan Guo) > Support LEAF_INDEX block type in HFile >

[jira] [Created] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-9283: - Summary: Benchmark RLI flow with a large table to improve performance Key: HUDI-9283 URL: https://issues.apache.org/jira/browse/HUDI-9283 Project: Apache Hudi

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Labels: (was: hudi-1.0.2) > Benchmark RLI flow with a large table to improve performance > ---

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Fix Version/s: 1.0.2 > Benchmark RLI flow with a large table to improve performance > --

[jira] [Assigned] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-9283: - Assignee: Rajesh Mahindra > Benchmark RLI flow with a large table to improve performance

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Sprint: Hudi 1.1 Sprint #1 > Benchmark RLI flow with a large table to improve performance >

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Fix Version/s: 1.1.0 > Benchmark RLI flow with a large table to improve performance > --

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Component/s: (was: core) (was: index) > Benchmark RLI flow with a large

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Labels: hudi-1.0.2 (was: ) > Benchmark RLI flow with a large table to improve performance > ---

[jira] [Updated] (HUDI-9283) Benchmark RLI flow with a large table to improve performance

2025-04-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-9283: -- Component/s: index > Benchmark RLI flow with a large table to improve performance >

[jira] [Updated] (HUDI-8068) Hook up source partitions to s3 incr source

2024-08-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-8068: -- Priority: Trivial (was: Major) > Hook up source partitions to s3 incr source >

[jira] [Created] (HUDI-8068) Hook up source partitions to s3 incr source

2024-08-09 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-8068: - Summary: Hook up source partitions to s3 incr source Key: HUDI-8068 URL: https://issues.apache.org/jira/browse/HUDI-8068 Project: Apache Hudi Issue Type: I

[jira] [Created] (HUDI-7940) Pass metrics to ErrorTableWriter to be able to emit metrics for Error Table

2024-06-29 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7940: - Summary: Pass metrics to ErrorTableWriter to be able to emit metrics for Error Table Key: HUDI-7940 URL: https://issues.apache.org/jira/browse/HUDI-7940 Project: Ap

[jira] [Assigned] (HUDI-7855) Add ability to dynamically configure write parallelism for BULK_INSERT for HoodieStreamer

2024-06-10 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7855: - Assignee: Rajesh Mahindra > Add ability to dynamically configure write parallelism for BU

[jira] [Created] (HUDI-7855) Add ability to dynamically configure write parallelism for BULK_INSERT for HoodieStreamer

2024-06-10 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7855: - Summary: Add ability to dynamically configure write parallelism for BULK_INSERT for HoodieStreamer Key: HUDI-7855 URL: https://issues.apache.org/jira/browse/HUDI-7855

[jira] [Created] (HUDI-7816) Pass the source profile to the snapshot query splitter

2024-05-30 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7816: - Summary: Pass the source profile to the snapshot query splitter Key: HUDI-7816 URL: https://issues.apache.org/jira/browse/HUDI-7816 Project: Apache Hudi Is

[jira] [Created] (HUDI-7606) Ensure that rdds persisted by table services are released in SparkRDDWriteClient

2024-04-11 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7606: - Summary: Ensure that rdds persisted by table services are released in SparkRDDWriteClient Key: HUDI-7606 URL: https://issues.apache.org/jira/browse/HUDI-7606 Projec

[jira] [Assigned] (HUDI-7606) Ensure that rdds persisted by table services are released in SparkRDDWriteClient

2024-04-11 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7606: - Assignee: Rajesh Mahindra > Ensure that rdds persisted by table services are released in

[jira] [Created] (HUDI-7517) Add ability to reset the checkpoint for kafka source

2024-03-19 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7517: - Summary: Add ability to reset the checkpoint for kafka source Key: HUDI-7517 URL: https://issues.apache.org/jira/browse/HUDI-7517 Project: Apache Hudi Issu

[jira] [Closed] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-19 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra closed HUDI-7418. - Resolution: Fixed > Implement file extension filter for s3 incr source > -

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Description: We have support for filter the input files based on an extension (custom) for GCS I

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Description: We have support for filter the input files based on an extension (custom) that can

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Priority: Trivial (was: Major) > Implement file extension filter for s3 incr source > -

[jira] [Created] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7418: - Summary: Implement file extension filter for s3 incr source Key: HUDI-7418 URL: https://issues.apache.org/jira/browse/HUDI-7418 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Sprint: Sprint 2023-03-28 > Implement file extension filter for s3 incr source > ---

[jira] [Assigned] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7418: - Assignee: Rajesh Mahindra > Implement file extension filter for s3 incr source >

[jira] [Updated] (HUDI-7381) Compaction not filling in stats for create and upsert time

2024-02-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7381: -- Priority: Minor (was: Major) > Compaction not filling in stats for create and upsert time > ---

[jira] [Created] (HUDI-7381) Compaction not filling in stats for create and upsert time

2024-02-04 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7381: - Summary: Compaction not filling in stats for create and upsert time Key: HUDI-7381 URL: https://issues.apache.org/jira/browse/HUDI-7381 Project: Apache Hudi

[jira] [Assigned] (HUDI-7381) Compaction not filling in stats for create and upsert time

2024-02-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7381: - Assignee: Rajesh Mahindra > Compaction not filling in stats for create and upsert time >

[jira] [Created] (HUDI-7161) Add commit action type and ext ra metadata to write callback on commit message

2023-11-29 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7161: - Summary: Add commit action type and ext ra metadata to write callback on commit message Key: HUDI-7161 URL: https://issues.apache.org/jira/browse/HUDI-7161 Project:

[jira] [Assigned] (HUDI-7161) Add commit action type and ext ra metadata to write callback on commit message

2023-11-29 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7161: - Assignee: Rajesh Mahindra > Add commit action type and ext ra metadata to write callback

[jira] [Assigned] (HUDI-7138) Fix instantiation issues with ErrorTableWriter and Schema Registry Provider

2023-11-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7138: - Assignee: Rajesh Mahindra > Fix instantiation issues with ErrorTableWriter and Schema Reg

[jira] [Created] (HUDI-7138) Fix instantiation issues with ErrorTableWriter and Schema Registry Provider

2023-11-25 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7138: - Summary: Fix instantiation issues with ErrorTableWriter and Schema Registry Provider Key: HUDI-7138 URL: https://issues.apache.org/jira/browse/HUDI-7138 Project: Ap

[jira] [Created] (HUDI-7108) Ensure schema is refreshed for every batch when using KafkaAvroSchemaDeserializer

2023-11-16 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7108: - Summary: Ensure schema is refreshed for every batch when using KafkaAvroSchemaDeserializer Key: HUDI-7108 URL: https://issues.apache.org/jira/browse/HUDI-7108 Proje

[jira] [Updated] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7106: -- Priority: Critical (was: Major) > Fix SQS deletes logic for S3 events source. > ---

[jira] [Assigned] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7106: - Assignee: Rajesh Mahindra > Fix SQS deletes logic for S3 events source. > ---

[jira] [Updated] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7106: -- Affects Version/s: 0.14.1 > Fix SQS deletes logic for S3 events source. > --

[jira] [Created] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7106: - Summary: Fix SQS deletes logic for S3 events source. Key: HUDI-7106 URL: https://issues.apache.org/jira/browse/HUDI-7106 Project: Apache Hudi Issue Type: B

[jira] [Updated] (HUDI-7052) Fix partition key validation for key generators.

2023-11-08 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7052: -- Summary: Fix partition key validation for key generators. (was: Fix partition key validation fo

[jira] [Created] (HUDI-7052) Fix partition key validation for custom payloads

2023-11-08 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7052: - Summary: Fix partition key validation for custom payloads Key: HUDI-7052 URL: https://issues.apache.org/jira/browse/HUDI-7052 Project: Apache Hudi Issue Ty

[jira] [Assigned] (HUDI-7052) Fix partition key validation for custom payloads

2023-11-08 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7052: - Assignee: Rajesh Mahindra > Fix partition key validation for custom payloads > --

[jira] [Created] (HUDI-6406) Pass in Spark Engine Context Wrapper for DeltaSync instead of spark engine context

2023-06-17 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-6406: - Summary: Pass in Spark Engine Context Wrapper for DeltaSync instead of spark engine context Key: HUDI-6406 URL: https://issues.apache.org/jira/browse/HUDI-6406 Proj

[jira] [Created] (HUDI-5255) Estimate the actual record size for the first ingestion batch instead of using default

2022-11-21 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-5255: - Summary: Estimate the actual record size for the first ingestion batch instead of using default Key: HUDI-5255 URL: https://issues.apache.org/jira/browse/HUDI-5255

[jira] [Created] (HUDI-4963) Extend InProcessLockProvider to support multiple table ingestion

2022-09-30 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-4963: - Summary: Extend InProcessLockProvider to support multiple table ingestion Key: HUDI-4963 URL: https://issues.apache.org/jira/browse/HUDI-4963 Project: Apache Hudi

[jira] [Created] (HUDI-4960) Upgrade Jetty version for Timeline server

2022-09-30 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-4960: - Summary: Upgrade Jetty version for Timeline server Key: HUDI-4960 URL: https://issues.apache.org/jira/browse/HUDI-4960 Project: Apache Hudi Issue Type: Imp

[jira] [Commented] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606735#comment-17606735 ] Rajesh Mahindra commented on HUDI-4430: --- [~alexey.kudinkin] Will help with this. >

[jira] [Assigned] (HUDI-4432) Checkpoint management for muti-writer scenario

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4432: - Assignee: Harshal Patil > Checkpoint management for muti-writer scenario > --

[jira] [Assigned] (HUDI-4452) Include hudi-aws to hudi-spark-bundle to fix cloudwatch reporter issue

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4452: - Assignee: Rahil Chertara > Include hudi-aws to hudi-spark-bundle to fix cloudwatch report

[jira] [Commented] (HUDI-4459) Corrupt parquet file created when syncing huge table with 4000+ fields,using hudi cow table with bulk_insert type

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571140#comment-17571140 ] Rajesh Mahindra commented on HUDI-4459: --- [~danny0405] can you help assign this ticke

[jira] [Assigned] (HUDI-4459) Corrupt parquet file created when syncing huge table with 4000+ fields,using hudi cow table with bulk_insert type

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4459: - Assignee: Danny Chen > Corrupt parquet file created when syncing huge table with 4000+ fi

[jira] [Assigned] (HUDI-4471) Relocate AWSDmsAvroPayload class to hudi-common

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4471: - Assignee: Rahil Chertara > Relocate AWSDmsAvroPayload class to hudi-common >

[jira] [Assigned] (HUDI-4412) Multiple writers NPE when Insert_overwrite

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4412: - Assignee: liujinhui > Multiple writers NPE when Insert_overwrite > --

[jira] [Commented] (HUDI-4415) Support spark writer running on thrift server

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569838#comment-17569838 ] Rajesh Mahindra commented on HUDI-4415: --- [~minihippo] are you planning to work on it

[jira] [Updated] (HUDI-4418) Implement ProtoKafkaSource

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4418: -- Fix Version/s: 0.13.0 > Implement ProtoKafkaSource > -- > >

[jira] [Commented] (HUDI-4422) read parquet failed due to length is 0 or corrupt parquet file

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569836#comment-17569836 ] Rajesh Mahindra commented on HUDI-4422: --- [~JinxinTang] Feel free to raise the PR aft

[jira] [Assigned] (HUDI-4429) Make Spark 3.1.3 the default profile

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4429: - Assignee: Rahil Chertara > Make Spark 3.1.3 the default profile > --

[jira] [Comment Edited] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569833#comment-17569833 ] Rajesh Mahindra edited comment on HUDI-4430 at 7/22/22 6:30 AM:

[jira] [Commented] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569833#comment-17569833 ] Rajesh Mahindra commented on HUDI-4430: --- Looks like your input column is of type str

[jira] [Assigned] (HUDI-4434) Disable EMRFS and EMR spark related properties

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4434: - Assignee: Rahil Chertara > Disable EMRFS and EMR spark related properties >

[jira] [Assigned] (HUDI-4440) Treat boostrapped table as non-partitioned in HudiFileIndex if partition column is missing from schema

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4440: - Assignee: Rahil Chertara > Treat boostrapped table as non-partitioned in HudiFileIndex if

[jira] [Assigned] (HUDI-4439) Fix Amazon CloudWatch reporter for metadata enabled tables

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4439: - Assignee: Rahil Chertara > Fix Amazon CloudWatch reporter for metadata enabled tables > -

[jira] [Updated] (HUDI-4442) Converting from json to avro does not sanitize field names

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4442: -- Fix Version/s: 0.12.0 > Converting from json to avro does not sanitize field names > ---

[jira] [Updated] (HUDI-4443) Add DeltaStreamer support for AWS managed Kafka (MSK)

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4443: -- Labels: blocker (was: ) > Add DeltaStreamer support for AWS managed Kafka (MSK) >

[jira] [Updated] (HUDI-4443) Add DeltaStreamer support for AWS managed Kafka (MSK)

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4443: -- Fix Version/s: 0.13.0 > Add DeltaStreamer support for AWS managed Kafka (MSK) > ---

[jira] [Updated] (HUDI-4445) Fix few things related to S3 Incremental Source

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4445: -- Description: # Decode file resource url before operating on it. # Fix serializability of hadoop

[jira] [Assigned] (HUDI-4448) Remove the latest commit refresh for timeline server

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4448: - Assignee: Danny Chen > Remove the latest commit refresh for timeline server > ---

[jira] [Assigned] (HUDI-4450) Revert the checkpoint abort notification

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4450: - Assignee: Danny Chen > Revert the checkpoint abort notification > ---

[jira] [Commented] (HUDI-3941) Add extrametadata and commitActiontype to write callback

2022-04-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525836#comment-17525836 ] Rajesh Mahindra commented on HUDI-3941: --- [~shivnarayan] i have a local PR for this,

[jira] [Assigned] (HUDI-3941) Add extrametadata and commitActiontype to write callback

2022-04-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-3941: - Assignee: sivabalan narayanan > Add extrametadata and commitActiontype to write callback

[jira] [Created] (HUDI-3941) Add extrametadata and commitActiontype to write callback

2022-04-21 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-3941: - Summary: Add extrametadata and commitActiontype to write callback Key: HUDI-3941 URL: https://issues.apache.org/jira/browse/HUDI-3941 Project: Apache Hudi

[jira] [Updated] (HUDI-2854) Harden Toast support for Postgres

2022-03-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2854: -- Fix Version/s: 0.12.0 > Harden Toast support for Postgres > - >

[jira] [Assigned] (HUDI-2854) Harden Toast support for Postgres

2022-03-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-2854: - Assignee: Rajesh Mahindra > Harden Toast support for Postgres > -

[jira] [Updated] (HUDI-2854) Harden Toast support for Postgres

2022-03-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2854: -- Priority: Blocker (was: Major) > Harden Toast support for Postgres > --

[jira] [Created] (HUDI-3562) Implement MoR table support for Debezium Source

2022-03-04 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-3562: - Summary: Implement MoR table support for Debezium Source Key: HUDI-3562 URL: https://issues.apache.org/jira/browse/HUDI-3562 Project: Apache Hudi Issue Typ

[jira] [Created] (HUDI-3557) Add support for Glue schema registry to deltastreamer and kafka sink connector

2022-03-03 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-3557: - Summary: Add support for Glue schema registry to deltastreamer and kafka sink connector Key: HUDI-3557 URL: https://issues.apache.org/jira/browse/HUDI-3557 Project:

[jira] [Updated] (HUDI-3396) Make sure Spark reads only Projected Columns for both MOR/COW

2022-02-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3396: -- Status: Patch Available (was: In Progress) > Make sure Spark reads only Projected Columns for b

[jira] [Updated] (HUDI-3404) Disable metadata table by config with conditions

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3404: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3449) Async compaction cannot proceed due to archived deltacommit in metadata table

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3449: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Async compaction can

[jira] [Updated] (HUDI-3354) Rebase `HoodieRealtimeRecordReader` to return `HoodieRecord`

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3354: -- Sprint: Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hud

[jira] [Updated] (HUDI-3207) Hudi Trino connector PR review

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3207: -- Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Spr

[jira] [Updated] (HUDI-3466) Validate metadata table with col stats with long-running jobs

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3466: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Validate metadata ta

[jira] [Updated] (HUDI-3457) Refactor Spark Relations to avoid code duplication

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3457: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Refactor Spark Relat

[jira] [Updated] (HUDI-1127) Handling late arriving Deletes

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-1127: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

  1   2   3   4   5   >