[ 
https://issues.apache.org/jira/browse/HUDI-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-7971:
--------------------------------------
    Description: 
Lets ensure 1.x reader is fully compatible w/ reading any of 0.14.x to 0.16.x 
tables 

 

Readers :  1.x
 # Spark SQL
 # Spark Datasource
 # Trino/Presto
 # Hive
 # Flink

Writer: 0.16

Table State:
 * COW
 ** few write commits 
 ** Pending clustering
 ** Completed Clustering
 ** Failed writes with no rollbacks
 ** Insert overwrite table/partition
 ** Savepoint for Time-travel query

 * MOR
 ** Same as COW
 ** Pending and completed async compaction (with log-files and no base file)
 ** Custom Payloads (for MOR snapshot queries) (e:g SQL Expression Payload)
 ** Rollback formats - DELETE, rollback block

Other knobs:
 # Metadata enabled/disabled
 # Column Stats enabled/disabled and data-skipping enabled/disabled
 # RLI enabled with eq/IN queries

 # Non-Partitioned dataset
 # CDC Reads 
 # Incremental Reads
 # Time-travel query

 

What to test ?
 # Query Results Correctness
 # Performance : See the benefit of 
 # Partition Pruning
 # Metadata  table - col stats, RLI,

 

Corner Case Testing:

 
 # Schema Evolution with different file-groups having different generation of 
schema
 # Dynamic Partition Pruning
 # Does Column Projection work correctly for log files reading 

  was:
Lets ensure 1.x reader is fully compatible w/ reading any of 0.14.x to 0.16.x 
tables 

 

Readers :  1.x
 # Spark SQL
 # Spark Datasource
 # Trino/Presto
 # Hive
 # Flink

Writer: 0.16

Table State:
 * COW
 * Pending clustering
 * Completed Clustering
 * Failed writes with no rollbacks
 * Insert overwrite table/partition
 * Savepoint for Time-travel query


 * MOR
 * Same as COW
 * Pending and completed async compaction (with log-files and no base file)
 * Custom Payloads (for MOR snapshot queries) (e:g SQL Expression Payload)
 * Rollback formats - DELETE, rollback block

Other knobs:
 # Metadata enabled/disabled
 # Column Stats enabled/disabled and data-skipping enabled/disabled
 # RLI enabled with eq/IN queries


 # Non-Partitioned dataset
 # CDC Reads 
 # Incremental Reads
 # Time-travel query

 

What to test ?
 # Query Results Correctness
 # Performance : See the benefit of 
 # Partition Pruning
 # Metadata  table - col stats, RLI,

 

Corner Case Testing:

 
 # Schema Evolution with different file-groups having different generation of 
schema
 # Dynamic Partition Pruning
 # Does Column Projection work correctly for log files reading 


> Test and Certify 0.14.x to 0.16.x tables are readable in 1.x Hudi reader 
> -------------------------------------------------------------------------
>
>                 Key: HUDI-7971
>                 URL: https://issues.apache.org/jira/browse/HUDI-7971
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: sivabalan narayanan
>            Priority: Major
>             Fix For: 1.0.0
>
>
> Lets ensure 1.x reader is fully compatible w/ reading any of 0.14.x to 0.16.x 
> tables 
>  
> Readers :  1.x
>  # Spark SQL
>  # Spark Datasource
>  # Trino/Presto
>  # Hive
>  # Flink
> Writer: 0.16
> Table State:
>  * COW
>  ** few write commits 
>  ** Pending clustering
>  ** Completed Clustering
>  ** Failed writes with no rollbacks
>  ** Insert overwrite table/partition
>  ** Savepoint for Time-travel query
>  * MOR
>  ** Same as COW
>  ** Pending and completed async compaction (with log-files and no base file)
>  ** Custom Payloads (for MOR snapshot queries) (e:g SQL Expression Payload)
>  ** Rollback formats - DELETE, rollback block
> Other knobs:
>  # Metadata enabled/disabled
>  # Column Stats enabled/disabled and data-skipping enabled/disabled
>  # RLI enabled with eq/IN queries
>  # Non-Partitioned dataset
>  # CDC Reads 
>  # Incremental Reads
>  # Time-travel query
>  
> What to test ?
>  # Query Results Correctness
>  # Performance : See the benefit of 
>  # Partition Pruning
>  # Metadata  table - col stats, RLI,
>  
> Corner Case Testing:
>  
>  # Schema Evolution with different file-groups having different generation of 
> schema
>  # Dynamic Partition Pruning
>  # Does Column Projection work correctly for log files reading 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to