parthchandra commented on code in PR #1756:
URL: https://github.com/apache/datafusion-comet/pull/1756#discussion_r2105011816


##########
native/core/src/parquet/mod.rs:
##########
@@ -715,6 +715,7 @@ pub unsafe extern "system" fn 
Java_org_apache_comet_parquet_Native_initRecordBat
             file_groups,
             None,
             data_filters,
+            None,

Review Comment:
   From Spark javadoc - 
   ```
   org. apache. spark. sql. catalyst. util. ResolveDefaultColumns
   
   def constantFoldCurrentDefaultsToExistDefaults(tableSchema: StructType, 
statementType: String): StructType
   
   Finds "current default" expressions in CREATE/ REPLACE TABLE columns and 
constant-folds them.
   
   The results are stored in the "exists default" metadata of the same columns. 
For example, in the event of this statement:
   
   CREATE TABLE T(a INT, b INT DEFAULT 5 + 5)
   
   This method constant-folds the "current default" value, stored in the 
CURRENT_DEFAULT metadata of 
   the "b" column, to "10", storing the result in the "exists default" value 
within the EXISTS_DEFAULT 
   metadata of that same column. Meanwhile the "current default" metadata of 
this "b" column retains its
   original value of "5 + 5".
   
   The reason for constant-folding the EXISTS_DEFAULT is to make the end-user 
visible behavior the same, 
   after executing an ALTER TABLE ADD COLUMNS command with DEFAULT value, as if 
the system had 
   performed an exhaustive backfill of the provided value to all previously 
existing rows in the table instead. 
   
   We choose to avoid doing such a backfill because it would be a 
time-consuming and costly operation. 
   Instead, we elect to store the EXISTS_DEFAULT in the column metadata for 
future reference when 
   querying data out of the data source. In turn, each data source then takes 
responsibility to provide the 
   constant-folded value in the EXISTS_DEFAULT metadata for such columns where 
the value is not 
   present in storage.
   ```
   
   I'll assume that the default values you get are the 'existence' defaults



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to