parthchandra commented on PR #1229:
URL: 
https://github.com/apache/datafusion-comet/pull/1229#issuecomment-2578930234

   > > > > > Finally, can we include two more things (either in 
spark_parquet_options or in some parquet_conversion_context struct) which has 
the conversion and type promition options that are also used in 
Java_org_apache_comet_parquet_Native_initColumnReader ?
   > > > > 
   > > > > 
   > > > > Could you clarify which fields you have in mind? I looked at adding 
things like decimal precision, expected precision, scale, etc. but those seem 
like individual column properties (more like schema) than a property of the 
Parquet reader. The timestamp conversion options seem to apply to the entire 
query.
   > > > 
   > > > 
   > > > I meant all of those but I realize that this needs to be per column. 
We need to pass in `useDecimal128` and `useLegacyDateTimestampOrNTZ`at least 
unless we can access the SQLConf. The type promotion info can be derived in 
native, I think).
   > > 
   > > 
   > > Couldn't we derive a lot of those (timestamp resolution, timezone, 
decimal properties) from the schemas that we have in SchemaAdapter?
   > 
   > The type promotion info can be derived (it already is on the jvm side). 
But the useDecimal128 and useLegacyDateTimestampOrNTZ parameters are from 
SQLConf. If we have a copy of SQLConf accessible in native, we can use that 
instead, but I think we no longer have a copy of the conf in native.
   
   Wondering if it is feasible to modify the required schema to take these 
parameters into account and have the parquet reader automagically use the 
correct arrow vector type. Let's gp ahead with this PR without the requirement 
to pass in these parameters and see if we can handle this all in entirely in 
either native or jvm.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to