Hi all, My name is Sarah Gilmore, and I am a software developer at MathWorks[1] as well as a committer for the apache/arrow project.
I noticed that the Spark ecosystem is introducing a new data type called TimeType[2] to represent time of day values in the upcoming 4.1.0 release, and I'm very excited to see this work come to fruition! However, I also noticed that the accompanying enhancement to Spark's Parquet reader only adds the ability to read Parquet TIME data if isAdjustedToUTC=false[3]. Does the community have any plans to lift the isAdjustedToUTC=false restriction in the future? My question stems from the fact that some Parquet writers generate TIME data with isAdjustedToUTC=true to adhere to the Parquet's compatibility guidelines[4] with respect to the deprecation of the ConvertedType TIME_MICROS. For example, Arrow's Parquet writer sets isAdjustedToUTC=true[5] even though Arrow's time types themselves are timezone-agnostic. Consequently, Spark's Parquet reader will still be unable to import Parquet files that contain TIME data that were generated by Parquet writers that follow the Parquet compatibility guidelines - such as the Arrow Parquet writer - even after the release of the TimeType Spark datatype. For context, the MATLAB parquetwrite function leverages Arrow's Parquet writer[6], and many MATLAB users want to read MATLAB-generated Parquet files that contain TIME data in Spark. I appreciate the community's time and consideration on this topic. Thanks! Best, Sarah Gilmore [1] https://www.mathworks.com/ [2] https://issues.apache.org/jira/browse/SPARK-51342 [3] https://github.com/apache/spark/blob/77413d443f23dd7a14194e516a12d2c959a357be/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L309 [4] https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#deprecated-time-convertedtype [5] https://github.com/apache/arrow/blob/066b2162206825f2d628f97f4113b0403da1f4ec/cpp/src/parquet/arrow/schema.cc#L434 [6] https://www.mathworks.com/help/matlab/import_export/datatype-mappings-matlab-parquet.html