Tishj commented on issue #519: URL: https://github.com/apache/parquet-format/issues/519#issuecomment-3396567330
I did some thinking on this, and I think I understand the distinction between the two null types. Feel free to correct me if I'm wrong however. `VARIANT_NULL` only exists because of the need to differentiate between a NULL value for a field, and the field value being missing, in the case of shredding. This is explained in the spec, [here](https://github.com/apache/parquet-format/blob/master/VariantShredding.md#value-shredding) > If a Variant is missing in a context where a value is required, readers must return a Variant null (00): basic type 0 (primitive) and physical type 0 (null). For example, if a Variant is required (like measurement above) and both value and typed_value are null, the returned value must be 00 (Variant null). I don't think there should be a distinction between `SQL NULL` and `VARIANT NULL` in the emitted value when the Variant column is scanned. The distinction should only exist at the storage level. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
