alamb commented on issue #82: URL: https://github.com/apache/parquet-testing/issues/82#issuecomment-2876516422
Thanks @mapleFU -- I agree -- by my reading of https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#value-encoding-grammar the first byte `0x14` is `0b00010100` * low 2 bits are `0b00` => ` Primitive type` * high 6 bits are `0b000101` ==> 5 (aka Int32 per the [Variant basic types](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#encoding-types)) table I made this value with Spark like this: https://github.com/apache/parquet-testing/blob/2dc8bf140ed6e28652fc347211c7d661714c7f95/variant/regen.py#L66 So perhaps there is a bug in Spark What I suggest is twofold: 1. Update the example data with the correct values of an int64 primitive 2. File a ticket in spark pointing out the problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
