Caizhi Weng created FLINK-26277: ----------------------------------- Summary: Java docs & implementation of TimestampColumnReader are contradicting Key: FLINK-26277 URL: https://issues.apache.org/jira/browse/FLINK-26277 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.15.0 Reporter: Caizhi Weng
(Not sure if this should be classified as a bug, but I don't see a more proper type.) The Java docs of {{TimestampColumnReader}} states that {code:java} /** * Timestamp {@link ColumnReader}. We only support INT96 bytes now, julianDay(4) + nanosOfDay(8). * See https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp * TIMESTAMP_MILLIS and TIMESTAMP_MICROS are the deprecated ConvertedType. */ {code} However the implementation goes like this {code:java} ByteBuffer buffer = readDataBuffer(12); column.setTimestamp( rowId + i, int96ToTimestamp(utcTimestamp, buffer.getLong(), buffer.getInt())); {code} This implementation contradicts the Java docs because {{nanosOfDay(8)}} actually precedes {{julianDay(4)}}. This implementation is also confusing as it relies on the evaluation order of the argument list. Although it is specified in the [Java Language Specification|https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.7.4] that argument lists are evaluated from left to right, it is not true for other languages (for example c++ does not specify this and may evaluate the list in arbitrary order). -- This message was sent by Atlassian Jira (v8.20.1#820001)