wgtmac commented on code in PR #496:
URL: https://github.com/apache/parquet-format/pull/496#discussion_r2161108287


##########
LogicalTypes.md:
##########
@@ -539,6 +544,26 @@ The sort order used for `INTERVAL` is undefined. When 
writing data, no min/max
 statistics should be saved for this type and if such non-compliant statistics
 are found during reading, they must be ignored.
 
+#### INTERVAL_YEAR_MONTH
+
+`INTERVAL_YEAR_MONTH` is used to represent a year-month time interval, such as
+`4 years and 6 months`. It must annotate an `int32` that stores the total 
number
+of months as a signed integer, which represents the interval and can be 
negative.
+The time duration is independent of any timezone.
+
+#### INTERVAL_DAY_TIME
+
+`INTERVAL_DAY_TIME` is used to represent a day-time time interval, such as

Review Comment:
   Have you considered introducing a type parameter like `unit` to make it 
possible to use physical types like int32 or int64 for shorter storage overhead?



##########
LogicalTypes.md:
##########
@@ -539,6 +544,31 @@ The sort order used for `INTERVAL` is undefined. When 
writing data, no min/max
 statistics should be saved for this type and if such non-compliant statistics
 are found during reading, they must be ignored.
 
+#### YEAR_MONTH_INTERVAL
+
+`YEAR_MONTH_INTERVAL` is used to represent a year-month time interval, such as
+`4 years and 6 months`. It must annotate an `int32` that stores the total 
number
+of months as a signed integer, which represents the interval and can be 
negative.
+
+While ANSI SQL systems typically restrict supported intervals to a range of 
+±10,000 years and enforce this constraint internally, the Parquet format 
+does not impose any limitations on the interval values that may be stored.
+
+#### DURATION
+
+`DURATION` is used to represent a span of time, such as `5 days`. It must 
+annotate an `int64` value that stores the total number of time units for the 
+duration. The value is a signed integer, where a negative value indicates the
+duration moves backward in time (e.g., -5 days means going backward for 5 
days).  
+The duration is purely a measure of time and is independent of any time zone.
+
+The `DURATION` type takes `unit` as a parameter, and the value must be one of
+`MILLIS`, `MICROS` or  `NANOS`.

Review Comment:
   ```suggestion
   `MILLIS`, `MICROS` or `NANOS`.
   ```



##########
LogicalTypes.md:
##########
@@ -539,6 +544,31 @@ The sort order used for `INTERVAL` is undefined. When 
writing data, no min/max
 statistics should be saved for this type and if such non-compliant statistics
 are found during reading, they must be ignored.
 
+#### YEAR_MONTH_INTERVAL
+
+`YEAR_MONTH_INTERVAL` is used to represent a year-month time interval, such as
+`4 years and 6 months`. It must annotate an `int32` that stores the total 
number
+of months as a signed integer, which represents the interval and can be 
negative.
+
+While ANSI SQL systems typically restrict supported intervals to a range of 
+±10,000 years and enforce this constraint internally, the Parquet format 
+does not impose any limitations on the interval values that may be stored.
+
+#### DURATION
+
+`DURATION` is used to represent a span of time, such as `5 days`. It must 
+annotate an `int64` value that stores the total number of time units for the 
+duration. The value is a signed integer, where a negative value indicates the
+duration moves backward in time (e.g., -5 days means going backward for 5 
days).  
+The duration is purely a measure of time and is independent of any time zone.
+
+The `DURATION` type takes `unit` as a parameter, and the value must be one of
+`MILLIS`, `MICROS` or  `NANOS`.
+
+`Duration` can be used to represent DayTime Intervals as defined by ANSI SQL. 
In 

Review Comment:
   ```suggestion
   `DURATION` can be used to represent day-time interval as defined by ANSI 
SQL. In 
   ```
   
   Just to be consistent with above.



##########
src/main/thrift/parquet.thrift:
##########
@@ -461,6 +461,30 @@ struct GeographyType {
   2: optional EdgeInterpolationAlgorithm algorithm;
 }
 
+/**
+ * Year-Month Interval logical type annotation
+ *
+ * The data is stored as an 4 byte signed integer which represents the number

Review Comment:
   ```suggestion
    * The data is stored as a 4-byte signed integer which represents the number
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to