wgtmac commented on code in PR #457:
URL: https://github.com/apache/parquet-format/pull/457#discussion_r1797718722


##########
VariantEncoding.md:
##########
@@ -107,7 +108,7 @@ offset_size_minus_one: 2-bit value providing the number of 
bytes per dictionary
 dictionary_size: `offset_size` bytes. little-endian value indicating the 
number of strings in the dictionary
 dictionary: <offset>* <bytes>
 offset: `offset_size` bytes. little-endian value indicating the starting 
position of the ith string in `bytes`. The list should contain `dictionary_size 
+ 1` values, where the last value is the total length of `bytes`.
-bytes: dictionary string values
+bytes: dictionary string UTF-8 encoded values

Review Comment:
   ```suggestion
   bytes: UTF-8 encoded dictionary string values
   ```



##########
VariantShredding.md:
##########
@@ -91,7 +91,7 @@ optional group variant_col {
 # Parquet Layout
 
 The `array` and `object` fields represent Variant array and object types, 
respectively.
-Arrays must use the three-level list structure described in 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md.
+Arrays must use the three-level list structure described in 
[LogicalTypes.md](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists).

Review Comment:
   ```suggestion
   Arrays must use the three-level list structure described in 
[LogicalTypes.md](LogicalTypes.md).
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to