iemejia commented on code in PR #576:
URL: https://github.com/apache/parquet-format/pull/576#discussion_r3345397490


##########
Encodings.md:
##########
@@ -68,7 +69,7 @@ stores the data in the following format:
 For native types, this outputs the data as little endian. Floating
     point types are encoded in IEEE.
 
-For the byte array type, it encodes the length as a 4 byte little
+For the byte array type, it encodes the length as a 4-byte little
 endian, followed by the bytes.

Review Comment:
   Done.



##########
Encodings.md:
##########
@@ -82,7 +83,7 @@ written first, before the data pages of the column chunk.
 Dictionary page format: the entries in the dictionary using the 
[plain](#PLAIN) encoding.
 
 Data page format: the bit width used to encode the entry ids stored as 1 byte 
(max bit width = 32),
-followed by the values encoded using RLE/Bit packed described above (with the 
given bit width).
+followed by the values encoded using RLE/Bit-Packed described above (with the 
given bit width).

Review Comment:
   Done.



##########
Encodings.md:
##########
@@ -322,8 +323,6 @@ The delta encoding algorithm described above stores a bit 
width per miniblock an
 
 Supported Types: BYTE_ARRAY
 
-This encoding is always preferred over PLAIN for byte array columns.
-

Review Comment:
   Reverted. You are right, this is a spec change and should be discussed 
separately.



##########
LogicalTypes.md:
##########
@@ -219,15 +219,15 @@ scale stores the number of digits of that value that are 
to the right of the
 decimal point, and the precision stores the maximum number of digits supported
 in the unscaled value.
 
-If not specified, the scale is 0. Scale must be zero or a positive integer less
-than or equal to the precision. Precision is required and must be a non-zero 
positive
+If not specified, the scale is 0. Scale must be a non-negative integer less
+than or equal to the precision. Precision is required and must be a positive

Review Comment:
   Reverted. Agreed, the original wording is clearer.



##########
LogicalTypes.md:
##########
@@ -243,7 +243,7 @@ comparison.
 
 *Compatibility*
 
-To support compatibility with older readers, implementations of parquet-format 
should
+To support compatibility with older readers, implementations of parquet-format 
must

Review Comment:
   Reverted. You are right, this is a spec change.



##########
LogicalTypes.md:
##########
@@ -544,7 +545,8 @@ are found during reading, they must be ignored.
 
 ## Embedded Types
 
-Embedded types do not have type-specific orderings.
+Embedded types do not have type-specific orderings beyond the unsigned
+byte-wise comparison of their physical type (`BYTE_ARRAY`).

Review Comment:
   Done. Applied your suggestion.



##########
LogicalTypes.md:
##########
@@ -606,7 +608,7 @@ optional group variant_shredded (VARIANT(1)) {
 ### GEOMETRY
 
 `GEOMETRY` is used for geospatial features in the Well-Known Binary (WKB) 
format
-with linear/planar edges interpolation. It must annotate a `BYTE_ARRAY`

Review Comment:
   Reverted to `edges`.



##########
README.md:
##########
@@ -172,7 +172,7 @@ be computed from the schema (i.e. how much nesting there 
is).  This defines the
 maximum number of bits required to store the levels (levels are defined for all
 values in the column).
 
-Two encodings for the levels are supported BIT_PACKED and RLE. Only RLE is now 
used as it supersedes BIT_PACKED.
+Two encodings for the levels are supported: `BIT_PACKED` and `RLE`. Only `RLE` 
is now used as it supersedes `BIT_PACKED`.

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to