Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-13 Thread via GitHub
cashmand commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1884404362 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous val

Re: [PR] GH-3080: HadoopStreams to support ByteBufferPositionedReadable [parquet-java]

2024-12-13 Thread via GitHub
wgtmac commented on code in PR #3096: URL: https://github.com/apache/parquet-java/pull/3096#discussion_r1884049013 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -111,14 +79,8 @@ private static Function unwrapByteBuffer * the data, n

Re: [PR] PARQUET-34: implement Size() filter for repeated columns [parquet-java]

2024-12-13 Thread via GitHub
wgtmac commented on PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#issuecomment-2541673515 Thanks for adding this! This is a large PR that I need to take some time to review. It would be good if @emkornfield @gszadovszky could take a look to see if this is a good use