Re: [PR] PARQUET-34: implement Size() filter for repeated columns [parquet-java]

2024-12-06 Thread via GitHub
clairemcginty commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1873889419 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/dictionarylevel/DictionaryFilter.java: ## @@ -493,6 +494,39 @@ public > Boolean visit(Contains co

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-06 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1868938364 ## VariantShredding.md: ## @@ -25,290 +25,318 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] MINOR: Clarify offsets etc are unsigned integers [parquet-format]

2024-12-06 Thread via GitHub
emkornfield merged PR #475: URL: https://github.com/apache/parquet-format/pull/475 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@pa

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-06 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1873830923 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous