Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-05 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1871880017 ## VariantShredding.md: ## @@ -25,276 +25,302 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-05 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1871877857 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-05 Thread via GitHub
gene-db commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1872656516 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous valu

[PR] PARQUET-34: implement Size() filter for repeated columns [parquet-java]

2024-12-05 Thread via GitHub
clairemcginty opened a new pull request, #3098: URL: https://github.com/apache/parquet-java/pull/3098 ### Rationale for this change this PR continues the work outlined in #1452. It implements a `size()` predicate for filtering on # of elements in repeated fields: ```java Fil

Re: [PR] PARQUET-34: implement Size() filter for repeated columns [parquet-java]

2024-12-05 Thread via GitHub
clairemcginty commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1872004619 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -217,6 +219,70 @@ public > Boolean visit(Contains co