Re: [PR] GH-3123: Omit level histogram for some max levels [parquet-java]

2025-01-20 Thread via GitHub
wgtmac commented on PR #3124: URL: https://github.com/apache/parquet-java/pull/3124#issuecomment-2603651578 I just merged it. Thanks @etseidl @emkornfield @gszadovszky! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[PR] GH-3133: Fix SizeStatistics to handle omitted histogram [parquet-java]

2025-01-20 Thread via GitHub
wgtmac opened a new pull request, #3134: URL: https://github.com/apache/parquet-java/pull/3134 ### Rationale for this change If SizeStatistics has omitted level histogram, creating an `SizeStatistics` will result in a null object which leads to NullPointerException when calling gette

Re: [I] Omit level histogram for some max levels without loss of precision [parquet-java]

2025-01-20 Thread via GitHub
wgtmac closed issue #3123: Omit level histogram for some max levels without loss of precision URL: https://github.com/apache/parquet-java/issues/3123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] PARQUET-2471: Add GEOMETRY and GEOGRAPHY logical types [parquet-format]

2025-01-20 Thread via GitHub
wgtmac commented on code in PR #240: URL: https://github.com/apache/parquet-format/pull/240#discussion_r1922976399 ## LogicalTypes.md: ## @@ -599,6 +599,45 @@ optional group variant_shredded (VARIANT) { } ``` +### GEOMETRY + +`GEOMETRY` is used for geometry features in the W

Re: [PR] PARQUET-2471: Add GEOMETRY and GEOGRAPHY logical types [parquet-format]

2025-01-20 Thread via GitHub
jiayuasu commented on code in PR #240: URL: https://github.com/apache/parquet-format/pull/240#discussion_r1923171061 ## LogicalTypes.md: ## @@ -599,6 +599,45 @@ optional group variant_shredded (VARIANT) { } ``` +### GEOMETRY + +`GEOMETRY` is used for geometry features in the

Re: [PR] GH-3127: Enabled `parquet.hadoop.vectored.io.enabled` by default [parquet-java]

2025-01-20 Thread via GitHub
dongjoon-hyun commented on PR #3128: URL: https://github.com/apache/parquet-java/pull/3128#issuecomment-2603736212 Thank you again, @wgtmac and @Fokko ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] GH-3133: Fix SizeStatistics to handle omitted histogram [parquet-java]

2025-01-20 Thread via GitHub
wgtmac commented on PR #3134: URL: https://github.com/apache/parquet-java/pull/3134#issuecomment-2603738886 @gszadovszky I've created this to include the fix only. Please take a look. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] PARQUET-2471: Add GEOMETRY and GEOGRAPHY logical types [parquet-format]

2025-01-20 Thread via GitHub
wgtmac commented on code in PR #240: URL: https://github.com/apache/parquet-format/pull/240#discussion_r1922972160 ## LogicalTypes.md: ## @@ -599,6 +599,45 @@ optional group variant_shredded (VARIANT) { } ``` +### GEOMETRY + +`GEOMETRY` is used for geometry features in the W

Re: [PR] GH-3099 add libthrift to parquet-cli shaded jar [parquet-java]

2025-01-20 Thread via GitHub
wgtmac merged PR #3100: URL: https://github.com/apache/parquet-java/pull/3100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet

Re: [I] NoClassDefFoundError using parquet-cli shaded jar (building with mvn -Plocal) [parquet-java]

2025-01-20 Thread via GitHub
wgtmac closed issue #3099: NoClassDefFoundError using parquet-cli shaded jar (building with mvn -Plocal) URL: https://github.com/apache/parquet-java/issues/3099 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [I] Enabled `parquet.hadoop.vectored.io.enabled` by default [parquet-java]

2025-01-20 Thread via GitHub
wgtmac closed issue #3127: Enabled `parquet.hadoop.vectored.io.enabled` by default URL: https://github.com/apache/parquet-java/issues/3127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] GH-3127: Enabled `parquet.hadoop.vectored.io.enabled` by default [parquet-java]

2025-01-20 Thread via GitHub
wgtmac merged PR #3128: URL: https://github.com/apache/parquet-java/pull/3128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet

Re: [PR] GH-3123: Omit level histogram for some max levels [parquet-java]

2025-01-20 Thread via GitHub
wgtmac merged PR #3124: URL: https://github.com/apache/parquet-java/pull/3124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet

Re: [PR] PARQUET-2471: Add GEOMETRY and GEOGRAPHY logical types [parquet-format]

2025-01-20 Thread via GitHub
paleolimbot commented on code in PR #240: URL: https://github.com/apache/parquet-format/pull/240#discussion_r1923067914 ## LogicalTypes.md: ## @@ -599,6 +599,45 @@ optional group variant_shredded (VARIANT) { } ``` +### GEOMETRY + +`GEOMETRY` is used for geometry features in

Re: [PR] GH-3125: Add CLI for SizeStatistics [parquet-java]

2025-01-20 Thread via GitHub
gszadovszky commented on code in PR #3126: URL: https://github.com/apache/parquet-java/pull/3126#discussion_r1922337308 ## parquet-column/src/main/java/org/apache/parquet/column/statistics/SizeStatistics.java: ## @@ -136,8 +136,10 @@ public SizeStatistics( List definition

Re: [PR] GH-3125: Add CLI for SizeStatistics [parquet-java]

2025-01-20 Thread via GitHub
gszadovszky commented on code in PR #3126: URL: https://github.com/apache/parquet-java/pull/3126#discussion_r1921950827 ## parquet-column/src/main/java/org/apache/parquet/column/statistics/SizeStatistics.java: ## @@ -136,8 +136,10 @@ public SizeStatistics( List definition

Re: [PR] GH-1452: implement Size() filter for repeated columns [parquet-java]

2025-01-20 Thread via GitHub
wgtmac commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1922067091 ## parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java: ## @@ -378,6 +379,11 @@ public > PrimitiveIterator.OfInt vi

Re: [PR] GH-3125: Add CLI for SizeStatistics [parquet-java]

2025-01-20 Thread via GitHub
wgtmac commented on code in PR #3126: URL: https://github.com/apache/parquet-java/pull/3126#discussion_r1921972035 ## parquet-column/src/main/java/org/apache/parquet/column/statistics/SizeStatistics.java: ## @@ -136,8 +136,10 @@ public SizeStatistics( List definitionLevel

Re: [PR] GH-3125: Add CLI for SizeStatistics [parquet-java]

2025-01-20 Thread via GitHub
wgtmac commented on code in PR #3126: URL: https://github.com/apache/parquet-java/pull/3126#discussion_r1921972035 ## parquet-column/src/main/java/org/apache/parquet/column/statistics/SizeStatistics.java: ## @@ -136,8 +136,10 @@ public SizeStatistics( List definitionLevel

Re: [PR] GH-1452: implement Size() filter for repeated columns [parquet-java]

2025-01-20 Thread via GitHub
clairemcginty commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1922778199 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -217,6 +219,70 @@ public > Boolean visit(Contains co

Re: [PR] GH-1452: implement Size() filter for repeated columns [parquet-java]

2025-01-20 Thread via GitHub
clairemcginty commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1922778946 ## parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java: ## @@ -378,6 +379,11 @@ public > PrimitiveIterator.Of

Re: [PR] GH-1452: implement Size() filter for repeated columns [parquet-java]

2025-01-20 Thread via GitHub
clairemcginty commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1922779893 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -217,6 +219,70 @@ public > Boolean visit(Contains co

Re: [PR] GH-1452: implement Size() filter for repeated columns [parquet-java]

2025-01-20 Thread via GitHub
clairemcginty commented on code in PR #3098: URL: https://github.com/apache/parquet-java/pull/3098#discussion_r1922782135 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -217,6 +219,70 @@ public > Boolean visit(Contains co

Re: [PR] GH-3127: Enabled `parquet.hadoop.vectored.io.enabled` by default [parquet-java]

2025-01-20 Thread via GitHub
dongjoon-hyun commented on PR #3128: URL: https://github.com/apache/parquet-java/pull/3128#issuecomment-2602921089 Thank you, @Fokko . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Bump org.easymock:easymock from 5.4.0 to 5.5.0 [parquet-java]

2025-01-20 Thread via GitHub
Fokko merged PR #3131: URL: https://github.com/apache/parquet-java/pull/3131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet.

Re: [PR] Bump org.apache.arrow:arrow-vector from 17.0.0 to 18.1.0 [parquet-java]

2025-01-20 Thread via GitHub
Fokko commented on PR #3129: URL: https://github.com/apache/parquet-java/pull/3129#issuecomment-2602745345 Arrow 18.0.0 and onward is Java 11+, see https://arrow.apache.org/docs/18.0/java/install.html -- This is an automated message from the Apache Git Service. To respond to the message,