Re: [PR] GH-465: Clarify backward-compatibility rules on LIST type [parquet-format]

2024-11-15 Thread via GitHub
gszadovszky commented on code in PR #466: URL: https://github.com/apache/parquet-format/pull/466#discussion_r1843395569 ## LogicalTypes.md: ## @@ -609,6 +609,17 @@ that is neither contained by a `LIST`- or `MAP`-annotated group nor annotated by `LIST` or `MAP` should be interp

Re: [PR] GH-3059: Add configuration to disable size statistics [parquet-java]

2024-11-15 Thread via GitHub
wgtmac commented on code in PR #3060: URL: https://github.com/apache/parquet-java/pull/3060#discussion_r1843397209 ## parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestParquetWriter.java: ## @@ -524,4 +527,71 @@ private void testParquetFileNumberOfBlocks( assertE

Re: [PR] GH-3059: Add configuration to disable size statistics [parquet-java]

2024-11-15 Thread via GitHub
Fokko commented on code in PR #3060: URL: https://github.com/apache/parquet-java/pull/3060#discussion_r1843357739 ## parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestParquetWriter.java: ## @@ -524,4 +527,71 @@ private void testParquetFileNumberOfBlocks( assertEq

Re: [PR] GH-465: Clarify backward-compatibility rules on LIST type [parquet-format]

2024-11-15 Thread via GitHub
pitrou commented on code in PR #466: URL: https://github.com/apache/parquet-format/pull/466#discussion_r1843418984 ## LogicalTypes.md: ## @@ -609,6 +609,17 @@ that is neither contained by a `LIST`- or `MAP`-annotated group nor annotated by `LIST` or `MAP` should be interpreted

Re: [PR] Bump Scala to 2.12.20 [parquet-java]

2024-11-15 Thread via GitHub
Fokko commented on PR #3044: URL: https://github.com/apache/parquet-java/pull/3044#issuecomment-2478384704 Closing this in favor of https://github.com/apache/parquet-java/pull/3063 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Remove `parquet-scala` [parquet-java]

2024-11-15 Thread via GitHub
Fokko commented on PR #3063: URL: https://github.com/apache/parquet-java/pull/3063#issuecomment-2478508793 @pan3793 I noticed that the Scala DSL wasn't mentioned to the list, I've added it 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] GH-465: Clarify backward-compatibility rules on LIST type [parquet-format]

2024-11-15 Thread via GitHub
etseidl commented on code in PR #466: URL: https://github.com/apache/parquet-format/pull/466#discussion_r1844245348 ## LogicalTypes.md: ## @@ -609,6 +609,17 @@ that is neither contained by a `LIST`- or `MAP`-annotated group nor annotated by `LIST` or `MAP` should be interprete

Re: [PR] GH-2943: Remove hadoop-2 support [parquet-java]

2024-11-15 Thread via GitHub
steveloughran commented on PR #3061: URL: https://github.com/apache/parquet-java/pull/3061#issuecomment-2479679353 Looking at this I note that thift is download from apache archives and built every time. Apache infra might be unhappy about this; archive isn't replicated the way others are a

Re: [PR] Bump jackson.version from 2.17.2 to 2.18.1 [parquet-java]

2024-11-15 Thread via GitHub
Fokko merged PR #3052: URL: https://github.com/apache/parquet-java/pull/3052 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet.

Re: [PR] Bump org.codehaus.mojo:buildnumber-maven-plugin from 3.2.0 to 3.2.1 [parquet-java]

2024-11-15 Thread via GitHub
Fokko merged PR #3054: URL: https://github.com/apache/parquet-java/pull/3054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet.

Re: [PR] Bump org.xerial.snappy:snappy-java from 1.1.10.5 to 1.1.10.7 [parquet-java]

2024-11-15 Thread via GitHub
Fokko merged PR #3053: URL: https://github.com/apache/parquet-java/pull/3053 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet.

Re: [PR] GH-2943: Remove hadoop-2 support [parquet-java]

2024-11-15 Thread via GitHub
steveloughran commented on PR #3061: URL: https://github.com/apache/parquet-java/pull/3061#issuecomment-2478643711 > I think we want to include the Hadoop 2 Github Actions as well I wanted to see what what happened there. Let me work on that. -- This is an automated message from the

Re: [I] Cannot read parquet file that was generated from nanoparquet [parquet-java]

2024-11-15 Thread via GitHub
wgtmac commented on issue #3043: URL: https://github.com/apache/parquet-java/issues/3043#issuecomment-2478997069 Thanks for reporting this issue! I can confirm that it has been reproduced on my side. Will take a look later. -- This is an automated message from the Apache Git Service. To r

Re: [I] Remove support for Hadoop <3.3 [parquet-java]

2024-11-15 Thread via GitHub
steveloughran commented on issue #2943: URL: https://github.com/apache/parquet-java/issues/2943#issuecomment-2478727692 w.r.t format testing, got some more thoughts there which would actually be * make some/all the benchmark tests subclass of the (stable) hadoop fs contract tests in h

[PR] Remove `parquet-scala` [parquet-java]

2024-11-15 Thread via GitHub
Fokko opened a new pull request, #3063: URL: https://github.com/apache/parquet-java/pull/3063 ### Rationale for this change Based on the `[DISCUSS]` thread: https://lists.apache.org/thread/scdq9t2gvvs4glhq0qx4qcvfp62j793s ### What changes are included in this PR?

Re: [PR] Bump Scala to 2.12.20 [parquet-java]

2024-11-15 Thread via GitHub
Fokko closed pull request #3044: Bump Scala to 2.12.20 URL: https://github.com/apache/parquet-java/pull/3044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] MINOR: Bump Hadoop to 3.4.1 [parquet-java]

2024-11-15 Thread via GitHub
Fokko closed pull request #3050: MINOR: Bump Hadoop to 3.4.1 URL: https://github.com/apache/parquet-java/pull/3050 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] GH-2943: Remove hadoop-2 support [parquet-java]

2024-11-15 Thread via GitHub
Fokko commented on PR #3061: URL: https://github.com/apache/parquet-java/pull/3061#issuecomment-2478359612 @steveloughran I think we want to include the Hadoop 2 Github Actions as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Remove `parquet-scala` [parquet-java]

2024-11-15 Thread via GitHub
pan3793 commented on PR #3063: URL: https://github.com/apache/parquet-java/pull/3063#issuecomment-2478455461 Can we leave some words on README or somewhere to mention this removal? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] GH-465: Clarify backward-compatibility rules on LIST type [parquet-format]

2024-11-15 Thread via GitHub
etseidl commented on PR #466: URL: https://github.com/apache/parquet-format/pull/466#issuecomment-2479863682 Apologies for muddying the waters, but I'm still trying to get this all clear in my head. I'm wondering if rather than adding a new rule, can we simply modify Rule 3 to say ```

Re: [PR] GH-455: Add Variant specification docs [parquet-format]

2024-11-15 Thread via GitHub
alamb commented on PR #456: URL: https://github.com/apache/parquet-format/pull/456#issuecomment-2479905612 Does anyone know of parquet implementations that implement the variant type? I would like to try and organize getting this into the Rust implementation (see https://github.com/ap

[PR] Add map_no_value.parquet [parquet-testing]

2024-11-15 Thread via GitHub
etseidl opened a new pull request, #63: URL: https://github.com/apache/parquet-testing/pull/63 https://github.com/apache/parquet-format/pull/469 recently clarified that a `MAP` in a Parquet file need not have a `values` field in the `key_value` group. This PR adds `map_no_value.parquet` whi