Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881046365 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881045255 ## VariantShredding.md: ## @@ -25,276 +25,302 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881053679 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881054123 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881064507 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881064507 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881064507 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] MINOR: Fix docstring style [parquet-format]

2024-12-11 Thread via GitHub
Fokko closed pull request #445: MINOR: Fix docstring style URL: https://github.com/apache/parquet-format/pull/445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] MINOR: Fix docstring style [parquet-format]

2024-12-11 Thread via GitHub
Fokko commented on PR #445: URL: https://github.com/apache/parquet-format/pull/445#issuecomment-2537068420 Thanks for raising this PR @alkis, but I'm going to close this one for now. For the thrift definitions, we don't want to lose the history, and if we want to have strict formatting, I t

Re: [PR] MINOR: Test against Java 21 LTS [parquet-format]

2024-12-11 Thread via GitHub
Fokko merged PR #471: URL: https://github.com/apache/parquet-format/pull/471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@parquet.

Re: [PR] GH-3091: Add verification guide and .rat-excludes.txt for release [parquet-java]

2024-12-11 Thread via GitHub
wgtmac commented on PR #3101: URL: https://github.com/apache/parquet-java/pull/3101#issuecomment-2536335161 cc @Fokko @gszadovszky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] GH-3091: Add verification guide and .rat-excludes.txt for release [parquet-java]

2024-12-11 Thread via GitHub
wgtmac commented on code in PR #3101: URL: https://github.com/apache/parquet-java/pull/3101#discussion_r1880426543 ## dev/README.md: ## @@ -91,3 +91,61 @@ Merge hash: 485658a5 Would you like to pick 485658a5 into another branch? (y/n): ``` For now just say n as we have 1 bran

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881077834 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneo

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-11 Thread via GitHub
emkornfield commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1881108489 ## VariantShredding.md: ## @@ -25,290 +25,320 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [I] High Memory Usage and Long GC Times When Writing Parquet Files [parquet-java]

2024-12-11 Thread via GitHub
ccl125 commented on issue #3102: URL: https://github.com/apache/parquet-java/issues/3102#issuecomment-2537600167 I noticed that when I set withDictionaryEncoding(false), the writer switches from using FallbackValuesWriter to PlainValuesWriter. These two have significantly different memory u

Re: [PR] GH-473: Add shredding version [parquet-format]

2024-12-11 Thread via GitHub
rdblue commented on code in PR #474: URL: https://github.com/apache/parquet-format/pull/474#discussion_r1881178197 ## src/main/thrift/parquet.thrift: ## @@ -384,6 +384,11 @@ struct BsonType { * Embedded Variant logical type annotation */ struct VariantType { + // If the Va