Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878995166 ## VariantShredding.md: ## @@ -25,290 +25,318 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878991946 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878990162 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
sfc-gh-saya commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878991799 ## VariantShredding.md: ## @@ -25,276 +25,299 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878967986 ## VariantShredding.md: ## @@ -25,276 +25,299 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878982761 ## VariantShredding.md: ## @@ -25,276 +25,302 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1879004931 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1879010171 ## VariantShredding.md: ## @@ -25,290 +25,318 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1879015316 ## VariantShredding.md: ## @@ -25,276 +25,299 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

[I] High Memory Usage and Long GC Times When Writing Parquet Files [parquet-java]

2024-12-10 Thread via GitHub
ccl125 opened a new issue, #3102: URL: https://github.com/apache/parquet-java/issues/3102 ### Describe the usage question you have. Please include as many useful details as possible. In my project, I am using the following code to write Parquet files to the server: `ParquetWr

Re: [I] High Memory Usage and Long GC Times When Writing Parquet Files [parquet-java]

2024-12-10 Thread via GitHub
ccl125 closed issue #3102: High Memory Usage and Long GC Times When Writing Parquet Files URL: https://github.com/apache/parquet-java/issues/3102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
sfc-gh-saya commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1879024065 ## VariantShredding.md: ## @@ -25,276 +25,299 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous

Re: [I] Add more types - time, nano timestamps, UUID to Variant spec [parquet-format]

2024-12-10 Thread via GitHub
emkornfield closed issue #463: Add more types - time, nano timestamps, UUID to Variant spec URL: https://github.com/apache/parquet-format/issues/463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] GH-463: Add more types - time, nano timestamps, UUID to Variant spec [parquet-format]

2024-12-10 Thread via GitHub
emkornfield merged PR #464: URL: https://github.com/apache/parquet-format/pull/464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@pa

Re: [PR] GH-463: Add more types - time, nano timestamps, UUID to Variant spec [parquet-format]

2024-12-10 Thread via GitHub
emkornfield commented on PR #464: URL: https://github.com/apache/parquet-format/pull/464#issuecomment-2533118037 Going to merge. Thanks @aihuaxu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878909221 ## VariantEncoding.md: ## @@ -39,13 +39,41 @@ Another motivation for the representation is that (aside from metadata) each nes For example, in a Variant containin

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878945956 ## VariantShredding.md: ## @@ -25,290 +25,316 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] Simplify Variant shredding and refactor for clarity [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #461: URL: https://github.com/apache/parquet-format/pull/461#discussion_r1878954798 ## VariantShredding.md: ## @@ -25,290 +25,318 @@ The Variant type is designed to store and process semi-structured data efficiently, even with heterogeneous value

Re: [PR] GH-473: Add shredding version [parquet-format]

2024-12-10 Thread via GitHub
rdblue commented on code in PR #474: URL: https://github.com/apache/parquet-format/pull/474#discussion_r1879054459 ## src/main/thrift/parquet.thrift: ## @@ -384,6 +384,11 @@ struct BsonType { * Embedded Variant logical type annotation */ struct VariantType { + // If the Va

Re: [PR] GH-473: Add shredding version [parquet-format]

2024-12-10 Thread via GitHub
emkornfield commented on code in PR #474: URL: https://github.com/apache/parquet-format/pull/474#discussion_r1879063140 ## src/main/thrift/parquet.thrift: ## @@ -384,6 +384,11 @@ struct BsonType { * Embedded Variant logical type annotation */ struct VariantType { + // If t