Re: [PR] [SPARK-48873] Use UnsafeRow in JSON parser. [spark]

2024-07-14 Thread via GitHub
LuciferYang commented on code in PR #47310: URL: https://github.com/apache/spark/pull/47310#discussion_r1677272149 ## sql/core/benchmarks/DataSourceReadBenchmark-results.txt: ## @@ -1,431 +1,438 @@ -DataSourceReadBenchmark-jdk21-results.txt===

Re: [PR] [SPARK-48873] Use UnsafeRow in JSON parser. [spark]

2024-07-12 Thread via GitHub
chenhao-db commented on PR #47310: URL: https://github.com/apache/spark/pull/47310#issuecomment-2226150952 @LuciferYang I updated the benchmark and PR description. Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] [SPARK-48873] Use UnsafeRow in JSON parser. [spark]

2024-07-11 Thread via GitHub
LuciferYang commented on PR #47310: URL: https://github.com/apache/spark/pull/47310#issuecomment-2224723225 Can we add corresponding test scenarios in `DataSourceReadBenchmark` or any other more suitable benchmarks? -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] [SPARK-48873] Use UnsafeRow in JSON parser. [spark]

2024-07-11 Thread via GitHub
chenhao-db commented on PR #47310: URL: https://github.com/apache/spark/pull/47310#issuecomment-2224132842 @sadikovi please take a look, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[PR] [SPARK-48873] Use UnsafeRow in JSON parser. [spark]

2024-07-11 Thread via GitHub
chenhao-db opened a new pull request, #47310: URL: https://github.com/apache/spark/pull/47310 ### What changes were proposed in this pull request? It uses `UnsafeRow` to represent struct result in the JSON parser. It saves memory compared to the current `GenericInternalRow`. The chang