Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#issuecomment-2716194217 Could you review this too when you have some time, @viirya , please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] [SPARK-51479][SQL] Nullable in Row Level Operation Column is not correct [spark]

2025-03-11 Thread via GitHub
huaxingao opened a new pull request, #50246: URL: https://github.com/apache/spark/pull/50246 ### What changes were proposed in this pull request? fix nullable in Row Level Operation column ### Why are the changes needed? In iceberg/spark 4.0 integration, there are a f

Re: [PR] [SPARK-51466][SQL][HIVE] Eliminate Hive built-in UDFs initialization on Hive UDF evaluation [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on code in PR #50232: URL: https://github.com/apache/spark/pull/50232#discussion_r1990711565 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFEvaluators.scala: ## @@ -111,17 +116,41 @@ class HiveGenericUDFEvaluator( extends HiveUDFEvaluatorBas

Re: [PR] [SPARK-51466][SQL][HIVE] Eliminate Hive built-in UDFs initialization on Hive UDF evaluation [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on code in PR #50232: URL: https://github.com/apache/spark/pull/50232#discussion_r1990710851 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFEvaluators.scala: ## @@ -111,17 +116,41 @@ class HiveGenericUDFEvaluator( extends HiveUDFEvaluatorBas

Re: [PR] Initial Implementation [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun closed pull request #1: Initial Implementation URL: https://github.com/apache/spark-connect-swift/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Initial Implementation [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #1: URL: https://github.com/apache/spark-connect-swift/pull/1#issuecomment-2716696574 I'm closing this PR because all contents are merged. - #7 - #9 - #10 After this, I'm moving forward to - Adding more test cases for various data types and

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990695806 ## Tests/SparkConnectTests/SparkSessionTests.swift: ## @@ -0,0 +1,74 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#issuecomment-2716692413 Thank you. For the record, the first MVP (Minimum Viable Product) is focusing on `SQL` area of Apache Spark 4.0.0 including the following. - SPARK-4 Use ANSI S

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#issuecomment-2716693543 Merged to main~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990701937 ## Tests/SparkConnectTests/SparkSessionTests.swift: ## @@ -0,0 +1,74 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990676843 ## Sources/SparkConnect/DataFrame.swift: ## @@ -0,0 +1,195 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#issuecomment-2716683724 Thank you for helping this effort so far, @viirya . This is the last of initial implementation. After this, I'm moving forward to - Adding more test cases for various da

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990696675 ## Tests/SparkConnectTests/SparkSessionTests.swift: ## @@ -0,0 +1,74 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990695806 ## Tests/SparkConnectTests/SparkSessionTests.swift: ## @@ -0,0 +1,74 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990687505 ## Sources/SparkConnect/DataFrame.swift: ## @@ -0,0 +1,195 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] [SPARK-51374][CORE] Switch to Using java.util.Map in Logging APIs [spark]

2025-03-11 Thread via GitHub
panbingkun commented on PR #50138: URL: https://github.com/apache/spark/pull/50138#issuecomment-2716655284 LGTM late. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on code in PR #10: URL: https://github.com/apache/spark-connect-swift/pull/10#discussion_r1990677314 ## Sources/SparkConnect/DataFrame.swift: ## @@ -0,0 +1,195 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

[PR] [SPARK-51483] Add `SparkSession` and `DataFrame` actors [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun opened a new pull request, #10: URL: https://github.com/apache/spark-connect-swift/pull/10 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-51466][SQL][HIVE] Eliminate Hive built-in UDFs initialization on Hive UDF evaluation [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on code in PR #50232: URL: https://github.com/apache/spark/pull/50232#discussion_r1990665008 ## sql/hive-thriftserver/pom.xml: ## @@ -148,16 +148,6 @@ byte-buddy-agent test - Review Comment: Is it possible for us to add some conf

Re: [PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #9: URL: https://github.com/apache/spark-connect-swift/pull/9#issuecomment-2716595208 Merged to main. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun closed pull request #9: [SPARK-51481] Add `RuntimeConf` actor URL: https://github.com/apache/spark-connect-swift/pull/9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #9: URL: https://github.com/apache/spark-connect-swift/pull/9#issuecomment-2716590137 This will have an independent release cadence (and version pattern and tags). -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #9: URL: https://github.com/apache/spark-connect-swift/pull/9#issuecomment-2716586970 Thank you for review, @yaooqinn . Sure, for now, I didn't change it, but I'm also think to follow `spark-kubernetes-operator` repository style. https://github.co

Re: [PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
yaooqinn commented on PR #9: URL: https://github.com/apache/spark-connect-swift/pull/9#issuecomment-2716572466 By the way, can we unbind the version tracker of this project to the Spark main repo? It might be inconvenient for the RM of v4.1.0 to generate the release note. FYI, ht

Re: [PR] [SPARK-51358] [SS] Introduce snapshot upload lag detection through StateStoreCoordinator [spark]

2025-03-11 Thread via GitHub
micheal-o commented on code in PR #50123: URL: https://github.com/apache/spark/pull/50123#discussion_r1990599552 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala: ## @@ -38,9 +38,20 @@ import org.apache.spark.sql.types.Str

Re: [PR] [SPARK-51358] [SS] Introduce snapshot upload lag detection through StateStoreCoordinator [spark]

2025-03-11 Thread via GitHub
micheal-o commented on code in PR #50123: URL: https://github.com/apache/spark/pull/50123#discussion_r1990606620 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCoordinator.scala: ## @@ -129,10 +174,17 @@ class StateStoreCoordinatorRef private

Re: [PR] [SPARK-51358] [SS] Introduce snapshot upload lag detection through StateStoreCoordinator [spark]

2025-03-11 Thread via GitHub
micheal-o commented on code in PR #50123: URL: https://github.com/apache/spark/pull/50123#discussion_r1990598816 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala: ## @@ -38,9 +38,20 @@ import org.apache.spark.sql.types.Str

Re: [PR] [MINOR][PYTHON] Reformat error classes [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on code in PR #50241: URL: https://github.com/apache/spark/pull/50241#discussion_r1990623258 ## python/pyspark/errors/error-conditions.json: ## @@ -1208,4 +1208,4 @@ "Index must be non-zero." ] } -} +} Review Comment: Does this file not n

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
yaooqinn closed pull request #50233: [SPARK-51467][UI] Make tables of the environment page filterable URL: https://github.com/apache/spark/pull/50233 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-51365][SQL][TESTS] Add Envs to control the number of `SHUFFLE_EXCHANGE/RESULT_QUERY_STAGE` threads used in test cases related to `SharedSparkSession/TestHive` [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on PR #50206: URL: https://github.com/apache/spark/pull/50206#issuecomment-2716537625 https://github.com/apache/spark/actions/runs/13797191525 ![image](https://github.com/user-attachments/assets/b00b3f6e-c293-4e36-a048-59082190d41c) The latest macOS daily

Re: [PR] [SPARK-51402][SQL][TESTS] Test TimeType in UDF [spark]

2025-03-11 Thread via GitHub
MaxGekk commented on code in PR #50194: URL: https://github.com/apache/spark/pull/50194#discussion_r1990554669 ## sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala: ## @@ -1197,6 +1197,35 @@ class UDFSuite extends QueryTest with SharedSparkSession { Row(Row(nul

Re: [PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #9: URL: https://github.com/apache/spark-connect-swift/pull/9#issuecomment-2716433529 Could you review this PR, @yaooqinn ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] [SPARK-48922][SQL] Optimize nested data type insertion performance [spark]

2025-03-11 Thread via GitHub
wForget commented on PR #47381: URL: https://github.com/apache/spark/pull/47381#issuecomment-2716359010 > Hi @wForget Just checking if you had a chance I reopened in #50245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on code in PR #50187: URL: https://github.com/apache/spark/pull/50187#discussion_r1990515581 ## connect-examples/server-library-example/pom.xml: ## @@ -36,7 +36,8 @@ UTF-8 2.13 2.13.15 -3.25.4 -4.0.0-preview2 +4.29.3 +4.1.0-SN

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on code in PR #50187: URL: https://github.com/apache/spark/pull/50187#discussion_r1990515581 ## connect-examples/server-library-example/pom.xml: ## @@ -36,7 +36,8 @@ UTF-8 2.13 2.13.15 -3.25.4 -4.0.0-preview2 +4.29.3 +4.1.0-SN

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716363691 Oh, indeed. BTW, I realized that the method `click` has a side-effect. Whenever I clicked the header, the table sorting order is flipped like the following. https://github.

[PR] [SPARK-48922][SQL] Avoid redundant array transform of identical expression for map type [spark]

2025-03-11 Thread via GitHub
wForget opened a new pull request, #50245: URL: https://github.com/apache/spark/pull/50245 ### What changes were proposed in this pull request? Similar to #47843, this patch avoids ArrayTransform in `resolveMapType` function if the resolution expression is the same as input pa

[PR] [SPARK-51481] Add `RuntimeConf` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun opened a new pull request, #9: URL: https://github.com/apache/spark-connect-swift/pull/9 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
yaooqinn commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716337142 It seems that you didn't click any of the `` fields, the click event will trigger both sort & filter. But I don't see a up-or-down `arrow` in your screen shot. -- This is an automated

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#issuecomment-2716316357 Thank you so much for your review, Liang-Chi! Merged to main. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990467780 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716314059 Maybe, am I hitting a timing issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716312558 It's loaded like the following but `Runtime Information` didn't have the filter, @yaooqinn . https://github.com/user-attachments/assets/de7814e7-f265-434d-9362-400796d14dd6";

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990477755 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990474308 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
yaooqinn commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716288827 https://github.com/user-attachments/assets/388bde03-b18a-4525-af94-c1d1701fa52e -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990467780 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51179][SQL] Refactor SupportsOrderingWithinGroup so that centralized check [spark]

2025-03-11 Thread via GitHub
beliefer commented on PR #49908: URL: https://github.com/apache/spark/pull/49908#issuecomment-2716282071 @cloud-fan Could you take a review again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
yaooqinn commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716281938 To show the filter input text box, we need to click the table head, and be sure to see this file loaded @dongjoon-hyun ![image](https://github.com/user-attachments/assets/294876bc-3

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990462746 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990462746 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990462126 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] [SPARK-51380][SQL] Add visitSQLFunction and visitAggregateFunction to improve the flexibility of V2ExpressionSQLBuilder [spark]

2025-03-11 Thread via GitHub
beliefer commented on PR #50143: URL: https://github.com/apache/spark/pull/50143#issuecomment-2716276079 @cloud-fan Please take a review again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on code in PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#discussion_r1990450408 ## Sources/SparkConnect/SparkConnectClient.swift: ## @@ -0,0 +1,228 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] [SPARK-50759][SQL] Deprecate a few legacy Catalog APIs [spark]

2025-03-11 Thread via GitHub
beliefer commented on PR #50085: URL: https://github.com/apache/spark/pull/50085#issuecomment-2716266455 These API looks not used anymore! +1 for this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716250511 Strangely, I was unable to verify this new changes. I tried in both Safari and Chrome with two normal and private(Incognito) modes. Is there something for me to do see this new chan

Re: [PR] [SPARK-51271][PYTHON] Add filter pushdown API to Python Data Sources [spark]

2025-03-11 Thread via GitHub
beliefer commented on code in PR #49961: URL: https://github.com/apache/spark/pull/49961#discussion_r1990448785 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/PythonScan.scala: ## @@ -16,26 +16,44 @@ */ package org.apache.spark.sql.execution.d

Re: [PR] [SPARK-51271][PYTHON] Add filter pushdown API to Python Data Sources [spark]

2025-03-11 Thread via GitHub
beliefer commented on code in PR #49961: URL: https://github.com/apache/spark/pull/49961#discussion_r1990447794 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/PythonScan.scala: ## @@ -16,26 +16,44 @@ */ package org.apache.spark.sql.execution.d

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#issuecomment-2716252115 Thank you so much, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
viirya commented on PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#issuecomment-2716237805 Looking into this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [SPARK-51467][UI] Make tables of the environment page filterable [spark]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #50233: URL: https://github.com/apache/spark/pull/50233#issuecomment-2716229205 Let me build and verify manually too~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-50759][SQL] Deprecate a few legacy Catalog APIs [spark]

2025-03-11 Thread via GitHub
LuciferYang commented on PR #50085: URL: https://github.com/apache/spark/pull/50085#issuecomment-2716219685 ![image](https://github.com/user-attachments/assets/763aae0f-c852-4299-bd49-d33ae9a0e98b) -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-51477] Enable autolink to SPARK jira issue [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun closed pull request #8: [SPARK-51477] Enable autolink to SPARK jira issue URL: https://github.com/apache/spark-connect-swift/pull/8 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-51477] Enable autolink to SPARK jira issue [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #8: URL: https://github.com/apache/spark-connect-swift/pull/8#issuecomment-2716185603 Thank you, @viirya ! Merged to main. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-50464] Support unsigned integer for Arrow [spark]

2025-03-11 Thread via GitHub
miscenko commented on PR #49022: URL: https://github.com/apache/spark/pull/49022#issuecomment-2716153988 > > Hi, I am curious, what is the status of this PR? Was it abandoned ? > > it seems that nobody is interested in reviewing this pr. ಥ_ಥ It's a pity, I think this is a very u

Re: [PR] [SPARK-50464] Support unsigned integer for Arrow [spark]

2025-03-11 Thread via GitHub
chenkovsky commented on PR #49022: URL: https://github.com/apache/spark/pull/49022#issuecomment-2716146962 > Hi, I am curious, what is the status of this PR? Was it abandoned ? it seems that nobody is interested in reviewing this pr. ಥ_ಥ -- This is an automated message from the Apac

Re: [PR] [SPARK-51417][CONNECT] Give a second to wait for Spark Connect server to fully start [spark]

2025-03-11 Thread via GitHub
HyukjinKwon commented on PR #50181: URL: https://github.com/apache/spark/pull/50181#issuecomment-2703145836 cc @cloud-fan I think it's still better to give one sec after the log file is created. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [CORE][CONNECT] Add cogroup and cogroupSorted variants for additional KeyValueGroupedDatasets. [spark]

2025-03-11 Thread via GitHub
kyle-winkelman commented on PR #49754: URL: https://github.com/apache/spark/pull/49754#issuecomment-2704496368 @hvanhovell, @HyukjinKwon, and @EnricoMi, sorry to keep bothering you. I am hoping to find out if this is something that might be considered in the near future or if I should consi

Re: [PR] [SPARK-51477] Enable autolink to SPARK jira issue [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #8: URL: https://github.com/apache/spark-connect-swift/pull/8#issuecomment-2716135545 Could you review this PR when you have some time, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] [SPARK-51472] Add gRPC `SparkConnectClient` actor [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #7: URL: https://github.com/apache/spark-connect-swift/pull/7#issuecomment-2716135412 Could you review this PR when you have some time, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] [SPARK-51473][ML][CONNECT] ML transformed dataframe keep a reference to the model [spark]

2025-03-11 Thread via GitHub
WeichenXu123 commented on code in PR #50199: URL: https://github.com/apache/spark/pull/50199#discussion_r1990366864 ## python/pyspark/ml/util.py: ## @@ -185,29 +185,40 @@ def wrapped(self: "JavaWrapper", dataset: "ConnectDataFrame") -> Any: assert isinstance(

Re: [PR] [Minor][SQL][Tests] Remove duplicated plan node check in DataFrameSetOperationsSuite [spark]

2025-03-11 Thread via GitHub
Surbhi-Vijay commented on code in PR #50227: URL: https://github.com/apache/spark/pull/50227#discussion_r1987608890 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameSetOperationsSuite.scala: ## @@ -1493,8 +1493,6 @@ class DataFrameSetOperationsSuite extends QueryTest

Re: [PR] [SPARK-51473][ML][CONNECT] ML transformed dataframe keep a reference to the model [spark]

2025-03-11 Thread via GitHub
WeichenXu123 commented on code in PR #50199: URL: https://github.com/apache/spark/pull/50199#discussion_r1990355750 ## python/pyspark/ml/util.py: ## @@ -185,29 +185,40 @@ def wrapped(self: "JavaWrapper", dataset: "ConnectDataFrame") -> Any: assert isinstance(

Re: [PR] [SPARK-51271][PYTHON] Add filter pushdown API to Python Data Sources [spark]

2025-03-11 Thread via GitHub
wengh commented on code in PR #49961: URL: https://github.com/apache/spark/pull/49961#discussion_r1990350885 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/PythonScan.scala: ## @@ -16,26 +16,43 @@ */ package org.apache.spark.sql.execution.data

Re: [PR] [SPARK-51478][ML] Validate SQLTransformer statement by parsed plan [spark]

2025-03-11 Thread via GitHub
WeichenXu123 commented on code in PR #50244: URL: https://github.com/apache/spark/pull/50244#discussion_r1990350728 ## mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala: ## @@ -78,8 +79,19 @@ class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val

Re: [PR] [WIP][SPARK-XXXX][Collation] Prevent Regex with collated strings [spark]

2025-03-11 Thread via GitHub
github-actions[bot] commented on PR #49020: URL: https://github.com/apache/spark/pull/49020#issuecomment-2716033807 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-51476][PYTHON][DOCS] Update document type conversion for Pandas UDFs (pyarrow 17.0.0, pandas 2.2.3, Python 3.11) [spark]

2025-03-11 Thread via GitHub
HyukjinKwon commented on code in PR #50242: URL: https://github.com/apache/spark/pull/50242#discussion_r1990319328 ## python/pyspark/sql/pandas/functions.py: ## @@ -366,7 +366,7 @@ def calculate(iterator: Iterator[pd.Series]) -> Iterator[pd.Series]: # Note: DDL formatted s

[PR] [SPARK-51477] Enable autolink to SPARK jira issue [spark-connect-swift]

2025-03-11 Thread via GitHub
dongjoon-hyun opened a new pull request, #8: URL: https://github.com/apache/spark-connect-swift/pull/8 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[PR] [WIP][ML] Validate SQLTransformer statement by parsed plan [spark]

2025-03-11 Thread via GitHub
zhengruifeng opened a new pull request, #50244: URL: https://github.com/apache/spark/pull/50244 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### Ho

[PR] [WIP][SQL] Support typed literals of the TIME data type [spark]

2025-03-11 Thread via GitHub
MaxGekk opened a new pull request, #50228: URL: https://github.com/apache/spark/pull/50228 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[PR] Update supported_api_gen.py: remove an invalid escape sequence "\_" using a raw string [spark]

2025-03-11 Thread via GitHub
wyattscarpenter opened a new pull request, #50243: URL: https://github.com/apache/spark/pull/50243 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] [SPARK-50763][SQL] Add Analyzer rule for resolving SQL table functions [spark]

2025-03-11 Thread via GitHub
cloud-fan commented on code in PR #49471: URL: https://github.com/apache/spark/pull/49471#discussion_r1990231759 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -1675,6 +1676,86 @@ class SessionCatalog( } } + /** + *

[PR] [SPARK-51476][PYTHON][DOCS] Update document type conversion for Pandas UDFs (pyarrow 17.0.0, pandas 2.2.3, Python 3.11) [spark]

2025-03-11 Thread via GitHub
HyukjinKwon opened a new pull request, #50242: URL: https://github.com/apache/spark/pull/50242 ### What changes were proposed in this pull request? This PR updates the chart generated at [SPARK-25666](https://issues.apache.org/jira/browse/SPARK-25666). ### Why are the changes n

Re: [PR] [SPARK-51250][K8S] Add Support for K8s PriorityClass Configuration fo… [spark]

2025-03-11 Thread via GitHub
zemin-piao commented on PR #49998: URL: https://github.com/apache/spark/pull/49998#issuecomment-2715839216 Bump :P -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[PR] [MINOR][PYTHON] Reformat error classes [spark]

2025-03-11 Thread via GitHub
HyukjinKwon opened a new pull request, #50241: URL: https://github.com/apache/spark/pull/50241 ### What changes were proposed in this pull request? This PR proposes to reformat error classes by: ```python from pyspark.errors.exceptions import _write_self; _write_self() ```

Re: [PR] [SPARK-50464] Support unsigned integer for Arrow [spark]

2025-03-11 Thread via GitHub
miscenko commented on PR #49022: URL: https://github.com/apache/spark/pull/49022#issuecomment-2715779075 Hi, I am curious, what is the status of this PR? Was it abandoned ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[PR] Delay `Join.metadataOutput` computation until `Join` is resolved [spark]

2025-03-11 Thread via GitHub
mihailotim-db opened a new pull request, #50240: URL: https://github.com/apache/spark/pull/50240 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### H

Re: [PR] [SPARK-51350][SQL] Implement Show Procedures [spark]

2025-03-11 Thread via GitHub
szehon-ho commented on code in PR #50109: URL: https://github.com/apache/spark/pull/50109#discussion_r1990117123 ## sql/core/src/test/scala/org/apache/spark/sql/connector/ProcedureSuite.scala: ## @@ -40,15 +40,23 @@ class ProcedureSuite extends QueryTest with SharedSparkSession

Re: [PR] [SPARK-51350][SQL] Implement Show Procedures [spark]

2025-03-11 Thread via GitHub
szehon-ho commented on code in PR #50109: URL: https://github.com/apache/spark/pull/50109#discussion_r1990109864 ## sql/core/src/test/scala/org/apache/spark/sql/connector/ProcedureSuite.scala: ## @@ -40,15 +40,23 @@ class ProcedureSuite extends QueryTest with SharedSparkSession

Re: [PR] [MINOR][BUILD]: Fix merge_spark_pr script for no jira case [spark]

2025-03-11 Thread via GitHub
viirya commented on PR #50237: URL: https://github.com/apache/spark/pull/50237#issuecomment-2715682204 Thank you @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [SPARK-51350][SQL] Implement Show Procedures [spark]

2025-03-11 Thread via GitHub
szehon-ho commented on code in PR #50109: URL: https://github.com/apache/spark/pull/50109#discussion_r1990109864 ## sql/core/src/test/scala/org/apache/spark/sql/connector/ProcedureSuite.scala: ## @@ -40,15 +40,23 @@ class ProcedureSuite extends QueryTest with SharedSparkSession

Re: [PR] [SPARK-51350][SQL] Implement Show Procedures [spark]

2025-03-11 Thread via GitHub
szehon-ho commented on code in PR #50109: URL: https://github.com/apache/spark/pull/50109#discussion_r1990109152 ## sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableCatalog.scala: ## @@ -268,6 +268,11 @@ class InMemoryTableCatalog extends BasicInM

Re: [PR] [SPARK-49488][SQL][FOLLOWUP] Use correct MySQL datetime functions when pushing down EXTRACT [spark]

2025-03-11 Thread via GitHub
beliefer commented on PR #50112: URL: https://github.com/apache/spark/pull/50112#issuecomment-2703041060 @cloud-fan @dongjoon-hyun Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] [SPARK-43221][CORE] Host local block fetching should use a block status of a block stored on disk [spark]

2025-03-11 Thread via GitHub
attilapiros commented on code in PR #50122: URL: https://github.com/apache/spark/pull/50122#discussion_r1990091266 ## core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala: ## @@ -474,6 +474,26 @@ class BlockManagerSuite extends SparkFunSuite with Matchers with P

Re: [PR] [MINOR][BUILD]: Fix merge_spark_pr script for no jira case [spark]

2025-03-11 Thread via GitHub
viirya commented on PR #50237: URL: https://github.com/apache/spark/pull/50237#issuecomment-2715620441 Thanks @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [MINOR][BUILD]: Fix merge_spark_pr script for no jira case [spark]

2025-03-11 Thread via GitHub
viirya closed pull request #50237: [MINOR][BUILD]: Fix merge_spark_pr script for no jira case URL: https://github.com/apache/spark/pull/50237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-XXXXX][SQL] Don't insert redundant ColumnarToRowExec [spark]

2025-03-11 Thread via GitHub
viirya commented on PR #50239: URL: https://github.com/apache/spark/pull/50239#issuecomment-2715607848 I will add JIRA later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] [SPARK-XXXXX][SQL] Don't insert redundant ColumnarToRowExec [spark]

2025-03-11 Thread via GitHub
viirya opened a new pull request, #50239: URL: https://github.com/apache/spark/pull/50239 ### What changes were proposed in this pull request? This patch fixes a corner case in `ApplyColumnarRulesAndInsertTransitions`. When a plan required to output rows, if the node suppo

Re: [PR] [SPARK-51446][SQL] Improve the codecNameMap for the compression codec [spark]

2025-03-11 Thread via GitHub
beliefer closed pull request #50221: [SPARK-51446][SQL] Improve the codecNameMap for the compression codec URL: https://github.com/apache/spark/pull/50221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-51452][UI] Improve Thread dump table search [spark]

2025-03-11 Thread via GitHub
dongjoon-hyun commented on PR #50225: URL: https://github.com/apache/spark/pull/50225#issuecomment-2711437716 I also built and verified manually this PR. Merged to master for Apache Spark 4.1.0. -- This is an automated message from the Apache Git Service. To respond to the message, plea

  1   2   >