Re: [PR] [SPARK-50891][BUILD] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
LuciferYang commented on PR #49550: URL: https://github.com/apache/spark/pull/49550#issuecomment-2601643020 Thanks @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [SPARK-50891][BUILD] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
HyukjinKwon closed pull request #49550: [SPARK-50891][BUILD] Remove the explicit dependency on `Guava` from `plugins.sbt` URL: https://github.com/apache/spark/pull/49550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-50891][BUILD] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on PR #49550: URL: https://github.com/apache/spark/pull/49550#issuecomment-2601635600 Merged to master and branch-4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [SPARK-50666][SQL] Support hint for reading in JDBC data source [spark]

2025-01-19 Thread via GitHub
beliefer commented on code in PR #49564: URL: https://github.com/apache/spark/pull/49564#discussion_r1921893916 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala: ## @@ -321,4 +323,5 @@ object JDBCOptions { val JDBC_CONNECTION_PROVID

Re: [PR] [MINOR][DOCS] Fix miss semicolon on list file sql example [spark]

2025-01-19 Thread via GitHub
MaxGekk closed pull request #49561: [MINOR][DOCS] Fix miss semicolon on list file sql example URL: https://github.com/apache/spark/pull/49561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [MINOR][DOCS] Fix miss semicolon on list file sql example [spark]

2025-01-19 Thread via GitHub
MaxGekk commented on PR #49561: URL: https://github.com/apache/spark/pull/49561#issuecomment-2601518547 +1, LGTM. Merging to master/4.0. Thank you, @camilesing and @HyukjinKwon for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-50666][SQL] Support hint for reading in JDBC data source [spark]

2025-01-19 Thread via GitHub
pan3793 commented on code in PR #49564: URL: https://github.com/apache/spark/pull/49564#discussion_r1921852024 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala: ## @@ -406,7 +406,7 @@ private case class MySQLDialect() extends JdbcDialect with SQLConfHelpe

[PR] common/unsafe: refine arrayEquals [spark]

2025-01-19 Thread via GitHub
cyb70289 opened a new pull request, #49568: URL: https://github.com/apache/spark/pull/49568 This is a trivial change to replace the loop index from `int` to `long`. Surprisingly, microbenchmark shows more than double performance uplift. Analysis The hot loop of `arrayEq

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-19 Thread via GitHub
beliefer commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1921843259 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } privat

Re: [PR] [SPARK-50793][SQL] Fix MySQL cast function for DOUBLE, LONGTEXT, BIGINT and BLOB types [spark]

2025-01-19 Thread via GitHub
yaooqinn commented on code in PR #49453: URL: https://github.com/apache/spark/pull/49453#discussion_r1921825889 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala: ## @@ -112,6 +112,21 @@ private case class MySQLDialect() extends JdbcDialect with SQLConfHel

Re: [PR] [SPARK-50666][SQL] Support hint for reading in JDBC data source [spark]

2025-01-19 Thread via GitHub
LuciferYang commented on PR #49564: URL: https://github.com/apache/spark/pull/49564#issuecomment-2601392588 also cc @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [SPARK-50666][SQL] Support hint for reading in JDBC data source [spark]

2025-01-19 Thread via GitHub
LuciferYang commented on code in PR #49564: URL: https://github.com/apache/spark/pull/49564#discussion_r1921806043 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala: ## @@ -321,4 +323,5 @@ object JDBCOptions { val JDBC_CONNECTION_PRO

Re: [PR] [SPARK-50876][ML][PYTHON][CONNECT] Support Tree Regressors on Connect [spark]

2025-01-19 Thread via GitHub
zhengruifeng commented on PR #49566: URL: https://github.com/apache/spark/pull/49566#issuecomment-2601384743 thanks, merged to master/4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-50876][ML][PYTHON][CONNECT] Support Tree Regressors on Connect [spark]

2025-01-19 Thread via GitHub
zhengruifeng closed pull request #49566: [SPARK-50876][ML][PYTHON][CONNECT] Support Tree Regressors on Connect URL: https://github.com/apache/spark/pull/49566 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] [SPARK-50877][ML][PYTHON][CONNECT] Support KMeans & BisectingKMeans on Connect [spark]

2025-01-19 Thread via GitHub
zhengruifeng opened a new pull request, #49567: URL: https://github.com/apache/spark/pull/49567 ### What changes were proposed in this pull request? Support KMeans & BisectingKMeans on Connect ### Why are the changes needed? For feature parity ### Does this PR

Re: [PR] [SPARK-50666][SQL] Support hint for reading in JDBC data source [spark]

2025-01-19 Thread via GitHub
LuciferYang commented on PR #49564: URL: https://github.com/apache/spark/pull/49564#issuecomment-2601270687 cc @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
LuciferYang commented on code in PR #49550: URL: https://github.com/apache/spark/pull/49550#discussion_r1921747987 ## project/plugins.sbt: ## @@ -21,9 +21,6 @@ addSbtPlugin("software.purpledragon" % "sbt-checkstyle-plugin" % "4.0.1") // please check pom.xml in the root of the

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
LuciferYang commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2601244783 > After doing further research, I found out why there are these two warning logs in Spark 4.0 version, but not in Spark 3.x and earlier versions: > > 1. When Spark 4.0 upgraded

Re: [PR] [SPARK-50876][ML][PYTHON][CONNECT] Support Tree Regressors on Connect [spark]

2025-01-19 Thread via GitHub
zhengruifeng commented on PR #49566: URL: https://github.com/apache/spark/pull/49566#issuecomment-2601239768 cc @wbo4958 @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-50876][ML][PYTHON][CONNECT] Support Tree Regressors on Connect [spark]

2025-01-19 Thread via GitHub
zhengruifeng opened a new pull request, #49566: URL: https://github.com/apache/spark/pull/49566 ### What changes were proposed in this pull request? Support Tree Regressors on Connect: - DecisionTreeRegressor - RandomForestRegressor - GBTRegressor ##

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
wayneguow commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2601185664 > Thank you so much, @wayneguow . I also observed the issue. > > Since this is a log-related thing, it seems a little hard to validate. > > Could you revise your PR descript

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
dongjoon-hyun commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2601181702 Ack! Sure, let's wait for @LuciferYang 's opinion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
wayneguow commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2601181261 > Given that, this PR is the best and safe way to handle these, right, @wayneguow ? @dongjoon-hyun In my opinion, it's like this. We can also wait for @LuciferYang 's opinion.

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
dongjoon-hyun commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2601178221 Given that, this PR is the best and safe way to handle these, right, @wayneguow ? -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
dongjoon-hyun commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2601177032 Thank you for sharing that, @wayneguow . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [WIP][SPARK-49401][BUILD] Upgrade `checkstyle` to 10.18.0 & `scalafmt` to 3.8.3 [spark]

2025-01-19 Thread via GitHub
panbingkun closed pull request #47879: [WIP][SPARK-49401][BUILD] Upgrade `checkstyle` to 10.18.0 & `scalafmt` to 3.8.3 URL: https://github.com/apache/spark/pull/47879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] [SPARK-50882][SQL][TESTS] Skip `TPCDSCollationQueryTestSuite.q22-v2.7` test in GitHub Action CI [spark]

2025-01-19 Thread via GitHub
panbingkun commented on PR #49558: URL: https://github.com/apache/spark/pull/49558#issuecomment-2601121203 +1, Late LGTM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-50886][BUILD][3.5] Upgrade Avro to 1.11.4 [spark]

2025-01-19 Thread via GitHub
panbingkun commented on PR #49563: URL: https://github.com/apache/spark/pull/49563#issuecomment-2601119703 +1, late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-50890][PYTHON][TESTS][CONNECT] Skip test_take in Spark Connect only build [spark]

2025-01-19 Thread via GitHub
HyukjinKwon closed pull request #49565: [SPARK-50890][PYTHON][TESTS][CONNECT] Skip test_take in Spark Connect only build URL: https://github.com/apache/spark/pull/49565 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-50890][PYTHON][TESTS][CONNECT] Skip test_take in Spark Connect only build [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on PR #49565: URL: https://github.com/apache/spark/pull/49565#issuecomment-2601114423 Merged to master and branch-4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] [SPARK-50890][PYTHON][TESTS][CONNECT] Skip test_take in Spark Connect only build [spark]

2025-01-19 Thread via GitHub
HyukjinKwon opened a new pull request, #49565: URL: https://github.com/apache/spark/pull/49565 ### What changes were proposed in this pull request? This PR proposes to skip test_take in Spark Connect only build. ### Why are the changes needed? This particular test is flak

Re: [PR] [SPARK-49646][SQL] add spark config for fixing subquery decorrelation for union/set operations when parentOuterReferences has references not covered in collectedChildOuterReferences [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on PR #49536: URL: https://github.com/apache/spark/pull/49536#issuecomment-2601102327 > Does this PR introduce any user-facing change? It does add the user-facing configuration, so please describe that -- This is an automated message from the Apache Git Service

Re: [PR] [SPARK-49646][SQL] add spark config for fixing subquery decorrelation for union/set operations when parentOuterReferences has references not covered in collectedChildOuterReferences [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on PR #49536: URL: https://github.com/apache/spark/pull/49536#issuecomment-2601101849 @AveryQi115 mind reading https://github.com/apache/spark/pull/49536/checks?check_run_id=35749405174, configuring the CI, and runing it? -- This is an automated message from the Apa

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1921668160 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } pri

Re: [PR] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on code in PR #49550: URL: https://github.com/apache/spark/pull/49550#discussion_r1921670214 ## project/plugins.sbt: ## @@ -21,9 +21,6 @@ addSbtPlugin("software.purpledragon" % "sbt-checkstyle-plugin" % "4.0.1") // please check pom.xml in the root of the

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1921668160 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } pri

Re: [PR] [SPARK-49711][SQL] Remove ExperimentalMethods [spark]

2025-01-19 Thread via GitHub
github-actions[bot] closed pull request #48390: [SPARK-49711][SQL] Remove ExperimentalMethods URL: https://github.com/apache/spark/pull/48390 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on code in PR #49550: URL: https://github.com/apache/spark/pull/49550#discussion_r1921670352 ## project/plugins.sbt: ## @@ -21,9 +21,6 @@ addSbtPlugin("software.purpledragon" % "sbt-checkstyle-plugin" % "4.0.1") // please check pom.xml in the root of the

Re: [PR] [SPARK-49921][CORE] Add task write data time to SQL tab's graph node [spark]

2025-01-19 Thread via GitHub
github-actions[bot] closed pull request #48408: [SPARK-49921][CORE] Add task write data time to SQL tab's graph node URL: https://github.com/apache/spark/pull/48408 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48922][SQL] Optimize nested data type insertion performance [spark]

2025-01-19 Thread via GitHub
github-actions[bot] commented on PR #47381: URL: https://github.com/apache/spark/pull/47381#issuecomment-2601097295 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] Remove the explicit dependency on `Guava` from `plugins.sbt` [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on code in PR #49550: URL: https://github.com/apache/spark/pull/49550#discussion_r1921670214 ## project/plugins.sbt: ## @@ -21,9 +21,6 @@ addSbtPlugin("software.purpledragon" % "sbt-checkstyle-plugin" % "4.0.1") // please check pom.xml in the root of the

Re: [PR] [SPARK-48652][SQL] Fix casting issue in Spark SQL when comparing string column to integer value [spark]

2025-01-19 Thread via GitHub
github-actions[bot] closed pull request #47246: [SPARK-48652][SQL] Fix casting issue in Spark SQL when comparing string column to integer value URL: https://github.com/apache/spark/pull/47246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1921669146 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } pri

Re: [PR] [SPARK-49273][CONNECT][SQL] Origin support for Spark Connect Scala client [spark]

2025-01-19 Thread via GitHub
HyukjinKwon commented on PR #49373: URL: https://github.com/apache/spark/pull/49373#issuecomment-2601078493 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49273][CONNECT][SQL] Origin support for Spark Connect Scala client [spark]

2025-01-19 Thread via GitHub
HyukjinKwon closed pull request #49373: [SPARK-49273][CONNECT][SQL] Origin support for Spark Connect Scala client URL: https://github.com/apache/spark/pull/49373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-50862][BUILD] Upgrade rocksdbjni to 9.8.4 [spark]

2025-01-19 Thread via GitHub
dongjoon-hyun closed pull request #49538: [SPARK-50862][BUILD] Upgrade rocksdbjni to 9.8.4 URL: https://github.com/apache/spark/pull/49538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
wayneguow commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2600994654 > Are there any negative impacts of disabling these features? Is it possible to retain these features through upgrading dependencies or code changes? For the first question:

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-19 Thread via GitHub
wayneguow commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2600990458 After doing further research, I found out why there are these two warning logs in Spark 4.0 version, but not in Spark 3.x and earlier versions: 1. When Spark 4.0 upgraded `jetty`