[PR] [SPARK-51136][HISTORYSERVER] Set CallerContext for History Server [spark]

2025-02-09 Thread via GitHub
cnauroth opened a new pull request, #49858: URL: https://github.com/apache/spark/pull/49858 ### What changes were proposed in this pull request? Initialize the Hadoop RPC `CallerContext` during History Server startup, before `FileSystem` access. Calls to HDFS will get tagged in the au

Re: [PR] [SPARK-51136][HISTORYSERVER] Set CallerContext for History Server [spark]

2025-02-09 Thread via GitHub
cnauroth commented on code in PR #49858: URL: https://github.com/apache/spark/pull/49858#discussion_r1948244266 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -3151,9 +3152,31 @@ private[spark] object Utils } } -private[util] object CallerContext extends L

Re: [PR] [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot [spark]

2025-02-09 Thread via GitHub
HyukjinKwon commented on PR #49859: URL: https://github.com/apache/spark/pull/49859#issuecomment-2646679978 cc @xinrong-meng mind taking a look please? I think the plot tests fail with dependencies with different versions. -- This is an automated message from the Apache Git Service. To re

[PR] [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot [spark]

2025-02-09 Thread via GitHub
HyukjinKwon opened a new pull request, #49859: URL: https://github.com/apache/spark/pull/49859 ### What changes were proposed in this pull request? This PR proposes to skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot. ### Wh

Re: [PR] [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot [spark]

2025-02-09 Thread via GitHub
HyukjinKwon commented on PR #49859: URL: https://github.com/apache/spark/pull/49859#issuecomment-2646680247 cc @zhengruifeng too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [TYPING] Add type overloads for inplace dataframe operations [spark]

2025-02-09 Thread via GitHub
github-actions[bot] commented on PR #48662: URL: https://github.com/apache/spark/pull/48662#issuecomment-2646681728 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-49730][SQL] classify syntax errors for pgsql, mysql, sqlserver and h2 [spark]

2025-02-09 Thread via GitHub
github-actions[bot] commented on PR #48368: URL: https://github.com/apache/spark/pull/48368#issuecomment-2646681753 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-49827][SQL] Fetching all partitions from hive metastore in batches [spark]

2025-02-09 Thread via GitHub
github-actions[bot] commented on PR #48337: URL: https://github.com/apache/spark/pull/48337#issuecomment-2646681764 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-50101][SQL] Fix collated behavior of StringToMap expression [spark]

2025-02-09 Thread via GitHub
github-actions[bot] closed pull request #48642: [SPARK-50101][SQL] Fix collated behavior of StringToMap expression URL: https://github.com/apache/spark/pull/48642 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #32987: URL: https://github.com/apache/spark/pull/32987#discussion_r1948345527 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala: ## @@ -255,4 +304,26 @@ case class ExpressionEquals(e: Expressio

Re: [PR] [SQL][SPARK-51113] Fix correctness with UNION/EXCEPT/INTERSECT inside a view or EXECUTE IMMEDIATE [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49835: URL: https://github.com/apache/spark/pull/49835#discussion_r1948346938 ## sql/core/src/test/resources/sql-tests/inputs/view-correctness.sql: ## @@ -0,0 +1,421 @@ +-- This test suite checks the correctness of queries over views + +-- SPAR

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
LuciferYang commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646829396 Just got back from vacation, I'll take a look tonight. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
wayneguow commented on code in PR #49854: URL: https://github.com/apache/spark/pull/49854#discussion_r1948351826 ## pom.xml: ## @@ -615,16 +615,17 @@ org.jvnet.staxex stax-ex - -jakarta.activation Review Co

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
wayneguow commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646834860 > Should the change be mentioned in the migration guide? Otherwise LGTM Add a change description in `ml-migration-guide.md`. -- This is an automated message from the Apache Git

[PR] [SPARK-51139][ML][CONNECT] Refine error class `MLAttributeNotAllowedException` [spark]

2025-02-09 Thread via GitHub
zhengruifeng opened a new pull request, #49860: URL: https://github.com/apache/spark/pull/49860 ### What changes were proposed in this pull request? Refine error class `MLAttributeNotAllowedException` ### Why are the changes needed? this error message should contains

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
pan3793 commented on code in PR #49854: URL: https://github.com/apache/spark/pull/49854#discussion_r1948320823 ## pom.xml: ## @@ -599,7 +599,7 @@ org.glassfish.jaxb jaxb-runtime -2.3.2 +4.0.5 Review Comment: I mean `com.sun.xml.fasti

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
pan3793 commented on code in PR #49854: URL: https://github.com/apache/spark/pull/49854#discussion_r1948320823 ## pom.xml: ## @@ -599,7 +599,7 @@ org.glassfish.jaxb jaxb-runtime -2.3.2 +4.0.5 Review Comment: I mean `com.sun.xml.fasti

Re: [PR] [SPARK-50917][EXAMPLES] Add SparkConnectPi Scala example to work both for Connect and Classic [spark]

2025-02-09 Thread via GitHub
yaooqinn commented on PR #49617: URL: https://github.com/apache/spark/pull/49617#issuecomment-2646776731 Thank you @cloud-fan and @dongjoon-hyun, SparkDataFramePi sounds good to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
pan3793 commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646775721 Should the change be mentioned in the migration guide? Otherwise LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-51119][SQL] Readers on executors resolving EXISTS_DEFAULT should not call catalogs [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49840: URL: https://github.com/apache/spark/pull/49840#discussion_r1948322440 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -265,6 +267,19 @@ object Literal { s"Literal must have a corresp

Re: [PR] [SPARK-51119][SQL] Readers on executors resolving EXISTS_DEFAULT should not call catalogs [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49840: URL: https://github.com/apache/spark/pull/49840#discussion_r1948322440 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -265,6 +267,19 @@ object Literal { s"Literal must have a corresp

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948357356 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/v2/SetVariableExec.scala: ## @@ -17,26 +17,29 @@ package org.apache.spark.sql.execution.command.v

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948357547 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/v2/SetVariableExec.scala: ## @@ -47,21 +50,51 @@ case class SetVariableExec(variables: Seq[Variable

[PR] [SPARK-51142][ML][CONNECT] ML protobufs clean up [spark]

2025-02-09 Thread via GitHub
zhengruifeng opened a new pull request, #49862: URL: https://github.com/apache/spark/pull/49862 ### What changes were proposed in this pull request? ML protobufs clean up ### Why are the changes needed? to follow the guide https://github.com/apache/spark/blob/ece1470

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948366905 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/v2/SetVariableExec.scala: ## @@ -47,21 +50,51 @@ case class SetVariableExec(variables: Seq[Varia

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948367347 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/v2/SetVariableExec.scala: ## @@ -17,26 +17,29 @@ package org.apache.spark.sql.execution.comman

Re: [PR] [SPARK-51142][ML][CONNECT] ML protobufs clean up [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on code in PR #49862: URL: https://github.com/apache/spark/pull/49862#discussion_r1948375370 ## sql/connect/common/src/main/protobuf/spark/connect/ml_common.proto: ## @@ -33,24 +33,32 @@ message MlParams { // MLOperator represents the ML operators like

Re: [PR] [SPARK-51142][ML][CONNECT] ML protobufs clean up [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on code in PR #49862: URL: https://github.com/apache/spark/pull/49862#discussion_r1948375370 ## sql/connect/common/src/main/protobuf/spark/connect/ml_common.proto: ## @@ -33,24 +33,32 @@ message MlParams { // MLOperator represents the ML operators like

Re: [PR] [SPARK-51142][ML][CONNECT] ML protobufs clean up [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on PR #49862: URL: https://github.com/apache/spark/pull/49862#issuecomment-2646882798 cc @grundprinzip and @wbo4958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948356321 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala: ## @@ -73,28 +95,49 @@ class ResolveCatalogs(val catalogManager: CatalogM

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948356233 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala: ## @@ -73,28 +95,49 @@ class ResolveCatalogs(val catalogManager: CatalogM

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948357083 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/VariableManager.scala: ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] [SPARK-51133][BUILD] Upgrade Apache `commons-pool2` to 2.12.1 [spark]

2025-02-09 Thread via GitHub
LuciferYang commented on PR #49856: URL: https://github.com/apache/spark/pull/49856#issuecomment-2646847348 Merged into master. Thanks @wayneguow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1948354168 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala: ## @@ -266,22 +275,42 @@ trait ColumnResolutionHelper extends Logg

Re: [PR] [SPARK-51133][BUILD] Upgrade Apache `commons-pool2` to 2.12.1 [spark]

2025-02-09 Thread via GitHub
LuciferYang closed pull request #49856: [SPARK-51133][BUILD] Upgrade Apache `commons-pool2` to 2.12.1 URL: https://github.com/apache/spark/pull/49856 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on PR #49859: URL: https://github.com/apache/spark/pull/49859#issuecomment-2646858360 This failure also happens in https://github.com/apache/spark/actions/workflows/build_python_3.11_macos.yml after check the history, I think it is due to `plotly` upgrade

[PR] [SPARK-51143][PYTHON] Pin `plotly==5.24.1` [spark]

2025-02-09 Thread via GitHub
zhengruifeng opened a new pull request, #49863: URL: https://github.com/apache/spark/pull/49863 ### What changes were proposed in this pull request? Pin `plotly==5.24.1` ### Why are the changes needed? the latest plotlly 6.0 has causes many plot-related test failures

Re: [PR] [SPARK-51143][PYTHON] Pin `plotly==5.24.1` [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on PR #49863: URL: https://github.com/apache/spark/pull/49863#issuecomment-2646873479 also cc @cloud-fan we probably need to pin `plotly` in the release docker image -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] [SPARK-51139][ML][CONNECT] Refine error class `MLAttributeNotAllowedException` [spark]

2025-02-09 Thread via GitHub
zhengruifeng closed pull request #49860: [SPARK-51139][ML][CONNECT] Refine error class `MLAttributeNotAllowedException` URL: https://github.com/apache/spark/pull/49860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-51139][ML][CONNECT] Refine error class `MLAttributeNotAllowedException` [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on PR #49860: URL: https://github.com/apache/spark/pull/49860#issuecomment-2646875702 thanks, merged to master/4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-51143][PYTHON] Pin `plotly==5.24.1` [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on PR #49863: URL: https://github.com/apache/spark/pull/49863#issuecomment-2647094992 update in the docker file trigger the refresh of the cache, and `torch` is also upgraded and caused ```

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
vruusmann commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646143491 A quick memory dump on JPMML-Model evolution: - `1.5.X`. Migrating from PMML schema 4.3 to 4.4. JDK 8 compatible. - `1.6.X`. API upgrades. Specifically, replacing `org.dmg.pmml.

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
wayneguow commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646149933 > Apache Spark 4.X is targeting JDK 17, no? Yes, Spark 4.x is targeting JDK 17. https://github.com/apache/spark/blob/301b666a1fcbd4c59d96c53fe3a547ea1512f397/pom.xml#L117

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
vruusmann commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646164469 > Yes, Spark 4.x is targeting JDK 17. Perhaps it is then possible to be even more aggressive on some transitive dependency updates (than JPMML-Model 1.7.1 proposes). My goa

Re: [PR] [SQL][SPARK-51113] Fix correctness with UNION/EXCEPT/INTERSECT inside a view or EXECUTE IMMEDIATE [spark]

2025-02-09 Thread via GitHub
srielau commented on PR #49835: URL: https://github.com/apache/spark/pull/49835#issuecomment-2646656062 > Here's the commit that enabled `SELECT 1 UNION SELECT 2` syntax in SQL parser: #40835. This commit enabled view creation with such SQL text (`parsePlan` is used for `CREATE VIEW`). >

Re: [PR] [SQL][SPARK-51113] Fix correctness with UNION/EXCEPT/INTERSECT inside a view or EXECUTE IMMEDIATE [spark]

2025-02-09 Thread via GitHub
srielau commented on PR #49835: URL: https://github.com/apache/spark/pull/49835#issuecomment-2646655795 > `spark.conf.set("spark.sql.ansi.enforceReservedKeywords", "true")` mitigates the problem, since UNION becomes a reserved keyword. True, a rather hefty mitigation, though. Lots of

Re: [PR] [SPARK-51136][HISTORYSERVER] Set CallerContext for History Server [spark]

2025-02-09 Thread via GitHub
cnauroth commented on PR #49858: URL: https://github.com/apache/spark/pull/49858#issuecomment-2646610534 If approved, can this also go into branch-3.5 please? The cherry-pick would need a minor merge conflict resolution in `FsHistoryProvider` import statements, or I can send a separate pull

Re: [PR] [SPARK-51111][DSTREAM] Avoid consumer rebalancing stuck when starting a spark streaming job [spark]

2025-02-09 Thread via GitHub
yabola commented on PR #49831: URL: https://github.com/apache/spark/pull/49831#issuecomment-2646800578 @tdas Hi~ could you help review this, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-51008][SQL] Add ResultStage for AQE [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49715: URL: https://github.com/apache/spark/pull/49715#discussion_r1948335832 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -344,56 +346,53 @@ case class AdaptiveSparkPlanExec( if

Re: [PR] [SPARK-48239][INFRA][FOLLOWUP] Update the release docker image to follow what we use in Github Action jobs [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on PR #49851: URL: https://github.com/apache/spark/pull/49851#issuecomment-2646810575 The streaming test failure is unrelated, thanks for the review, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] [SPARK-48239][INFRA][FOLLOWUP] Update the release docker image to follow what we use in Github Action jobs [spark]

2025-02-09 Thread via GitHub
cloud-fan closed pull request #49851: [SPARK-48239][INFRA][FOLLOWUP] Update the release docker image to follow what we use in Github Action jobs URL: https://github.com/apache/spark/pull/49851 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] [SPARK-51109][SQL] CTE in subquery expression as grouping column [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on PR #49829: URL: https://github.com/apache/spark/pull/49829#issuecomment-2646813304 thanks for the review! merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-50953][PYTHON][CONNECT] Add support for non-literal paths in VariantGet [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on PR #49609: URL: https://github.com/apache/spark/pull/49609#issuecomment-2646812807 The code doesn't compile... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-51109][SQL] CTE in subquery expression as grouping column [spark]

2025-02-09 Thread via GitHub
cloud-fan closed pull request #49829: [SPARK-51109][SQL] CTE in subquery expression as grouping column URL: https://github.com/apache/spark/pull/49829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #32987: URL: https://github.com/apache/spark/pull/32987#discussion_r1948343729 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala: ## @@ -79,7 +86,12 @@ class EquivalentExpressions( }

Re: [PR] [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #32987: URL: https://github.com/apache/spark/pull/32987#discussion_r1948343729 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala: ## @@ -79,7 +86,12 @@ class EquivalentExpressions( }

Re: [PR] [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #32987: URL: https://github.com/apache/spark/pull/32987#discussion_r1948344607 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala: ## @@ -99,34 +111,37 @@ class EquivalentExpressions( * only

Re: [PR] [SPARK-50953][PYTHON][CONNECT] Add support for non-literal paths in VariantGet [spark]

2025-02-09 Thread via GitHub
harshmotw-db commented on PR #49609: URL: https://github.com/apache/spark/pull/49609#issuecomment-2646925700 @cloud-fan Thanks, I think we're good now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SQL][SPARK-51113] Fix correctness with UNION/EXCEPT/INTERSECT inside a view or EXECUTE IMMEDIATE [spark]

2025-02-09 Thread via GitHub
vladimirg-db commented on PR #49835: URL: https://github.com/apache/spark/pull/49835#issuecomment-2646625494 Here's the commit that enabled `SELECT 1 UNION SELECT 2` syntax in SQL parser: https://github.com/apache/spark/pull/40835. This commit enabled view creation with such SQL text (`pars

[PR] [SPARK-51140][ML] Sort the params before saving [spark]

2025-02-09 Thread via GitHub
zhengruifeng opened a new pull request, #49861: URL: https://github.com/apache/spark/pull/49861 ### What changes were proposed in this pull request? Sort the params before saving ### Why are the changes needed? to improve debugability: when developing ml connect, somet

Re: [PR] [SPARK-50917][EXAMPLES] Add Pi Scala example to work both for Connect and Classic [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49617: URL: https://github.com/apache/spark/pull/49617#discussion_r1948330890 ## examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] [SPARK-51067][SQL] Revert session level collation for DML queries and apply object level collation for DDL queries [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49772: URL: https://github.com/apache/spark/pull/49772#discussion_r1948333282 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultStringTypes.scala: ## @@ -18,15 +18,15 @@ package org.apache.spark.sql.catalyst.a

Re: [PR] [SPARK-51135][SQL] Fix ViewResolverSuite for ANSI modes [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on PR #49857: URL: https://github.com/apache/spark/pull/49857#issuecomment-2646800076 thanks, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-51067][SQL] Revert session level collation for DML queries and apply object level collation for DDL queries [spark]

2025-02-09 Thread via GitHub
cloud-fan commented on code in PR #49772: URL: https://github.com/apache/spark/pull/49772#discussion_r1948333844 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultStringTypes.scala: ## @@ -79,18 +79,37 @@ object ResolveDefaultStringTypes extends

Re: [PR] [SPARK-51135][SQL] Fix ViewResolverSuite for ANSI modes [spark]

2025-02-09 Thread via GitHub
cloud-fan closed pull request #49857: [SPARK-51135][SQL] Fix ViewResolverSuite for ANSI modes URL: https://github.com/apache/spark/pull/49857 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot [spark]

2025-02-09 Thread via GitHub
HyukjinKwon closed pull request #49859: [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot URL: https://github.com/apache/spark/pull/49859 -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.test_parity_frame_plot_plotly.FramePlotPlotlyParityTests.test_area_plot [spark]

2025-02-09 Thread via GitHub
HyukjinKwon commented on PR #49859: URL: https://github.com/apache/spark/pull/49859#issuecomment-2646721135 Merged to master and branch-4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
zhengruifeng commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646737700 The JPMML upgrade and related code changes LGTM. also ping @LuciferYang and @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
wayneguow commented on code in PR #49854: URL: https://github.com/apache/spark/pull/49854#discussion_r1948060171 ## pom.xml: ## @@ -599,7 +599,7 @@ org.glassfish.jaxb jaxb-runtime -2.3.2 +4.0.5 Review Comment: There is currently no m

[PR] [SPARK-51135][SQL] Fix ViewResolverSuite for ANSI modes [spark]

2025-02-09 Thread via GitHub
vladimirg-db opened a new pull request, #49857: URL: https://github.com/apache/spark/pull/49857 ### What changes were proposed in this pull request? Fix `ViewResolverSuite` for non-ANSI mode. View column `Cast`s have have ANSI evaluation in non-ANSI mode when view schema `COMPENSATION

Re: [PR] [SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer [spark]

2025-02-09 Thread via GitHub
vladimirg-db commented on code in PR #49658: URL: https://github.com/apache/spark/pull/49658#discussion_r1948128138 ## sql/core/src/test/scala/org/apache/spark/sql/analysis/resolver/ViewResolverSuite.scala: ## @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
wayneguow commented on PR #49854: URL: https://github.com/apache/spark/pull/49854#issuecomment-2646172436 @vruusmann Thank you for sharing these views, much appreciated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1 [spark]

2025-02-09 Thread via GitHub
wayneguow commented on code in PR #49854: URL: https://github.com/apache/spark/pull/49854#discussion_r1948060923 ## pom.xml: ## @@ -615,16 +615,17 @@ org.jvnet.staxex stax-ex - -jakarta.activation Review Co