Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1967623966 ## sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala: ## @@ -328,7 +336,7 @@ trait AlterTableTests extends SharedSparkSession with QueryEr

Re: [PR] [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50040: URL: https://github.com/apache/spark/pull/50040#discussion_r1967634418 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5545,6 +5545,15 @@ object SQLConf { .booleanConf .createWithDefault(false

Re: [PR] [SPARK-51304][DOCS][PYTHON] Use `getCondition` instead of `getErrorClass` in contribution guide [spark]

2025-02-24 Thread via GitHub
LuciferYang closed pull request #50062: [SPARK-51304][DOCS][PYTHON] Use `getCondition` instead of `getErrorClass` in contribution guide URL: https://github.com/apache/spark/pull/50062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] [SPARK-51304][DOCS][PYTHON] Use `getCondition` instead of `getErrorClass` in contribution guide [spark]

2025-02-24 Thread via GitHub
LuciferYang commented on PR #50062: URL: https://github.com/apache/spark/pull/50062#issuecomment-2678412727 Merged into master and branch-4.0. Thanks @itholic @HyukjinKwon @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[PR] [SPARK-50692] Add the LPAD and RPAD pushdown support for H2 [spark]

2025-02-24 Thread via GitHub
beliefer opened a new pull request, #50068: URL: https://github.com/apache/spark/pull/50068 ### What changes were proposed in this pull request? This PR proposes to add the `LPAD` and `RPAD` pushdown support for H2. ### Why are the changes needed? https://github.com/apache/sp

Re: [PR] [SPARK-51078][SPARK-50963][ML][PYTHON][CONNECT][TESTS][FOLLOW-UP] Add back tests for default value [spark]

2025-02-24 Thread via GitHub
zhengruifeng commented on PR #50067: URL: https://github.com/apache/spark/pull/50067#issuecomment-2678422359 thanks @LuciferYang merged to master/4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-51273][SQL] Spark Connect Call Procedure runs the procedure twice [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50031: URL: https://github.com/apache/spark/pull/50031#discussion_r1967645049 ## sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryProcedureCatalog.scala: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] [SPARK-51078][SPARK-50963][ML][PYTHON][CONNECT][TESTS][FOLLOW-UP] Add back tests for default value [spark]

2025-02-24 Thread via GitHub
zhengruifeng closed pull request #50067: [SPARK-51078][SPARK-50963][ML][PYTHON][CONNECT][TESTS][FOLLOW-UP] Add back tests for default value URL: https://github.com/apache/spark/pull/50067 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-51273][SQL] Spark Connect Call Procedure runs the procedure twice [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50031: URL: https://github.com/apache/spark/pull/50031#discussion_r1967642091 ## sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryProcedureCatalog.scala: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option [spark]

2025-02-24 Thread via GitHub
beliefer commented on code in PR #50040: URL: https://github.com/apache/spark/pull/50040#discussion_r1967675305 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala: ## @@ -841,20 +841,24 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSpark

Re: [PR] [SPARK-51187][SQL][SS] Implement the graceful deprecation of incorrect config introduced in SPARK-49699 [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on PR #49983: URL: https://github.com/apache/spark/pull/49983#issuecomment-2678491074 have we merged this graceful deprecation in branch 3.5? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-50994][CORE] Perform RDD conversion under tracked execution [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #49678: URL: https://github.com/apache/spark/pull/49678#discussion_r1967698435 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala: ## @@ -2721,6 +2721,25 @@ class DataFrameSuite extends QueryTest parameters = Map("name"

Re: [PR] [SPARK-51299][SQL][UI] MetricUtils.stringValue should filter metric values with initValue rather than a hardcoded value [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on PR #50055: URL: https://github.com/apache/spark/pull/50055#issuecomment-2678515900 I'm not very convinced by the "Why" section. What's the end-to-end problem you hit? BTW https://github.com/apache/spark/pull/47721 has some more context about the SQLMetric initi

Re: [PR] [SPARK-51256][SQL] Increase parallelism if joining with small bucket table [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50004: URL: https://github.com/apache/spark/pull/50004#discussion_r1967710205 ## sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala: ## @@ -150,10 +150,15 @@ case class EnsureRequirements( // A:

Re: [PR] [SPARK-50785][SQL] Refactor FOR statement to utilize local variables properly. [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50026: URL: https://github.com/apache/spark/pull/50026#discussion_r1967684965 ## sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala: ## @@ -206,6 +207,15 @@ class TriggerToExceptionHandlerMap( def getNotFo

Re: [PR] [SPARK-51256][SQL] Increase parallelism if joining with small bucket table [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50004: URL: https://github.com/apache/spark/pull/50004#discussion_r1967730371 ## sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala: ## @@ -150,10 +150,15 @@ case class EnsureRequirements( // A:

Re: [PR] [SPARK-51273][SQL] Spark Connect Call Procedure runs the procedure twice [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on code in PR #50031: URL: https://github.com/apache/spark/pull/50031#discussion_r1967751946 ## sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/ProcedureSuite.scala: ## @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] [SPARK-49912] Refactor simple CASE statement to evaluate the case variable only once [spark]

2025-02-24 Thread via GitHub
cloud-fan commented on PR #50027: URL: https://github.com/apache/spark/pull/50027#issuecomment-2678690747 thanks, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-49912] Refactor simple CASE statement to evaluate the case variable only once [spark]

2025-02-24 Thread via GitHub
cloud-fan closed pull request #50027: [SPARK-49912] Refactor simple CASE statement to evaluate the case variable only once URL: https://github.com/apache/spark/pull/50027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-24 Thread via GitHub
vladimirg-db commented on code in PR #50069: URL: https://github.com/apache/spark/pull/50069#discussion_r1967880226 ## sql/core/src/test/resources/sql-tests/inputs/order-by.sql: ## @@ -0,0 +1,24 @@ +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-24 Thread via GitHub
mihailoale-db commented on code in PR #50069: URL: https://github.com/apache/spark/pull/50069#discussion_r1967915773 ## sql/core/src/test/resources/sql-tests/inputs/order-by.sql: ## @@ -0,0 +1,24 @@ +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUE

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-24 Thread via GitHub
mihailoale-db commented on code in PR #50069: URL: https://github.com/apache/spark/pull/50069#discussion_r1967916766 ## sql/core/src/test/resources/sql-tests/inputs/order-by.sql: ## @@ -0,0 +1,24 @@ +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUE

Re: [PR] [SPARK-51156][CONNECT][FOLLOWUP] Remove unused `private val AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG` from `SparkConnectClient` [spark]

2025-02-24 Thread via GitHub
LuciferYang commented on PR #50070: URL: https://github.com/apache/spark/pull/50070#issuecomment-2678986308 https://github.com/LuciferYang/spark/actions/runs/13502647699/job/37724729211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-51095][CORE][SQL] Include caller context for hdfs audit logs for calls from driver [spark]

2025-02-24 Thread via GitHub
attilapiros closed pull request #49814: [SPARK-51095][CORE][SQL] Include caller context for hdfs audit logs for calls from driver URL: https://github.com/apache/spark/pull/49814 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] [SPARK-51156][CONNECT][FOLLOWUP] Remove unused `private val AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG` from `SparkConnectClient` [spark]

2025-02-24 Thread via GitHub
LuciferYang opened a new pull request, #50070: URL: https://github.com/apache/spark/pull/50070 ### What changes were proposed in this pull request? This pr aims to remove unused `private val AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG` from `SparkConnectClient` because it becomes a useless `p

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-24 Thread via GitHub
mihailoale-db commented on PR #50069: URL: https://github.com/apache/spark/pull/50069#issuecomment-2679037294 @MaxGekk Could you PTAL when you have time? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [SPARK-51149][CORE] Log classpath in SparkSubmit on ClassNotFoundException [spark]

2025-02-24 Thread via GitHub
vrozov commented on PR #49870: URL: https://github.com/apache/spark/pull/49870#issuecomment-2679050636 @dongjoon-hyun Please review or advise who may review the PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

2025-02-24 Thread via GitHub
vrozov commented on PR #49928: URL: https://github.com/apache/spark/pull/49928#issuecomment-2679052693 @cloud-fan Please review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968032448 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala: ## @@ -80,7 +80,6 @@ object TableOutputResolver extends SQLConfHelper wi

Re: [PR] [SPARK-51221][CONNECT][TESTS] Use unresolvable host name in SparkConnectClientSuite [spark]

2025-02-24 Thread via GitHub
vrozov commented on PR #49960: URL: https://github.com/apache/spark/pull/49960#issuecomment-2679056088 @HyukjinKwon Please review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968032448 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala: ## @@ -80,7 +80,6 @@ object TableOutputResolver extends SQLConfHelper wi

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968038807 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3534,7 +3534,8 @@ class Analyzer(override val catalogManager: CatalogManage

Re: [PR] [SPARK-50639][SQL] Improve warning logging in CacheManager [spark]

2025-02-24 Thread via GitHub
vrozov commented on PR #49276: URL: https://github.com/apache/spark/pull/49276#issuecomment-2679043977 @gengliangwang Please see my [response](https://github.com/apache/spark/pull/49276#discussion_r1956993969) to your [comment](https://github.com/apache/spark/pull/49276#discussion_r1956913

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968042471 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3534,7 +3534,8 @@ class Analyzer(override val catalogManager: CatalogManage

Re: [PR] [SPARK-51156][CONNECT][FOLLOWUP] Remove unused `private val AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG` from `SparkConnectClient` [spark]

2025-02-24 Thread via GitHub
dongjoon-hyun closed pull request #50070: [SPARK-51156][CONNECT][FOLLOWUP] Remove unused `private val AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG` from `SparkConnectClient` URL: https://github.com/apache/spark/pull/50070 -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] [WIP][SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-02-24 Thread via GitHub
Pajaraja commented on code in PR #49955: URL: https://github.com/apache/spark/pull/49955#discussion_r1968079360 ## sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala: ## @@ -714,6 +717,177 @@ case class UnionExec(children: Seq[SparkPlan]) extends

Re: [PR] [WIP][SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-02-24 Thread via GitHub
Pajaraja commented on code in PR #49955: URL: https://github.com/apache/spark/pull/49955#discussion_r1968079676 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/cteOperators.scala: ## @@ -34,7 +34,9 @@ import org.apache.spark.sql.internal.SQLConf * @p

Re: [PR] [SPARK-50785][SQL] Refactor FOR statement to utilize local variables properly. [spark]

2025-02-24 Thread via GitHub
dusantism-db commented on code in PR #50026: URL: https://github.com/apache/spark/pull/50026#discussion_r1968081531 ## sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala: ## @@ -206,6 +207,15 @@ class TriggerToExceptionHandlerMap( def getNo

Re: [PR] [WIP][SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-02-24 Thread via GitHub
Pajaraja commented on code in PR #49955: URL: https://github.com/apache/spark/pull/49955#discussion_r1968080315 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -848,6 +848,15 @@ object LimitPushDown extends Rule[LogicalPlan] { c

Re: [PR] [WIP][SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-02-24 Thread via GitHub
Pajaraja commented on code in PR #49955: URL: https://github.com/apache/spark/pull/49955#discussion_r1968080690 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -4421,6 +4421,12 @@ ], "sqlState" : "38000" }, + "RECURSION_LEVEL_LIMIT_EXCEEDED" :

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968081855 ## sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableCatalog.scala: ## @@ -122,7 +122,7 @@ class BasicInMemoryTableCatalog extends TableCat

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968084126 ## sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableCatalog.scala: ## @@ -122,7 +122,7 @@ class BasicInMemoryTableCatalog extends TableCat

Re: [PR] [WIP][SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-02-24 Thread via GitHub
Pajaraja commented on code in PR #49955: URL: https://github.com/apache/spark/pull/49955#discussion_r1968084349 ## sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala: ## @@ -714,6 +717,177 @@ case class UnionExec(children: Seq[SparkPlan]) extends

Re: [PR] [SPARK-51305][SQL][CONNECT] Improve `SparkConnectPlanExecution.createObservedMetricsResponse` [spark]

2025-02-24 Thread via GitHub
dongjoon-hyun closed pull request #50066: [SPARK-51305][SQL][CONNECT] Improve `SparkConnectPlanExecution.createObservedMetricsResponse` URL: https://github.com/apache/spark/pull/50066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] [SPARK-51305][SQL][CONNECT] Improve `SparkConnectPlanExecution.createObservedMetricsResponse` [spark]

2025-02-24 Thread via GitHub
dongjoon-hyun commented on PR #50066: URL: https://github.com/apache/spark/pull/50066#issuecomment-2679154851 Merged to master/4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-50692][SQL] Add the LPAD and RPAD pushdown support for H2 [spark]

2025-02-24 Thread via GitHub
dongjoon-hyun closed pull request #50068: [SPARK-50692][SQL] Add the LPAD and RPAD pushdown support for H2 URL: https://github.com/apache/spark/pull/50068 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-50692][SQL][FOLLOWUP] Add the LPAD and RPAD pushdown support for H2 [spark]

2025-02-24 Thread via GitHub
dongjoon-hyun commented on PR #50068: URL: https://github.com/apache/spark/pull/50068#issuecomment-2679192384 Please don't forget `[FOLLOWUP]` in the PR title next time, @beliefer . Or, please use a new JIRA ID. -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] [SPARK-50692][SQL][FOLLOWUP] Add the LPAD and RPAD pushdown support for H2 [spark]

2025-02-24 Thread via GitHub
dongjoon-hyun commented on PR #50068: URL: https://github.com/apache/spark/pull/50068#issuecomment-2679188787 Oh, did you aim to use this as a follow-up, @beliefer ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-51290][SQL] Enable filling default values in DSv2 writes [spark]

2025-02-24 Thread via GitHub
viirya commented on code in PR #50044: URL: https://github.com/apache/spark/pull/50044#discussion_r1968132329 ## sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala: ## @@ -328,7 +336,7 @@ trait AlterTableTests extends SharedSparkSession with QueryError

<    1   2