Re: [PR] [SPARK-48911][SQL][TESTS] Improve collation support testing for various expressions [spark]

2024-08-02 Thread via GitHub
mihailom-db commented on code in PR #47372: URL: https://github.com/apache/spark/pull/47372#discussion_r1701395705 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -2295,6 +2295,827 @@ class CollationSQLExpressionsSuite assert(typeEx

Re: [PR] [SPARK-47430][SQL] Rework group by map type [spark]

2024-08-02 Thread via GitHub
ulysses-you commented on code in PR #47545: URL: https://github.com/apache/spark/pull/47545#discussion_r1701399571 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -246,8 +246,6 @@ abstract class Optimizer(catalogManager: CatalogManag

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701404697 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701407614 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701407799 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47568: URL: https://github.com/apache/spark/pull/47568#issuecomment-2264737580 Also cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
xuzifu666 commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701414017 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] [MINOR][DOCS] Fix typos in docs/sql-ref-number-pattern.md [spark]

2024-08-02 Thread via GitHub
yaooqinn closed pull request #47557: [MINOR][DOCS] Fix typos in docs/sql-ref-number-pattern.md URL: https://github.com/apache/spark/pull/47557 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
xuzifu666 commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701415679 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701417143 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [MINOR][DOCS] Fix typos in docs/sql-ref-number-pattern.md [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47557: URL: https://github.com/apache/spark/pull/47557#issuecomment-2264754956 Merged to master Can you make a backport PR for branch-3.5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-48763][CONNECT][BUILD][FOLLOW-UP] Move Spark Connect common/server into sql directory [spark]

2024-08-02 Thread via GitHub
HyukjinKwon commented on PR #47579: URL: https://github.com/apache/spark/pull/47579#issuecomment-2264756460 Thank you!!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-49090][CORE] Support `JWSFilter` [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47575: URL: https://github.com/apache/spark/pull/47575#discussion_r1701423438 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -138,6 +138,7 @@ jersey-server/3.0.12//jersey-server-3.0.12.jar jettison/1.5.4//jettison-1.5.4.jar jetty-util-ajax/11

Re: [PR] [SPARK-49090][CORE] Support `JWSFilter` [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on code in PR #47575: URL: https://github.com/apache/spark/pull/47575#discussion_r1701424363 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -138,6 +138,7 @@ jersey-server/3.0.12//jersey-server-3.0.12.jar jettison/1.5.4//jettison-1.5.4.jar jetty-util-aj

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701428363 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701429635 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49090][CORE] Support `JWSFilter` [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on PR #47575: URL: https://github.com/apache/spark/pull/47575#issuecomment-2264762491 Thank you, @yaooqinn ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701428363 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701430354 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala: ## @@ -2377,10 +2377,9 @@ class DataSourceV2SQLSuiteV1Filter val t = "testca

Re: [PR] [SPARK-49090][CORE] Support `JWSFilter` [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47575: URL: https://github.com/apache/spark/pull/47575#discussion_r1701432137 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -138,6 +138,7 @@ jersey-server/3.0.12//jersey-server-3.0.12.jar jettison/1.5.4//jettison-1.5.4.jar jetty-util-ajax/11

Re: [PR] [SPARK-48936][CONNECT] Makes spark-shell works with Spark connect [spark]

2024-08-02 Thread via GitHub
HyukjinKwon commented on PR #47402: URL: https://github.com/apache/spark/pull/47402#issuecomment-2264769667 The build works with SBT but doesn't with Maven - investigating -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] [SPARK-48763][CONNECT][BUILD][FOLLOW-UP] Move Spark Connect common/server into sql directory [spark]

2024-08-02 Thread via GitHub
HyukjinKwon commented on PR #47579: URL: https://github.com/apache/spark/pull/47579#issuecomment-2264786515 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48763][CONNECT][BUILD][FOLLOW-UP] Move Spark Connect common/server into sql directory [spark]

2024-08-02 Thread via GitHub
HyukjinKwon closed pull request #47579: [SPARK-48763][CONNECT][BUILD][FOLLOW-UP] Move Spark Connect common/server into sql directory URL: https://github.com/apache/spark/pull/47579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47568: URL: https://github.com/apache/spark/pull/47568#discussion_r1701461014 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowColumnsTableExec.scala: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [SPARK-49080][SQL][TEST] Upgrade `mssql-jdbc` to 12.8.0.jre11 [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47569: URL: https://github.com/apache/spark/pull/47569#issuecomment-2264808347 Can we also update MsSQLServer to 2022-CU14-ubuntu-22.04? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-49080][SQL][TEST] Upgrade `mssql-jdbc` to 12.8.0.jre11 [spark]

2024-08-02 Thread via GitHub
wayneguow commented on PR #47569: URL: https://github.com/apache/spark/pull/47569#issuecomment-2264812399 > Can we also update MsSQLServer to 2022-CU14-ubuntu-22.04? No problem, let me work on this. -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] [SPARK-49080][SQL][TEST] Upgrade `mssql-jdbc` to 12.8.0.jre11 [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47569: URL: https://github.com/apache/spark/pull/47569#issuecomment-2264817708 If you don't mind, we can also switch the docker registry from mcr to https://hub.docker.com/r/microsoft/mssql-server directly -- This is an automated message from the Apache Git Servi

Re: [PR] [SPARK-49000][SQL][FOLLOWUP] Improve code style and update comments [spark]

2024-08-02 Thread via GitHub
yaooqinn closed pull request #47565: [SPARK-49000][SQL][FOLLOWUP] Improve code style and update comments URL: https://github.com/apache/spark/pull/47565 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-49000][SQL][FOLLOWUP] Improve code style and update comments [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47565: URL: https://github.com/apache/spark/pull/47565#issuecomment-2264820355 Thank you @uros-db @dongjoon-hyun @cloud-fan Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] [MINOR][SQL]remove useless `return` [spark]

2024-08-02 Thread via GitHub
jlfsdtc opened a new pull request, #47580: URL: https://github.com/apache/spark/pull/47580 ### What changes were proposed in this pull request? remove useless `return` in TSetIpAddressProcessor ### Why are the changes needed? For code elegance ### Does this

[PR] docs: use the same Spark version in the readme and quick-start [spark-connect-go]

2024-08-02 Thread via GitHub
haoxins opened a new pull request, #38: URL: https://github.com/apache/spark-connect-go/pull/38 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### Ho

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-08-02 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1701552127 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/CheckConnectJvmClientCompatibility.scala: ## @@ -307,6 +307,12 @@ object CheckConn

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701573785 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701576418 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

[PR] [SPARK-49063][SQL]Fix Between with ScalarSubqueries [spark]

2024-08-02 Thread via GitHub
mihailom-db opened a new pull request, #47581: URL: https://github.com/apache/spark/pull/47581 ### What changes were proposed in this pull request? Fix for between with ScalarSubqueries. ### Why are the changes needed? There is a regression from some previous PR. #

Re: [PR] [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status [spark]

2024-08-02 Thread via GitHub
akki commented on PR #36564: URL: https://github.com/apache/spark/pull/36564#issuecomment-226498 Hi all I am facing [this issue](https://stackoverflow.com/q/78789779/3061686) after upgrading to Spark3.5.1 and wonder if this change is a root cause for it. Can anyone here confirm?

Re: [PR] [SPARK-48936][CONNECT] Makes spark-shell works with Spark connect [spark]

2024-08-02 Thread via GitHub
pan3793 commented on code in PR #47402: URL: https://github.com/apache/spark/pull/47402#discussion_r1701610965 ## bin/spark-shell: ## @@ -44,8 +44,53 @@ Scala REPL options: # through spark.driver.extraClassPath is not automatically propagated. SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_

Re: [PR] [SPARK-48292][CORE] Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status [spark]

2024-08-02 Thread via GitHub
akki commented on PR #46696: URL: https://github.com/apache/spark/pull/46696#issuecomment-2264999225 Hi all I am facing [this issue](https://stackoverflow.com/q/78789779/3061686) after upgrading to Spark3.5.1 and wonder if this revert would help me. Does anyone here know that?

[PR] [SPARK-49082][SQL] Widening type promotions in `AvroDeserializer` [spark]

2024-08-02 Thread via GitHub
wayneguow opened a new pull request, #47582: URL: https://github.com/apache/spark/pull/47582 ### What changes were proposed in this pull request? This PR aims to widen type promotions in `AvroDeserializer`. Supported as following(Avro Type -> Spark Type): - Int -> Long

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701573785 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

Re: [PR] [SPARK-48791][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-02 Thread via GitHub
Ngone51 commented on PR #47578: URL: https://github.com/apache/spark/pull/47578#issuecomment-2265022305 > How much is the actual impact ? Under what circumstances ? @mridulm We tested on a 2 executors (8 CPU, 18G memory) cluster and run a seris of SQL queries. The overall perf has sl

[PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn opened a new pull request, #47583: URL: https://github.com/apache/spark/pull/47583 ### What changes were proposed in this pull request? ignoreCorruptFiles now applies to all file data sources except for hive orc implementation with mergeSchema off ### Why are t

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47583: URL: https://github.com/apache/spark/pull/47583#issuecomment-2265057737 cc @cloud-fan @dongjoon-hyun @HyukjinKwon thanks in advance -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-02 Thread via GitHub
davidm-db commented on code in PR #47442: URL: https://github.com/apache/spark/pull/47442#discussion_r1701677348 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNodeSuite.scala: ## @@ -77,11 +93,157 @@ class SqlScriptingExecutionNodeSuite extends S

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701680965 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701681169 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701681680 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

[PR] [SPARK-49095][SQL] Update `DecimalType` compatible logic of `Avro` data source to avoid loss of decimal precision [spark]

2024-08-02 Thread via GitHub
wayneguow opened a new pull request, #47584: URL: https://github.com/apache/spark/pull/47584 ### What changes were proposed in this pull request? This PR aims to enhance comparison logic to avoid precision loss in decimal parts of `Decimal` in Avro data source. It refers to th

Re: [PR] [SPARK-49080][SQL][TEST] Upgrade `mssql-jdbc` to 12.8.0.jre11 [spark]

2024-08-02 Thread via GitHub
wayneguow commented on PR #47569: URL: https://github.com/apache/spark/pull/47569#issuecomment-2265131440 > If you don't mind, we can also switch the docker registry from mcr to https://hub.docker.com/r/microsoft/mssql-server directly Is this the part? `mcr.microsoft.com/mssql/server`

[PR] [SPARK-48763][FOLLOWUP] Make `dev/lint-scala` error message more accurate [spark]

2024-08-02 Thread via GitHub
panbingkun opened a new pull request, #47585: URL: https://github.com/apache/spark/pull/47585 ### What changes were proposed in this pull request? The pr is followuping https://github.com/apache/spark/pull/47157, to make `dev/lint-scala` error message more accurate. ### Why are

Re: [PR] [SPARK-48763][FOLLOWUP] Make `dev/lint-scala` error message more accurate [spark]

2024-08-02 Thread via GitHub
panbingkun commented on PR #47585: URL: https://github.com/apache/spark/pull/47585#issuecomment-2265163805 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-02 Thread via GitHub
davidm-db commented on code in PR #47442: URL: https://github.com/apache/spark/pull/47442#discussion_r1701776960 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNodeSuite.scala: ## @@ -77,11 +93,157 @@ class SqlScriptingExecutionNodeSuite extends S

Re: [PR] [SPARK-48911][SQL][TESTS] Improve collation support testing for various expressions [spark]

2024-08-02 Thread via GitHub
uros-db commented on code in PR #47372: URL: https://github.com/apache/spark/pull/47372#discussion_r1701789525 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -2295,6 +2295,827 @@ class CollationSQLExpressionsSuite assert(typeExcept

Re: [PR] [SPARK-48911][SQL][TESTS] Improve collation support testing for various expressions [spark]

2024-08-02 Thread via GitHub
uros-db commented on code in PR #47372: URL: https://github.com/apache/spark/pull/47372#discussion_r1701791384 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -2295,6 +2295,827 @@ class CollationSQLExpressionsSuite assert(typeExcept

Re: [PR] [SPARK-48700] [SQL] Mode expression for complex types (all collations) [spark]

2024-08-02 Thread via GitHub
GideonPotok commented on PR #47154: URL: https://github.com/apache/spark/pull/47154#issuecomment-2265373898 > left some comments, code structure looks much better than before in my opinion > > let's just clean this up a bit - not leave behind any comments from prototyping tests, and

[PR] Verifying Run documentation build [spark]

2024-08-02 Thread via GitHub
uros-db opened a new pull request, #47586: URL: https://github.com/apache/spark/pull/47586 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

Re: [PR] Verifying Run documentation build [spark]

2024-08-02 Thread via GitHub
uros-db closed pull request #47586: Verifying Run documentation build URL: https://github.com/apache/spark/pull/47586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[PR] Verifying Run documentation build [spark]

2024-08-02 Thread via GitHub
uros-db opened a new pull request, #47587: URL: https://github.com/apache/spark/pull/47587 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

Re: [PR] [SPARK-48791][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-02 Thread via GitHub
Ngone51 commented on code in PR #47578: URL: https://github.com/apache/spark/pull/47578#discussion_r1701887383 ## core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala: ## @@ -272,8 +270,18 @@ class TaskMetrics private[spark] () extends Serializable { */ @transi

Re: [PR] [SPARK-49057][SQL] Do not block the AQE loop when submitting query stages [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47533: URL: https://github.com/apache/spark/pull/47533#discussion_r1701895376 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala: ## @@ -170,6 +170,16 @@ object StaticSQLConf { .intConf .createWithDefa

Re: [PR] [SPARK-49090][CORE] Support `JWSFilter` [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun closed pull request #47575: [SPARK-49090][CORE] Support `JWSFilter` URL: https://github.com/apache/spark/pull/47575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [SPARK-49090][CORE] Support `JWSFilter` [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on PR #47575: URL: https://github.com/apache/spark/pull/47575#issuecomment-2265486093 Merged to master for Apache Spark 4.0.0-preview2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[PR] Bump rexml from 3.2.6 to 3.3.3 in /docs [spark]

2024-08-02 Thread via GitHub
dependabot[bot] opened a new pull request, #47588: URL: https://github.com/apache/spark/pull/47588 Bumps [rexml](https://github.com/ruby/rexml) from 3.2.6 to 3.3.3. Release notes Sourced from https://github.com/ruby/rexml/releases";>rexml's releases. REXML 3.3.3 - 2024-08-01

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701680965 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

Re: [PR] [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle [spark]

2024-08-02 Thread via GitHub
lifulong commented on code in PR #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r1701681169 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -97,27 +97,36 @@ case class AdaptiveSparkPlanExec( AQEUtils.

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701918449 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestH

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701920360 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701921270 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-49000][SQL][3.5] Fix "select count(distinct 1) from t" where t is empty table by expanding RewriteDistinctAggregates [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on PR #47566: URL: https://github.com/apache/spark/pull/47566#issuecomment-2265518947 The python doc failure is definitely unrelated, thanks, merging to 3.5! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701927746 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestH

Re: [PR] [SPARK-49000][SQL][3.5] Fix "select count(distinct 1) from t" where t is empty table by expanding RewriteDistinctAggregates [spark]

2024-08-02 Thread via GitHub
cloud-fan closed pull request #47566: [SPARK-49000][SQL][3.5] Fix "select count(distinct 1) from t" where t is empty table by expanding RewriteDistinctAggregates URL: https://github.com/apache/spark/pull/47566 -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] [SPARK-49000][SQL][3.5] Fix "select count(distinct 1) from t" where t is empty table by expanding RewriteDistinctAggregates [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on PR #47566: URL: https://github.com/apache/spark/pull/47566#issuecomment-2265531806 Thank you, @uros-db and all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-48791][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47578: URL: https://github.com/apache/spark/pull/47578#discussion_r1701933267 ## sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java: ## @@ -139,12 +139,15 @@ public void initialize(

Re: [PR] [SPARK-48791][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47578: URL: https://github.com/apache/spark/pull/47578#discussion_r1701934280 ## sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala: ## @@ -179,14 +179,15 @@ class SQLAppStatusListener( // work around a ra

Re: [PR] [SPARK-48791][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47578: URL: https://github.com/apache/spark/pull/47578#discussion_r1701934716 ## sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala: ## @@ -311,7 +311,7 @@ object InputOutputMetricsHelper { res.shuffl

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47442: URL: https://github.com/apache/spark/pull/47442#discussion_r1701942845 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreterSuite.scala: ## @@ -214,4 +215,152 @@ class SqlScriptingInterpreterSuite extends Quer

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47442: URL: https://github.com/apache/spark/pull/47442#discussion_r1701942055 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNodeSuite.scala: ## @@ -38,21 +40,32 @@ class SqlScriptingExecutionNodeSuite extends Sp

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47442: URL: https://github.com/apache/spark/pull/47442#discussion_r1701943293 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreterSuite.scala: ## @@ -32,7 +32,8 @@ class SqlScriptingInterpreterSuite extends QueryTes

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47442: URL: https://github.com/apache/spark/pull/47442#discussion_r1701944866 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreterSuite.scala: ## @@ -214,4 +215,152 @@ class SqlScriptingInterpreterSuite extends Quer

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47553: URL: https://github.com/apache/spark/pull/47553#discussion_r1701948494 ## sql/catalyst/src/main/scala/org/apache/spark/sql/exceptions/SqlScriptingException.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47553: URL: https://github.com/apache/spark/pull/47553#discussion_r1701949792 ## sql/catalyst/src/main/scala/org/apache/spark/sql/exceptions/SqlScriptingException.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47553: URL: https://github.com/apache/spark/pull/47553#discussion_r1701951490 ## sql/catalyst/src/main/scala/org/apache/spark/sql/exceptions/SqlScriptingException.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (AS

[PR] [MINOR][DOCS] Fix broken links in docs [spark]

2024-08-02 Thread via GitHub
wayneguow opened a new pull request, #47589: URL: https://github.com/apache/spark/pull/47589 ### What changes were proposed in this pull request? This PR aims to fix broken links in docs. ### Why are the changes needed? Fix broken links. ### Does this PR in

Re: [PR] [SPARK-49018][SQL] Fix approx_count_distinct not working correctly with collation [spark]

2024-08-02 Thread via GitHub
viktorluc-db commented on PR #47503: URL: https://github.com/apache/spark/pull/47503#issuecomment-2265577555 @cloud-fan Changes made, CI failure unrelated to the changes, please proceed to merge -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [MINOR] Use LinkedHashSet for ResolveLateralColumnAliasReference to generate stable hash for the plan [spark]

2024-08-02 Thread via GitHub
zhangyt26 commented on PR #47571: URL: https://github.com/apache/spark/pull/47571#issuecomment-2265577920 even easier repro: spark.sql("SELECT 100 as col1, 200 as col2, col1 * col2").semanticHash -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] [SPARK-49063][SQL] Fix Between with ScalarSubqueries [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on code in PR #47581: URL: https://github.com/apache/spark/pull/47581#discussion_r1701960014 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -4603,6 +4603,15 @@ object SQLConf { .booleanConf .createWithDefault(t

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701961494 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701962015 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] Bump rexml from 3.2.6 to 3.3.3 in /docs [spark]

2024-08-02 Thread via GitHub
yaooqinn closed pull request #47588: Bump rexml from 3.2.6 to 3.3.3 in /docs URL: https://github.com/apache/spark/pull/47588 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701966870 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestH

Re: [PR] Bump rexml from 3.2.6 to 3.3.3 in /docs [spark]

2024-08-02 Thread via GitHub
dependabot[bot] commented on PR #47588: URL: https://github.com/apache/spark/pull/47588#issuecomment-2265589485 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
dongjoon-hyun commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701966870 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestH

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-02 Thread via GitHub
dusantism-db commented on code in PR #47553: URL: https://github.com/apache/spark/pull/47553#discussion_r1701970258 ## sql/catalyst/src/main/scala/org/apache/spark/sql/exceptions/SqlScriptingException.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701970747 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701970747 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-48292][CORE] Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status [spark]

2024-08-02 Thread via GitHub
cloud-fan commented on PR #46696: URL: https://github.com/apache/spark/pull/46696#issuecomment-2265597736 This revert should fix your problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701978095 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on code in PR #47583: URL: https://github.com/apache/spark/pull/47583#discussion_r1701980955 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala: ## @@ -415,4 +415,23 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSi

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-02 Thread via GitHub
yaooqinn commented on PR #47568: URL: https://github.com/apache/spark/pull/47568#issuecomment-2265624443 Can you retrigger the CI? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[PR] [SPARK-49097][INFRA] Add Python3 environment detection for the `build_orror_docs` method in `build_api_decs.rb` [spark]

2024-08-02 Thread via GitHub
wayneguow opened a new pull request, #47590: URL: https://github.com/apache/spark/pull/47590 ### What changes were proposed in this pull request? This PR aims to add Python3 environment detection for the `build_orror_docs` method in `build_api_decs.rb`. ### Why are the

  1   2   >