date:20250121

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-21 Thread via GitHub

beliefer commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1923797461 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } privat

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-21 Thread via GitHub

beliefer commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1923797461 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } privat

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-21 Thread via GitHub

beliefer commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1923797461 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } privat

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-21 Thread via GitHub

beliefer commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1923797461 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } privat

[PR] [SPARK-50798][SQL][FOLLOWUP] Further improvements to `NormalizePlan` [spark]

2025-01-21 Thread via GitHub

mihailotim-db opened a new pull request, #49585: URL: https://github.com/apache/spark/pull/49585 ### What changes were proposed in this pull request? Improve `NormalizePlan` by fixing normalization of `InheritAnalysisRules` and add normalization for `CommonExpressionId` and expres

[PR] [SPARK-50904][SQL] Fix collation expression walker query execution [spark]

2025-01-21 Thread via GitHub

stefankandic opened a new pull request, #49586: URL: https://github.com/apache/spark/pull/49586 ### What changes were proposed in this pull request? Changing when we collect results in `CollationExpressionWalkerSuite` on borders of changing session default collation. ### Why ar

Re: [PR] [SPARK-50895][SQL] Create common interface for expressions which produce default string type [spark]

2025-01-21 Thread via GitHub

stefankandic commented on code in PR #49576: URL: https://github.com/apache/spark/pull/49576#discussion_r1923877836 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collationExpressions.scala: ## @@ -151,12 +151,15 @@ case class ResolvedCollation(collatio

Re: [PR] [SPARK-48353][SQL] Introduction of Error Handling mechanism in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on PR #49427: URL: https://github.com/apache/spark/pull/49427#issuecomment-2605017778 Could we add a test in which a handler is declared but an error is thrown anyway, because it is a different condition? For example declare handler for divide by zero but unresolved c

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

vladimirg-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923923937 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923946713 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

Re: [PR] [WIP][SPARK-50838][SQL]Performs additional checks inside recursive CTEs to throw an error if forbidden case is encountered [spark]

2025-01-21 Thread via GitHub

milanisvet commented on PR #49518: URL: https://github.com/apache/spark/pull/49518#issuecomment-2605067854 As discussed offline, `checkIfSelfReferenceIsPlacedCorrectly` and `checkDataTypesAnchorAndRecursiveTerm` definitions left in `resolveWithCTE` singleton, but invoked in `checkAnalysis`

Re: [PR] [WIP][SPARK-50838][SQL]Performs additional checks inside recursive CTEs to throw an error if forbidden case is encountered [spark]

2025-01-21 Thread via GitHub

milanisvet commented on code in PR #49518: URL: https://github.com/apache/spark/pull/49518#discussion_r1923949925 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3117,6 +3117,29 @@ ], "sqlState" : "42602" }, + "INVALID_RECURSIVE_REFERENCE" :

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923946713 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

Re: [PR] [SPARK-50718][PYTHON][4.0] Support `addArtifact(s)` for PySpark [spark]

2025-01-21 Thread via GitHub

itholic closed pull request #49583: [SPARK-50718][PYTHON][4.0] Support `addArtifact(s)` for PySpark URL: https://github.com/apache/spark/pull/49583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] [SPARK-50718][PYTHON][4.0] Support `addArtifact(s)` for PySpark [spark]

2025-01-21 Thread via GitHub

itholic commented on PR #49583: URL: https://github.com/apache/spark/pull/49583#issuecomment-2603949067 Merged to branch-4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-50582][SQL][PYTHON] Add quote builtin function [spark]

2025-01-21 Thread via GitHub

sarutak commented on code in PR #49191: URL: https://github.com/apache/spark/pull/49191#discussion_r1923271004 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -3723,3 +3723,40 @@ case class Luhncheck(input: Expression) exte

Re: [PR] [SPARK-50582][SQL][PYTHON] Add quote builtin function [spark]

2025-01-21 Thread via GitHub

sarutak commented on code in PR #49191: URL: https://github.com/apache/spark/pull/49191#discussion_r1923275195 ## sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala: ## @@ -1452,4 +1452,21 @@ class StringFunctionsSuite extends QueryTest with SharedSparkSess

Re: [PR] [SPARK-49646][SQL] add spark config for fixing subquery decorrelation for union/set operations when parentOuterReferences has references not covered in collectedChildOuterReferences [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on PR #49536: URL: https://github.com/apache/spark/pull/49536#issuecomment-2604140193 Can you link to the original PR that did the change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-50091][SQL] Handle case of aggregates in left-hand operand of IN-subquery [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #48627: URL: https://github.com/apache/spark/pull/48627#discussion_r1923402160 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala: ## @@ -246,46 +267,106 @@ object RewritePredicateSubquery extends Rule[Logical

Re: [PR] [SPARK-50091][SQL] Handle case of aggregates in left-hand operand of IN-subquery [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #48627: URL: https://github.com/apache/spark/pull/48627#discussion_r1923402967 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala: ## @@ -246,46 +267,106 @@ object RewritePredicateSubquery extends Rule[Logical

Re: [PR] [SPARK-50091][SQL] Handle case of aggregates in left-hand operand of IN-subquery [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #48627: URL: https://github.com/apache/spark/pull/48627#discussion_r1923408425 ## sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala: ## @@ -2800,4 +2800,32 @@ class SubquerySuite extends QueryTest checkAnswer(df3, Row(7))

Re: [PR] [SPARK-50902][CORE][K8S][TESTS] Add `CRC32C` test cases [spark]

2025-01-21 Thread via GitHub

LuciferYang commented on PR #49582: URL: https://github.com/apache/spark/pull/49582#issuecomment-2604071321 A failure occurred in https://github.com/dongjoon-hyun/spark/actions/runs/12881008656/job/35910916844 , causing the `KubernetesLocalDiskShuffleDataIOSuite` to not be executed. Alth

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1923350962 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } priva

Re: [PR] [SPARK-50880][SQL] Add a new visitBinaryComparison method to V2ExpressionSQLBuilder [spark]

2025-01-21 Thread via GitHub

beliefer commented on PR #49556: URL: https://github.com/apache/spark/pull/49556#issuecomment-2604766338 @cloud-fan Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [SPARK-50895][SQL] Create common interface for expressions which produce default string type [spark]

2025-01-21 Thread via GitHub

stefankandic commented on PR #49576: URL: https://github.com/apache/spark/pull/49576#issuecomment-2604961817 @MaxGekk I have created a separate PR to unblock the test that is failing in the collation expression walker suite #49586. -- This is an automated message from the Apache Git Servi

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923899724 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real SYST

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923900874 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real SYST

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923862248 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -154,6 +154,9 @@ case class AnalysisContext( referredTempFunctionN

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on PR #49445: URL: https://github.com/apache/spark/pull/49445#issuecomment-2604938126 @cloud-fan @MaxGekk I resolved all comments, could you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

vladimirg-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923910427 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

Re: [PR] [SPARK-48353][SQL] Introduction of Error Handling mechanism in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49427: URL: https://github.com/apache/spark/pull/49427#discussion_r1923910371 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionSuite.scala: ## @@ -65,6 +68,426 @@ class SqlScriptingExecutionSuite extends QueryTe

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923910719 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real SYST

Re: [PR] [SPARK-48353][SQL] Introduction of Error Handling mechanism in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49427: URL: https://github.com/apache/spark/pull/49427#discussion_r1923910371 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionSuite.scala: ## @@ -65,6 +68,426 @@ class SqlScriptingExecutionSuite extends QueryTe

Re: [PR] [SPARK-48353][SQL] Introduction of Error Handling mechanism in SQL Scripting [spark]

2025-01-21 Thread via GitHub

miland-db commented on code in PR #49427: URL: https://github.com/apache/spark/pull/49427#discussion_r1923911342 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -159,15 +159,104 @@ class AstBuilder extends DataTypeAstBuilder scrip

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

vladimirg-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1923911359 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

[PR] [SPARK-50905][SQL][TESTS] Rename `Customer` to `Custom` in `SparkSessionExtensionSuite` [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun opened a new pull request, #49587: URL: https://github.com/apache/spark/pull/49587 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [WIP][SPARK-50838][SQL]Performs additional checks inside recursive CTEs to throw an error if forbidden case is encountered [spark]

2025-01-21 Thread via GitHub

milanisvet commented on code in PR #49518: URL: https://github.com/apache/spark/pull/49518#discussion_r1923954633 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala: ## @@ -49,17 +51,27 @@ object ResolveWithCTE extends Rule[LogicalPlan] {

Re: [PR] [SPARK-50898][ML][PYTHON][CONNECT] Support `FPGrowth` on connect [spark]

2025-01-21 Thread via GitHub

zhengruifeng commented on PR #49579: URL: https://github.com/apache/spark/pull/49579#issuecomment-2603906459 thanks, merged to master/4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-50898][ML][PYTHON][CONNECT] Support `FPGrowth` on connect [spark]

2025-01-21 Thread via GitHub

zhengruifeng closed pull request #49579: [SPARK-50898][ML][PYTHON][CONNECT] Support `FPGrowth` on connect URL: https://github.com/apache/spark/pull/49579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-50582][SQL][PYTHON] Add quote builtin function [spark]

2025-01-21 Thread via GitHub

MaxGekk commented on code in PR #49191: URL: https://github.com/apache/spark/pull/49191#discussion_r1923230780 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -3723,3 +3723,40 @@ case class Luhncheck(input: Expression) exte

Re: [PR] [SPARK-50879][ML][PYTHON][CONNECT] Support feature scalers on Connect [spark]

2025-01-21 Thread via GitHub

zhengruifeng commented on PR #49581: URL: https://github.com/apache/spark/pull/49581#issuecomment-2603917527 thanks, merged to master/4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-50879][ML][PYTHON][CONNECT] Support feature scalers on Connect [spark]

2025-01-21 Thread via GitHub

zhengruifeng closed pull request #49581: [SPARK-50879][ML][PYTHON][CONNECT] Support feature scalers on Connect URL: https://github.com/apache/spark/pull/49581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace [spark]

2025-01-21 Thread via GitHub

wengh commented on code in PR #49535: URL: https://github.com/apache/spark/pull/49535#discussion_r1924468968 ## python/pyspark/util.py: ## @@ -468,16 +468,19 @@ def handle_worker_exception(e: BaseException, outfile: IO) -> None: and exception traceback info to outfile. JVM

Re: [PR] [SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace [spark]

2025-01-21 Thread via GitHub

wengh commented on code in PR #49535: URL: https://github.com/apache/spark/pull/49535#discussion_r1924468968 ## python/pyspark/util.py: ## @@ -468,16 +468,19 @@ def handle_worker_exception(e: BaseException, outfile: IO) -> None: and exception traceback info to outfile. JVM

Re: [PR] [SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace [spark]

2025-01-21 Thread via GitHub

allisonwang-db commented on code in PR #49535: URL: https://github.com/apache/spark/pull/49535#discussion_r1924503930 ## python/pyspark/util.py: ## @@ -468,16 +468,19 @@ def handle_worker_exception(e: BaseException, outfile: IO) -> None: and exception traceback info to out

Re: [PR] [SPARK-50902][CORE][K8S][TESTS] Add `CRC32C` test cases [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun commented on PR #49582: URL: https://github.com/apache/spark/pull/49582#issuecomment-2605130155 Thank you! Now all tests passed. Merged to master/4.0. ![Screenshot 2025-01-21 at 08 00 57](https://github.com/user-attachments/assets/d74967d6-ea5f-47fd-b241-0781257445c1)

Re: [PR] [SPARK-50082][CORE] Remove some unnecessary Jersey-related warning logs [spark]

2025-01-21 Thread via GitHub

pan3793 commented on PR #48611: URL: https://github.com/apache/spark/pull/48611#issuecomment-2605203551 > If the following two dependencies are in the class path, there will be no corresponding warning logs, but we excluded it in this PR: https://github.com/apache/spark/pull/25481 > - `j

Re: [PR] [WIP][SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-01-21 Thread via GitHub

milanisvet commented on code in PR #49571: URL: https://github.com/apache/spark/pull/49571#discussion_r1924001588 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InlineCTE.scala: ## @@ -61,7 +61,8 @@ case class InlineCTE( // 1) It is fine to inline a

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1924348312 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

Re: [PR] [SPARK-45013][CORE][TEST][3.5] Flaky Test with NPE: track allocated resources by taskId [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun closed pull request #49589: [SPARK-45013][CORE][TEST][3.5] Flaky Test with NPE: track allocated resources by taskId URL: https://github.com/apache/spark/pull/49589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-49646][SQL] add spark config for fixing subquery decorrelation for union/set operations when parentOuterReferences has references not covered in collectedChildOuterReferences [spark]

2025-01-21 Thread via GitHub

AveryQi115 commented on PR #49536: URL: https://github.com/apache/spark/pull/49536#issuecomment-2605444582 I changed the description and linked the original pr in the pr description. Here's the linked change: https://github.com/apache/spark/pull/48109 -- This is an automated message from

Re: [PR] [SPARK-50639][SQL] Improve warning logging in CacheManager [spark]

2025-01-21 Thread via GitHub

vrozov commented on PR #49276: URL: https://github.com/apache/spark/pull/49276#issuecomment-2605196148 @gengliangwang Please check my reply. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [SPARK-50883][SQL] Support altering multiple columns in the same command [spark]

2025-01-21 Thread via GitHub

ctring commented on PR #49559: URL: https://github.com/apache/spark/pull/49559#issuecomment-2605472867 @MaxGekk I updated the PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [SPARK-45013][CORE][TEST][3.5] Flaky Test with NPE: track allocated resources by taskId [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun commented on PR #49589: URL: https://github.com/apache/spark/pull/49589#issuecomment-2605714639 Merged to branch-3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[PR] [SPARK-50906][SS] Add nullability check for if inputs of to_avro align with schema [spark]

2025-01-21 Thread via GitHub

fanyue-xia opened a new pull request, #49590: URL: https://github.com/apache/spark/pull/49590 ### What changes were proposed in this pull request? Previously, we don't explicitly check when input of `to_avro` is `null` but the schema does not allow `null`. As a result, a NPE w

Re: [PR] [SPARK-50905][SQL][TESTS] Rename `Customer` to `Custom` in `SparkSessionExtensionSuite` [spark]

2025-01-21 Thread via GitHub

LuciferYang commented on PR #49587: URL: https://github.com/apache/spark/pull/49587#issuecomment-2605266305 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

davidm-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1924312921 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real SYST

Re: [PR] [SPARK-48353][SQL] Introduction of Error Handling mechanism in SQL Scripting [spark]

2025-01-21 Thread via GitHub

davidm-db commented on code in PR #49427: URL: https://github.com/apache/spark/pull/49427#discussion_r1924320046 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala: ## @@ -161,25 +161,26 @@ class SqlScriptingParsingContext { transitionTo(S

Re: [PR] [SPARK-50904][SQL] Fix collation expression walker query execution [spark]

2025-01-21 Thread via GitHub

MaxGekk closed pull request #49586: [SPARK-50904][SQL] Fix collation expression walker query execution URL: https://github.com/apache/spark/pull/49586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-48530][SQL] Support for local variables in SQL Scripting [spark]

2025-01-21 Thread via GitHub

dusantism-db commented on code in PR #49445: URL: https://github.com/apache/spark/pull/49445#discussion_r1924066055 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala: ## @@ -49,6 +49,18 @@ class CatalogManager( // TODO: create a real S

Re: [PR] [SPARK-50905][SQL][TESTS] Rename `Customer` to `Custom` in `SparkSessionExtensionSuite` [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun commented on PR #49587: URL: https://github.com/apache/spark/pull/49587#issuecomment-2605256634 Could you review this PR, @LuciferYang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-50905][SQL][TESTS] Rename `Customer` to `Custom` in `SparkSessionExtensionSuite` [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun commented on PR #49587: URL: https://github.com/apache/spark/pull/49587#issuecomment-2605260101 Thank you, @MaxGekk ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-50905][SQL][TESTS] Rename `Customer` to `Custom` in `SparkSessionExtensionSuite` [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun closed pull request #49587: [SPARK-50905][SQL][TESTS] Rename `Customer*` to `Custom*` in `SparkSessionExtensionSuite` URL: https://github.com/apache/spark/pull/49587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-50905][SQL][TESTS] Rename `Customer` to `Custom` in `SparkSessionExtensionSuite` [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun commented on PR #49587: URL: https://github.com/apache/spark/pull/49587#issuecomment-2605264174 `SparkSessionExtensionSuite` passed in the CI. Merged to master/4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-50883][SQL] Support altering multiple columns in the same command [spark]

2025-01-21 Thread via GitHub

ctring commented on code in PR #49559: URL: https://github.com/apache/spark/pull/49559#discussion_r1924272176 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -1622,60 +1622,84 @@ trait CheckAnalysis extends PredicateHelper with L

Re: [PR] [SPARK-46937][SQL] Improve concurrency performance for FunctionRegistry [spark]

2025-01-21 Thread via GitHub

github-actions[bot] commented on PR #47084: URL: https://github.com/apache/spark/pull/47084#issuecomment-2606014785 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace [spark]

2025-01-21 Thread via GitHub

HyukjinKwon commented on code in PR #49535: URL: https://github.com/apache/spark/pull/49535#discussion_r1924528145 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -3459,6 +3459,15 @@ object SQLConf { .checkValues(Set("legacy", "row", "dic

Re: [PR] [SPARK-50855][SS][CONNECT] Spark Connect Support for TransformWithState [spark]

2025-01-21 Thread via GitHub

anishshri-db commented on code in PR #49488: URL: https://github.com/apache/spark/pull/49488#discussion_r1924521840 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala: ## @@ -140,28 +144,50 @@ class KeyValueGroupedDataset[K, V] priva

Re: [PR] [SPARK-39901][CORE][SQL] Redesign `ignoreCorruptFiles` to make it more accurate by adding a new config `spark.files.ignoreCorruptFiles.errorClasses` [spark]

2025-01-21 Thread via GitHub

github-actions[bot] closed pull request #47090: [SPARK-39901][CORE][SQL] Redesign `ignoreCorruptFiles` to make it more accurate by adding a new config `spark.files.ignoreCorruptFiles.errorClasses` URL: https://github.com/apache/spark/pull/47090 -- This is an automated message from the Apache

[PR] [ML][CONNECT] Support Transformer [spark]

2025-01-21 Thread via GitHub

wbo4958 opened a new pull request, #49588: URL: https://github.com/apache/spark/pull/49588 ### What changes were proposed in this pull request? This PR adds support transformer on ml connect. Currently, VectorAssembler is fully supported. ### Why are the changes needed?

[PR] [SPARK-45013][TEST][3.5] Flaky Test with NPE: track allocated resources by taskId [spark]

2025-01-21 Thread via GitHub

LuciferYang opened a new pull request, #49589: URL: https://github.com/apache/spark/pull/49589 ### What changes were proposed in this pull request? This PR ensures the runningTasks to be updated before subsequent tasks causing NPE ### Why are the changes needed? fix flakey tests

Re: [PR] [SPARK-50855][SS][CONNECT] Spark Connect Support for TransformWithState [spark]

2025-01-21 Thread via GitHub

jingz-db commented on PR #49488: URL: https://github.com/apache/spark/pull/49488#issuecomment-2605402772 > Is the CI failure related - https://github.com/jingz-db/spark/actions/runs/12837471455/job/35801692179 ? Yes it is related to proto file changes. I just rebased on latest master

Re: [PR] [SPARK-50883][SQL] Support altering multiple columns in the same command [spark]

2025-01-21 Thread via GitHub

ctring commented on code in PR #49559: URL: https://github.com/apache/spark/pull/49559#discussion_r1924273748 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala: ## @@ -201,49 +201,55 @@ case class RenameColumn( copy(table

Re: [PR] [SPARK-50902][CORE][K8S][TESTS] Add `CRC32C` test cases [spark]

2025-01-21 Thread via GitHub

dongjoon-hyun closed pull request #49582: [SPARK-50902][CORE][K8S][TESTS] Add `CRC32C` test cases URL: https://github.com/apache/spark/pull/49582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45013][TEST][3.5] Flaky Test with NPE: track allocated resources by taskId [spark]

2025-01-21 Thread via GitHub

LuciferYang commented on PR #49589: URL: https://github.com/apache/spark/pull/49589#issuecomment-2605240660 I hope to backport this fix to branch-3.5, as I encountered similar test failures in the daily tests of branch-3.5: - https://github.com/apache/spark/actions/runs/12885594112/job/35

Re: [PR] [SPARK-50904][SQL] Fix collation expression walker query execution [spark]

2025-01-21 Thread via GitHub

MaxGekk commented on PR #49586: URL: https://github.com/apache/spark/pull/49586#issuecomment-2605237204 +1, LGTM. Merging to master/4.0. Thank you, @stefankandic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-50895][SQL] Create common interface for expressions which produce default string type [spark]

2025-01-21 Thread via GitHub

MaxGekk commented on PR #49576: URL: https://github.com/apache/spark/pull/49576#issuecomment-2605524819 +1, LGTM. Merging to master/4.0. Thank you, @stefankandic and @stevomitric for review. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] [SPARK-50895][SQL] Create common interface for expressions which produce default string type [spark]

2025-01-21 Thread via GitHub

MaxGekk closed pull request #49576: [SPARK-50895][SQL] Create common interface for expressions which produce default string type URL: https://github.com/apache/spark/pull/49576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-50883][SQL] Support altering multiple columns in the same command [spark]

2025-01-21 Thread via GitHub

scovich commented on code in PR #49559: URL: https://github.com/apache/spark/pull/49559#discussion_r1924223680 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -1622,60 +1622,84 @@ trait CheckAnalysis extends PredicateHelper with

Re: [PR] [SPARK-50895][SQL] Create common interface for expressions which produce default string type [spark]

2025-01-21 Thread via GitHub

MaxGekk commented on PR #49576: URL: https://github.com/apache/spark/pull/49576#issuecomment-2605535297 @stefankandic Could you open a PR with backport to `branch-4.0`, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [WIP][SPARK-50838][SQL]Performs additional checks inside recursive CTEs to throw an error if forbidden case is encountered [spark]

2025-01-21 Thread via GitHub

dtenedor commented on code in PR #49518: URL: https://github.com/apache/spark/pull/49518#discussion_r1924249528 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala: ## @@ -183,4 +184,52 @@ object ResolveWithCTE extends Rule[LogicalPlan] {

Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2025-01-21 Thread via GitHub

hvanhovell commented on code in PR #48818: URL: https://github.com/apache/spark/pull/48818#discussion_r1924244051 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala: ## Review Comment: Take another look at DataWriti

Re: [PR] [SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace [spark]

2025-01-21 Thread via GitHub

allisonwang-db commented on code in PR #49535: URL: https://github.com/apache/spark/pull/49535#discussion_r1924245423 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -3459,6 +3459,16 @@ object SQLConf { .checkValues(Set("legacy", "row", "

Re: [PR] [SPARK-50858][PYTHON] Add configuration to hide Python UDF stack trace [spark]

2025-01-21 Thread via GitHub

wengh commented on PR #49535: URL: https://github.com/apache/spark/pull/49535#issuecomment-2605498765 @allisonwang-db @ueshin Could you review this PR that adds configuration to hide Python stack trace from analyze_udtf? -- This is an automated message from the Apache Git Service. To resp

Re: [PR] [SPARK-50792][SQL][FOLLOWUP] Improve the push down information for binary [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #49555: URL: https://github.com/apache/spark/pull/49555#discussion_r1923352440 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala: ## @@ -388,12 +388,13 @@ private[sql] object HoursTransform { } priva

Re: [PR] [SPARK-50880][SQL] Add a new visitBinaryComparison method to V2ExpressionSQLBuilder [spark]

2025-01-21 Thread via GitHub

cloud-fan closed pull request #49556: [SPARK-50880][SQL] Add a new visitBinaryComparison method to V2ExpressionSQLBuilder URL: https://github.com/apache/spark/pull/49556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-50091][SQL] Handle case of aggregates in left-hand operand of IN-subquery [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on code in PR #48627: URL: https://github.com/apache/spark/pull/48627#discussion_r1923409502 ## sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala: ## @@ -2800,4 +2800,32 @@ class SubquerySuite extends QueryTest checkAnswer(df3, Row(7))

Re: [PR] [SPARK-50880][SQL] Add a new visitBinaryComparison method to V2ExpressionSQLBuilder [spark]

2025-01-21 Thread via GitHub

cloud-fan commented on PR #49556: URL: https://github.com/apache/spark/pull/49556#issuecomment-2604176635 thanks, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[PR] [WIP][SPARK_50903][CONNECT] Let the plan cache only contain analysed plans [spark]

2025-01-21 Thread via GitHub

changgyoopark-db opened a new pull request, #49584: URL: https://github.com/apache/spark/pull/49584 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ##

Re: [PR] [SPARK-50912][PYTHON][TESTS] Skip FrameTakeAdvParityTests.test_take_adv because of OOM for now [spark]

2025-01-21 Thread via GitHub

HyukjinKwon closed pull request #49593: [SPARK-50912][PYTHON][TESTS] Skip FrameTakeAdvParityTests.test_take_adv because of OOM for now URL: https://github.com/apache/spark/pull/49593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[PR] [SPARK-50912][PYTHON][TESTS] Skip FrameTakeAdvParityTests.test_take_adv because of OOM for now [spark]

2025-01-21 Thread via GitHub

HyukjinKwon opened a new pull request, #49593: URL: https://github.com/apache/spark/pull/49593 ### What changes were proposed in this pull request? This PR proposes to skip `test_take_adv` in Spark Connect only build. Similar with https://github.com/apache/spark/pull/49565 ###

Re: [PR] [SPARK-50912][PYTHON][TESTS] Skip FrameTakeAdvParityTests.test_take_adv because of OOM for now [spark]

2025-01-21 Thread via GitHub

HyukjinKwon commented on PR #49593: URL: https://github.com/apache/spark/pull/49593#issuecomment-2606091709 Merged to master, branch-4.0, and branch-3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45013][CORE][TEST][3.5] Flaky Test with NPE: track allocated resources by taskId [spark]

2025-01-21 Thread via GitHub

LuciferYang commented on PR #49589: URL: https://github.com/apache/spark/pull/49589#issuecomment-2606161803 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-50915][PYTHON][CONNECT] Add `getCondition` and deprecate `getErrorClass` in `PySparkException` [spark]

2025-01-21 Thread via GitHub

HyukjinKwon commented on code in PR #49594: URL: https://github.com/apache/spark/pull/49594#discussion_r1924621839 ## python/docs/source/reference/pyspark.errors.rst: ## @@ -69,7 +69,7 @@ Methods .. autosummary:: :toctree: api/ -PySparkException.getErrorClass +Py

Re: [PR] [SPARK-50915][PYTHON][CONNECT] Add `getCondition` and deprecate `getErrorClass` in `PySparkException` [spark]

2025-01-21 Thread via GitHub

HyukjinKwon commented on code in PR #49594: URL: https://github.com/apache/spark/pull/49594#discussion_r1924621621 ## python/docs/source/reference/pyspark.errors.rst: ## @@ -69,7 +69,7 @@ Methods .. autosummary:: :toctree: api/ -PySparkException.getErrorClass Review

Re: [PR] [SPARK-50915][PYTHON][CONNECT] Add `getCondition` and deprecate `getErrorClass` in `PySparkException` [spark]

2025-01-21 Thread via GitHub

HyukjinKwon commented on code in PR #49594: URL: https://github.com/apache/spark/pull/49594#discussion_r1924621839 ## python/docs/source/reference/pyspark.errors.rst: ## @@ -69,7 +69,7 @@ Methods .. autosummary:: :toctree: api/ -PySparkException.getErrorClass +Py

Re: [PR] [SPARK-50853][CORE] Close temp shuffle file writable channel [spark]

2025-01-21 Thread via GitHub

LuciferYang commented on code in PR #49531: URL: https://github.com/apache/spark/pull/49531#discussion_r1924639668 ## core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala: ## @@ -130,6 +132,62 @@ class NettyBlockTransferServiceSuite assert

Re: [PR] [SPARK-50853][CORE] Close temp shuffle file writable channel [spark]

2025-01-21 Thread via GitHub

LuciferYang commented on code in PR #49531: URL: https://github.com/apache/spark/pull/49531#discussion_r1924639668 ## core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala: ## @@ -130,6 +132,62 @@ class NettyBlockTransferServiceSuite assert

Re: [PR] [SPARK-50820][SQL] DSv2: Conditional nullification of metadata columns in DML [spark]

2025-01-21 Thread via GitHub

aokolnychyi commented on code in PR #49493: URL: https://github.com/apache/spark/pull/49493#discussion_r1924643421 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ReplaceDataProjections.scala: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] [SPARK-50578][PYTHON][SS] Add support for new version of state metadata for TransformWithStateInPandas [spark]

2025-01-21 Thread via GitHub

HyukjinKwon commented on PR #49156: URL: https://github.com/apache/spark/pull/49156#issuecomment-2606079406 Just a quick note .. Seems like the test `test_value_state_ttl_expiration` is still flaky when old dependencies are used (https://github.com/apache/spark/actions/runs/12883552117/job/

1 2 >

1 - 100 of 154 matches

Mail list logo