Re: [PR] [SPARK-49094][SQL] Fix ignoreCorruptFiles non-functioning for hive orc impl with mergeSchema off [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47583: URL: https://github.com/apache/spark/pull/47583#issuecomment-2268325751 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] [SPARK-49078][SQL] Support show columns syntax in v2 table [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47568: URL: https://github.com/apache/spark/pull/47568#issuecomment-2268329914 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47442: URL: https://github.com/apache/spark/pull/47442#issuecomment-2268371062 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

2024-08-05 Thread via GitHub
cloud-fan closed pull request #47442: [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts URL: https://github.com/apache/spark/pull/47442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-49082][SQL] Widening type promotions in `AvroDeserializer` [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47582: URL: https://github.com/apache/spark/pull/47582#discussion_r1703674860 ## connector/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala: ## @@ -921,6 +921,34 @@ abstract class AvroSuite } } + test("SPARK-49082: Widen

Re: [PR] [SPARK-49082][SQL] Widening type promotions in `AvroDeserializer` [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47582: URL: https://github.com/apache/spark/pull/47582#discussion_r1703675320 ## connector/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala: ## @@ -921,6 +921,34 @@ abstract class AvroSuite } } + test("SPARK-49082: Widen

Re: [PR] [SPARK-49108][EXAMPLE] Add `submit_pi.sh` REST API example [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun commented on PR #47601: URL: https://github.com/apache/spark/pull/47601#issuecomment-2268398685 Thank you, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-49060] Clean up Mima rules for SQL-Connect binary compatibility checks [spark]

2024-08-05 Thread via GitHub
xupefei commented on code in PR #47487: URL: https://github.com/apache/spark/pull/47487#discussion_r1703685470 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala: ## @@ -879,7 +879,7 @@ class KeyValueGroupedDataset[K, V] private[sql]

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
HeartSaVioR commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1703618747 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSourceErrors.scala: ## @@ -63,6 +63,12 @@ object StateDataSourceErrors {

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
HeartSaVioR commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1703704102 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -104,7 +104,15 @@ class StateMetadataTable

Re: [PR] [SPARK-49063][SQL] Fix Between with ScalarSubqueries [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47581: URL: https://github.com/apache/spark/pull/47581#issuecomment-2268432411 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[PR] [SPARK 49109][SQL] Rename leftover BinaryLcase to Lcase [spark]

2024-08-05 Thread via GitHub
mihailom-db opened a new pull request, #47602: URL: https://github.com/apache/spark/pull/47602 ### What changes were proposed in this pull request? Renaming of all leftover binaryLcase to Lcase. ### Why are the changes needed? The code should follow proper naming.

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
HeartSaVioR commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1703709138 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -104,7 +104,15 @@ class StateMetadataTable

Re: [PR] [SPARK-49063][SQL] Fix Between with ScalarSubqueries [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47581: URL: https://github.com/apache/spark/pull/47581#issuecomment-2268449298 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-49063][SQL] Fix Between with ScalarSubqueries [spark]

2024-08-05 Thread via GitHub
cloud-fan closed pull request #47581: [SPARK-49063][SQL] Fix Between with ScalarSubqueries URL: https://github.com/apache/spark/pull/47581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[PR] [MINOR][SQL] Remove orphans in ProtoToParsedPlanTestSuite and PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
HyukjinKwon opened a new pull request, #47603: URL: https://github.com/apache/spark/pull/47603 ### What changes were proposed in this pull request? This PR proposes to remove orphans in ProtoToParsedPlanTestSuite and PlanGenerationTestSuite ### Why are the changes needed?

Re: [PR] [SPARK-49017][SQL] Insert statement fails when multiple parameters are being used [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47501: URL: https://github.com/apache/spark/pull/47501#discussion_r1703732585 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -624,6 +624,67 @@ class ParametersSuite extends QueryTest with SharedSparkSession with P

Re: [PR] [SPARK-49017][SQL] Insert statement fails when multiple parameters are being used [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47501: URL: https://github.com/apache/spark/pull/47501#discussion_r1703732988 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -624,6 +624,67 @@ class ParametersSuite extends QueryTest with SharedSparkSession with P

Re: [PR] [SPARK-49017][SQL] Insert statement fails when multiple parameters are being used [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47501: URL: https://github.com/apache/spark/pull/47501#discussion_r1703734011 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -68,23 +68,25 @@ class AstBuilder extends DataTypeAstBuilder with SQLConf

Re: [PR] [SPARK-49017][SQL] Insert statement fails when multiple parameters are being used [spark]

2024-08-05 Thread via GitHub
mihailom-db commented on code in PR #47501: URL: https://github.com/apache/spark/pull/47501#discussion_r1703740899 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -624,6 +624,67 @@ class ParametersSuite extends QueryTest with SharedSparkSession with

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-05 Thread via GitHub
dusantism-db commented on code in PR #47553: URL: https://github.com/apache/spark/pull/47553#discussion_r1703748683 ## sql/catalyst/src/main/scala/org/apache/spark/sql/exceptions/SqlScriptingException.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-08-05 Thread via GitHub
panbingkun commented on PR #47364: URL: https://github.com/apache/spark/pull/47364#issuecomment-2268504350 Hi, @mihailom-db, I have a few questions to ask: - Does the `CATALOG` here correspond to spark's `catalog`? (the default value should be `spark_catalog`, not `SYSTEM`) Meanwhile, doe

[PR] [SPARK-49110][SQL] Fix reading metadata columns for tables with CHAR columns [spark]

2024-08-05 Thread via GitHub
tomvanbussel opened a new pull request, #47604: URL: https://github.com/apache/spark/pull/47604 ### What changes were proposed in this pull request? This PR modifies `SubqueryAlias` to always propagate the metadata output of its child, even if the child is not a `SubqueryAlias` or a `

Re: [PR] [SPARK-49043][SQL] Fix interpreted codepath group by on map containing collated strings [spark]

2024-08-05 Thread via GitHub
stefankandic commented on code in PR #47521: URL: https://github.com/apache/spark/pull/47521#discussion_r1703731222 ## sql/core/src/test/resources/sql-tests/results/mode.sql.out: ## @@ -182,15 +182,9 @@ struct> -- !query SELECT mode(col, true) FROM VALUES (map(1, 'a')) AS tab(

[PR] [SPARK-48763][TESTS][FOLLOW-UP] Update project location in PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
HyukjinKwon opened a new pull request, #47605: URL: https://github.com/apache/spark/pull/47605 ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/47579 that updates the Spark Connect location in the test `PlanGenerationTe

Re: [PR] [SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset [spark]

2024-08-05 Thread via GitHub
yaooqinn commented on code in PR #41782: URL: https://github.com/apache/spark/pull/41782#discussion_r1703826363 ## sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java: ## @@ -69,6 +71,12 @@ public void reset() { putNotNulls(0, capaci

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
yaooqinn commented on code in PR #47578: URL: https://github.com/apache/spark/pull/47578#discussion_r1703833305 ## core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala: ## @@ -272,8 +270,9 @@ class TaskMetrics private[spark] () extends Serializable { */ @transi

Re: [PR] [SPARK-48763][TESTS][FOLLOW-UP] Update project location in PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
HyukjinKwon closed pull request #47605: [SPARK-48763][TESTS][FOLLOW-UP] Update project location in PlanGenerationTestSuite URL: https://github.com/apache/spark/pull/47605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [SPARK-48763][TESTS][FOLLOW-UP] Update project location in PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
HyukjinKwon commented on PR #47605: URL: https://github.com/apache/spark/pull/47605#issuecomment-2268680063 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49108][EXAMPLE] Add `submit_pi.sh` REST API example [spark]

2024-08-05 Thread via GitHub
yaooqinn closed pull request #47601: [SPARK-49108][EXAMPLE] Add `submit_pi.sh` REST API example URL: https://github.com/apache/spark/pull/47601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [SPARK-49108][EXAMPLE] Add `submit_pi.sh` REST API example [spark]

2024-08-05 Thread via GitHub
yaooqinn commented on PR #47601: URL: https://github.com/apache/spark/pull/47601#issuecomment-2268686337 Merged to master. Thank you @dongjoon-hyun @viirya @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-49107][SQL] `ROUTINE_ALREADY_EXISTS` supports RoutineType [spark]

2024-08-05 Thread via GitHub
yaooqinn closed pull request #47600: [SPARK-49107][SQL] `ROUTINE_ALREADY_EXISTS` supports RoutineType URL: https://github.com/apache/spark/pull/47600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-49107][SQL] `ROUTINE_ALREADY_EXISTS` supports RoutineType [spark]

2024-08-05 Thread via GitHub
yaooqinn commented on PR #47600: URL: https://github.com/apache/spark/pull/47600#issuecomment-2268692047 Merged to master. Thank you @zhengruifeng and all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-49060][CONNECT] Clean up Mima rules for SQL-Connect binary compatibility checks [spark]

2024-08-05 Thread via GitHub
HyukjinKwon commented on PR #47487: URL: https://github.com/apache/spark/pull/47487#issuecomment-2268705241 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49060][CONNECT] Clean up Mima rules for SQL-Connect binary compatibility checks [spark]

2024-08-05 Thread via GitHub
HyukjinKwon closed pull request #47487: [SPARK-49060][CONNECT] Clean up Mima rules for SQL-Connect binary compatibility checks URL: https://github.com/apache/spark/pull/47487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] [MINOR][SQL] Remove orphans in ProtoToParsedPlanTestSuite and PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
HyukjinKwon commented on PR #47603: URL: https://github.com/apache/spark/pull/47603#issuecomment-2268739202 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [MINOR][SQL] Remove orphans in ProtoToParsedPlanTestSuite and PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
HyukjinKwon closed pull request #47603: [MINOR][SQL] Remove orphans in ProtoToParsedPlanTestSuite and PlanGenerationTestSuite URL: https://github.com/apache/spark/pull/47603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-49082][SQL] Widening type promotions in `AvroDeserializer` [spark]

2024-08-05 Thread via GitHub
wayneguow commented on code in PR #47582: URL: https://github.com/apache/spark/pull/47582#discussion_r1703943390 ## connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala: ## @@ -194,6 +200,9 @@ private[sql] class AvroDeserializer( case (FLOAT, Flo

[PR] DataSourceV2Strategy - Move method to object [spark]

2024-08-05 Thread via GitHub
urosstan-db opened a new pull request, #47606: URL: https://github.com/apache/spark/pull/47606 ### What changes were proposed in this pull request? Move static method `withProjectAndFilter` to object in DataSourceV2Strategy ### Why are the changes needed? It provides better oppor

[PR] [WIP] Implement Levenshtein distance for utf8_lcase collation [spark]

2024-08-05 Thread via GitHub
viktorluc-db opened a new pull request, #47607: URL: https://github.com/apache/spark/pull/47607 ### What changes were proposed in this pull request? Supporting Levenshtein distance with utf8_lcase collation. ### Why are the changes needed? Levenshtein distance

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47553: URL: https://github.com/apache/spark/pull/47553#issuecomment-2268936300 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter [spark]

2024-08-05 Thread via GitHub
cloud-fan closed pull request #47553: [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter URL: https://github.com/apache/spark/pull/47553 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-48763][TESTS][FOLLOW-UP] Update project location in PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
yaooqinn commented on PR #47605: URL: https://github.com/apache/spark/pull/47605#issuecomment-2268957653 Reverted this to restore CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
yaooqinn closed pull request #47578: [SPARK-48791][CORE][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums URL: https://github.com/apache/spark/pull/47578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
yaooqinn commented on PR #47578: URL: https://github.com/apache/spark/pull/47578#issuecomment-2268981626 Merged to master, thank you all. Can you make backports for 3.4 and 3.5? @Ngone51 -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] [SPARK-49112][CONNECT][TEST] Make `createLocalRelationProto` support relation with `TimestampType` [spark]

2024-08-05 Thread via GitHub
zhengruifeng opened a new pull request, #47608: URL: https://github.com/apache/spark/pull/47608 ### What changes were proposed in this pull request? Make `createLocalRelationProto` support relation with `TimestampType` ### Why are the changes needed? existing helper function

Re: [PR] [SPARK-49018][SQL] Fix approx_count_distinct not working correctly with collation [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47503: URL: https://github.com/apache/spark/pull/47503#issuecomment-2269008837 the protobuf check failure is unrelated, thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-49018][SQL] Fix approx_count_distinct not working correctly with collation [spark]

2024-08-05 Thread via GitHub
cloud-fan closed pull request #47503: [SPARK-49018][SQL] Fix approx_count_distinct not working correctly with collation URL: https://github.com/apache/spark/pull/47503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-47430][SQL] Rework group by map type to fix bind reference exception [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47545: URL: https://github.com/apache/spark/pull/47545#discussion_r1704081192 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -246,8 +246,6 @@ abstract class Optimizer(catalogManager: CatalogManager

Re: [PR] [SPARK-47430][SQL] Rework group by map type to fix bind reference exception [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on code in PR #47545: URL: https://github.com/apache/spark/pull/47545#discussion_r1704078860 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/AddMapSortInAggregate.scala: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] [SPARK-49083][CONNECT] Allow from_xml and from_json to natively work with json schemas [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on PR #47573: URL: https://github.com/apache/spark/pull/47573#issuecomment-2269030845 @dongjoon-hyun yeah, working on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [WIP][SPARK-?][SQL] SQL Scripting execution (including Spark Connect) [spark]

2024-08-05 Thread via GitHub
davidm-db opened a new pull request, #47609: URL: https://github.com/apache/spark/pull/47609 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How w

[PR] [SPARK-48989][SQL][FOLLOWUP] Fix SubstringIndex codegen [spark]

2024-08-05 Thread via GitHub
uros-db opened a new pull request, #47610: URL: https://github.com/apache/spark/pull/47610 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

Re: [PR] [SPARK-48989][SQL][FOLLOWUP] Fix SubstringIndex codegen [spark]

2024-08-05 Thread via GitHub
uros-db commented on PR #47610: URL: https://github.com/apache/spark/pull/47610#issuecomment-2269102691 Following up on https://github.com/apache/spark/pull/47481, adding @miland-db and @cloud-fan to review To recap: - SubstringIndex always treats `count` as an `int`, rather than a

[PR] [SPARK-49113] remove assert from datasource v2 strategy [spark]

2024-08-05 Thread via GitHub
milastdbx opened a new pull request, #47611: URL: https://github.com/apache/spark/pull/47611 ### What changes were proposed in this pull request? ### Why are the changes needed? In this PR I propose that we do not assert and fail queries when `V2ExpressionBu

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
Ngone51 commented on PR #47578: URL: https://github.com/apache/spark/pull/47578#issuecomment-2269159908 @yaooqinn Sure, will do. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-48989][SQL][FOLLOWUP] Fix SubstringIndex codegen [spark]

2024-08-05 Thread via GitHub
cloud-fan commented on PR #47610: URL: https://github.com/apache/spark/pull/47610#issuecomment-2269167709 shall we add a test? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [SPARK-48989][SQL][FOLLOWUP] Fix SubstringIndex codegen [spark]

2024-08-05 Thread via GitHub
uros-db commented on PR #47610: URL: https://github.com/apache/spark/pull/47610#issuecomment-2269170590 yes, will add tests & description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[PR] [SPARK-48791][CORE][FOLLOW-UP][3.5] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
Ngone51 opened a new pull request, #47612: URL: https://github.com/apache/spark/pull/47612 This PR backports https://github.com/apache/spark/pull/47578 to branch-3.5. ### What changes were proposed in this pull request? This is a followup fix for https://github.com/apache/sp

[PR] [SPARK-48791][CORE][FOLLOW-UP][3.4] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
Ngone51 opened a new pull request, #47613: URL: https://github.com/apache/spark/pull/47613 This PR backports https://github.com/apache/spark/pull/47578 to branch-3.4. ### What changes were proposed in this pull request? This is a followup fix for https://github.com/apache/sp

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
Ngone51 commented on PR #47578: URL: https://github.com/apache/spark/pull/47578#issuecomment-2269222087 Please take a look at the backport PRs: https://github.com/apache/spark/pull/47612, https://github.com/apache/spark/pull/47613. Thanks! -- This is an automated message from the Apache

Re: [PR] [WIP][SPARK-?][SQL] SQL Scripting execution (including Spark Connect) [spark]

2024-08-05 Thread via GitHub
miland-db commented on code in PR #47609: URL: https://github.com/apache/spark/pull/47609#discussion_r1704152082 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala: ## @@ -73,4 +84,28 @@ case class IfElseStatement( cond

Re: [PR] [WIP][SPARK-?][SQL] SQL Scripting execution (including Spark Connect) [spark]

2024-08-05 Thread via GitHub
davidm-db commented on code in PR #47609: URL: https://github.com/apache/spark/pull/47609#discussion_r1704231326 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala: ## @@ -73,4 +84,28 @@ case class IfElseStatement( cond

Re: [PR] [WIP][SPARK-?][SQL] SQL Scripting execution (including Spark Connect) [spark]

2024-08-05 Thread via GitHub
davidm-db commented on code in PR #47609: URL: https://github.com/apache/spark/pull/47609#discussion_r1704235261 ## sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreter.scala: ## @@ -82,20 +89,36 @@ case class SqlScriptingInterpreter() { .map

Re: [PR] [WIP][SPARK-?][SQL] SQL Scripting execution (including Spark Connect) [spark]

2024-08-05 Thread via GitHub
davidm-db commented on code in PR #47609: URL: https://github.com/apache/spark/pull/47609#discussion_r1704237126 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreterSuite.scala: ## @@ -27,29 +29,44 @@ import org.apache.spark.sql.test.SharedSparkSessi

Re: [PR] [WIP][SPARK-48344][SQL] SQL Scripting execution (including Spark Connect) [spark]

2024-08-05 Thread via GitHub
davidm-db commented on code in PR #47609: URL: https://github.com/apache/spark/pull/47609#discussion_r1704256694 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala: ## @@ -15,37 +15,40 @@ * limitations under the License.

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-08-05 Thread via GitHub
Fokko commented on PR #47498: URL: https://github.com/apache/spark/pull/47498#issuecomment-2269348151 @dongjoon-hyun And it has been published :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [SPARK-49108][EXAMPLE] Add `submit_pi.sh` REST API example [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun commented on PR #47601: URL: https://github.com/apache/spark/pull/47601#issuecomment-2269399368 Thank you, @yaooqinn and @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun commented on PR #47498: URL: https://github.com/apache/spark/pull/47498#issuecomment-2269440537 Thank you so much, @Fokko ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun commented on PR #47498: URL: https://github.com/apache/spark/pull/47498#issuecomment-2269449182 BTW, is this announced in Apache Avro website or GitHub repo? I cannot find this release yet. - https://github.com/apache/avro/releases/ - https://avro.apache.org/project/downl

[PR] [SPARK-48824] Add Identity Column sql syntax [spark]

2024-08-05 Thread via GitHub
zhipengmao-db opened a new pull request, #47614: URL: https://github.com/apache/spark/pull/47614 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### H

Re: [PR] [SPARK-49004][CONNECT] Register Column API internal functions in separate namespace [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on code in PR #47572: URL: https://github.com/apache/spark/pull/47572#discussion_r1704379231 ## sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriterV2.scala: ## @@ -235,6 +235,22 @@ final class DataFrameWriterV2[T] private[sql](table: String, ds: Da

Re: [PR] [SPARK-49004][CONNECT] Register Column API internal functions in separate namespace [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on code in PR #47572: URL: https://github.com/apache/spark/pull/47572#discussion_r1704380657 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -2338,10 +2342,12 @@ class Analyzer(override val catalogManager: Catalog

Re: [PR] [SPARK-49004][CONNECT] Register Column API internal functions in separate namespace [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on code in PR #47572: URL: https://github.com/apache/spark/pull/47572#discussion_r1704382405 ## connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -1614,14 +1614,23 @@ class SparkConnectPlanner( fun: pr

Re: [PR] [SPARK-49014][BUILD] Bump Apache Avro to 1.12.0 [spark]

2024-08-05 Thread via GitHub
Fokko commented on PR #47498: URL: https://github.com/apache/spark/pull/47498#issuecomment-2269531491 @dongjoon-hyun We're in the process of releasing which is quite an effort since we do all the languages simultaneously. When all the convenience binaries are published, the announcement ema

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP][3.5] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun closed pull request #47612: [SPARK-48791][CORE][FOLLOW-UP][3.5] Fix regression caused by immutable conversion on TaskMetrics#externalAccums URL: https://github.com/apache/spark/pull/47612 -- This is an automated message from the Apache Git Service. To respond to the message, pl

[PR] Fixed comma splice in cluster-overview.md [spark]

2024-08-05 Thread via GitHub
j-j-wright opened a new pull request, #47615: URL: https://github.com/apache/spark/pull/47615 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP][3.5] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun commented on PR #47612: URL: https://github.com/apache/spark/pull/47612#issuecomment-2269530755 Merged to branch-3.5. cc @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-49004][CONNECT] Use separate registry for Column API internal functions [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on code in PR #47572: URL: https://github.com/apache/spark/pull/47572#discussion_r1704380657 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -2338,10 +2342,12 @@ class Analyzer(override val catalogManager: Catalog

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP][3.4] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun commented on PR #47613: URL: https://github.com/apache/spark/pull/47613#issuecomment-2269570807 Thank you, @Ngone51 and @mridulm . Merged to branch-3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-48791][CORE][FOLLOW-UP][3.4] Fix regression caused by immutable conversion on TaskMetrics#externalAccums [spark]

2024-08-05 Thread via GitHub
dongjoon-hyun closed pull request #47613: [SPARK-48791][CORE][FOLLOW-UP][3.4] Fix regression caused by immutable conversion on TaskMetrics#externalAccums URL: https://github.com/apache/spark/pull/47613 -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] [SPARK-48949][SQL] SPJ: Runtime partition filtering [spark]

2024-08-05 Thread via GitHub
szehon-ho commented on PR #47426: URL: https://github.com/apache/spark/pull/47426#issuecomment-2269612624 Thank you @sunchao and @dongjoon-hyun for quick review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-49114] Subcategorize cannot load state store errors [spark]

2024-08-05 Thread via GitHub
riyaverm-db opened a new pull request, #47616: URL: https://github.com/apache/spark/pull/47616 ### What changes were proposed in this pull request? - Reclassify a OOM error for both state store providers in loading state store and throw an informative error message - Wi

Re: [PR] [SPARK-48821][SQL] Support Update in DataFrameWriterV2 [spark]

2024-08-05 Thread via GitHub
szehon-ho commented on PR #47233: URL: https://github.com/apache/spark/pull/47233#issuecomment-2269643868 @huaxingao @cloud-fan it makes sense, I made an attempt to move from DataFrame to SparkSession as suggested. I initially keep set() and where() API as it reads more like the SQL statem

Re: [PR] [SPARK-49114] Subcategorize cannot load state store errors [spark]

2024-08-05 Thread via GitHub
chaoqin-li1123 commented on code in PR #47616: URL: https://github.com/apache/spark/pull/47616#discussion_r1704482755 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -2608,6 +2608,30 @@ private[sql] object QueryExecutionErrors extends

Re: [PR] [SPARK-49114] Subcategorize cannot load state store errors [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47616: URL: https://github.com/apache/spark/pull/47616#discussion_r1704494978 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -2608,6 +2608,30 @@ private[sql] object QueryExecutionErrors extends

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704544977 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -207,12 +217,9 @@ class StateMetadataParti

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704546506 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala: ## @@ -439,8 +447,9 @@ case class TransformWithStateExec(

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704547296 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSourceErrors.scala: ## @@ -63,6 +63,12 @@ object StateDataSourceErrors {

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704547549 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala: ## @@ -291,30 +312,54 @@ class OperatorStateMetadataV2Writer

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704549390 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala: ## @@ -291,30 +312,54 @@ class OperatorStateMetadataV2Writer

Re: [PR] [SPARK-49083][CONNECT] Allow from_xml and from_json to natively work with json schemas [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on code in PR #47573: URL: https://github.com/apache/spark/pull/47573#discussion_r1704565599 ## sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala: ## @@ -1204,42 +1204,32 @@ class JsonFunctionsSuite extends QueryTest with SharedSparkSess

Re: [PR] [SPARK-48949][SQL] SPJ: Runtime partition filtering [spark]

2024-08-05 Thread via GitHub
viirya commented on code in PR #47426: URL: https://github.com/apache/spark/pull/47426#discussion_r1704570702 ## sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala: ## @@ -429,8 +429,19 @@ case class EnsureRequirements( // expressio

Re: [PR] [SPARK-49083][CONNECT] Allow from_xml and from_json to natively work with json schemas [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on code in PR #47573: URL: https://github.com/apache/spark/pull/47573#discussion_r1704570893 ## sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala: ## @@ -1204,42 +1204,32 @@ class JsonFunctionsSuite extends QueryTest with SharedSparkSess

Re: [PR] [SPARK-48949][SQL] SPJ: Runtime partition filtering [spark]

2024-08-05 Thread via GitHub
viirya commented on code in PR #47426: URL: https://github.com/apache/spark/pull/47426#discussion_r1704570702 ## sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala: ## @@ -429,8 +429,19 @@ case class EnsureRequirements( // expressio

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704572305 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadataSuite.scala: ## @@ -133,6 +154,21 @@ class OperatorStateMetadataSuit

Re: [PR] [SPARK-48949][SQL] SPJ: Runtime partition filtering [spark]

2024-08-05 Thread via GitHub
viirya commented on code in PR #47426: URL: https://github.com/apache/spark/pull/47426#discussion_r1704570702 ## sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala: ## @@ -429,8 +429,19 @@ case class EnsureRequirements( // expressio

Re: [PR] [SPARK-49048][SS] Add support for reading relevant operator metadata at given batch id [spark]

2024-08-05 Thread via GitHub
anishshri-db commented on code in PR #47528: URL: https://github.com/apache/spark/pull/47528#discussion_r1704578675 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ## @@ -104,7 +104,15 @@ class StateMetadataTable

Re: [PR] [SPARK 49109][SQL] Rename leftover BinaryLcase to Lcase [spark]

2024-08-05 Thread via GitHub
mihailom-db commented on PR #47602: URL: https://github.com/apache/spark/pull/47602#issuecomment-2269826869 @cloud-fan Could you merge this renaming? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [MINOR][SQL] Remove orphans in ProtoToParsedPlanTestSuite and PlanGenerationTestSuite [spark]

2024-08-05 Thread via GitHub
hvanhovell commented on PR #47603: URL: https://github.com/apache/spark/pull/47603#issuecomment-2269837655 @HyukjinKwon are you sure these orphans do not represent a plan that can be created by an older version of the connect client? If so, then we should keep them. -- This is an automat

  1   2   3   >