[PR] [SPARK-51773][SQL] Add hashCode and equals [spark]

2025-04-11 Thread via GitHub
vladimirg-db opened a new pull request, #50562: URL: https://github.com/apache/spark/pull/50562 ### What changes were proposed in this pull request? Add `hashCode` and `equals` to file formats ### Why are the changes needed? This is necessary to make `LogicalRelation`s co

Re: [PR] [SPARK-51769][SQL] Add maxRecordsPerOutputBatch to limit the number of record of Arrow output batch [spark]

2025-04-11 Thread via GitHub
viirya closed pull request #50301: [SPARK-51769][SQL] Add maxRecordsPerOutputBatch to limit the number of record of Arrow output batch URL: https://github.com/apache/spark/pull/50301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] [SPARK-49653][SQL] Single join for correlated scalar subqueries [spark]

2025-04-11 Thread via GitHub
iwanttobepowerful commented on PR #48145: URL: https://github.com/apache/spark/pull/48145#issuecomment-2796202913 ```sql create table correlated_scalar_t1(c1 bigint, c2 bigint); create table correlated_scalar_t2(c1 bigint, c2 bigint); create table correlated_scalar_t3(c1 bigint, c2 b

[PR] [SPARK-51775][SQL] Normalize LogicalRelation and HiveTableRelation by NormalizePlan [spark]

2025-04-11 Thread via GitHub
vladimirg-db opened a new pull request, #50563: URL: https://github.com/apache/spark/pull/50563 ### What changes were proposed in this pull request? Normalize `LogicalRelation` and `HiveTableRelation` by `NormalizePlan`. ### Why are the changes needed? To make single-pass

[PR] [SPARK-51774][CONNECT] Add GRPC Status code to Python Connect GRPC Exception [spark]

2025-04-11 Thread via GitHub
heyihong opened a new pull request, #50564: URL: https://github.com/apache/spark/pull/50564 ### What changes were proposed in this pull request? - Add GRPC Status code to Python Connect GRPC Exception ### Why are the changes needed? - Users can use grpc st

[PR] [SPARK-51776][SQL] Fix logging in single-pass Analyzer [spark]

2025-04-11 Thread via GitHub
vladimirg-db opened a new pull request, #50565: URL: https://github.com/apache/spark/pull/50565 ### What changes were proposed in this pull request? Fix logging in single-pass Analyzer. ### Why are the changes needed? `lookupMetadataAndResolve` logging is out of place.

Re: [PR] [SPARK-51757] Fix LEAD/LAG Function Offset Exceeds Partition Size [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50552: URL: https://github.com/apache/spark/pull/50552#discussion_r2039125858 ## sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala: ## @@ -183,7 +183,8 @@ abstract class OffsetWindowFunctionFrameBase( ov

Re: [PR] [SPARK-51646][SQL] Fix propagating collation in views with default collation [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50436: URL: https://github.com/apache/spark/pull/50436#discussion_r2039134089 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDDLCommandStringTypes.scala: ## @@ -79,6 +83,8 @@ object ResolveDDLCommandStringTypes ext

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on PR #49928: URL: https://github.com/apache/spark/pull/49928#issuecomment-2796299257 We need basic functionality tests in different languages, but regression tests shouldn't go there if it's not language-specific. -- This is an automated message from the Apache Git Se

Re: [PR] [SPARK-50131][SQL] Add IN Subquery DataFrame API [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50470: URL: https://github.com/apache/spark/pull/50470#discussion_r2039135991 ## sql/api/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -803,6 +803,26 @@ class Column(val node: ColumnNode) extends Logging with TableValuedFunctionArgum

Re: [PR] [SPARK-51646][SQL] Fix propagating collation in views with default collation [spark]

2025-04-11 Thread via GitHub
vladimirg-db commented on code in PR #50436: URL: https://github.com/apache/spark/pull/50436#discussion_r2039136459 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDDLCommandStringTypes.scala: ## @@ -79,6 +83,8 @@ object ResolveDDLCommandStringTypes

Re: [PR] [SPARK-51423][SQL] Add the current_time() function for TIME datatype [spark]

2025-04-11 Thread via GitHub
MaxGekk closed pull request #50336: [SPARK-51423][SQL] Add the current_time() function for TIME datatype URL: https://github.com/apache/spark/pull/50336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-51423][SQL] Add the current_time() function for TIME datatype [spark]

2025-04-11 Thread via GitHub
MaxGekk commented on PR #50336: URL: https://github.com/apache/spark/pull/50336#issuecomment-2796312063 +1, LGTM. Merging to master. Thank you, @the-sakthi. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-51420][SQL][FOLLOWUP] Support all valid TIME precisions in the minute function [spark]

2025-04-11 Thread via GitHub
MaxGekk closed pull request #50551: [SPARK-51420][SQL][FOLLOWUP] Support all valid TIME precisions in the minute function URL: https://github.com/apache/spark/pull/50551 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-51423][SQL] Add the current_time() function for TIME datatype [spark]

2025-04-11 Thread via GitHub
the-sakthi commented on PR #50336: URL: https://github.com/apache/spark/pull/50336#issuecomment-2796364972 Thank you very much @MaxGekk ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-51420][SQL][FOLLOWUP] Support all valid TIME precisions in the minute function [spark]

2025-04-11 Thread via GitHub
the-sakthi commented on PR #50551: URL: https://github.com/apache/spark/pull/50551#issuecomment-2796366776 Thank you very much, @MaxGekk ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-51420][SQL] Get minutes of TIME datatype [spark]

2025-04-11 Thread via GitHub
the-sakthi commented on code in PR #50296: URL: https://github.com/apache/spark/pull/50296#discussion_r2039182055 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala: ## @@ -160,3 +161,75 @@ object TryToTimeExpressionBuilder extends Ex

[PR] [SPARK-51777][SQL][CORE] Register sql.columnar.* classes to KryoSerializer [spark]

2025-04-11 Thread via GitHub
yaooqinn opened a new pull request, #50566: URL: https://github.com/apache/spark/pull/50566 ### What changes were proposed in this pull request? Register sql.columnar.* classes to KryoSerializer ### Why are the changes needed? Satisfy cache query cases when `spark

Re: [PR] [SPARK-51420][SQL][FOLLOWUP] Support all valid TIME precisions in the minute function [spark]

2025-04-11 Thread via GitHub
MaxGekk commented on PR #50551: URL: https://github.com/apache/spark/pull/50551#issuecomment-2796352636 +1, LGTM. Merging to master. Thank you, @the-sakthi. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-51757] Fix LEAD/LAG Function Offset Exceeds Partition Size [spark]

2025-04-11 Thread via GitHub
beliefer commented on PR #50552: URL: https://github.com/apache/spark/pull/50552#issuecomment-279675 Could you check the description `The current implementation of the write in OffsetWindowFunctionFrameBase:` Where is it? -- This is an automated message from the Apache Git Servi

Re: [PR] [SPARK-51757] Fix LEAD/LAG Function Offset Exceeds Partition Size [spark]

2025-04-11 Thread via GitHub
beliefer commented on code in PR #50552: URL: https://github.com/apache/spark/pull/50552#discussion_r2039245906 ## sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala: ## @@ -183,7 +183,8 @@ abstract class OffsetWindowFunctionFrameBase( ove

Re: [PR] [SPARK-51757] Fix LEAD/LAG Function Offset Exceeds Partition Size [spark]

2025-04-11 Thread via GitHub
beliefer commented on code in PR #50552: URL: https://github.com/apache/spark/pull/50552#discussion_r2039271225 ## sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala: ## @@ -183,7 +183,8 @@ abstract class OffsetWindowFunctionFrameBase( ove

Re: [PR] [SPARK-51414][SQL] Add the make_time() function [spark]

2025-04-11 Thread via GitHub
MaxGekk commented on code in PR #50269: URL: https://github.com/apache/spark/pull/50269#discussion_r2039329478 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala: ## @@ -349,3 +349,49 @@ object SecondExpressionBuilder extends Expressio

[PR] Revert "[SPARK-47895][SQL] group by alias should be idempotent" [spark]

2025-04-11 Thread via GitHub
mihailotim-db opened a new pull request, #50567: URL: https://github.com/apache/spark/pull/50567 This reverts commit 70e0e20b985f7d497541826ea69ebd5c8c8c5c39. ### What changes were proposed in this pull request? ### Why are the changes needed? ###

Re: [PR] [SPARK-51752][SQL] Enable rCTE referencing from within a CTE [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50546: URL: https://github.com/apache/spark/pull/50546#discussion_r2039394412 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala: ## @@ -202,14 +202,18 @@ object CTESubstitution extends Rule[LogicalPlan]

Re: [PR] [SPARK-51752][SQL] Enable rCTE referencing from within a CTE [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50546: URL: https://github.com/apache/spark/pull/50546#discussion_r2039396283 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala: ## @@ -220,18 +224,31 @@ object CTESubstitution extends Rule[LogicalPlan]

Re: [PR] [SPARK-51752][SQL] Enable rCTE referencing from within a CTE [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50546: URL: https://github.com/apache/spark/pull/50546#discussion_r2039399624 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala: ## @@ -83,8 +83,21 @@ object ResolveWithCTE extends Rule[LogicalPlan] {

Re: [PR] [SPARK-51752][SQL] Enable rCTE referencing from within a CTE [spark]

2025-04-11 Thread via GitHub
cloud-fan commented on code in PR #50546: URL: https://github.com/apache/spark/pull/50546#discussion_r2039400525 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala: ## @@ -83,8 +83,21 @@ object ResolveWithCTE extends Rule[LogicalPlan] {

Re: [PR] [SPARK-51638][CORE] Fix fetching the remote disk stored RDD blocks via the external shuffle service [spark]

2025-04-11 Thread via GitHub
attilapiros commented on PR #50439: URL: https://github.com/apache/spark/pull/50439#issuecomment-2796989479 cc @peter-toth -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-51638][CORE] Fix fetching the remote disk stored RDD blocks via the external shuffle service [spark]

2025-04-11 Thread via GitHub
peter-toth commented on code in PR #50439: URL: https://github.com/apache/spark/pull/50439#discussion_r2039636849 ## core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala: ## @@ -863,36 +863,36 @@ class BlockManagerMasterEndpoint( blockId: BlockId,

Re: [PR] [SPARK-51752][SQL] Enable rCTE referencing from within a CTE [spark]

2025-04-11 Thread via GitHub
Pajaraja commented on code in PR #50546: URL: https://github.com/apache/spark/pull/50546#discussion_r2039667154 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala: ## @@ -220,18 +224,31 @@ object CTESubstitution extends Rule[LogicalPlan]

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

2025-04-11 Thread via GitHub
vrozov commented on PR #49928: URL: https://github.com/apache/spark/pull/49928#issuecomment-2797149216 This is really confusing. How do you define basic functionality tests vs regression tests and why error message will go not into API test and be qualified as regression test. IMO, initial

Re: [PR] [SPARK-51728][SQL] Add SELECT EXCEPT Support [spark]

2025-04-11 Thread via GitHub
ik8 commented on PR #50536: URL: https://github.com/apache/spark/pull/50536#issuecomment-2797339700 This is already implemented [https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html](here): When spark.sql.parser.quotedRegexColumnNames is true, quoted identifiers (using bac

Re: [PR] [SPARK-51764] Add collation to AnalysisContext doc [spark]

2025-04-11 Thread via GitHub
ilicmarkodb closed pull request #50556: [SPARK-51764] Add collation to AnalysisContext doc URL: https://github.com/apache/spark/pull/50556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[PR] [SPARK-51778][SQL] Close SQL test gaps discovered during single-pass Analyzer implementation [spark]

2025-04-11 Thread via GitHub
vladimirg-db opened a new pull request, #50568: URL: https://github.com/apache/spark/pull/50568 ### What changes were proposed in this pull request? Close SQL test gaps discovered during single-pass Analyzer implementation. ### Why are the changes needed? To make Spark te

Re: [PR] [SPARK-51752][SQL] Enable rCTE referencing from within a CTE [spark]

2025-04-11 Thread via GitHub
Pajaraja commented on code in PR #50546: URL: https://github.com/apache/spark/pull/50546#discussion_r2039960513 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala: ## @@ -83,8 +83,21 @@ object ResolveWithCTE extends Rule[LogicalPlan] {

Re: [PR] [SPARK-50131][SQL] Add IN Subquery DataFrame API [spark]

2025-04-11 Thread via GitHub
ueshin commented on code in PR #50470: URL: https://github.com/apache/spark/pull/50470#discussion_r2040085100 ## sql/api/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -803,6 +803,26 @@ class Column(val node: ColumnNode) extends Logging with TableValuedFunctionArgum

Re: [PR] [SPARK-51419][SQL][FOLLOWUP] Making hour function to accept any precision of TIME type [spark]

2025-04-11 Thread via GitHub
senthh commented on PR #50554: URL: https://github.com/apache/spark/pull/50554#issuecomment-2797729370 > Could you add a check for the input types, see for instance #50551 Yes @MaxGekk I have added test as you mentioned -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-50131][SQL] Add IN Subquery DataFrame API [spark]

2025-04-11 Thread via GitHub
ueshin commented on code in PR #50470: URL: https://github.com/apache/spark/pull/50470#discussion_r2040102986 ## sql/api/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -803,6 +803,26 @@ class Column(val node: ColumnNode) extends Logging with TableValuedFunctionArgum

Re: [PR] [SPARK-51756][CORE] Computes RowBasedChecksum in ShuffleWriters [spark]

2025-04-11 Thread via GitHub
mridulm commented on code in PR #50230: URL: https://github.com/apache/spark/pull/50230#discussion_r2040084155 ## core/src/main/java/org/apache/spark/shuffle/checksum/RowBasedChecksum.scala: ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] [SPARK-51756][CORE] Computes RowBasedChecksum in ShuffleWriters [spark]

2025-04-11 Thread via GitHub
mridulm commented on code in PR #50230: URL: https://github.com/apache/spark/pull/50230#discussion_r2040102976 ## core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java: ## @@ -171,7 +182,11 @@ public void write(Iterator> records) throws IOException

Re: [PR] [SPARK-51756][CORE] Computes RowBasedChecksum in ShuffleWriters [spark]

2025-04-11 Thread via GitHub
mridulm commented on code in PR #50230: URL: https://github.com/apache/spark/pull/50230#discussion_r2040129005 ## core/src/main/scala/org/apache/spark/MapOutputTracker.scala: ## @@ -169,6 +174,12 @@ private class ShuffleStatus( } else { mapIdToMapIndex.remove(current

Re: [PR] [SPARK-50131][SQL] Add IN Subquery DataFrame API [spark]

2025-04-11 Thread via GitHub
ueshin commented on code in PR #50470: URL: https://github.com/apache/spark/pull/50470#discussion_r2040102986 ## sql/api/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -803,6 +803,26 @@ class Column(val node: ColumnNode) extends Logging with TableValuedFunctionArgum

Re: [PR] [SPARK-51775][SQL] Normalize LogicalRelation and HiveTableRelation by NormalizePlan [spark]

2025-04-11 Thread via GitHub
mihailoale-db commented on code in PR #50563: URL: https://github.com/apache/spark/pull/50563#discussion_r2040207320 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/NormalizeableRelation.scala: ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] [SPARK-51775][SQL] Normalize LogicalRelation and HiveTableRelation by NormalizePlan [spark]

2025-04-11 Thread via GitHub
vladimirg-db commented on code in PR #50563: URL: https://github.com/apache/spark/pull/50563#discussion_r2040229416 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/NormalizeableRelation.scala: ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foun

[PR] [SPARK-51780][SQL] Implement Describe Procedure [spark]

2025-04-11 Thread via GitHub
szehon-ho opened a new pull request, #50569: URL: https://github.com/apache/spark/pull/50569 ### What changes were proposed in this pull request? Implement 'describe procedures' ### Why are the changes needed? User need to understand a procedure before calling it.

Re: [PR] [SPARK-51419][SQL][FOLLOWUP] Making hour function to accept any precision of TIME type [spark]

2025-04-11 Thread via GitHub
MaxGekk commented on PR #50554: URL: https://github.com/apache/spark/pull/50554#issuecomment-2797942085 I think the test failure is not related to the changes: ``` [info] - SPARK-41224: collect data using arrow *** FAILED *** (33 milliseconds) [info] VerifyEvents.this.executeHolde

Re: [PR] [SPARK-51419][SQL][FOLLOWUP] Making hour function to accept any precision of TIME type [spark]

2025-04-11 Thread via GitHub
MaxGekk closed pull request #50554: [SPARK-51419][SQL][FOLLOWUP] Making hour function to accept any precision of TIME type URL: https://github.com/apache/spark/pull/50554 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [SPARK-51419][SQL][FOLLOWUP] Making hour function to accept any precision of TIME type [spark]

2025-04-11 Thread via GitHub
MaxGekk commented on PR #50554: URL: https://github.com/apache/spark/pull/50554#issuecomment-2797943966 +1, LGTM. Merging to master. Thank you, @senthh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-51553][SQL] Modify EXTRACT to support TIME data type [spark]

2025-04-11 Thread via GitHub
vinodkc commented on PR #50558: URL: https://github.com/apache/spark/pull/50558#issuecomment-2798016878 @MaxGekk , Could you please review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] [SPARK-46640][FOLLOW-UP] Consider the whole expression tree when excluding subquery references [spark]

2025-04-11 Thread via GitHub
nikhilsheoran-db opened a new pull request, #50570: URL: https://github.com/apache/spark/pull/50570 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ##

Re: [PR] [SPARK-46640][FOLLOW-UP] Consider the whole expression tree when excluding subquery references [spark]

2025-04-11 Thread via GitHub
nikhilsheoran-db commented on PR #50570: URL: https://github.com/apache/spark/pull/50570#issuecomment-2798042587 cc: @agubichev @cloud-fan to take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
asl3 opened a new pull request, #50571: URL: https://github.com/apache/spark/pull/50571 ### What changes were proposed in this pull request? Follow-up to https://github.com/apache/spark/pull/50538. Add a SQL legacy conf to enable/disable the change to allow users to

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040353600 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5260,6 +5260,15 @@ object SQLConf { .booleanConf .createWithDefault(f

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040353600 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5260,6 +5260,15 @@ object SQLConf { .booleanConf .createWithDefault(f

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040354376 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5260,6 +5260,15 @@ object SQLConf { .booleanConf .createWithDefault(f

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040354852 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5260,6 +5260,15 @@ object SQLConf { .booleanConf .createWithDefault(f

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040355842 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala: ## @@ -1405,6 +1405,31 @@ abstract class DDLSuite extends QueryTest with DDLSui

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040356163 ## docs/sql-migration-guide.md: ## @@ -64,6 +64,7 @@ license: | - Since Spark 4.0, Views allow control over how they react to underlying query changes. By defau

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040356678 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5260,6 +5260,15 @@ object SQLConf { .booleanConf .createWithDefault(f

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040356861 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala: ## @@ -257,12 +257,14 @@ class FindDataSourceTable(sparkSession: S

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040358977 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala: ## @@ -1405,6 +1405,31 @@ abstract class DDLSuite extends QueryTest with DDLSui

[PR] SPARK-51779 Use virtual column families for stream-stream joins [spark]

2025-04-11 Thread via GitHub
zecookiez opened a new pull request, #50572: URL: https://github.com/apache/spark/pull/50572 ### What changes were proposed in this pull request? SPARK-51779 This PR includes the join operator's implementation of using virtual column families for stream-stream joins. Th

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
szehon-ho commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040395136 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5260,6 +5260,15 @@ object SQLConf { .booleanConf .createWithDefault(false

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
szehon-ho commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040394255 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala: ## @@ -257,12 +257,14 @@ class FindDataSourceTable(sparkSession: Spark

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
aokolnychyi commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040392099 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -297,6 +298,24 @@ public int hashCode() { } } + /** + * Cr

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
aokolnychyi commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040392846 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -297,6 +298,24 @@ public int hashCode() { } } + /** + * Cr

Re: [PR] [SPARK-51779] [SS] Use virtual column families for stream-stream joins [spark]

2025-04-11 Thread via GitHub
anishshri-db commented on code in PR #50572: URL: https://github.com/apache/spark/pull/50572#discussion_r2040400152 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala: ## @@ -24,7 +24,8 @@ import scala.reflect.ClassTag import

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040421291 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -297,6 +298,24 @@ public int hashCode() { } } + /** + *

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040432672 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala: ## @@ -296,6 +297,49 @@ private[sql] object CatalogV2Util { } }

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040433390 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +806,81 @@ public int hashCode() { return Arrays.hashCo

Re: [PR] [SPARK-51779] [SS] Use virtual column families for stream-stream joins [spark]

2025-04-11 Thread via GitHub
zecookiez commented on code in PR #50572: URL: https://github.com/apache/spark/pull/50572#discussion_r2040436071 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala: ## @@ -24,7 +24,8 @@ import scala.reflect.ClassTag import or

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
szehon-ho commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040434026 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -811,6 +811,20 @@ }, "sqlState" : "XX000" }, + "CONSTRAINT_ALREADY_EXISTS" : { +

Re: [PR] [SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migration guide [spark]

2025-04-11 Thread via GitHub
szehon-ho commented on code in PR #50571: URL: https://github.com/apache/spark/pull/50571#discussion_r2040458364 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala: ## @@ -257,12 +257,14 @@ class FindDataSourceTable(sparkSession: Spark

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
viirya commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040464042 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +803,82 @@ public int hashCode() { return Arrays.hashCode(clus

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
viirya commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040465935 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +803,82 @@ public int hashCode() { return Arrays.hashCode(clus

Re: [PR] [SPARK-51688][PYTHON] Use Unix Domain Socket between Python and JVM communication [spark]

2025-04-11 Thread via GitHub
ueshin commented on code in PR #50466: URL: https://github.com/apache/spark/pull/50466#discussion_r2040453795 ## core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala: ## @@ -401,33 +415,35 @@ private[spark] abstract class BasePythonRunner[IN, OUT]( }

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
viirya commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040461182 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +803,82 @@ public int hashCode() { return Arrays.hashCode(clus

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
aokolnychyi commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040494480 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +806,81 @@ public int hashCode() { return Arrays.hashCode

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
aokolnychyi commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040494480 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +806,81 @@ public int hashCode() { return Arrays.hashCode

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
aokolnychyi commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040494615 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +803,82 @@ public int hashCode() { return Arrays.hashCode

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
aokolnychyi commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040494831 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +803,82 @@ public int hashCode() { return Arrays.hashCode

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040503839 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java: ## @@ -787,4 +803,82 @@ public int hashCode() { return Arrays.hashCo

Re: [PR] [SPARK-51780][SQL] Implement Describe Procedure [spark]

2025-04-11 Thread via GitHub
szehon-ho commented on PR #50569: URL: https://github.com/apache/spark/pull/50569#issuecomment-2798498648 @aokolnychyi @cloud-fan when you have a chance, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-51728][SQL] Add SELECT EXCEPT Support [spark]

2025-04-11 Thread via GitHub
Gschiavon commented on PR #50536: URL: https://github.com/apache/spark/pull/50536#issuecomment-2798456325 > This is already implemented [here](https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html): > > ``` > When spark.sql.parser.quotedRegexColumnNames is true, >

Re: [PR] [SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT [spark]

2025-04-11 Thread via GitHub
gengliangwang commented on code in PR #50561: URL: https://github.com/apache/spark/pull/50561#discussion_r2040500082 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -811,6 +811,20 @@ }, "sqlState" : "XX000" }, + "CONSTRAINT_ALREADY_EXISTS" : {

Re: [PR] [SPARK-51728][SQL] Add SELECT EXCEPT Support [spark]

2025-04-11 Thread via GitHub
ik8 commented on PR #50536: URL: https://github.com/apache/spark/pull/50536#issuecomment-2798588230 > > This is already implemented [here](https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html): > > ``` > > When spark.sql.parser.quotedRegexColumnNames is true, > > quo