Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973314920 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask for a barr

[PR] [SPARK-51337][SQL] Add maxRows to CTERelationRef [spark]

2025-02-27 Thread via GitHub
vladimirg-db opened a new pull request, #50104: URL: https://github.com/apache/spark/pull/50104 ### What changes were proposed in this pull request? Add `maxRows` field to `CTERelationRef`. ### Why are the changes needed? The Analyzer validates scalar subqueries b

Re: [PR] [SPARK-51332][SQL] DS V2 supports push down BIT_AND, BIT_OR, BIT_XOR, BIT_COUNT and BIT_GET [spark]

2025-02-27 Thread via GitHub
beliefer commented on PR #50097: URL: https://github.com/apache/spark/pull/50097#issuecomment-2687554240 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973321076 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -80,8 +81,9 @@ private[spark] class BarrierCoordinator( states.forEachValue(1, clearStat

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973321076 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -80,8 +81,9 @@ private[spark] class BarrierCoordinator( states.forEachValue(1, clearStat

[PR] [WIP][SQL] Add `TimeType` [spark]

2025-02-27 Thread via GitHub
MaxGekk opened a new pull request, #50103: URL: https://github.com/apache/spark/pull/50103 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

Re: [PR] [SPARK-50849][Connect] Add example project to demonstrate Spark Connect Server Libraries [spark]

2025-02-27 Thread via GitHub
vicennial commented on PR #49604: URL: https://github.com/apache/spark/pull/49604#issuecomment-2687751626 Thanks for the comment @LuciferYang > I have some doubts about merging this part of the code into the main codebase, as it seems more like an independent project to me. It

[PR] [SPARK-51340][ML][CONNECT] Model size estimation for linear classification & regression models [spark]

2025-02-27 Thread via GitHub
zhengruifeng opened a new pull request, #50106: URL: https://github.com/apache/spark/pull/50106 ### What changes were proposed in this pull request? Model size estimation for linear classification & regression models ### Why are the changes needed? pre-training model

Re: [PR] [SPARK-51325] Check in source code for `smallJar.jar` [spark]

2025-02-27 Thread via GitHub
vicennial closed pull request #50092: [SPARK-51325] Check in source code for `smallJar.jar` URL: https://github.com/apache/spark/pull/50092 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-51325] Check in source code for `smallJar.jar` [spark]

2025-02-27 Thread via GitHub
vicennial commented on PR #50092: URL: https://github.com/apache/spark/pull/50092#issuecomment-2687688931 Understood. Closing this PR and holding the code changes until we have a direction based on the dev list discussion -- This is an automated message from the Apache Git Service. To res

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jayadeep-jayaraman commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973338339 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jayadeep-jayaraman commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973345489 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -80,8 +81,9 @@ private[spark] class BarrierCoordinator( states.forEachValue(1,

[PR] Add java/scala version in response [spark]

2025-02-27 Thread via GitHub
garlandz-db opened a new pull request, #50102: URL: https://github.com/apache/spark/pull/50102 ### What changes were proposed in this pull request? * piggyback off the spark version response and include other env properties like java/scala version ### Why are the ch

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jjayadeep06 commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973391083 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -80,8 +81,9 @@ private[spark] class BarrierCoordinator( states.forEachValue(1, clearS

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jayadeep-jayaraman commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973338339 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jjayadeep06 commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973390612 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask for a b

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jayadeep-jayaraman commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973345489 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -80,8 +81,9 @@ private[spark] class BarrierCoordinator( states.forEachValue(1,

[PR] [SPARK-51339] Remove `IllegalImportsChecker` for `scala.collection.Seq/IndexedSeq` from `scalastyle-config.xml` [spark]

2025-02-27 Thread via GitHub
LuciferYang opened a new pull request, #50105: URL: https://github.com/apache/spark/pull/50105 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #50040: URL: https://github.com/apache/spark/pull/50040#discussion_r1973575824 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala: ## @@ -841,20 +841,24 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSpar

Re: [PR] [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50040: URL: https://github.com/apache/spark/pull/50040#issuecomment-2687956522 thanks for the review, merging to master/4.0/3.5! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option [spark]

2025-02-27 Thread via GitHub
cloud-fan closed pull request #50040: [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option URL: https://github.com/apache/spark/pull/50040 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973575314 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask for a barr

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973607118 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask for a barr

[PR] [SPARK-51341][SQL] Cancel time task with suitable way. [spark]

2025-02-27 Thread via GitHub
beliefer opened a new pull request, #50107: URL: https://github.com/apache/spark/pull/50107 ### What changes were proposed in this pull request? This PR proposes to cancel task with suitable way. ### Why are the changes needed? According to the discussion at https://github

Re: [PR] [SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when pushing down EXTRACT [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50101: URL: https://github.com/apache/spark/pull/50101#issuecomment-2688008852 and also cc @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #49928: URL: https://github.com/apache/spark/pull/49928#issuecomment-2688013194 The change LGTM, can we add a test in `DataFrameReaderWriterSuite`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-48375][SQL] Add support for SIGNAL statement [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #49726: URL: https://github.com/apache/spark/pull/49726#issuecomment-2688007524 Can we implement the SIGNAL statement as a `SELECT raise_error(...)`? Then it's consistent that every scripting statement generates a DataFrame. -- This is an automated message from t

Re: [PR] [SPARK-51299][SQL][UI] MetricUtils.stringValue should filter metric values with initValue rather than a hardcoded value [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50055: URL: https://github.com/apache/spark/pull/50055#issuecomment-2688019216 @jiwen624 I think you are right, do you have a concrete metric that has the problem? For metrics that we only need a sum, `0` is OK as the initial value and we don't filter it out. --

Re: [PR] [SPARK-51307][SQL] locationUri in CatalogStorageFormat shall be decoded for display [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #50074: URL: https://github.com/apache/spark/pull/50074#discussion_r1973628519 ## sql/core/src/test/resources/sql-tests/results/describe.sql.out: ## @@ -890,6 +890,48 @@ a string CONCAT('a\n b\n ', 'c\n

Re: [PR] [SPARK-51307][SQL] locationUri in CatalogStorageFormat shall be decoded for display [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #50074: URL: https://github.com/apache/spark/pull/50074#discussion_r1973633764 ## sql/core/src/test/resources/sql-tests/results/describe.sql.out: ## @@ -890,6 +890,48 @@ a string CONCAT('a\n b\n ', 'c\n

Re: [PR] [SPARK-51342][SQL] Add `TimeType` [spark]

2025-02-27 Thread via GitHub
the-sakthi commented on code in PR #50103: URL: https://github.com/apache/spark/pull/50103#discussion_r1974366151 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -6190,6 +6190,12 @@ ], "sqlState" : "42000" }, + "UNSUPPORTED_TIME_PRECISION" : {

Re: [PR] [SPARK-51326][CONNECT][4.0] Remove LazyExpression proto message [spark]

2025-02-27 Thread via GitHub
ueshin commented on PR #50094: URL: https://github.com/apache/spark/pull/50094#issuecomment-263482 Thanks! merging to branch-4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-51342][SQL] Add `TimeType` [spark]

2025-02-27 Thread via GitHub
the-sakthi commented on PR #50103: URL: https://github.com/apache/spark/pull/50103#issuecomment-2689165941 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] [SPARK-51339][BUILD] Remove `IllegalImportsChecker` for `s.c.Seq/IndexedSeq` from `scalastyle-config.xml` [spark]

2025-02-27 Thread via GitHub
the-sakthi commented on PR #50105: URL: https://github.com/apache/spark/pull/50105#issuecomment-2689144092 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] [SPARK-51344] Fix `ENV` key value format in `*.template` [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #82: URL: https://github.com/apache/spark-docker/pull/82#issuecomment-2689182775 Thank you, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-51344] Fix `ENV` key value format in `*.template` [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun closed pull request #82: [SPARK-51344] Fix `ENV` key value format in `*.template` URL: https://github.com/apache/spark-docker/pull/82 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] [SPARK-51270][SQL] Support UUID type in Variant [spark]

2025-02-27 Thread via GitHub
gene-db commented on code in PR #50025: URL: https://github.com/apache/spark/pull/50025#discussion_r1974322433 ## common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java: ## @@ -240,6 +242,18 @@ public void appendBinary(byte[] binary) { writePos += b

Re: [PR] [SPARK-51301][BUILD] Bump zstd-jni 1.5.7-1 [spark]

2025-02-27 Thread via GitHub
luben commented on code in PR #50057: URL: https://github.com/apache/spark/pull/50057#discussion_r1974306557 ## core/benchmarks/ZStandardBenchmark-jdk21-results.txt: ## @@ -2,48 +2,48 @@ Benchmark ZStandardCompressionCodec =

Re: [PR] [SPARK-51326][CONNECT][4.0] Remove LazyExpression proto message [spark]

2025-02-27 Thread via GitHub
ueshin commented on PR #50094: URL: https://github.com/apache/spark/pull/50094#issuecomment-262567 The remaining test failures are not related to this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-51278][FOLLOWUP][DOCS] Update JSON format from documentation [spark]

2025-02-27 Thread via GitHub
the-sakthi commented on PR #50100: URL: https://github.com/apache/spark/pull/50100#issuecomment-2689210402 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] [SPARK-51270][SQL] Support UUID type in Variant [spark]

2025-02-27 Thread via GitHub
cashmand commented on code in PR #50025: URL: https://github.com/apache/spark/pull/50025#discussion_r1974435292 ## common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java: ## @@ -240,6 +242,18 @@ public void appendBinary(byte[] binary) { writePos +=

Re: [PR] [SPARK-50855][SS][CONNECT] Spark Connect Support for TransformWithState In Scala [spark]

2025-02-27 Thread via GitHub
jingz-db commented on code in PR #49488: URL: https://github.com/apache/spark/pull/49488#discussion_r1974452111 ## sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/streaming/TransformWithStateConnectSuite.scala: ## Review Comment: > I've realized that you

Re: [PR] [SPARK-50855][SS][CONNECT] Spark Connect Support for TransformWithState In Scala [spark]

2025-02-27 Thread via GitHub
jingz-db commented on code in PR #49488: URL: https://github.com/apache/spark/pull/49488#discussion_r1974458075 ## sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/streaming/TransformWithStateConnectSuite.scala: ## @@ -0,0 +1,489 @@ +/* + * Licensed to the Apac

Re: [PR] [SPARK-50855][SS][CONNECT] Spark Connect Support for TransformWithState In Scala [spark]

2025-02-27 Thread via GitHub
jingz-db commented on code in PR #49488: URL: https://github.com/apache/spark/pull/49488#discussion_r1974452111 ## sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/streaming/TransformWithStateConnectSuite.scala: ## Review Comment: > I've realized that you

Re: [PR] [SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef [spark]

2025-02-27 Thread via GitHub
cloud-fan closed pull request #50104: [SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef URL: https://github.com/apache/spark/pull/50104 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50104: URL: https://github.com/apache/spark/pull/50104#issuecomment-2689469635 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50069: URL: https://github.com/apache/spark/pull/50069#issuecomment-2689447332 This is a test only PR and other test failures are definitely unrelated. Thanks, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-27 Thread via GitHub
cloud-fan closed pull request #50069: [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage URL: https://github.com/apache/spark/pull/50069 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [WIP][SPARK-51271][PYTHON] Add filter pushdown API to Python Data Sources [spark]

2025-02-27 Thread via GitHub
wengh commented on code in PR #49961: URL: https://github.com/apache/spark/pull/49961#discussion_r1974559481 ## python/pyspark/sql/tests/test_python_datasource.py: ## @@ -246,6 +248,137 @@ def reader(self, schema) -> "DataSourceReader": assertDataFrameEqual(df, [Row(x=0

Re: [PR] [WIP][SPARK-51271][PYTHON] Add filter pushdown API to Python Data Sources [spark]

2025-02-27 Thread via GitHub
wengh commented on code in PR #49961: URL: https://github.com/apache/spark/pull/49961#discussion_r1974559481 ## python/pyspark/sql/tests/test_python_datasource.py: ## @@ -246,6 +248,137 @@ def reader(self, schema) -> "DataSourceReader": assertDataFrameEqual(df, [Row(x=0

Re: [PR] [SPARK-50855][SS][CONNECT] Spark Connect Support for TransformWithState In Scala [spark]

2025-02-27 Thread via GitHub
jingz-db commented on code in PR #49488: URL: https://github.com/apache/spark/pull/49488#discussion_r1974088879 ## sql/connect/common/src/main/protobuf/spark/connect/relations.proto: ## Review Comment: Yes, Scala & Python is sharing the same connect client protocol. --

Re: [PR] [SPARK-51278][FOLLOWUP][DOCS] Update JSON format from documentation [spark]

2025-02-27 Thread via GitHub
HyukjinKwon closed pull request #50100: [SPARK-51278][FOLLOWUP][DOCS] Update JSON format from documentation URL: https://github.com/apache/spark/pull/50100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-51278][FOLLOWUP][DOCS] Update JSON format from documentation [spark]

2025-02-27 Thread via GitHub
HyukjinKwon commented on PR #50100: URL: https://github.com/apache/spark/pull/50100#issuecomment-2689394019 Merged to master and branch-4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Dev/milast/recurisve cte [spark]

2025-02-27 Thread via GitHub
github-actions[bot] closed pull request #48878: Dev/milast/recurisve cte URL: https://github.com/apache/spark/pull/48878 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #80: URL: https://github.com/apache/spark-docker/pull/80#issuecomment-2689111825 Thank you for the pointer, @pan3793 . Let me try in this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[PR] [SPARK-51344] Fix `ENV` key value format in `*.template` [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun opened a new pull request, #82: URL: https://github.com/apache/spark-docker/pull/82 ### What changes were proposed in this pull request? This PR aims to fix `ENV` key value format in `*.template`. ### Why are the changes needed? To follow the Docker guidelin

Re: [PR] [SPARK-51298][WIP] Support variant in CSV scan [spark]

2025-02-27 Thread via GitHub
the-sakthi commented on PR #50052: URL: https://github.com/apache/spark/pull/50052#issuecomment-2689223159 Hello @chenhao-db thanks for the changes, could we please get some description in the jira and the response to the PR template quetsions above? Helps a lot with understanding the conte

[PR] [SPARK-51345] Remove all EOL versions from `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun opened a new pull request, #83: URL: https://github.com/apache/spark-docker/pull/83 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-51345] Remove all EOL versions from `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #83: URL: https://github.com/apache/spark-docker/pull/83#issuecomment-2689220984 cc @Yikun , @yaooqinn , @LuciferYang , @yaooqinn , @itholic , @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] [SPARK-51288][DOCS] Add link for Scala API of Spark Connect [spark]

2025-02-27 Thread via GitHub
the-sakthi commented on PR #50042: URL: https://github.com/apache/spark/pull/50042#issuecomment-2689227610 Nice. LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] [WIP][SPARK-51271][PYTHON] Add filter pushdown API to Python Data Sources [spark]

2025-02-27 Thread via GitHub
wengh commented on code in PR #49961: URL: https://github.com/apache/spark/pull/49961#discussion_r1974189473 ## python/pyspark/sql/tests/test_python_datasource.py: ## @@ -246,6 +248,137 @@ def reader(self, schema) -> "DataSourceReader": assertDataFrameEqual(df, [Row(x=0

Re: [PR] [SPARK-51345] Remove all EOL versions from `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #83: URL: https://github.com/apache/spark-docker/pull/83#issuecomment-2689239268 Thank you, @viirya . No~ AFAIK, the published docker images will exist like the published Maven jars. -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] [SPARK-51326][CONNECT] Remove LazyExpression proto message [spark]

2025-02-27 Thread via GitHub
ueshin closed pull request #50093: [SPARK-51326][CONNECT] Remove LazyExpression proto message URL: https://github.com/apache/spark/pull/50093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-51326][CONNECT] Remove LazyExpression proto message [spark]

2025-02-27 Thread via GitHub
ueshin commented on PR #50093: URL: https://github.com/apache/spark/pull/50093#issuecomment-2688892681 I reran the compatibility test after #50094 was merged and it passed. - https://github.com/ueshin/apache-spark/actions/runs/13554631961/job/37946210197 -- This is an automated messag

[PR] [SPARK-51347] Enable Ingress and Service Support for Spark Driver [spark-kubernetes-operator]

2025-02-27 Thread via GitHub
jiangzho opened a new pull request, #159: URL: https://github.com/apache/spark-kubernetes-operator/pull/159 ### What changes were proposed in this pull request? This PR adds support for launching Ingress and Services with Spark Applications. ### Why are the changes

Re: [PR] [SPARK-51322][SQL] Better error message for streaming subquery expression [spark]

2025-02-27 Thread via GitHub
cloud-fan closed pull request #50088: [SPARK-51322][SQL] Better error message for streaming subquery expression URL: https://github.com/apache/spark/pull/50088 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-51322][SQL] Better error message for streaming subquery expression [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50088: URL: https://github.com/apache/spark/pull/50088#issuecomment-2687357950 thanks for the review, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[PR] [SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when pushing down EXTRACT [spark]

2025-02-27 Thread via GitHub
cloud-fan opened a new pull request, #50101: URL: https://github.com/apache/spark/pull/50101 ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/48210 to fix correctness issues caused by pgsql filter pushdown. These d

Re: [PR] [SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when pushing down EXTRACT [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50101: URL: https://github.com/apache/spark/pull/50101#issuecomment-2687368062 cc @beliefer @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] [SPARK-51281][SQL] DataFrameWriterV2 should respect the path option [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #50040: URL: https://github.com/apache/spark/pull/50040#discussion_r1973251531 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -5545,6 +5545,15 @@ object SQLConf { .booleanConf .createWithDefault(false

Re: [PR] [SPARK-50994][CORE] Perform RDD conversion under tracked execution [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #49678: URL: https://github.com/apache/spark/pull/49678#issuecomment-2687493241 thanks, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-48375][SQL] Add support for SIGNAL statement [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #49726: URL: https://github.com/apache/spark/pull/49726#discussion_r1973284907 ## sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionSuite.scala: ## @@ -69,6 +70,222 @@ class SqlScriptingExecutionSuite extends QueryTest

[PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun opened a new pull request, #80: URL: https://github.com/apache/spark-docker/pull/80 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-50994][CORE] Perform RDD conversion under tracked execution [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #49678: URL: https://github.com/apache/spark/pull/49678#discussion_r1973281569 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala: ## @@ -2721,6 +2721,25 @@ class DataFrameSuite extends QueryTest parameters = Map("name"

Re: [PR] [SPARK-50994][CORE] Perform RDD conversion under tracked execution [spark]

2025-02-27 Thread via GitHub
cloud-fan closed pull request #49678: [SPARK-50994][CORE] Perform RDD conversion under tracked execution URL: https://github.com/apache/spark/pull/49678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] [SPARK-51336] Upgrade `upload-artifact` to v4 in `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun opened a new pull request, #81: URL: https://github.com/apache/spark-docker/pull/81 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-51310][SQL] Resolve the type of default string producing expressions [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on PR #50053: URL: https://github.com/apache/spark/pull/50053#issuecomment-2687505463 thanks, merging to master/4.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-51310][SQL] Resolve the type of default string producing expressions [spark]

2025-02-27 Thread via GitHub
cloud-fan closed pull request #50053: [SPARK-51310][SQL] Resolve the type of default string producing expressions URL: https://github.com/apache/spark/pull/50053 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-51341][CORE] Cancel time task with suitable way. [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50107: URL: https://github.com/apache/spark/pull/50107#discussion_r1973643904 ## core/src/main/scala/org/apache/spark/BarrierTaskContext.scala: ## @@ -300,11 +300,7 @@ object BarrierTaskContext { @Since("2.4.0") def get(): BarrierTaskConte

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-02-27 Thread via GitHub
jjayadeep06 commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1973705329 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -122,23 +124,40 @@ private[spark] class BarrierCoordinator( // Init a TimerTask for a b

Re: [PR] [SPARK-51307][SQL] locationUri in CatalogStorageFormat shall be decoded for display [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #50074: URL: https://github.com/apache/spark/pull/50074#discussion_r1973627334 ## sql/core/src/test/resources/sql-tests/inputs/describe.sql: ## @@ -122,6 +122,12 @@ DESC TABLE EXTENDED e; DESC FORMATTED e; +CREATE TABLE f PARTITIONED BY (B,

Re: [PR] [SPARK-51303] [SQL] [TESTS] Extend `ORDER BY` testing coverage [spark]

2025-02-27 Thread via GitHub
mihailoale-db commented on PR #50069: URL: https://github.com/apache/spark/pull/50069#issuecomment-2688377111 @MaxGekk @cloud-fan Failures don't seem related to changes? Please check when you have time. Thanks -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-51336] Upgrade `upload-artifact` to v4 in `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun merged PR #81: URL: https://github.com/apache/spark-docker/pull/81 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spa

Re: [PR] [SPARK-51336] Upgrade `upload-artifact` to v4 in `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #81: URL: https://github.com/apache/spark-docker/pull/81#issuecomment-2688408435 Thank you, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-51270][SQL] Support UUID type in Variant [spark]

2025-02-27 Thread via GitHub
cashmand commented on code in PR #50025: URL: https://github.com/apache/spark/pull/50025#discussion_r1973900282 ## common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java: ## @@ -240,6 +242,19 @@ public void appendBinary(byte[] binary) { writePos +=

Re: [PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
viirya commented on PR #80: URL: https://github.com/apache/spark-docker/pull/80#issuecomment-2688494398 This error happened (even after re-triggering): ``` 54.80 qemu: uncaught target signal 11 (Segmentation fault) - core dumped 55.22 Segmentation fault (core dumped) 55.27 qem

Re: [PR] [SPARK-51336] Upgrade `upload-artifact` to v4 in `spark-docker` repository [spark-docker]

2025-02-27 Thread via GitHub
LuciferYang commented on PR #81: URL: https://github.com/apache/spark-docker/pull/81#issuecomment-2688468782 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #80: URL: https://github.com/apache/spark-docker/pull/80#issuecomment-2688513900 Let me hold on this because this is not a blocker For Apache Spark 3.5.5 release announcement. After announcing, I'll come back to this. Thank you, @viirya . -- This is

Re: [PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #80: URL: https://github.com/apache/spark-docker/pull/80#issuecomment-2688511310 Ya, it seems to come from the Qemu environment on GitHub Action. When I build the docker locally, it seems to work. ``` $ docker build . [+] Building 423.4s (12/12) F

Re: [PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
viirya commented on PR #80: URL: https://github.com/apache/spark-docker/pull/80#issuecomment-2688522180 Yes. Looks like an issue from the infrastructure. Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-51335] Publish Apache Spark 3.5.5 to docker registry [spark-docker]

2025-02-27 Thread via GitHub
pan3793 commented on PR #80: URL: https://github.com/apache/spark-docker/pull/80#issuecomment-2688551158 this might be a solution https://github.com/apache/iceberg/pull/12262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] [SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when pushing down EXTRACT [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50101: URL: https://github.com/apache/spark/pull/50101#discussion_r1974672469 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -303,12 +303,27 @@ private case class PostgresDialect() class PostgresSQLBuilder

Re: [PR] [SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when pushing down EXTRACT [spark]

2025-02-27 Thread via GitHub
beliefer commented on code in PR #50101: URL: https://github.com/apache/spark/pull/50101#discussion_r1974672789 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -303,12 +303,27 @@ private case class PostgresDialect() class PostgresSQLBuilder

Re: [PR] [SPARK-51307][SQL] locationUri in CatalogStorageFormat shall be decoded for display [spark]

2025-02-27 Thread via GitHub
yaooqinn commented on code in PR #50074: URL: https://github.com/apache/spark/pull/50074#discussion_r1974620691 ## sql/core/src/test/resources/sql-tests/inputs/describe.sql: ## @@ -122,6 +122,12 @@ DESC TABLE EXTENDED e; DESC FORMATTED e; +CREATE TABLE f PARTITIONED BY (B,

Re: [PR] [SPARK-51307][SQL] locationUri in CatalogStorageFormat shall be decoded for display [spark]

2025-02-27 Thread via GitHub
yaooqinn commented on code in PR #50074: URL: https://github.com/apache/spark/pull/50074#discussion_r1974624507 ## sql/core/src/test/resources/sql-tests/results/describe.sql.out: ## @@ -890,6 +890,48 @@ a string CONCAT('a\n b\n ', 'c\n

Re: [PR] [SPARK-51339][BUILD] Remove `IllegalImportsChecker` for `s.c.Seq/IndexedSeq` from `scalastyle-config.xml` [spark]

2025-02-27 Thread via GitHub
LuciferYang commented on PR #50105: URL: https://github.com/apache/spark/pull/50105#issuecomment-2689564664 Merged into master and branch-4.0. Thanks @HyukjinKwon and @the-sakthi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] [SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when pushing down EXTRACT [spark]

2025-02-27 Thread via GitHub
cloud-fan commented on code in PR #50101: URL: https://github.com/apache/spark/pull/50101#discussion_r1974712264 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala: ## @@ -304,11 +309,25 @@ class PostgresIntegrationSu

Re: [PR] [SPARK-51352] Use Spark 3.5.5 in E2E tests [spark-kubernetes-operator]

2025-02-27 Thread via GitHub
dongjoon-hyun closed pull request #160: [SPARK-51352] Use Spark 3.5.5 in E2E tests URL: https://github.com/apache/spark-kubernetes-operator/pull/160 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [SPARK-51347] Enable Ingress and Service Support for Spark Driver [spark-kubernetes-operator]

2025-02-27 Thread via GitHub
dongjoon-hyun closed pull request #159: [SPARK-51347] Enable Ingress and Service Support for Spark Driver URL: https://github.com/apache/spark-kubernetes-operator/pull/159 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] [SPARK-51347] Enable Ingress and Service Support for Spark Driver [spark-kubernetes-operator]

2025-02-27 Thread via GitHub
dongjoon-hyun commented on PR #159: URL: https://github.com/apache/spark-kubernetes-operator/pull/159#issuecomment-2689793279 BTW, I noticed that this is a first commit with a different email, @jiangzho . Are you going to use this one? ``` $ git log --author=zh | grep 'Author:' | sort

  1   2   >