bogao007 commented on PR #47133:
URL: https://github.com/apache/spark/pull/47133#issuecomment-2231824089
> @bogao007 - test failure seems related ?
>
> ```
> [error]
/home/runner/work/spark/spark/sql/core/src/main/scala/org/apache/spark/sql/execution/python/TransformWithStateInPand
anishshri-db commented on code in PR #47133:
URL: https://github.com/apache/spark/pull/47133#discussion_r1680065191
##
python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py:
##
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or mo
anishshri-db commented on code in PR #47133:
URL: https://github.com/apache/spark/pull/47133#discussion_r1680066012
##
python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py:
##
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or mo
anishshri-db commented on code in PR #47133:
URL: https://github.com/apache/spark/pull/47133#discussion_r1680066533
##
python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py:
##
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or mo
anishshri-db commented on code in PR #47133:
URL: https://github.com/apache/spark/pull/47133#discussion_r1680066276
##
python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py:
##
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or mo
anishshri-db commented on code in PR #47363:
URL: https://github.com/apache/spark/pull/47363#discussion_r1680081616
##
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBSuite.scala:
##
@@ -1663,9 +1670,8 @@ class RocksDBSuite extends
AlsoTestWithChan
mingkangli-db commented on PR #47361:
URL: https://github.com/apache/spark/pull/47361#issuecomment-2231904382
@cloud-fan Hi Wenchen, since last time you reviewed it, I addressed the
comments and also synced the changes to R, Python, and Java `SparkContext` API,
making it consistent with the
zhengruifeng closed pull request #47342: [SPARK-48892][ML] Avoid per-row param
read in `Tokenizer`
URL: https://github.com/apache/spark/pull/47342
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
zhengruifeng commented on PR #47342:
URL: https://github.com/apache/spark/pull/47342#issuecomment-2231972215
merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
panbingkun opened a new pull request, #47377:
URL: https://github.com/apache/spark/pull/47377
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
bogao007 commented on code in PR #47133:
URL: https://github.com/apache/spark/pull/47133#discussion_r1680168088
##
python/pyspark/sql/streaming/__init__.py:
##
@@ -19,3 +19,4 @@
from pyspark.sql.streaming.readwriter import DataStreamReader,
DataStreamWriter # noqa: F401
from
HyukjinKwon commented on PR #47341:
URL: https://github.com/apache/spark/pull/47341#issuecomment-2232029263
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon closed pull request #47341: [SPARK-48883][ML][R] Replace RDD read /
write API invocation with Dataframe read / write API
URL: https://github.com/apache/spark/pull/47341
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
github-actions[bot] closed pull request #45849:
[SPARK-47602][CORE][K8S][FOLLOWUP] Improve structure logging for
isExecutorIdleTimedOut
URL: https://github.com/apache/spark/pull/45849
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
github-actions[bot] commented on PR #45776:
URL: https://github.com/apache/spark/pull/45776#issuecomment-2232054848
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
HyukjinKwon commented on PR #47368:
URL: https://github.com/apache/spark/pull/47368#issuecomment-2232077921
@itholic wanna try merging this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
huaxingao commented on PR #47233:
URL: https://github.com/apache/spark/pull/47233#issuecomment-2232086743
I took a look at Delta Lake's implementation for
[update](https://github.com/delta-io/delta/blob/master/spark/src/main/scala/io/delta/tables/DeltaTable.scala#L234),
which uses executeUp
panbingkun commented on PR #47377:
URL: https://github.com/apache/spark/pull/47377#issuecomment-2232087368
It depends on `4.27.0
panbingkun closed pull request #47377: [ONLY TEST][SPARK-48917][BUILD] Upgrade
tink to 1.14.0
URL: https://github.com/apache/spark/pull/47377
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
panbingkun commented on PR #47377:
URL: https://github.com/apache/spark/pull/47377#issuecomment-2232113942
https://github.com/user-attachments/assets/0ecddfae-fae6-4f0b-bdb6-8458164a15fe";>
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
HeartSaVioR closed pull request #47363: [SPARK-48903][SS] Set the RocksDB last
snapshot version correctly on remote load
URL: https://github.com/apache/spark/pull/47363
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HeartSaVioR commented on PR #47363:
URL: https://github.com/apache/spark/pull/47363#issuecomment-2232122796
Thanks! Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
itholic closed pull request #47368: [SPARK-48510][CONNECT][FOLLOW-UP] Fix for
UDAF `toColumn` API when running tests in Maven
URL: https://github.com/apache/spark/pull/47368
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
itholic commented on PR #47368:
URL: https://github.com/apache/spark/pull/47368#issuecomment-2232155789
Merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680301810
##
sql/api/pom.xml:
##
@@ -86,7 +92,7 @@
true
-../api/src/main/antlr4
+
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680302416
##
connector/connect/client/jvm/pom.xml:
##
@@ -116,49 +90,18 @@
false
true
+
Review Comment:
Most of the shading i
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680303391
##
connect/server/pom.xml:
##
@@ -36,70 +36,25 @@
- org.apache.spark
Review Comment:
This has all been moved to connect-api.
--
This is an a
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680303100
##
connect/common/pom.xml:
##
@@ -39,59 +39,6 @@
spark-sql-api_${scala.binary.version}
${project.version}
-
Review Comme
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680312565
##
sql/connect-api/pom.xml:
##
@@ -0,0 +1,312 @@
+
+
+
+http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:sch
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680312565
##
sql/connect-api/pom.xml:
##
@@ -0,0 +1,312 @@
+
+
+
+http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:sch
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680313273
##
sql/connect-api/pom.xml:
##
@@ -0,0 +1,312 @@
+
+
+
+http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:sch
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680313747
##
connector/connect/client/jvm/pom.xml:
##
@@ -116,49 +90,18 @@
false
true
+
Review Comment:
... and yes I still n
williamhyun opened a new pull request, #47379:
URL: https://github.com/apache/spark/pull/47379
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680314151
##
project/SparkBuild.scala:
##
@@ -674,23 +664,76 @@ object SparkConnectCommon {
// Exclude `scala-library` from assembly.
(assembly / assemblyPackageScal
williamhyun commented on PR #47379:
URL: https://github.com/apache/spark/pull/47379#issuecomment-2232214327
cc: @yaooqinn , @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
hvanhovell commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680315462
##
project/SparkBuild.scala:
##
@@ -713,85 +756,9 @@ object SparkConnectCommon {
}
}
-object SparkConnect {
- import BuildCommons.protoVersion
-
+object Spark
hvanhovell commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232216536
For the reviewers. Most of this PR is mechanical, renaming imports to their
new shaded names. Please focus on the Maven and SBT build files first!
--
This is an automated message fro
panbingkun commented on PR #47364:
URL: https://github.com/apache/spark/pull/47364#issuecomment-2232225702
Currently only show `normalized` collation name.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HyukjinKwon commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232225650
cc @LuciferYang if you find some time to review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HyukjinKwon commented on PR #47373:
URL: https://github.com/apache/spark/pull/47373#issuecomment-2232228940
Can you make a PR against master brnach?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
HyukjinKwon commented on PR #47371:
URL: https://github.com/apache/spark/pull/47371#issuecomment-2232230103
@allisonwang-db wanna try merging this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
viirya opened a new pull request, #47380:
URL: https://github.com/apache/spark/pull/47380
### What changes were proposed in this pull request?
We got a customer issue that a `MergeInto` query on Iceberg table works
earlier but cannot work after upgrading to Spark 3.4.
allisonwang-db closed pull request #47371: [SPARK-47307][DOCS][FOLLOWUP] Add a
migration guide for the behavior change of base64 function
URL: https://github.com/apache/spark/pull/47371
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
allisonwang-db commented on PR #47371:
URL: https://github.com/apache/spark/pull/47371#issuecomment-2232238622
Merged to master and branch-3.5
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
panbingkun commented on PR #47364:
URL: https://github.com/apache/spark/pull/47364#issuecomment-2232241876
If necessary, we can also show columns: `CaseSensitivity` and
`AccentSensitivity`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log o
williamhyun commented on PR #47379:
URL: https://github.com/apache/spark/pull/47379#issuecomment-2232245366
Thank you, @yaooqinn , @dongjoon-hyun , @HyukjinKwon !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
pan3793 commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232246362
I remember there are issues for maven to consume a shaded module in the same
project. i.e. you must run `mvn install -pl ` first, otherwise
`mvn test` or `mvn package` can not see the sha
hvanhovell commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232276170
@pan3793 thanks for the input. I did check maven package and that seemed to
work for packaging (this is the command I used: `build/mvn package -pl
connector/connect/client/jvm -am`). I
wForget opened a new pull request, #47381:
URL: https://github.com/apache/spark/pull/47381
### What changes were proposed in this pull request?
To improve insertion performance, there is no need to add transform
expressions when there is no conversion for complex types.
panbingkun commented on PR #47364:
URL: https://github.com/apache/spark/pull/47364#issuecomment-2232283575
Another option:
Only display `name`, `provider` and `version` when execute `SHOW COLLATIONS
...`
And when execute `DESCRIBE COLLATIONS ...`, will display: `name`,
`provider`, `ve
HeartSaVioR commented on PR #47339:
URL: https://github.com/apache/spark/pull/47339#issuecomment-2232289776
https://github.com/siying/spark/runs/27529815714
This only failed with Docker integration test
`org.apache.spark.sql.jdbc.OracleIntegrationSuite` which is unrelated.
--
This
dongjoon-hyun commented on code in PR #47380:
URL: https://github.com/apache/spark/pull/47380#discussion_r1680352198
##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala:
##
@@ -299,4 +300,58 @@ class ResolveSubquerySuite extends Analy
HeartSaVioR commented on PR #47339:
URL: https://github.com/apache/spark/pull/47339#issuecomment-2232290247
Thanks! Merging to master/3.5/3.4 (if there's no merge conflict).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
LuciferYang commented on PR #47377:
URL: https://github.com/apache/spark/pull/47377#issuecomment-2232290639
Thank you for pinging me, @dongjoon-hyun
Yes, due to the compatibility issue with protobuf-java, although excluding
it may be a workaround, I prefer to wait for the official releas
dongjoon-hyun commented on PR #47380:
URL: https://github.com/apache/spark/pull/47380#issuecomment-2232291809
Do you happen to know which JIRA issue cause this regression, @viirya ?
> after upgrading to Spark 3.4.
--
This is an automated message from the Apache Git Service.
To respo
HeartSaVioR closed pull request #47339: [SPARK-48889][SS] testStream to unload
state stores before finishing
URL: https://github.com/apache/spark/pull/47339
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
panbingkun opened a new pull request, #47382:
URL: https://github.com/apache/spark/pull/47382
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
panbingkun commented on PR #47382:
URL: https://github.com/apache/spark/pull/47382#issuecomment-2232303157
> Please use a proper JIRA ID for code change. Especially, this is a kind of
`Fix`.
Okay, let me file it.
--
This is an automated message from the Apache Git Service.
To respo
yabola commented on PR #46713:
URL: https://github.com/apache/spark/pull/46713#issuecomment-2232302650
I would like to describe the usage scenario:
In a scenario where multiple users are sharing 2048 core long running SQL
cluster. Some users may have non-standard queries that use a large
panbingkun commented on code in PR #47382:
URL: https://github.com/apache/spark/pull/47382#discussion_r1680359737
##
common/unsafe/src/test/scala/org/apache/spark/unsafe/types/CollationFactorySuite.scala:
##
@@ -154,8 +151,8 @@ class CollationFactorySuite extends AnyFunSuite wit
LuciferYang commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232303765
Thank you for pinging me, @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
panbingkun commented on code in PR #47382:
URL: https://github.com/apache/spark/pull/47382#discussion_r1680361637
##
common/unsafe/src/test/scala/org/apache/spark/unsafe/types/CollationFactorySuite.scala:
##
@@ -154,8 +151,8 @@ class CollationFactorySuite extends AnyFunSuite wit
panbingkun closed pull request #47377: [SPARK-48917][BUILD] Upgrade tink to
1.14.0
URL: https://github.com/apache/spark/pull/47377
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
panbingkun commented on PR #47377:
URL: https://github.com/apache/spark/pull/47377#issuecomment-2232309413
> Unfortunately, it seems that we had a previous PR and we decided to close,
@panbingkun .
>
> * [[SPARK-48814][BUILD] Upgrade `tink` to 1.14.0
#47221](https://github.com/apache
dongjoon-hyun commented on PR #47382:
URL: https://github.com/apache/spark/pull/47382#issuecomment-2232315194
Also, cc @dbatomic , @cloud-fan , @MaxGekk from the original PR.
- #44968
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
LuciferYang commented on code in PR #46849:
URL: https://github.com/apache/spark/pull/46849#discussion_r1680369394
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/UserDefinedFunctionE2ETestSuite.scala:
##
@@ -388,6 +378,66 @@ class UserDefinedFunctionE2ETestS
LuciferYang commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232321196
> I am not sure how much of an issue this is since we use SBT for CI.
There are multiple daily tests now using Maven for testing.
--
This is an automated message f
HyukjinKwon commented on code in PR #46849:
URL: https://github.com/apache/spark/pull/46849#discussion_r1680376356
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/UserDefinedFunctionE2ETestSuite.scala:
##
@@ -388,6 +378,66 @@ class UserDefinedFunctionE2ETestS
dongjoon-hyun commented on PR #47379:
URL: https://github.com/apache/spark/pull/47379#issuecomment-2232348708
Merged to branch-3.5 for Apache Spark 3.5.2. Thank you all!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
dongjoon-hyun closed pull request #47379: [SPARK-48920][BUILD][3.5] Upgrade ORC
to 1.9.4
URL: https://github.com/apache/spark/pull/47379
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
viirya commented on PR #47380:
URL: https://github.com/apache/spark/pull/47380#issuecomment-2232423385
> Do you happen to know which JIRA issue is related to this regression,
@viirya ?
>
> > after upgrading to Spark 3.4.
Thank you for review, @dongjoon-hyun.
It is not ca
viirya commented on PR #47380:
URL: https://github.com/apache/spark/pull/47380#issuecomment-2232425180
I re-triggered the failed `Run Docker integration tests`.
All CIs are passed now:
https://github.com/viirya/spark-1/actions/runs/9967182407/job/27542878853
--
This is an automated
dongjoon-hyun commented on PR #47380:
URL: https://github.com/apache/spark/pull/47380#issuecomment-2232428003
Got it. Feel free to merge and backport, @viirya ~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
viirya commented on PR #47380:
URL: https://github.com/apache/spark/pull/47380#issuecomment-2232429329
Thank you @dongjoon-hyun. I will keep it for a day and merge if no more
comments.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
LuciferYang commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232430016
@hvanhovell local run
```
build/mvn clean install -DskipTests -Phive
build/mvn test -pl connector/connect/client/jvm -Phive
```
then
```
[ERROR] Test
LuciferYang commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232457945
local run
```
./dev/test-dependencies.sh --replace-manifest
git diff
```
```
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3
b/dev/deps/spark-deps-hadoop-
LuciferYang commented on code in PR #47378:
URL: https://github.com/apache/spark/pull/47378#discussion_r1680434473
##
project/SparkBuild.scala:
##
@@ -674,23 +664,76 @@ object SparkConnectCommon {
// Exclude `scala-library` from assembly.
(assembly / assemblyPackageSca
LuciferYang commented on PR #47378:
URL: https://github.com/apache/spark/pull/47378#issuecomment-2232464638
https://github.com/apache/spark/blob/3a245558be882ae94f507976e4e4fb8c1d9bf344/dev/sparktestsupport/modules.py#L323-L334
Although the `connect-api` module does not have test case
101 - 178 of 178 matches
Mail list logo