date:20250308

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

2025-03-08 Thread via GitHub

jjayadeep06 commented on code in PR #50020: URL: https://github.com/apache/spark/pull/50020#discussion_r1983092651 ## core/src/main/scala/org/apache/spark/BarrierCoordinator.scala: ## @@ -80,8 +82,13 @@ private[spark] class BarrierCoordinator( states.forEachValue(1, clear

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

srowen commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986124610 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool(

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

srowen commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986206274 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool(

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

beliefer commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986207234 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

2025-03-08 Thread via GitHub

srowen commented on PR #50187: URL: https://github.com/apache/spark/pull/50187#issuecomment-2708415916 Can the examples module simply point to SNAPSHOT versions like everything else in the build? the main branch code is always pointing at unreleased code, but on release, those SNAPSHOT vers

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

2025-03-08 Thread via GitHub

LuciferYang commented on PR #50187: URL: https://github.com/apache/spark/pull/50187#issuecomment-2708419792 > Can the examples module simply point to SNAPSHOT versions like everything else in the build? the main branch code is always pointing at unreleased code, but on release, those SNAPSH

Re: [PR] [SPARK-51338] Add automated CI build for `connect-examples` [spark]

2025-03-08 Thread via GitHub

hvanhovell commented on PR #50187: URL: https://github.com/apache/spark/pull/50187#issuecomment-2704355339 @HyukjinKwon can you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] Change host to ip [spark]

2025-03-08 Thread via GitHub

AryelSouza opened a new pull request, #50216: URL: https://github.com/apache/spark/pull/50216 an issue was open about the necessity of changing host to ip because ip would be deprecated for the documentation -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

2025-03-08 Thread via GitHub

vrozov commented on PR #49928: URL: https://github.com/apache/spark/pull/49928#issuecomment-2708396116 @cloud-fan @LuciferYang I prefer to keep the test in the java as it does not hurt and 1. There is similar test in R even though it is not R specific 2. Other tests in `JavaDataFr

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

beliefer commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986190446 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

srowen commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986190918 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool(

Re: [PR] Remove session string calls [spark]

2025-03-08 Thread via GitHub

github-actions[bot] commented on PR #48974: URL: https://github.com/apache/spark/pull/48974#issuecomment-2708582247 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

beliefer commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986202471 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

beliefer commented on code in PR #50209: URL: https://github.com/apache/spark/pull/50209#discussion_r1986202471 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/FetchedDataPool.scala: ## @@ -139,7 +139,7 @@ private[consumer] class FetchedDataPool

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

2025-03-08 Thread via GitHub

dongjoon-hyun commented on code in PR #50206: URL: https://github.com/apache/spark/pull/50206#discussion_r1986220526 ## .github/workflows/build_maven_java21_macos15.yml: ## @@ -36,5 +36,9 @@ jobs: os: macos-15 envs: >- { - "OBJC_DISABLE_INITIALIZE

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

2025-03-08 Thread via GitHub

LuciferYang commented on code in PR #50206: URL: https://github.com/apache/spark/pull/50206#discussion_r1986220726 ## .github/workflows/build_maven_java21_macos15.yml: ## @@ -36,5 +36,9 @@ jobs: os: macos-15 envs: >- { - "OBJC_DISABLE_INITIALIZE_F

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

2025-03-08 Thread via GitHub

dongjoon-hyun commented on code in PR #50206: URL: https://github.com/apache/spark/pull/50206#discussion_r1986220526 ## .github/workflows/build_maven_java21_macos15.yml: ## @@ -36,5 +36,9 @@ jobs: os: macos-15 envs: >- { - "OBJC_DISABLE_INITIALIZE

[PR] [SPARK-51444][CORE] Remove unreachable code from `TaskSchedulerImpl#statusUpdate` [spark]

2025-03-08 Thread via GitHub

LuciferYang opened a new pull request, #50218: URL: https://github.com/apache/spark/pull/50218 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-51402][SQL][TESTS] Test TimeType in UDF [spark]

2025-03-08 Thread via GitHub

MaxGekk commented on code in PR #50194: URL: https://github.com/apache/spark/pull/50194#discussion_r1986236836 ## dev/create-release/release-build.sh: ## @@ -137,6 +137,12 @@ if [[ "$1" == "finalize" ]]; then --repository-url https://upload.pypi.org/legacy/ \ "pyspark_

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

2025-03-08 Thread via GitHub

beliefer commented on PR #50209: URL: https://github.com/apache/spark/pull/50209#issuecomment-2708241246 ping @srowen @dongjoon-hyun @LuciferYang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-51429][Connect] Add "Acknowledgement" message to ExecutePlanResponse [spark]

2025-03-08 Thread via GitHub

vicennial commented on PR #50193: URL: https://github.com/apache/spark/pull/50193#issuecomment-2708400817 Putting this on hold ATM as some unexpected complications popped up (particularly the interactions with response indices and response caching) -- This is an automated message from the

Re: [PR] [SPARK-51298][SQL] Support variant in CSV scan [spark]

2025-03-08 Thread via GitHub

sandip-db commented on code in PR #50052: URL: https://github.com/apache/spark/pull/50052#discussion_r1986121542 ## sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala: ## @@ -760,12 +760,9 @@ class CsvFunctionsSuite extends QueryTest with SharedSparkSession {

Re: [PR] [SPARK-50763][SQL] Add Analyzer rule for resolving SQL table functions [spark]

2025-03-08 Thread via GitHub

cloud-fan commented on code in PR #49471: URL: https://github.com/apache/spark/pull/49471#discussion_r1982863500 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -1675,6 +1676,91 @@ class SessionCatalog( } } + /** + *

Re: [PR] [SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

2025-03-08 Thread via GitHub

cloud-fan commented on PR #49955: URL: https://github.com/apache/spark/pull/49955#issuecomment-2705416036 @peter-toth Ideally recursive CTE should stop if the last iteration generates no data. Pushing down the LIMIT and applying an early stop is an optimization and should not change the qu

[PR] [SPARK-51443] Fix singleVariantColumn in DSv2 and readStream. [spark]

2025-03-08 Thread via GitHub

chenhao-db opened a new pull request, #50217: URL: https://github.com/apache/spark/pull/50217 ### What changes were proposed in this pull request? The current JSON `singleVariantColumn` mode doesn't work in DSv2 and `spark.readStream`. This PR fixes the two cases: - DSv1 calls `Jso

Re: [PR] [SPARK-45265][SQL] Support Hive 4.0 metastore [spark]

2025-03-08 Thread via GitHub

hidataplus commented on code in PR #48823: URL: https://github.com/apache/spark/pull/48823#discussion_r1986028539 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala: ## @@ -1030,7 +1030,7 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, h

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

2025-03-08 Thread via GitHub

LuciferYang commented on PR #50187: URL: https://github.com/apache/spark/pull/50187#issuecomment-2708296150 I've noticed a rather peculiar issue here. It seems that the `connect-examples` project is dependent on a released version of Spark, which means we can only update to a new version af

[PR] [SPARK-51359][CORE][SQL] Set INT64 as the default timestamp type for Parquet files [spark]

2025-03-08 Thread via GitHub

ganeshashree opened a new pull request, #50215: URL: https://github.com/apache/spark/pull/50215 ### What changes were proposed in this pull request? Changes done to set INT64 as the default timestamp type for Parquet files. ### Why are the changes needed?

Re: [PR] [SPARK-43221][CORE] Host local block fetching should use a block status of a block stored on disk [spark]

2025-03-08 Thread via GitHub

attilapiros commented on code in PR #50122: URL: https://github.com/apache/spark/pull/50122#discussion_r1986090749 ## core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala: ## @@ -474,6 +474,26 @@ class BlockManagerSuite extends SparkFunSuite with Matchers with P

Re: [PR] [SPARK-45265][SQL] Support Hive 4.0 metastore [spark]

2025-03-08 Thread via GitHub

hidataplus commented on code in PR #48823: URL: https://github.com/apache/spark/pull/48823#discussion_r1986020465 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -177,8 +179,10 @@ private[hive] class HiveClientImpl( // got changed. We

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

2025-03-08 Thread via GitHub

LuciferYang commented on PR #50206: URL: https://github.com/apache/spark/pull/50206#issuecomment-2708248561 The PR title and description will be updated after the finalization of the plan. -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

2025-03-08 Thread via GitHub

LuciferYang commented on PR #49928: URL: https://github.com/apache/spark/pull/49928#issuecomment-2708280968 > @vrozov can you remove the java test? [#49928 (comment)](https://github.com/apache/spark/pull/49928#discussion_r1981021185) +1, for @cloud-fan's comments -- This is an auto

Re: [PR] [SPARK-49479][CORE] Cancel the Timer non-daemon thread on stopping the BarrierCoordinator [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

Re: [PR] [SPARK-51338] Add automated CI build for `connect-examples` [spark]

[PR] Change host to ip [spark]

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] Remove session string calls [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

[PR] [SPARK-51444][CORE] Remove unreachable code from `TaskSchedulerImpl#statusUpdate` [spark]

Re: [PR] [SPARK-51402][SQL][TESTS] Test TimeType in UDF [spark]

Re: [PR] [SPARK-51436][CORE][SQL][K8s][SS] Fix bug that cancel Future specified mayInterruptIfRunning with true [spark]

Re: [PR] [SPARK-51429][Connect] Add "Acknowledgement" message to ExecutePlanResponse [spark]

Re: [PR] [SPARK-51298][SQL] Support variant in CSV scan [spark]

Re: [PR] [SPARK-50763][SQL] Add Analyzer rule for resolving SQL table functions [spark]

Re: [PR] [SPARK-50892][SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries [spark]

[PR] [SPARK-51443] Fix singleVariantColumn in DSv2 and readStream. [spark]

Re: [PR] [SPARK-45265][SQL] Support Hive 4.0 metastore [spark]

Re: [PR] [SPARK-51338][INFRA] Add automated CI build for `connect-examples` [spark]

[PR] [SPARK-51359][CORE][SQL] Set INT64 as the default timestamp type for Parquet files [spark]

Re: [PR] [SPARK-43221][CORE] Host local block fetching should use a block status of a block stored on disk [spark]

Re: [PR] [SPARK-45265][SQL] Support Hive 4.0 metastore [spark]

Re: [PR] [SPARK-51365][SQL][TESTS] Reduce `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD/RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for tests related to `SharedSparkSession/TestHive` when using `macOS + Apple S

Re: [PR] [SPARK-51182][SQL] DataFrameWriter should throw dataPathNotSpecifiedError when path is not specified [spark]

32 matches

Site Navigation

Mail list logo

Footer information