[jira] [Updated] (SPARK-51252) Adding state store level metrics for last uploaded snapshot version in HDFS State Stores

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51252: --- Labels: pull-request-available (was: ) > Adding state store level metrics for last uploaded

[jira] [Updated] (SPARK-51274) PySparkLogger should respect the expected keyword arguments.

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51274: --- Labels: pull-request-available (was: ) > PySparkLogger should respect the expected keyword

[jira] [Created] (SPARK-51273) Spark Connect Call Procedure runs the procedure twice

2025-02-20 Thread Szehon Ho (Jira)
Szehon Ho created SPARK-51273: - Summary: Spark Connect Call Procedure runs the procedure twice Key: SPARK-51273 URL: https://issues.apache.org/jira/browse/SPARK-51273 Project: Spark Issue Type: B

[jira] [Created] (SPARK-51274) PySparkLogger should respect the expected keyword arguments.

2025-02-20 Thread Takuya Ueshin (Jira)
Takuya Ueshin created SPARK-51274: - Summary: PySparkLogger should respect the expected keyword arguments. Key: SPARK-51274 URL: https://issues.apache.org/jira/browse/SPARK-51274 Project: Spark

[jira] [Updated] (SPARK-51185) Revert simplifications to PartitionedFileUtil API due to increased risk of OOM

2025-02-20 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-51185: Fix Version/s: 3.5.5 > Revert simplifications to PartitionedFileUtil API due to increased risk of

[jira] [Created] (SPARK-51272) Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key

2025-02-20 Thread Asif (Jira)
Asif created SPARK-51272: Summary: Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key Key: SPARK-51272 URL: https://issues.apache.org/jira/browse/SPARK-5

[jira] [Updated] (SPARK-51272) Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key

2025-02-20 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-51272: - Attachment: BugTest.txt > Race condition in DagScheduler can result in failure of retrying all > partitions for

[jira] [Commented] (SPARK-51272) Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key

2025-02-20 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928944#comment-17928944 ] Asif commented on SPARK-51272: -- [^BugTest.txt] [^bugrepro.patch] > Race condition in Da

[jira] [Updated] (SPARK-51272) Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key

2025-02-20 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-51272: - Labels: spark-core (was: ) > Race condition in DagScheduler can result in failure of retrying all > partitions

[jira] [Updated] (SPARK-51272) Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key

2025-02-20 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asif updated SPARK-51272: - Attachment: bugrepro.patch > Race condition in DagScheduler can result in failure of retrying all > partitions

[jira] [Assigned] (SPARK-51275) Session propagation in python readwrite

2025-02-20 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-51275: - Assignee: Ruifeng Zheng > Session propagation in python readwrite > ---

[jira] [Updated] (SPARK-51262) exceptAll not working with drop_duplicates using subset

2025-02-20 Thread Nicolau Balbino (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolau Balbino updated SPARK-51262: Description: When using drop_duplicate with subset and after use exceptAll method, when c

[jira] [Updated] (SPARK-51273) Spark Connect Call Procedure runs the procedure twice

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51273: --- Labels: pull-request-available (was: ) > Spark Connect Call Procedure runs the procedure tw

[jira] [Updated] (SPARK-50864) Optimize and reeanble slow pytorch tests

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-50864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-50864: --- Labels: pull-request-available (was: ) > Optimize and reeanble slow pytorch tests > ---

[jira] [Updated] (SPARK-51278) Use appropriate structure of JSON format for PySparkLogger

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51278: --- Labels: pull-request-available (was: ) > Use appropriate structure of JSON format for PySpa

[jira] [Resolved] (SPARK-51249) Fix version encoding bug in NoPrefixKeyStateEncoder

2025-02-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-51249. -- Resolution: Fixed Issue resolved by pull request 49996 [https://github.com/apache/spark/pull/4

[jira] [Assigned] (SPARK-51249) Fix version encoding bug in NoPrefixKeyStateEncoder

2025-02-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-51249: Assignee: Eric Marnadi > Fix version encoding bug in NoPrefixKeyStateEncoder > --

[jira] [Updated] (SPARK-50864) Optimize and reeanble slow pytorch tests

2025-02-20 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-50864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng updated SPARK-50864: -- Affects Version/s: 4.1 > Optimize and reeanble slow pytorch tests > --

[jira] [Created] (SPARK-51278) Use appropriate structure of JSON format for PySparkLogger

2025-02-20 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-51278: --- Summary: Use appropriate structure of JSON format for PySparkLogger Key: SPARK-51278 URL: https://issues.apache.org/jira/browse/SPARK-51278 Project: Spark Issu

[jira] [Updated] (SPARK-51272) Race condition in DagScheduler can result in failure of retrying all partitions for non deterministic partitioning key

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51272: --- Labels: pull-request-available spark-core (was: spark-core) > Race condition in DagSchedule

[jira] [Created] (SPARK-51276) Enable spark.sql.execution.arrow.pyspark.enabled by default

2025-02-20 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-51276: Summary: Enable spark.sql.execution.arrow.pyspark.enabled by default Key: SPARK-51276 URL: https://issues.apache.org/jira/browse/SPARK-51276 Project: Spark

[jira] [Updated] (SPARK-51276) Enable spark.sql.execution.arrow.pyspark.enabled by default

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51276: --- Labels: pull-request-available (was: ) > Enable spark.sql.execution.arrow.pyspark.enabled b

[jira] [Updated] (SPARK-51276) Enable spark.sql.execution.arrow.pyspark.enabled by default

2025-02-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-51276: - Labels: release-notes (was: pull-request-available) > Enable spark.sql.execution.arrow.pyspark.

[jira] [Updated] (SPARK-48516) Turn on Arrow optimization for Python UDFs by default

2025-02-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-48516: - Labels: pull-request-available release-notes (was: pull-request-available) > Turn on Arrow opti

[jira] [Created] (SPARK-51277) Implement 0-arg implementation in Arrow-optimized Python UDF

2025-02-20 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-51277: Summary: Implement 0-arg implementation in Arrow-optimized Python UDF Key: SPARK-51277 URL: https://issues.apache.org/jira/browse/SPARK-51277 Project: Spark

[jira] [Comment Edited] (SPARK-51016) result data compromised in case of indeterministic join keys in Outer Join op, when retry happens

2025-02-20 Thread Asif (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924245#comment-17924245 ] Asif edited comment on SPARK-51016 at 2/20/25 11:20 PM: Pull Req

[jira] [Created] (SPARK-51275) Session propagation in python readwrite

2025-02-20 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-51275: - Summary: Session propagation in python readwrite Key: SPARK-51275 URL: https://issues.apache.org/jira/browse/SPARK-51275 Project: Spark Issue Type: Sub-tas

[jira] [Updated] (SPARK-51275) Session propagation in python readwrite

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51275: --- Labels: pull-request-available (was: ) > Session propagation in python readwrite >

[jira] [Resolved] (SPARK-51267) Match local Spark Connect server logic between Python and Scala

2025-02-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-51267. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 50017 [https://gi

[jira] [Assigned] (SPARK-51267) Match local Spark Connect server logic between Python and Scala

2025-02-20 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-51267: Assignee: Hyukjin Kwon > Match local Spark Connect server logic between Python and Scala

[jira] [Resolved] (SPARK-48530) [M1] Support for local variables

2025-02-20 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-48530. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 49445 [https://gith

[jira] [Assigned] (SPARK-48530) [M1] Support for local variables

2025-02-20 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-48530: --- Assignee: David Milicevic > [M1] Support for local variables >

[jira] [Updated] (SPARK-51269) SQLConf should manage the default value for avro compression level

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51269: --- Labels: pull-request-available (was: ) > SQLConf should manage the default value for avro c

[jira] [Updated] (SPARK-51269) Let the SQLConf control the default value for compression level

2025-02-20 Thread Jiaan Geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng updated SPARK-51269: --- Description: Currently, the default value of spark.sql.avro.deflate.level is -1. But it managed wit

[jira] [Updated] (SPARK-51269) SQLConf should manage the default value for compression level

2025-02-20 Thread Jiaan Geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng updated SPARK-51269: --- Summary: SQLConf should manage the default value for compression level (was: Let the SQLConf contro

[jira] [Updated] (SPARK-51263) Clean up unnecessary `invokePrivate` method calls in the test code.

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51263: --- Labels: pull-request-available (was: ) > Clean up unnecessary `invokePrivate` method calls

[jira] [Updated] (SPARK-49489) 'TTransportException: MaxMessageSize reached' occurs if get a table with a large number of partitions

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-49489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-49489: --- Labels: pull-request-available (was: ) > 'TTransportException: MaxMessageSize reached' occu

[jira] [Created] (SPARK-51270) Support new Variant types UUID, Time, and nanosecond timestamp

2025-02-20 Thread David Cashman (Jira)
David Cashman created SPARK-51270: - Summary: Support new Variant types UUID, Time, and nanosecond timestamp Key: SPARK-51270 URL: https://issues.apache.org/jira/browse/SPARK-51270 Project: Spark

[jira] [Updated] (SPARK-51270) Support new Variant types UUID, Time, and nanosecond timestamp

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-51270: --- Labels: pull-request-available (was: ) > Support new Variant types UUID, Time, and nanoseco

[jira] [Assigned] (SPARK-51266) Remove the no longer used definition of `private[spark] object TaskDetailsClassNames`

2025-02-20 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-51266: - Assignee: Yang Jie > Remove the no longer used definition of `private[spark] object >

[jira] [Resolved] (SPARK-51266) Remove the no longer used definition of `private[spark] object TaskDetailsClassNames`

2025-02-20 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-51266. --- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 50016 [https://

[jira] [Updated] (SPARK-51266) Remove the no longer used definition of `private[spark] object TaskDetailsClassNames`

2025-02-20 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-51266: -- Parent: SPARK-51166 Issue Type: Sub-task (was: Improvement) > Remove the no longer us

[jira] [Updated] (SPARK-50785) [M1] FOR statement improvements

2025-02-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-50785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-50785: --- Labels: pull-request-available (was: ) > [M1] FOR statement improvements >

[jira] [Created] (SPARK-51271) Python Data Sources Filter Pushdown API

2025-02-20 Thread Haoyu Weng (Jira)
Haoyu Weng created SPARK-51271: -- Summary: Python Data Sources Filter Pushdown API Key: SPARK-51271 URL: https://issues.apache.org/jira/browse/SPARK-51271 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-51097) Adding state store level metrics for last uploaded snapshot version in RocksDB

2025-02-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-51097: Assignee: Zeyu Chen > Adding state store level metrics for last uploaded snapshot version

[jira] [Resolved] (SPARK-51097) Adding state store level metrics for last uploaded snapshot version in RocksDB

2025-02-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-51097. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 49816 [https://gi

[jira] [Updated] (SPARK-51269) SQLConf should manage the default value for avro compression level

2025-02-20 Thread Jiaan Geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng updated SPARK-51269: --- Summary: SQLConf should manage the default value for avro compression level (was: SQLConf should ma

[jira] [Updated] (SPARK-51092) Skip the v1 FlatMapGroupsWithState tests with timeout on big endian platforms

2025-02-20 Thread Jonathan Albrecht (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Albrecht updated SPARK-51092: -- Summary: Skip the v1 FlatMapGroupsWithState tests with timeout on big endian platforms

[jira] [Created] (SPARK-51269) Let the SQLConf control the default value for compression level

2025-02-20 Thread Jiaan Geng (Jira)
Jiaan Geng created SPARK-51269: -- Summary: Let the SQLConf control the default value for compression level Key: SPARK-51269 URL: https://issues.apache.org/jira/browse/SPARK-51269 Project: Spark

[jira] [Updated] (SPARK-51092) Skip the v1 FlatMapGroupsWithState tests with timeout on big endian platforms

2025-02-20 Thread Jonathan Albrecht (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Albrecht updated SPARK-51092: -- Description: The {{dataType}} of the {{timestampTimeoutAttribute}} is IntegerType in

[jira] [Resolved] (SPARK-51092) Skip the v1 FlatMapGroupsWithState tests with timeout on big endian platforms

2025-02-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-51092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-51092. -- Fix Version/s: 4.0.0 Assignee: Jonathan Albrecht Resolution: Fixed Issue resol