user

Messages by Thread

[ANNOUNCE] Apache Spark Connect Swift Client 0.7.0 released Dongjoon Hyun
[ANNOUNCE] Apache Spark 4.2.0 released huaxin gao
[ANNOUNCE] Apache Spark 4.0.3 released Dongjoon Hyun
[SECURITY] Request to bump bundled Netty and ZooKeeper in PySpark (Blocks Enterprise Scanners) - [SPARK-57343] Alam, Shahnoor via user
- Re: [SECURITY] Request to bump bundled Netty and ZooKeeper in PySpark (Blocks Enterprise Scanners) - [SPARK-57343] Hyukjin Kwon
- Re: [External] Re: [SECURITY] Request to bump bundled Netty and ZooKeeper in PySpark (Blocks Enterprise Scanners) - [SPARK-57343] Alam, Shahnoor via user
- Re: [External] Re: [SECURITY] Request to bump bundled Netty and ZooKeeper in PySpark (Blocks Enterprise Scanners) - [SPARK-57343] Alam, Shahnoor via user
- Re: [External] Re: [SECURITY] Request to bump bundled Netty and ZooKeeper in PySpark (Blocks Enterprise Scanners) - [SPARK-57343] Hyukjin Kwon
Seeking Guidance on Spark Contribution (SPARK-56891) Y. Hitesh Reddy
- Re: Seeking Guidance on Spark Contribution (SPARK-56891) Y. Hitesh Reddy
Apache Spark Community Event in Bengaluru, India Rudresh Holani
[Spark XML] Reading an XML after setting ignoreNamespace option leads to an empty DataFrame in pyspark 4.1.1 Johannes Bock
[ANNOUNCE] Apache Spark 4.1.2 released Peter Toth
[ANNOUNCE] Apache Spark Kubernetes Operator 0.9.0 Released Dongjoon Hyun
[ANNOUNCE] Apache Sedona 1.9.0 released Jia Yu
[ANNOUNCE] Apache Celeborn 0.6.3 available Nicholas
[ANNOUNCE] Apache Spark Connect Swift Client 0.6.0 released Dongjoon Hyun
Spark + Hive 4 Integration Guide (Practical Approach) Mich Talebzadeh
[ANNOUNCE] Apache Spark Kubernetes Operator 0.8.0 Released Dongjoon Hyun
CVE-2025-54920: Apache Spark: Spark History Server Code Execution Vulnerability Holden Karau
- Re: CVE-2025-54920: Apache Spark: Spark History Server Code Execution Vulnerability Tres Pittman
- Re: CVE-2025-54920: Apache Spark: Spark History Server Code Execution Vulnerability Holden Karau
- Re: CVE-2025-54920: Apache Spark: Spark History Server Code Execution Vulnerability Tres Pittman
unsuscribe Jose Zuccoli via user
[Community] Seeking advice on PR follow-up process Sergei Repnikov
- Re: [Community] Seeking advice on PR follow-up process Mich Talebzadeh
[FYI] Applications for Travel Assistance to Community Over Code Glasgow now open Dongjoon Hyun
Participation in spark user community Mich Talebzadeh
- Re: Participation in spark user community David Edwards
- Re: Participation in spark user community Mich Talebzadeh
[ANNOUNCE] Apache Spark 4.0.2 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark 4.0.2 released Fengyu Cao
- Re: [ANNOUNCE] Apache Spark 4.0.2 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark 4.0.2 released Fengyu Cao
Issue with $0 Resolution in ksh/bash Scripts When sudo Digest Is Enabled on RHEL 8.10 Satyendra Kumar Paterya via user
- Re: Issue with $0 Resolution in ksh/bash Scripts When sudo Digest Is Enabled on RHEL 8.10 Pasha Finkelshtein
Re: user Digest 20 Jan 2026 04:44:25 -0000 Issue 12172 Matthew Cook
- Re: user Digest 20 Jan 2026 04:44:25 -0000 Issue 12172 DaeJin Jung
[ANNOUNCE] Apache Spark 3.5.8 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark 3.5.8 released Wenchen Fan
[ANNOUNCE] Apache Spark Kubernetes Operator 0.7.0 released Dongjoon Hyun
[ANNOUNCE] Apache Spark Connect Swift Client 0.5.0 released Dongjoon Hyun
[ANNOUNCE] Apache Sedona 1.8.1 released Jia Yu
[QUESTION] Stability of FileScanRDD partitions mapping Artsiom Samasadau
[ANNOUNCE] Apache Kyuubi v1.10.3 is available Akira Ajisaka
[ANNOUNCE] Apache Kyuubi v1.11.0 is available Cheng Pan
[ANNOUNCE] Apache Celeborn 0.6.2 available Nicholas
[Spark SQL] Observation on empty-after-filter DF never materializes (PySpark 3.5.0 / Scala 2.12) Matúš Letko (CZ) via user
Is it safe to use ThreadLocalRandom inside UDAF? Sem
[DISCUSS] Possible workaround for Hadoop ZStandardDecompressor errors triggered by small ZSTD files? FengYu Cao
- Re: [DISCUSS] Possible workaround for Hadoop ZStandardDecompressor errors triggered by small ZSTD files? Ángel Álvarez Pascua
- Re: [DISCUSS] Possible workaround for Hadoop ZStandardDecompressor errors triggered by small ZSTD files? FengYu Cao
- Re: [DISCUSS] Possible workaround for Hadoop ZStandardDecompressor errors triggered by small ZSTD files? Ángel Álvarez Pascua
Has anyone upgraded to using TLS for RPC and Shuffle in Spark 3.* versions ? Guruprasad Veerannavaru
[ANNOUNCE] Apache Spark 4.1.0-preview4 released Dongjoon Hyun
[ANNOUNCE] Apache Spark Kubernetes Operator 0.6.0 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark Kubernetes Operator 0.6.0 released Peter Toth
Palo Alto Data Analytics Platform Meetup Kang, Kenneth
[DISCUSS] Different output precision of now() in Spark 3.5.1 between Java 8 and Java 21 Yu Hong
OutOfMemoryError in EMR for Iceberg Metadata Query Execution Nipuna Shantha
- Re: OutOfMemoryError in EMR for Iceberg Metadata Query Execution Soumasish
Question: Unexpected behavior and potential correctness issue when using groupBy and count_distinct + percentile FengYu Cao
- Re: Question: Unexpected behavior and potential correctness issue when using groupBy and count_distinct + percentile Herman van Hovell
- Re: Question: Unexpected behavior and potential correctness issue when using groupBy and count_distinct + percentile FengYu Cao
- Re: Question: Unexpected behavior and potential correctness issue when using groupBy and count_distinct + percentile FengYu Cao
CVE-2025-55039: Apache Spark, Apache Spark: RPC encryption defaults to unauthenticated AES-CTR mode, enabling man-in-the-middle ciphertext modification attacks Holden Karau
- Re: CVE-2025-55039: Apache Spark, Apache Spark: RPC encryption defaults to unauthenticated AES-CTR mode, enabling man-in-the-middle ciphertext modification attacks Akira Ajisaka
Broken preview link Khizar Qureshi
- Re: Broken preview link Hyukjin Kwon
[ANNOUNCE] Apache Spark Kubernetes Operator 0.5.0 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark Kubernetes Operator 0.5.0 released Peter Toth
[ANNOUNCE] Apache Spark Connect Swift Client 0.4.0 released Dongjoon Hyun
[ANNOUNCE] Apache Spark 3.5.7 released Peter Toth
- Re: [ANNOUNCE] Apache Spark 3.5.7 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark 3.5.7 released Nebi Aydin
- RE: [ANNOUNCE] Apache Spark 3.5.7 released Appel, Kevin
- Re: [ANNOUNCE] Apache Spark 3.5.7 released Peter Toth
- Re: [ANNOUNCE] Apache Spark 3.5.7 released Dongjoon Hyun
Boston Spark Meetup Dana Rosenfarb
- RE: Boston Spark Meetup Jules Damji
[ANNOUNCE] Apache Sedona 1.8.0 released Jia Yu
[ANNOUNCE] Apache Celeborn 0.6.1 available Nicholas
[ANNOUNCE] Apache Spark 4.0.1 released Dongjoon Hyun
- Re: [ANNOUNCE] Apache Spark 4.0.1 released Hyukjin Kwon
[Structured Streaming] failOnDataLoss "false alarm" with Kafka retention policy Christian Winkler
Performance impact of enabling spark.rdd.compres = true Guruprasad Veerannavaru
Reading from MinIO I get SSL errors. Hamish Whittal
[Spark SQL] [How-to] Can columns be excluded from a scan performed as part of an update? William Muesing
Support Required: Issue with PySpark Code Execution Order Karthick N
- Re: Support Required: Issue with PySpark Code Execution Order Bjørn Jørgensen
- Re: Support Required: Issue with PySpark Code Execution Order Ángel Álvarez Pascua
- Re: Support Required: Issue with PySpark Code Execution Order Karthick N
- Re: Support Required: Issue with PySpark Code Execution Order Mich Talebzadeh
- Re: Support Required: Issue with PySpark Code Execution Order Karthick N
- Re: Support Required: Issue with PySpark Code Execution Order Mich Talebzadeh
Spark K8s auto scaling using Keda or similar tools Nimrod Ofek
[PySpark] [Beginner] [Debug] Does Spark ReadStream support reading from a MinIO bucket? Kleckner, Jade
- Re: [PySpark] [Beginner] [Debug] Does Spark ReadStream support reading from a MinIO bucket? 刘唯
- RE: [PySpark] [Beginner] [Debug] Does Spark ReadStream support reading from a MinIO bucket? Bhatt, Kashyap
[ANNOUNCE] Debo CLI v0.1.0 – Unified Hadoop Ecosystem Management Tool Surafel Temesgen
[SPARK-CORE] SerializationDebugger fails on Java 21 Clemens Ballarin
[SPARK-CONNECT] [SPARK-4.0] Encountered end-of-stream mid-frame Manas Bhardwaj
GraphFrames is back with v0.9.2! pip install graphframes-py :) Russell Jurney
Read duration, Write duration, Processing Time metrics Melika Ghiasi
Compatibility Issue: DescribeTopicsResult.all() missing in Kafka 4.0.0 used with Spark 4.0.0 Sandeep Ballu
[Spark SQL]: Python Data Source API and spark.sql.execution.pyspark.python Ilya
Regarding Obtaining Executor ID and GPU Binding in PySpark wuchaowei
[Spark SQL]: Spark 4 logs warning and stack trace when loading dataframe from path containing wildcard Glenn J
spark.api.mode property is not available in spark 4.0.0 Sangram Mohanty
Spark Job Stuck in Active State (v2.4.3, Cluster Mode) Hitesh Vaghela
- Re: Spark Job Stuck in Active State (v2.4.3, Cluster Mode) Ángel Álvarez Pascua
[Spark SQL]: Spark can't read views created via Trino using enableHiveSupport. Tal Haimov
Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka Nimrod Ofek
- Re: Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka Khalid Mammadov
- Re: Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka Nimrod Ofek
- Re: Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka Khalid Mammadov
- RE: Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka Wolfgang Buchner
- RE: Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka Wolfgang Buchner
Question about Spark Tag in TreeNode Yifan Li
[PR REVIEW] Glob based provider for history server Gaurav Waghmare
Spark checkpointing in batch mode fault tolerance problem Martin Aras
[ANNOUNCE] Apache Spark Kubernetes Operator 0.4.0 released Dongjoon Hyun
[PYSPARK] createDataFrame throws exception with Python 3.12.3 Eyck Troschke
Technical Guidance: Dynamic Resource Allocation + External Shuffle Storage Andrew M.
Inquiry About User Impersonation Support in Spark Thrift Server (Spark 1.x to 4.x) Allen Chu
What is the current canonical way to join more than 2 watermarked streams (Spark 3.5.6)? [email protected]
- Re: What is the current canonical way to join more than 2 watermarked streams (Spark 3.5.6)? Jungtaek Lim
pyspark4.0.0 still includes "jackson-mapper-asl.jar" that was supposed to be removed according to release note Haibo.Wang
- RE: pyspark4.0.0 still includes "jackson-mapper-asl.jar" that was supposed to be removed according to release note Haibo.Wang
Spark on kubernete, configmap add log4j2.properties data melin li
[SQL]: Registering spark extensions which utilise DataSourceV2Strategy in Spark 4 Jack Buggins
[ANNOUNCE] Apache Sedona 1.7.2 released Jia Yu
[ANNOUNCE] Apache Spark Kubernetes Operator 0.3.0 released Dongjoon Hyun
[ANNOUNCE] Apache Spark Connect Swift Client 0.3.0 released Dongjoon Hyun
Inquiry: Extending Spark ML Support via Spark Connect to Scala/Java APIs (SPARK-50812 Analogue) Daniel Filev
- Re: Inquiry: Extending Spark ML Support via Spark Connect to Scala/Java APIs (SPARK-50812 Analogue) Daniel Filev
[ANNOUNCE] Apache Spark 4.0.0 released Wenchen Fan
[PYSPARK] df.collect throws exception for MapType with ArrayType as key Eyck Troschke
- Re: [PYSPARK] df.collect throws exception for MapType with ArrayType as key Soumasish
Aligning pom.xml in Bundled PySpark JARs with Effective Runtime Dependencies for SCA Tools Guzarevich, M. (Mikalai)
Reg: spark delta table read failing Akram Shaik
- Re: Reg: spark delta table read failing Bjørn Jørgensen
[ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.2.0 Dongjoon Hyun
[ANNOUNCE] Announcing Apache Spark Connect Swift Client 0.2.0 Dongjoon Hyun
Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters megh vidani
- Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters megh vidani
- Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters Prashant Sharma
- Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters megh vidani
- Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters megh vidani
- Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters megh vidani
user-unsubscribe Sky Yin
Help requested: Spark security triage and followup Apache Security Team
Apache Sedona + Iceberg GEO meetup in San Francisco Jia Yu
[ANNOUNCE] Announcing Apache Spark Connect Swift Client 0.1.0 Dongjoon Hyun
[ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.1.0 Dongjoon Hyun
- Re: [ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.1.0 Mridul Muralidharan
[ML] Does GeneralizedLinearRegression correctly handle interaction between two categorical values? Emil Hofman
Fw: Reg: Supporting inheritance for datatypes in pyspark Vaibhaw
[Spark SQL] spark.sql insert overwrite on existing partition not updating hive metastore partition transient_lastddltime and column_stats Pradeep
- Re: [Spark SQL] spark.sql insert overwrite on existing partition not updating hive metastore partition transient_lastddltime and column_stats Sathi Chowdhury
Issue with Spark Operator nilanjan sarkar
Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3 Sungwoo Park
- Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3 Sungwoo Park
- Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3 Sungwoo Park
Appreciate a second opinion – Metadata Analysis of PDF Files Mich Talebzadeh
Checkpointing in foreachPartition in Spark batck Abhishek Singla
- Re: Checkpointing in foreachPartition in Spark batck daniel williams
- Re: Checkpointing in foreachPartition in Spark batck Ángel Álvarez Pascua
- Re: Checkpointing in foreachPartition in Spark batck Abhishek Singla
- Re: Checkpointing in foreachPartition in Spark batck daniel williams
- Re: Checkpointing in foreachPartition in Spark batck Ángel Álvarez Pascua
- Re: Checkpointing in foreachPartition in Spark batck daniel williams
- Re: Checkpointing in foreachPartition in Spark batck Ángel Álvarez Pascua
- Re: Checkpointing in foreachPartition in Spark batck daniel williams
- Re: Checkpointing in foreachPartition in Spark batck Abhishek Singla
Comparison between union and stack in pyspark Dhruv Singla
Structured Streaming Initial Listing Issue Anastasiia Sokhova
- Re: Structured Streaming Initial Listing Issue 刘唯
- Re: Structured Streaming Initial Listing Issue Andrei L
- Re: Structured Streaming Initial Listing Issue Anastasiia Sokhova
Parallelism for glue pyspark jobs Perez
The use of Python ParamSpec in PySpark Rafał Wojdyła
Spark Streaming Dataset with Multiple S3 Sources is too Slow Jevon Cowell
Is "SORTED BY (col DESC)" Supported for Bucketed Table? Joe Lee
kubernetes spark connect iceberg SparkWrite$WriterFactory not found Razvan Mihai
High count of Active Jobs nayan sharma
- Re: High count of Active Jobs nayan sharma
- Re: High count of Active Jobs nayan sharma
- Re: High count of Active Jobs Ángel Álvarez Pascua
- Re: High count of Active Jobs Ángel Álvarez Pascua
- Re: High count of Active Jobs nayan sharma
- Re: High count of Active Jobs Ángel Álvarez Pascua
- Re: High count of Active Jobs Ángel Álvarez Pascua
Announcing the Community Over Code 2025 Streaming Track James Hughes
Kubeflow Spark-Operator Hamish Whittal
Correctness Issue: UNIX_SECONDS() mismatch with TO_UTC_TIMESTAMP() result in Spark 3.5.1 Miguel Leite
Executors not getting released dynamically once task is over Shivang Modi
- Re: Executors not getting released dynamically once task is over Soumasish
Java coding with spark API tim wade
- Re: Java coding with spark API Jevon Cowell

Earlier messages