user
Thread
Date
Earlier messages
Messages by Thread
Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka
Nimrod Ofek
Re: Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka
Khalid Mammadov
Question about Spark Tag in TreeNode
Yifan Li
[PR REVIEW] Glob based provider for history server
Gaurav Waghmare
Spark checkpointing in batch mode fault tolerance problem
Martin Aras
[ANNOUNCE] Apache Spark Kubernetes Operator 0.4.0 released
Dongjoon Hyun
[PYSPARK] createDataFrame throws exception with Python 3.12.3
Eyck Troschke
Technical Guidance: Dynamic Resource Allocation + External Shuffle Storage
Andrew M.
Inquiry About User Impersonation Support in Spark Thrift Server (Spark 1.x to 4.x)
Allen Chu
What is the current canonical way to join more than 2 watermarked streams (Spark 3.5.6)?
cheapsolutionarchit...@gmail.com
Re: What is the current canonical way to join more than 2 watermarked streams (Spark 3.5.6)?
Jungtaek Lim
pyspark4.0.0 still includes "jackson-mapper-asl.jar" that was supposed to be removed according to release note
Haibo.Wang
RE: pyspark4.0.0 still includes "jackson-mapper-asl.jar" that was supposed to be removed according to release note
Haibo.Wang
Spark on kubernete, configmap add log4j2.properties data
melin li
[SQL]: Registering spark extensions which utilise DataSourceV2Strategy in Spark 4
Jack Buggins
[ANNOUNCE] Apache Sedona 1.7.2 released
Jia Yu
[ANNOUNCE] Apache Spark Kubernetes Operator 0.3.0 released
Dongjoon Hyun
[ANNOUNCE] Apache Spark Connect Swift Client 0.3.0 released
Dongjoon Hyun
Inquiry: Extending Spark ML Support via Spark Connect to Scala/Java APIs (SPARK-50812 Analogue)
Daniel Filev
Re: Inquiry: Extending Spark ML Support via Spark Connect to Scala/Java APIs (SPARK-50812 Analogue)
Daniel Filev
[ANNOUNCE] Apache Spark 4.0.0 released
Wenchen Fan
[PYSPARK] df.collect throws exception for MapType with ArrayType as key
Eyck Troschke
Re: [PYSPARK] df.collect throws exception for MapType with ArrayType as key
Soumasish
Aligning pom.xml in Bundled PySpark JARs with Effective Runtime Dependencies for SCA Tools
Guzarevich, M. (Mikalai)
Reg: spark delta table read failing
Akram Shaik
Re: Reg: spark delta table read failing
Bjørn Jørgensen
[ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.2.0
Dongjoon Hyun
[ANNOUNCE] Announcing Apache Spark Connect Swift Client 0.2.0
Dongjoon Hyun
Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
megh vidani
Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
megh vidani
Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
Prashant Sharma
Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
megh vidani
Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
megh vidani
Re: Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
megh vidani
user-unsubscribe
Sky Yin
Help requested: Spark security triage and followup
Apache Security Team
Apache Sedona + Iceberg GEO meetup in San Francisco
Jia Yu
[ANNOUNCE] Announcing Apache Spark Connect Swift Client 0.1.0
Dongjoon Hyun
[ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.1.0
Dongjoon Hyun
Re: [ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.1.0
Mridul Muralidharan
[ML] Does GeneralizedLinearRegression correctly handle interaction between two categorical values?
Emil Hofman
Fw: Reg: Supporting inheritance for datatypes in pyspark
Vaibhaw
[Spark SQL] spark.sql insert overwrite on existing partition not updating hive metastore partition transient_lastddltime and column_stats
Pradeep
Re: [Spark SQL] spark.sql insert overwrite on existing partition not updating hive metastore partition transient_lastddltime and column_stats
Sathi Chowdhury
Issue with Spark Operator
nilanjan sarkar
Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3
Sungwoo Park
Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3
Sungwoo Park
Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3
Sungwoo Park
Appreciate a second opinion – Metadata Analysis of PDF Files
Mich Talebzadeh
Checkpointing in foreachPartition in Spark batck
Abhishek Singla
Re: Checkpointing in foreachPartition in Spark batck
daniel williams
Re: Checkpointing in foreachPartition in Spark batck
Ángel Álvarez Pascua
Re: Checkpointing in foreachPartition in Spark batck
Abhishek Singla
Re: Checkpointing in foreachPartition in Spark batck
daniel williams
Re: Checkpointing in foreachPartition in Spark batck
Ángel Álvarez Pascua
Re: Checkpointing in foreachPartition in Spark batck
daniel williams
Re: Checkpointing in foreachPartition in Spark batck
Ángel Álvarez Pascua
Re: Checkpointing in foreachPartition in Spark batck
daniel williams
Re: Checkpointing in foreachPartition in Spark batck
Abhishek Singla
Comparison between union and stack in pyspark
Dhruv Singla
Structured Streaming Initial Listing Issue
Anastasiia Sokhova
Re: Structured Streaming Initial Listing Issue
刘唯
Re: Structured Streaming Initial Listing Issue
Andrei L
Re: Structured Streaming Initial Listing Issue
Anastasiia Sokhova
Parallelism for glue pyspark jobs
Perez
The use of Python ParamSpec in PySpark
Rafał Wojdyła
Spark Streaming Dataset with Multiple S3 Sources is too Slow
Jevon Cowell
Is "SORTED BY (col DESC)" Supported for Bucketed Table?
Joe Lee
kubernetes spark connect iceberg SparkWrite$WriterFactory not found
Razvan Mihai
High count of Active Jobs
nayan sharma
Re: High count of Active Jobs
nayan sharma
Re: High count of Active Jobs
nayan sharma
Re: High count of Active Jobs
Ángel Álvarez Pascua
Re: High count of Active Jobs
Ángel Álvarez Pascua
Re: High count of Active Jobs
nayan sharma
Re: High count of Active Jobs
Ángel Álvarez Pascua
Re: High count of Active Jobs
Ángel Álvarez Pascua
Announcing the Community Over Code 2025 Streaming Track
James Hughes
Kubeflow Spark-Operator
Hamish Whittal
Correctness Issue: UNIX_SECONDS() mismatch with TO_UTC_TIMESTAMP() result in Spark 3.5.1
Miguel Leite
Executors not getting released dynamically once task is over
Shivang Modi
Re: Executors not getting released dynamically once task is over
Soumasish
Java coding with spark API
tim wade
Re: Java coding with spark API
Jevon Cowell
Re: Java coding with spark API
tim wade
Re: Java coding with spark API
Ángel Álvarez Pascua
Re: Java coding with spark API
Sonal Goyal
Re: Java coding with spark API
Ángel Álvarez Pascua
Re: Java coding with spark API
Stephen Coy
Re: Java coding with spark API
Jules Damji
Spark 3.3 job jar assembly with JDK 17 and JRE 11 runtime (java target/source = 8)
Kristopher Kane
Request for Support and Resources for Apache Spark User Groups in Bogotá and Mexico
Juan Diaz
Inquiry in regards to a New onQuery Method for StreamingQueryListener
Jevon Cowell
Re: Inquiry in regards to a New onQuery Method for StreamingQueryListener
Jungtaek Lim
Re: Inquiry in regards to a New onQuery Method for StreamingQueryListener
Jevon Cowell
Re: Inquiry in regards to a New onQuery Method for StreamingQueryListener
Jevon Cowell
performance issue Spark 3.5.2 on kubernetes
Prem Sahoo
Spark Shuffle - in kubeflow spark operator installation on k8s
karan alang
Re: Spark Shuffle - in kubeflow spark operator installation on k8s
karan alang
Re: Spark Shuffle - in kubeflow spark operator installation on k8s
karan alang
Re: Spark Shuffle - in kubeflow spark operator installation on k8s
Mich Talebzadeh
Re: Spark Shuffle - in kubeflow spark operator installation on k8s
megh vidani
Re: Spark Shuffle - in kubeflow spark operator installation on k8s
karan alang
Re: Spark Shuffle - in kubeflow spark operator installation on k8s
karan alang
Motif finding tutorial
Russell Jurney
Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance
Prem Gmail
Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance
Ángel Álvarez Pascua
Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance
Prem Sahoo
Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance
Prem Gmail
Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance
Prem Sahoo
High/Critical CVEs in jackson-mapper-asl (spark 3.5.5)
Mohammad, Ejas Ali
Re: High/Critical CVEs in jackson-mapper-asl (spark 3.5.5)
Ángel Álvarez Pascua
Spark Kubernetes Operator | Release Date
Dheeraj Panangat
[ANNOUNCE] Apache Sedona 1.7.1 released
Jia Yu
Multiple CVE issues in apache/spark-py:3.4.0 + Pyspark 3.4.0
Mohammad, Ejas Ali
Re: Multiple CVE issues in apache/spark-py:3.4.0 + Pyspark 3.4.0
Soumasish
[ANNOUNCE] Apache Celeborn 0.5.4 available
Nicholas
4.1.0 release timeline
Martin Bielik
[ANNOUNCE] Version 2.0.0-beta1 of hnswlib spark released
jelmer
[CONNECT] Question on Spark Connect in Cluster Deply Mode
Yasukazu Nagatomi
Apply pivot only on some columns in pyspark
Dhruv Singla
Re: Apply pivot only on some columns in pyspark
Mich Talebzadeh
Re: Apply pivot only on some columns in pyspark
Dhruv Singla
Re: Apply pivot only on some columns in pyspark
Mich Talebzadeh
Re: Apply pivot only on some columns in pyspark
Dhruv Singla
Re: Apply pivot only on some columns in pyspark
Mich Talebzadeh
Re: Apply pivot only on some columns in pyspark
Bjørn Jørgensen
[ANNOUNCE] Apache Spark 3.5.5 released
Dongjoon Hyun
Optimizing file size of an iceberg table
Pathum Wijethunge
Re: Apache - GSOC'25 projects / Contributions
Mich Talebzadeh
Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
daniel williams
Re: Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
Rommel Yuan
Re: Kafka Connector: producer throttling
Jungtaek Lim
Re: Kafka Connector: producer throttling
daniel williams
Re: Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
daniel williams
Re: Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
daniel williams
Re: Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
daniel williams
Re: Kafka Connector: producer throttling
Abhishek Singla
Re: Kafka Connector: producer throttling
Rommel Yuan
Re: Kafka Connector: producer throttling
daniel williams
GraphFrames Hackathon - NOW :)
Russell Jurney
Using storage decommissioning on K8S cluster
Enrico Minack
Re: Using storage decommissioning on K8S cluster
Enrico Minack
Spark connect: Table caching for global use?
Tim Robertson
Re: Spark connect: Table caching for global use?
Tim Robertson
Re: Spark connect: Table caching for global use?
Mich Talebzadeh
Re: Spark connect: Table caching for global use?
Tim Robertson
Re: Spark connect: Table caching for global use?
Mich Talebzadeh
Re: Spark connect: Table caching for global use?
Subhasis Mukherjee
Re: Spark connect: Table caching for global use?
Ángel
END OF LIFE DETERMINATION
Izhar Mohammed
Doubt regarding year formatting
Dhruv Singla
Spark Website Styling Issues Partially Resolved
Gengliang Wang
Re: Spark Website Styling Issues Partially Resolved
Reynold Xin
Website Down
Will Dumas
Re: Website Down
walt
Is SSL configuration being used for RPC communication?
Pablo Fernández
Re: Is SSL configuration being used for RPC communication?
Aironman DirtDiver
Re: Is SSL configuration being used for RPC communication?
Pablo Fernández
GraphFrames Hackathon on Friday, February 21
Russell Jurney
Re: GraphFrames Hackathon on Friday, February 21
Russell Jurney
Re: GraphFrames Hackathon on Friday, February 21
Holden Karau
Re: GraphFrames Hackathon on Friday, February 21
Russell Jurney
Drop Python 2 support from GraphFrames?
Russell Jurney
Re: Drop Python 2 support from GraphFrames?
Holden Karau
Re: Drop Python 2 support from GraphFrames?
Jules Damji
Re: Drop Python 2 support from GraphFrames?
Russell Jurney
Re: Drop Python 2 support from GraphFrames?
Mich Talebzadeh
Re: Drop Python 2 support from GraphFrames?
Ángel
Re: Drop Python 2 support from GraphFrames?
Russell Jurney
[Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Frank Bertsch
Re: [Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Mich Talebzadeh
Re: [Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Frank Bertsch
Re: [Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Soumasish
Re: [Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Reynold Xin
Re: [Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Allison Wang
Re: [Spark SQL]: Are SQL User-Defined Functions on the Roadmap?
Frank Bertsch
Re: [Spark Stream]: Batch processing time reduce over time causing Kafka Lag
Mich Talebzadeh
Re: [Spark Stream]: Batch processing time reduce over time causing Kafka Lag
Saurabh Agrawal
Re: [Spark Stream]: Batch processing time reduce over time causing Kafka Lag
Mich Talebzadeh
Re: Feature store in bigquery
Mich Talebzadeh
Re: Feature store in bigquery
Gunjan Kumar
Need a solution
aishwarya talluri
[start-connect-server.sh] connecting with org.apache.spark.deploy.worker.Worker
Andrew Petersen
Re: [start-connect-server.sh] connecting with org.apache.spark.deploy.worker.Worker
Mich Talebzadeh
Re: [start-connect-server.sh] connecting with org.apache.spark.deploy.worker.Worker
Mich Talebzadeh
Re: Re: Increasing Shading & Relocating for 4.0
Mich Talebzadeh
Re: Re: Increasing Shading & Relocating for 4.0
Ángel
[Spark Core][BlockManager] Spark job fails if blockmgr dirs are cleaned up
Olga Averianova
Re: [Spark Core][BlockManager] Spark job fails if blockmgr dirs are cleaned up
Mich Talebzadeh
Help choose a GraphFrames logo
Russell Jurney
Re: Help choose a GraphFrames logo
Denny Lee
Re: Help choose a GraphFrames logo
Matei Zaharia
Re: Help choose a GraphFrames logo
Mich Talebzadeh
Earlier messages