user
Thread
Date
Earlier messages
Later messages
Messages by Thread
Re: Introducing Comet, a plugin to accelerate Spark execution via DataFusion and Arrow
Mich Talebzadeh
Re: Introducing Comet, a plugin to accelerate Spark execution via DataFusion and Arrow
Manoj Kumar
Null pointer exception while replying WAL
nayan sharma
Re: Null pointer exception while replying WAL
Mich Talebzadeh
Re: Null pointer exception while replying WAL
nayan sharma
Re: Null pointer exception while replying WAL
Mich Talebzadeh
Building an Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration
Mich Talebzadeh
Re: Building an Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration
Mich Talebzadeh
performance of union vs insert into
Manish Mehra
[ANNOUNCE] Apache Celeborn(incubating) 0.4.0 available
Fu Chen
Community over Code EU 2024 Travel Assistance Applications now open!
Gavin McDonald
[no subject]
Gavin McDonald
deploy spark as cluster
ali sharifi
Create Custom Logs
PRASHANT L
randomsplit has issue?
second_co...@yahoo.com.INVALID
Issue in Creating Temp_view in databricks and using spark.sql().
Karthick Nk
Re: Issue in Creating Temp_view in databricks and using spark.sql().
Jungtaek Lim
Re: Issue in Creating Temp_view in databricks and using spark.sql().
Mich Talebzadeh
Re: Issue in Creating Temp_view in databricks and using spark.sql().
Mich Talebzadeh
[Spark SQL]: Crash when attempting to select PostgreSQL bpchar without length specifier in Spark 3.5.0
Lily Hahn
startTimestamp doesn't work when using rate-micro-batch format
Perfect Stranger
Re: startTimestamp doesn't work when using rate-micro-batch format
Mich Talebzadeh
Re: startTimestamp doesn't work when using rate-micro-batch format
Perfect Stranger
Re: startTimestamp doesn't work when using rate-micro-batch format
Mich Talebzadeh
Some optimization questions about our beloved engine Spark
Aissam Chia
Facing Error org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for s3ablock-0001-
Abhishek Singla
Re: Facing Error org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for s3ablock-0001-
Abhishek Singla
Re: Facing Error org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for s3ablock-0001-
Bjørn Jørgensen
Re: Facing Error org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for s3ablock-0001-
Mich Talebzadeh
[spark.local.dir] comma separated list does not work
Andrew Petersen
Re: [spark.local.dir] comma separated list does not work
Koert Kuipers
Re: [spark.local.dir] comma separated list does not work
Andrew Petersen
Re: [spark.local.dir] comma separated list does not work
Andrew Petersen
Unsubscribe
Andrew Redd
[GraphFrames Spark Package]: Why is there not a distribution for Spark 3.3?
Boileau, Brad
Re: [GraphFrames Spark Package]: Why is there not a distribution for Spark 3.3?
Russell Jurney
Re: [External] Re: [GraphFrames Spark Package]: Why is there not a distribution for Spark 3.3?
Ofir Manor
Best option to process single kafka stream in parallel: PySpark Vs Dask
lab22
Structured Streaming Process Each Records Individually
PRASHANT L
Re: Structured Streaming Process Each Records Individually
Khalid Mammadov
Re: Structured Streaming Process Each Records Individually
Ant Kutschera
Re: Structured Streaming Process Each Records Individually
Mich Talebzadeh
Re: Structured Streaming Process Each Records Individually
Mich Talebzadeh
[Structured Streaming] Avoid one microbatch delay with multiple stateful operations
Andrzej Zera
Re: [Structured Streaming] Avoid one microbatch delay with multiple stateful operations
Ant Kutschera
Re: [Structured Streaming] Avoid one microbatch delay with multiple stateful operations
Jungtaek Lim
Re: [Structured Streaming] Avoid one microbatch delay with multiple stateful operations
Andrzej Zera
[apache-spark] documentation on File Metadata _metadata struct
Jason Horner
Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.
Mich Talebzadeh
Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.
Mich Talebzadeh
Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.
ashok34...@yahoo.com.INVALID
Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.
Mich Talebzadeh
[ANNOUNCE] Apache Celeborn(incubating) 0.3.2 available
Nicholas Jiang
[Structured Streaming] Keeping checkpointing cost under control
Andrzej Zera
Re: [Structured Streaming] Keeping checkpointing cost under control
Mich Talebzadeh
Re: [Structured Streaming] Keeping checkpointing cost under control
Andrzej Zera
Re: [Structured Streaming] Keeping checkpointing cost under control
Mich Talebzadeh
Re: [Structured Streaming] Keeping checkpointing cost under control
Andrzej Zera
Re: [Structured Streaming] Keeping checkpointing cost under control
Mich Talebzadeh
Re: [Structured Streaming] Keeping checkpointing cost under control
Andrzej Zera
Re: [Structured Streaming] Keeping checkpointing cost under control
Mich Talebzadeh
Re: [Structured Streaming] Keeping checkpointing cost under control
Andrzej Zera
Re: [Structured Streaming] Keeping checkpointing cost under control
Mich Talebzadeh
Re: [Structured Streaming] Keeping checkpointing cost under control
Jungtaek Lim
Issue with Spark Session Initialization in Kubernetes Deployment
Atul Patil
Re: Issue with Spark Session Initialization in Kubernetes Deployment
Mich Talebzadeh
Select Columns from Dataframe in Java
PRASHANT L
Re: Select Columns from Dataframe in Java
Grisha Weintraub
Re: Select Columns from Dataframe in Java
PRASHANT L
Re: Select Columns from Dataframe in Java
Grisha Weintraub
Fwd: the life cycle shuffle Dependency
yang chen
Re: the life cycle shuffle Dependency
murat migdisoglu
Pyspark UDF as a data source for streaming
Поротиков Станислав Вячеславович
Re: Pyspark UDF as a data source for streaming
Mich Talebzadeh
RE: Pyspark UDF as a data source for streaming
Поротиков Станислав Вячеславович
RE: Pyspark UDF as a data source for streaming
Поротиков Станислав Вячеславович
RE: Pyspark UDF as a data source for streaming
Поротиков Станислав Вячеславович
Re: Pyspark UDF as a data source for streaming
Hyukjin Kwon
Re: Pyspark UDF as a data source for streaming
Mich Talebzadeh
RE: Pyspark UDF as a data source for streaming
Поротиков Станислав Вячеславович
Re: Pyspark UDF as a data source for streaming
Mich Talebzadeh
Re: Pyspark UDF as a data source for streaming
Mich Talebzadeh
Re: Pyspark UDF as a data source for streaming
Mich Talebzadeh
Re: Validate spark sql
Nicholas Chammas
Re: Validate spark sql
Mich Talebzadeh
Re: Validate spark sql
ram manickam
回复:Validate spark sql
tianlangstudio
Re: Validate spark sql
Mich Talebzadeh
Re: Validate spark sql
Bjørn Jørgensen
Re: Validate spark sql
Gourav Sengupta
Re: Validate spark sql
Bjørn Jørgensen
India Scala & Big Data Job Referral
sri hari kali charan Tummala
About shuffle partition size
Nebi Aydin
[ANNOUNCE] Apache Spark 3.3.4 released
Dongjoon Hyun
Architecture of Spark Connect
Nikhil Goyal
Re: Architecture of Spark Connect
Nikhil Goyal
Re: Architecture of Spark Connect
Kezhi Xiong
Re: Architecture of Spark Connect
Hyukjin Kwon
Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Patil, Atul
Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Atul Patil
Re: Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Koert Kuipers
Re: Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Mich Talebzadeh
Re: Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Mich Talebzadeh
Cluster-mode job compute-time/cost metrics
Jack Wells
Re: Cluster-mode job compute-time/cost metrics
Jörn Franke
Re: Cluster-mode job compute-time/cost metrics
murat migdisoglu
Spark 3.1.3 with Hive dynamic partitions fails while driver moves the staged files
Shay Elbaz
Spark on Java 17
Faiz Halde
RE: Spark on Java 17
Luca Canali
Re: Spark on Java 17
Faiz Halde
Re: Spark on Java 17
Jörn Franke
Re: Spark on Java 17
Jörn Franke
SSH Tunneling issue with Apache Spark
Venkatesan Muniappan
Re: SSH Tunneling issue with Apache Spark
Venkatesan Muniappan
Re: SSH Tunneling issue with Apache Spark
Nicholas Chammas
Re: SSH Tunneling issue with Apache Spark
Venkatesan Muniappan
ordering of rows in dataframe
Som Lima
Re: ordering of rows in dataframe
Enrico Minack
ML advice
Zahid Rahman
Do we have any mechanism to control requests per second for a Kafka connect sink?
Yeikel Santana
Re: Do we have any mechanism to control requests per second for a Kafka connect sink?
Yeikel Santana
Spark-Connect: Param `--packages` does not take effect for executors.
Xiaolong Wang
Re: Spark-Connect: Param `--packages` does not take effect for executors.
Aironman DirtDiver
Re: Spark-Connect: Param `--packages` does not take effect for executors.
Holden Karau
[PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?
Михаил Кулаков
Re: [PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?
Enrico Minack
Re: [PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?
Enrico Minack
Re: [PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?
Михаил Кулаков
ML using Spark Connect
Faiz Halde
[FYI] SPARK-45981: Improve Python language test coverage
Dongjoon Hyun
Re: [FYI] SPARK-45981: Improve Python language test coverage
Hyukjin Kwon
[Streaming (DStream) ] : Does Spark Streaming supports pause/resume consumption of message from Kafka?
Saurabh Agrawal (180813)
Re: [Streaming (DStream) ] : Does Spark Streaming supports pause/resume consumption of message from Kafka?
Mich Talebzadeh
[ANNOUNCE] Apache Spark 3.4.2 released
Dongjoon Hyun
Re:[ANNOUNCE] Apache Spark 3.4.2 released
beliefer
[sql] how to connect query stage to Spark job/stages?
Chenghao Lyu
Tuning Best Practices
Bryant Wright
Re: Tuning Best Practices
Jack Goodson
Re: Tuning Best Practices
Bryant Wright
Classpath isolation per SparkSession without Spark Connect
Faiz Halde
Re: Classpath isolation per SparkSession without Spark Connect
Holden Karau
Re: Classpath isolation per SparkSession without Spark Connect
Faiz Halde
Re: Classpath isolation per SparkSession without Spark Connect
Pasha Finkelshtein
Re: Classpath isolation per SparkSession without Spark Connect
Faiz Halde
Re: Classpath isolation per SparkSession without Spark Connect
Pasha Finkelshtein
Re: Spark structured streaming tab is missing from spark web UI
Jungtaek Lim
[Spark-sql 3.2.4] Wrong Statistic INFO From 'ANALYZE TABLE' Command
Nick Luo
Query fails on CASE statement depending on order of summed columns
Evgenii Ignatev
How exactly does dropDuplicatesWithinWatermark work?
Perfect Stranger
Re: How exactly does dropDuplicatesWithinWatermark work?
Jungtaek Lim
Setting fs.s3a.aws.credentials.provider through a connect server.
Leandro Martelli
Spark-submit without access to HDFS
Eugene Miretsky
Re: Spark-submit without access to HDFS
eab...@163.com
Re: [EXTERNAL] Re: Spark-submit without access to HDFS
Eugene Miretsky
Re: Re: [EXTERNAL] Re: Spark-submit without access to HDFS
eab...@163.com
Re: Spark-submit without access to HDFS
Jörn Franke
Re: Spark-submit without access to HDFS
Mich Talebzadeh
Re: [EXTERNAL] Re: Spark-submit without access to HDFS
Eugene Miretsky
Re: [EXTERNAL] Re: Spark-submit without access to HDFS
Eugene Miretsky
Re: [EXTERNAL] Re: Spark-submit without access to HDFS
Mich Talebzadeh
Re: [EXTERNAL] Re: [EXTERNAL] Re: Spark-submit without access to HDFS
Eugene Miretsky
[Spark Structured Streaming] Two sink from Single stream
Subash Prabanantham
The job failed when we upgraded from spark 3.3.1 to spark3.4.1
Hanyu Huang
The job failed when we upgraded from spark 3.3.1 to spark3.4.1
Hanyu Huang
RE: The job failed when we upgraded from spark 3.3.1 to spark3.4.1
Stevens, Clay
The job failed when we upgraded from spark 3.3.1 to spark3.4.1
Hanyu Huang
Why create/drop/alter/rename partition does not post listener event in ExternalCatalogWithListener?
李响
Pass xmx values to SparkLauncher launched Java process
Deepthi Sathia Raj
How grouping rows without shuffle
Yoel Benharrous
help needed with SPARK-45598 and SPARK-45769
Maksym M
Storage Partition Joins only works for buckets?
Arwin Tio
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizerFactory ClassNotFoundException
Yi Zheng
[ANNOUNCE] Apache Kyuubi released 1.8.0
Cheng Pan
Spark master shuts down when one of zookeeper dies
Kaustubh Ghode
Re: Spark master shuts down when one of zookeeper dies
Mich Talebzadeh
How to configure authentication from a pySpark client to a Spark Connect server ?
Xiaolong Wang
[Spark SQL] [Bug] Adding `checkpoint()` causes "column [...] cannot be resolved" error
Robin Zimmerman
Parser error when running PySpark on Windows connecting to GCS
Richard Smith
Re: Parser error when running PySpark on Windows connecting to GCS
Mich Talebzadeh
Data analysis issues
Jauru Lin
Re: Data analysis issues
Mich Talebzadeh
Spark / Scala conflict
Harry Jamison
Re: Spark / Scala conflict
Aironman DirtDiver
Re: Spark / Scala conflict
Harry Jamison
Fixed byte array issue
KhajaAsmath Mohammed
jackson-databind version mismatch
moshik.vitas
Re: jackson-databind version mismatch
eab...@163.com
Re: jackson-databind version mismatch
Bjørn Jørgensen
Re: jackson-databind version mismatch
Bjørn Jørgensen
Re: Re: jackson-databind version mismatch
eab...@163.com
RE: jackson-databind version mismatch
moshik.vitas
Elasticity and scalability for Spark in Kubernetes
Mich Talebzadeh
[Structured Streaming] Joins after aggregation don't work in streaming
Andrzej Zera
Re: [Structured Streaming] Joins after aggregation don't work in streaming
Jungtaek Lim
Re: [Structured Streaming] Joins after aggregation don't work in streaming
Andrzej Zera
spark schema conflict behavior records being silently dropped
Carlos Aguni
submitting tasks failed in Spark standalone mode due to missing failureaccess jar file
eab...@163.com
Contribution Recommendations
Phil Dakin
Maximum executors in EC2 Machine
KhajaAsmath Mohammed
Re: Maximum executors in EC2 Machine
Riccardo Ferrari
Earlier messages
Later messages