user
Thread
Date
Earlier messages
Later messages
Messages by Thread
Re: How reading works?
Sid
Re: How reading works?
Bjørn Jørgensen
Re: How reading works?
Bjørn Jørgensen
Re: How reading works?
Sid
input file size
mbreuer
Re: input file size
marc nicole
Re: input file size
Yong Walt
Re: input file size
Enrico Minack
Re: input file size
Gourav Sengupta
Re: input file size
Enrico Minack
Re: input file size
marc nicole
Re: input file size
Markus Breuer
how to properly filter a dataset by dates ?
marc nicole
Re: how to properly filter a dataset by dates ?
Sean Owen
Re: how to properly filter a dataset by dates ?
marc nicole
Re: how to properly filter a dataset by dates ?
Sean Owen
Re: how to properly filter a dataset by dates ?
marc nicole
Re: how to properly filter a dataset by dates ?
Stelios Philippou
Re: how to properly filter a dataset by dates ?
marc nicole
Re: how to properly filter a dataset by dates ?
Stelios Philippou
Re: how to properly filter a dataset by dates ?
marc nicole
Re: how to properly filter a dataset by dates ?
marc nicole
Re: how to properly filter a dataset by dates ?
marc nicole
How to update TaskMetrics from Python?
Shay Elbaz
Spark Structured streaming(batch mode) - running dependent jobs concurrently
karan alang
How to recognize and get the min of a date/string column in Java?
marc nicole
Re: How to recognize and get the min of a date/string column in Java?
Sean Owen
Re: How to recognize and get the min of a date/string column in Java?
marc nicole
Re: How to recognize and get the min of a date/string column in Java?
marc nicole
Re: How to recognize and get the min of a date/string column in Java?
Sean Owen
Re: How to recognize and get the min of a date/string column in Java?
marc nicole
Re: How to recognize and get the min of a date/string column in Java?
marc nicole
Stickers and Swag
Xiao Li
Re: Stickers and Swag
Hyukjin Kwon
Re: Stickers and Swag
Gengliang Wang
Re: Stickers and Swag
Reynold Xin
Re: Stickers and Swag
Qian Sun
Redesign approach for hitting the APIs using PySpark
Sid
Re: Redesign approach for hitting the APIs using PySpark
Gourav Sengupta
Re: Redesign approach for hitting the APIs using PySpark
Sid
Re: Redesign approach for hitting the APIs using PySpark
Gourav Sengupta
Re: Redesign approach for hitting the APIs using PySpark
Sid
Re: Redesign approach for hitting the APIs using PySpark
Gourav Sengupta
Re: Redesign approach for hitting the APIs using PySpark
Sid
[no subject]
Rodrigo
Re:
Aironman DirtDiver
Spark streaming / confluent Kafka- messages are empty
KhajaAsmath Mohammed
API Problem
Sid
Re: API Problem
Stelios Philippou
Re: API Problem
Sean Owen
Re: API Problem
Sid
Re: API Problem
Stelios Philippou
Re: API Problem
Sid
Re: API Problem
Enrico Minack
Re: API Problem
Enrico Minack
Re: API Problem
Sid
Re: API Problem
Enrico Minack
Re: API Problem
Sid
Re: API Problem
Enrico Minack
Retrieve the count of spark nodes
Poorna Murali
Re: Retrieve the count of spark nodes
Stephen Coy
Re: Retrieve the count of spark nodes
Poorna Murali
to find Difference of locations in Spark Dataframe rows
Chetan Khatri
Re: to find Difference of locations in Spark Dataframe rows
Bjørn Jørgensen
How the data is distributed
Sid
Re: How the data is distributed
Peyman Mohajerian
Re: How the data is distributed
Sean Owen
Re: How the data is distributed
Sid
Structured streaming with protobuf proto3 schema registry
Kiran Biswal
partitionBy creating lot of small files
Nikhil Goyal
Re: partitionBy creating lot of small files
Enrico Minack
How to convert a Dataset<Row> to a Dataset<String>?
marc nicole
Re: How to convert a Dataset<Row> to a Dataset<String>?
Sean Owen
Re: How to convert a Dataset<Row> to a Dataset<String>?
marc nicole
Re: How to convert a Dataset<Row> to a Dataset<String>?
Sean Owen
Re: How to convert a Dataset<Row> to a Dataset<String>?
marc nicole
Re: How to convert a Dataset<Row> to a Dataset<String>?
Enrico Minack
Re: How to convert a Dataset<Row> to a Dataset<String>?
marc nicole
Re: How to convert a Dataset<Row> to a Dataset<String>?
Enrico Minack
Re: How to convert a Dataset<Row> to a Dataset<String>?
marc nicole
Re: How to convert a Dataset<Row> to a Dataset<String>?
Christophe Préaud
Re: How to convert a Dataset<Row> to a Dataset<String>?
Stelios Philippou
PartitionBy and SortWithinPartitions
Nikhil Goyal
Re: PartitionBy and SortWithinPartitions
Enrico Minack
Re: PartitionBy and SortWithinPartitions
Nikhil Goyal
approx_count_distinct in spark always return 1
marc nicole
Does adaptive auto broadcast respect spark.sql.autoBroadcastJoinThreshold
Henry Quan
What's the expected Spark 3.1.4 release date ?
Sandeep Vinayak
Kotlin API for Apache Spark feedback
finkel
Re: protobuf data as input to spark streaming
Kiran Biswal
Unable to format timestamp values in pyspark
Sid
Re: Unable to format timestamp values in pyspark
Stelios Philippou
Re: Unable to format timestamp values in pyspark
Sid
Unable to convert double values
Sid
Re: Unable to convert double values
Stelios Philippou
Re: Unable to convert double values
marc nicole
Re: Unable to convert double values
marc nicole
k-anonymity with Spark in Java
marc nicole
Issues getting Apache Spark
Martin, Michael
Re: Issues getting Apache Spark
Apostolos N. Papadopoulos
java.lang.NoSuchMethodError: org.apache.hadoop.hive.common.FileUtils.mkdir --> Spark to Hive
Prasanth M Sasidharan
Fwd: java.lang.NoSuchMethodError: org.apache.hadoop.hive.common.FileUtils.mkdir --> Spark to Hive
Prasanth M Sasidharan
Complexity with the data
Sid
Re: Complexity with the data
Apostolos N. Papadopoulos
Re: Complexity with the data
Sid
Re: Complexity with the data
Gavin Ray
Re: Complexity with the data
Sid
Re: Complexity with the data
Sid
Re: Complexity with the data
Bjørn Jørgensen
Re: Complexity with the data
Sid
Re: Complexity with the data
Bjørn Jørgensen
Re: Complexity with the data
Sid
Re: Complexity with the data
Apostolos N. Papadopoulos
Re: Complexity with the data
Sid
Re: Complexity with the data
Bjørn Jørgensen
Re: Complexity with the data
Sid
Re: Complexity with the data
Bjørn Jørgensen
Re: Complexity with the data
Gourav Sengupta
Re: Complexity with the data
Sid
[SPARK SQL] Spark Thrift server, It is not releasing memory.
Ramakrishna Chilaka
GCP Dataproc - adding multiple packages(kafka, mongodb) while submitting spark jobs not working
karan alang
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Ranadip Chatterjee
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Ori Popowski
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Ranadip Chatterjee
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Aniket Mokashi
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Ori Popowski
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Ranadip Chatterjee
Re: Job migrated from EMR to Dataproc takes 20 hours instead of 90 minutes
Gourav Sengupta
Spark Push-Based Shuffle causing multiple stage failures
Han Altae-Tran
Re: Spark Push-Based Shuffle causing multiple stage failures
Mridul Muralidharan
Re: Spark Push-Based Shuffle causing multiple stage failures
Ye Zhou
Re: Spark Push-Based Shuffle causing multiple stage failures
Han Altae-Tran
Re: Spark Push-Based Shuffle causing multiple stage failures
Ye Zhou
how to add a column for percent
wilson
Re: how to add a column for percent
Raghavendra Ganesh
Problem with implementing the Datasource V2 API for Salesforce
Rohit Pant
Re: Problem with implementing the Datasource V2 API for Salesforce
Gourav Sengupta
Final reminder: ApacheCon North America call for presentations closing soon
Rich Bowen
[SQL] Why does a small two-source JDBC query take ~150-200ms with all optimizations (AQE, CBO, pushdown, Kryo, unsafe) enabled? (v3.4.0-SNAPSHOT)
Gavin Ray
Spark 3 migration question
Jason Xu
What does Apache Spark do?
Turritopsis Dohrnii Teo En Ming
Re: What does Apache Spark do?
Pasha Finkelshtein
Stopping streaming after the write commit and before the read commit?
kineret M
A scene with unstable Spark performance
Bowen Song
Re: A scene with unstable Spark performance
Qian SUN
Re: A scene with unstable Spark performance
Bowen Song
Re: A scene with unstable Spark performance
Sungwoo Park
Re: A scene with unstable Spark performance
Chang Chen
Reverse proxy for Spark UI on Kubernetes
bo yang
Re: Reverse proxy for Spark UI on Kubernetes
Holden Karau
Re: Reverse proxy for Spark UI on Kubernetes
bo yang
Re: Reverse proxy for Spark UI on Kubernetes
wilson
Re: Reverse proxy for Spark UI on Kubernetes
bo yang
Re: Reverse proxy for Spark UI on Kubernetes
Holden Karau
Re: Reverse proxy for Spark UI on Kubernetes
bo yang
[Spark SQL]: Configuring/Using Spark + Catalyst optimally for read-heavy transactional workloads in JDBC sources?
Gavin Ray
Re: [Spark SQL]: Configuring/Using Spark + Catalyst optimally for read-heavy transactional workloads in JDBC sources?
Gavin Ray
[Spark SQL]: Does Spark SQL support WAITFOR?
K. N. Ramachandran
Re: [Spark SQL]: Does Spark SQL support WAITFOR?
K. N. Ramachandran
Re: [Spark SQL]: Does Spark SQL support WAITFOR?
Sean Owen
Re: [Spark SQL]: Does Spark SQL support WAITFOR?
K. N. Ramachandran
Re: [Spark SQL]: Does Spark SQL support WAITFOR?
Artemis User
Re: [Spark SQL]: Does Spark SQL support WAITFOR?
Someshwar Kale
Structured streaming help on releasing memory
Xavi Gervilla
Spark on K8s - repeating annoying exception
Shay Elbaz
Re: Spark on K8s - repeating annoying exception
Martin Grigorov
RE: [EXTERNAL] Re: Spark on K8s - repeating annoying exception
Shay Elbaz
How do I read parquet with python object
ben
Re: How do I read parquet with python object
Sean Owen
Need help on migrating Spark on Hortonworks to Kubernetes Cluster
Chetan Khatri
Count() action leading to errors | Pyspark
Sid
Re: Count() action leading to errors | Pyspark
Bjørn Jørgensen
groupby question
Irene Markelic
Re: groupby question
wilson
Something about Spark which has bothered me for a very long time, which I've never understood
Denarian Kislata
Re: Something about Spark which has bothered me for a very long time, which I've never understood
Lalwani, Jayesh
Kafka Spark Structure Streaming Error
nayan sharma
Disable/Remove datasources in Spark
Aditya
Re: Disable/Remove datasources in Spark
wilson
Re: Disable/Remove datasources in Spark
Aditya
Re: Disable/Remove datasources in Spark
wilson
Re: Disable/Remove datasources in Spark
wilson
trouble using spark in kubernetes
Andreas Klos
Re: Spark error with jupyter
Bjørn Jørgensen
Re: Spark error with jupyter
Gourav Sengupta
REMINDER - Travel Assistance available for ApacheCon NA New Orleans 2022
Gavin McDonald
Parse Execution Plan from PySpark
Pablo Alcain
RE: [EXTERNAL] Parse Execution Plan from PySpark
Shay Elbaz
Re: [EXTERNAL] Parse Execution Plan from PySpark
Walaa Eldin Moustafa
Re: [EXTERNAL] Parse Execution Plan from PySpark
Pablo Alcain
Idea for improving performance when reading from hive-like partition folders and specifying a filter [Spark 3.2]
Martin
how spark handle the abnormal values
wilson
Re: how spark handle the abnormal values
wilson
Re: how spark handle the abnormal values
Artemis User
Re: how spark handle the abnormal values
Mich Talebzadeh
Re: how spark handle the abnormal values
wilson
spark null values calculation
wilson
Re: spark null values calculation
wilson
structured streaming- checkpoint metadata growing indefinetely
Wojciech Indyk
Re: structured streaming- checkpoint metadata growing indefinetely
Wojciech Indyk
Earlier messages
Later messages