date:20210812

Re: Naming files while saving a Dataframe

2021-08-12 Thread Eric Beabes

This doesn't work as given here ( https://stackoverflow.com/questions/36107581/change-output-filename-prefix-for-dataframe-write) but the answer suggests using FileOutputFormat class. Will try that. Thanks. Regards. On Sun, Jul 18, 2021 at 12:44 AM Jörn Franke wrote: > Spark heavily depends on H

Replacing BroadcastNestedLoopJoin

2021-08-12 Thread Eric Beabes

We’ve two datasets that look like this: Dataset A: App specific data that contains (among other fields): ip_address Dataset B: Location data that contains start_ip_address_int, end_ip_address_int, latitude, longitude We’re (left) joining these two datasets as: A.ip_address >= B.start_ip_address

Re: K8S submit client vs. cluster

2021-08-12 Thread Mich Talebzadeh

OK amazon not much difference compared to Google Cloud Kubernetes Engines (GKE). When I submit a job, you need a powerful compute server to submit the job. It is another host but you cannot submit from K8s cluster nodes (I am not aware if one can actually do that). Anyway you submit something lik

RE: K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD

On EKS... From: Mich Talebzadeh Sent: Donnerstag, 12. August 2021 15:47 To: Bode, Meikel, NMA-CFD Cc: user@spark.apache.org Subject: Re: K8S submit client vs. cluster Ok As I see it with PySpark even if it is submitted as cluster, it will be converted to client mode anyway Are you running t

Re: [EXTERNAL] [Marketing Mail] Reading SPARK 3.1.x generated parquet in SPARK 2.4.x

2021-08-12 Thread Gourav Sengupta

Hi Saurabh, a very big note of thanks from Gourav :) Regards, Gourav Sengupta On Thu, Aug 12, 2021 at 4:16 PM Saurabh Gulati wrote: > We had issues with this migration mainly because of changes in spark date > calendars. See >

Re: [EXTERNAL] [Marketing Mail] Reading SPARK 3.1.x generated parquet in SPARK 2.4.x

2021-08-12 Thread Saurabh Gulati

We had issues with this migration mainly because of changes in spark date calendars. See We got this working by setting the below params: ("spark.sql.legacy.parquet.datetimeReba

Re: K8S submit client vs. cluster

2021-08-12 Thread Mich Talebzadeh

Ok As I see it with PySpark even if it is submitted as cluster, it will be converted to client mode anyway Are you running this on AWS or GCP? view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibil

RE: K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD

Hi Mich, All PySpark. Best, Meikel From: Mich Talebzadeh Sent: Donnerstag, 12. August 2021 13:41 To: Bode, Meikel, NMA-CFD Cc: user@spark.apache.org Subject: Re: K8S submit client vs. cluster Is this Spark or PySpark? [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9

Re: K8S submit client vs. cluster

2021-08-12 Thread Mich Talebzadeh

Is this Spark or PySpark? view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's tec

K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD

Hi all, If we schedule a spark job on k8s, how are volume mappings handled? In client mode I would expect that drivers volumes have to mapped manually in the pod template. Executor volumes are attached dynamically based on submit parameters. Right...? I cluster mode I would expect that volumes

Re: Naming files while saving a Dataframe

Replacing BroadcastNestedLoopJoin

Re: K8S submit client vs. cluster

RE: K8S submit client vs. cluster

Re: [EXTERNAL] [Marketing Mail] Reading SPARK 3.1.x generated parquet in SPARK 2.4.x

Re: [EXTERNAL] [Marketing Mail] Reading SPARK 3.1.x generated parquet in SPARK 2.4.x

Re: K8S submit client vs. cluster

RE: K8S submit client vs. cluster

Re: K8S submit client vs. cluster

K8S submit client vs. cluster

10 matches

Site Navigation

Mail list logo

Footer information