Re: unable to deploy Pyspark application on GKE, Spark installed using bitnami helm chart

2024-08-27 Thread Mat Schaffer
I use https://github.com/kubeflow/spark-operator rather than bitnami chart, but https://medium.com/@kayvan.sol2/spark-on-kubernetes-d566158186c6 shows running spark submit from a master pod exec. Might be something to try. On Mon, Aug 26, 2024 at 12:22 PM karan alang wrote: > We are currently us

Spark for offline log processing/querying

2016-05-22 Thread Mat Schaffer
I'm curious about trying to use spark as a cheap/slow ELK (ElasticSearch,Logstash,Kibana) system. Thinking something like: - instances rotate local logs - copy rotated logs to s3 (s3://logs/region/grouping/instance/service/*.logs) - spark to convert from raw text logs to parquet - maybe presto to

Re: Spark for offline log processing/querying

2016-05-23 Thread Mat Schaffer
ably be much faster on ELK. If your queries are more interactive and > not about batch processing then it does not make so much sense. I am not > sure why you plan to use Presto. > > On 23 May 2016, at 07:28, Mat Schaffer wrote: > > I'm curious about trying to use spark as

Re: kubeflow spark operator & SparkHistoryService on k8s - spark driver/executor logs not showing up Spark History Server

2024-10-23 Thread Mat Schaffer
We have a similar setup (EKS/S3) and use promtail to collect pod logs to loki. We haven't tried to get the history UI log links working. Instead we link to both the history server and logs from the same job/cluster overview dashboards in grafana. On Wed, Oct 23, 2024 at 3:36 PM karan alang wrote

Re: spark k8s submit

2025-01-05 Thread Mat Schaffer
This should work. We use the same setting to specify pod volume configurations. On Thu, Jan 2, 2025 at 11:41 AM jilani shaik wrote: > Hi, > > I am trying to run Spark on the Kubernetes cluster, but that cluster has > certain validation to deploy any pod that is not allowing me to run my > Spark