Hello,

Before I ask my questions, let me say what I am trying to do and briefly
describe the setup I have so far. I am basically building an API service
that serves a ML model which uses Spark ML. I have Spark deployed in
Kubernetes in standalone mode (the default Spark manager) with 2 worker
nodes, each worker running as a pod. The API app itself is in Python using
FastAPI, and uses PySpark to start a Spark application in the Spark
cluster. The Spark Driver is created on app startup and lives in the same
pod as the app, so the Spark application is expected to run for as long as
the API pod is alive, which is indefinitely. I understand that this is not
the typical pattern, since usually a Spark application is expected to
complete at some point, as a kind of workload that eventually returns a
result.
When load on the API increases, I would like to have the Spark cluster
scale up automatically by increasing the number of worker replicas, and
have Spark application scale up by using those new workers in the cluster.
For controlling horizontal pod scaling in Kubernetes we use KEDA
<https://keda.sh/>, and in this case I will probably use certain Prometheus
metrics exposed by Spark Master as the scaling trigger. This leads to my
questions.

1. I initially thought the number of pending jobs (jobs waiting for
available executors) in the Spark cluster would be a good metric to use as
a scaling trigger. But after some digging, I found that "pending" is not
one of the statuses of jobs. Spark doesn't seem to have an obvious concept
of a job-queue. Are there any metrics that would give some indication of
the number of jobs in the cluster waiting to be run?
2. Are there other Spark metrics that would make sense to use as a scaling
trigger?
3. Does this setup of having perpetual-running Spark applications even make
much sense in the first place?

Thanks very much in advance for any advice!

Reply via email to