Hi,
On Mon, Apr 11, 2022 at 7:43 AM Jason Jun wrote:
> the official doc, https://spark.apache.org/docs/latest/job-scheduling.html,
> didn't mention that its working for kubernete cluster?
>
You could use Volcano scheduler for more advanced setups on Kubernetes.
Here is an article explaining ho
the official doc, https://spark.apache.org/docs/latest/job-scheduling.html,
didn't mention that its working for kubernete cluster?
Can anyone quickly answer this?
TIA.
Jason
multiple threads (using thread pool or futures)
This way you will be able to run multiple writes concurrently on driver
which will add all related jobs/tasks to common queue.
At this point you can decide whether you want FIFO or FAIR. In some cases
(because of data locality) FAIR scheduler can produce
at the same
time, using the idle executors.
My question is: is it something achievable with the FAIR scheduler approach
and if yes how?
As I read the fair scheduler needs a pool of jobs and then it schedules
their tasks in a round-robin fashion. If I submit action 1 and action 2 at
the same time
lization and handle scenarios of nested parfor.
> >>
> >> At the end of the day, we just want to configure fair scheduling in a
> >> programmatic way without the need for additional configuration files
> >> which is a hassle for a library that is meant to work out
not do the trick
because we end up with a single default fair scheduler pool in FIFO
mode, which is equivalent to FIFO. Providing a way to set the mode of
the default scheduler would be awesome.
Regarding why fair scheduling showed generally better performance for
out-of-core datasets, I don't hav
heduling in a
>> programmatic way without the need for additional configuration files
>> which is a hassle for a library that is meant to work out-of-the-box.
>> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick
>> because we end up with a single
c way without the need for additional configuration files
> which is a hassle for a library that is meant to work out-of-the-box.
> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick
> because we end up with a single default fair scheduler pool in FIFO
> m
iterations mapped to spark tasks). If
> the data is too large and non-partitionable, the parfor loop is
> executed as a multi-threaded operator in the driver and each worker
> might spawn several data-parallel spark jobs in the context of the
> worker's scheduler pool, for operations that
-threaded operator in the driver and each worker
might spawn several data-parallel spark jobs in the context of the
worker's scheduler pool, for operations that don't fit into the
driver.
We decided to use these fair scheduler pools (w/ fair scheduling
across pools, FIFO per pool) instead of
; Hi all,
>
> for concurrent Spark jobs spawned from the driver, we use Spark's fair
> scheduler pools, which are set and unset in a thread-local manner by
> each worker thread. Typically (for rather long jobs), this works very
> well. Unfortunately, in an application with lots of
Hi all,
for concurrent Spark jobs spawned from the driver, we use Spark's fair
scheduler pools, which are set and unset in a thread-local manner by
each worker thread. Typically (for rather long jobs), this works very
well. Unfortunately, in an application with lots of very short
par
@Crystal
You can use spark on yarn. Yarn have fair scheduler,modified yarn-site.xml.
发自我的 iPad
> 在 2014年8月11日,6:49,Matei Zaharia 写道:
>
> Hi Crystal,
>
> The fair scheduler is only for jobs running concurrently within the same
> SparkContext (i.e. within an application)
Hi Crystal,
The fair scheduler is only for jobs running concurrently within the same
SparkContext (i.e. within an application), not for separate applications on the
standalone cluster manager. It has no effect there. To run more of those
concurrently, you need to set a cap on how many cores
Hi
I am trying to switch from FIFO to FAIR with standalone mode.
my environment:
hadoop 1.2.1
spark 0.8.0 using stanalone mode
and i modified the code..
ClusterScheduler.scala -> System.getProperty("spark.scheduler.mode",
"FAIR"))
SchedulerBuilder.scala ->
val DEFAULT_SCHEDULING_MODE
15 matches
Mail list logo