Re: Sharing SparkContext

Ognen Duzlevski Tue, 25 Feb 2014 21:23:27 -0800

In that case, I must have misunderstood the following (fromhttp://spark.incubator.apache.org/docs/0.8.1/job-scheduling.html).Apologies. Ognen

"Inside a given Spark application (SparkContext instance), multipleparallel jobs can run simultaneously if they were submitted fromseparate threads. By “job”, in this section, we mean a Spark action(e.g.|save|,|collect|) and any tasks that need to run to evaluate thataction. Spark’s scheduler is fully thread-safe and supports this usecase to enable applications that serve multiple requests (e.g. queriesfor multiple users).

By default, Spark’s scheduler runs jobs in FIFO fashion. Each job isdivided into “stages” (e.g. map and reduce phases), and the first jobgets priority on all available resources while its stages have tasks tolaunch, then the second job gets priority, etc. If the jobs at the headof the queue don’t need to use the whole cluster, later jobs can startto run right away, but if the jobs at the head of the queue are large,then later jobs may be delayed significantly.

Starting in Spark 0.8, it is also possible to configure fair sharingbetween jobs. Under fair sharing, Spark assigns tasks between jobs in a“round robin” fashion, so that all jobs get a roughly equal share ofcluster resources. This means that short jobs submitted while a long jobis running can start receiving resources right away and still get goodresponse times, without waiting for the long job to finish. This mode isbest for multi-user settings.

To enable the fair scheduler, simply setthe|spark.scheduler.mode|to|FAIR|before creating a SparkContext:"


On 2/25/14, 12:30 PM, Mayur Rustagi wrote:

fair scheduler merely reorders tasks .. I think he is looking to runmultiple pieces of code on a single context on demand fromcustomers...if the code & order is decided then fair scheduler willensure that all tasks get equal cluster time :)




Mayur Rustagi
Ph: +919632149971 <tel:%2B919632149971>

h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com<http://www.sigmoidanalytics.com>

https://twitter.com/mayur_rustagi

On Tue, Feb 25, 2014 at 10:24 AM, Ognen Duzlevski<og...@nengoiksvelzud.com <mailto:og...@nengoiksvelzud.com>> wrote:


    Doesn't the fair scheduler solve this?
    Ognen


    On 2/25/14, 12:08 PM, abhinav chowdary wrote:

    Sorry for not being clear earlier
    how do you want to pass the operations to the spark context?
    this is partly what i am looking for . How to access the active
    spark context and possible ways to pass operations

    Thanks



    On Tue, Feb 25, 2014 at 10:02 AM, Mayur Rustagi
    <mayur.rust...@gmail.com <mailto:mayur.rust...@gmail.com>> wrote:

        how do you want to pass the operations to the spark context?


        Mayur Rustagi
        Ph: +919632149971 <tel:%2B919632149971>
        h
        <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
        <http://www.sigmoidanalytics.com>
        https://twitter.com/mayur_rustagi



        On Tue, Feb 25, 2014 at 9:59 AM, abhinav chowdary
        <abhinav.chowd...@gmail.com
        <mailto:abhinav.chowd...@gmail.com>> wrote:

            Hi,
                   I am looking for ways to share the sparkContext,
            meaning i need to be able to perform multiple operations
            on the same spark context.

            Below is code of a simple app i am testing

             def main(args: Array[String]) {
                println("Welcome to example application!")

                val sc = new
            SparkContext("spark://10.128.228.142:7077
            <http://10.128.228.142:7077>", "Simple App")

                println("Spark context created!")

                println("Creating RDD!")

            Now once this context is created i want to access  this
            to submit multiple jobs/operations

            Any help is much appreciated

            Thanks

--Warm Regards

    Abhinav Chowdary

Re: Sharing SparkContext

Reply via email to