I have been posting on the Mesos list, as I am looking to see if it it's possible or not to share spark drivers. Obviously, in stand alone cluster mode, the Master handles requests, and you can instantiate a new sparkcontext to a currently running master. However in Mesos (and perhaps Yarn) I don't see how this is possible.
I guess I am curious on why? It could make quite a bit of sense to have one driver act as a master, running as a certain user, (ideally running out in the Mesos cluster, which I believe Tim Chen is working on). That driver could belong to a user, and be used as a long term resource controlled instance that the user could use for adhoc queries. While running many little ones out on the cluster seems to be a waste of driver resources, as each driver would be using the same resources, and rarely would many be used at once (if they were for a users adhoc environment). Additionally, the advantages of the shared driver seem to play out for a user as they come back to the environment over and over again. Does this make sense? I really want to try to understand how looking at this way is wrong, either from a Spark paradigm perspective of a technological perspective. I will grant, that I am coming from a traditional background, so some of the older ideas for how to set things up may be creeping into my thinking, but if that's the case, I'd love to understand better. Thanks1 John --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
