Here's the 1.0.0rc9 version of the docs: https://people.apache.org/~pwendell/spark-1.0.0-rc9-docs/running-on-mesos.html I refreshed them with the goal of steering users more towards prebuilt packages than relying on compiling from source plus improving overall formatting and clarity, but not otherwise modifying the content. I don't expect any changes for rc10.
It does seem like an issue though that classpath issues are preventing that from running. Just to check, have you given the exact some jar a shot when running against a standalone cluster? If it works in standalone, I think that's good evidence that there's an issue with the Mesos classloaders in master. I'm running into a similar issue with classpaths failing on Mesos but working in standalone, but I haven't coherently written up my observations yet so haven't gotten that to this list. I'd almost gotten to the point where I thought that my custom code needed to be included in the SPARK_EXECUTOR_URI but that can't possibly be correct. The Spark workers that are launched on Mesos slaves should start with the Spark core jars and then transparently get classes from custom code over the network, or at least that's who I thought it should work. For those who have been using Mesos in previous releases, you've never had to do that before have you? On Wed, May 21, 2014 at 3:30 PM, Gerard Maas <gerard.m...@gmail.com> wrote: > Hi Tobias, > > On Wed, May 21, 2014 at 5:45 PM, Tobias Pfeiffer <t...@preferred.jp> wrote: > >first, thanks for your explanations regarding the jar files! > No prob :-) > > >> On Thu, May 22, 2014 at 12:32 AM, Gerard Maas <gerard.m...@gmail.com> >> wrote: >> > I was discussing it with my fellow Sparkers here and I totally >> overlooked >> > the fact that you need the class files to de-serialize the closures (or >> > whatever) on the workers, so you always need the jar file delivered to >> the >> > workers in order for it to work. >> >> So the closure as a function is serialized, sent across the wire, >> deserialized there, and *still* you need the class files? (I am not >> sure I understand what is actually sent over the network then. Does >> that serialization only contain the values that I close over?) >> > > I also had that mental lapse. Serialization refers to converting object > (not class) state (current values) into a byte stream and de-serialization > restores the bytes from the wire into an seemingly identical object at the > receiving side (except for transient variables), for that, it requires the > class definition of that object to know what it needs to instantiate, so > yes, the compiled classes need to be given to the Spark driver and it will > take care of dispatching them to the workers (much better than in the old > RMI days ;-) > > >> If I understand correctly what you are saying, then the documentation >> at >> https://people.apache.org/~pwendell/catalyst-docs/running-on-mesos.html >> (list item 8) needs to be extended quite a bit, right? >> > > The mesos docs have been recently updated here: > https://github.com/apache/spark/pull/756/files > Don't know where the latest version from master is built/available. > > -kr, Gerard. >