Hi, I am running into a pretty concerning issue with Shark (granted I'm running v. 0.8.1).
I have a Spark slave node that has run out of disk space. When I try to start Shark it attempts to deploy the application to a directory on that node, fails and eventually gives up (I see a "Master Removed our application" message in the shark server log). Is Spark supposed to be able to ignore a slave if something goes wrong for it (I realize that the slave probably appears "alive" enough)? I restarted the Spark master in hopes that it would detect that the slave is suffering but it doesn't seem to be the case. Any thoughts appreciated -- we'll monitor disk space but I'm a little worried that the cluster is not functional on account of a single slave.