I posted about the Application WebUI error (specifically application WebUI not 
the master WebUI generally) and have spent at least a few hours a day for over 
week trying to resolve it so I’d be very grateful for any suggestions. It is 
quite troubling that I appear to be the only one encountering this issue and 
I’ve tried to include everything here which might be relevant (sorry for the 
length). Please see the thread "Current Build Gives HTTP ERROR” 
https://www.mail-archive.com/[email protected]/msg18752.html 
<https://www.mail-archive.com/[email protected]/msg18752.html> to see 
specifics about the application webUI issue and the master log.


Environment:

I’m doing my spark builds and application programming in scala locally on my 
macbook pro in eclipse, using modified ec2 launch scripts to launch my cluster, 
uploading my spark builds and models to s3, and uploading applications to and 
submitting them from ec2. I’m using java 8 locally and also installing and 
using java 8 on my ec2 instances (which works with spark 1.2.0). I have a 
windows machine at home (macbook is work machine), but have not yet attempted 
to launch from there.


Errors:

I’ve built two different recent git versions of spark both multiple times, and 
when running applications both have produced an Application WebUI error and 
this exception: 

Exception in thread "main" java.lang.IllegalArgumentException: Log directory 
/tmp/spark-events does not exist.

While both will display the master webUI just fine including running/completed 
applications, registered workers etc, when I try to access a running or 
completed application’s WebUI by clicking their respective link, I receive a 
server error. When I manually create the above log directory, the exception 
goes away, but the WebUI problem does not. I don’t have any strong evidence, 
but I suspect these errors and whatever is causing them are related. 


Why and How of Modifications to Launch Scripts for Installation of Unreleased 
Spark Versions:

When using a prebuilt version of spark on my cluster everything works except 
the new methods I need, which I had previously added to my custom version of 
spark and used by building the spark-assembly.jar locally and then replacing 
the assembly file produced through the 1.1.0 ec2 launch scripts. However, since 
my pull request was accepted and can now be found in the apache/spark 
repository along with some additional features I’d like to use and because I’d 
like a more elegant permanent solution for launching a cluster and installing 
unreleased versions of spark to my ec2 clusters, I’ve modified the included ec2 
launch scripts in this way (credit to gen tang here: 
https://www.mail-archive.com/user%40spark.apache.org/msg18761.html 
<https://www.mail-archive.com/[email protected]/msg18761.html>):

1. Clone the most recent git version of spark
2. Use the make-dist script 
3. Tar the dist folder and upload the resulting 
spark-1.3.0-snapshot-hadoop1.tgz to s3 and change file permissions
4. Fork the mesos/spark-ec2 repository and modify the spark/init.sh script to 
do a wget of my hosted distribution instead of spark’s stable release
5. Modify my spark_ec2.py script to point to my repository.
6. Modify my spark_ec2.py script to install java 8 on my ec2 instances. (This 
works and does not produce the above stated errors when using a stable release 
like 1.2.0).


Additional Possibly Related Info:

As far as I can tell (I went through line by line), when I launch my recent 
build vs when I launch the most recent stable release the console prints almost 
identical INFO and WARNINGS except where you would expect things to be 
different e.g. version numbers. I’ve noted that after launch the prebuilt 
stable version does not have a /tmp/spark-events directory, but it is created 
when the application is launched, while it is never created in my build. 
Further, in my unreleased builds the application logs that I find are always 
stored as .inprogress files (when I set the logging directory to /root/ or add 
the /tmp/spark-events directory manually) even after completion, which I 
believe is supposed to change to .completed (or something similar) when the 
application finishes.


Thanks for any help!

Reply via email to