Re: Catalyst dependency on Spark Core

2017-11-05 Thread piyush.mukati
Hi, Is there any work going on for removing spark dependency from the Catalyst? Thanks -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread Xin Lu
To prevent this in the future you could set up something that checks if each worker has the path. If a worker doesn't satisfy the criteria then just mark the worker offline or restart the process automatically.It's doable with a maintenance script on the jenkins master. When it fails like thi

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread Josh Rosen
Disconnecting and reconnecting each Jenkins worker appears to have resolved the PATH issue: in the System Info page for each worker, I now see a PATH which includes Anaconda. To restart the worker processes, I only needed to hit the "Disconnect" button in the Jenkins master UI for each worker, wai

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread shane knapp
hello from the canary islands! ;) i just saw this thread, and another one about a quick power loss at the colo where our machines are hosted. the master is on UPS but the workers aren't... and when they come back, the PATH variable specified in the workers' configs get dropped and we see behavi

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread Xin Lu
Also another thing to look at is if you guys have any kinda of nightly cleanup scripts for these workers that completely nuke the conda environments. If there is one maybe that's why some of them recover after a while. I don't know enough about your infra right now to understand all the things th

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread Xin Lu
So, right now it looks like 2 and 6 are still broken, but 7 has recovered: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/SparkPullRequestBuilder/buildTimeTrend What I am suggesting is to just perhaps modify the SparkPullRequestBuilder configuration and run "which python" and t

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread Xin Lu
Thanks, I actually don't have access to the machines or build configs to do proper debugging on this. It looks like these workers are shared with other build configurations like avocado and cannoli as well and really any of the shared configs could be changing your JAVA_HOME and python environme

RE: Accessing DataFrame inside UserDefinedFunction.

2017-11-05 Thread knowsnothing
Thank you for your response Anurag. I am not sure if I get your point. Are you suggesting that UDF somehow serializes not only reference to Dataset, but also all the data? -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --

RE: Accessing DataFrame inside UserDefinedFunction.

2017-11-05 Thread Anurag Verma
This is expected. You are not accessing the DataSet Dict when calling UDF countPositiveSimilarity. The dict dataframe as it existed when udf was created is encoded into udf. If you change dict later on the changes will not get automatically picked up in UDF countPositiveSimilarity. Sent from