Thank you! That definitely did the trick.

I was trying to use ZEPPELIN_CLASSPATH_OVERRIDES to load the jars and couldn’t 
figure out why it wasn’t working. Also 
mode.asInstanceOf[Hdfs].conf.get("tmpjars").split(",").foreach(println) was 
exactly the command that I was looking for to diagnose this problem. Switching 
over to loading jars through the args.string got everything working.

On Thu, Dec 22, 2016 at 3:37 PM Prasad Wagle

<
mailto:Prasad Wagle <prasadwa...@gmail.com>
> wrote:

a, pre, code, a:link, body { word-wrap: break-word !important; }

Hi Paul,

It looks like the cascading jars are not distributed to the YARN cluster. Can 
you please try adding "zeppelin/interpreter/

scalding/*" to the args.string property of the scalding interpreter? 

Here's the args.string we use:

-libjars /home/zeppelin-user/zeppelin/

interpreter/scalding/*,/home/

zeppelin-user/deploy-bundle-

201608111417/libs/* -Dscalding.reducer.estimator.

classes=com.twitter.scalding.

reducer_estimation.

InputSizeReducerEstimator -Delephantbird.use.combine.

input.format=true -Delephantbird.combine.split.

size=134217728 --hdfs --repl

tmpjars contains jars that are distributed to the YARN cluster. You can see its 
contents with the command below:

%scalding 

mode.asInstanceOf[Hdfs].conf.get("tmpjars").split(",").foreach(println)

Thanks,

Prasad

On Thu, Dec 22, 2016 at 9:31 AM, Paul Brenner

<
mailto:pbren...@placeiq.com
>

wrote:

I'm trying to get Scalding working on Zeppelin while using YARN. I followed the 
steps in the docs 
https://share.polymail.io/v1/z/b/NTg1YzBkOTc5ZTkx/2oW5SQjbADW8zb9nS3JO5g421bQMXDLTC0FeCJ_WR7eecFsW9CWa-tzokB9aSLwG5t9yQ9B6QpcS8AmXjjFFxJ31Thy9lN7HSvilaEeoI6Az7C53CrnFmUoMnta-EYrRI5uEQhbztPSzTrQle-3E00nNiVc7M6poouix37ZlX2VacVqONwmxpu6FSMs2x-_t20QRzFz8S7lneRPUBtpzIyxBRLcRL4CMf1AeMxQIVl3FkoStgA==
 to build the interpreter and set up the classpath override. When I run in 
local mode, code executes properly. However when I run on my cluster via YARN 
my jobs fail with:

Error: java.lang.

ClassNotFoundException: cascading.CascadingException

or

Error: java.lang.

ClassNotFoundException: cascading.tuple.TupleException

What is even stranger to me is that I can go into Zeppelin and execute:

import cascading.tuple.TupleException import cascading.CascadingException

And both appear to have no problem finding those classes. It is only when I try 
to actually use scalding (on YARN), like loading data into a typed pipe and 
dumping that I get the ClassNotFoundException. Any ideas on how to debug or 
what to fix?

(I already posted this on Stack Overflow with no luck: 
https://share.polymail.io/v1/z/b/NTg1YzBkOTc5ZTkx/2oW5SQjbADW8zb9nS3JO5g421bQMXDLTC0FeCJ_WR7eecFsW9CWa-tzokB9aSLwG5t9yQ9B6QpcS8AmXjjFFxJ31Thy9lN7HSvilaEeoI6Az7C53CrnFmUoMnta-EYrRI5uEQhbztPSzB6ElJ-PAwFLHk1sne6d3tKW61fUlXHcQZkGEK0jyp-yaS91unqy7hlcPyFvqV_lgfB3NDNd4PGFMQK4UWIUMywUEsXGgXRQGgho1Hlj-0iyZtbcZrE3vMNtVzjO8HWJpq45DjiV-N210k6sXeYh5YrKTEiEpER4B
)

Reply via email to