Thank you! That definitely did the trick. I was trying to use ZEPPELIN_CLASSPATH_OVERRIDES to load the jars and couldn’t figure out why it wasn’t working. Also mode.asInstanceOf[Hdfs].conf.get("tmpjars").split(",").foreach(println) was exactly the command that I was looking for to diagnose this problem. Switching over to loading jars through the args.string got everything working.
On Thu, Dec 22, 2016 at 3:37 PM Prasad Wagle < mailto:Prasad Wagle <prasadwa...@gmail.com> > wrote: a, pre, code, a:link, body { word-wrap: break-word !important; } Hi Paul, It looks like the cascading jars are not distributed to the YARN cluster. Can you please try adding "zeppelin/interpreter/ scalding/*" to the args.string property of the scalding interpreter? Here's the args.string we use: -libjars /home/zeppelin-user/zeppelin/ interpreter/scalding/*,/home/ zeppelin-user/deploy-bundle- 201608111417/libs/* -Dscalding.reducer.estimator. classes=com.twitter.scalding. reducer_estimation. InputSizeReducerEstimator -Delephantbird.use.combine. input.format=true -Delephantbird.combine.split. size=134217728 --hdfs --repl tmpjars contains jars that are distributed to the YARN cluster. You can see its contents with the command below: %scalding mode.asInstanceOf[Hdfs].conf.get("tmpjars").split(",").foreach(println) Thanks, Prasad On Thu, Dec 22, 2016 at 9:31 AM, Paul Brenner < mailto:pbren...@placeiq.com > wrote: I'm trying to get Scalding working on Zeppelin while using YARN. I followed the steps in the docs https://share.polymail.io/v1/z/b/NTg1YzBkOTc5ZTkx/2oW5SQjbADW8zb9nS3JO5g421bQMXDLTC0FeCJ_WR7eecFsW9CWa-tzokB9aSLwG5t9yQ9B6QpcS8AmXjjFFxJ31Thy9lN7HSvilaEeoI6Az7C53CrnFmUoMnta-EYrRI5uEQhbztPSzTrQle-3E00nNiVc7M6poouix37ZlX2VacVqONwmxpu6FSMs2x-_t20QRzFz8S7lneRPUBtpzIyxBRLcRL4CMf1AeMxQIVl3FkoStgA== to build the interpreter and set up the classpath override. When I run in local mode, code executes properly. However when I run on my cluster via YARN my jobs fail with: Error: java.lang. ClassNotFoundException: cascading.CascadingException or Error: java.lang. ClassNotFoundException: cascading.tuple.TupleException What is even stranger to me is that I can go into Zeppelin and execute: import cascading.tuple.TupleException import cascading.CascadingException And both appear to have no problem finding those classes. It is only when I try to actually use scalding (on YARN), like loading data into a typed pipe and dumping that I get the ClassNotFoundException. Any ideas on how to debug or what to fix? (I already posted this on Stack Overflow with no luck: https://share.polymail.io/v1/z/b/NTg1YzBkOTc5ZTkx/2oW5SQjbADW8zb9nS3JO5g421bQMXDLTC0FeCJ_WR7eecFsW9CWa-tzokB9aSLwG5t9yQ9B6QpcS8AmXjjFFxJ31Thy9lN7HSvilaEeoI6Az7C53CrnFmUoMnta-EYrRI5uEQhbztPSzB6ElJ-PAwFLHk1sne6d3tKW61fUlXHcQZkGEK0jyp-yaS91unqy7hlcPyFvqV_lgfB3NDNd4PGFMQK4UWIUMywUEsXGgXRQGgho1Hlj-0iyZtbcZrE3vMNtVzjO8HWJpq45DjiV-N210k6sXeYh5YrKTEiEpER4B )