strange. legacy still depends on m-math and should include it into job jar. or did it get that much out of hand after MR deprecation?
On Fri, Jan 9, 2015 at 8:51 AM, mw <[email protected]> wrote: > I found a solution! > I had to upload the missing jars onto yarn hdfs and add the following to > the hadoop Configuration: > > hadoopConf.set("tmpjars","/lib/mahout-math-1.0-20150108. > 230237-316.jar,/lib/commons-cli-2.0-mahout.jar"); > > Best, > Max > > On 01/09/2015 02:13 PM, mw wrote: > >> I looked into the submitted job.jar and i found that the missing >> class(org.apache.mahout.math.Vector) is not contained. >> >> >> On 01/09/2015 12:57 PM, mw wrote: >> >>> I wrote a message to the hadoop list about it. Also i found this >>> https://issues.apache.org/jira/browse/MAHOUT-1498 ticket. >>> Could it be a related bug? >>> >>> Best, >>> Max >>> On 01/08/2015 06:18 PM, Pat Ferrel wrote: >>> >>>> That sounds like a Hadoop list question. >>>> >>>> All I can say is there is a job.jar in mrlegacy/target with all >>>> dependencies packaged. This should have everything needed for lda. >>>> >>>> On Jan 8, 2015, at 5:50 AM, mw <[email protected]> wrote: >>>> >>>> Hello again, >>>> >>>> maybe my question was misleading. >>>> I am asking whether the intended usage is to provide the job with the >>>> required library’s and sent those together with the job to yarn(if yes how >>>> can this be done?), or to add the required classes to the classpath of >>>> every node in the cluster. >>>> What is the best practice? >>>> >>>> Best, >>>> Max >>>> >>>> >>>> On 01/07/2015 06:13 PM, mw wrote: >>>> >>>>> Hello, >>>>> >>>>> the first error was due to a missing property in yarn.xml. However no >>>>> i have a different problem. >>>>> >>>>> >>>>> i am working on a web application that should execute lda on a >>>>> external yarn cluster. >>>>> >>>>> I am uploading all the relevant sequence files onto the yarn cluter. >>>>> This is how it try to remotely execute lda on the cluster. >>>>> >>>>> try { >>>>> ugi.doAs(new PrivilegedExceptionAction<Void>() { >>>>> public Void run() throws Exception { >>>>> Configuration hdoopConf = new Configuration(); >>>>> hdoopConf.set("fs.defaultFS", >>>>> "hdfs://xxx.xxx.xxx.xxx:9000/user/xx"); >>>>> hdoopConf.set("yarn.resourcemanager.hostname", "xxx.xxx.xxx.xxx"); >>>>> hdoopConf.set("mapreduce.framework.name", "yarn"); >>>>> hdoopConf.set("mapred.framework.name", "yarn"); >>>>> hdoopConf.set("mapred.job.tracker", >>>>> "xxx.xxx.xxx.xxx"); >>>>> hdoopConf.set("dfs.permissions.enabled", "false"); >>>>> hdoopConf.set("hadoop.job.ugi", "xx"); >>>>> hdoopConf.set("mapreduce.jobhistory.address","xxx.xxx.xxx.xxx:10020" >>>>> ); >>>>> CVB0Driver driver = new CVB0Driver(); >>>>> try { >>>>> driver.run(hdoopConf, sparseVectorIn.suffix("/ >>>>> matrix"), >>>>> topicsOut, k, numTerms, >>>>> doc_topic_smoothening, term_topic_smoothening, >>>>> maxIter, iteration_block_size, >>>>> convergenceDelta, >>>>> sparseVectorIn.suffix("/dictionary.file-0"), >>>>> topicsOut.suffix("/DocumentTopics/"), >>>>> sparseVectorIn, >>>>> seed, testFraction, numTrainThreads, >>>>> numUpdateThreads, maxItersPerDoc, >>>>> numReduceTasks, backfillPerplexity); >>>>> } catch (ClassNotFoundException e) { >>>>> e.printStackTrace(); >>>>> } catch (InterruptedException e) { >>>>> e.printStackTrace(); >>>>> } >>>>> return null; >>>>> } >>>>> }); >>>>> } catch (InterruptedException e) { >>>>> e.printStackTrace(); >>>>> } >>>>> >>>>> I am getting the following error message: >>>>> >>>>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:372) >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:344) >>>>> at org.apache.hadoop.conf.Configuration.getClassByNameOrNull( >>>>> Configuration.java:1844) >>>>> at org.apache.hadoop.conf.Configuration.getClassByName( >>>>> Configuration.java:1809) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1903) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1929) >>>>> at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass( >>>>> JobConf.java:837) >>>>> at >>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983) >>>>> >>>>> at org.apache.hadoop.mapred.MapTask.createSortingCollector( >>>>> MapTask.java:391) >>>>> at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80) >>>>> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.< >>>>> init>(MapTask.java:675) >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) >>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( >>>>> UserGroupInformation.java:1614) >>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) >>>>> >>>>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:372) >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:344) >>>>> at org.apache.hadoop.conf.Configuration.getClassByNameOrNull( >>>>> Configuration.java:1844) >>>>> at org.apache.hadoop.conf.Configuration.getClassByName( >>>>> Configuration.java:1809) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1903) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1929) >>>>> at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass( >>>>> JobConf.java:837) >>>>> at >>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983) >>>>> >>>>> at org.apache.hadoop.mapred.MapTask.createSortingCollector( >>>>> MapTask.java:391) >>>>> at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80) >>>>> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.< >>>>> init>(MapTask.java:675) >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) >>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( >>>>> UserGroupInformation.java:1614) >>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) >>>>> >>>>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:372) >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:344) >>>>> at org.apache.hadoop.conf.Configuration.getClassByNameOrNull( >>>>> Configuration.java:1844) >>>>> at org.apache.hadoop.conf.Configuration.getClassByName( >>>>> Configuration.java:1809) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1903) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1929) >>>>> at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass( >>>>> JobConf.java:837) >>>>> at >>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983) >>>>> >>>>> at org.apache.hadoop.mapred.MapTask.createSortingCollector( >>>>> MapTask.java:391) >>>>> at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80) >>>>> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.< >>>>> init>(MapTask.java:675) >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) >>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( >>>>> UserGroupInformation.java:1614) >>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) >>>>> >>>>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:372) >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:344) >>>>> at org.apache.hadoop.conf.Configuration.getClassByNameOrNull( >>>>> Configuration.java:1844) >>>>> at org.apache.hadoop.conf.Configuration.getClassByName( >>>>> Configuration.java:1809) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1903) >>>>> at org.apache.hadoop.conf.Configuration.getClass( >>>>> Configuration.java:1929) >>>>> at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass( >>>>> JobConf.java:837) >>>>> at >>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983) >>>>> >>>>> at org.apache.hadoop.mapred.MapTask.createSortingCollector( >>>>> MapTask.java:391) >>>>> at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80) >>>>> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.< >>>>> init>(MapTask.java:675) >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) >>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( >>>>> UserGroupInformation.java:1614) >>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) >>>>> >>>>> java.lang.InterruptedException: Failed to complete iteration 1 stage 1 >>>>> at org.apache.mahout.clustering.lda.cvb.CVB0Driver. >>>>> runIteration(CVB0Driver.java:502) >>>>> at >>>>> org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:319) >>>>> >>>>> ... >>>>> >>>>> So apparently the job misses some mahout classes. How can i provide >>>>> the required classes to yarn? >>>>> >>>>> Best, >>>>> >>>>> Max >>>>> >>>> >>>> >>> >> >
