Hello all, I checked out the latest mahout 0.8 code this morning but get an error when I run seq2sparse.
$ mahout seq2sparse -i in -o out --namedVector --weight tf hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally 27-Feb-2013 17:08:58 org.slf4j.impl.JCLLoggerAdapter info INFO: Maximum n-gram size is: 1 27-Feb-2013 17:08:58 org.slf4j.impl.JCLLoggerAdapter info INFO: Minimum LLR value: 1.0 27-Feb-2013 17:08:58 org.slf4j.impl.JCLLoggerAdapter info INFO: Number of reduce tasks: 1 27-Feb-2013 17:08:59 org.slf4j.impl.JCLLoggerAdapter info INFO: Deleting /home/kris/mahoutTest/put/tokenized-documents 27-Feb-2013 17:08:59 org.apache.hadoop.util.NativeCodeLoader <clinit> WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 27-Feb-2013 17:08:59 org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus INFO: Total input paths to process : 1 27-Feb-2013 17:08:59 org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Running job: job_local_0001 27-Feb-2013 17:09:00 org.apache.hadoop.util.ProcessTree isSetsidSupported INFO: setsid exited with exit code 0 27-Feb-2013 17:09:00 org.apache.hadoop.mapred.Task initialize INFO: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@51b4a74b 27-Feb-2013 17:09:00 org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local_0001 java.lang.NoSuchFieldError: LUCENE_41 at org.apache.mahout.common.lucene.AnalyzerUtils.createAnalyzer(AnalyzerUtils.java:38) at org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:66) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 27-Feb-2013 17:09:00 org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 0% reduce 0% 27-Feb-2013 17:09:00 org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Job complete: job_local_0001 27-Feb-2013 17:09:00 org.apache.hadoop.mapred.Counters log INFO: Counters: 0 Exception in thread "main" java.lang.IllegalStateException: Job failed! at org.apache.mahout.vectorizer.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:95) at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:256) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) Any reason why it would be complaining about LUCENE_41? Best, Kris -- Dr Kris Jack, http://www.mendeley.com/profiles/kris-jack/
