I am running the lda for the first time. I gave the following command to
test over the Reuters dataset but i got the error
lda -i reuters-vectors/tf-vectors -o reuters-lda-sparse -k 10 -v 7000 -x
20 -ow
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/vineeth_rakesh/src/mahout/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/vineeth_rakesh/src/mahout/examples/target/dependency/slf4j-jcl-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/vineeth_rakesh/src/mahout/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
12/10/18 12:11:17 ERROR driver.MahoutDriver: : Try the new Collapsed
Variation Bayes LDA, try bin/mahout cvb or bin/mahout cvb0_local
As i mentioned this command seems to be for Mahout 0.5. Now if i have to
use Collapsed Variation LDA how do you give the parameters? are there
any websites describing the usage of CVB lda?
On 12-10-18 09:09 AM, Jake Mannix wrote:
For Mahout 0.7, the format of the model files for LDA are just a
SequenceFile<IntWritable, VectorWritable>, with the row numbers being the
topicIds, and the entries being the (un-normalized) probabilities for each
termId.
bin/vectordump --dictionary <path to dictionary file> \
--dictioanryType <either text or sequencefile> \
--input <path to model files> \
--vectorSize <num entries per topic you want to
see> \
--sortVectors
On Wed, Oct 17, 2012 at 10:11 PM, vineeth <[email protected]> wrote:
Hello,
I am seeing from this website http://theglassicon.com/**
computing/machine-learning/**running-lda-algorithm-mahout<http://theglassicon.com/computing/machine-learning/running-lda-algorithm-mahout>(Mahout
0.5). This website give the complete procedure to get probabilities
of word and topics using LDA. However, these steps donot work on Mahout
0.7. Can some one give an updated website of the same steps?, or can some
one provide me the alternative commands and parameters?
Thank You
Vineeth