way to handle
> that.
>
> On Wed, Jan 13, 2016 at 6:40 PM, Li Li wrote:
>>
>> It looks like the problem is the vectors of term counts in the corpus
>> are not always the vocabulary size.
>> Do you mean some integers not occured in the corpus?
>> for exampl
so if the vocabulary, or dictionary has 10 words, each vector
> should have a size of 10). This probably means that there will be some
> elements with zero counts, and a sparse vector might be a good way to handle
> that.
>
> On Wed, Jan 13, 2016 at 6:40 PM, Li Li wrote:
>>
>&
it will not throw an exception but the term indices
> will start to be incorrect. For a small number of iterations, it is ok, but
> increasing iterations causes the indices to get larger also. Maybe that is
> what is going on in the JIRA you linked to?
>
> On Wed, Jan 13, 2016 at 1:17 AM,
I will try spark 1.6.0 to see it is the bug of 1.5.2.
On Wed, Jan 13, 2016 at 3:58 PM, Li Li wrote:
> I have set up a stand alone spark cluster and use the same codes. it
> still failed with the same exception
> I also preprocessed the data to lines of integers and use the scala
>
output that looks good. Sorry, I don't
> have a YARN cluster setup right now, so maybe the error you are seeing is
> specific to that. Btw, I am running the latest Spark code from the master
> branch. Hope that helps some!
>
> Bryan
>
> On Mon, Jan 4, 2016 at 8:42 PM,
anyone could help? the problem is very easy to reproduce. What's wrong?
On Wed, Dec 30, 2015 at 8:59 PM, Li Li wrote:
> I use a small data and reproduce the problem.
> But I don't know my codes are correct or not because I am not familiar
> with spark.
> So I first post
}
}));
corpus.cache();
// Cluster the documents into three topics using LDA
DistributedLDAModel ldaModel = (DistributedLDAModel) new
LDA().setMaxIterations(iterNumber)
.setK(topicNumber).run(corpus);
On Wed, Dec 30, 2015 at 3:34 PM, Li Li wrote:
> I will use a p
ra/browse/SPARK-12488
>
> I haven't figured out yet what is causing it. Do you have a small corpus
> which reproduces this error, and which you can share on the JIRA? If so,
> that would help a lot in debugging this failure.
>
> Thanks!
> Joseph
>
> On Sun, Dec 27, 201
I ran my lda example in a yarn 2.6.2 cluster with spark 1.5.2.
it throws exception in line: Matrix topics = ldaModel.topicsMatrix();
But in yarn job history ui, it's successful. What's wrong with it?
I submit job with
.bin/spark-submit --class Myclass \
--master yarn-client \
--num-execut