Re: Dimension mismatch exception

2014-03-21 Thread Herb Roitblat
Computing the cosine between two documents requires that the vectors for each document to be the same length (same number of elements, same dimensionality, not the norm). The length of the vector is the length of the vocabulary for the whole set. The two sets will inevitably have different nu

Re: Dimension mismatch exception

2014-03-21 Thread Stefy D.
Hello Herb. Thank you very much for your reply. I want to have the cosine for each a and each b. I'm using code for lucene I found online, which I will post below. Hello Uwe. Thank you very much for replying. I am using a class DocVector and then a class in which i try to compute the similariti

RE: Dimension mismatch exception

2014-03-20 Thread Uwe Schindler
Hi Stefy, the stack trace you posted has nothing to do with Apache Lucene. It looks like you are using some commons-lang3 classes here, but no Lucene code at all. So I think your question might be better asked on the commons-math mailing list, unless you have some Lucene code around, too. If th

Re: Dimension mismatch exception

2014-03-20 Thread Herb Roitblat
If you want to compute the cosines between pairs of documents (each a compared with each b), then the dimension is 100, the size of each document. If you want to compare the whole index then you will need to make them the same length (number of elements) by padding the shorter with zeroes. There