Hi,

I have a problem with cluster labels. I try to use ClusterLabeld.getLabels(), but In ClusterLabels.scoreDocumentFrequencies i run into situations where k22 becomes negative, yieldung an exception.
The numbers I get are:
corpusSize : 435
clusterSize: 181
outDF: 277
  => long k22 = corpusSize - clusterSize - outDF;
    becomes -23 but has to be at least zero.

I have no clue what mistake of mine is causing this, I use the same Lucene analyzer for creating the mahout sequence and for the lucene index that is used as a parameter in the ClasterLabels constructor.

Regards
Stefan


Reply via email to