Hello! Can someone point me to some explanatory documentation for Outlier Detection & Removal in Clustering in Mahout. I am unable to understand the internal mechanism of outlier detection just by reading the Javadoc: clusterClassificationThreshold Is a clustering strictness / outlier removal parameter. Its value should be between 0 and 1. Vectors having pdf below this value will not be clustered.
What does the pdf represent? Thanks Prabhakar
