Hi, While it is also missing in spark.mllib, I'd suggest adding cardinality as part of the Simple descriptive statistics for both spark.ml and spark.mlib? This is useful even for data in double precision FP to understand the "uniqueness" of the feature data.
Cheers, Brad -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Design-document-MLlib-s-statistical-package-for-DataFrames-tp21014p21016.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org