Ashu, There is one main issue and a few stylistic/ grammatical things I noticed. 1> You take and rdd or type String which you expect to be comma separated. This limits usability since the user will have to convert their RDD to that format only for you to split it on string. It would make more sense to take an RDD of type (col_num:Int , attr_value:Int), frequency:Int) You could also use Long instead of Int.
2> the increment functions could be more along the lines of def incr = {count += 1; count} which is ina a more functional style 3> reset functions could be simply def reset_count = count = 1L 4> in https://github.com/codeAshu/Outlier-Detection-with-AVF-Spark/blob/master/OutlierWithAVFModel.scala#L108 You have a key of type string which is basically a string of form "number, string" when you could just have a tuple of the form (i:Int, word:String) 5? the lines exceed the style guides 100 character length Thanks Anant -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Contributing-Algorithm-for-Outlier-Detection-tp8880p8992.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org