Dear Users,

This question is about the MLlib algorithms in general. Consider a hypothetical 
situation where you have a dataset with n records and assume n could be very 
large. Will all the MLlib algorithms work for such a dataset even when a very 
minimal cluster is set up (even with degraded performance)? Is there any 
relationship between n, choice of algorithm and hardware set up? If the general 
question is difficult, can something be said about the popular classification 
and clustering algorithms?

Thanks and regards,



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to