Dear Users, This question is about the MLlib algorithms in general. Consider a hypothetical situation where you have a dataset with n records and assume n could be very large. Will all the MLlib algorithms work for such a dataset even when a very minimal cluster is set up (even with degraded performance)? Is there any relationship between n, choice of algorithm and hardware set up? If the general question is difficult, can something be said about the popular classification and clustering algorithms?
Thanks and regards, --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org