Hi If your data is not so huge you can use both cloudera and HDP's free
stack. Cloudera Express is 100% opensource free.
-
Software Developer
SigmoidAnalytics, Bangalore
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-application-in-production-wi
Hi,
Spark on YARN should help in the memory management for Spark jobs.
Here is a good starting point:
https://spark.apache.org/docs/latest/running-on-yarn.html
YARN integrates well with HDFS and should be a good solution for a large
cluster.
What specific features are you looking for that HDFS doe