Hi, Spark can run on top of HDFS. While Spark talks about the RDDs which do not need replication because the partitions can be built with the help of lineage. But, HDFS inherently has replication. How do these two concepts go together? Thank You
- Spark on HDFS with replication Deep Pradhan
- Re: Spark on HDFS with replication Stanley Shi
- Re: Spark on HDFS with replication Deep Pradhan