Hi, I'm curious if it's a common approach to have discussions in JIRA not here. I don't think it's the ASF way.
Pozdrawiam, Jacek Laskowski http://blog.japila.pl 17 maj 2014 23:55 "Matei Zaharia" <matei.zaha...@gmail.com> napisaĆ(a): > We do actually have replicated StorageLevels in Spark. You can use > MEMORY_AND_DISK_2 or construct your own StorageLevel with your own custom > replication factor. > > BTW you guys should probably have this discussion on the JIRA rather than > the dev list; I think the replies somehow ended up on the dev list. > > Matei > > On May 17, 2014, at 1:36 AM, Mridul Muralidharan <mri...@gmail.com> wrote: > > > We don't have 3x replication in spark :-) > > And if we use replicated storagelevel, while decreasing odds of failure, > it > > does not eliminate it (since we are not doing a great job with > replication > > anyway from fault tolerance point of view). > > Also it does take a nontrivial performance hit with replicated levels. > > > > Regards, > > Mridul > > On 17-May-2014 8:16 am, "Xiangrui Meng" <men...@gmail.com> wrote: > > > >> With 3x replication, we should be able to achieve fault tolerance. > >> This checkPointed RDD can be cleared if we have another in-memory > >> checkPointed RDD down the line. It can avoid hitting disk if we have > >> enough memory to use. We need to investigate more to find a good > >> solution. -Xiangrui > >> > >> On Fri, May 16, 2014 at 4:00 PM, Mridul Muralidharan <mri...@gmail.com> > >> wrote: > >>> Effectively this is persist without fault tolerance. > >>> Failure of any node means complete lack of fault tolerance. > >>> I would be very skeptical of truncating lineage if it is not reliable. > >>> On 17-May-2014 3:49 am, "Xiangrui Meng (JIRA)" <j...@apache.org> > wrote: > >>> > >>>> Xiangrui Meng created SPARK-1855: > >>>> ------------------------------------ > >>>> > >>>> Summary: Provide memory-and-local-disk RDD checkpointing > >>>> Key: SPARK-1855 > >>>> URL: https://issues.apache.org/jira/browse/SPARK-1855 > >>>> Project: Spark > >>>> Issue Type: New Feature > >>>> Components: MLlib, Spark Core > >>>> Affects Versions: 1.0.0 > >>>> Reporter: Xiangrui Meng > >>>> > >>>> > >>>> Checkpointing is used to cut long lineage while maintaining fault > >>>> tolerance. The current implementation is HDFS-based. Using the > BlockRDD > >> we > >>>> can create in-memory-and-local-disk (with replication) checkpoints > that > >> are > >>>> not as reliable as HDFS-based solution but faster. > >>>> > >>>> It can help applications that require many iterations. > >>>> > >>>> > >>>> > >>>> -- > >>>> This message was sent by Atlassian JIRA > >>>> (v6.2#6252) > >>>> > >> > >