Re: [jira] [Created] (SPARK-1855) Provide memory-and-local-disk RDD checkpointing

Jacek Laskowski Sun, 18 May 2014 11:21:31 -0700

Hi,

I'm curious if it's a common approach to have discussions in JIRA not here.
I don't think it's the ASF way.


Pozdrawiam,
Jacek Laskowski
http://blog.japila.pl
17 maj 2014 23:55 "Matei Zaharia" <[email protected]> napisał(a):

> We do actually have replicated StorageLevels in Spark. You can use
> MEMORY_AND_DISK_2 or construct your own StorageLevel with your own custom
> replication factor.
>
> BTW you guys should probably have this discussion on the JIRA rather than
> the dev list; I think the replies somehow ended up on the dev list.
>
> Matei
>
> On May 17, 2014, at 1:36 AM, Mridul Muralidharan <[email protected]> wrote:
>
> > We don't have 3x replication in spark :-)
> > And if we use replicated storagelevel, while decreasing odds of failure,
> it
> > does not eliminate it (since we are not doing a great job with
> replication
> > anyway from fault tolerance point of view).
> > Also it does take a nontrivial performance hit with replicated levels.
> >
> > Regards,
> > Mridul
> > On 17-May-2014 8:16 am, "Xiangrui Meng" <[email protected]> wrote:
> >
> >> With 3x replication, we should be able to achieve fault tolerance.
> >> This checkPointed RDD can be cleared if we have another in-memory
> >> checkPointed RDD down the line. It can avoid hitting disk if we have
> >> enough memory to use. We need to investigate more to find a good
> >> solution. -Xiangrui
> >>
> >> On Fri, May 16, 2014 at 4:00 PM, Mridul Muralidharan <[email protected]>
> >> wrote:
> >>> Effectively this is persist without fault tolerance.
> >>> Failure of any node means complete lack of fault tolerance.
> >>> I would be very skeptical of truncating lineage if it is not reliable.
> >>> On 17-May-2014 3:49 am, "Xiangrui Meng (JIRA)" <[email protected]>
> wrote:
> >>>
> >>>> Xiangrui Meng created SPARK-1855:
> >>>> ------------------------------------
> >>>>
> >>>>             Summary: Provide memory-and-local-disk RDD checkpointing
> >>>>                 Key: SPARK-1855
> >>>>                 URL: https://issues.apache.org/jira/browse/SPARK-1855
> >>>>             Project: Spark
> >>>>          Issue Type: New Feature
> >>>>          Components: MLlib, Spark Core
> >>>>    Affects Versions: 1.0.0
> >>>>            Reporter: Xiangrui Meng
> >>>>
> >>>>
> >>>> Checkpointing is used to cut long lineage while maintaining fault
> >>>> tolerance. The current implementation is HDFS-based. Using the
> BlockRDD
> >> we
> >>>> can create in-memory-and-local-disk (with replication) checkpoints
> that
> >> are
> >>>> not as reliable as HDFS-based solution but faster.
> >>>>
> >>>> It can help applications that require many iterations.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> This message was sent by Atlassian JIRA
> >>>> (v6.2#6252)
> >>>>
> >>
>
>

Re: [jira] [Created] (SPARK-1855) Provide memory-and-local-disk RDD checkpointing

Reply via email to