Re: No overwrite flag for saveAsXXFile

Nan Zhu Fri, 06 Mar 2015 07:36:41 -0800

Actually, except setting spark.hadoop.validateOutputSpecs to false to disable 
output validation for the whole program


Spark implementation uses a Dynamic Variable (object PairRDDFunctions) 
internally to disable it in a case-by-case manner

val disableOutputSpecValidation: DynamicVariable[Boolean] = new 
DynamicVariable[Boolean](false)

I’m not sure if there is enough amount of benefits to make it worth exposing 
this variable to the user…  

Best,  

--  
Nan Zhu
http://codingcat.me


On Friday, March 6, 2015 at 10:22 AM, Ted Yu wrote:

> Found this thread:
> http://search-hadoop.com/m/JW1q5HMrge2
>  
> Cheers
>  
> On Fri, Mar 6, 2015 at 6:42 AM, Sean Owen <so...@cloudera.com 
> (mailto:so...@cloudera.com)> wrote:
> > This was discussed in the past and viewed as dangerous to enable. The
> > biggest problem, by far, comes when you have a job that output M
> > partitions, 'overwriting' a directory of data containing N > M old
> > partitions. You suddenly have a mix of new and old data.
> >  
> > It doesn't match Hadoop's semantics either, which won't let you do
> > this. You can of course simply remove the output directory.
> >  
> > On Fri, Mar 6, 2015 at 2:20 PM, Ted Yu <yuzhih...@gmail.com 
> > (mailto:yuzhih...@gmail.com)> wrote:
> > > Adding support for overwrite flag would make saveAsXXFile more user 
> > > friendly.
> > >
> > > Cheers
> > >
> > >
> > >
> > >> On Mar 6, 2015, at 2:14 AM, Jeff Zhang <zjf...@gmail.com 
> > >> (mailto:zjf...@gmail.com)> wrote:
> > >>
> > >> Hi folks,
> > >>
> > >> I found that RDD:saveXXFile has no overwrite flag which I think is very 
> > >> helpful. Is there any reason for this ?
> > >>
> > >>
> > >>
> > >> --
> > >> Best Regards
> > >>
> > >> Jeff Zhang
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> > > (mailto:user-unsubscr...@spark.apache.org)
> > > For additional commands, e-mail: user-h...@spark.apache.org 
> > > (mailto:user-h...@spark.apache.org)
> > >
>

Re: No overwrite flag for saveAsXXFile

Reply via email to