Since we already have "spark.hadoop.validateOutputSpecs" config, I think there is not much need to expose disableOutputSpecValidation
Cheers On Fri, Mar 6, 2015 at 7:34 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote: > Actually, except setting spark.hadoop.validateOutputSpecs to false to > disable output validation for the whole program > > Spark implementation uses a Dynamic Variable (object PairRDDFunctions) > internally to disable it in a case-by-case manner > > val disableOutputSpecValidation: DynamicVariable[Boolean] = new > DynamicVariable[Boolean](false) > > > I’m not sure if there is enough amount of benefits to make it worth exposing > this variable to the user… > > > Best, > > > -- > Nan Zhu > http://codingcat.me > > On Friday, March 6, 2015 at 10:22 AM, Ted Yu wrote: > > Found this thread: > http://search-hadoop.com/m/JW1q5HMrge2 > > Cheers > > On Fri, Mar 6, 2015 at 6:42 AM, Sean Owen <so...@cloudera.com> wrote: > > This was discussed in the past and viewed as dangerous to enable. The > biggest problem, by far, comes when you have a job that output M > partitions, 'overwriting' a directory of data containing N > M old > partitions. You suddenly have a mix of new and old data. > > It doesn't match Hadoop's semantics either, which won't let you do > this. You can of course simply remove the output directory. > > On Fri, Mar 6, 2015 at 2:20 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > Adding support for overwrite flag would make saveAsXXFile more user > friendly. > > > > Cheers > > > > > > > >> On Mar 6, 2015, at 2:14 AM, Jeff Zhang <zjf...@gmail.com> wrote: > >> > >> Hi folks, > >> > >> I found that RDD:saveXXFile has no overwrite flag which I think is very > helpful. Is there any reason for this ? > >> > >> > >> > >> -- > >> Best Regards > >> > >> Jeff Zhang > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > > > >