Create an array out of cilumns, convert to Dataframe, explode,distinct,write. On 19 Sep 2016 19:11, "Saurav Sinha" <sauravsinh...@gmail.com> wrote:
> You can use distinct over you data frame or rdd > > rdd.distinct > > It will give you distinct across your row. > > On Mon, Sep 19, 2016 at 2:35 PM, Abhishek Anand <abhis.anan...@gmail.com> > wrote: > >> I have an rdd which contains 14 different columns. I need to find the >> distinct across all the columns of rdd and write it to hdfs. >> >> How can I acheive this ? >> >> Is there any distributed data structure that I can use and keep on >> updating it as I traverse the new rows ? >> >> Regards, >> Abhi >> > > > > -- > Thanks and Regards, > > Saurav Sinha > > Contact: 9742879062 >